If you want to have the list of publications which belongs to these following topics, write in the search field the category's name (cat-1, ...)
  • cat-1: Infrastructure, data collection (incl. scenarios) and data management (annotation and standardization)
  • cat-2: Audio and speech processing (tracking, ASR, etc)
  • cat-3: Visual and joint audio-visual processing
  • cat-4: Multimodal structure and content analysis
  • cat-5: Human factors, HCI, system evaluation, and applications
  • cat-6: Project overviews
 



Publications 

- Alphabetic 

- Type 

- Recent 

Authors 


Export 



Name:
Password:



All publications in the database, sorted on year



2008

A generic layout-tool for summaries of meetings in a constraint-based approach, Sandro Castronovo, Jochen Frey and Peter Poller, in: 5th Joint Workshop on Machine Learning and Multimodal Interaction (MLMI 2008), pages 248-259, Springer, Heidelberg, TNO, Utrecht, 2008.
 
Adaptive Beamforming with a Maximum Negentropy Criterion, Kenichi Kumatani, John McDonough, Dietrich Klakow, Philip N. Garner and Weifeng Li, in: Proceedings of The Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2008.
 
Annotating Subjective Content In Meetings, Theresa Wilson, in: Proceedings of the Language Resources and Evaluation Conference, Springer, LREC-2008, Marrakech, Maroc, 2008.
 
Annotations and Subjective Machines of annotators, embodied agents, users, and other humans, Dennis Reidsma, University of Twente, 2008.
 
Automatic Speech Recognition for Scientific Purposes - webASR, Thomas Hain, Asmaa El Hannani, Stuart Wrigley and Vincent Wan, in: In proc. Interspeech, 2008.
 
Bob: A Lexicon and Pronunciaiton Dictionary Generator, Vincent Wan, John Dines, Asmaa El Hannani and Thomas Hain, in: Proc. IEEE Workshop on Spoken Language Technology, 2008, 2008.
 
Body-Part Templates for Recovery of 2D Human Poses under Occlusion, Ronald Poppe and Mannes Poel, in: International Workshop on Articulated Motion and Deformable Objects (AMDO'08), pages 289-298, Springer-Verlag, 2008.
 
Combining Spectral Representations for Large Vocabulary Continuous Speech Recognition, Giulia Garau and Steve Renals, in: IEEE Transactions on Audio Speech and Language Processing, volume 16-3, pages 508-518, 2008.
 
Comparing word, character, and phoneme n-grams for subjective utterance recognition, Theresa Wilson and Stephan Raaijmakers, in: Interspeech 2008, Brisbane, Australia, 2008.
 
Dealing with Uncertainty in Microphone Placement in a Microphone Array Speech Recognition System, I. Himawan, S. Sridharan and I. McCowan, in: Proceedings of 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), IEEE Signal Processing Society, Nevada, US, 2008.
 
Decision-Level Fusion for Audio-Visual Laughter Detection, B. Reuderink, Mannes Poel, K. P. Truong, Ronald Poppe and Maja Pantic, in: 5th Joint Workshop on Machine Learning and Multimodal Interaction, MLMI 2008, pages 137-148, Springer Verlag, 2008.
 
Design and Evaluation of Systems to Support Interaction Capture and Retrieval, Steve Whittaker, Simon Tucker, K. Swampillai and Rachel Laban, in: Personal and Ubiquitous Computing, volume 12, number 3, pages 197-221, 2008.
 
Determining Latency for on-line Dialog Act Classification, Sebastian Germesin, in: MLMI'08, 2008.
 
Discriminative human action recognition using pairwise CSP classifiers, Ronald Poppe and Mannes Poel, in: 8th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2008), 2008.
 
Domain-specific Classification Methods for Disfluency Detection, Sebastian Germesin, Tilman Becker and Peter Poller, in: Interspeech 2008, 2008.
 
Effect of sound spatialisation on multitasking in remote meetings, S. N. Wrigley, Simon Tucker, G. J. Brown and Steve Whittaker, in: Proceedings of Acoustics'08, 2008.
 
Estimating the Dominant Person in Multi-Party Conversations Using Speaker Diarization Strategies, Hayley Hung, Yan Huang, Gerald Friedland and Daniel Gatica-Perez, in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2008.
 
Exploiting `Subjective' Annotations, Dennis Reidsma and H. J. A. op den Akker, in: Coling 2008: Proceedings of the workshop on Human Judgements in Computational Linguistics, pages 8-16, Coling 2008 Organizing Committee, 2008.
 
Extrinsic Summarization Evaluation: A Decision Audit Task, Gabriel Murray, Thomas Kleinbauer, Peter Poller, Steve Renals, Jonathan Kilgour and Tilman Becker, in: Machine Learning for Multimodal Interaction - 5th International Workshop, MLMI 2008, pages 349-361, 2008.
 
Filter Bank Design Based on Minimization of Individual Aliasing Terms for Minimum Mutual Information Subband Adaptive Beamforming, Kenichi Kumatani, John McDonough, Stefan Schacht, Dietrich Klakow, Philip N. Garner and Weifeng Li, in: Proceedings International Conference on Acoustics, Speech and Signal Processing, IEEE, 2008.
 
How do I address you? Modelling addressing behaviour based on an analysis of multi-modal corpora of conversational discourse, Rieks op den Akker and Mariet Theune, in: AISB 2008 Symposium on Multimodal Output Generation (MOG 2008), pages 10-17, Aberdeen, UK, 2008.
 
Hybrid Multi-Step Disfluency Detection, Sebastian Germesin, Tilman Becker and Peter Poller, in: MLMI'08, 2008.
 
Identifying Dominant People in Meetings from Audio-Visual Sensors, Hayley Hung and Daniel Gatica-Perez, in: Proc. IEEE Int. Conf. on Automatic Face and Gesture Recognition (FG), Special Session on Multi-Sensor HCI for Smart Environments, Amsterdam, 2008.
 
Interpretation of Multiparty Meetings: The AMI and AMIDA Projects, Steve Renals, Thomas Hain and Hervé Bourlard, in: HSCMA 2008 (6-8 May), pages 115-118, Trento-Italy, 2008.
 
Investigating Automatic Dominance Estimation in Groups from Visual Attention and Speaking Activity, in: Proc. Int. Conf. on Multimodal Interfaces (ICMI), Chania, 2008.
 
Juicer and Tracter software release, Philip N. Garner, Darren Moore, Octavian Cheng, John Dines and Danil Korchagin, 2008.
 
Making remote ‘meeting hopping’ work: assistance to initiate, join and leave meetings, A. H. M. Cremers, Maaike Duistermaat, P Groenewegen and Jacomien de Jong, in: 5th Joint Workshop on Machine Learning and Multimodal Interaction (MLMI 2008), 2008.
 
Meeting Behavior Detection in Smart Environments: Nonverbal Cues that Help to Obtain Natural Interaction, Mannes Poel, Ronald Poppe and Anton Nijholt, in: Proceedings 8th IEEE International Conference on Automatic face and Gesture Recognition (FG 2008), pages 1-6, IEEE Computer Society Press, 2008.
 
Microphone Array Calibration in Diffuse Noise Fields, I. McCowan, Mike Lincoln and I. Himawan, in: IEEE Transactions on Audio, Speech and Language Processing, 2008.
 
Microphone Array Shape Calibration in Diffuse Noise Fields, I. McCowan, Mike Lincoln and I. Himawan, in: IEEE Transactions on Audio, Speech and Language Processing, volume 16-3, pages 666-670, ISSN 1558-7916, 2008.
 
Modeling dominance in group conversations from non-verbal activity cues, Dinesh Jayagopi, Hayley Hung, Chuohao Yeo and Daniel Gatica-Perez, in: IEEE Transactions on Audio, Speech and Language Processing, accepted for publication, 2008.
 
Multimodal Subjectivity Analysis of Multiparty Conversation, Stephan Raaijmakers, K. P. Truong and Theresa Wilson, in: Proceedings of EMNLP, 2008.
 
Mutually Coordinated Anticipatory Multimodal Interaction, Anton Nijholt, Dennis Reidsma, H. van Welbergen, H. J. A. op den Akker and Z. Ruttkay, in: Nonverbal Features of Human-Human and Human-Machine Interaction, 29-31 October 2007, pages 73-93, Springer Verlag, Berlin, Patras, Greece, 2008.
 
On the Contextual Analysis of Agreement Scores, Dennis Reidsma, Dirk Heylen and H. J. A. op den Akker, in: Proceedings of the LREC Workshop on Multimodal Corpora, pages 52-55, ELRA, ELRA, Marrakech, Morrocco, 2008.
 
Optimizing Bottle-Neck Features for LVCSR, Frantisek Grezl and Petr Fousek, pages 4729-4732, IEEE Signal Processing Society, 2008.
 
Overlapped Speech Detection for Improved Speaker Diarization in Multiparty Meetings, Kofi Boakye, B. Trueba-Hornero, O. Vinyals and Gerald Friedland, in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2008.
 
Packing the Meeting Summarization Knapsack, K. Riedhammer, D. Gillick, B. Favre and D. Hakkani-Tür, in: Interspeech 2008, Brisbane, Australia, 2008.
 
Physicality and Cooperative Design, Vyas Dhaval and Dirk Heylen, Springer LNCS proceedings, MLMI08, 5th Joint Workshop on Machine Learning, Utrecht, The Netherlands, 2008.
 
Predicting the Dominant Clique in Meetings through Fusion of Nonverbal Cues, in: Proc. ACM Int. Conf. on Multimedia (MM), Vancouver, 2008.
 
Predicting Two Facets of Social Verticality in Meetings from Five-minute Time Slices and Nonverbal Cues, in: Proc. Int. Conf. on Multimodal Interfaces (ICMI), Special Session on Social Signal Processing, Chania, 2008.
 
Recognition and Understanding of Meetings Overview of the European AMI and AMIDA Projects, Hervé Bourlard and Steve Renals, number 27, 2008.
 
Recognition of Dialogue Acts in Multiparty Meetings using a Switching DBN, Alfred Dielmann and Steve Renals, in: In IEEE Transactions on Audio, Speech and Language Processing, volume 16-5, pages 1303-1314, ISSN 1558-7916, 2008.
 
Reliability measurement without limits, Dennis Reidsma and Jean Carletta, in: Computational Linguistics, volume 34, number 3, pages 319-326, ISSN 0891-2017, 2008. [DOI]
 
Temporal Compression Of Speech: An Evaluation, Simon Tucker and Steve Whittaker, in: IEEE Transactions on Audio, Speech and Language Processing, 2008.
 
The AMIDA Automatic Content Linking Device: Just-in-Time Document Retrieval in Meetings, Andrei Popescu-Belis, Erik Boertjes, Jonathan Kilgour, Peter Poller, Sandro Castronovo, Theresa Wilson, Alejandro Jaimes, Jean Carletta and Rainer Stiefelhagen, in: Machine Learning for Multimodal Interaction, pages 272-283, Springer, TNO, Utrecht, 2008.
 
The influence of audio presentation style on multitasking during teleconferences, Stuart Wrigley, Simon Tucker, G. J. Brown and Steve Whittaker, in: Interspeech 2008, pages 801-804, 2008.
 
Time-Compressing Speech: ASR Transcripts are an Effective Way to Support Gist Extraction, Simon Tucker, N. Kyprianou and Steve Whittaker, in: 5th Joint Workshop on Machine Learning and Multimodal Interaction (MLMI 2008), Utrecht, The Netherlands, 2008.
 
Towards an Objective Test for Meeting Browsers: the BET4TQB Pilot Experiment, Andrei Popescu-Belis, Philippe Baudrion, Mike Flynn and Pierre Wellner, Lecture Notes in Computer Science, volume LNCS 4892/2008, pages 108-119, Springer Verlag, ISBN 978-3-540-78154-7, 2008. [DOI]
 
Tracking the Visual Focus of Attention for a Varying Number of Wandering People, Kevin Smith, Sileye O. Ba, Daniel Gatica-Perez and Jean-Marc Odobez, in: IEEE Trans. on Pattern Analysis and Machine Intelligence, 2008.
 
Two's a Crowd: Improving Speaker Diarization by Automatically Identifying and Excluding Overlapped Speech, Kofi Boakye, O. Vinyals and Gerald Friedland, in: Interspeech 2008, Brisbane, Australia, 2008.
 

2007

A Approach for Robust, faster than Real-Time Speaker Diarization, Yan Huang, O. Vinyals, Gerald Friedland, C. Müller, Nikki Mirghafori and Chuck Wooters, in: Proceedings of IEEE ASRU, pages 693-698, 2007.
 
A Cognitive and Unsupervised MAP Adaptation approach to the Recognition of the Focus of Attention from Head Pose, Jean-Marc Odobez and Sileye O. Ba, in: Proc. of the IEEE International Conference on Multimedia and Expo (ICME'07), 2007.
 
A Microphone Array Beamforming Approach to Blind Speech Separation, I. McCowan, I. Himawan and Mike Lincoln, in: Machine Learning for Multimodal Interaction IV, pages 294-304, Springer, 2007.
 
Accuracy of head orientation perception in triadic situations: Experiment in a virtual environment, Ronald Poppe, Rutger Rienks and Dirk Heylen, in: Perception, number 36(7):971-979, pages 971-979, ISSN 0301-0066, 2007.
 
Adaboost Engine, Pavel Zemcik and Martin Zadnik, in: International Conference on Field Programmable Logic and Applications, FPL 2007, pages 656-660, IEEE Computer Society, Amsterdam, NL, 2007. [DOI]
 
Adaptive Beamforming with a Minimum Mutual Information Criterion, Kenichi Kumatani, Tobias Gehrig, Uwe Mayer, Emilian Stoimenov, John McDonough and Matthias Wolfel, in: IEEE Trans. Audio, Speech and Language Processing, volume 15, pages 2527-2541, 2007.
 
AMI/DA STT and SASTT 2007, Thomas Hain, Lukas Burget, Martin Karafiat, John Dines, David van Leeuwen, Giulia Garau, Mike Lincoln and Vincent Wan, in: Proceedings of the RT07 Workshop 2007, 2007.
 
An Analysis of Sentence Segmentation Features for Broadcast News, Broadcast Conversations, and Meetings, S. Cuendet, Elizabeth Shriberg, B. Favre, J. Fung and D. Hakkani-Tür, in: SIGIR Workshop on Searching Conversational Spontaneous Speech, 2007.
 
Analysis of feature extraction and channel compensation in GMM speaker recognition system, Lukas Burget, Pavel Matejka, Petr Schwarz, Ondřej Glembek and Jan Cernocky, in: IEEE Transactions on Audio, Speech, and Language Processing, volume 15, number 7, pages 1979-1986, ISSN 1558-7916, 2007.
 
Application of CMLLR in narrow band wide band adapted systems, Martin Karafiat, Lukas Burget, Thomas Hain and Jan Cernocky, in: 8th Annual Conference of the International Speech Communication Association, pages 4, International Speech Communication Association, Antwerp, Belgium, 2007.
 
Artefact Ecologies: Supporting Embodied Meeting Practices with Distance Access, Dhaval Vyas and Anne Bajart, in: Proceedings of UbiComp (Ubiquitous Computing) 2007 Workshops, pages 117-122, Ubicomp, University of Innsbruck, Austria, 2007.
 
Audio-based unsupervised segmentation of multiparty dialogue, Pei-Yun Hsueh, in: IEEE International Conference on Acoustics, Speech and Signal Processing, 2008 (ICASSP 2008), pages 5049-5052, Las Vegas, Nevada, USA, 2007. [DOI]
 
Audio-Visual Probabilistic Tracking of Multiple Speakers in Meetings, Daniel Gatica-Perez, Guillaume Lathoud, Jean-Marc Odobez and I. McCowan, in: IEEE Trans. on Audio, Speech, and Language Processing, volume 15, number 5, pages 1696-1710, 2007.
 
Audio-Visual Processing in Meetings: Seven Questions and Current AMI Answers, Marc Al-Hames, Thomas Hain, Jan Cernocky, Sascha Schreiber, Mannes Poel, Ronald Müller, Sebastien Marcel, David van Leeuwen, Jean-Marc Odobez, Sileye O. Ba, Hervé Bourlard, Fabien Cardinaux, Daniel Gatica-Perez, Adam Janin, Petr Motlicek, Stephan Reiter, Steve Renals, Jeroen van Rest, Rutger Rienks, Gerhard Rigoll, Kevin Smith, Andrew Thean and Pavel Zemcik, in: MLMI 2006, 3rd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms, pages 24-35, Springer, 2007.
 
Automatic Decision Detection in Meeting Speech, Pei-Yun Hsueh and Johanna D. Moore, in: Machine Learning for Multimodal Interaction IV, Springer, 2007.
 
Automatic Dialogue Act Recognition using a Dynamic Bayesian Network, Alfred Dielmann and Steve Renals, in: Proc. Multimodal Interaction and Related Machine Learning Algorithms Workshop (MLMI--06), pages 178-189, Springer, 2007.
 
Automatic Labeling Inconsistencies Detection And Correction For Sentence Unit Segmentation In Conversational Speech, S. Cuendet, D. Hakkani-Tür and Elizabeth Shriberg, in: Proceedings of MLMI, 2007.
 
Automatic Laughter Detection Using Neural Networks, M. Knox and Nikki Mirghafori, in: Interspeech, 2007.
 
Automatic Meeting Segmentation using Dynamic Bayesian Networks, Alfred Dielmann and Steve Renals, in: IEEE Transactions on Multimedia, volume 9, number 1, pages 25-36, 2007.
 
Automatic meeting segmentation using dynamic Bayesian networks, Alfred Dielmann and Steve Renals, in: IEEE Transactions on Multimedia, volume 9, number 1, pages 25-36, 2007. [DOI]
 
Automatic Multi-Modal Meeting Camera Selection for Video-Conferences and Meeting Browsing, Marc Al-Hames, Benedikt Hörnler, Ronald Müller, Joachim Schenk and Gerhard Rigoll, in: Proceedings of the 8th International Conference on Multimedia and Expo (ICME), 2007.
 
Automatic Segmentation and Summarization of Meeting Speech, Gabriel Murray, Pei-Yun Hsueh, Simon Tucker, J. Kilgour, Jean Carletta, Johanna D. Moore and Steve Renals, in: Proc. of NAACL/HLT, 2007.
 
Binaural speech separation using recurrent timing neural networks for joint F0-localisation estimation, S. N. Wrigley and G. J. Brown, Lecture Notes in Computer Science, volume LNCS 4892, pages 271-282, Springer Berlin, ISBN 978-3-540-78154-7, 2007. [DOI]
 
Can unquantised articlatory feature continuums be modelled?, Odette Scharenborg and Vincent Wan, in: Proc Interspeech 2007, 2007.
 
Challenges for Virtual Humans in Human Computing, Dennis Reidsma, Z. Ruttkay and Anton Nijholt, volume LNAI State-of-th, pages 316-338, Springer Verlag, 2007.
 
Combining Multiple Information Layers for the Automatic Generation of Indicative Meeting Abstracts, Thomas Kleinbauer, Stephanie Becker and Tilman Becker, in: 11th European Workshop on Natural Language Generation (ENLG07), 2007.
 
Combining Multiple Knowledge Sources for Dialogue Segmentation in Multimedia Archives., Pei-Yun Hsueh and Johanna D. Moore, in: Proceedings of the 45th Annual Meeting of the ACL, Association for Computational Linguistics, 2007.
 
Constraint-basierte Generierung parametrisierbarer, multimodaler Comic-Layouts für verlaufsorientierte Meeting-Zusammenfassungen, Jochen Frey, University of Saarbrücken, 2007.
 
Cross-Genre Feature Comparisons for Spoken Sentence Segmentation, S. Cuendet, D. Hakkani-Tür, Elizabeth Shriberg, J. Fung and B. Favre, in: International Conference on Semantic Computing (ICSC), 2007.
 
DBN based joint dialogue act recognition of multiparty meetings, Alfred Dielmann and Steve Renals, in: Proc IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP '07), 2007.
 
DEMO: Automatic Decision Detection in Meeting Speech, Pei-Yun Hsueh, J. Kilgour, Jean Carletta, Johanna D. Moore and Steve Renals, in: Proc. MLMI, 2007.
 
Evaluating Example-based Pose Estimation: Experiments on the HumanEva Sets, Ronald Poppe, in: Online Proceedings of the Workshop on Evaluation of Articulated Human Motion and Pose Estimation (EHuM) at the International Conference on Computer Vision and Pattern Recognition (CVPR), number TR-CTIT-07-72, pages 1-8, 2007.
 
Evaluating Meeting Support Tools, Wilfried Post, M. A. A. HuisIntVeld and S.A.A. van den Boogaard, in: Personal and Ubiquitous Computing, 2007.
 
Evaluating the Future of HCI: Challenges for the Evaluation of Emerging Applications, Ronald Poppe, Rutger Rienks and Betsy van Dijk, Lecture Notes in Artificial Intelligence, pages 234-250, Springer Verlag, ISBN ISBN=3-540-72346-2, 2007.
 
Evaluating the Future of HCI: Challenges for the Evaluation of Upcoming Applications, Ronald Poppe and Rutger Rienks, in: Proceedings of the International Workshop on Artificial Intelligence for Human Computing at the International Joint Conference on Artificial Intelligence IJCAI'07, pages 89-96, IJCAI, 2007.
 
Evaluation and comparison of tracking methods using meeting omnidirectional images, Igor Potucek, Vítezslav Beran, Stanislav Sumec and Pavel Zemcik, in: Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI), pages 12, Brno, CZ, 2007.
 
Evaluation of Automatic Video Editing, Stanislav Sumec and Igor Potucek, in: Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI), pages 12, Brno, CZ, 2007.
 
Experiencing-in-the-World : Using Pragmatist Philosophy to Design for Aesthetic Experience, Dhaval Vyas, Dirk Heylen, Anton Eliens and Anton Nijholt, in: Proceedings of the 3rd International Conference on Designing for Users Experience, pages 1-16, ACM Press, 2007.
 
Experiencing-in-the-World: Using Pragmatist Philosophy to Design for Aesthetic Experience, Dhaval Vyas, Dirk Heylen, Anton Nijholt and Anton Eliens, pages 16, 2007.
 
Experimental Comparison of Multimodal Meeting Browsers, Wilfried Post, E. Elling, A. H. M. Cremers and Wessel Kraaij, in: HCII 2007, Beijing, China, 2007.
 
Exploring Contextual Information in a Layered Framework for Group Action Recognition, Dong Zhang and Samy Bengio, in: 2007 IEEE International Conference on Multimedia and Expo, pages 2022-2025, Beijing, 2007. [DOI]
 
Feedback loops in communication and human computing, H. J. A. op den Akker and Dirk Heylen, in: Artificial Intelligence for Human Computing, pages 215-447, Springer Verlag, 2007.
 
Filtering the Unknown: Speech Activity Detection in Heterogeneous Video Collections, Marijn Huijbregts, Chuck Wooters and R.J.F Ordelman, in: Proceedings of Interspeech 2007, pages 4, International Speech Communication Association, 2007.
 
Finding Maximum Margin Segments in Speech, Yago Pereiro Estevan, Vincent Wan and Odette Scharenborg, in: Proceedings of the 32nd International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), 2007.
 
Fusion of heterogeneous speaker recognition systems in the STBU submission for the NIST speaker recognition evaluation 2006, Niko Brümmer, Lukas Burget, Jan Cernocky, Ondřej Glembek, Frantisek Grezl, Martin Karafiat, David van Leeuwen, Pavel Matejka, Petr Schwarz and Albeert Strasheim, in: IEEE Transactions on Audio, Speech, and Language Processing, volume 15, number 7, pages 2072-2084, ISSN 1558-7916, 2007. [DOI]
 
Hardware Acceleration of AdaBoost Classifier, Jiri Granat, Adam Herout, Michal Hradis and Pavel Zemcik, in: Workshop on Multimodal Int