INTERSPEECH 2005:
Lisbon,
Portugal
INTERSPEECH 2005 - Eurospeech, 9th European Conference on Speech Communication and Technology, Lisbon, Portugal, September 4-8, 2005.
ISCA 2005
Keynote Papers
- Graeme M. Clark:
The multiple-channel cochlear implant: interfacing electronic technology to human consciousness.
1-4
Speech Recognition - Language Modelling I-III
Prosody in Language Performance I,
II
Spoken Language Extraction / Retrieval I,
II
- Olivier Siohan, Michiel Bacchiani:
Fast vocabulary-independent audio search using path-based graph indexing.
53-56
- John Makhoul, Alex Baron, Ivan Bulyko, Long Nguyen, Lance A. Ramshaw, David Stallard, Richard M. Schwartz, Bing Xiang:
The effects of speech recognition and punctuation on information extraction performance.
57-60
- Ciprian Chelba, Alex Acero:
Indexing uncertainty for spoken document search.
61-64
- Tomoyosi Akiba, Hiroyuki Abe:
Exploiting passage retrieval for n-best rescoring of spoken questions.
65-68
- BalaKrishna Kolluru, Heidi Christensen, Yoshihiko Gotoh:
Multi-stage compaction approach to broadcast news summarisation.
69-72
- Chien-Lin Huang, Chia-Hsin Hsieh, Chung-Hsien Wu:
Audio-video summarization of TV news using speech recognition and shot change detection.
73-76
The Blizzard Challenge 2005
- Alan W. Black, Keiichi Tokuda:
The blizzard challenge - 2005: evaluating corpus-based speech synthesis on common datasets.
77-80
- Shinsuke Sakai, Han Shu:
A probabilistic approach to unit selection for corpus-based speech synthesis.
81-84
- John Kominek, Christina L. Bennett, Brian Langner, Arthur R. Toth:
The blizzard challenge 2005 CMU entry - a method for improving speech synthesis systems.
85-88
- H. Timothy Bunnell, Christopher A. Pennington, Debra Yarrington, John Gray:
Automatic personal synthetic voice construction.
89-92
- Heiga Zen, Tomoki Toda:
An overview of nitech HMM-based speech synthesis system for blizzard challenge 2005.
93-96
- Wael Hamza, Raimo Bakis, Zhiwei Shuang, Heiga Zen:
On building a concatenative speech synthesis system from the blizzard challenge speech databases.
97-100
- Robert A. J. Clark, Korin Richmond, Simon King:
Multisyn voices from ARCTIC data for the blizzard challenge.
101-104
- Christina L. Bennett:
Large scale evaluation of corpus-based synthesizers: results and lessons from the blizzard challenge 2005.
105-108
New Applications
- Berlin Chen, Yi-Ting Chen, Chih-Hao Chang, Hung-Bin Chen:
Speech retrieval of Mandarin broadcast news via mobile devices.
109-112
- Michiaki Katoh, Kiyoshi Yamamoto, Jun Ogata, Takashi Yoshimura, Futoshi Asano, Hideki Asoh, Nobuhiko Kitawaki:
State estimation of meetings by information fusion using Bayesian network.
113-116
- Roger K. Moore:
Results from a survey of attendees at ASRU 1997 and 2003.
117-120
- Reinhold Haeb-Umbach, Basilis Kladis, Joerg Schmalenstroeer:
Speech processing in the networked home environment - a view on the amigo project.
121-124
- Masahide Sugiyama:
Fixed distortion segmentation in efficient sound segment searching.
125-128
- Tin Lay Nwe, Haizhou Li:
Identifying singers of popular songs.
129-132
- Jun Ogata, Masataka Goto:
Speech repair: quick error correction just by using selection operation for speech input interfaces.
133-136
- Dirk Olszewski, Fransiskus Prasetyo, Klaus Linhard:
Steerable highly directional audio beam loudspeaker.
137-140
- Hassan Ezzaidi, Jean Rouat:
Automatic music genre classification using second-order statistical measures for the prescriptive approach.
141-144
- Alberto Abad, Dusan Macho, Carlos Segura, Javier Hernando, Climent Nadeu:
Effect of head orientation on the speaker localization performance in smart-room environment.
145-148
- Corinne Fredouille, Gilles Pouchoulin, Jean-François Bonastre, M. Azzarello, Antoine Giovanni, Alain Ghio:
Application of automatic speaker recognition techniques to pathological voice assessment (dysphonia).
149-152
- Upendra V. Chaudhari, Ganesh N. Ramaswamy, Eddie Epstein, Sasha Caskey, Mohamed Kamal Omar:
Adaptive speech analytics: system, infrastructure, and behavior.
153-156
E-learning and Spoken Language Processing
- Katherine Forbes-Riley, Diane J. Litman:
Correlating student acoustic-prosodic profiles with student learning in spoken tutoring dialogues.
157-160
- Diane J. Litman, Katherine Forbes-Riley:
Speech recognition performance and learning in spoken dialogue tutoring.
161-164
- Satoshi Asakawa, Nobuaki Minematsu, Toshiko Isei-Jaakkola, Keikichi Hirose:
Structural representation of the non-native pronunciations.
165-168
- Fu-Chiang Chou:
Ya-ya language box - a portable device for English pronunciation training with speech recognition technologies.
169-172
- Akinori Ito, Yen-Ling Lim, Motoyuki Suzuki, Shozo Makino:
Pronunciation error detection method based on error rule clustering using a decision tree.
173-176
- Abhinav Sethy, Shrikanth Narayanan, Nicolaus Mote, W. Lewis Johnson:
Modeling and automating detection of errors in Arabic language learner speech.
177-180
- Felicia Zhang, Michael Wagner:
Effects of F0 feedback on the learning of Chinese tones by native speakers of English.
181-184
E-inclusion and Spoken Language Processing I,
II
- Tom Brøndsted, Erik Aaskoven:
Voice-controlled internet browsing for motor-handicapped users. design and implementation issues.
185-188
- Briony Williams, Delyth Prys, Ailbhe Ní Chasaide:
Creating an ongoing research capability in speech technology for two minority languages: experiences from the WISPR project.
189-192
- Anestis Vovos, Basilis Kladis, Nikolaos D. Fakotakis:
Speech operated smart-home control system for users with special needs.
193-196
- Takatoshi Jitsuhiro, Shigeki Matsuda, Yutaka Ashikari, Satoshi Nakamura, Ikuko Eguchi Yairi, Seiji Igi:
Spoken dialog system and its evaluation of geographic information system for elderly persons' mobility support.
197-200
- Daniele Falavigna, Toni Giorgino, Roberto Gretter:
A frame based spoken dialog system for home care.
201-204
Acoustic Processing for ASR I-III
- Matthias Wölfel:
Frame based model order selection of spectral envelopes.
205-208
- Vivek Tyagi, Christian Wellekens, Hervé Bourlard:
On variable-scale piecewise stationary spectral analysis of speech signals for ASR.
209-212
- Arlo Faria, David Gelbart:
Efficient pitch-based estimation of VTLN warp factors.
213-216
- Yanli Zheng, Richard Sproat, Liang Gu, Izhak Shafran, Haolang Zhou, Yi Su, Daniel Jurafsky, Rebecca Starr, Su-Youn Yoon:
Accent detection and speech recognition for Shanghai-accented Mandarin.
217-220
- Loic Barrault, Renato de Mori, Roberto Gemello, Franco Mana, Driss Matrouf:
Variability of automatic speech recognition systems using different features.
221-224
- Slavomír Lihan, Jozef Juhar, Anton Cizmar:
Crosslingual and bilingual speech recognition with Slovak and Czech speechdat-e databases.
225-228
- Carmen Peláez-Moreno, Qifeng Zhu, Barry Y. Chen, Nelson Morgan:
Automatic data selection for MLP-based feature extraction for ASR.
229-232
- Thilo Köhler, Christian Fügen, Sebastian Stüker, Alex Waibel:
Rapid porting of ASR-systems to mobile devices.
233-236
- Hugo Meinedo, João Paulo Neto:
A stream-based audio segmentation, classification and clustering pre-processing system for broadcast news using ANN models.
237-240
- Etienne Marcheret, Karthik Visweswariah, Gerasimos Potamianos:
Speech activity detection fusing acoustic phonetic and energy features.
241-244
- Zoltán Tüske, Péter Mihajlik, Zoltán Tobler, Tibor Fegyó:
Robust voice activity detection based on the entropy of noise-suppressed spectrum.
245-248
- Masamitsu Murase, Shun'ichi Yamamoto, Jean-Marc Valin, Kazuhiro Nakadai, Kentaro Yamada, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno:
Multiple moving speaker tracking by microphone array on mobile robot.
249-252
Speech Recognition - Adaptation I,
II
- Yaxin Zhang, Bian Wu, Xiaolin Ren, Xin He:
A speaker biased SI recognizer for embedded mobile applications.
253-256
- Bart Bakker, Carsten Meyer, Xavier L. Aubert:
Fast unsupervised speaker adaptation through a discriminative eigen-MLLR algorithm.
257-260
- Rusheng Hu, Jian Xue, Yunxin Zhao:
Incremental largest margin linear regression and MAP adaptation for speech separation in telemedicine applications.
261-264
- Giulia Garau, Steve Renals, Thomas Hain:
Applying vocal tract length normalization to meeting recordings.
265-268
- S. Umesh, András Zolnay, Hermann Ney:
Implementing frequency-warping and VTLN through linear transformation of conventional MFCC.
269-272
- Xiaodong Cui, Abeer Alwan:
MLLR-like speaker adaptation based on linearization of VTLN with MFCC features.
273-276
- Chandra Kant Raut, Takuya Nishimoto, Shigeki Sagayama:
Model adaptation by state splitting of HMM for long reverberation.
277-280
- Daben Liu, Daniel Kiecza, Amit Srivastava, Francis Kubala:
Online speaker adaptation and tracking for real-time speech recognition.
281-284
- Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa:
Automatic speech recognition based on adaptation and clustering using temporal-difference learning.
285-288
- Hui Ye, Steve Young:
Improving the speech recognition performance of beginners in spoken conversational interaction for language learning.
289-292
- Randy Gomez, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano:
Rapid unsupervised speaker adaptation based on multi-template HMM sufficient statistics in noisy environments.
293-296
- Dong-jin Choi, Yung-Hwan Oh:
Rapid speaker adaptation for continuous speech recognition using merging eigenvoices.
297-300
Signal Analysis,
Processing and Feature Estimation I-III
- Jian Liu, Thomas Fang Zheng, Jing Deng, Wenhu Wu:
Real-time pitch tracking based on combined SMDSF.
301-304
- András Bánhalmi, Kornél Kovács, András Kocsor, László Tóth:
Fundamental frequency estimation by least-squares harmonic model fitting.
305-308
- Siu Wa Lee, Frank K. Soong, Pak-Chung Ching:
Harmonic filtering for joint estimation of pitch and voiced source with single-microphone input.
309-312
- Marián Képesi, Luis Weruaga:
High-resolution noise-robust spectral-based pitch estimation.
313-316
- John-Paul Hosom:
F0 estimation for adult and children's speech.
317-320
- Ben Milner, Xu Shao, Jonathan Darch:
Fundamental frequency and voicing prediction from MFCCs for speech reconstruction from unconstrained speech.
321-324
- Nelly Barbot, Olivier Boëffard, Damien Lolive:
F0 stylisation with a free-knot b-spline model and simulated-annealing optimization.
325-328
- Friedhelm R. Drepper:
Voiced excitation as entrained primary response of a reconstructed glottal master oscillator.
329-332
- Damien Vincent, Olivier Rosec, Thierry Chonavel:
Estimation of LF glottal source parameters based on an ARX model.
333-336
- Leigh D. Alsteris, Kuldip K. Paliwal:
Some experiments on iterative reconstruction of speech from STFT phase and magnitude spectra.
337-340
- R. Muralishankar, Abhijeet Sangwan, Douglas D. O'Shaughnessy:
Statistical properties of the warped discrete cosine transform cepstrum compared with MFCC.
341-344
- Aníbal J. S. Ferreira:
New signal features for robust identification of isolated vowels.
345-348
- Jonathan Pincas, Philip J. B. Jackson:
Amplitude modulation of frication noise by voicing saturates.
349-352
- Ron M. Hecht, Naftali Tishby:
Extraction of relevant speech features using the information bottleneck method.
353-356
- Mohammad Firouzmand, Laurent Girin, Sylvain Marchand:
Comparing several models for perceptual long-term modeling of amplitude and phase trajectories of sinusoidal speech.
357-360
- Hynek Hermansky, Petr Fousek:
Multi-resolution RASTA filtering for TANDEM-based ASR.
361-364
- Woojay Jeon, Biing-Hwang Juang:
A category-dependent feature selection method for speech signals.
365-368
- Trausti Kristjansson, Sabine Deligne, Peder A. Olsen:
Voicing features for robust speech detection.
369-372
Robust Speech Recognition I-IV
- Svein Gunnar Pettersen, Magne Hallstein Johnsen, Tor André Myrvoll:
Joint Bayesian predictive classification and parallel model combination for robust speech recognition.
373-376
- Glauco F. G. Yared, Fábio Violaro, Lívio C. Sousa:
Gaussian elimination algorithm for HMM complexity reduction in continuous speech recognition systems.
377-380
- Luis Buera, Eduardo Lleida, Antonio Miguel, Alfonso Ortega:
Robust speech recognition in cars using phoneme dependent multi-environment linear normalization.
381-384
- Yi Chen, Lin-Shan Lee:
Energy-based frame selection for reliable feature normalization and transformation in robust speech recognition.
385-388
- Yoshitaka Nakajima, Hideki Kashioka, Kiyohiro Shikano, Nick Campbell:
Remodeling of the sensor for non-audible murmur (NAM).
389-392
- Amarnag Subramanya, Jeff Bilmes, Chia-Ping Chen:
Focused word segmentation for ASR.
393-396
Speech Perception I,
II
Spoken Language Understanding I,
II
- Ian R. Lane, Tatsuya Kawahara:
Utterance verification incorporating in-domain confidence and discourse coherence measures.
421-424
- Constantinos Boulis, Mari Ostendorf:
Using symbolic prominence to help design feature subsets for topic classification and clustering of natural human-human conversations.
425-428
- Katsuhito Sudoh, Hajime Tsukada:
Tightly integrated spoken language understanding using word-to-concept translation.
429-432
- Ruhi Sarikaya, Hong-Kwang Jeff Kuo, Vaibhava Goel, Yuqing Gao:
Exploiting unlabeled data using multiple classifiers for improved natural language call-routing.
433-436
- Hong-Kwang Jeff Kuo, Vaibhava Goel:
Active learning with minimum expected error for spoken language understanding.
437-440
- Matthias Thomae, Tibor Fábián, Robert Lieb, Günther Ruske:
Lexical out-of-vocabulary models for one-stage speech interpretation.
441-444
E-inclusion and Spoken Language Processing I,
II
Paralinguistic and Nonlinguistic Information in Speech
- Nick Campbell, Hideki Kashioka, Ryo Ohara:
No laughing matter.
465-468
- Christophe Blouin, Valérie Maffiolo:
A study on the automatic detection and characterization of emotion in a voice service context.
469-472
- Raul Fernandez, Rosalind W. Picard:
Classical and novel discriminant features for affect recognition from speech.
473-476
- Jaroslaw Cichosz, Krzysztof Slot:
Low-dimensional feature space derivation for emotion recognition.
477-480
- Carlos Toshinori Ishi, Hiroshi Ishiguro, Norihiro Hagita:
Proposal of acoustic measures for automatic detection of vocal fry.
481-484
- Khiet P. Truong, David A. van Leeuwen:
Automatic detection of laughter.
485-488
- Anton Batliner, Stefan Steidl, Christian Hacker, Elmar Nöth, Heinrich Niemann:
Tales of tuning - prototyping for automatic classification of emotional user states.
489-492
- Iker Luengo, Eva Navas, Inmaculada Hernáez, Jon Sánchez:
Automatic emotion recognition using prosodic parameters.
493-496
- Sungbok Lee, Serdar Yildirim, Abe Kazemzadeh, Shrikanth Narayanan:
An articulatory study of emotional speech production.
497-500
- Gregor Hofer, Korin Richmond, Robert A. J. Clark:
Informed blending of databases for emotional speech synthesis.
501-504
- Fabio Tesser, Piero Cosi, Carlo Drioli, Graziano Tisato:
Emotional FESTIVAL-MBROLA TTS synthesis.
505-508
- Felix Burkhardt:
Emofilt: the simulation of emotional speech by prosody-transformation.
509-512
- Andrew Rosenberg, Julia Hirschberg:
Acoustic/prosodic and lexical correlates of charismatic speech.
513-516
- Yoko Greenberg, Minoru Tsuzaki, Hiroaki Kato, Yoshinori Sagisaka:
Communicative speech synthesis using constituent word attributes.
517-520
- Angelika Braun, Matthias Katerbow:
Emotions in dubbed speech: an intercultural approach with respect to F0.
521-524
- Nicolas Audibert, Véronique Aubergé, Albert Rilliard:
The prosodic dimensions of emotion in speech: the relative weights of parameters.
525-528
- Susanne Schötz:
Stimulus duration and type in perception of female and male speaker age.
529-532
- Cecilia Ovesdotter Alm, Richard Sproat:
Perceptions of emotions in expressive storytelling.
533-536
- Hideki Kawahara, Alain de Cheveigné, Hideki Banno, Toru Takahashi, Toshio Irino:
Nearly defect-free F0 trajectory extraction for expressive speech modifications based on STRAIGHT.
537-540
- Tomoko Yonezawa, Noriko Suzuki, Kenji Mase, Kiyoshi Kogure:
Gradually changing expression of singing voice based on morphing.
541-544
Issues in Large Vocabulary Decoding
- I. Lee Hetherington:
A multi-pass, dynamic-vocabulary approach to real-time, large-vocabulary speech recognition.
545-548
- George Saon, Daniel Povey, Geoffrey Zweig:
Anatomy of an extremely fast LVCSR decoder.
549-552
- Dong Yu, Li Deng, Alex Acero:
Evaluation of a long-contextual-Span hidden trajectory model and phonetic recognizer using a* lattice search.
553-556
- Takaaki Hori, Atsushi Nakamura:
Generalized fast on-the-fly composition algorithm for WFST-based speech recognition.
557-560
- Hiroaki Nanjo, Teruhisa Misu, Tatsuya Kawahara:
Minimum Bayes-risk decoding considering word significance for information retrieval system.
561-564
- Arthur Chan, Mosur Ravishankar, Alexander I. Rudnicky:
On improvements to CI-based GMM selection.
565-568
- Dominique Massonié, Pascal Nocera, Georges Linares:
Scalable language model look-ahead for LVCSR.
569-572
- Miroslav Novak:
Memory efficient approximative lattice generation for grammar based decoding.
573-576
- Dong-Hoon Ahn, Su-Byeong Oh, Minhwa Chung:
Improved semi-dynamic network decoding using WFSTs.
577-580
- Janne Pylkkönen:
New pruning criteria for efficient decoding.
581-584
- Tibor Fábián, Robert Lieb, Günther Ruske, Matthias Thomae:
A confidence-guided dynamic pruning approach - utilization of confidence measurement in speech recognition.
585-588
Spoken Language Extraction / Retrieval I,
II
- Toru Taniguchi, Akishige Adachi, Shigeki Okawa, Masaaki Honda, Katsuhiko Shirai:
Discrimination of speech, musical instruments and singing voices using the temporal patterns of sinusoidal segments in audio signals.
589-592
- Gabriel Murray, Steve Renals, Jean Carletta:
Extractive summarization of meeting recordings.
593-596
- Arjan van Hessen, Jaap Hinke:
IR-based classification of customer-agent phone calls.
597-600
- Benoît Favre, Frédéric Béchet, Pascal Nocera:
Mining broadcast news data: robust information extraction from word lattices.
601-604
- Mikko Kurimo, Ville T. Turunen:
To recover from speech recognition errors in spoken document retrieval.
605-608
- Edgar González, Jordi Turmo:
Unsupervised clustering of spontaneous speech documents.
609-612
- Masahide Yamaguchi, Masaru Yamashita, Shoichi Matsunaga:
Spectral cross-correlation features for audio indexing of broadcast news and meetings.
613-616
- Chiori Hori, Alex Waibel:
Spontaneous speech consolidation for spoken language applications.
617-620
- Sameer Maskey, Julia Hirschberg:
Comparing lexical, acoustic/prosodic, structural and discourse features for speech summarization.
621-624
- Te-Hsuan Li, Ming-Han Lee, Berlin Chen, Lin-Shan Lee:
Hierarchical topic organization and visual presentation of spoken documents using probabilistic latent semantic analysis (PLSA) for efficient retrieval/browsing applications.
625-628
- Janez Zibert, France Mihelic, Jean-Pierre Martens, Hugo Meinedo, João Paulo Neto, Laura Docío Fernández, Carmen García-Mateo, Petr David, Jindrich Zdánský, Matús Pleva, Anton Cizmar, Andrej Zgank, Zdravko Kacic, Csaba Teleki, Klára Vicsi:
The COST278 broadcast news segmentation and speaker clustering evaluation - overview, methodology, systems, results.
629-632
- Igor Szöke, Petr Schwarz, Pavel Matejka, Lukas Burget, Martin Karafiát, Michal Fapso, Jan Cernocký:
Comparison of keyword spotting approaches for informal continuous speech.
633-636
- Teruhisa Misu, Tatsuya Kawahara:
Dialogue strategy to clarify user's queries for document retrieval system with speech interface.
637-640
- Nicolas Moreau, Shan Jin, Thomas Sikora:
Comparison of different phone-based spoken document retrieval methods with text and spoken queries.
641-644
Signal Analysis,
Processing and Feature Estimation I-III
- Pedro Gómez Vilda, Francisco Díaz, Agustín Álvarez Marquina, Rafael Martínez, Victoria Rodellar, Roberto Fernández-Baíllo, Alberto Nieto, Francisco J. Fernandez:
PCA of perturbation parameters in voice pathology detection.
645-648
- Anindya Sarkar, T. V. Sreenivas:
Dynamic programming based segmentation approach to LSF matrix reconstruction.
649-652
- T. Nagarajan, Douglas D. O'Shaughnessy:
Explicit segmentation of speech based on frequency-domain AR modeling.
653-656
- Petr Motlícek, Lukás Burget, Jan Cernocký:
Non-parametric speaker turn segmentation of meeting data.
657-660
- Petri Korhonen, Unto K. Laine:
Unsupervised segmentation of continuous speech using vector autoregressive time-frequency modeling errors.
661-664
- P. Vijayalakshmi, M. RamasubbaReddy:
The analysis on band-limited hypernasal speech using group delay based formant extraction technique.
665-668
- Jindrich Zdánský, Jan Nouza:
Detection of acoustic change-points in audio records via global BIC maximization and dynamic programming.
669-672
- Md. Khademul Islam Molla, Keikichi Hirose, Nobuaki Minematsu:
Multi-band approach of audio source discrimination with empirical mode decomposition.
673-676
- Minoru Tsuzaki, Satomi Tanaka, Hiroaki Kato, Yoshinori Sagisaka:
Application of auditory image model for speech event detection.
677-680
- José Anibal Arias:
Unsupervised identification of speech segments using kernel methods for clustering.
681-684
- Georgios Evangelopoulos, Petros Maragos:
Speech event detection using multiband modulation energy.
685-688
- John Kominek, Alan W. Black:
Measuring unsupervised acoustic clustering through phoneme pair merge-and-split tests.
689-692
- Fabio Valente, Christian Wellekens:
Variational Bayesian speaker change detection.
693-696
- Sarah Borys, Mark Hasegawa-Johnson:
Distinctive feature based SVM discriminant features for improvements to phone recognition on telephone band speech.
697-700
- P. Vijayalakshmi, M. RamasubbaReddy:
Detection of hypernasality using statistical pattern classifiers.
701-704
- Luis Weruaga, Marián Képesi:
Self-organizing chirp-sensitive artificial auditory cortical model.
705-708
- Sotiris Karabetsos, Pirros Tsiakoulis, Stavroula-Evita Fotinea, Ioannis Dologlou:
On the use of a decimative spectral estimation method based on eigenanalysis and SVD for formant and bandwidth tracking of speech signals.
709-712
- Alexei V. Ivanov, Marek Parfieniuk, Alexander A. Petrovsky:
Frequency-domain auditory suppression modelling (FASM) - a WDFT-based anthropomorphic noise-robust feature extraction algorithm for speech recognition.
713-716
Keynote Papers
Speech Recognition - Language Modelling I-III
Spoken Language Acquisition,
Development and Learning I,
II
Multi-modal / Multi-media Processing I,
II
- Nick Campbell:
Non-verbal speech processing for a communicative agent.
769-772
- Stuart N. Wrigley, Guy J. Brown:
Physiologically motivated audio-visual localisation and tracking.
773-776
- Jing Huang, Daniel Povey:
Discriminatively trained features using fMPE for multi-stream audio-visual speech recognition.
777-780
- Graziano Tisato, Piero Cosi, Carlo Drioli, Fabio Tesser:
INTERFACE: a new tool for building emotive/expressive talking heads.
781-784
- Pascual Ejarque, Javier Hernando:
Variance reduction by using separate genuine- impostor statistics in multimodal biometrics.
785-788
- Volker Schubert, Stefan W. Hamerich:
The dialog application metalanguage GDialogXML.
789-792
- Jonas Beskow, Mikael Nordenberg:
Data-driven synthesis of expressive visual speech using an MPEG-4 talking head.
793-796
- Oytun Türk, Marc Schröder, Baris Bozkurt, Levent M. Arslan:
Voice quality interpolation for emotional text-to-speech synthesis.
797-800
- Murtaza Bulut, Carlos Busso, Serdar Yildirim, Abe Kazemzadeh, Chul Min Lee, Sungbok Lee, Shrikanth Narayanan:
Investigating the role of phoneme-level modifications in emotional speech resynthesis.
801-804
- Björn Schuller, Ronald Müller, Manfred K. Lang, Gerhard Rigoll:
Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles.
805-808
- Jonghwa Kim, Elisabeth André, Matthias Rehm, Thurid Vogt, Johannes Wagner:
Integrating information from speech and physiological signals to achieve emotional sensitivity.
809-812
- Ellen Douglas-Cowie, Laurence Devillers, Jean-Claude Martin, Roddy Cowie, Suzie Savvidou, Sarkis Abrilian, Cate Cox:
Multimodal databases of everyday emotion: facing up to complexity.
813-816
Spoken / Multi-modal Dialogue Systems I,
II
- Francisco Torres, Emilio Sanchis, Encarna Segarra:
Learning of stochastic dialog models through a dialog simulation technique.
817-820
- Lesley-Ann Black, Michael F. McTear, Norman D. Black, Roy Harper, Michelle Lemon:
Evaluating the DI@l-log system on a cohort of elderly, diabetic patients: results from a preliminary study.
821-824
- Pavel Král, Christophe Cerisara, Jana Klecková:
Combination of classifiers for automatic recognition of dialog acts.
825-828
- Xiaojun Wu, Thomas Fang Zheng, Michael Brasser, Zhanjiang Song:
Rapidly developing spoken Chinese dialogue systems with the d-ear SDS SDK.
829-832
- Daniela Oria, Akos Vetek:
Robust algorithms and interaction strategies for voice spelling.
833-836
- Ioannis Toptsis, Axel Haasch, Sonja Hwel, Jannik Fritsch, Gernot A. Fink:
Modality integration and dialog management for a robotic assistant.
837-840
- Norbert Reithinger, Daniel Sonntag:
An integration framework for a mobile multimodal dialogue system accessing the semantic web.
841-844
- Ryuichi Nisimura, Akinobu Lee, Masashi Yamada, Kiyohiro Shikano:
Operating a public spoken guidance system in real environment.
845-848
- Esa-Pekka Salonen, Markku Turunen, Jaakko Hakulinen, Leena Helin, Perttu Prusi, Anssi Kainulainen:
Distributed dialogue management for smart terminal devices.
849-852
- Jaakko Hakulinen, Markku Turunen, Esa-Pekka Salonen:
Visualization of spoken dialogue systems for demonstration, debugging and tutoring.
853-856
- César González Ferreras, Valentín Cardeñoso-Payo:
Development and evaluation of a spoken dialog system to access a newspaper web site.
857-860
- Olivier Pietquin, Richard Beaufort:
Comparing ASR modeling methods for spoken dialogue simulation and optimal strategy learning.
861-864
- Shiu-Wah Chu, Ian M. O'Neill, Philip Hanna, Michael F. McTear:
An approach to multi-strategy dialogue management.
865-868
- Anna Hjalmarsson:
Towards user modelling in conversational dialogue systems: a qualitative study of the dynamics of dialogue parameters.
869-872
- Kouichi Katsurada, Kazumine Aoki, Hirobumi Yamada, Tsuneo Nitta:
Reducing the description amount in authoring MMI applications.
873-876
- Kazunori Komatani, Naoyuki Kanda, Tetsuya Ogata, Hiroshi G. Okuno:
Contextual constraints based on dialogue models in database search task for spoken dialogue systems.
877-880
- Mihai Rotaru, Diane J. Litman:
Using word-level pitch features to better predict student emotions during spoken tutoring dialogues.
881-884
- Antoine Raux, Brian Langner, Dan Bohus, Alan W. Black, Maxine Eskenazi:
Let's go public! taking a spoken dialog system to the real world.
885-888
- Shinya Fujie, Kenta Fukushima, Tetsunori Kobayashi:
Back-channel feedback generation using linguistic and nonlinguistic information and its application to spoken dialogue system.
889-892
- Kallirroi Georgila, James Henderson, Oliver Lemon:
Learning user simulations for information state update dialogue systems.
893-896
- Darío Martín-Iglesias, Yago Pereiro-Estevan, Ana I. García-Moral, Ascensión Gallardo-Antolín, Fernando Díaz-de-María:
Design of a voice-enabled interface for real-time access to stock exchange from a PDA through GPRS.
897-900
- William Schuler, Tim Miller:
Integrating denotational meaning into a DBN language model.
901-904
- Louis ten Bosch:
Improving out-of-coverage language modelling in a multimodal dialogue system using small training sets.
905-908
- Olivier Galibert, Gabriel Illouz, Sophie Rosset:
Ritel: an open-domain, human-computer dialog system.
909-912
Robust Speech Recognition I-IV
- Reinhold Haeb-Umbach, Joerg Schmalenstroeer:
A comparison of particle filtering variants for speech feature enhancement.
913-916
- Ilyas Potamitis, Nikolaos D. Fakotakis:
Enhancement of mel log-power spectrum of speech using particle filtering.
917-920
- Makoto Shozakai, Goshu Nagino:
Improving robustness of speech recognition performance to aggregate of noises by two-dimensional visualization.
921-924
- Woohyung Lim, Bong Kyoung Kim, Nam Soo Kim:
Feature compensation based on switching linear dynamic model and soft decision.
925-928
- Shilei Huang, Xiang Xie, Jingming Kuang:
Using output probability distribution for improving speech recognition in adverse environment.
929-932
- Eric H. C. Choi:
A generalized framework for compensation of mel-filterbank outputs in feature extraction for robust ASR.
933-936
- Hesham Tolba, Zili Li, Douglas D. O'Shaughnessy:
Robust automatic speech recognition using a perceptually-based optimal spectral amplitude estimator speech enhancement algorithm in various low-SNR environments.
937-940
- Stephen So, Kuldip K. Paliwal:
Improved noise-robustness in distributed speech recognition via perceptually-weighted vector quantisation of filterbank energies.
941-944
- Babak Nasersharif, Ahmad Akbari:
Sub-band weighted projection measure for robust sub-band speech recognition.
945-948
- Jianping Deng, Martin Bouchard, Tet Hin Yeap:
Noise compensation using interacting multiple kalman filters.
949-952
- Veronique Stouten, Hugo Van Hamme, Patrick Wambacq:
Kalman and unscented kalman filter feature enhancement for noise robust ASR.
953-956
- Chia-Yu Wan, Lin-Shan Lee:
Histogram-based quantization (HQ) for robust and scalable distributed speech recognition.
957-960
- Yong-Joo Chung:
A data-driven approach for the model parameter compensation in noisy speech recognition.
961-964
- Satoshi Kobashikawa, Satoshi Takahashi, Yoshikazu Yamaguchi, Atsunori Ogawa:
Rapid response and robust speech recognition by preliminary model adaptation for additive and convolutional noise.
965-968
- Saurabh Prasad, Stephen A. Zahorian:
Nonlinear and linear transformations of speech features to compensate for channel and noise effects.
969-972
- Motoyuki Suzuki, Yusuke Kato, Akinori Ito, Shozo Makino:
Construction method of acoustic models dealing with various background noises based on combination of HMMs.
973-976
- Haitian Xu, Zheng-Hua Tan, Paul Dalsgaard, Børge Lindberg:
Robust speech recognition based on noise and SNR classification - a multiple-model framework.
977-980
- Hwa Jeon Song, Hyung Soon Kim:
Eigen-environment based noise compensation method for robust speech recognition.
981-984
- Martin Graciarena, Horacio Franco, Gregory K. Myers, Victor Abrash:
Robust feature compensation in nonstationary and multiple noise environments.
985-988
- Jasha Droppo, Alex Acero:
Maximum mutual information SPLICE transform for seen and unseen conditions.
989-992
- Sven E. Krüger, Martin Schafföner, Marcel Katz, Edin Andelic, Andreas Wendemuth:
Speech recognition with support vector machines in a hybrid system.
993-996
- Vincent Barreaud, Douglas D. O'Shaughnessy, Jean-Guy Dahan:
Experiments on speaker profile portability.
997-1000
- Daniele Colibro, Luciano Fissore, Claudio Vair, Emanuele Dalmasso, Pietro Laface:
A confidence measure invariant to language and grammar.
1001-1004
- Ken Schutte, James R. Glass:
Robust detection of sonorant landmarks.
1005-1008
Speech Production I
- Amélie Rochet-Capellan, Jean-Luc Schwartz:
The labial-coronal effect and CVCV stability during reiterant speech production: an acoustic analysis.
1009-1012
- Amélie Rochet-Capellan, Jean-Luc Schwartz:
The labial-coronal effect and CVCV stability during reiterant speech production: an articulatory analysis.
1013-1016
- Mitsuhiro Nakamura:
Articulatory constraints and coronal stops: an EPG study.
1017-1020
- Vincent Robert, Brigitte Wrobel-Dautcourt, Yves Laprie, Anne Bonneau:
Strategies of labial coarticulation.
1021-1024
- Jianwu Dang, Jianguo Wei, Takeharu Suzuki, Pascal Perrier:
Investigation and modeling of coarticulation during speech.
1025-1028
- Fang Hu:
Tongue kinematics in diphthong production in Ningbo Chinese.
1029-1032
- Takayuki Arai:
Comparing tongue positions of vowels in oral and nasal contexts.
1033-1036
- Slim Ouni:
Can we retrieve vocal tract dynamics that produced speech? toward a speaker articulatory strategy model.
1037-1040
- Pascal Perrier, Liang Ma, Yohan Payan:
Modeling the production of VCV sequences via the inversion of a biomechanical model of the tongue.
1041-1044
- Xiaochuan Niu, Alexander Kain, Jan P. H. van Santen:
Estimation of the acoustic properties of the nasal tract during the production of nasalized vowels.
1045-1048
- Kohichi Ogata:
A web-based articulatory speech synthesis system for distance education.
1049-1052
- Paavo Alku, Matti Airas, Tomas Bäckström, Hannu Pulakka:
Group delay function as a means to assess quality of glottal inverse filtering.
1053-1056
- Eva Björkner, Johan Sundberg, Paavo Alku:
Subglottal pressure and NAQ variation in voice production of classically trained baritone singers.
1057-1060
- Gunnar Fant, Anita Kruckenberg:
Covariation of subglottal pressure, F0 and intensity.
1061-1064
- Javier Pérez, Antonio Bonafonte:
Automatic voice-source parameterization of natural speech.
1065-1068
- Chakir Zeroual, John H. Esling, Lise Crevier-Buchman:
Physiological study of whispered speech in Moroccan Arabic.
1069-1072
- C. P. Moura, D. Andrade, L. M. Cunha, M. J. Cunha, H. Vilarinho, H. Barros, Diamantino Freitas, M. Pais-Clemente:
Voice quality in down syndrome children treated with rapid maxillary expansion.
1073-1076
- Julien Hanquinet, Francis Grenez, Jean Schoentgen:
Synthesis of disordered speech.
1077-1080
- Julie Fontecave, Frédéric Berthommier:
Quasi-automatic extraction of tongue movement from a large existing speech cineradiographic database.
1081-1084
- Shimon Sapir, Ravit Cohen Mimran:
The working memory token test (WMTT): preliminary findings in young adults with and without dyslexia.
1085-1088
- Sérgio Paulo, Luís C. Oliveira:
Reducing the corpus-based TTS signal degradation due to speaker's word pronunciations.
1089-1092
- Wai-Sum Lee:
A phonetic study of the "er-hua" rimes in Beijing Mandarin.
1093-1096
Acoustic Processing for ASR I-III
- Li Deng, Dong Yu, Alex Acero:
Learning statistically characterized resonance targets in a hidden trajectory model of speech coarticulation and reduction.
1097-1100
- Daniil Kocharov, András Zolnay, Ralf Schlüter, Hermann Ney:
Articulatory motivated acoustic features for speech recognition.
1101-1104
- Shinji Watanabe, Atsushi Nakamura:
Effects of Bayesian predictive classification using variational Bayesian posteriors for sparse training data in speech recognition.
1105-1108
- Yu Tsao, Jinyu Li, Chin-Hui Lee:
A study on separation between acoustic models and its applications.
1109-1112
- Mohamed Afify:
Extended baum-welch reestimation of Gaussian mixture models based on reverse Jensen inequality.
1113-1116
- Asela Gunawardana, Milind Mahajan, Alex Acero, John C. Platt:
Hidden conditional random fields for phone classification.
1117-1120
Signal Analysis,
Processing and Feature Estimation I-III
- Francesco Gianfelici, Giorgio Biagetti, Paolo Crippa, Claudio Turchetti:
Asymptotically exact AM-FM decomposition based on iterated hilbert transform.
1121-1124
- Athanassios Katsamanis, Petros Maragos:
Advances in statistical estimation and tracking of AM-FM speech components.
1125-1128
- Jonathan Darch, Ben P. Milner, Saeed Vaseghi:
Formant frequency prediction from MFCC vectors in noisy environments.
1129-1132
- S. R. Mahadeva Prasanna, B. Yegnanarayana:
Detection of vowel onset point events using excitation information.
1133-1136
- João P. Cabral, Luís C. Oliveira:
Pitch-synchronous time-scaling for prosodic and voice quality transformations.
1137-1140
- Yasunori Ohishi, Masataka Goto, Katunobu Itou, Kazuya Takeda:
Discrimination between singing and speaking voices.
1141-1144
Spoken Language Resources and Technology Evaluation I,
II
- Douglas Jones, Wade Shen, Elizabeth Shriberg, Andreas Stolcke, Teresa M. Kamm, Douglas A. Reynolds:
Two experiments comparing reading with listening for human processing of conversational telephone speech.
1145-1148
- Sylvain Galliano, Edouard Geoffrois, Djamel Mostefa, Khalid Choukri, Jean-François Bonastre, Guillaume Gravier:
The ESTER phase II evaluation campaign for the rich transcription of French broadcast news.
1149-1152
- Takashi Saito:
A method of multi-layered speech segmentation tailored for speech synthesis.
1153-1156
- Sérgio Paulo, Luís C. Oliveira:
Generation of word alternative pronunciations using weighted finite state transducers.
1157-1160
- Helmer Strik, Diana Binnenpoorte, Catia Cucchiarini:
Multiword expressions in spontaneous speech: do we really speak like that?
1161-1164
- Jáchym Kolár, Jan Svec, Stephanie Strassel, Christopher Walker, Dagmar Kozlíková, Josef Psutka:
Czech spontaneous speech corpus with structural metadata.
1165-1168
Early Language Acquisition
Multi-modal / Multi-media Processing I,
II
- Raghunandan S. Kumaran, Karthik Narayanan, John N. Gowdy:
Myoelectric signals for multimodal speech recognition.
1189-1192
- Philippe Daubias:
Is color information really useful for lip-reading ? (or what is lost when color is not used).
1193-1196
- Islam Shdaifat, Rolf-Rainer Grigat:
A system for audio-visual speech recognition.
1197-1200
- Norihide Kitaoka, Hironori Oshikawa, Seiichi Nakagawa:
Multimodal interface for organization name input based on combination of isolated word recognition and continuous base-word recognition.
1201-1204
- Yosuke Matsusaka:
Recognition of (3) party conversation using prosody and gaze.
1205-1208
- Dongdong Li, Yingchun Yang, Zhaohui Wu:
Combining voiceprint and face biometrics for speaker identification using SDWS.
1209-1212
- Neil Cooke, Martin Russell:
Using the focus of visual attention to improve spontaneous speech recognition.
1213-1216
- Sabri Gurbuz:
Real-time outer lip contour tracking for HCI applications.
1217-1220
- Jing Huang, Karthik Visweswariah:
Improving lip-reading with feature space transforms for multi-stream audio-visual speech recognition.
1221-1224
- Hansjörg Mixdorff, Denis Burnham, Guillaume Vignali, Patavee Charnvivit:
Are there facial correlates of Thai syllabic tones?
1225-1228
- Rowan Seymour, Ji Ming, Darryl Stewart:
A new posterior based audio-visual integration method for robust speech recognition.
1229-1232
Bridging the Gap ASR-HSR
- Sorin Dusan, Lawrence R. Rabiner:
On integrating insights from human speech perception into automatic speech recognition.
1233-1236
- Odette Scharenborg:
Parallels between HSR and ASR: how ASR can contribute to HSR.
1237-1240
- Louis ten Bosch, Odette Scharenborg:
ASR decoding in a computational model of human word recognition.
1241-1244
- Viktoria Maier, Roger K. Moore:
An investigation into a simulation of episodic memory for automatic speech recognition.
1245-1248
- Eric Fosler-Lussier, C. Anton Rytting, Soundararajan Srinivasan:
Phonetic ignorance is bliss: investigating the effects of phonetic information reduction on ASR performance.
1249-1252
- Marcus Holmberg, David Gelbart, Ulrich Ramacher, Werner Hemmert:
Automatic speech recognition with neural spike trains.
1253-1256
- Michael J. Carey, Tuan P. Quang:
A speech similarity distance weighting for robust recognition.
1257-1260
- Takao Murakami, Kazutaka Maruyama, Nobuaki Minematsu, Keikichi Hirose:
Japanese vowel recognition based on structural representation of speech.
1261-1264
- Soundararajan Srinivasan, DeLiang Wang:
Modeling the perception of multitalker speech.
1265-1268
- Sue Harding, Jon P. Barker, Guy J. Brown:
Binaural feature selection for missing data speech recognition.
1269-1272
- Thorsten Wesker, Bernd T. Meyer, Kirsten Wagener, Jörn Anemüller, Alfred Mertins, Birger Kollmeier:
Oldenburg logatome speech corpus (OLLO) for speech recognition experiments with humans and machines.
1273-1276
Speech Recognition - Language Modelling I-III
- Jen-Wei Kuo, Berlin Chen:
Minimum word error based discriminative training of language models.
1277-1280
- A. Ghaoui, François Yvon, Chafic Mokbel, Gérard Chollet:
On the use of morphological constraints in n-gram statistical language model.
1281-1284
- Elvira I. Sicilia-Garcia, Ji Ming, F. Jack Smith:
A posteriori multiple word-domain language model.
1285-1288
- Javier Dieguez-Tirado, Carmen García-Mateo, Antonio Cardenal López:
Effective topic-tree based language model adaptation.
1289-1292
- Abhinav Sethy, Panayiotis G. Georgiou, Shrikanth Narayanan:
Building topic specific language models from webdata using competitive models.
1293-1296
- Carlos Troncoso, Tatsuya Kawahara:
Trigger-based language model adaptation for automatic meeting transcription.
1297-1300
- Jacques Duchateau, Dong Hoon Van Uytsel, Hugo Van Hamme, Patrick Wambacq:
Statistical language models for large vocabulary spontaneous speech recognition in dutch.
1301-1304
- Alexandre Allauzen, Jean-Luc Gauvain:
Diachronic vocabulary adaptation for broadcast news transcription.
1305-1308
- Vesa Siivola, Bryan L. Pellom:
Growing an n-gram language model.
1309-1312
- Harald Hning, Manuel Kirschner, Fritz Class, André Berton, Udo Haiber:
Embedding grammars into statistical language models.
1313-1316
- Simo Broman, Mikko Kurimo:
Methods for combining language models in speech recognition.
1317-1320
- Airenas Vaiciunas, Gailius Raskinis:
Review of statistical modeling of highly inflected lithuanian using very large vocabulary.
1321-1324
- Genevieve Gorrell, Brandyn Webb:
Generalized hebbian algorithm for incremental latent semantic analysis.
1325-1328
- Arnar Thor Jensson, Edward W. D. Whittaker, Koji Iwano, Sadaoki Furui:
Language model adaptation for resource deficient languages using translated data.
1329-1332
- Petra Witschel, Sergey Astrov, Gabriele Bakenecker, Josef G. Bauer, Harald Höge:
POS-based language models for large vocabulary speech recognition on embedded systems.
1333-1336
Speech Recognition - Pronunciation Modelling
- Je Hun Jeon, Minhwa Chung:
Automatic generation of domain-dependent pronunciation lexicon with data-driven rules and rule adaptation.
1337-1340
- Michael Tjalve, Mark Huckvale:
Pronunciation variation modelling using accent features.
1341-1344
- Khiet P. Truong, Ambra Neri, Febe de Wet, Catia Cucchiarini, Helmer Strik:
Automatic detection of frequent pronunciation errors made by L2-learners.
1345-1348
- Josef Psutka, Pavel Ircing, Josef V. Psutka, Jan Hajic, William J. Byrne, Jirí Mírovský:
Automatic transcription of Czech, Russian, and Slovak spontaneous speech in the MALACH project.
1349-1352
- Stéphane Dupont, Christophe Ris, Laurent Couvreur, Jean-Marc Boite:
A study of implicit and explicit modeling of coarticulation and pronunciation variation.
1353-1356
- Shinya Takahashi, Tsuyoshi Morimoto, Sakashi Maeda, Naoyuki Tsuruta:
Detection of coughs from user utterances using imitated phoneme model.
1357-1360
- V. Ramasubramanian, P. Srinivas, T. V. Sreenivas:
Stochastic pronunciation modeling by ergodic-HMM of acoustic sub-word units.
1361-1364
- Chen Liu, Lynette Melnar:
An automated linguistic knowledge-based cross-language transfer method for building acoustic models for a language without native training data.
1365-1368
- Ghazi Bouselmi, Dominique Fohr, Irina Illina, Jean Paul Haton:
Fully automated non-native speech recognition using confusion-based acoustic model integration.
1369-1372
Prosodic Structure
- Véronique Aubergé, Albert Rilliard:
The focus prosody: more than a simple binary function.
1373-1376
- Martha Dalton, Ailbhe Ní Chasaide:
Peak timing in two dialects of connaught irish.
1377-1380
- Janet Fletcher:
Compound rises and "uptalk" in spoken English.
1381-1384
- Li-chiung Yang:
Duration and the temporal structure of Mandarin discourse.
1385-1388
- Bei Wang:
Prosodic realization of split noun phrases in Mandarin Chinese compared in topic and focus contexts.
1389-1392
- Ziyu Xiong:
Downstep effect on disyllabic words of citation forms in standard Chinese.
1393-1396
- Jinfu Ni, Hisashi Kawai, Keikichi Hirose:
Estimation of intonation variation with constrained tone transformations.
1397-1400
- Ho-hsien Pan:
Voice quality of falling tones in taiwan min.
1401-1404
- Chiu-yu Tseng, Bau-Ling Fu:
Duration, intensity and pause predictions in relation to prosody organization.
1405-1408
- Jiahong Yuan, Jason M. Brenier, Daniel Jurafsky:
Pitch accent prediction: effects of genre and speaker.
1409-1412
- Hiroya Fujisaki, Sumio Ohno:
Analysis and modeling of fundamental frequency contours of hindi utterances.
1413-1416
- Natasha Govender, Etienne Barnard, Marelie H. Davel:
Fundamental frequency and tone in isizulu: initial experiments.
1417-1420
- Judith Bishop, Marc Peake, Dmitry Sityaev:
Intonational sequences in tuscan Italian.
1421-1424
- Caterina Petrone:
Effects of raddoppiamento sintattico on tonal alignment in Italian.
1425-1428
- Tomás Dubeda, Jan Votrubec:
Acoustic analysis of Czech stress: intonation, duration and intensity revisited.
1429-1432
- Mohamed Yeou:
Variability of F0 peak alignment in moroccan Arabic accentual focus.
1433-1436
- Anne Lacheret, Ch. Lyche, Michel Morel:
Phonological analysis of schwa and liaison within the PFC project (phonologie du fran ais contemporain): how determinant are the prosodic factors?
1437-1440
- Plínio A. Barbosa, Pablo Arantes, Alexsandro R. Meireles, Jussara M. Vieira:
Abstractness in speech-metronome synchronisation: P-centres as cyclic attractors.
1441-1444
Applications of Confidence Related Measures to ASR
- Makoto Yamada, Tsuneo Kato, Masaki Naito, Hisashi Kawai:
Improvement of rejection performance of keyword spotting using anti-keywords derived from large vocabulary considering acoustical similarity to keywords.
1445-1448
- Ralf Schlüter, T. Scharrenbach, Volker Steinbiss, Hermann Ney:
Bayes risk minimization using metric loss functions.
1449-1452
- Akio Kobayashi, Kazuo Onoe, Shoei Sato, Toru Imai:
Word error rate minimization using an integrated confidence measure.
1453-1456
- Bin Dong, Qingwei Zhao, Yonghong Yan:
Fast confidence measure algorithm for continuous speech recognition.
1457-1460
- Hamed Ketabdar, Jithendra Vepa, Samy Bengio, Hervé Bourlard:
Developing and enhancing posterior based speech recognition systems.
1461-1464
- Peng Liu, Ye Tian, Jian-Lai Zhou, Frank K. Soong:
Background model based posterior probability for measuring confidence.
1465-1468
Multilingual TTS
- Laura Mayfield Tomokiyo, Alan W. Black, Kevin A. Lenzo:
Foreign accents in synthetic speech: development and evaluation.
1469-1472
- Raul Fernandez, Wei Zhang, Ellen Eide, Raimo Bakis, Wael Hamza, Yi Liu, Michael Picheny, John F. Pitrelli, Yong Qing, Zhiwei Shuang, Li Qin Shen:
Toward multiple-language TTS: experiments in English and Mandarin.
1473-1476
- Javier Latorre, Koji Iwano, Sadaoki Furui:
Cross-language synthesis with a polyglot synthesizer.
1477-1480
- Mucemi Gakuru, Frederick K. Iraki, Roger Tucker, Ksenia Shalonova, Kamanda Ngugi:
Development of a Kiswahili text to speech system.
1481-1484
- Jaime Botella Ordinas, Volker Fischer, Claire Waast-Richard:
Multilingual models in the IBM bilingual text-to-speech systems.
1485-1488
- Artur Janicki, Piotr Herman:
Reconstruction of Polish diacritics in a text-to-speech system.
1489-1492
Speech Bandwidth Extension
- Hiroyuki Ehara, Toshiyuki Morii, Masahiro Oshikiri, Koji Yoshida, Kouichi Honma:
Design of bandwidth scalable LSF quantization using interframe and intraframe prediction.
1493-1496
- Bernd Geiser, Peter Jax, Peter Vary:
Artificial bandwidth extension of speech supported by watermark-transmitted side information.
1497-1500
- Rongqiang Hu, Venkatesh Krishnan, David V. Anderson:
Speech bandwidth extension by improved codebook mapping towards increased phonetic classification.
1501-1504
- Dhananjay Bansal, Bhiksha Raj, Paris Smaragdis:
Bandwidth expansion of narrowband speech using non-negative matrix factorization.
1505-1508
- Michael L. Seltzer, Alex Acero, Jasha Droppo:
Robust bandwidth extension of noise-corrupted narrowband speech.
1509-1512
- João P. Cabral, Luís C. Oliveira:
Pitch-synchronous time-scaling for high-frequency excitation regeneration.
1513-1516
Spoken Language Resources and Technology Evaluation I,
II
- Felix Burkhardt, Astrid Paeschke, M. Rolfes, Walter F. Sendlmeier, Benjamin Weiss:
A database of German emotional speech.
1517-1520
- Philippe Boula de Mareüil, Christophe d'Alessandro, Gérard Bailly, Frédéric Béchet, Marie-Neige Garcia, Michel Morel, Romain Prudon, Jean Véronis:
Evaluating the pronunciation of proper names by four French grapheme-to-phoneme converters.
1521-1524
- Filip Jurcícek, Jirí Zahradil, Libor Jelínek:
A human-human train timetable dialogue corpus.
1525-1528
- Gloria Branco, Luís Almeida, Rui Gomes, Nuno Beires:
A Portuguese spoken and multi-modal dialog corpora.
1529-1532
- Joyce Y. C. Chan, P. C. Ching, Tan Lee:
Development of a Cantonese-English code-mixing speech corpus.
1533-1536
- Andrej Zgank, Darinka Verdonik, Aleksandra Zögling Markus, Zdravko Kacic:
BNSI Slovenian broadcast news database - speech and text corpus.
1537-1540
- Jan Volín, Radek Skarnitzl, Petr Pollák:
Confronting HMM-based phone labelling with human evaluation of speech production.
1541-1544
- Stephanie Strassel, Jáchym Kolár, Zhiyi Song, Leila Barclay, Meghan Lammie Glenn:
Structural metadata annotation: moving beyond English.
1545-1548
- Delphine Charlet, Sacha Krstulovic, Frédéric Bimbot, Olivier Boëffard, Dominique Fohr, Odile Mella, Filip Korkmazsky, Djamel Mostefa, Khalid Choukri, Arnaud Vallée:
Neologos: an optimized database for the development of new speech processing algorithms.
1549-1552
- Cheng-Yuan Lin, Kuan-Ting Chen, Jyh-Shing Roger Jang:
A hybrid approach to automatic segmentation and labeling for Mandarin Chinese speech corpus.
1553-1556
- Yuang-Chin Chiang, Min-Siong Liang, Hong-Yi Lin, Ren-Yuan Lyu:
The multiple pronunciations in Taiwanese and the automatic transcription of Buddhist sutra with augmented read speech.
1557-1560
- Marelie H. Davel, Etienne Barnard:
Bootstrapping pronunciation dictionaries: practical issues.
1561-1564
- Nigel G. Ward, Anais G. Rivera, Karen Ward, David G. Novick:
Root causes of lost time and user stress in a simple dialog system.
1565-1568
- Julie A. Parisi, Douglas Brungart:
Evaluating communication effectiveness in team collaboration.
1569-1572
- David Conejero, Alan Lounds, Carmen García-Mateo, Leandro Rodríguez Liñares, Raquel Mochales, Asunción Moreno:
Bilingual aligned corpora for speech to speech translation for Spanish, English and Catalan.
1573-1576
- Hynek Boril, Petr Pollák:
Design and collection of Czech Lombard speech database.
1577-1580
- Abe Kazemzadeh, Hong You, Markus Iseli, Barbara Jones, Xiaodong Cui, Margaret Heritage, Patti Price, Elaine Andersen, Shrikanth Narayanan, Abeer Alwan:
TBALL data collection: the making of a young children's speech corpus.
1581-1584
- Hitomi Tohyama, Shigeki Matsubara, Nobuo Kawaguchi, Yasuyoshi Inagaki:
Construction and utilization of bilingual speech corpus for simultaneous machine interpretation research.
1585-1588
- Rebecca A. Bates, Patrick Menning, Elizabeth Willingham, Chad Kuyper:
Meeting acts: a labeling system for group interaction in meetings.
1589-1592
- Marius-Calin Silaghi, Rachna Vargiya:
A new evaluation criteria for keyword spotting techniques and a new algorithm.
1593-1596
- Christoph Draxler, Alexander Steffen:
Phattsessionz: recording 1000 adolescent speakers in schools in Germany.
1597-1600
- Solomon Teferra Abate, Wolfgang Menzel, Bairu Tafila:
An Amharic speech corpus for large vocabulary continuous speech recognition.
1601-1604
- Hans Dolfing, David Reitter, Luís Almeida, Nuno Beires, Michael Cody, Rui Gomes, Kerry Robinson, Roman Zielinski:
The FASil speech and multimodal corpora.
1605-1608
- Karin Müller:
Revealing phonological similarities between German and dutch.
1609-1612
Large Vocabulary Speech Recognition Systems
- Dimitra Vergyri, Katrin Kirchhoff, Venkata Ramana Rao Gadde, Andreas Stolcke, Jing Zheng:
Development of a conversational telephone speech recognizer for Levantine Arabic.
1613-1616
- Bhuvana Ramabhadran:
Exploiting large quantities of spontaneous speech for unsupervised training of acoustic models.
1617-1620
- Che-Kuang Lin, Lin-Shan Lee:
Improved spontaneous Mandarin speech recognition by disfluency interruption point (IP) detection using prosodic features.
1621-1624
- Jeff Z. Ma, Spyros Matsoukas:
Improvements to the BBN RT04 Mandarin conversational telephone speech recognition system.
1625-1628
- Sakriani Sakti, Satoshi Nakamura, Konstantin Markov:
Incorporating a Bayesian wide phonetic context model for acoustic rescoring.
1629-1632
- Abdelkhalek Messaoudi, Lori Lamel, Jean-Luc Gauvain:
Modeling vowels for Arabic BN transcription.
1633-1636
- Mohamed Afify, Long Nguyen, Bing Xiang, Sherif Abdou, John Makhoul:
Recent progress in Arabic broadcast news transcription at BBN.
1637-1640
- Spyros Matsoukas, Rohit Prasad, Srinivas Laxminarayan, Bing Xiang, Long Nguyen, Richard M. Schwartz:
The 2004 BBN 1xRT recognition systems for English broadcast news and conversational telephone speech.
1641-1644
- Rohit Prasad, Spyros Matsoukas, Chia-Lin Kao, Jeff Z. Ma, D.-X. Xu, Thomas Colthurst, Owen Kimball, Richard M. Schwartz, Jean-Luc Gauvain, Lori Lamel, Holger Schwenk, Gilles Adda, Fabrice Lefèvre:
The 2004 BBN/LIMSI 20xRT English conversational telephone speech recognition system.
1645-1648
- Bing Xiang, Long Nguyen, Xuefeng Guo, Dongxin Xu:
The BBN Mandarin broadcast news transcription system.
1649-1652
- Paul Deléglise, Yannick Estève, Sylvain Meignier, Téva Merlin:
The LIUM speech transcription system: a CMU Sphinx III-based system for French broadcast news.
1653-1656
- Lori Lamel, Gilles Adda, Eric Bilinski, Jean-Luc Gauvain:
Transcribing lectures and seminars.
1657-1660
- Thomas Hain, John Dines, Giulia Garau, Martin Karafiát, Darren Moore, Vincent Wan, Roeland Ordelman, Steve Renals:
Transcription of conference room meetings: an investigation.
1661-1664
- Jean-Luc Gauvain, Gilles Adda, Martine Adda-Decker, Alexandre Allauzen, Véronique Gendner, Lori Lamel, Holger Schwenk:
Where are we in transcribing French broadcast news?
1665-1668
- Odette Scharenborg, Stephanie Seneff:
Two-pass strategy for handling OOVs in a large vocabulary recognition task.
1669-1672
- Long Nguyen, Bing Xiang, Mohamed Afify, Sherif Abdou, Spyros Matsoukas, Richard M. Schwartz, John Makhoul:
The BBN RT04 English broadcast news transcription system.
1673-1676
- Rong Zhang, Ziad Al Bawab, Arthur Chan, Ananlada Chotimongkol, David Huggins-Daines, Alexander I. Rudnicky:
Investigations on ensemble based semi-supervised acoustic model training.
1677-1680
- Jan Nouza, Jindrich Zdánský, Petr David, Petr Cerva, Jan Kolorenc, Dana Nejedlová:
Fully automated system for Czech spoken broadcast transcription with very large (300k+) lexicon.
1681-1684
- Mike Schuster, Takaaki Hori, Atsushi Nakamura:
Experiments with probabilistic principal component analysis in LVCSR.
1685-1688
- Thang Tat Vu, Dung Tien Nguyen, Chi Mai Luong, John-Paul Hosom:
Vietnamese large vocabulary continuous speech recognition.
1689-1692
- Takahiro Shinozaki, Mari Ostendorf, Les E. Atlas:
Data sampling for improved speech recognizer training.
1693-1696
Speech Perception I,
II
- Do Dat Tran, Eric Castelli, Jean-François Serignat, Van Loan Trinh, Le Xuan Hung:
Influence of F0 on Vietnamese syllable perception.
1697-1700
- Barbara Schwanhäußer, Denis Burnham:
Lexical tone and pitch perception in tone and non-tone language speakers.
1701-1704
- Isabel Falé, Isabel Hub Faria:
Intonational contrasts in EP: a categorical perception approach.
1705-1708
- Bettina Braun, Andrea Weber, Matthew W. Crocker:
Does narrow focus activate alternative referents?
1709-1712
- Kiyoaki Aikawa, Hayato Hashimoto:
Audiovisual interaction on the perception of frequency glide of linear sweep tones.
1713-1716
- Kei Omata, Ken Mogi:
Audiovisual integration in dichotic listening.
1717-1720
- Gunilla Svanfeldt, Dirk Olszewski:
Perception experiment combining a parametric loudspeaker and a synthetic talking head.
1721-1724
- Catherine Mayo, Robert A. J. Clark, Simon King:
Multidimensional scaling of listener responses to synthetic speech.
1725-1728
- Hiroko Terasawa, Malcolm Slaney, Jonathan Berger:
A timbre space for speech.
1729-1732
- Abdellah Kacha, Francis Grenez, Jean Schoentgen:
Voice quality assessment by means of comparative judgments of speech tokens.
1733-1736
- Toshio Irino, Satoru Satou, Shunsuke Nomura, Hideki Banno, Hideki Kawahara:
Speech intelligibility derived from time-frequency and source smearing.
1737-1740
- Nahoko Hayashi, Takayuki Arai, Nao Hodoshima, Yusuke Miyauchi, Kiyohiro Kurisu:
Steady-state pre-processing for improving speech intelligibility in reverberant environments: evaluation in a hall with an electrical reverberator.
1741-1744
- Patrick C. M. Wong, Kiara M. Lee, Todd B. Parrish:
Neural bases of listening to speech in noise.
1745-1748
- P. Jongmans, Frans J. M. Hilgers, Louis C. W. Pols, C. J. van As-Brooks:
The intelligibility of tracheoesophageal speech: first results.
1749-1752
- Guy J. Brown, Kalle J. Palomäki:
A computational model of the speech reception threshold for laterally separated speech and noise.
1753-1756
- Esther Janse:
Lexical inhibition effects in time-compressed speech.
1757-1760
- Caroline Jacquier, Fanny Meunier:
Perception of time-compressed rapid acoustic cues in French CV syllables.
1761-1764
- Claire-Léonie Grataloup, Michel Hoen, François Pellegrino, E. Veuillet, Lionel Collet, Fanny Meunier:
Reversed speech comprehension depends on the auditory efferent system functionality.
1765-1768
- Won Tokuma, Shinichi Tokuma:
Perceptual space of English fricatives for Japanese learners.
1769-1772
- Ioana Vasilescu, Maria Candea, Martine Adda-Decker:
Perceptual salience of language-specific acoustic differences in autonomous fillers across eight languages.
1773-1776
- Marc D. Pell:
Effects of cortical and subcortical brain damage on the processing of emotional prosody.
1777-1780
Keynote Papers
- Elizabeth Shriberg:
Spontaneous speech: how people really talk and why engineers should care.
1781-1784
Speech Recognition - Adaptation I,
II
Prosody Modelling and Speech Technology I,
II
Detecting and Synthesizing Speaker State
- Julia Hirschberg, Stefan Benus, Jason M. Brenier, Frank Enos, Sarah Friedman, Sarah Gilman, Cynthia Girand, Martin Graciarena, Andreas Kathol, Laura Michaelis, Bryan L. Pellom, Elizabeth Shriberg, Andreas Stolcke:
Distinguishing deceptive from non-deceptive speech.
1833-1836
- Jackson Liscombe, Julia Hirschberg, Jennifer J. Venditti:
Detecting certainness in spoken tutorial dialogues.
1837-1840
- Laurence Vidrascu, Laurence Devillers:
Detection of real-life emotions in call centers.
1841-1844
- Jackson Liscombe, Giuseppe Riccardi, Dilek Z. Hakkani-Tür:
Using context to improve emotion detection in spoken dialog systems.
1845-1848
- Irena Yanushevskaya, Christer Gobl, Ailbhe Ní Chasaide:
Voice quality and f0 cues for affect expression: implications for synthesis.
1849-1852
- Toru Takahashi, Takeshi Fujii, Masashi Nishi, Hideki Banno, Toshio Irino, Hideki Kawahara:
Voice and emotional expression transformation based on statistics of vowel parameters in an emotional speech database.
1853-1856
Rapid Development of Spoken Dialogue Systems
- Giuseppe Di Fabbrizio, Gökhan Tür, Dilek Z. Hakkani-Tür:
Automated wizard-of-oz for spoken dialogue systems.
1857-1860
- Kouichi Katsurada, Kunitoshi Sato, Hiroaki Adachi, Hirobumi Yamada, Tsuneo Nitta:
A rapid prototyping tool for constructing web-based MMI applications.
1861-1864
- Philip Hanna, Ian M. O'Neill, Xingkun Liu, Michael F. McTear:
Developing extensible and reusable spoken dialogue components: an examination of the Queen's communicator.
1865-1868
- Ye-Yi Wang, Alex Acero:
SGStudio: rapid semantic grammar development for spoken language understanding.
1869-1872
- Murat Akbacak, Yuqing Gao, Liang Gu, Hong-Kwang Jeff Kuo:
Rapid transition to new spoken dialogue domains: language model training using knowledge from previous domain applications and web text resources.
1873-1876
- Manny Rayner, Pierrette Bouillon, Nikos Chatzichrisafis, Beth Ann Hockey, Marianne Santaholma, Marianne Starlander, Hitoshi Isahara, Kyoko Kanzaki, Yukie Nakao:
A methodology for comparing grammar-based and robust approaches to speech understanding.
1877-1880
Text-to-Speech I,
II
- François Mairesse, Marilyn A. Walker:
Learning to personalize spoken generation for dialogue systems.
1881-1884
- S. Revelin, Didier Cadic, Claire Waast-Richard:
Optimization of text-to-speech phonetic transcriptions using a-posteriori signal comparison.
1885-1888
- Özgül Salor, Mübeccel Demirekler:
Voice transformation using principle component analysis based LSF quantization and dynamic programming approach.
1889-1892
- Hai Ping Li, Wei Zhang:
Adapt Mandarin TTS system to Chinese dialect TTS systems.
1893-1896
- Min Zheng, Qin Shi, Wei Zhang, Lianhong Cai:
Grapheme-to-phoneme conversion based on TBL algorithm in Mandarin TTS system.
1897-1900
- Paolo Massimino, Alberto Pacchiotti:
An automaton-based machine learning technique for automatic phonetic transcription.
1901-1904
- Tasanawan Soonklang, Robert I. Damper, Yannick Marchand:
Comparative objective and subjective evaluation of three data-driven techniques for proper name pronunciation.
1905-1908
- Olov Engwall:
Articulatory synthesis using corpus-based estimation of line spectrum pairs.
1909-1912
- Aoju Chen, Els den Os:
Effects of pitch accent type on interpreting information status in synthetic speech.
1913-1916
- Perttu Prusi, Anssi Kainulainen, Jaakko Hakulinen, Markku Turunen, Esa-Pekka Salonen, Leena Helin:
Towards generic spatial object model and route guidance grammar for speech-based systems.
1917-1920
- Chi-Chun Hsia, Chung-Hsien Wu, Te-Hsien Liu:
Duration-embedded bi-HMM for expressive voice conversion.
1921-1924
- Toshio Hirai, Hisashi Kawai, Minoru Tsuzaki, Nobuyuki Nishizawa:
Analysis of major factors of naturalness degradation in concatenative synthesis.
1925-1928
- Jilei Tian, Jani Nurminen, Imre Kiss:
Duration modeling and memory optimization in a Mandarin TTS system.
1929-1932
- Min-Siong Liang, Ke-Chun Chuang, Rhuei-Cheng Yang, Yuang-Chin Chiang, Ren-Yuan Lyu:
A bi-lingual Mandarin-to-taiwanese text-to-speech system.
1933-1936
- Uwe D. Reichel, Florian Schiel:
Using morphology and phoneme history to improve grapheme-to-phoneme conversion.
1937-1940
- Olga Goubanova, Simon King:
Predicting consonant duration with Bayesian belief networks.
1941-1944
- Per-Anders Jande:
Inducing decision tree pronunciation variation models from annotated speech data.
1945-1948
- Lijuan Wang, Yong Zhao, Min Chu, Frank K. Soong, Zhigang Cao:
Phonetic transcription verification with generalized posterior probability.
1949-1952
- Hua Cheng, Fuliang Weng, Niti Hantaweepant, Lawrence Cavedon, Stanley Peters:
Training a maximum entropy model for surface realization.
1953-1956
- Tomoki Toda, Kiyohiro Shikano:
NAM-to-speech conversion with Gaussian mixture models.
1957-1960
- Michelina Savino, Mario Refice, Massimo Mitaritonna:
Which Italian do current systems speak? a first step towards pronunciation modelling of Italian varieties.
1961-1964
- Dominika Oliver, Robert A. J. Clark:
Modelling pitch accent types for Polish speech synthesis.
1965-1968
- Chatchawarn Hansakunbuntheung, Ausdang Thangthai, Chai Wutiwiwatchai, Rungkarn Siricharoenchai:
Learning methods and features for corpus-based phrase break prediction on Thai.
1969-1972
- Paul Taylor:
Hidden Markov models for grapheme to phoneme conversion.
1973-1976
Speaker Characterization and Recognition I-IV
- Longbiao Wang, Norihide Kitaoka, Seiichi Nakagawa:
Robust distant speaker recognition based on position dependent cepstral mean normalization.
1977-1980
- David A. van Leeuwen:
Speaker adaptation in the NIST speaker recognition evaluation 2004.
1981-1984
- Jacob Goldberger, Hagai Aronowitz:
A distance measure between GMMs based on the unscented transform and its application to speaker recognition.
1985-1988
- Sorin Dusan:
Estimation of speaker's height and vocal tract length from speech signal.
1989-1992
- Doroteo Torre Toledano, Carlos Fombella, Joaquin Gonzalez-Rodriguez, Luis A. Hernández Gómez:
On the relationship between phonetic modeling precision and phonetic speaker recognition accuracy.
1993-1996
- J. Fortuna, P. Sivakumaran, Aladdin M. Ariyaeeinia, Amit S. Malegaonkar:
Open-set speaker identification using adapted Gaussian mixture models.
1997-2000
- James McAuley, Ji Ming, Pat Corr:
Speaker verification in noisy conditions using correlated subband features.
2001-2004
- Mikaël Collet, Yassine Mami, Delphine Charlet, Frédéric Bimbot:
Probabilistic anchor models approach for speaker verification.
2005-2008
- Mijail Arcienega, Anil Alexander, Philipp Zimmermann, Andrzej Drygajlo:
A Bayesian network approach combining pitch and spectral envelope features to reduce channel mismatch in speaker verification and forensic speaker recognition.
2009-2012
- Kwok-Kwong Yiu, Man-Wai Mak, Sun-Yuan Kung:
Channel robust speaker verification via Bayesian blind stochastic feature transformation.
2013-2016
- Tomoko Matsui, Kunio Tanabe:
dPLRM-based speaker identification with log power spectrum.
2017-2020
- Xianxian Zhang, John H. L. Hansen, Pongtep Angkititrakul, Kazuya Takeda:
Speaker verification using Gaussian mixture models within changing real car environments.
2021-2024
- Kanae Amino, Tsutomu Sugawara, Takayuki Arai:
The correspondences between the perception of the speaker individualities contained in speech sounds and their acoustic properties.
2025-2028
- Samuel Kim, Sung-Wan Yoon, Thomas Eriksson, Hong-Goo Kang, Dae Hee Youn:
A noise-robust pitch synchronous feature extraction algorithm for speaker recognition systems.
2029-2032
- Jing Deng, Thomas Fang Zheng, Zhanjiang Song, Jian Liu:
Modeling high-level information by using Gaussian mixture correlation for GMM-UBM based speaker recognition.
2033-2036
- Xianxian Zhang, John H. L. Hansen:
In-set/out-of-set speaker identification based on discriminative speech frame selection.
2037-2040
- Zhenchun Lei, Yingchun Yang, Zhaohui Wu:
Mixture of support vector machines for text-independent speaker recognition.
2041-2044
- Shilei Zhang, Junmei Bai, Shuwu Zhang, Bo Xu:
Optimal model order selection based on regression tree in speaker identification.
2045-2048
- Marcos Faúndez-Zanuy, Jordi Solé-Casals:
Speaker verification improvement using blind inversion of distortions.
2049-2052
Single-channel Speech Enhancement
- Israel Cohen:
Supergaussian GARCH models for speech signals.
2053-2056
- Athanasios Mouchtaris, Jan Van der Spiegel, Paul Mueller, Panagiotis Tsakalides:
A spectral conversion approach to feature denoising and speech enhancement.
2057-2060
- Alfonso Ortega, Eduardo Lleida, Enrique Masgrau, Luis Buera, Antonio Miguel:
Acoustic feedback cancellation in speech reinforcement systems for vehicles.
2061-2064
- Julien Bourgeois, Jürgen Freudenberger, Guillaume Lathoud:
Implicit control of noise canceller for speech enhancement.
2065-2068
- T. M. Sunil Kumar, T. V. Sreenivas:
Speech enhancement using Markov model of speech segments.
2069-2072
- Vladimir Braquet, Takao Kobayashi:
A wavelet based noise reduction algorithm for speech signal corrupted by coloured noise.
2073-2076
- Esfandiar Zavarehei, Saeed Vaseghi:
Speech enhancement in temporal DFT trajectories using Kalman filters.
2077-2080
- Qin Yan, Saeed Vaseghi, Esfandiar Zavarehei, Ben P. Milner:
Formant-tracking linear prediction models for speech processing in noisy environments.
2081-2084
- Hui Jiang, Qian-Jie Fu:
Statistical noise compensation for cochlear implant processing.
2085-2088
- Tuan Van Pham, Gernot Kubin:
WPD-based noise suppression using nonlinearly weighted threshold quantile estimation and optimal wavelet shrinking.
2089-2092
- Weifeng Li, Katunobu Itou, Kazuya Takeda, Fumitada Itakura:
Subjective and objective quality assessment of regression-enhanced speech in real car environments.
2093-2096
- Masashi Unoki, Masaaki Kubo, Atsushi Haniu, Masato Akagi:
A model for selective segregation of a target instrument sound from the mixed sound of various instruments.
2097-2100
- Richard C. Hendriks, Richard Heusdens, Jesper Jensen:
Improved decision directed approach for speech enhancement using an adaptive time segmentation.
2101-2104
- Heinrich W. Löllmann, Peter Vary:
Generalized filter-bank equalizer for noise reduction with reduced signal delay.
2105-2108
- Nicoleta Roman, DeLiang Wang:
A pitch-based model for separation of reverberant speech.
2109-2112
- David Y. Zhao, W. Bastiaan Kleijn:
On noise gain estimation for HMM-based speech enhancement.
2113-2116
- Om Deshmukh, Carol Y. Espy-Wilson:
Speech enhancement using auditory phase opponency model.
2117-2120
Acoustic Modelling for LVCSR
- Brian Mak, Jeff Siu-Kei Au-Yeung, Yiu-Pong Lai, Man-Hung Siu:
High-density discrete HMM with the use of scalar quantization indexing.
2121-2124
- Jing Zheng, Andreas Stolcke:
Improved discriminative training using phone lattices.
2125-2128
- Qifeng Zhu, Barry Y. Chen, Frantisek Grézl, Nelson Morgan:
Improved MLP structures for data-driven feature extraction for ASR.
2129-2132
- Wolfgang Macherey, Lars Haferkamp, Ralf Schlüter, Hermann Ney:
Investigations on error minimizing training criteria for discriminative training in automatic speech recognition.
2133-2136
- K. C. Sim, M. J. F. Gales:
Temporally varying model parameters for large vocabulary continuous speech recognition.
2137-2140
- Qifeng Zhu, Andreas Stolcke, Barry Y. Chen, Nelson Morgan:
Using MLP features in SRI's conversational speech recognition system.
2141-2144
Speech Production I
- Matti Airas, Hannu Pulakka, Tomas Bäckström, Paavo Alku:
A toolkit for voice inverse filtering and parametrisation.
2145-2148
- Denisse Sciamarella, Christophe d'Alessandro:
Stylization of glottal-flow spectra produced by a mechanical vocal-fold model.
2149-2152
- Hideyuki Nomura, Tetsuo Funada:
Numerical glottal sound source model as coupled problem between vocal cord vibration and glottal flow.
2153-2156
- Marianne Pouplier, Maureen Stone:
A tagged-cine MRI investigation of German vowels.
2157-2160
- Antoine Serrurier, Pierre Badin:
A three-dimensional linear articulatory model of velum based on MRI data.
2161-2164
- Anne Cros, Didier Demolin, Ana Georgina Flesia, Antonio Galves:
On the relationship between intra-oral pressure and speech sonority.
2165-2168
Speaker Characterization and Recognition I-IV
- Mohamed Kamal Omar, Jiri Navratil, Ganesh N. Ramaswamy:
Maximum conditional mutual information modeling for speaker verification.
2169-2172
- Luciana Ferrer, M. Kemal Sönmez, Sachin S. Kajarekar:
Class-dependent score combination for speaker recognition.
2173-2176
- Hagai Aronowitz, Dror Irony, David Burshtein:
Modeling intra-speaker variability for speaker recognition.
2177-2180
- Girija Chetty, Michael Wagner:
Liveness detection using cross-modal correlations in face-voice person authentication.
2181-2184
- Taichi Asami, Koji Iwano, Sadaoki Furui:
Stream-weight optimization by LDA and adaboost for multi-stream speaker verification.
2185-2188
- Yosef A. Solewicz, Moshe Koppel:
Considering speech quality in speaker verification fusion.
2189-2192
Gender and Age Issues in Speech and Language Research I,
II
- Matteo Gerosa, Diego Giuliani, Fabio Brugnara:
Speaker adaptive acoustic modeling with mixture of adult and children's speech.
2193-2196
- Shona D'Arcy, Martin J. Russell:
A comparison of human and computer recognition accuracy for children's speech.
2197-2200
- Piero Cosi, Bryan L. Pellom:
Italian children's speech recognition for advanced interactive literacy tutors.
2201-2204
- Martine Adda-Decker, Lori Lamel:
Do speech recognizers prefer female speakers?
2205-2208
- Serdar Yildirim, Chul Min Lee, Sungbok Lee, Alexandros Potamianos, Shrikanth Narayanan:
Detecting Politeness and frustration state of a child in a conversational computer game.
2209-2212
- Diana Binnenpoorte, Christophe Van Bael, Els den Os, Lou Boves:
Gender in everyday speech and language: a corpus-based study.
2213-2216
Spoken Language Acquisition,
Development and Learning I,
II
- Shigeaki Amano:
Developmental change of phoneme duration in a Japanese infant and mother.
2217-2220
- Haiping Jia, Hiroki Mori, Hideki Kasuya:
Mora timing organization in producing contrastive geminate/single consonants and long/short vowels by native and non-native speakers of Japanese: effects of speaking rate.
2221-2224
- Hongyan Wang, Vincent J. van Heuven:
Mutual intelligibility of american, Chinese and dutch-accented speakers of English.
2225-2228
- Peter Juel Henrichsen:
Deriving a bi-lingual dictionary from raw transcription data.
2229-2232
- Kei Ohta, Seiichi Nakagawa:
A statistical method of evaluating pronunciation proficiency for Japanese words.
2233-2236
Language and Dialect Identification I,
II
- Pavel Matejka, Petr Schwarz, Jan Cernocký, Pavel Chytil:
Phonotactic language identification using high quality phoneme recognition.
2237-2240
- Rongqing Huang, John H. L. Hansen:
Advances in word based dialect/accent classification.
2241-2244
- Rym Hamdi, Salem Ghazali, Melissa Barkat-Defradas:
Syllable structure in spoken Arabic: a comparative investigation.
2245-2248
- J. C. Marcadet, Volker Fischer, Claire Waast-Richard:
A transformation-based learning approach to language identification for mixed-lingual text-to-speech synthesis.
2249-2252
- Shuichi Itahashi, Shiwei Zhu, Mikio Yamamoto:
Constructing family trees of multilingual speech using Gaussian mixture models.
2253-2256
- Jean-Luc Rouas:
Modeling long and short-term prosody for language identification.
2257-2260
Spoken Language Translation I,
II
- Matthias Paulik, Christian Fügen, Sebastian Stüker, Tanja Schultz, Thomas Schaaf, Alex Waibel:
Document driven machine translation enhanced ASR.
2261-2264
- Shahram Khadivi, András Zolnay, Hermann Ney:
Automatic text dictation in computer-assisted translation.
2265-2268
- Luis Rodríguez, Jorge Civera, Enrique Vidal, Francisco Casacuberta, César Martínez:
On the use of speech recognition in computer assisted translation.
2269-2272
- Andreas Kathol, Kristin Precoda, Dimitra Vergyri, Wen Wang, Susanne Riehemann:
Speech translation for low-resource languages: the case of Pashto.
2273-2276
- David Picó, Jorge González, Francisco Casacuberta, Diamantino Caseiro, Isabel Trancoso:
Finite-state transducer inference for a speech-input Portuguese-to-English machine translation system.
2277-2280
- Kenko Ohta, Keiji Yasuda, Gen-ichiro Kikui, Masuzo Yanagida:
Quantitative evaluation of effects of speech recognition errors on speech translation quality.
2281-2284
Multi-channel Speech Enhancement
- Thomas Lotter, Bastian Sauert, Peter Vary:
A stereo input-output superdirective beamformer for dual channel noise reduction.
2285-2288
- Ulrich Klee, Tobias Gehrig, John W. McDonough:
Kalman filters for time delay of arrival-based source localization.
2289-2292
- Osamu Ichikawa, Masafumi Nishimura:
Simultaneous adaptation of echo cancellation and spectral subtraction for in-car speech recognition.
2293-2296
- Rong Hu, Yunxin Zhao:
Variable step size adaptive decorrelation filtering for competing speech separation.
2297-2300
- Daisuke Saitoh, Atsunobu Kaminuma, Hiroshi Saruwatari, Tsuyoki Nishikawa, Akinobu Lee:
Speech extraction in a car interior using frequency-domain ICA with rapid filter adaptations.
2301-2304
- Rongqiang Hu, Sunil D. Kamath, David V. Anderson:
Speech enhancement using non-acoustic sensors.
2305-2308
- Marc Delcroix, Takafumi Hikichi, Masato Miyoshi:
Improved blind dereverberation performance by using spatial information.
2309-2312
- Junfeng Li, Masato Akagi:
A hybrid microphone array post-filter in a diffuse noise field.
2313-2316
- Venkatesh Krishnan, Phil Spencer Whitehead, David V. Anderson, Mark A. Clements:
A framework for estimation of clean speech by fusion of outputs from multiple speech enhancement systems.
2317-2320
- Yuki Denda, Takanobu Nishiura, Yoichi Yamashita:
A study of weighted CSP analysis with average speech spectrum for noise robust talker localization.
2321-2324
- Young-Ik Kim, Sung Jun An, Rhee Man Kil, Hyung-Min Park:
Sound segregation based on binaural zero-crossings.
2325-2328
- Jürgen Freudenberger, Klaus Linhard:
A two-microphone diversity system and its application for hands-free car kits.
2329-2332
- Takahiro Murakami, Kiyoshi Kurihara, Yoshihisa Ishida:
Directionally constrained minimization of power algorithm for speech signals.
2333-2336
- Alessio Brutti, Maurizio Omologo, Piergiorgio Svaizer:
Oriented global coherence field for the estimation of the head orientation in smart rooms equipped with distributed microphone arrays.
2337-2340
- Nilesh Madhu, Rainer Martin:
Robust speaker localization through adaptive weighted pair TDOA (AWEPAT) estimation.
2341-2344
- Guillaume Lathoud, Mathew Magimai-Doss, Bertrand Mesot:
A spectrogram model for enhanced source localization and noise-robust ASR.
2345-2348
- Sriram Srinivasan, Mattias Nilsson, W. Bastiaan Kleijn:
Denoising through source separation and minimum tracking.
2349-2352
- Louisa Busca Grisoni, John H. L. Hansen:
Collaborative voice activity detection for hearing aids.
2353-2356
- Enrique Robledo-Arnuncio, Biing-Hwang Juang:
Using inter-frequency decorrelation to reduce the permutation inconsistency problem in blind source separation.
2357-2360
- Amarnag Subramanya, Zhengyou Zhang, Zicheng Liu, Jasha Droppo, Alex Acero:
A graphical model for multi-sensory speech processing in air-and-bone conductive microphones.
2361-2364
Prosody in Language Performance I,
II
- Heejin Kim, Jennifer Cole:
The stress foot as a unit of planned timing: evidence from shortening in the prosodic phrase.
2365-2368
- Pauline Welby, Hélène Loevenbruck:
Segmental "anchorage" and the French late rise.
2369-2372
- Ivan Chow:
Prosodic cues for syntactically-motivated junctures.
2373-2376
- Isabel Falé, Isabel Hub Faria:
A glimpse of the time-course of intonation processing in European Portuguese.
2377-2380
- Petra Wagner:
Great expectations - introspective vs. perceptual prominence ratings and their acoustic correlates.
2381-2384
- Christian Jensen, John Tndering:
Choosing a scale for measuring perceived prominence.
2385-2388
- Jens Edlund, David House, Gabriel Skantze:
The effects of prosodic features on the interpretation of clarification ellipses.
2389-2392
- Matthias Jilka:
Exploration of different types of intonational deviations in foreign-accented and synthesized speech.
2393-2396
- Jörg Bröggelwirth:
A rhythmic-prosodic model of poetic speech.
2397-2400
- Sonja Biersack, Vera Kempe, Lorna Knapton:
Fine-tuning speech registers: a comparison of the prosodic features of child-directed and foreigner-directed speech.
2401-2404
- Timothy Arbisi-Kelm:
An analysis of the intonational structure of stuttered speech.
2405-2408
- Britta Lintfert, Wolfgang Wokurek:
Voice quality dimensions of pitch accents.
2409-2412
- Marion Dohen, Hélène Loevenbruck:
Audiovisual production and perception of contrastive focus in French: a multispeaker study.
2413-2416
- Pashiera Barkhuysen, Emiel Krahmer, Marc Swerts:
Predicting end of utterance in multimodal and unimodal conditions.
2417-2420
- Saori Tanaka, Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa:
Production of prominence in Japanese sign language.
2421-2424
Speaker Characterization and Recognition I-IV
- Andreas Stolcke, Luciana Ferrer, Sachin S. Kajarekar, Elizabeth Shriberg, Anand Venkataraman:
MLLR transforms as features in speaker recognition.
2425-2428
- Brendan Baker, Robbie Vogt, Sridha Sridharan:
Gaussian mixture modelling of broad phonetic and syllabic events for text-independent speaker verification.
2429-2432
- Hagai Aronowitz, David Burshtein:
Efficient speaker identification and retrieval.
2433-2436
- R. Sinha, S. E. Tranter, M. J. F. Gales, Philip C. Woodland:
The Cambridge University March 2005 speaker diarisation system.
2437-2440
- Xuan Zhu, Claude Barras, Sylvain Meignier, Jean-Luc Gauvain:
Combining speaker identification and BIC for speaker diarization.
2441-2444
- Dan Istrate, Nicolas Scheffer, Corinne Fredouille, Jean-François Bonastre:
Broadcast news speaker tracking for ESTER 2005 campaign.
2445-2448
Phonetics and Phonology I,
II
- Sorin Dusan:
On the nature of acoustic information in identification of coarticulated vowels.
2449-2452
- Cédric Gendrot, Martine Adda-Decker:
Impact of duration on F1/F2 formant values of oral vowels: an automatic analysis of large broadcast news corpora in French and German.
2453-2456
- Hugo Quené:
Modeling of between-speaker and within-speaker variation in spontaneous speech tempo.
2457-2460
- Masahiko Komatsu, Makiko Aoyagi:
Vowel devoicing vs. mora-timed rhythm in spontaneous Japanese - inspection of phonetic labels of OGI_TS.
2461-2464
- Jalal-Eddin Al-Tamimi, Emmanuel Ferragne:
Does vowel space size depend on language vowel inventories? evidence from two Arabic dialects and French.
2465-2468
- Chilin Shih:
Understanding phonology by phonetic implementation.
2469-2472
Spoken / Multi-modal Dialogue Systems I,
II
Human factors,
User Experience and Natural Language Application Design
- Esther Levin, Alex Levin:
Spoken dialog system for real-time data capture.
2497-2500
- Michael Pucher, Peter Fröhlich:
A user study on the influence of mobile device class, synthesis method, data rate and lexicon on speech synthesis quality.
2501-2504
- Fang Chen, Yael Katzenellenbogen:
User's experience of a commercial speech dialogue system.
2505-2508
- Esther Levin, Amir M. Mané:
Voice user interface design for automated directory assistance.
2509-2512
- Maria Gabriela Alvarez-Ryan, Narendra K. Gupta, Barbara Hollister, Tirso Alonso:
Optimizing user experience through design of the spoken language understanding (SLU) module.
2513-2516
- Jeremy H. Wright, David A. Kapilow, Alicia Abella:
Interactive visualization of human-machine dialogs.
2517-2520
TTS Inventory
- Matthew P. Aylett:
Synthesising hyperarticulation in unit selection TTS.
2521-2524
- Daniel Tihelka:
Symbolic prosody driven unit selection for highly natural synthetic speech.
2525-2528
- Jindrich Matousek, Zdenek Hanzlícek, Daniel Tihelka:
Hybrid syllable/triphone speech synthesis.
2529-2532
- Francisco Campillo Díaz, José Luis Alba, Eduardo Rodríguez Banga:
A neural network approach for the design of the target cost function in unit-selection speech synthesis.
2533-2536
- Christian Weiss:
FSM and k-nearest-neighbor for corpus based video-realistic audio-visual synthesis.
2537-2540
- Gui-Lin Chen, Ke-Song Han, Zhen-Li Yu, Dong-Jian Yue, Yi-Qing Zu:
An embedded and concatenative approach to TTS of multiple languages.
2541-2544
- Tony Ezzat, Ethan Meyers, James R. Glass, Tomaso Poggio:
Morphing spectral envelopes using audio flow.
2545-2548
- Vincent Colotte, Richard Beaufort:
Linguistic features weighting for a text-to-speech system without prosody model.
2549-2552
- Ingunn Amdal, Torbjørn Svendsen:
Unit selection synthesis database development using utterance verification.
2553-2556
- Yong Zhao, Lijuan Wang, Min Chu, Frank K. Soong, Zhigang Cao:
Refining phoneme segmentations using speaker-adaptive context dependent boundary models.
2557-2560
- Yining Chen, Yong Zhao, Min Chu:
Customizing base unit set with speech database in TTS systems.
2561-2564
- Soufiane Rouibia, Olivier Rosec:
Unit selection for speech synthesis based on a new acoustic target cost.
2565-2568
- Dan Chazan, Ron Hoory, Zvi Kons, Ariel Sagi, Slava Shechtman, Alexander Sorin:
Small footprint concatenative text-to-speech synthesis system using complex spectral envelope modeling.
2569-2572
- Francesc Alías, Ignasi Iriondo Sanz, Lluís Formiga, Xavier Gonzalvo, Carlos Monzo, Xavier Sevillano:
High quality Spanish restricted-domain TTS oriented to a weather forecast application.
2573-2576
- Ingmund Bjrkan, Torbjørn Svendsen, Snorre Farner:
Comparing spectral distance measures for join cost optimization in concatenative speech synthesis.
2577-2580
- Maria João Barros, Ranniery Maia, Keiichi Tokuda, Fernando Gil Resende, Diamantino Freitas:
HMM-based european Portuguese TTS system.
2581-2584
- Wael Hamza, John F. Pitrelli:
Combining the flexibility of speech synthesis with the naturalness of pre-recorded audio: a comparison of two approaches to phrase-splicing TTS.
2585-2588
- Guntram Strecha, Oliver Jokisch, Matthias Eichner, Rüdiger Hoffmann:
Codec integrated voice conversion for embedded speech synthesis.
2589-2592
- David Sündermann, Guntram Strecha, Antonio Bonafonte, Harald Höge, Hermann Ney:
Evaluation of VTLN-based voice conversion for embedded speech synthesis.
2593-2596
- Juri Isogai, Junichi Yamagishi, Takao Kobayashi:
Model adaptation and adaptive training using ESAT algorithm for HMM-based speech synthesis.
2597-2600
- Tien Ying Fung, Yuk-Chi Li, Eddie Sio, Icarus Lee, Helen M. Meng, P. C. Ching:
Embedded Cantonese TTS for multi-device access to web content.
2601-2604
- Karl Schnell, Arild Lacroix:
Model based analysis of a diphone database for improved unit concatenation.
2605-2608
Robust Speech Recognition I-IV
- Ning Ma, Phil Green:
Context-dependent word duration modelling for robust speech recognition.
2609-2612
- Julien Epps, Eric H. C. Choi:
An energy search approach to variable frame rate front-end processing for robust ASR.
2613-2616
- Roberto Gemello, Franco Mana, Renato de Mori:
Non-linear estimation of voice activity to improve automatic recognition of noisy speech.
2617-2620
- Yusuke Kida, Tatsuya Kawahara:
Voice activity detection based on optimally weighted combination of multiple features.
2621-2624
- Pei Ding:
Soft decision strategy and adaptive compensation for robust speech recognition against impulsive noise.
2625-2628
- Nicolás Morales, Doroteo Torre Toledano, John H. L. Hansen, José Colás, Javier Garrido:
Statistical class-based MFCC enhancement of filtered and band-limited speech for robust ASR.
2629-2632
- Hemant Misra, Hervé Bourlard:
Spectral entropy feature in full-combination multi-stream for robust ASR.
2633-2636
- Wooil Kim, Richard M. Stern, Hanseok Ko:
Environment-independent mask estimation for missing-feature reconstruction.
2637-2640
- André Coy, Jon Barker:
Soft harmonic masks for recognising speech in the presence of a competing speaker.
2641-2644
- Lech Szymanski, Martin Bouchard:
Comb filter decomposition for robust ASR.
2645-2648
- Panikos Heracleous, Tomomi Kaino, Hiroshi Saruwatari, Kiyohiro Shikano:
Investigating the role of the Lombard reflex in non-audible murmur (NAM) recognition.
2649-2652
- Evan Ruzanski, John H. L. Hansen, Don Finan, James Meyerhoff, William Norris, Terry Wollert:
Improved "TEO" feature-based automatic stress detection using physiological and acoustic speech sensors.
2653-2656
- Takeshi S. Kobayakawa:
Spectral subtraction using elliptic integral for multiplication factor.
2657-2660
- Longbiao Wang, Norihide Kitaoka, Seiichi Nakagawa:
Robust distant speech recognition based on position dependent CMN using a novel multiple microphone processing technique.
2661-2664
- H. Tanaka, Hiroshi Fujimura, Chiyomi Miyajima, Takanori Nishino, Katunobu Itou, Kazuya Takeda:
Data collection and evaluation of speech recognition for motorbike riders.
2665-2668
- Agustín Álvarez Marquina, Pedro Gómez Vilda, Victor Nieto Lluis, Rafael Martínez, Victoria Rodellar:
Application of a first-order differential microphone for efficient voice activity detection in a car platform.
2669-2672
- Panji Setiawan, Suhadi Suhadi, Tim Fingscheidt, Sorel Stan:
Robust speech recognition for mobile devices in car noise.
2673-2676
- Péter Mihajlik, Zoltán Tobler, Zoltán Tüske, Géza Gordos:
Evaluation and optimization of noise robust front-end technologies for the automatic recognition of Hungarian telephone speech.
2677-2680
- Gang Chen, Douglas D. O'Shaughnessy, Hesham Tolba:
A performance investigation of noisy voice recognition over IP telephony networks.
2681-2684
- Akinori Ito, Takashi Kanayama, Motoyuki Suzuki, Shozo Makino:
Internal noise suppression for speech recognition by small robots.
2685-2688
- Florian Kraft, Robert Malkin, Thomas Schaaf, Alex Waibel:
Temporal ICA for classification of acoustic events i a kitchen environment.
2689-2692
- Jan Felix Krebber:
"hello - is anybody at home?" - about the minimum word accuracy of a smart home spoken dialogue system.
2693-2696
- Hans-Günter Hirsch, Harald Finster:
The simulation of realistic acoustic input scenarios for speech recognition systems.
2697-2700
- Michael Walsh, Gregory M. P. O'Hare, Julie Carson-Berndsen:
An agent-based framework for speech investigation.
2701-2704
Speech Coding
- Stephen So, Kuldip K. Paliwal:
Switched split vector quantisation of line spectral frequencies for wideband speech coding.
2705-2708
- Changchun Bao, Jason Lukasiak, Christian Ritz:
A novel voicing cut-off determination for low bit-rate harmonic speech coding.
2709-2712
- Hauke Krüger, Peter Vary:
A partial decorrelation scheme for improved predictive open loop quantization with noise shaping.
2713-2716
- Venkatesh Krishnan, Thomas P. Barnwell III, David V. Anderson:
Using dynamic codebook re-ordering to exploit inter-frame correlation in MELP coders.
2717-2720
- Adriane Swalm Durey, Venkatesh Krishnan, Thomas P. Barnwell III:
Enhanced speech coding based on phonetic class segmentation.
2721-2724
- Ali Erdem Ertan, Thomas P. Barnwell III:
A pitch-synchronous pitch-cycle modification method for designing a hybrid i-MELP/waveform-matching speech coder.
2725-2728
- Joon-Hyuk Chang, Jong Won Shin, Seung Yeol Lee, Nam Soo Kim:
A new structural preprocessor for low-bit rate speech coding.
2729-2732
- Tiago H. Falk, Wai-Yip Chan, Peter Kabal:
An improved GMM-based voice quality predictor.
2733-2736
- Jan S. Erkelens:
High-quality memoryless subband coding of impulse responses at 22 bits per frame.
2737-2740
- Shi-Han Chen, Kuo-Guan Wu, Chih-Chung Kuo:
A study of variable pulse allocation for MPE and CELP coders based on PESQ analysis.
2741-2744
- José L. Pérez-Córdoba, Antonio M. Peinado, Angel M. Gomez, Antonio J. Rubio:
Joint source-channel coding of LSP parameters for bursty channels.
2745-2748
Gender and Age Issues in Speech and Language Research I,
II
- Daniel Elenius, Mats Blomberg:
Adaptation and normalization experiments in speech recognition for 4 to 8 year old children.
2749-2752
- Wim Jansen, Hugo Van Hamme:
PROSPECT features and their application to missing data techniques for vocal tract length normalization.
2753-2756
- Andreas Hagen, Bryan L. Pellom:
Data driven subword unit modeling for speech recognition and its application to interactive reading tutors.
2757-2760
- Anton Batliner, Mats Blomberg, Shona D'Arcy, Daniel Elenius, Diego Giuliani, Matteo Gerosa, Christian Hacker, Martin J. Russell, Stefan Steidl, Michael Wong:
The PF_STAR children's speech corpus.
2761-2764
- Linda Bell, Johan Boye, Joakim Gustafson, Mattias Heldner, Anders Lindström, Mats Wirén:
The Swedish NICE corpus - spoken dialogues between children and embodied characters in a computer game scenario.
2765-2768
- Yusuke Miyauchi, Nao Hodoshima, Keiichi Yasu, Nahoko Hayashi, Takayuki Arai, Mitsuko Shindo:
A preprocessing technique for improving speech intelligibility in reverberant environments: the effect of steady-state suppression on elderly people.
2769-2772
Discourse and Dialogue I,
II
- Norbert Pfleger, Markus Löckelt:
Synchronizing dialogue contributions of human users and virtual characters in a virtual reality environment.
2773-2776
- Anand Venkataraman, Yang Liu, Elizabeth Shriberg, Andreas Stolcke:
Does active learning help automatic dialog act tagging in meeting data?
2777-2780
- Dan Bohus, Alexander I. Rudnicky:
A principled approach for rejection threshold optimization in spoken dialog systems.
2781-2784
- David Pérez-Piñar López, Carmen García-Mateo:
Application of confidence measures for dialogue systems through the use of parallel speech recognizers.
2785-2788
- Sophie Rosset, Delphine Tribout:
Multi-level information and automatic dialog acts detection in human-human spoken dialogs.
2789-2792
- Rieks op den Akker, Harry Bunt, Simon Keizer, Boris W. van Schooten:
From question answering to spoken dialogue: towards an information search assistant for interactive multimodal information extraction.
2793-2796
Text-to-Speech I,
II
- Ulrich Reubold, Alexander Steffen:
Pitch-effects in diphone recording: are logatomes inappropriate?
2797-2800
- Tomoki Toda, Keiichi Tokuda:
Speech parameter generation algorithm considering global variance for HMM-based speech synthesis.
2801-2804
- Makoto Tachibana, Junichi Yamagishi, Takashi Masuko, Takao Kobayashi:
Performance evaluation of style adaptation for hidden semi-Markov model based speech synthesis.
2805-2808
- Gabriel Webster, Tina Burrows, Katherine Knill:
A comparison of methods for speaker-dependent pronunciation tuning for text-to-speech synthesis.
2809-2812
- Ann K. Syrdal, Alistair Conkie:
Perceptually-based data-driven join costs: comparing join types.
2813-2816
- Yannis Pantazis, Yannis Stylianou, Esther Klabbers:
Discontinuity detection in concatenated speech synthesis based on nonlinear speech analysis.
2817-2820
Language and Dialect Identification I,
II
- Tingyao Wu, Dirk Van Compernolle, Jacques Duchateau, Qian Yang, Jean-Pierre Martens:
Improving the discrimination between native accents when recorded over different channels.
2821-2824
- Isabel Trancoso, António Joaquim Serralheiro, Céu Viana, Diamantino Caseiro:
Aligning and recognizing spoken books in different varieties of Portuguese.
2825-2828
- Bin Ma, Haizhou Li, Chin-Hui Lee:
An acoustic segment modeling approach to automatic language identification.
2829-2832
- Dong Zhu, Martine Adda-Decker, Fabien Antoine:
Different size multilingual phone inventories and context-dependent acoustic models for language identification.
2833-2836
- Sheng Gao, Bin Ma, Haizhou Li, Chin-Hui Lee:
A text categorization approach to automatic language identification.
2837-2840
- Giampiero Salvi:
Advances in regional accent clustering in Swedish.
2841-2844
Speech Recognition in Ubiquitous Networking and Context-Aware Computing
- David Pearce, Jonathan Engelsma, James C. Ferrans, John Johnson:
An architecture for seamless access to distributed multimodal services.
2845-2848
- Zheng-Hua Tan, Paul Dalsgaard, Børge Lindberg, Haitian Xu:
Robust speech recognition in ubiquitous networking and context-aware computing.
2849-2852
- Valentin Ion, Reinhold Haeb-Umbach:
Unified probabilistic approach to error concealment for distributed speech recognition.
2853-2856
- Alastair Bruce James, Ben Milner:
Combining packet loss compensation methods for robust distributed speech recognition.
2857-2860
- Trond Skogstad, Torbjørn Svendsen:
Distributed ASR using speech coder data for efficient feature vector representation.
2861-2864
- Sadaoki Furui, Tomohisa Ichiba, Takahiro Shinozaki, Edward W. D. Whittaker, Koji Iwano:
Cluster-based modeling for ubiquitous speech recognition.
2865-2868
Phonetics and Phonology I,
II
- Danny R. Moates, Zinny S. Bond, Russell Fox, Verna Stockmal:
The feature [sonorant] in lexical access.
2869-2872
- Simone Mikuteit:
Voice and aspiration in German and east bengali stops: a cross-language study.
2873-2876
- Irene Jacobi, Louis C. W. Pols, Jan Stroop:
Polder dutch: aspects of the /ei/-lowering in standard dutch.
2877-2880
- Eric Castelli, René Carré:
Production and perception of Vietnamese vowels.
2881-2884
- Vu Ngoc Tuan, Christophe d'Alessandro, Alexis Michaud:
Using open quotient for the characterisation of vietnamese glottalised tones.
2885-2888
- John Hajek, Mary Stevens:
On the acoustic characterization of ejective stops in Waima'a.
2889-2892
- Mary Stevens, John Hajek:
Spirantization of /p t k/ in Sienese Italian and so-called semi-fricatives.
2893-2896
- Barbara Gili Fivela, Claudio Zmarich:
Italian geminates under speech rate and focalization changes: kinematic, acoustic, and perception data.
2897-2900
- Sunhee Kim:
Durational characteristics of Korean Lombard speech.
2901-2904
- Toshiko Isei-Jaakkola, Satoshi Asakawa:
A cross-linguistic study of vowel quantity in different word structures: Japanese, Finnish and Czech.
2905-2908
- Laura Mori, Melissa Barkat-Defradas:
Acoustic properties of foreign accent: VOT variations in Moroccan-accented Italian.
2909-2912
- Andréia S. Rauber, Paola Escudero, Ricardo Augusto Hoffmann Bion, Barbara O. Baptista:
The interrelation between the perception and production of English vowels by native speakers of Brazilian Portuguese.
2913-2916
- Julia Hoelterhoff:
Recognition of German obstruents.
2917-2920
- Radek Skarnitzl, Jan Volín:
Czech voiced labiodental continuant discrimination from basic acoustic data.
2921-2924
- Jean-Baptiste Maj, Anne Bonneau, Dominique Fohr, Yves Laprie:
An elitist approach for extracting automatically well-realized speech sounds with high confidence.
2925-2928
- Na'im R. Tyson:
Applying multiple regression models for predicting word duration in a corpus of spontaneous speech.
2929-2932
- Catarina Oliveira, Lurdes Castro Moutinho, António J. S. Teixeira:
On european Portuguese automatic syllabification.
2933-2936
- Aimilios Chalamandaris, Spyros Raptis, Pirros Tsiakoulis:
Rule-based grapheme-to-phoneme method for the Greek.
2937-2940
- Constandinos Kalimeris, George K. Mikros, Stelios Bakamidis:
Assimilation and deletion phenomena involving word-final /n/ and word-initial /p, t, k/ in modern Greek: a codification of the observed variation intended for use in TTS synthesis.
2941-2944
- Christian Weiss, Bianca Aschenberner:
A German viseme-set for automatic transcription of input text used for audio-visual speech synthesis.
2945-2948
- Johanna-Pascale Roy:
Visual perception of anticipatory rounding gestures in French.
2949-2952
Acoustic Processing for ASR I-III
- Michael Jonas, James G. Schmolze:
Hierarchical clustering of mixture tying using a partially observable Markov decision process.
2953-2956
- Pierre Ouellet, Gilles Boulianne, Patrick Kenny:
Flavors of Gaussian warping.
2957-2960
- Joseph Keshet, Shai Shalev-Shwartz, Yoram Singer, Dan Chazan:
Phoneme alignment based on discriminative learning.
2961-2964
- Jussi Leppänen, Imre Kiss:
Comparison of low footprint acoustic modeling techniques for embedded ASR systems.
2965-2968
- Atiwong Suchato, Proadpran Punyabukkana:
Factors in classification of stop consonant place of articulation.
2969-2972
- Arthur R. Toth, Alan W. Black:
Cross-speaker articulatory position data for phonetic feature prediction.
2973-2976
- Daniel Povey:
Improvements to fMPE for discriminative training of features.
2977-2980
- Xin Lei, Mei-Yuh Hwang, Mari Ostendorf:
Incorporating tone-related MLP posteriors in the feature representation for Mandarin ASR.
2981-2984
- Yan Han, Johan de Veth, Lou Boves:
Speech trajectory clustering for improved speech recognition.
2985-2988
- Andrey Temko, Dusan Macho, Climent Nadeu:
Selection of features and combination of classifiers using a fuzzy approach for acoustic event classification.
2989-2992
- Jan Stadermann, Wolfram Koska, Gerhard Rigoll:
Multi-task learning strategies for a recurrent neural net in a hybrid tied-posteriors acoustic model.
2993-2996
- Florian Hönig, Georg Stemmer, Christian Hacker, Fabio Brugnara:
Revising Perceptual Linear Prediction (PLP).
2997-3000
- Joel Pinto, R. N. V. Sitaram:
Confidence measures in speech recognition based on probability distribution of likelihoods.
3001-3004
- Frank Diehl, Asunción Moreno, Enric Monte:
Continuous local codebook features for multi- and cross-lingual acoustic phonetic modelling.
3005-3008
- Antonio Miguel, Eduardo Lleida, Richard C. Rose, Luis Buera, Alfonso Ortega:
Augmented state space acoustic decoding for modeling local variability in speech.
3009-3012
- Dimitrios Dimitriadis, Petros Maragos, Alexandros Potamianos:
Auditory Teager energy cepstrum coefficients for robust speech recognition.
3013-3016
- Yasser Hifny, Steve Renals, Neil D. Lawrence:
A hybrid Maxent/HMM based ASR system.
3017-3020
- Hakan Erdogan:
Regularizing linear discriminant analysis for speech recognition.
3021-3024
- Yadong Wang, Steven Greenberg, Jayaganesh Swaminathan, Ramdas Kumaresan, David Poeppel:
Comprehensive modulation representation for automatic speech recognition.
3025-3028
- Qiang Fu, Biing-Hwang Juang:
Segment-based phonetic class detection using minimum verification error (MVE) training.
3029-3032
- Yi Liu, Pascale Fung:
Acoustic and phonetic confusions in accented speech recognition.
3033-3036
- Mario E. Munich, Qiguang Lin:
Auditory image model features for automatic speech recognition.
3037-3040
- Panikos Heracleous, Tomomi Kaino, Hiroshi Saruwatari, Kiyohiro Shikano:
Applications of NAM microphones in speech recognition for privacy in human-machine communication.
3041-3044
- Joe Frankel, Simon King:
A hybrid ANN/DBN approach to articulatory feature recognition.
3045-3048
Speaker Characterization and Recognition I-IV
- Daniel Moraru, Mathieu Ben, Guillaume Gravier:
Experiments on speaker tracking and segmentation in radio broadcast news.
3049-3052
- Emanuele Dalmasso, Pietro Laface, Daniele Colibro, Claudio Vair:
Unsupervised segmentation and verification of multi-speaker conversational speech.
3053-3056
- Sacha Krstulovic, Frédéric Bimbot, Delphine Charlet, Olivier Boëffard:
Focal speakers: a speaker selection method able to deal with heterogeneous similarity criteria.
3057-3060
- Mathieu Ben, Guillaume Gravier, Frédéric Bimbot:
A model space framework for efficient speaker detection.
3061-3064
- Nicolas Scheffer, Jean-François Bonastre:
Speaker detection using acoustic event sequences.
3065-3068
- Wei-Ho Tsai, Hsin-Min Wang:
Speaker clustering of unknown utterances based on maximum purity estimation.
3069-3072
- Petra Zochová, Vlasta Radová:
Modified DISTBIC algorithm for speaker change detection.
3073-3076
- Gilles Gonon, Rémi Gribonval, Frédéric Bimbot:
Decision trees with improved efficiency for fast speaker verification.
3077-3080
- Nicolas Eveno, Laurent Besacier:
A speaker independent "liveness" test for audio-visual biometrics.
3081-3084
- Shingo Kuroiwa, Yoshiyuki Umeda, Satoru Tsuge, Fuji Ren:
Distributed speaker recognition using speaker-dependent VQ codebook and earth mover's distance.
3085-3088
- Ka-Yee Leung, Man-Wai Mak, Man-Hung Siu, Sun-Yuan Kung:
Speaker verification via articulatory feature-based conditional pronunciation modeling with vowel and consonant mixture models.
3089-3092
- Jixu Chen, Beiqian Dai, Jun Sun:
Prosodic features based on wavelet analysis for speaker verification.
3093-3096
- Mohamed Mihoubi, Douglas D. O'Shaughnessy, Pierre Dumouchel:
Relevant information extraction for discriminative training applied to speaker identification.
3097-3100
- Jérôme Louradour, Khalid Daoudi:
Conceiving a new sequence kernel and applying it to SVM speaker verification.
3101-3104
- Jing Deng, Thomas Fang Zheng, Jian Liu, Wenhu Wu:
The predictive differential amplitude spectrum for robust speaker recognition in stationary noises.
3105-3108
- Michael Mason, Robbie Vogt, Brendan Baker, Sridha Sridharan:
Data-driven clustering for blind feature mapping in speaker verification.
3109-3112
- Xi Zhou, Zhiqiang Yao, Beiqian Dai:
Improved covariance modeling for GMM in speaker identification.
3113-3116
- Robbie Vogt, Brendan Baker, Sridha Sridharan:
Modelling session variability in text-independent speaker verification.
3117-3120
- Mihalis Siafarikas, Todor Ganchev, Nikolaos D. Fakotakis, George K. Kokkinakis:
Overlapping wavelet packet features for speaker verification.
3121-3124
- An-rong Yin, Xiang Xie, Jingming Kuang:
Using Hadamard ECOC in multi-class problems based on SVM.
3125-3128
Robust Speech Recognition I-IV
Speech Coding and Quality Assessment
- Akira Takahashi, Atsuko Kurashima, Chiharu Morioka, Hideaki Yoshino:
Objective quality assessment of wideband speech by an extension of ITU-t recommendation p.862.
3153-3156
- Marc Werner, Peter Vary:
Quality control for UMTS-AMR speech channels.
3157-3160
- Wei Chen, Peter Kabal, Turaj Zakizadeh Shabestary:
Perceptual postfilter estimation for low bit rate speech coders using Gaussian mixture models.
3161-3164
- Kengo Fujita, Tsuneo Kato, Hideaki Yamada, Hisashi Kawai:
SNR-dependent background noise compensation of PESQ values for cellular phone speech.
3165
- Gil Ho Lee, Jae Sam Yoon, Hong Kook Kim:
A MFCC-based CELP speech coder for server-based speech recognition in network environments.
3169-3172
- Volodya Grancharov, Jonas Samuelsson, W. Bastiaan Kleijn:
Distortion measures for vector quantization of noisy spectrum.
3173-3176
Spoken Language Translation I,
II
- Evgeny Matusov, Stephan Kanthak, Hermann Ney:
On the integration of speech recognition and statistical machine translation.
3177-3180
- V. H. Quan, Marcello Federico, Mauro Cettolo:
Integrated n-best re-ranking for spoken language translation.
3181-3184
- Josep Maria Crego, José B. Mariño, Adrià de Gispert:
An n-gram-based statistical machine translation decoder.
3185-3188
- Liang Gu, Yuqing Gao:
Use of maximum entropy in natural word generation for statistical concept-based speech-to-speech translation.
3189-3192
- Adrià de Gispert, José B. Mariño, Josep Maria Crego:
Improving statistical machine translation by classifying and generalizing inflected verb forms.
3193-3196
- Abdulvohid Bozarov, Yoshinori Sagisaka, Ruiqiang Zhang, Gen-ichiro Kikui:
Improved speech recognition word lattice translation by confidence measure.
3197-3200
Speech Inversion
Prosody Modelling and Speech Technology I,
II
- Daniel Hirst, Cyril Auran:
Analysis by synthesis of speech prosody: the Prozed environment.
3225-3228
- Stephen Cox:
A discriminative approach to phrase break modelling.
3229-3232
- Ian Read, Stephen Cox:
Stochastic and syntactic techniques for predicting phrase breaks.
3233-3236
- Gerasimos Xydas, Panagiotis Zervas, Georgios Kouroupetroglou, Nikolaos D. Fakotakis, George K. Kokkinakis:
Tree-based prediction of prosodic phrase breaks on top of shallow textual features.
3237-3240
- Honghui Dong, Jianhua Tao, Bo Xu:
Chinese prosodic phrasing with a constraint-based approach.
3241-3244
- Minghui Dong, Kim-Teng Lua, Haizhou Li:
A probabilistic approach to prosodic word prediction for Mandarin Chinese TTS.
3245-3248
- João Paulo Teixeira, Diamantino Freitas, Hiroya Fujisaki:
Evaluation of a system for F0 contour prediction for european Portuguese.
3249-3252
- Ke Li, Yoshinori Sagisaka:
Analysis on command sequences of a F0 generation model for Mandarin speech and its application to their automatic extraction.
3253-3256
- Keikichi Hirose, Yusuke Furuyama, Nobuaki Minematsu:
Corpus-based extraction of F0 contour generation process model parameters.
3257-3260
- David Escudero Mancebo, Valentín Cardeñoso-Payo:
Optimized selection of intonation dictionaries in corpus based intonation modelling.
3261-3264
- Qinghua Sun, Keikichi Hirose, Wentao Gu, Nobuaki Minematsu:
Generation of fundamental frequency contours for Mandarin speech synthesis based on tone nucleus model.
3265-3268
- Chen-Yu Chiang, Yih-Ru Wang, Sin-Horng Chen:
On the inter-syllable coarticulation effect of pitch modeling for Mandarin speech.
3269-3272
- Matej Rojc, Pablo Daniel Agüero, Antonio Bonafonte, Zdravko Kacic:
Training the tilt intonation model using the JEMA methodology.
3273-3276
- Dagen Wang, Shrikanth Narayanan:
Piecewise linear stylization of pitch via wavelet analysis.
3277-3280
- Harald Romsdorfer, Beat Pfister:
Phonetic labeling and segmentation of mixed-lingual prosody databases.
3281-3284
- Edmilson Morais, Fábio Violaro:
Exploratory analysis of linguistic data based on genetic algorithm for robust modeling of the segmental duration of speech.
3285-3288
- Dafydd Gibbon, Flaviane Romani Fernandes:
Annotation-mining for rhythm model comparison in Brazilian portuguese.
3289-3292
- Tohru Nagano, Shinsuke Mori, Masafumi Nishimura:
A stochastic approach to phoneme and accent estimation.
3293-3296
- Jason M. Brenier, Daniel M. Cer, Daniel Jurafsky:
The detection of emphatic words using acoustic and lexical features.
3297-3300
- Dinoj Surendran, Gina-Anne Levow, Yi Xu:
Tone recognition in Mandarin using focus.
3301-3304
- Mikolaj Wypych:
An automatic intonation recognizer for the Polish language based on machine learning and expert knowledge.
3305-3308
- Atsuhiro Sakurai:
Generalized envelope matching technique for time-scale modification of speech (GEM-TSM).
3309-3312
Topics in Speech Recognition
- Yang Liu, Elizabeth Shriberg, Andreas Stolcke, Mary P. Harper:
Comparing HMM, maximum entropy, and conditional random fields for disfluency detection.
3313-3316
- Bhiksha Raj, Rita Singh, Paris Smaragdis:
Recognizing speech from simultaneous speakers.
3317-3320
- Vincent Wan, James Carmichael:
Polynomial dynamic time warping kernel support vector machines for dysarthric speech recognition with sparse training data.
3321-3324
- R. Lejeune, J. Baude, C. Tchong, Hubert Crepy, Claire Waast-Richard:
Flavoured acoustic model and combined spelling to sound for asymmetrical bilingual environment.
3325-3328
- Chris D. Bartels, Kevin Duh, Jeff Bilmes, Katrin Kirchhoff, Simon King:
Genetic triangulation of graphical models for speech and language processing.
3329-3332
- Guillermo Aradilla, Jithendra Vepa, Hervé Bourlard:
Improving speech recognition using a data-driven approach.
3333-3336
- Shigeki Matsuda, Wolfgang Herbordt, Satoshi Nakamura:
Outlier detection for acoustic model training using robust statistics.
3337-3340
- Jonathan Le Roux, Erik McDermott:
Optimization methods for discriminative training.
3341-3344
- Patrick Cardinal, Gilles Boulianne, Michel Comeau:
Segmentation of recordings based on partial transcriptions.
3345-3348
- Hussien Seid, Björn Gambäck:
A speaker independent continuous speech recognizer for Amharic.
3349-3352
- Tetsuji Ogawa, Tetsunori Kobayashi:
Optimizing the structure of partly-hidden Markov models using weighted likelihood-ratio maximization criterion.
3353-3356
- C. Santhosh Kumar, V. P. Mohandas, Haizhou Li:
Multilingual speech recognition: a unified approach.
3357-3360
- Tomás Bartos, Ludek Müller:
Detection of recognition errors based on classifiers trained on artificially created data.
3361-3364
- Jinyu Li, Chin-Hui Lee:
On designing and evaluating speech event detectors.
3365-3368
- Joseph Razik, Odile Mella, Dominique Fohr, Jean Paul Haton:
Local word confidence measure using word graph and n-best list.
3369-3372
- Xiaolin Ren, Xin He, Yaxin Zhang:
Mandarin/English mixed-lingual name recognition for mobile phone.
3373-3376
- Javier Ferreiros, Rubén San Segundo, Fernando F. Fernández-Martínez, Luis Fernando D'Haro, Valentín Sama, Roberto Barra-Chicote, Pedro Mellén:
New word-level and sentence-level confidence scoring using graph theory calculus and its evaluation on speech understanding.
3377-3380
- Masanobu Nakamura, Koji Iwano, Sadaoki Furui:
Analysis of spectral space reduction in spontaneous speech and its effects on speech recognition performances.
3381-3384
- Simon King, Chris D. Bartels, Jeff Bilmes:
SVitchboard 1: small vocabulary tasks from Switchboard.
3385-3388
Discourse and Dialogue I,
II
- Wieneke Wesseling, R. J. J. H. van Son:
Timing of experimentally elicited minimal responses as quantitative evidence for the use of intonation in projecting TRPs.
3389-3392
- Shinya Yamada, Toshihiko Itoh, Kenji Araki:
Linguistic and acoustic features depending on different situations - the experiments considering speech recognition rate.
3393-3396
- Dirk Bühler, Stefan W. Hamerich:
Towards voiceXML compilation for portable embedded applications in ubiquitous environments.
3397-3400
- Eva Strangert:
Prosody in public speech: analyses of a news announcement and a Political interview.
3401-3404
- Amit Anil Nanavati, Nitendra Rajput:
Characterising dialogue call-flows for pervasive environments.
3405-3408
- Tanveer A. Faruquie, Pankaj Kankar, Nitendra Rajput, Abhishek Verma:
An architecture for pluggable disambiguation mechanism for RDC based voice applications.
3409-3412
- Nitendra Rajput, Amit Anil Nanavati, Abhishek Kumar, Neeraj Chaudhary:
Adapting dialog call-flows for pervasive devices.
3413-3416
- Ulf Krum, Hartwig Holzapfel, Alex Waibel:
Clarification questions to improve dialogue flow and speech recognition in spoken dialogue systems.
3417-3420
- Fernando F. Fernández-Martínez, Javier Ferreiros, Valentín Sama, Juan Manuel Montero, Rubén San Segundo, Javier Macías Guarasa, Rafael García:
Speech interface for controlling an hi-fi audio system based on a Bayesian belief networks approach for dialog modeling.
3421-3424
Spoken Language Understanding I,
II
- Matthias Thomae, Tibor Fábián, Robert Lieb, Günther Ruske:
Hierarchical language models for one-stage speech interpretation.
3425-3428
- Nick J.-C. Wang:
Spoken language understanding using layered n-gram modeling.
3429-3432
- Mihai Surdeanu, Jordi Turmo, Eli Comelles:
Named entity recognition from spontaneous open-domain speech.
3433-3436
- Imed Zitouni, Hui Jiang, Qiru Zhou:
Discriminative training and support vector machine for natural language call routing.
3437-3440
- Jihyun Eun, Minwoo Jeong, Gary Geunbae Lee:
A multiple classifier-based concept-spotting approach for robust spoken language understanding.
3441-3444
- Robert Lieb, Matthias Thomae, Günther Ruske, Daniel Bobbert, Frank Althoff:
A flexible and integrated interface between speech recognition, speech interpretation and dialog management.
3445-3448
- Tomohiro Ohno, Shigeki Matsubara, Hideki Kashioka, Naoto Kato, Yasuyoshi Inagaki:
Incremental dependency parsing of Japanese spoken monologue based on clause boundaries.
3449-3452
- Atsushi Sako, Tetsuya Takiguchi, Yasuo Ariki:
Situation based speech recognition for structuring baseball live games.
3453-3456
- Hélène Bonneau-Maynard, Sophie Rosset, Christelle Ayache, A. Kuhn, Djamel Mostefa:
Semantic annotation of the French media dialog corpus.
3457-3460
- Ralf Engel:
Robust and efficient semantic parsing of free word order languages in spoken dialogue systems.
3461-3464
- Catherine Kobus, Géraldine Damnati, Lionel Delphin-Poulat, Renato de Mori:
Conceptual language model design for spoken language understanding.
3465-3468
- Luís Seabra Lopes, António J. S. Teixeira, Marcelo Quinderé, Mário Rodrigues:
From robust spoken language understanding to knowledge acquisition and management.
3469-3472
- Cheng Wu, Xiang Li, Hong-Kwang Jeff Kuo, E. E. Jan, Vaibhava Goel, David Lubensky:
Improving end-to-end performance of call classification through data confusion reduction and model tolerance enhancement.
3473-3476
Last update Fri May 25 08:23:10 2012
CET by the DBLP Team —
Data released under the ODC-BY 1.0 license — See also our legal information page