default search action
ICASSP 2006: Toulouse, France
- 2006 IEEE International Conference on Acoustics Speech and Signal Processing, ICASSP 2006, Toulouse, France, May 14-19, 2006. IEEE 2006, ISBN 1-4244-0469-X
Volume 1
Speech Coding for Network Applications
- Stéphane Ragot, Balázs Kövesi, David Virette, Romain Trilling, Dominique Massaloux:
A 8-32 KBIT/S Scalable Wideband Speech and Audio Coding Candidate for ITU-T G729EV Standardization. 1-4 - Peter Jax, Bernd Geiser, Stefan Schandl, Hervé Taddei, Peter Vary:
An Embedded Scalable Wideband Codec Based on the GSM EFR Codec. 5-8 - Efrain Orozco, Stephane Villette, Ahmet M. Kondoz:
Multiple Description Coding for Voice over IP using Sinusoidal Speech Coding. 9-12 - Mohamed Chibani, Roch Lefebvre, Philippe Gournay:
Resynchronization of the Adaptive Codebook in a Constrained celp Codec After a Frame Erasure. 13-16 - Philippe Gournay, Kyle D. Anderson:
Performance Analysis of a Decoder-Based Time Scaling Algorithm for Variable Jitter Buffering of Speech Over Packet Networks. 17-20 - Manohar N. Murthi, Christoffer Asgaard Rødbro, Søren Vang Andersen, Søren Holdt Jensen:
Packet Loss Concealment with Natural Variations using HMM. 21-24
Advances in Robust Speech Recognition
- Daniel Willett, Franz Gerl, Raymond Brueckner:
Discriminatively Trained Context-Dependent Duration-Bigram Models for Korean Digit Recognition. 25-28 - Hamed Ketabdar, Jithendra Vepa, Samy Bengio, Hervé Bourlard:
Using More Informative Posterior Probabilities for Speech Recognition. 29-32 - Keiichiro Oura, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda:
Hidden Semi-Markov Model Based Speech Recognition System using Weighted Finite-State Transducer. 33-36 - Chao Huang, Yingchun Huang, Frank K. Soong, Jianlai Zhou:
Weighted Likelihood Ratio (WLR) Hidden Markov Model for Noisy Speech Recognition. 37-40 - Yi Chen, Chia-Yu Wan, Lin-Shan Lee:
Entropy-Based Feature Parameter Weighting for Robust Speech Recognition. 41-44 - Jen-Tzung Chien, Chih-Hsien Huang, Koichi Shinoda, Sadaoki Furui:
Towards Optimal Bayes Decision for Speech Recognition. 45-48
Spoken Language Dialog
- Olivier Pietquin, Thierry Dutoit:
Dynamic Bayesian Networks for NLU Simulation with Applications to Dialog Optimal Strategy Learning. 49-52 - Srinivas Bangalore, Giuseppe Di Fabbrizio, Amanda Stent:
Towards Learning to Converse: Structuring Task-Oriented Human-Human Dialogs. 53-56 - Esther Levin, Alex Levin:
Dialog Design for User Adaptation. 57-60 - Pavel Král, Christophe Cerisara, Jana Klecková:
Automatic Dialog Acts Recognition Based on Sentence Structure. 61-64 - James Hieronymus, Greg Aist, John Dowding:
Open Microphone Speech Understanding: Correct Discrimination of in Domain Speech. 65-68 - Cheongjae Lee, Sangkeun Jung, Jihyun Eun, Minwoo Jeong, Gary Geunbae Lee:
A Situation-Based Dialogue Management using Dialogue Examples. 69-72
Speech Synthesis I
- Ellen Eide, Michael Picheny:
Towards Pooled-Speaker Concatenative Text-to-Speech. 73-76 - Junichi Yamagishi, Katsumi Ogata, Yuji Nakano, Juri Isogai, Takao Kobayashi:
HSMM-Based Model Adaptation Algorithms for Average-Voice-Based Speech Synthesis. 77-80 - David Sündermann, Harald Höge, Antonio Bonafonte, Hermann Ney, Alan W. Black, Shrikanth S. Narayanan:
Text-Independent Voice Conversion Based on Unit Selection. 81-84 - Helenca Duxans, Antonio Bonafonte:
Residual Conversion Versus Prediction on Voice Morphing Systems. 85-88 - Yi-Jian Wu, Ren-Hua Wang:
Minimum Generation Error Training for HMM-Based Speech Synthesis. 89-92 - Kenichi Nakamura, Tomoki Toda, Yoshihiko Nankaku, Keiichi Tokuda:
On the Use of Phonetic Information for Mapping from Articulatory Movements to Vocal Tract Spectrum. 93-96
Advances in Speaker Recognition
- William M. Campbell, Douglas E. Sturim, Douglas A. Reynolds, Alex Solomonoff:
SVM Based Speaker Verification using a GMM Supervector Kernel and NAP Variability Compensation. 97-100 - Luciana Ferrer, Elizabeth Shriberg, Sachin S. Kajarekar, Andreas Stolcke, M. Kemal Sönmez, Anand Venkataraman, Harry Bratt:
The Contribution of Cepstral and Stylistic Features to SRI's 2005 NIST Speaker Recognition Evaluation System. 101-104 - Zi-He Chen, Zhi-Ren Zeng, Yuan-Fu Liao, Yau-Tarng Juang:
Probabilistic Latent Prosody Analysis for Robust Speaker Verification. 105-108 - Jason W. Pelecanos, Daniel Povey, Ganesh N. Ramaswamy:
Secondary Classification for GMM Based Speaker Recognition. 109-112 - Patrick Kenny, Gilles Boulianne, Pierre Ouellet, Pierre Dumouchel:
Improvements in Factor Analysis Based Speaker Verification. 113-116 - Yanlu Xie, Beiqian Dai, Zhiqiang Yao, Minghui Liu:
Kurtosis Normalization in Feature Space for Robust Speaker Verification. 117-120
Advances in LVCSR Algorithms
- Emilian Stoimenov, John W. McDonough:
Modeling Polyphone Context Withweighted Finite-State Transducers. 121-124 - Chia-Yu Wan, Lin-Shan Lee:
Joint Uncertainty Decoding (JUD) with Histogram-Based Quantization (HQ) for Robust and/or Distributed Speech Recognition. 125-128 - Martin I. Layton, Mark J. F. Gales:
Augmented Statistical Models for Speech Recognition. 129-132 - Yao Qian, Frank K. Soong, Tan Lee:
Tone-Enhanced Generalized Character Posterior Probability (GCPP) for Cantonese LVCSR. 133-136 - Yoo Rhee Oh, Jae Sam Yoon, Hong Kook Kim:
Acoustic Model Adaptation Based on Pronunciation Variability Analysis for Non-Native Speech Recognition. 137-140 - Zhengyu Zhou, Jianfeng Gao, Frank K. Soong, Helen Meng:
A Comparative Study of Discriminative Methods for Reranking LVCSR N-Best Hypotheses in Domain Adaptation and Generalization. 141-144
Speech Enhancement for Noise Suppression
- Nicholas W. D. Evans, John S. D. Mason, Wei Ming Liu, Benoit G. B. Fauve:
An Assessment on the Fundamental Limitations of Spectral Subtraction. 145-148 - Teddy Surya Gunawan, Eliathamby Ambikairajah:
A New Forward Masking Model and its Application to Speech Enhancement. 149-152 - Yi Hu, Philipos C. Loizou:
Subjective Comparison of Speech Enhancement Algorithms. 153-156 - Siu Wa Lee, Frank K. Soong, Pak-Chung Ching:
An Iterative Trajectory Regeneration Algorithm for Separating Mixed Speech Sources. 157-160 - David Yuheng Zhao, W. Bastiaan Kleijn:
HMM-Based Speech Enhancement using Explicit Gain Modeling. 161-164 - Mark Hadley, Ben Milner, Richard W. Harvey:
Noise Reduction for Driver-To-Pit-Crew Communication in Motor Racing. 165-168
Efficient Techniques for LVCSR
- Valentin Ion, Reinhold Haeb-Umbach:
An Inexpensive Packet Loss Compensation Scheme for Distributed Speech Recognition Based on Soft-Features. 169-172 - Kaisheng Yao, Lorin Netsch, Vishu Viswanathan:
Speaker-Independent Name Recognition Using Improved Compensation and Acoustic Modeling Methods for Mobile Applications. 173-176 - Shin-ya Ishikawa, Kiyoshi Yamabana, Ryosuke Isotani, Akitoshi Okumura:
Parallel LVCSR Algorithm for Cellphone-Oriented Multicore Processors. 177-180 - Jussi Leppänen, Imre Kiss:
Gaussian Selection with Non-Overlapping Clusters for ASR in Embedded Devices. 181-184 - David Huggins-Daines, Mohit Kumar, Arthur Chan, Alan W. Black, Mosur Ravishankar, Alexander I. Rudnicky:
Pocketsphinx: A Free, Real-Time Continuous Speech Recognition System for Hand-Held Devices. 185-188 - Stephen So, Kuldip K. Paliwal:
Multi-Frame GMM-Based Block Quantisation for Distributed Speech Recognition Under Noisy Conditions. 189-192
Spoken Language Identification
- Chi-Yueh Lin, Hsiao-Chuan Wang:
Language Identification Using Pitch Contour Information in the Ergodic Markov Model. 193-196 - Pavel Matejka, Petr Schwarz, Lukás Burget, Jan Cernocký:
Use of Anti-Models to Further Improve State-of-the-Art PRLM Language Recognition System. 197-200 - Felicity Allen, Eliathamby Ambikairajah, Julien Epps:
Warped Magnitude and Phase-Based Features for Language Identification. 201-204 - Rong Tong, Bin Ma, Donglai Zhu, Haizhou Li, Engsiong Chng:
Integrating Acoustic, Prosodic and Phonotactic Features for Spoken Language Identification. 205-208 - Lukás Burget, Pavel Matejka, Jan Cernocký:
Discriminative Training Techniques for Acoustic Language Identification. 209-212 - Christopher White, Izhak Shafran, Jean-Luc Gauvain:
Discriminative Classifiers for Language Recognition. 213-216
Speaker Adaption
- Kai Yu, Mark J. F. Gales:
Incremental Adaptation using Bayesian Inference. 217-220 - Suleyman Serdar Kozat, Karthik Visweswariah, Ramesh Gopinath:
Feature Adaptation Based on Gaussian Posteriors. 221-224 - George Saon:
A Non-Linear Speaker Adaptation Technique using Kernel Ridge Regression. 225-228 - Brian Mak, Tsz-Chung Lai, Roger Wend-Huu Hsiao:
Improving Reference Speaker Weighting Adaptation by the Use of Maximum-Likelihood Reference Speakers. 229-232 - Jeff Siu-Kei Au-Yeung, Man-Hung Siu:
Reference Speaker Weighting Adaptation for Sub-Phonetic Polynomial Segment Models. 233-236 - Xiao Li, Jeff A. Bilmes:
Regularized Adaptation of Discriminative Classifiers. 237-240
Advances in Speech Analysis and Representations
- Aren Jansen, Partha Niyogi:
Intrinsic Fourier Analysis on the Manifold of Speech Sounds. 241-244 - L. Anders Ekman, W. Bastiaan Kleijn, Manohar N. Murthi:
Spectral Envelope Estimation and Regularization. 245-248 - Patricia A. Pelle:
A Robust Pitch Extraction System Based on Phase Locked Loops. 249-252 - Patrick J. Clemins, Marek B. Trawicki, Kuntoro Adi, Jidong Tao, Michael T. Johnson:
Generalized Perceptual Features for Vocalization Analysis Across Multiple Species. 253-256 - Peter Jancovic, Münevver Köküer:
Voicing-Character Estimation of Speech Spectra: Application to Noise Robust Speech Recognition. 257-260 - Nobuaki Minematsu, Satoshi Asakawa, Keikichi Hirose:
Para-Linguistic Information Represented as Distortion of the Acoustic Universal Structure In Speech. 261-264
Discriminative Training
- Fei Sha, Lawrence K. Saul:
Large Margin Gaussian Mixture Modeling for Phonetic Classification and Recognition. 265-268 - Chaojun Liu, Hui Jiang, Luca Rigazio:
Recent Improvement on Maximum Relative Margin Estimation of HMMS for Speech Recognition. 269-272 - Milind Mahajan, Asela Gunawardana, Alex Acero:
Training Algorithms for Hidden Conditional Random Fields. 273-276 - Mohamed Kamal Omar, Ganesh N. Ramaswamy:
Maximum Conditional Mutual Information Weighted Scoring for Speech Recognition. 277-280 - Jasha Droppo, Alex Acero:
Joint Discriminative Front End and Back End Training for Improved Speech Recognition Accuracy. 281-284 - Fabrice Lefèvre, Jean-Luc Gauvain:
Discriminant Initialization for Factor Analyzed HMM Training. 285-288
Missing Data Methods in Robust Speech Recognition
- Kalle J. Palomäki, Guy J. Brown, Jon P. Barker:
Recognition of Reverberant Speech using Full Cepstral Features and Spectral Missing Data. 289-292 - Hugo Van hamme:
Handling Time-Derivative Features in a Missing Data Framework for Robust Automatic Speech Recognition. 293-296 - Soundararajan Srinivasan, DeLiang Wang:
A Supervised Learning Approach to Uncertainty Decoding for Robust Speech Recognition. 297-300 - Sébastien Demange, Christophe Cerisara, Jean-Paul Haton:
Mask Estimation for Missing Data Recognition using Background Noise Sniffing. 301-304 - Wooil Kim, Richard M. Stern:
Band-Independent Mask Estimation for Missing-Feature Reconstruction in the Presence of Unknown Background Noise. 305-308 - Nicoleta Roman, Soundararajan Srinivasan, DeLiang Wang:
Speech Recognition in Multisource Reverberant Environments with Binaural Inputs. 309-312
Feature Extraction and Modeling
- Bing Zhang, Spyros Matsoukas, Richard M. Schwartz:
Discriminatively Trained Region Dependent Feature Transforms for Speech Recognition. 313-316 - Alessandro Duminuco, Chaojun Li, David Kryze, Luca Rigazio:
Flexible Feature Spaces Based on Generalized Heteroscedastic Linear Discriminant Analysis. 317-320 - Andreas Stolcke, Frantisek Grézl, Mei-Yuh Hwang, Xin Lei, Nelson Morgan, Dimitra Vergyri:
Cross-Domain and Cross-Language Portability of Acoustic Features Estimated by Multilayer Perceptrons. 321-324 - Petr Schwarz, Pavel Matejka, Jan Cernocký:
Hierarchical Structures of Neural Networks for Phoneme Recognition. 325-328 - Frederik Stouten, Jean-Pierre Martens:
On The Use of Phonological Features for Pronunciation Scoring. 329-332 - Jun Hou, Lawrence R. Rabiner, Sorin Dusan:
Automatic Speech Attribute Transcription (ASAT) - The Front End Processor. 333-336 - Jinsong Zhang, Xinhui Hu, Satoshi Nakamura:
Automatic Derivation of a Phoneme Set with Tone Information for Chinese Speech Recognition Based on Mutual Information Criterion. 337-340 - Masahide Ariu, Takashi Masuko, Shinichi Tanaka, Akinori Kawamura:
Speech Recognition Using Syllable Duration Ratio Model. 341-344 - Ghazi Bouselmi, Dominique Fohr, Irina Illina, Jean-Paul Haton:
Fully Automated Non-Native Speech Recognition Using Confusion-Based Acoustic Model Integration and Graphemic Constraints. 345-348 - Tingyao Wu, Dirk Van Compernolle, Jacques Duchateau, Hugo Van hamme:
Maximum Likelihood Based Temporal Frame Selection. 349-352 - Alexandros Potamianos, Eduardo Sánchez-Soto, Khalid Daoudi:
Stream Weight Computation for Multi-Stream Classifiers. 353-356 - Özgür Çetin, Elizabeth Shriberg:
Speaker Overlaps and ASR Errors in Meetings: Effects Before, During, and After the Overlap. 357-360
Speech Production, Analysis and Modeling
- Asterios Toutios, Konstantinos G. Margaritis:
Learning Electropalatograms from Acoustics. 361-364 - Bruce Denby, Yacine Oussar, Gérard Dreyfus, Maureen Stone:
Prospects for a Silent Speech Interface using Ultrasound Imaging. 365-368 - Li Deng, Xiaodong Cui, Robert Pruvenok, Yanyi Chen, Safiyy Momen, Abeer Alwan:
A Database of Vocal Tract Resonance Trajectories for Research in Speech Processing. 369-372 - Huiqun Deng, Rabab K. Ward, Michael P. Beddoes, Douglas D. O'Shaughnessy:
Obtaining LIP and Glottal Reflection Coefficients from Vowel Sounds. 373-376 - Hui Li, Beiqian Dai, Wei Lu:
A Pitch Detection Algorithm Based on AMDF and ACF. 377-380 - Damien Vincent, Olivier Rosec, Thierry Chonavel:
Glottal Closure Instant Estimation using an Appropriateness Measure of the Source and Continuity Constraints. 381-384 - Abdellah Kacha, Francis Grenez, Jean Schoentgen:
Frame-Based Acoustic Cues of Vocal Dysperiodicity in Connected Speech. 385-388 - Markus Iseli, Yen-Liang Shue, Abeer Alwan:
Age-and Gender-Dependent Analysis of Voice Source Characteristics. 389-392 - Matteo Gerosa, Sungbok Lee, Diego Giuliani, Shrikanth S. Narayanan:
Analyzing Children's Speech: An Acoustic Study of Consonants and Consonant-Vowel Transition. 393-396 - Satoru Tsuge, Masami Shishibori, Kenji Kita, Fuji Ren, Shingo Kuroiwa:
Study of Intra-Speakers Speech Variability Over Long and Short Time Periods for Speech Recognition. 397-400 - Ayako Ikeno, John H. L. Hansen:
Perceptual Recognition Cues in Native English Accent Variation: "Listener Accent, Perceived Accent, and Comprehension". 401-404
Novel LVCSR Algorithms
- Øystein Birkenes, Tomoko Matsui, Kunio Tanabe:
Isolated-Word Recognition with Penalized Logistic Regression Machines. 405-408 - Alex Park, James R. Glass:
Unsupervised Word Acquisition from Speech using Pattern Discovery. 409-412 - Mithun Balakrishna, Dan I. Moldovan, Ellis Cave:
N-Best List Reranking using Higher Level Phonetic, Lexical, Syntactic and Semantic Knowledge Sources. 413-416 - I. Lee Hetherington, Han Shu, James R. Glass:
Flexible Multi-Stream Framework for Speech Recognition using Multi-Tape Finite-State Transducers. 417-420 - Rong Zhang, Alexander I. Rudnicky:
A New Data Selection Approach for Semi-Supervised Acoustic Modeling. 421-424 - Evan Ruzanski, John H. L. Hansen, James Meyerhoff, George Saviolakis, William Norris, Terry Wollert:
Stress Level Classification of Speech Using Euclidean Distance Metrics in a Novel Hybrid Multi-Dimensional Feature Space. 425-428 - Tanel Alumäe:
Sentence-Adapted Factored Language Model for Transcribing Estonian Speech. 429-432 - Petr Fousek, Hynek Hermansky:
Towards ASR Based on Hierarchical Posterior-Based Keyword Recognition. 433-436 - Louis ten Bosch, Annika Hämäläinen, Odette Scharenborg, Lou Boves:
Acoustic Scores and Symbolic Mismatch Penalties in Phone Lattices. 437-440 - Yun Tang, Wenju Liu, Hua Zhang, Bo Xu, Guo-Hong Ding:
One-Pass Coarse-to-Fine Segmental Speech Decoding Algorithm. 441-444 - Guillermo Aradilla, Jithendra Vepa, Hervé Bourlard:
Using Pitch as Prior Knowledge in Template-Based Speech Recognition. 445-448 - Man-Hung Siu, Jeff Siu-Kei Au-Yeung:
Robust Large Vocabulary Continuous Speech Recognition using Polynomial Segment Model with Unsupervised Adaptation. 449-452
Speech Enhancement in Adverse Environments
- Richard C. Hendriks, Richard Heusdens, Jesper Jensen:
Speech Enhancement Under a Combined Stochastic-Deterministic Model. 453-456 - Lutfa Akter, Md. Kamrul Hasan:
Crosscorrelation Compensated Wiener Filter for Speech Enhancement. 457-460 - Chang Huai You, Susanto Rahardja, Soo Ngee Koh:
Perceptual Kalman Filtering Speech Enhancement. 461-464 - Stefan Windmann, Reinhold Haeb-Umbach:
Iterative Speech Enhancement using a Non-Linear Dynamic State Model of Speech and its Parameters. 465-468 - Sung-il Jung, Younghun Kwon, Sung-il Yang:
Speech Enhancement by Wavelet Packet Transform with Best Fitting Regression Line in Various Noise Environments. 469-472 - Masanori Kato, Akihiko Sugiyama:
A Low-Complexity Noise Suppressor with Nonuniform Subbands and a Frequency-Domain Highpass Filter. 473-476