Takao Kobayashi, Keikichi Hirose, Satoshi Nakamura (Eds.):
INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September 26-30, 2010.
: Still talking to machines (cognitively speaking).
: Sound-based assistive technology supporting "seeing", "hearing" and "speaking" for the disabled and the elderly.
Models of Speech - In Search of Better Representations
Acoustic Models I-III
Khe Chai Sim
: Probabilistic state clustering using conditional random field for context-dependent acoustic modelling.
, Yunxin Zhao
: Integrate template matching and statistical modeling for speech recognition.
, Yunxin Zhao
: Integrating MLP features and discriminative training in data sampling based ensemble acoustic modeling.
, Yu Shi
, Qiang Huo
: A study of irrelevant variability normalization based training and unsupervised online adaptation for LVCSR.
, Yu Hu
, Hui Jiang
: Boosted mixture learning of Gaussian mixture HMMs for speech recognition.
, Reinhold Haeb-Umbach
: On the exploitation of hidden Markov models and linear dynamic models in a hybrid decoder architecture for continuous speech recognition.
Khe Chai Sim
, Shilin Liu
: Semi-parametric trajectory modelling using temporally varying feature mapping for speech recognition.
, Li Deng
: Deep-structured hidden conditional random fields for phonetic recognition.
Spoken Dialogue Systems I, II
Spoken Dialogue Systems II
, David Griol
: New technique to enhance the performance of spoken dialogue systems based on dialogue states-dependent language models and grammatical rules.
Factors Influencing Perception
, Paola Escudero
: The interrelation between the stimulus range and the number of response categories in vowel categorization.
, Katalin Mády
: Nucleus position within the intonation phrase: a typological study of English, Czech and Hungarian.
Unit Selection and Others
, Simon King
: A classifier-based target cost for unit selection speech synthesis trained on perceptual data.
, Xiaodong Cui
: Applying scalable phonetic context similarity in unit selection of concatenative text-to-speech.
Search, Decoding and Confidence Measures I, II
: Incremental composition of static decoding graphs with label pushing.
, Wenju Liu
: A novel path extension framework using steady segment detection for Mandarin speech recognition.
Special-Purpose Speech Applications
, Yusuke Takano
: Estimation of two-to-one forced selection intelligibility scores by speech recognizers using noise-adapted models.
, Karen Livescu
: Modeling pronunciation variation with context-dependent articulatory feature decision trees.
Systems for LVCSR
Speaker Characterization and Recognition I-IV
, Rong Zheng
, Bo Xu
: An investigation into direct scoring methods without SVM training in speaker verification.
, Bo Xu
: On the use of Gaussian component information in the generative likelihood ratio estimation for speaker verification.
, Wei Rao
: Acoustic vector resampling for GMMSVM-based speaker verification.
: A fast speaker indexing using vector quantization and second order statistics with adaptive threshold computation.
, Brian Mak
: The use of subvector quantization and discrete densities for fast GMM computation for speaker verification.
HMM-Based Speech Synthesis I, II
: Speaker and language adaptive training for HMM-based polyglot speech synthesis.
, Martin J. Russell
: Improved modelling of speech dynamics using non-linear formant trajectories for HMM-based speech synthesis.
, Hisashi Kawai
: An unsupervised approach to creating web audio contents-based HMM voices.
Multi-Modal Signal Processing
, Marion Dohen
: Gesture and speech coordination: the influence of the relationship between manual gesture and speech.
Speaker Adaptation, Robustness Against Reverberation
, Rohit Sinha
: Enhancing children's speech recognition under mismatched condition by explicit acoustic normalization.
, Khe Chai Sim
: Comparison of discriminative input and output transformations for speaker adaptation in the hybrid NN/HMM systems.
, Climent Nadeu
: On the potential of channel selection for recognition of reverberated speech with multiple microphones.
Language Learning, TTS, and Other Applications
, Seiichi Nakagawa
: Automatic evaluation of English pronunciation by Japanese speakers using various acoustic features and pattern recognition techniques.
, John Dines
: An analysis of language mismatch in HMM state mapping-based cross-lingual speaker adaptation.
Pitch and Glottal-Waveform Estimation and Modeling I, II
: On evaluation of the f0 estimation based on time-varying complex speech analysis.
, Tan Lee
: Pitch estimation in noisy speech based on temporal accumulation of spectrum peaks.
, Abeer Alwan
: SAFE: a statistical algorithm for F0 estimation for both clean and noisy speech.
Open Vocabulary Spoken Document Retrieval (Special Session)
, Weiqiang Zhang
, Jia Liu
: Combining Chinese spoken term detection systems via side-information conditioned linear logistic regression.
, Lin-Shan Lee
: Unsupervised spoken-term detection with spoken queries using segment-based dynamic time warping.
Language and Dialect Identification
Technologies for Learning and Education
Wai Kit Lo
, Shuang Zhang
, Helen M. Meng
: Automatic derivation of phonological rules for mispronunciation detection in a computer-assisted pronunciation training system.
, Jack Mostow
: Adapting a duration synthesis model to rate children's oral reading prosody.
, Laurence Devillers
: Incremental acoustic valence recognition: an inter-corpus perspective on features, matching, and performance in a gating paradigm.
New Paradigms in ASR I, II
, Lin-Shan Lee
: Mandarin tone recognition using affine-invariant prosodic features and tone posteriorgram.
: Similarity of effects of emotions on the speech organ configuration with and without speaking.
: Korean lenis, fortis, and aspirated stops: effect of place of articulation on acoustic realization.
, Tatsuya Hirahara
: Physics of body-conducted silent speech - production, propagation and representation of non-audible murmur.
, Qiang Fu
, Yonghong Yan
: Speech enhancement using improved generalized sidelobe canceller in frequency domain with multi-channel postfiltering.
, Weiqian Liang
: Integrated feedback and noise reduction algorithm in digital hearing aids via oscillation detection.
Fact and Replica of Speech Production (Special Session)
: Effects of wall impedance on transmission and attenuation of higher-order modes in vocal-tract model.
: Mechanical vocal-tract models for speech dynamics.
Michael C. Brady
: Prosodic timing analysis for articulatory re-synthesis using a bank of resonators with an adaptive oscillator.
Single-Channel Speech Enhancement