INTERSPEECH 2004:
Lisbon,
Portugal
INTERSPEECH 2004 - ICSLP, 8th International Conference on Spoken Language Processing, Jeju Island, Korea, October 4-8, 2004.
ISCA 2004
Plenary Talks
- Chin-Hui Lee:
From decoding-driven to detection-based paradigms for automatic speech recognition.
- Hyun-Bok Lee:
In search of a universal phonetic alphabet - theory and application of an organic visible speech-.
- Jacqueline Vaissière:
From X-ray or MRU data to sounds through articulatory synthesis: towards an integrated view of the speech communication process.
Speech Recognition - Adaptation
- Sreeram Balakrishnan, Karthik Visweswariah, Vaibhava Goel:
Stochastic gradient adaptation of front-end parameters.
- Antoine Raux, Rita Singh:
Maximum - likelihod adaptation of semi-continuous HMMs by latent variable decomposition of state distributions.
- Chao Huang, Tao Chen, Eric Chang:
Transformation and combination of hiden Markov models for speaker selection training.
- Brian Kan-Wing Mak, Roger Wend-Huu Hsiao:
Improving eigenspace-based MLLR adaptation by kernel PCA.
- Nikos Chatzichrisafis, Vassilios Digalakis, Vassilios Diakoloukas, Costas Harizakis:
Rapid acoustic model development using Gaussian mixture clustering and language adaptation.
- Karthik Visweswariah, Ramesh A. Gopinath:
Adaptation of front end parameters in a speech recognizer.
- Diego Giuliani, Matteo Gerosa, Fabio Brugnara:
Speaker normalization through constrained MLLR based transforms.
- Xiangyu Mu, Shuwu Zhang, Bo Xu:
Multi-layer structure MLLR adaptation algorithm with subspace regression classes and tying.
- Georg Stemmer, Stefan Steidl, Christian Hacker, Elmar Nöth:
Adaptation in the pronunciation space for non-native speech recognition.
- Xuechuan Wang, Douglas D. O'Shaughnessy:
Robust ASR model adaptation by feature-based statistical data mapping.
- Zhaobing Han, Shuwu Zhang, Bo Xu:
A novel target-driven generalized JMAP adaptation algorithm.
- Brian Mak, Simon Ho, James T. Kwok:
Speedup of kernel eigenvoice speaker adaptation by embedded kernel PCA.
- Hyung Bae Jeon, Dong Kook Kim:
Maximum a posteriori eigenvoice speaker adaptation for Korean connected digit recognition.
- Wei Wang, Stephen Zahorian:
Vocal tract normalization based on spectral warping.
- Koji Tanaka, Fuji Ren, Shingo Kuroiwa, Satoru Tsuge:
Acoustic model adaptation for coded speech using synthetic speech.
- Motoyuki Suzuki, Hirokazu Ogasawara, Akinori Ito, Yuichi Ohkawa, Shozo Makino:
Speaker adaptation method for CALL system using bilingual speakers' utterances.
- Shinji Watanabe:
Acoustic model adaptation based on coarse/fine training of transfer vectors and its application to a speaker adaptation task.
- Wei-Ho Tsai, Shih-Sian Cheng, Hsin-Min Wang:
Speaker clustering of speech utterances using a voice characteristic reference space.
- Young Kuk Kim, Hwa Jeon Song, Hyung Soon Kim:
Performance improvement of connected digit recognition using unsupervised fast speaker adaptation.
- Hyung Soon Kim, Hwa Jeon Song:
Simultaneous estimation of weights of eigenvoices and bias compensation vector for rapid speaker adaptation.
- Matthias Wölfel:
Speaker dependent model order selection of spectral envelopes.
- Enrico Bocchieri, Michael Riley, Murat Saraclar:
Methods for task adaptation of acoustic models with limited transcribed in-domain data.
- Atsushi Fujii, Tetsuya Ishikawa, Katsunobu Itou, Tomoyosi Akiba:
Unsupervised topic adaptation for lecture speech retrieval.
- Haibin Liu, Zhenyang Wu:
Mean and covariance adaptation based on minimum classification error linear regression for continuous density HMMs.
- Goshu Nagino, Makoto Shozakai:
Design of ready-made acoustic model library by two-dimensional visualization of acoustic space.
Spoken Language Identification,
Translation and Retrieval I
- Jean-Luc Gauvain, Abdelkhalek Messaoudi, Holger Schwenk:
Language recognition using phone latices.
- Mark Huckvale:
ACCDIST: a metric for comparing speakers' accents.
- Michael Levit, Allen L. Gorin, Patrick Haffner, Hiyan Alshawi, Elmar Nöth:
Aspects of named entity processing.
- Josep Maria Crego, José B. Mariño, Adrià de Gispert:
Finite-state-based and phrase-based statistical machine translation.
- Tanja Schultz, Szu-Chen Stan Jou, Stephan Vogel, Shirin Saleem:
Using word latice information for a tighter coupling in speech translation systems.
- Teruhisa Misu, Tatsuya Kawahara, Kazunori Komatani:
Confirmation strategy for document retrieval systems with spoken dialog interface.
- Shi-wook Lee, Kazuyo Tanaka, Yoshiaki Itoh:
Multilayer subword units for open-vocabulary spoken document retrieval.
- Yoshiaki Itoh, Kazuyo Tanaka, Shi-wook Lee:
An efficient partial matching algorithm toward speech retrieval by speech.
- Celestin Sedogbo, Sébastien Herry, Bruno Gas, Jean-Luc Zarader:
Language detection by neural discrimination.
- Ricardo de Córdoba, Javier Ferreiros, Valentín Sama, Javier Macías Guarasa, Luis Fernando D'Haro, Fernando Fernandez:
Language identification techniques based on full recognition in an air traffic control task.
- John H. L. Hansen, Umit H. Yapanel, Rongqing Huang, Ayako Ikeno:
Dialect analysis and modeling for automatic classification.
- Emmanuel Ferragne, François Pellegrino:
Rhythm in read british English: interdialect variability.
- Pascale Fung, Yi Liu, Yongsheng Yang, Yihai Shen, Dekai Wu:
A grammar-based Chinese to English speech translation system for portable devices.
- Gökhan Tür:
Cost-sensitive call classification.
- Mikko Kurimo, Ville T. Turunen, Inger Ekman:
An evaluation of a spoken document retrieval baseline system in finish.
- Hui Jiang, Pengfei Liu, Imed Zitouni:
Discriminative training of naive Bayes classifiers for natural language call routing.
- Nicolas Moreau, Hyoung-Gook Kim, Thomas Sikora:
Phonetic confusion based document expansion for spoken document retrieval.
- Euisok Chung, Soojong Lim, Yi-Gyu Hwang, Myung-Gil Jang:
Hybrid named entity recognition for question-answering system.
- Jitendra Ajmera, Iain McCowan, Hervé Bourlard:
An online audio indexing system.
- Eric Sanders, Febe de Wet:
Histogram normalisation and the recognition of names and ontology words in the MUMIS project.
- Rui Amaral, Isabel Trancoso:
Improving the topic indexation and segmentation modules of a media watch system.
- Melissa Barkat-Defradas, Rym Hamdi, Emmanuel Ferragne, François Pellegrino:
Speech timing and rhythmic structure in arabic dialects: a comparison of two approaches.
- Hsin-Min Wang, Shih-Sian Cheng:
METRIC-SEQDAC: a hybrid approach for audio segmentation.
- Jen-Wei Kuo, Yao-Min Huang, Berlin Chen, Hsin-Min Wang:
Statistical Chinese spoken document retrieval using latent topical information.
- Masahiko Matsushita, Hiromitsu Nishizaki, Seiichi Nakagawa, Takehito Utsuro:
Keyword recognition and extraction by multiple-LVCSRs with 60, 000 words in speech-driven WEB retrieval task.
- Ruiqiang Zhang, Gen-ichiro Kikui, Hirofumi Yamamoto, Frank K. Soong, Taro Watanabe, Eiichiro Sumita, Wai Kit Lo:
Improved spoken language translation using n-best speech recognition hypotheses.
- Kakeung Wong, Man-Hung Siu:
Automatic language identification using discrete hidden Markov model.
- Bowen Zhou, Daniel Déchelotte, Yuqing Gao:
Two-way speech-to-speech translation on handheld devices.
- Hervé Blanchon:
HLT modules scalability within the NESPOLE! project.
Linguistics,
Phonology,
and Phonetics
- Midam Kim:
Correlation between VOT and F0 in the perception of Korean stops and affricates.
- Aude Noiray, Lucie Ménard, Marie-Agnès Cathiard, Christian Abry, Christophe Savariaux:
The development of anticipatory labial coarticulation in French: a pionering study.
- Melvyn John Hunt:
Speech recognition, sylabification and statistical phonetics.
- Jilei Tian:
Data-driven approaches for automatic detection of syllable boundaries.
- Anne Cutler, Dennis Norris, Núria Sebastián-Gallés:
Phonemic repertoire and similarity within the vocabulary.
- Sameer Maskey, Alan W. Black, Laura Tomokiya:
Boostrapping phonetic lexicons for new languages.
- Mirjam Broersma, K. Marieke Kolkman:
Lexical representation of non-native phonemes.
- Jong-Pyo Lee, Tae-Yeoub Jang:
A comparative study on the production of inter-stress intervals of English speech by English native speakers and Korean speakers.
- Emi Zuiki Murano, Mihoko Teshigawara:
Articulatory correlates of voice qualities of god guys and bad guys in Japanese anime: an MRI study.
- Sorin Dusan:
Effects of phonetic contexts on the duration of phonetic segments in fluent read speech.
- Qiang Fang:
A study on nasal coda los in continuous speech.
- Hua-Li Jian:
An improved pair-wise variability index for comparing the timing characteristics of speech.
- Hua-Li Jian:
An acoustic study of speech rhythm in taiwan English.
- Sung-A. Kim:
Language specific phonetic rules: evidence from domain-initial strengthening.
- Hansang Park:
Spectral characteristics of the release bursts in Korean alveolar stops.
- Rob van Son, Olga Bolotova, Louis C. W. Pols, Mietta Lennes:
Frequency effects on vowel reduction in three typologically different languages (dutch, finish, Russian).
- Julia Abresch, Stefan Breuer:
Assessment of non-native phones in anglicisms by German listeners.
- Sunhee Kim:
Phonology of exceptions for for Korean grapheme-to-phoneme conversion.
- Shigeyoshi Kitazawa, Shinya Kiriyama:
Acoustic and prosodic analysis of Japanese vowel-vowel hiatus with laryngeal effect.
- Kimiko Tsukada:
A cross-linguistic acoustic comparison of unreleased word-final stops: Korean and Thai.
- Taehong Cho, Elizabeth K. Johnson:
Acoustic correlates of phrase-internal lexical boundaries in dutch.
- Taehong Cho, James M. McQueen:
Phonotactics vs. phonetic cues in native and non-native listening: dutch and Korean listeners' perception of dutch and English.
- Svetlana Kaminskaia, François Poiré:
Comparing intonation of two varieties of French using normalized F0 values.
- Mira Oh, Kee-Ho Kim:
Phonetic realization of the suffix-suppressed accentual phrase in Korean.
- H. Timothy Bunnell, James B. Polikoff, Jane McNicholas:
Spectral moment vs. bark cepstral analysis of children's word-initial voiceles stops.
- Nobuaki Minematsu:
Pronunciation assessment based upon the compatibility between a learner's pronunciation structure and the target language's lexical structure.
- Kenji Yoshida:
Spread of high tone in akita Japanese.
Biomedical Applications of Speech Analysis
- Juan Ignacio Godino-Llorente, María Victoria Rodellar Biarge, Pedro Gómez Vilda, Francisco Díaz Pérez, Agustín Álvarez Marquina, Rafael Martínez-Olalla:
Biomechanical parameter fingerprint in the mucosal wave power spectral density.
- Cheolwoo Jo, Soo-Geon Wang, Byung-Gon Yang, Hyung-Soon Kim, Tao Li:
Classification of pathological voice including severely noisy cases.
- Qiang Fu, Peter Murphy:
A robust glottal source model estimation technique.
- Hiroki Mori, Yasunori Kobayashi, Hideki Kasuya, Hajime Hirose, Noriko Kobayashi:
F0 and formant frequency distribution of dysarthric speech - a comparative study.
- Hideki Kawahara, Yumi Hirachi, Masanori Morise, Hideki Banno:
Procedure "senza vibrato": a key component for morphing singing.
- Claudia Manfredi, Giorgio Peretti, Laura Magnoni, Fabrizio Dori, Ernesto Iadanza:
Thyroplastic medialisation in unilateral vocal fold paralysis: assessing voice quality recovering.
- Gernot Kubin, Martin Hagmüller:
Voice enhancement of male speakers with laryngeal neoplasm.
- Jong Min Choi, Myung-Whun Sung, Kwang Suk Park, Jeong-Hun Hah:
A comparison of the perturbation analysis between PRAAT and computerize speech lab.
Robust Speech Recognition on AURORA
- Ming Ji, Baochun Hou:
Evaluation of universal compensation on Aurora 2 and 3 and beyond.
- Hugo Van hamme:
PROSPECT features and their application to missing data techniques for robust speech recognition.
- Hugo Van hamme, Patrick Wambacq, Veronique Stouten:
Accounting for the uncertainty of speech estimates in the context of model-based feature enhancement.
- Hans-Günter Hirsch, Harald Finster:
Applying the Aurora feature extraction schemes to a phoneme based recognition task.
- Zhipeng Zhang, Tomoyuki Ohya, Sadaoki Furui:
Evaluation of tree-structured piecewise linear transformation-based noise adaptation on AURORA2 database.
- Tor André Myrvoll, Satoshi Nakamura:
Online minimum mean square error filtering of noisy cepstral coefficients using a sequential EM algorithm.
- Akira Sasou, Kazuyo Tanaka, Satoshi Nakamura, Futoshi Asano:
HMM-based feature compensation method: an evaluation using the AURORA2.
- Xuechuan Wang, Douglas D. O'Shaughnessy:
Noise adaptation for robust AURORA 2 noisy digit recognition using statistical data mapping.
- Benjamin J. Shannon, Kuldip K. Paliwal:
MFCC computation from magnitude spectrum of higher lag autocorrelation coefficients for robust speech recognition.
- Muhammad Ghulam, Takashi Fukuda, Junsei Horikawa, Tsuneo Nitta:
A noise-robust feature extraction method based on pitch-synchronous ZCPA for ASR.
- José C. Segura, Ángel de la Torre, Javier Ramírez, Antonio J. Rubio, M. Carmen Benítez:
Including uncertainty of speech observations in robust speech recognition.
- Takeshi Yamada, Jiro Okada, Nobuhiko Kitawaki:
Integration of n-best recognition results obtained by multiple noise reduction algorithms.
- Panji Setiawan, Sorel Stan, Tim Fingscheidt:
Revisiting some model-based and data-driven denoising algorithms in Aurora 2 context.
- Guo-Hong Ding, Bo Xu:
Exploring high-performance speech recognition in noisy environments using high-order taylor series expansion.
- Wing-Hei Au, Man-Hung Siu:
A robust training algorithm based on neighborhood information.
- Siu Wa Lee, Pak-Chung Ching:
In-phase feature induction: an effective compensation technique for robust speech recognition.
- Jeff Siu-Kei Au-Yeung, Man-Hung Siu:
Improved performance of Aurora 4 using HTK and unsupervised MLLR adaptation.
- Shang-nien Tsai, Lin-Shan Lee:
A new feature extraction front-end for robust speech recognition using progressive histogram equalization and multi-eigenvector temporal filtering.
Spoken / Multimodal Dialogue System
- Christian Fügen, Hartwig Holzapfel, Alex Waibel:
Tight coupling of speech recognition and dialog management - dialog-context dependent grammar weighting for speech recognition.
- Akinobu Lee, Keisuke Nakamura, Ryuichi Nisimura, Hiroshi Saruwatari, Kiyohiro Shikano:
Noise robust real world spoken dialogue system using GMM based rejection of unintended inputs.
- Hironori Oshikawa, Norihide Kitaoka, Seiichi Nakagawa:
Speech interface for name input based on combination of recognition methods using syllable-based n-gram and word dictionary.
- Imed Zitouni, Minkyu Lee, Hui Jiang:
Constrained minimization technique for topic identification using discriminative training and support vector machines.
- Jason D. Williams, Steve Young:
Characterizing task-oriented dialog using a simulated ASR chanel.
- Takashi Konashi, Motoyuki Suzuki, Akinori Ito, Shozo Makino:
A spoken dialog system based on automatic grammar generation and template-based weighting for autonomous mobile robots.
- Akinori Ito, Takanobu Oba, Takashi Konashi, Motoyuki Suzuki, Shozo Makino:
Noise adaptive spoken dialog system based on selection of multiple dialog strategies.
- Mikko Hartikainen, Markku Turunen, Jaakko Hakulinen, Esa-Pekka Salonen, J. Adam Funk:
Flexible dialogue management using distributed and dynamic dialogue control.
- Keith Houck:
Contextual revision in information seeking conversation systems.
- Ian M. O'Neill, Philip Hanna, Xingkun Liu, Michael F. McTear:
Cross domain dialogue modelling: an object-based approach.
- Hirohiko Sagawa, Teruko Mitamura, Eric Nyberg:
A comparison of confirmation styles for error handling in a speech dialog system.
- Fan Yang, Peter A. Heeman:
Using computer simulation to compare two models of mixed-initiative.
- Fan Yang, Peter A. Heeman, Kristy Hollingshead:
Towards understanding mixed-initiative in task-oriented dialogues.
- Peter Wolf, Joseph Woelfel, Jan Van Gemert, Bhiksha Raj, David Wong:
Spokenquery: an alternate approach to chosing items with speech.
- Shona Douglas, Deepak Agarwal, Tirso Alonso, Robert M. Bell, Mazin G. Rahim, Deborah F. Swayne, Chris Volinsky:
Mining customer care dialogs for "daily news".
- Jens Edlund, Gabriel Skantze, Rolf Carlson:
Higgins - a spoken dialogue system for investigating error handling techniques.
- Fuliang Weng, Lawrence Cavedon, Badri Raghunathan, Danilo Mirkovic, Hua Cheng, Hauke Schmidt, Harry Bratt, Rohit Mishra, Stanley Peters, Sandra Upson, Elizabeth Shriberg, Carsten Bergmann, Lin Zhao:
A conversational dialogue system for cognitively overloaded users.
- Gerhard Hanrieder, Stefan W. Hamerich:
Modeling generic dialog applications for embedded systems.
- Matthew N. Stuttle, Jason D. Williams, Steve Young:
A framework for dialogue data collection with a simulated ASR channel.
- Shimei Pan:
A multi-layer conversation management approach for information seeking applications.
- Thomas K. Harris, Roni Rosenfeld:
A universal speech interface for appliances.
- Keita Hayashi, Yuki Irie, Yukiko Yamaguchi, Shigeki Matsubara, Nobuo Kawaguchi:
Speech understanding, dialogue management and response generation in corpus-based spoken dialogue system.
- Fernando Fernandez, Valentín Sama, Luis Fernando D'Haro, Rubén San Segundo, Ricardo de Córdoba, Juan Manuel Montero:
Implementation of dialog applications in an open-source voiceXML platform.
- Chun Wai Lau, Bin Ma, Helen Mei-Ling Meng, Yiu-Sang Moon, Yeung Yam:
Fuzzy logic decision fusion in a multimodal biometric system.
- Peter Poller, Norbert Reithinger:
A state model for the realization of visual perceptive feedback in smartkom.
- Akemi Iida, Yoshito Ueno, Ryohei Matsuura, Kiyoaki Aikawa:
A vector-based method for efficiently representing multivariate environmental information.
- Ioannis Toptsis, Shuyin Li, Britta Wrede, Gernot A. Fink:
A multi-modal dialog system for a mobile robot.
- Niels Ole Bernsen, Laila Dybkjær:
Structured interview-based evaluation of spoken multimodal conversation with h.c. andersen.
Speech Recognition - Search
Spoken Dialogue and Systems
- Katsutoshi Ohtsuki, Nobuaki Hiroshima, Yoshihiko Hayashi, Katsuji Bessho, Shoichi Matsunaga:
Topic structure extraction for meeting indexing.
- Sophie Rosset, Lori Lamel:
Automatic detection of dialog acts based on multilevel information.
- Gina-Anne Levow:
Identifying local corrections in human-computer dialogue.
- Peter Reichl, Florian Hammer:
Hot discussion or frosty dialogue? towards a temperature metric for conversational interactivity.
- Stephanie Seneff, Chao Wang, I. Lee Hetherington, Grace Chung:
A dynamic vocabulary spoken dialogue interface.
- Matthias Denecke, Kohji Dohsaka, Mikio Nakano:
Learning dialogue policies using state aggregation in reinforcement learning.
Speech Perception
- Keren B. Shatzman:
Segmenting ambiguous phrases using phoneme duration.
- Shuichi Sakamoto, Yôiti Suzuki, Shigeaki Amano, Tadahisa Kondo, Naoki Iwaoka:
A compensation method for word-familiarity difference with SNR control in intelligibility test.
- Takashi Otake, Yoko Sakamoto, Yasuyuki Konomi:
Phoneme-based word activation in spoken-word recognition: evidence from Japanese school children.
- Belynda Brahimi, Philippe Boula de Mareüil, Cédric Gendrot:
Role of segmental and suprasegmental cues in the perception of maghrebian-acented French.
- Hiroaki Kato, Yoshinori Sagisaka, Minoru Tsuzaki, Makiko Muto:
Effect of speaking rate on the acceptability of change in segment duration.
- Kiyoko Yoneyama:
A cross-linguistic study of diphthongs in spoken word processing in Japanese and English.
Multi-Lingual Speech-to-Speech Translation
Speech Recognition - Large Vocabulary
- Dominique Fohr, Odile Mella, Christophe Cerisara, Irina Illina:
The automatic news transcription system: ANTS, some real time experiments.
- Bhuvana Ramabhadran, Olivier Siohan, Geoffrey Zweig:
Use of metadata to improve recognition of spontaneous speech and named entities.
- Janne Pylkkönen, Mikko Kurimo:
Duration modeling techniques for continuous speech recognition.
- Tanel Alumäe:
Large vocabulary continuous speech recognition for estonian using morpheme classes.
- Zhaobing Han, Shuwu Zhang, Bo Xu:
Combining agglomerative and tree-based state clustering for high accuracy acoustic modeling.
- William S.-Y. Wang, Gang Peng:
Parallel tone score association method for tone language speech recognition.
- Jing Zheng, Horacio Franco, Andreas Stolcke:
Effective acoustic modeling for rate-of-speech variation in large vocabulary conversational speech recognition.
- G. L. Sarada Ghadiyaram, N. Hemalatha Nagarajan, T. Nagarajan Thangavelu, Hema A. Murthy:
Automatic transcription of continuous speech using unsupervised and incremental training.
- Jan Nouza, Dana Nejedlová, Jindrich Zdánský, Jan Kolorenc:
Very large vocabulary speech recognition system for automatic transcription of czech broadcast programs.
- Olivier Siohan, Bhuvana Ramabhadran, Geoffrey Zweig:
Speech recognition error analysis on the English MALACH corpus.
- Rong Zhang, Alexander I. Rudnicky:
A frame level boosting training scheme for acoustic modeling.
- Rong Zhang, Alexander I. Rudnicky:
Optimizing boosting with discriminative criteria.
- Xianghua Xu, Qiang Guo, Jie Zhu:
Restructuring HMM states for speaker adaptation in Mandarin speech recognition.
- Mike Matton, Mathias De Wachter, Dirk Van Compernolle, Ronald Cools:
A discriminative locally weighted distance measure for speaker independent template based speech recognition.
- Yohei Itaya, Heiga Zen, Yoshihiko Nankaku, Chiyomi Miyajima, Keiichi Tokuda, Tadashi Kitamura:
Deterministic annealing EM algorithm in parameter estimation for acoustic model.
- Frantisek Grézl, Martin Karafiát, Jan Cernocký:
TRAP based features for LVCSR of meting data.
- Frank K. Soong, Wai Kit Lo, Satoshi Nakamura:
Optimal acoustic and language model weights for minimizing word verification errors.
- Atsushi Sako, Yasuo Ariki:
Structuring of baseball live games based on speech recognition using task dependant knowledge.
- Zhengyu Zhou, Helen M. Meng:
A two-level schema for detecting recognition errors.
- In-Jeong Choi, Nam-Hoon Kim, Su Youn Yoon:
Large vocabulary continuous speech recognition based on cross-morpheme phonetic information.
- Changxue Ma:
Automatic phonetic base form generation based on maximum context tree.
- Gustavo Hernández Ábrego, Lex Olorenshaw, Raquel Tato, Thomas Schaaf:
Dictionary refinements based on phonetic consensus and non-uniform pronunciation reduction.
- Abdelkhalek Messaoudi, Lori Lamel, Jean-Luc Gauvain:
Transcription of arabic broadcast news.
- Takahiro Shinozaki, Sadaoki Furui:
Spontaneous speech recognition using a massively parallel decoder.
- Tanja Schultz, Qin Jin, Kornel Laskowski, Yue Pan, Florian Metze, Christian Fügen:
Issues in meeting transcription - the ISL meeting transcription system.
- Katsutoshi Ohtsuki, Nobuaki Hiroshima, Shoichi Matsunaga, Yoshihiko Hayashi:
Multi-pass ASR using vocabulary expansion.
- Vlasios Doumpiotis, William Byrne:
Pinched lattice minimum Bayes risk discriminative training for large vocabulary continuous speech recognition.
- Izhak Shafran, William Byrne:
Task-specific minimum Bayes-risk decoding using learned edit distance.
- Rong Zhang, Alexander I. Rudnicky:
Apply n-best list re-ranking to acoustic model combinations of boosting training.
- Do Yeong Kim, S. Umesh, M. J. F. Gales, Thomas Hain, Philip C. Woodland:
Using VTLN for broadcast news transcription.
- Andreas Stolcke, Chuck Wooters, Ivan Bulyko, Martin Graciarena, Scott Otterson, Barbara Peskin, Mari Ostendorf, David Gelbart, Nikki Mirghafori, Tuomo W. Pirinen:
From switchboard to meetings: development of the 2004 ICSI-SRI-UW meeting recognition system.
- Anand Venkataraman, Andreas Stolcke, Wen Wang, Dimitra Vergyri, Jing Zheng, Venkata Ramana Rao Gadde:
An efficient repair procedure for quick transcriptions.
- Yao Qian, Tan Lee, Frank K. Soong:
Tone information as a confidence measure for improving Cantonese LVCSR.
Speech Science
- Danielle Due:
Temporal variables in parkinsonian speech.
- Olov Engwall:
Speaker adaptation of a three-dimensional tongue model.
- Nicole Cooper, Anne Cutler:
Perception of non-native phonemes in noise.
- Hideki Kawahara, Hideki Banno, Toshio Irino, Jiang Jin:
Intelligibility of degraded speech from smeared STRAIGHT spectrum.
- Young-Ik Kim, Rhee Man Kil:
Sound source localization based on zero-crosing peak-amplitude coding.
- Kajikawa Sachiyo, Fais Laurel, Shigeaki Amano, Werker Janet:
Adult and infant sensitivity to phonotactic features in spoken Japanese.
- Phil Green, James Carmichael:
Revisiting dysarthria assessment intelligibility metrics.
- Valter Ciocca, Tara L. Whitehill, Joan K.-Y. Ma:
The effect of intonation on perception of Cantonese lexical tones.
- Toshiko Isei-Jaakkola:
Maximum short quantity in Japanese and finish in two perception tests with F0 and db variants.
- Paavo Alku, Matti Airas, Brad Story:
Evaluation of an inverse filtering technique using physical modeling of voice production.
- Hui-ju Hsu, Janice Fon:
Positional and phonotactic effects on the realization of taiwan Mandarin tone 2.
- Karl Schnell, Arild Lacroix:
Speech production based on lossy tube models: unit concatenation and sound transitions.
- Qin Yan, Saeed Vaseghi, Dimitrios Rentzos, Ching-Hsiang Ho:
Modelling and ranking of differences across formants of british, australian and american accents.
- Tatsuya Kitamura, Satoru Fujita, Kiyoshi Honda, Hironori Nishimoto:
An experimental method for measuring transfer functions of acoustic tubes.
- Takuya Tsuji, Tokihiko Kaburagi, Kohei Wakamiya, Jiji Kim:
Estimation of the vocal tract spectrum from articulatory movements using phoneme-dependent neural networks.
- Kunitoshi Motoki, Hiroki Matsuzaki:
Computation of the acoustic characteristics of vocal-tract models with geometrical perturbation.
- P. Vijayalakshmi, M. RamasubbaReddy:
Analysis of hypernasality by synthesis.
- Abdellah Kacha, Francis Grenez, Frédéric Bettens, Jean Schoentgen:
Adaptive long-term predictive analysis of disordered speech.
- Slobodan Jovicic, Sandra Antesevic, Zoran Saric:
Phoneme restoration in degraded speech communication.
- Maria Marinaki, Constantine Kotropoulos, Ioannis Pitas, Nikolaos Maglaveras:
Automatic detection of vocal fold paralysis and edema.
Novel Features in ASR
Spoken and Natural Language Understanding
- Robert Lieb, Tibor Fábián, Günther Ruske, Matthias Thomae:
Estimation of semantic confidences on lattice hierarchies.
- Fumiyo Fukumoto, Yoshimi Suzuki:
Learning subject drift for topic tracking.
- Elizabeth Shriberg, Andreas Stolcke, Dustin Hillard, Mari Ostendorf, Barbara Peskin, Mary P. Harper, Yang Liu:
The ICSI-SRI-UW metadata extraction system.
- Mark Hasegawa-Johnson, Stephen E. Levinson, Tong Zhang:
Automatic detection of contrast for speech understanding.
- Nick Jui-Chang Wang, Jia-Lin Shen, Ching-Ho Tsai:
Integrating layer concept inform ation into n-gram modeling for spoken language understanding.
- Junyan Chen, Ji Wu, Zuoying Wang:
A robust understanding model for spoken dialogues.
- Chai Wutiwiwatchai, Sadaoki Furui:
Belief-based nonlinear rescoring in Thai speech understanding.
- Toshihiko Itoh, Atsuhiko Kai, Yukihiro Itoh, Tatsuhiro Konishi:
An understanding strategy based on plausibility score in recognition history using CSR confidence measure.
- Sangkeun Jung, Minwoo Jeong, Gary Geunbae Lee:
Speech recognition error correction using maximum entropy language model.
- Xiang Li, Juan M. Huerta:
Discriminative training of compound-word based multinomial classifiers for speech routing.
- Jihyun Eun, Changki Lee, Gary Geunbae Lee:
An information extraction approach for spoken language understanding.
- David Horowitz, Partha Lal, Pierce Gerard Buckley:
A maximum entropy shallow functional parser for spoken language understanding.
- Qiang Huang, Stephen J. Cox:
Mixture language models for call routing.
- Chung-Hsien Wu, Jui-Feng Yeh, Ming-Jun Chen:
Speech act identification using an ontology-based partial pattern tree.
- Ye-Yi Wang, Yun-Cheng Ju:
Creating speech recognition grammars from regular expressions for alphanumeric concepts.
- Isabel Trancoso, Paulo Araújo, Céu Viana, Nuno J. Mamede:
Poetry assistant.
- Tasuku Kitade, Tatsuya Kawahara, Hiroaki Nanjo:
Automatic extraction of key sentences from oral presentations using statistical measure based on discourse markers.
- Tomohiro Ohno, Shigeki Matsubara, Nobuo Kawaguchi, Yasuyoshi Inagaki:
Robust dependency parsing of spontaneous Japanese speech and its evaluation.
- Wolfgang Minker, Dirk Bühler, Christiane Beuschel:
Strategies for optimizing a stochastic spoken natural language parser.
- Tzu-Lun Lee, Ya-Fang He, Yun-Ju Huang, Shu-Chuan Tseng, Robert Eklund:
Prolongation in spontaneous Mandarin.
- Yuki Irie, Shigeki Matsubara, Nobuo Kawaguchi, Yukiko Yamaguchi, Yasuyoshi Inagaki:
Speech intention understanding based on decision tree learning.
- Satanjeev Banerjee, Alexander I. Rudnicky:
Using simple speech-based features to detect the state of a meeting and the roles of the meeting participants.
- Serdar Yildirim, Murtaza Bulut, Chul Min Lee, Abe Kazemzadeh, Zhigang Deng, Sungbok Lee, Shrikanth Narayanan, Carlos Busso:
An acoustic study of emotions expressed in speech.
- Tatsuya Kawahara, Ian Richard Lane, Tomoko Matsui, Satoshi Nakamura:
Topic classification and verification modeling for out-of-domain utterance detection.
- So-Young Park, Yong-Jae Kwak, Joon-Ho Lim, Hae-Chang Rim, Soo-Hong Kim:
Partially lexicalized parsing model utilizing rich features.
- Yoshimi Suzuki, Fumiyo Fukumoto, Yoshihiro Sekiguchi:
Clustering similar nouns for selecting related news articles.
- Leonardo Badino:
Chinese text word-segmentation considering semantic links among sentences.
- Do-Gil Lee, Hae-Chang Rim:
Syllable-based probabilistic morphological analysis model of Korean.
Speaker Segmentation and Clustering
Speech Processing in a Packet Network Environment
- Kuldip K. Paliwal, Stephen So:
Scalable distributed speech recognition using multi-frame GMM-based block quantization.
- Naveen Srinivasamurthy, Kyu Jeong Han, Shrikanth Narayanan:
Robust speech recognition over packet networks: an overview.
- Thomas Eriksson, Samuel Kim, Hong-Goo Kang, Chungyong Lee:
Theory for speaker recognition over IP.
- Wu Chou, Feng Liu:
Voice portal services in packet network and voIP environment.
- Peter Kabal, Colm Elliott:
Synchronization of speaker selection for centralized tandem free voIP conferencing.
- Akitoshi Kataoka, Yusuke Hiwasaki, Toru Morinaga, Jotaro Ikedo:
Measuring the perceived importance of time- and frequency-divided speech blocks for transmitting over packet networks.
- Moo Young Kim, W. Bastiaan Kleijn:
Comparison of transmitter - based packet-loss recovery techniques for voice transmission.
Acoustic Modeling
- Denis Jouvet, Ronaldo O. Messina:
Context dependent "long units" for speech recognition.
- Shinichi Yoshizawa, Kiyohiro Shikano:
Rapid EM training based on model-integration.
- Dominique Fohr, Odile Mella, Irina Illina, Christophe Cerisara:
Experiments on the accuracy of phone models and liaison processing in a French broadcast news transcription system.
- Jorge F. Silva, Shrikanth Narayanan:
A statistical discrimination measure for hidden Markov models based on divergence.
- Jan Stadermann, Gerhard Rigoll:
A hybrid SVM/HMM acoustic modeling approach to automatic speech recognition.
- Dirk Knoblauch:
Data driven number-of-states selection in HMM topologies.
- Youngkyu Cho, Sung-a Kim, Dongsuk Yook:
Hybrid model using subspace distribution clustering hidden Markov models and semi-continuous hidden Markov models for embedded speech recognizers.
- Peder A. Olsen, Karthik Visweswariah:
Fast clustering of Gaussians and the virtue of representing Gaussians in exponential model format.
- Karen Livescu, James R. Glass:
Feature-based pronunciation modeling with trainable asynchrony probabilities.
- Hong-Kwang Jeff Kuo, Yuqing Gao:
Maximum entropy direct model as a unified model for acoustic modeling in speech recognition.
- Yu Zhu, Tan Lee:
Explicit duration modeling for Cantonese connected-digit recognition.
- Arthur Chan, Mosur Ravishankar, Alexander I. Rudnicky, Jahanzeb Sherwani:
Four-layer categorization scheme of fast GMM computation techniques in large vocabulary continuous speech recognition systems.
- Junho Park, Hanseok Ko:
Compact acoustic model for embedded implementation.
- Takatoshi Jitsuhiro, Satoshi Nakamura:
Increasing the mixture components of non-uniform HMM structures based on a variational Bayesian approach.
- Panu Somervuo:
Comparison of ML, MAP, and VB based acoustic models in large vocabulary speech recognition.
- Wolfgang Macherey, Ralf Schlüter, Hermann Ney:
Discriminative training with tied covariance matrices.
- Frank Diehl, Asunción Moreno:
Acoustic phonetic modeling using local codebook features.
- Gue Jun Jung, Su-Hyun Kim, Yung-Hwan Oh:
An efficient codebook design in SDCHMM for mobile communication environments.
- Makoto Shozakai, Goshu Nagino:
Analysis of speaking styles by two-dimensional visualization of aggregate of acoustic models.
- Myoung-Wan Koo, Ho-Hyun Jeon, Sang-Hong Lee:
Context dependent phoneme duration modeling with tree-based state tying.
- John Scott Bridle:
Towards better understanding of the model implied by the use of dynamic features in HMMs.
Prosody Modeling and Generation
- Jian-Feng Li, Guoping Hu, Ren-Hua Wang:
Chinese prosody phrase break prediction based on maximum entropy model.
- Krothapalli Sreenivasa Rao, Bayya Yegnanarayana:
Intonation modeling for indian languages.
- Yu Zheng, Gary Geunbae Lee, Byeongchang Kim:
Using multiple linguistic features for Mandarin phrase break prediction in maximum-entropy classification framework.
- Ian Read, Stephen Cox:
Using part-of-speech for predicting phrase breaks.
- David Escudero Mancebo, Valentín Cardeñoso-Payo:
A proposal to quantitatively select the right intonation unit in data-driven intonation modeling.
- Jinfu Ni, Hisashi Kawai, Keikichi Hirose:
Formulating contextual tonal variations in Mandarin.
- Salma Mouline, Olivier Boëffard, Paul C. Bagshaw:
Automatic adaptation of the momel F0 stylisation algorithm to new corpora.
- Pablo Daniel Agüero, Klaus Wimmer, Antonio Bonafonte:
Joint extraction and prediction of fujisaki's intonation model parameters.
- Panagiotis Zervas, Nikos Fakotakis, George K. Kokkinakis, Georgios Kouroupetroglou, Gerasimos Xydas:
Evaluation of corpus based tone prediction in mismatched environments for greek tts synthesis.
- Ziyu Xiong, Juanwen Chen:
The duration of pitch transition phase and its relative factors.
- Yu Hu, Ren-Hua Wang, Lu Sun:
Polynomial regression model for duration prediction in Mandarin.
- Michelle Tooher, John G. McKenna:
Prediction of the glottal LF parameters using regression trees.
- Volker Dellwo, Bianca Aschenberner, Petra Wagner, Jana Dancovicova, Ingmar Steiner:
Bonntempo-corpus and bonntempo-tools: a database for the study of speech rhythm and rate.
- Wentao Gu, Keikichi Hirose, Hiroya Fujisaki:
Analysis of F0 contours of Cantonese utterances based on the command-response model.
- Marion Dohen, Hélène Loevenbruck:
Pre-focal rephrasing, focal enhancement and postfocal deaccentuation in French.
- Sridhar Krishna Nemala, Partha Pratim Talukdar, Kalika Bali, A. G. Ramakrishnan:
Duration modeling for hindi text-to-speech synthesis system.
- Nemala Sridhar Krishna, Hema A. Murthy:
A new prosodic phrasing model for indian language telugu.
- Oliver Jokisch, Michael Hofmann:
Evolutionary optimization of an adaptive prosody model.
- Gerasimos Xydas, Georgios Kouroupetroglou:
An intonation model for embedded devices based on natural F0 samples.
- Katerina Vesela, Nino Peterek, Eva Hajicová:
Prosodic characteristics of czech contrastive topic.
Multi-Sensor ASR
- Martin Graciarena, Federico Cesari, Horacio Franco, Gregory K. Myers, Cregg Cowan, Victor Abrash:
Combination of standard and throat microphones for robust speech recognition in highly noisy environments.
- Cenk Demiroglu, David V. Anderson:
Noise robust digit recognition using a glottal radar sensor for voicing detection.
- Dominik Raub, John W. McDonough, Matthias Wölfel:
A cepstral domain maximum likelihod beamformer for speech recognition.
- Naoya Mochiki, Tetsunori Kobayashi, Toshiyuki Sekiya, Tetsuji Ogawa:
Recognition of three simultaneous utterance of speech by four-line directivity microphone mounted on head of robot.
- Shigeki Sagayama, Okajima Takashi, Kamamoto Yutaka, Takuya Nishimoto:
Complex spectrum circle centroid for microphone-array-based noisy speech recognition.
- Larry P. Heck, Mark Mao:
Automatic speech recognition of co-channel speech: integrated speaker and speech recognition approach.
Multi-Lingual Speech Processing
- José B. Mariño, Asunción Moreno, Albino Nogueiras:
A first experience on multilingual acoustic modeling of the languages spoken in morocco.
- Mónica Caballero, Asunción Moreno, Albino Nogueiras:
Data driven multidialectal phone set for Spanish dialects.
- Daniela Oria, Akos Vetek:
Multilingual e-mail text processing for speech synthesis.
- Harald Romsdorfer, Beat Pfister:
Multi-context rules for phonological processing in polyglot TTS synthesis.
- Leonardo Badino, Claudia Barolo, Silvia Quazza:
A general approach to TTS reading of mixed-language texts.
- Panayiotis G. Georgiou, Shrikanth S. Narayanan, Hooman Shirani Mehr:
Context dependent statistical augmentation of persian transcripts.
Speech Enhancement
- Cenk Demiroglu, David V. Anderson:
A soft decision MMSE amplitude estimator as a noise preprocessor to speech coder s using a glottal sensor.
- Rongqiang Hu, David V. Anderson:
Single acoustic-channel speech enhancement based on glottal correlation using non-acoustic sensor.
- Xianxian Zhang, John H. L. Hansen, Kathryn Hoberg Arehart, Jessica Rossi-Katz:
In-vehicle based speech processing for hearing impaired subjects.
- Sriram Srinivasan, W. Bastiaan Kleijn:
Speech enhancement using adaptive time-domain segmentation.
- Tomohiro Nakatani, Keisuke Kinoshita, Masato Miyoshi, Parham Zolfaghari:
Harmonicity based monaural speech dereverberation with time warping and F0 adaptive window.
- Marc Delcroix, Takafumi Hikichi, Masato Miyoshi:
Dereverberation of speech signals based on linear prediction.
Speech and Affect
- Nick Campbell:
Perception of affect in speech - towards an automatic processing of paralinguistic information in spoken conversation.
- Noël Chateau, Valérie Maffiolo, Christophe Blouin:
Analysis of emotional speech in voice mail messages: the influence of speakers' gender.
- Chul Min Lee, Serdar Yildirim, Murtaza Bulut, Abe Kazemzadeh, Carlos Busso, Zhigang Deng, Sungbok Lee, Shrikanth Narayanan:
Emotion recognition based on phoneme classes.
- Peter Robinson, Tal Sobol Shikler:
Visualizing dynamic features of expressions in speech.
- Aijun Li, Haibo Wang:
Friendly speech analysis and perception in standard Chinese.
- Ailbhe Ní Chasaide, Christer Gobl:
Decomposing linguistic and affective components of phonatory quality.
- Dan-Ning Jiang, Lian-Hong Cai:
Classifying emotion in Chinese speech by decomposing prosodic features.
- Chen Yu, Paul M. Aoki, Allison Woodruff:
Detecting user engagement in everyday conversations.
- Takashi X. Fujisawa, Norman D. Cook:
Identifying emotion in speech prosody using acoustical cues of harmony.
- Jianhua Tao:
Context based emotion detection from text input.
- Atsushi Iwai, Yoshikazu Yano, Shigeru Okuma:
Complex emotion recognition system for a specific user using SOM based on prosodic features.
- Hoon-Young Cho, Kaisheng Yao, Te-Won Lee:
Emotion verification for emotion detection and unknown emotion rejection.
- Keikichi Hirose:
Improvement in corpus-based generation of F0 contours using generation process model for emotional speech synthesis.
Speech Features
- Rajesh Mahanand Hegde, Hema A. Murthy, Venkata Ramana Rao Gadde:
Continuous speech recognition using joint features derived from the modified group delay function and MFCC.
- Hua Yu:
Phase-space representation of speech.
- Hema A. Murthy, Rajesh Mahanand Hegde, Venkata Ramana Rao Gadde:
The modified group delay feature: a new spectral representation of speech.
- Oh-Wook Kwon, Te-Won Lee:
ICA-based feature extraction for phoneme recognition.
- Qifeng Zhu, Barry Y. Chen, Nelson Morgan, Andreas Stolcke:
On using MLP features in LVCSR.
- Barry Y. Chen, Qifeng Zhu, Nelson Morgan:
Learning long-term temporal features in LVCSR using neural networks.
- T. V. Sreenivas, G. V. Kiran, A. G. Krishna:
Neural "spike rate spectrum" as a noise robust, speaker invariant feature for automatic speech recognition.
- Yoshihisa Nakatoh, Makoto Nishizaki, Shinichi Yoshizawa, Maki Yamada:
An adaptive MEL-LPC analysis for speech recognition.
- Kentaro Ishizuka, Noboru Miyazaki, Tomohiro Nakatani, Yasuhiro Minami:
Improvement in robustness of speech feature extraction method using sub-band based periodicity and aperiodicity decomposition.
- Carlos Toshinori Ishi:
A new acoustic measure for aspiration noise detection.
- Kris Demuynck, Oscar Garcia, Dirk Van Compernolle:
Synthesizing speech from speech recognition parameters.
- Marios Athineos, Hynek Hermansky, Daniel P. W. Ellis:
LP-TRAP: linear predictive temporal patterns.
- Xiang Li, Richard M. Stern:
Parallel feature generation based on maximizing normalized acoustic likelihood.
- Kun-Ching Wang:
An adaptive band-partitioning spectral entropy based speech detection in realistic noisy environments.
- Javier Ramírez, José C. Segura, M. Carmen Benítez, Ángel de la Torre, Antonio J. Rubio:
Improved voice activity detection combining noise reduction and subband divergence measures.
- Kiyoung Park, Changkyu Choi, Jeongsu Kim:
Voice activity detection using global soft decision with mixture of Gaussian model.
- Thomas Kemp, Climent Nadeu, Yin Hay Lam, Josep Maria Sola i Caros:
Environmental robust features for speech detection.
- Kornel Laskowski, Qin Jin, Tanja Schultz:
Crosscorrelation-based multispeaker speech activity detection.
- Shang-nien Tsai:
Improved robustness of time-frequency principal components (TFPC) by synergy of methods in different domains.
- Li Deng, Yu Dong, Alex Acero:
A quantitative model for formant dynamics and contextually assimilated reduction in fluent speech.
- Gernot Kubin, Tuan Van Pham:
DWT-based classification of acoustic-phonetic classes and phonetic units.
- Yong-Choon Cho, Seungjin Choi:
Learning nonnegative features of spectro-temporal sounds for classification.
Language Modeling,
Multimodal & Multilingual Speech Processing
- Sungyup Chung, Keikichi Hirose, Nobuaki Minematsu:
N-gram language modeling of Japanese using bunsetsu boundaries.
- Langzhou Chen, Lori Lamel, Jean-Luc Gauvain, Gilles Adda:
Dynamic language modeling for broadcast news.
- Ren-Yuan Lyu, Dau-Cheng Lyu, Min-Siong Liang, Min-Hong Wang, Yuang-Chin Chiang, Chun-Nan Hsu:
A unified framework for large vocabulary speech recognition of mutually unintelligible Chinese "regionalects".
- Ielka van der Sluis, Emiel Krahmer:
The influence of target size and distance on the production of speech and gesture in multimodal referring expressions.
- Anurag Kumar Gupta, Tasos Anastasakos:
Dynamic time windows for multimodal input fusion.
- Raymond H. Lee, Anurag Kumar Gupta:
MICot : a tool for multimodal input data collection.
- Chakib Tadj, Hicham Djenidi, Madjid Haouani, Amar Ramdane-Cherif, Nicole Lévy:
Simulating multimodal applications.
- Jakob Schou Pedersen, Paul Dalsgaard, Børge Lindberg:
A multimodal communication aid for global aphasia patients.
- Hirofumi Yamamoto, Gen-ichiro Kikui, Yoshinori Sagisaka:
Mis-recognized utterance detection using hierarchical language model.
- Marko Moberg, Kimmo Pärssinen, Juha Iso-Sipilä:
Cross-lingual phoneme mapping for multilingual synthesis systems.
- Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno, Tsuyoshi Tasaki, Takeshi Yamaguchi:
Robot motion control using listener's back-channels and head gesture information.
- Sakriani Sakti, Arry Akhmad Arman, Satoshi Nakamura, Paulus Hutagaol:
Indonesian speech recognition for hearing and speaking impaired people.
- Mohsen Rashwan:
A two phase arabic language model for speech recognition and other language applications.
- Yuya Akita, Tatsuya Kawahara:
Language model adaptation based on PLSA of topics and speakers.
- Hans J. G. A. Dolfing, Pierce Gerard Buckley, David Horowitz:
Unified language modeling using finite-state transducers with first applications.
- Katsunobu Itou, Atsushi Fujii, Tomoyosi Akiba:
Effects of language modeling on speech-driven question answering.
- Abhinav Sethy, Shrikanth Narayanan, Bhuvana Ramabhadran:
Measuring convergence in language model estimation using relative entropy.
Detection and Classification in ASR
Speech Analysis
- Toshiki Tamiya, Tetsuya Shimamura:
Reconstruction filter design for bone-conducted speech.
- Pedro J. Quintana-Morales, Juan L. Navarro-Mesa:
Frequency warped ARMA analysis of the closed and the open phase of voiced speech.
- Boris Doval, Baris Bozkurt, Christophe d'Alessandro, Thierry Dutoit:
Zeros of z-transform (ZZT) decomposition of speech for source-tract separation.
- Li Deng, Roberto Togneri:
Use of neural network mapping and extended kalman filter to recover vocal tract resonances from the MFCC parameters of speech.
- Xiao Li, Jonathan Malkin, Jeff Bilmes:
Graphical model approach to pitch tracking.
- Bo Xu, Jianhua Tao, Yongguo Kang:
A new multicomponent AM-FM demodulation with predicting frequency boundaries and its application to formant estimation.
- Yves Laprie:
A concurrent curve strategy for formant tracking.
- Qin Yan, Esfandiar Zavarehei, Saeed Vaseghi, Dimitrios Rentzos:
A formant tracking LP model for speech processing.
- Hong You:
Application of long-term filtering to formant estimation.
- Baris Bozkurt, Thierry Dutoit, Boris Doval, Christophe d'Alessandro:
A method for glottal formant frequency estimation.
- Baris Bozkurt, Thierry Dutoit, Boris Doval, Christophe d'Alessandro:
Improved differential phase spectrum processing for formant tracking.
- Xu Shao, Ben P. Milner:
MAP prediction of pitch from MFCC vectors for speech reconstruction.
- An-Tze Yu, Hsiao-Chuan Wang:
New harmonicity measures for pitch estimation and voice activity detection.
- Takuya Nishimoto, Shigeki Sagayama, Hirokazu Kameoka:
Multi-pitch trajectory estimation of concurrent speech based on harmonic GMM and nonlinear kalman filtering.
- Attila Ferencz, Jeongsu Kim, Yong-Beom Lee, Jae-Won Lee:
Automatic pitch marking and reconstruction of glottal closure instants from noisy and deformed electro-glotto-graph signals.
- Federico Flego, Luca Armani, Maurizio Omologo:
On the use of a weighted autocorrelation based fundamental frequency estimation for a multidimensional speech input.
- Aarthi M. Reddy, Bhiksha Raj:
A minimum mean squared error estimator for single channel speaker separation.
- Md. Khademul Islam Molla, Keikichi Hirose, Nobuaki Minematsu:
Audio source separation from the mixture using empirical mode decomposition with independent subspace analysis.
- In-Jung Oh, Hyun-Yeol Chung, Jae-Won Cho, Ho-Youl Jung, Rémy Prost:
Audio watermarking in sub-band signals using multiple echo kernels.
- Jie Zhang, Zhenyang Wu:
A piecewise interpolation method based on log-least square error criterion for HRTF.
- Juan L. Navarro-Mesa, Pedro J. Quintana-Morales:
Modified realizable frequency warped ARMA modeling and its application in synthesis structures for voiced speech.
- R. Muralishankar, A. G. Ramakrishnan, Lakshmish N. Kaushik:
Time-scaling of speech using independent subspace analysis.
- Laurent Girin, Mohammad Firouzmand, Sylvain Marchand:
Long term modeling of phase trajectories within the speech sinusoidal model framework.
- Tina Soltani, Dave Hermann, Etienne Cornu, Hamid Sheikhzadeh, Robert L. Brennan:
An acoustic shock limiting algorithm using time and frequency domain speech features.
- Jong Won Shin, Joon-Hyuk Chang, Nam Soo Kim:
Speech probability distribution based on generalized gama distribution.
- Yanli Zheng, Mark Hasegawa-Johnson, Sarah Borys:
Stop consonant classification by dynamic formant trajectory.
- Yoshinori Shiga, Simon King:
Estimating detailed spectral envelopes using articulatory clustering.
Speech Production
- Olov Engwall:
From real-time MRI to 3d tongue movements.
- Mitsuhiro Nakamura:
Coarticulatory variability and directionality in [s, ..]: an EPG study.
- Yosuke Tanabe, Tokihiko Kaburagi:
Flow representation through the glottis having a polygonal boundary shape.
- Hannu Pulakka, Paavo Alku, Svante Granqvist, Stellan Hertegard, Hans Larsson, Anne-Maria Laukkanen, Per-Ake Lindestad, Erkki Vilkman:
Analysis of the voice source in different phonation types: simultaneous high-sped imaging of the vocal fold vibration and glottal inverse filtering.
- Peter Birkholz, Dietmar Jackel:
Influence of temporal discretization schemes on formant frequencies and bandwidths in time domain simulations of the vocal tract system.
- Tomoki Toda, Alan W. Black, Keiichi Tokuda:
Acoustic-to-articulatory inversion mapping with Gaussian mixture model.
Audio-Visual Speech Processing
- Jinyoung Kim, Jeesun Kim, Chris Davis:
Audio-visual spoken language processing.
- Kaoru Sekiyama, Denis Burnham:
Issues in the development of auditory-visual speech perception: adults, infants, and children.
- Emiel Krahmer, Marc Swerts:
Signaling and detecting uncertainty in audiovisual speech by children and adults.
- Valérie Hazan, Anke Sennema, Andrew Faulkner:
Effect of intensive audiovisual perceptual training on the perception and production of the /l/-/r/ contrast for Japanese learners of English.
- Jean Vroomen, Sabine van Linden, Béatrice de Gelder, Paul Bertelson:
Visual recalibration of auditory speech versus selective speech adaptation: different build-up courses.
- Chris Davis, Jeesun Kim:
Of the top of the head: audio-visual speech perception from the nose up.
- J. Bruce Millar, Michael Wagner, Roland Goecke:
Aspects of speaking-face data corpus design methodology.
- Jean-Luc Schwartz, Marie-Agnès Cathiard:
Modeling audio-visual speech perception: back on fusion architectures and fusion control.
- Mikko Sams, Ville Ojanen, Jyrki Tuomainen, Vasily Klucharev:
Neurocognition of speech-specific audiovisual perception.
- Adriano Vilela Barbosa, Eric Vatikiotis-Bateson, Andreas Daffertshofer:
Target practice on talking faces.
- Matthias Odisio, Gérard Bailly:
Audiovisual perceptual evaluation of resynthesised speech movements.
- Sascha Fagel:
Video-realistic synthetic speech with a parametric visual speech synthesizer.
- Patricia Scanlon, Gerasimos Potamianos, Vit Libal, Stephen M. Chu:
Mutual information based visual feature selection for lipreading.
- Bowon Lee, Mark Hasegawa-Johnson, Camille Goudeseune, Suketu Kamdar, Sarah Borys, Ming Liu, Thomas S. Huang:
AVICAR: audio-visual speech corpus in a car environment.
- Engin Erzin, Yucel Yemez, A. Murat Tekalp:
Adaptive classifier cascade for multimodal speaker identification.
- Midori Iba, Anke Sennema, Valérie Hazan, Andrew Faulkner:
Use of visual cues in the perception of a labial/labiodental contrast by Spanish-L1 and Japanese-L1 learners of English.
- Xianxian Zhang, Kazuya Takeda, John H. L. Hansen, Toshiki Maeno:
Audio-visual SPeaker localization for car navigation systems.
- Josef Chaloupka:
Automatic lips reading for audio-visual speech processing and recognition.
- Michael Wagner, Girija Chetty:
"liveness" verification in audio-video authentication.
- Maria José Sanchez Martinez, Juan Pablo de la Cruz Gutiérrez:
Speech recognition using motion based lipreading.
- Frédéric Berthommier:
Comparative study of linear and non-linear models for viseme in version: modeling of a cortical associative function.
- Petr Císar, Zdenek Krnoul, Milos Zelezný:
3d lip-tracking for audio-visual speech recognition in real applications.
- J. Bruce Millar, Roland Goecke:
The audio-video australian English speech data corpus AVOZES.
- Ki-Hyung Hong, Yong-Ju Lee, Jae-Young Suh, Kyong-Nim Lee:
Correcting Korean vowel speech recognition errors with limited lip features.
- Kuniko Nielsen:
Segmental differences in the visual contribution to speech inteligibility.
Spoken Language Generation and Synthesis III
- Hui Ye, Steve Young:
Voice conversion for unknown speakers.
- Volker Fischer, Jaime Botella Ordinas, Siegfried Kunzmann:
Domain adaptation methods in the IBM trainable text-to-speech system.
- Yi Zhou, Yiqing Zu, Zhenli Yu, Dongjian Yue, Guilin Chen:
Applying pitch connection control in Mandarin speech synthesis.
- Hermann Ney, David Sündermann, Antonio Bonafonte, Harald Höge:
A first step towards text-independent voice conversion.
- Zhenli Yu, Kaizhi Wang, Yiqing Zu, Dongjian Yue, Guilin Chen:
Data pruning approach to unit selection for inventory generation of concatenative embeddable Chinese TTS systems.
- Jithendra Vepa, Simon King:
Subjective evaluation of join cost functions used in unit selection speech synthesis.
- Heiga Zen, Tadashi Kitamura, Murtaza Bulut, Shrikanth Narayanan, Ryosuke Tsuzuki, Keiichi Tokuda:
Constructing emotional speech synthesizers with limited speech database.
- Cheng-Yuan Lin, Jyh-Shing Roger Jang:
A two-phase pitch marking method for TD-PSOLA synthesis.
- Antonio Bonafonte, Alexander Kain, Jan P. H. van Santen, Helenca Duxans:
Including dynamic and phonetic information in voice conversion systems.
- Zixiang Wang, Ren-Hua Wang, Zhiwei Shuang, Zhen-Hua Ling:
A novel voice conversion system based on codebook mapping with phoneme-tied weighting.
- Zhen-Hua Ling, Yu Hu, Zhiwei Shuang, Ren-Hua Wang:
Compression of speech database by feature separation and pattern clustering using STRAIGHT.
- Shunsuke Kataoka, Nobuaki Mizutani, Keiichi Tokuda, Tadashi Kitamura:
Decision-tree backing-off in HMM-based speech synthesis.
- Nobuyuki Nishizawa, Hisashi Kawai:
Using a depth-restricted search to reduce delays in unit selection.
- Junichi Yamagishi, Takashi Masuko, Takao Kobayashi:
MLLR adaptation for hidden semi-Markov model based speech synthesis.
- Stefan Breuer, Julia Abresch:
Phoxsy: multi-phone segments for unit selection speech synthesis.
- Francesc Alías, Xavier Llorà, Ignasi Iriondo Sanz, Joan Claudi Socoró, Xavier Sevillano, Lluís Formiga:
Perception-guided and phonetic clustering weight tuning based on diphone pairs for unit selection TTS.
- Taoufik En-Najjary, Olivier Rosec, Thierry Chonavel:
A voice conversion method based on joint pitch and spectral envelope transformation.
- Taoufik En-Najjary, Olivier Rosec, Thierry Chonavel:
Fast GMM-based voice conversion for text-to-speech synthesis systems.
- Rohit Kumar:
A genetic algorithm for unit selection based speech synthesis.
- Jun Huang, Lex Olorenshaw, Gustavo Hernández Ábrego, Lei Duan:
A memory efficient grapheme-to-phoneme conversion system for speech processing.
- Rohit Kumar, S. Prahallad Kishore:
Automatic pruning of unit selection speech databases for synthesis without loss of naturalness.
- Tanya Lambert, Andrew P. Breen:
A database design for a TTS synthesis system using lexical diphones.
- John Kominek, Alan W. Black:
A family-of-models approach to HMM-based segmentation for unit selection speech synthesis.
- Wei Zhang, Ling Jin, Xijun Ma:
Mutual-information based segment pre-selection in concatenative text-to-speech.
- Heiga Zen, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura:
Hidden semi-Markov model based speech synthesis.
- Hartmut R. Pfitzinger:
DFW-based spectral smoothing for concatenative speech synthesis.
- Kyung-Joong Min, Un-Cheon Lim:
Korean prosody generation and artificial neural networks.
- Kyuchul Yoon:
A prosodic phrasing model for a Korean text-to-speech synthesis system.
- Qin Shi, Volker Fischer:
A comparison of statistical methods and features for the prediction of prosodic structures.
- Gui-Lin Chen, Ke-Song Han:
Letter-to-sound for small-footprint multilingual TTS engine.
- Jun Xu, Guohong Fu, Haizhou Li:
Grapheme-to-phoneme conversion for Chinese text-to-speech.
- Marc Schröder, Stefan Breuer:
XML representation languages as a way of interconnecting TTS modules.
- Wenjie Cao, Chengqing Zong, Bo Xu:
Approach to interchange-format based Chinese generation.
- Enrico Zovato, Stefano Sandri, Silvia Quazza, Leonardo Badino:
Prosodic analysis of a multi-style corpus in the perspective of emotional speech synthesis.
- Kyung-Joong Min, Chan-Goo Kang, Un-Cheon Lim:
Number of output nodes of artificial neural networks for Korean prosody generation.
- Sunhee Kim, Ju-Eun Ahn, Soon-Hyob Kim, Yang-Hee Lee:
A Korean grapheme-to-phoneme conversion system using selection procedure for exceptions.
- Thanate Khaorapapong, Montri Karnjanadecha, Keerati Inthavisas:
Synthesis of vowels and tones in Thai language by articulatory modeling.
- Yoshinori Shiga, Simon King:
Source-filter separation for articulation-to-speech synthesis.
- Hisako Asano, Hideharu Nakajima, Hideyuki Mizuno, Oku Masahiro:
Long vowel detection for letter-to-sound conversion for Japanese sourced words transliterated into the alphabet.
- Frantz Clermont, Thomas John Millhouse:
Inexactness and robustness in cepstral-to-formant transformation of spoken and sung vowels.
- Takeshi Saitou, Naoya Tsuji, Masashi Unoki, Masato Akagi:
Analysis of acoustic features affecting "singing-ness" and its application to singing-voice synthesis from speaking-voice.
- Vincent Pollet, Geert Coorman:
Statistical corpus-based speech segmentation.
- Jindrich Matousek, Jan Romportl, Daniel Tihelka, Zbynek Tychtl:
Recent improvements on ARTIC: czech text-to-speech system.
- Youngim Jung, Donghun Lee, HyeonSook Nam, Ae-sun Yoon, Hyuk-Chul Kwon:
Learning for transliteration of arabic-numeral expressions using decision tree for Korean TTS.
- Nicole Beringer:
How to integrate phonetic and linguistic knowledge in a text-to-phoneme conversion task: a syllabic TPC tool for French.
- Wael Hamza, Ellen Eide, Raimo Bakis:
Reconciling pronunciation differences between the front-end and the back-end in the IBM speech synthesis system.
- Juhong Ha, Yu Zheng, Gary Geunbae Lee, Yoon-Suk Seong, Byeongchang Kim:
High quality text-to-pinyin conversion using two-phase unknown word prediction.
- Yeon-Jun Kim, Ann K. Syrdal, Alistair Conkie:
Pronunciation lexicon adaptation for TTS voice building.
- Gabriel Webster:
Improving letter-to-pronunciation accuracy with automatic morphologically-based stress prediction.
- Wael Hamza, Ellen Eide, Raimo Bakis, Michael Picheny, John F. Pitrelli:
The IBM expressive speech synthesis system.
- Markus Schnell, Rüdiger Hoffmann:
What concept-to-speech can gain for prosody.
Speech Recognition - Language Model
- Tatsuya Kawahara, Kiyotaka Uchimoto, Hitoshi Isahara, Kazuya Shitaoka:
Dependency structure analysis and sentence boundary detection in spontaneous Japanese.
- Salma Jamoussi, David Langlois, Jean Paul Haton, Kamel Smaïli:
Statistical feature language model.
- Brigitte Bigi, Yan Huang, Renato de Mori:
Vocabulary and language model adaptation using information retrieval.
- Shinsuke Mori, Daisuke Takuma:
Word n-gram probability estimation from a Japanese raw corpus.
- Jen-Tzung Chien, Hung-Ying Chen:
Mining of association patterns for language modeling.
- Jen-Tzung Chien, Meng-Sung Wu, Hua-Jui Peng:
On latent semantic language modeling and smoothing.
- Vaibhava Goel:
Conditional maximum likelihood estimation for improving annotation performance of n-gram models incorporating stochastic finite state grammars.
- Edward James Schofield:
Fast parameter estimation for joint maximum entropy language models.
- Dimitra Vergyri, Katrin Kirchhoff, Kevin Duh, Andreas Stolcke:
Morphology-based language modeling for arabic speech recognition.
- A. Nayeemulla Khan, B. Yegnanarayana:
Speech enhanced multi-Span language model.
- Holger Schwenk, Jean-Luc Gauvain:
Neural network language models for conversational speech recognition.
- David Mrva, Philip C. Woodland:
A PLSA-based language model for conversational telephone speech.
Speaker Recognition
- Jérôme Louradour, Régine André-Obrecht, Khalid Daoudi:
Segmentation and relevance measure for speaker verification.
- Mohamed Chetouani, Bruno Gas, Jean-Luc Zarader, Marcos Faúndez-Zanuy:
A new nonlinear feature extraction algorithm for speaker verification.
- Elizabeth Shriberg, Luciana Ferrer, Anand Venkataraman, Sachin S. Kajarekar:
SVM modeling of "SNERF-grams" for speaker recognition.
- Purdy Ho, Pedro J. Moreno:
SVM kernel adaptation in speaker classification and verification.
- Koji Iwano, Taichi Asami, Sadaoki Furui:
Noise-robust speaker verification using F0 features.
- Zi-He Chen, Yuan-Fu Liao, Yau-Tarng Juang:
Eigen-prosody analysis for robust speaker recognition under mismatch handset environment.
- Aaron D. Lawson, Mark C. Huggins:
Triphone-based confidence system for speaker identification.
- Kenichi Yoshida, Kazuyuki Takagi, Kazuhiko Ozeki:
Improved model training and automatic weight adjustment for multi-SNR multi-band speaker identification system.
- Man-Wai Mak, Kwok-Kwong Yiu, Ming-Cheung Cheung, Sun-Yuan Kung:
A new approach to channel robust speaker verification via constrained stochastic feature transformation.
- Chakib Tadj, Christian S. Gargour, Nabil Badri:
Best speaker-based structure tree for speaker verification.
- David Chow, Waleed H. Abdulla:
Robust speaker identification based on perceptual log area ratio and Gaussian mixture models.
- Stanley J. Wenndt, Richard M. Floyd:
Channel frequency response correction for speaker recognition.
- Yh-Her Yang, Yuan-Fu Liao:
Unseen handset mismatch compensation based on a priori knowledge interpolation for robust speaker recognition.
- Michael T. Padilla, Thomas F. Quatieri:
A comparison of soft and hard spectral subtraction for speaker verification.
- Vlasta Radová, Ales Padrta:
Comparison of several speaker verification procedures based on GMM.
- Yong Guan, Wenju Liu, Hongwei Qi, Jue Wang:
Improving performance of text-independent speaker identification by utilizing contextual principal curves filtering.
- Jen-Tzung Chien, Chuan-Wei Ting:
Speaker identification using probabilistic PCA model selection.
- Hagai Aronowitz, David Burshtein, Amihood Amir:
Text independent speaker recognition using speaker dependent word spotting.
- Hsiao-Chuan Wang, Jyh-Min Cheng:
A study on model-based equal error rate estimation for automatic speaker verification.
- Tomoko Matsui, Kunio Tanabe:
Probabilistic speaker identification with dual penalized logistic regression machine.
- Javier R. Saeta, Javier Hernando:
Model quality evaluation during enrolment for speaker verification.
- Pasi Fränti, Evgeny Karpov, Tomi Kinnunen:
Real-time speaker identification.
- Mohammed Abu El-Yazeed, Nemat S. Abdel Kader, Mohammed El-Henawy:
Multi-codebook vector quantization algorithm for speaker identification.
- Ming-Cheung Cheung, Kwok-Kwong Yiu, Man-Wai Mak, Sun-Yuan Kung:
Multi-sample fusion with constrained feature transformation for robust speaker verification.
- Michael Betser, Frédéric Bimbot, Mathieu Ben, Guillaume Gravier:
Speaker diarization using bottom-up clustering based on a parameter-derived distance between adapted GMMs.
- Nengheng Zheng, P. C. Ching, Tan Lee:
Time -frequency analysis of vocal source signal for speaker recognition.
- Rashmi Gangadharaiah, Balakrishnan Narayanaswamy, Narayanaswamy Balakrishnan:
A novel method for two-speaker segmentation.
- Bayya Yegnanarayana, A. Shahina, M. R. Kesheorey:
Throat microphone signal for speaker recognition.
- Mohamed Faouzi BenZeghiba, Hervé Bourlard:
Posteriori probabilities and likelihoods combination for speech and speaker recognition.
- Mohamed Mihoubi, Douglas D. O'Shaughnessy, Pierre Dumouchel:
The use of typical sequences for robust speaker identification.
- KyungHwa Kim:
A forensic phonetic investigation into the duration and speech rate.
- T. V. Sreenivas, Sameer Badaskar, Sameer Badaskar:
Mixture Gaussian model training against impostor model parameters: an application to speaker identification.
- Jan Anguita, Javier Hernando, Alberto Abad:
Jacobian adaptation with improved noise reference for speaker verification.
- Mihalis Siafarikas, Todor Ganchev, Nikos Fakotakis:
Objective wavelet packet features for speaker verification.
- Upendra V. Chaudhari, Ganesh N. Ramaswamy:
Policy analysis framework for conversational biometrics.
- Woo-Yong Choi, Jung Gon Kim, Hyung Soon Kim, Sung Bum Pan:
A new score normalization method for speaker verification with virtual impostor model.
- Samuel Kim, Thomas Eriksson, Hong-Goo Kang:
On the time variability of vocal tract for speaker recognition.
- Veena Desai, Hema A. Murthy:
Distributed speaker recognition.
- Pongtep Angkititrakul, Sepideh Baghaii, John H. L. Hansen:
Cluster-dependent modeling and confidence measure processing for in-set/out-of-set speaker identification.
- Yoshiyuki Umeda, Shingo Kuroiwa, Satoru Tsuge, Fuji Ren:
Distributed speaker recognition using earth mover's distance.
- Michael Barlow, Mehrdad Khodai-Joopari, Frantz Clermont:
A forensically-motivated tool for selecting cepstrally-consistent steady-states from non-contemporaneous vowel utterances.
- Anil Alexander, Andrzej Drygajlo:
Scoring and direct methods for the interpretation of evidence in forensic speaker recognition.
- Tomi Kinnunen, Evgeny Karpov, Pasi Fränti:
Efficient online cohort selection method for speaker verification.
- Jiri Navratil, Ganesh N. Ramaswamy, Ran D. Zilca:
Statistical model migration in speaker recognition.
- A. Nayeemulla Khan, Bayya Yegnanarayana:
Latent semantic analysis for speaker recognition.
- Yang Shao, DeLiang Wang:
Model-based sequential organization for cochannel speaker identification.
- Ka-Yee Leung, Man-Wai Mak, Sun-Yuan Kung:
Articulatory feature-based conditional pronunciation modeling for speaker verification.
- Alex Park, Timothy J. Hazen:
A comparison of normalization and training approaches for ASR-dependent speaker identification.
- Dat Tran:
New background modeling for speaker verification.
Processing of Prosody by Humans and Machines
- Gérard Bailly, Bleicke Holm, Véronique Aubergé:
A trainable prosodic model: learning the contours implementing communicative functions within a superpositional model of intonation.
- Dung Tien Nguyen, Chi Mai Luong, Bang Kim Vu, Hansjörg Mixdorff, Huy Hoang Ngo:
Fujisaki model based F0 contours in vietnamese TTS.
- Kazuyuki Ashimura, Hideki Kashioka, Nick Campbell:
Estimating speaking rate in spontaneous speech from z-scores of pattern durations.
- Takashi Masuko, Takao Kobayashi, Keisuke Miyanaga:
A style control technique for HMM-based speech synthesis.
- Mark Hasegawa-Johnson, Stephen E. Levinson, Tong Zhang:
Children's emotion recognition in an intelligent tutoring scenario.
- Keikichi Hirose, Nobuaki Minematsu:
Use of prosodic features for speech recognition.
Contemporary Issues in ASR
- Jochen Peters, Christina Drexel:
Transformation-based error correction for speech-to-text systems.
- Alexander Gutkin, Simon King:
Phone classification in pseudo-euclidean vector spaces.
- Grace Chung, Chao Wang, Stephanie Seneff, Edward Filisko, Min Tang:
Combining linguistic knowledge and acoustic information in automatic pronunciation lexicon generation.
- Ken Chen, Mark Hasegawa-Johnson:
Modeling pronunciation variation using artificial neural networks for English spontaneous speech.
- Stefanie Aalburg, Harald Höge:
Foreign-accented speaker-independent speech recognition.
- Panikos Heracleous, Yoshitaka Nakajima, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano:
Non-audible murmur (NAM) speech recognition using a stethoscopic NAM microphone.
- Martin Russell, Shona D'Arcy, Lit Ping Wong:
Recognition of read and spontaneous children's speech using two new corpora.
- Joe Frankel, Mirjam Wester, Simon King:
Articulatory feature recognition using dynamic Bayesian networks.
- Gies Bouwman, Bert Cranen, Lou Boves:
Predicting word correct rate from acoustic and linguistic confusability.
- Kazushi Ishihara, Yuya Hattori, Tomohiro Nakatani, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno:
Disambiguation in determining phonemes of sound-imitation words for environmental sound recognition.
- Jan Anguita, Stephane Peillon, Javier Hernando, Alexandre Bramoulle:
Word confusability prediction in automatic speech recognition.
- Szu-Chen Stan Jou, Tanja Schultz, Alex Waibel:
Adaptation for soft whisper recognition using a throat microphone.
- Rainer Gruhn, Konstantin Markov, Satoshi Nakamura:
A statistical lexicon for non-native speech recognition.
- Mathew Magimai-Doss, Shajith Ikbal, Todd A. Stephenson, Hervé Bourlard:
Modeling auxiliary features in tandem systems.
- Louis ten Bosch, Lou Boves:
Survey of spontaneous speech phenomena in a multimodal dialogue system and some implications for ASR.
- Tobias Cincarek, Rainer Gruhn, Satoshi Nakamura:
Speech recognition for multiple non-native accent groups with speaker-group-dependent acoustic models.
- Frederik Stouten, Jean-Pierre Martens:
Coping with disfluencies in spontaneous speech recognition.
- Soonil Kwon, Shrikanth Narayanan:
Speaker model quantization for unsupervised speaker indexing.
- Matteo Gerosa, Diego Giuliani:
Investigating automatic recognition of non-native children's speech.
- Yang Liu, Elizabeth Shriberg, Andreas Stolcke, Mary P. Harper:
Using machine learning to cope with imbalanced classes in natural speech: evidence from sentence boundary and disfluency detection.
- Minho Jin, Gyucheol Jang, Sungrack Yun, Chang Dong Yoo:
Hybrid utterance verification based on n-best models and model derived from kulback-leibler divergence.
- Masataka Goto, Koji Kitayama, Katsunobu Itou, Tetsunori Kobayashi:
Speech spotter: on-demand speech recognition in human-human conversation on the telephone or in face-to-face situations.
- Kyong-Nim Lee, Minhwa Chung:
Pronunciation lexicon modeling and design for Korean large vocabulary continuous speech recognition.
- Sebastian Möller, Jan Felix Krebber, Alexander Raake:
Performance of speech recognition and synthesis in packet-based networks.
- Alastair Bruce James, Ben P. Milner, Angel Manuel Gomez:
A comparison of packet loss compensation methods and interleaving for speech recognition in burst-like packet loss.
- Ben P. Milner, Alastair Bruce James:
An analysis of packet loss models for distributed speech recognition.
Second Language Learning and Spoken Language Processing
- Nobuaki Minematsu:
Pronunciation assessment based upon the phonological distortions observed in language learners' utterances.
- Yasuo Suzuki, Yoshinori Sagisaka, Katsuhiko Shirai, Makiko Muto:
Analysis of the phone level contributions to objective evaluation of English speech by non-natives.
- Chao Wang, Mitchell Peabody, Stephanie Seneff, Jong-mi Kim:
An interactive English pronunciation dictionary for Korean learners.
- Seok-Chae Rhee, Jeon G. Park:
Development of the knowledge-based spoken English evaluation system and its application.
- Jared Bernstein, Isabella Barbier, Elizabeth Rosenfeld, John H. A. L. de Jong:
Theory and data in spoken language assessment.
- Tatsuya Kawahara, Masatake Dantsuji, Yasushi Tsubota:
Practical use of English pronunciation system for Japanese students in the CALL classroom.
- Jonas Beskow, Olov Engwall, Björn Granström, Preben Wik:
Design strategies for a virtual language tutor.
Emerging Research:
Human Factors in Speech and Communication Systems
- Ellen Campana, Michael K. Tanenhaus, James F. Allen, Roger W. Remington:
Evaluating cognitive load in spoken language interfaces using a dual-task paradigm.
- Lesley-Ann Black, Norman D. Black, Roy Harper, Michelle Lemon, Michael F. McTear:
The voice-logbook: integrating human factors for a chronic care system.
- Kristiina Jokinen:
Communicative competence and adaptation in a spoken dialogue system.
- Zhan Fu, Lay Ling Pow, Fang Chen:
Evaluation of the difference between the driving behavior of a speech based and a speech-visual based task of an in-car compute.
- Sebastian Möller, Jan Felix Krebber, Paula M. T. Smeele:
Evaluating system metaphors via the speech output of a smart home system.
- Florian Hammer, Peter Reichl, Alexander Raake:
Elements of interactivity in telephone conversations.
Interdisciplinary Topics in Spoken Language Processing
- Rubén San Segundo, Juan Manuel Montero, Javier Macías Guarasa, Ricardo de Córdoba, Javier Ferreiros, José Manuel Pardo:
Generating gestures from speech.
- Noboru Kanedera, Asuka Sumida, Takao Ikehata, Tetsuo Funada:
Subtopic segmentation in the lecture speech.
- Donna Erickson, Caroline Menezes, Akinori Fujino:
Some articulatory measurements of real sadness.
- Chen-Long Lee, Wen-Whei Chang, Yuan-Chuan Chiang:
Application of voice conversion to hearing-impaired Mandarin speech enhancement.
- Oh Pyo Kweon, Akinori Ito, Motoyuki Suzuki, Shozo Makino:
A Japanese dialogue-based CALL system with mispronunciation and grammar error detection.
- Cheolwoo Jo, Ilsuh Bak:
Statistics-based direction finding for training vowels.
- Simona Montanari, Serdar Yildirim, Elaine Andersen, Shrikanth Narayanan:
Reference marking in children's computer-directed speech: an integrated analysis of discourse and gestures.
- Jong-mi Kim, Suzanne Flynn:
What makes a non-native accent?: a study of Korean English.
- Sang-Jin Kim, Kwang-Ki Kim, Minsoo Hahn:
Study on emotional speech features in Korean with its aplication to voice color conversion.
- Shigeaki Amano, Tomohiro Nakatani, Tadahisa Kondo:
Developmental changes in voiced-segment ratio for Japanese infants and parents.
- Kisun You, Hoyoun Kim, Wonyong Sung:
Implementation of an intonational quality assessment system for a handheld device.
- Denis Beautemps, Thomas Burger, Laurent Girin:
Characterizing and classifying cued speech vowels from labial parameters.
- Shinya Takahashi, Tsuyoshi Morimoto, Sakashi Maeda, Naoyuki Tsuruta:
Cough detection in spoken dialogue system for home health care.
Towards Adaptive Machines:
Active and Unsupervised Learning
- Dong Yu, Mei-Yuh Hwang, Peter Mau, Alex Acero, Li Deng:
Unsupervised learning from users' error correction in speech dictation.
- Gerard G. L. Meyer, Teresa M. Kamm:
Robustness aspects of active learning for acoustic modeling.
- Karthik Visweswariah, Ramesh A. Gopinath, Vaibhava Goel:
Task adaptation of acoustic and language models based on large quantities of data.
- Luc Lussier, Edward W. D. Whittaker, Sadaoki Furui:
Unsupervised language model adaptation methods for spontaneous speech.
- Masafumi Nishida, Yoshitaka Mamiya, Yasuo Horiuchi, Akira Ichikawa:
On-line incremental adaptation based on reinforcement learning for robust speech recognition.
- Tomohiro Watanabe, Hiromitsu Nishizaki, Takehito Utsuro, Seiichi Nakagawa:
Unsupervised speaker adaptation using high confidence portion recognition results by multiple recognition systems.
Speech Coding
- Sorin Dusan, James L. Flanagan, Amod Karve, Mridul Balaraman:
Speech coding using trajectory compression and multiple sensors.
- Christian Feldbauer, Gernot Kubin:
How sparse can we make the auditory representation of speech?
- Malah David, Slava Shectman:
Efficient sub-optimal temporal decomposition with dynamic weighting of speech signals for coding applications.
- Teddy Surya Gunawan, Eliathamby Ambikairajah, Julien Epps:
Perceptual wavelet packet audio coder.
- Sung-Kyo Jung, Hong-Goo Kang, Dae Hee Youn, Chang-Heon Lee:
Performance analysis of transcoding algorithms in packet-loss environments.
- Tiago H. Falk, Wai-Yip Chan, Peter Kabal:
Speech quality estimation using Gaussian mixture models.
Robust ASR
- Hong Kook Kim, Mazin G. Rahim:
Why speech recognizers make errors ? a robustness view.
- Seyed Mohammad Ahadi, Hamid Sheikhzadeh, Robert L. Brennan, George Freeman:
An energy normalization scheme for improved robustness in speech recognition.
- Juan M. Huerta, Etienne Marcheret, Sreeram Balakrishnan:
Rapid on-line environment compensation for server - based speech recognition in noisy mobile environments.
- Leila Ansary, Seyyed Ali Seyyed Salehi:
Modeling phones coarticulation effects in a neural network based speech recognition system.
- Daniel Willett:
Error - weighted discriminative training for HMM parameter estimation.
- Wai Kit Lo, Frank K. Soong, Satoshi Nakamura:
Robust verification of recognized words in noise.
- Zili Li, Hesham Tolba, Douglas D. O'Shaughnessy:
Robust automatic speech recognition using an optimal spectral amplitude estimator algorithm in low-SNR car environments.
- Junhui Zhao, Jingming Kuang, Xiang Xie:
Robust speech recognition using data-driven temporal filters based on independent component analysis.
- Norihide Kitaoka, Longbiao Wang, Seiichi Nakagawa:
Robust distant speech recognition based on position dependent CMN.
- Sumitaka Sakauchi, Yoshikazu Yamaguchi, Satoshi Takahashi, Satoshi Kobashikawa:
Robust speech recognition based on HMM composition and modified wiener filter.
- Ivan Brito, Néstor Becerra Yoma, Carlos Molina:
Feature-dependent compensation in speech recognition.
- Stephen Cox:
Using context to correct phone recognition errors.
- Yasunari Obuchi:
Improved histogram-based feature compensation for robust speech recognition and unsupervised speaker adaptation.
- Zhenyu Xiong, Thomas Fang Zheng, Wenhu Wu:
Weighting observation vectors for robust speech recognition in noisy environments.
- Masanori Tsujikawa, Ken-ichi Iso:
Hands-free speech recognition using blind source separation post-processed by two-stage spectral subtraction.
- Randy Gomez, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano:
Robust speech recognition with spectral subtraction in low SNR.
- Bert Cranen, Johan de Veth:
Active perception: using a priori knowledge from clean speech models to ignore non-target features.
- Haitian Xu, Zheng-Hua Tan, Paul Dalsgaard, Børge Lindberg:
Spectral subtraction with full-wave rectification and likelihood controlled instantaneous noise estimation for robust speech recognition.
- Filip Korkmazsky, Dominique Fohr, Irina Illina:
Using linear interpolation to improve histogram equalization for speech recognition.
- Mark Hasegawa-Johnson, Ameya Deoras:
A factorial HMM aproach to robust isolated digit recognition in background music.
- Yoonjae Lee, Hanseok Ko:
Multi-eigenspace normalization for robust speech recognition in noisy environments.
- Christophe Cerisara, Dominique Fohr, Odile Mella, Irina Illina:
Exploiting models intrinsic robustness for noisy speech recognition.
- Pere Pujol, Jaume Padrell, Climent Nadeu, Dusan Macho:
Speech recognition experiments with the SPEECON database using several robust front-ends.
- Shajith Ikbal, Mathew Magimai-Doss, Hemant Misra, Hervé Bourlard:
Spectro-temporal activity pattern (STAP) features for noise robust ASR.
- Byoung-Don Kim, Jin-Young Kim, Seung-Ho Choi, Young-Bum Lee, Kyoung-Rok Lee:
Improvement of confidence measure performance using background model set algorithm.
- Guillermo Aradilla, John Dines, Sunil Sivadas:
Using RASTA in task independent TANDEM feature extraction.
- Kyu Jeong Han, Shrikanth Narayanan, Naveen Srinivasamurthy:
A distributed speech recognition system in multi-user environments.
- Reinhold Haeb-Umbach, Valentin Ion:
Soft features for improved distributed speech recognition over wireless networks.
Emerging Research
- Rinzou Ebukuro:
Analysis on disappearing and thriving of speech applications for ergonomic design guidelines and recommendations.
- Paula M. T. Smeele, Sebastian Möller, Jan Felix Krebber:
Evaluation of the speech output of a smart-home system in a car environment.
- Ellen C. Haas:
How does the integration of speech recognition controls and spatialized auditory displays affect user workload?
- Fang Chen:
Speech interaction system - how to increase its usability?
- Nicole Beringer:
Human language acquisition methods in a machine learning task.
Spoken Language Resources and Technology Evaluation I
- Laila Dybkjær, Niels Ole Bernsen, Wolfgang Minker:
New challenges in usability evaluation - beyond task-oriented spoken dialogue systems.
- Owen Kimball, Chia-Lin Kao, Rukmini Iyer, Teodoro Arvizo, John Makhoul:
Using quick transcriptions to improve conversational speech models.
- Rohit Mishra, Elizabeth Shriberg, Sandra Upson, Joyce Chen, Fuliang Weng, Stanley Peters, Lawrence Cavedon, John Niekrasz, Hua Cheng, Harry Bratt:
A wizard of oz framework for collecting spoken human-computer dialogs.
- Mikko Hartikainen, Esa-Pekka Salonen, Markku Turunen:
Subjective evaluation of spoken dialogue systems using SER VQUAL method.
- Ioana Vasilescu, Laurence Devillers, Chloé Clavel, Thibaut Ehrette:
Fiction database for emotion detection in abnormal situations.
- Ruhi Sarikaya, Yuqing Gao, Paola Virga:
Fast semi-automatic semantic annotation for spoken dialog systems.
- Yi-Jian Wu, Hisashi Kawai, Jinfu Ni, Ren-Hua Wang:
A study on automatic detection of Japanese vowel devoicing for speech synthesis.
- Tolga Çiloglu, Dinc Acar, Ahmet Tokatli:
Orientel-turkish: telephone speech database description and notes on the experience.
- Taejin Yoon, Sandra Chavarria, Jennifer Cole, Mark Hasegawa-Johnson:
Intertranscriber reliability of prosodic labeling on telephone conversation using toBI.
- Jilei Tian:
Efficient compression method for pronunciation dictionaries.
- Min-Siong Liang, Dau-Cheng Lyu, Yuang-Chin Chiang, Ren-Yuan Lyu:
Construct a multi-lingual speech corpus in taiwan with extracting phonetically balanced articles.
- Per Olav Heggtveit, Jon Emil Natvig:
Automatic prosody labeling of read norwegian.
- Eric Sanders, Andrea Diersen, Willy Jongenburger, Helmer Strik:
Towards automatic word segmentation of dialect speech.
- Petr Fousek, Frantisek Grézl, Hynek Hermansky, Petr Svojanovsky:
New nonsense syllables database - analyses and preliminary ASR experiments.
- Jan Felix Krebber, Sebastian Möller, Alexander Raake:
Speech input and output module assessment for remote access to a smart-home spoken dialog system.
- Dong-Hyun Kim, Yong-Wan Roh, Kwang-Seok Hong:
An implement of speech DB gathering system using voiceXML.
- Farshad Almasganj:
Precise phone boundary detection using wavelet packet and recurrent neural networks.
- Andrew Cameron Morris, Viktoria Maier, Phil Green:
From WER and RIL to MER and WIL: improved evaluation measures for connected speech recognition.
- Seok-Chae Rhee, Sook-Hyang Lee, Young-Ju Lee, Seok-Keun Kang:
Design and construction of Korean-spoken English corpus.
- Folkert de Vriend, Giulio Maltese:
Exploring XML-based technologies and procedures for quality evaluation from a real-life case perspective.
- Kuansan Wang:
Spoken language interface in ECMA/ISO telecommunication standards.
- Marelie H. Davel, Etienne Barnard:
The efficient generation of pronunciation dictionaries: machine learning factors during bootstrapping.
- Anja Geumann:
Towards a new level of anotation detail of multilingual speech corpora.
- Nobuo Kawaguchi, Shigeki Matsubara, Yukiko Yamaguchi, Kazuya Takeda, Fumitada Itakura:
CIAIR in-car speech database.
- Christophe Van Bael, Henk van den Heuvel, Helmer Strik:
Investigating speech style specific pronunciation variation in large spoken language corpora.
- Marelie H. Davel, Etienne Barnard:
The efficient generation of pronunciation dictionaries: human factors during bootstrapping.
Multi-Modal / Multi-Media Processing
- Roger K. Moore:
Modeling data entry rates for ASR and alternative input methods.
- Hiromitsu Ban, Chiyomi Miyajima, Katsunobu Itou, Fumitada Itakura, Kazuya Takeda:
Speech recognition using synchronization between speech and finger tapping.
- Anurag Kumar Gupta, Tasos Anastasakos:
Integration patterns during multimodal interaction.
- Etienne Marcheret, Stephen M. Chu, Vaibhava Goel, Gerasimos Potamianos:
Efficient likelihood computation in multi-stream HMM based audio-visual speech recognition.
- Changkyu Choi, Donggeon Kong, Hyoung-Ki Lee, Sang Min Yoon:
Separation of multiple concurrent speeches using audio-visual speaker localization and minimum variance beam-forming.
- Tokitomo Ariyoshi, Kazuhiro Nakadai, Hiroshi Tsujino:
Multimodal expression for humanoid robots by integration of human speech mimicking and facial color.
Automatic Speech Recognition in the Context of Mobile Communications
Robust Features for ASR
- Takashi Fukuda, Tsuneo Nitta:
Canonicalization of feature parameters for automatic speech recognition.
- Soundararajan Srinivasan, Nicoleta Roman, DeLiang Wang:
On binary and ratio time-frequency masks for robust speech recognition.
- Alberto Sanchís, Alfons Juan, Enrique Vidal:
New features based on multiple word graphs for utterance verification.
- Lukas Burget:
Combination of speech features using smoothed heteroscedastic linear discriminant analysis.
- Shajith Ikbal, Hemant Misra, Sunil Sivadas, Hynek Hermansky, Hervé Bourlard:
Entropy based combination of tandem representations for noise robust ASR.
- Dongsuk Yook, Donghyun Kim:
Fast speech adaptation in linear spectral domain for additive and convolutional noise.
Towards Rapid Speech and Natural Language Application Development:
Tooling,
Architectures,
Components and Standards
- I. Lee Hetherington:
The MIT finite-state transducer toolkit for speech and language processing.
- Junlan Feng, Srinivas Bangalore, Mazin G. Rahim:
Question-answering in webtalk: an evaluation study.
- Juan M. Huerta, Chaitanya Ekanadham:
Automatic network optimization of voice applications.
- Miguel Angel Rodriguez-Moreno, Heriberto Cuayáhuitl, Juventino Montiel-Hernández:
Voicebuilder: a framework for automatic speech application development.
- Andrea Facco, Daniele Falavigna, Roberto Gretter, Marcello Viganò:
On the development of telephone applications: some practical issues and evaluation.
- Stefan W. Hamerich, Volker Schless, Basilis Kladis, Volker Schubert, Otilia Kocsis, Stefan Igel, Ricardo de Córdoba, Luis Fernando D'Haro, José Manuel Pardo:
The GEMINI platform: semi-automatic generation of dialogue applications.
Speech Coding and Enhancement
- Kazuhiro Kondo, Kiyoshi Nakagawa:
A packet loss concealment method using recursive linear prediction.
- Minkyu Lee, Imed Zitouni, Qiru Zhou:
On a n-gram model approach for packet loss concealment.
- Stephen So, Kuldip K. Paliwal:
Efficient vector quantisation of line spectral frequencies using the switched split vector quantiser.
- M. Chaitanya, S. R. Mahadeva Prasanna, B. Yegnanarayana:
Enhancement of reverberant speech using excitation source information.
- Keisuke Kinoshita, Tomohiro Nakatani, Masato Miyoshi:
Improving automatic speech recognition performance and speech inteligibility with harmonicity based dereverberation.
- Seung Yeol Lee, Nam Soo Kim, Joon-Hyuk Chang:
Inner product based-multiband vector quantization for wideband speech coding at 16 kbps.
- Alberto Abad, Javier Hernando:
Speech enhancement and recognition by integrating adaptive beamforming and wiener filtering.
- Kyung-Tae Kim, Sung-Kyo Jung, MiSuk Lee, Hong-Goo Kang, Dae Hee Youn:
Temporal normalization techniques for transform-type speech coding and application to split-band wideband coders.
- Tatsunori Asai, Shigeki Miyabe, Hiroshi Saruwatari, Kiyohiro Shikano:
Interface for barge-in free spoken dialogue system using adaptive sound field control.
- Jong-Hark Kim, Jae-Hyun Shin, InSung Lee:
Multi-mode harmonic transfrom excitation LPC coding for speech and music.
- Mital Gandhi, Mark Hasegawa-Johnson:
Source separation using particle filters.
- Anssi Rämö, Jani Nurminen, Sakari Himanen, Ari Heikkinen:
Segmental speech coding model for storage applications.
- Gwo-hwa Ju, Lin-Shan Lee:
Improved speech enhancement by applying time-shift property of DFT on hankel matrices for signal subspace decomposition.
- Jari Juhani Turunen, Juha T. Tanttu, Frank Cameron:
Minimum phase compensation in speech coding using hammerstein model.
- Weifeng Li, Fumitada Itakura, Kazuya Takeda:
Optimizing regression for in-car speech recognition using multiple distributed microphones.
- Weifeng Li, Kazuya Takeda, Fumitada Itakura, Tran Huy Dat:
Speech enhancement based on magnitude estimation using the gamma prior.
- Andrew Errity, John McKenna, Stephen Isard:
Unscented kalman filtering of line spectral frequencies.
- Hyoung-Gook Kim, Thomas Sikora:
Speech enhancement based on smoothing of spectral noise floor.
- Junfeng Li, Masato Akagi:
Noise reduction using hybrid noise estimation technique and post-filtering.
- Marcel Gabrea:
An adaptive kalman filter for the enhancement of speech signals.
- T. V. Sreenivas, K. Sharath Rao, A. Sreenivasa Murthy:
Improved iterative wiener filtering for non-stationary noise speech enhancement.
- Yasheng Qian, Peter Kabal:
Highband spectrum envelope estimation of telephone speech using hard/soft-classification.
Acoustic Modeling for Robust ASR
- Filip Korkmazsky, Murat Deviren, Dominique Fohr, Irina Illina:
Hidden factor dynamic Bayesian networks for speech recognition.
- Mark Mao, Vincent Vanhoucke:
Design of compact acoustic models through clustering of tied-covariance Gaussians.
- Chandra Kant Raut, Takuya Nishimoto, Shigeki Sagayama:
Model composition by lagrange polynomial approximation for robust speech recognition in noisy environment.
- Jian Wu, Donglai Zhu, Qiang Huo:
A study of minimum classification error training for segmental switching linear Gaussian hidden Markov models.
- Shigeki Matsuda, Takatoshi Jitsuhiro, Konstantin Markov, Satoshi Nakamura:
Speech recognition system robust to noise and speaking styles.
- Néstor Becerra Yoma, Ivan Brito, Carlos Molina:
The stochastic weighted viterbi algorithm: a frame work to compensate additive noise and low-bit rate coding distortion.
Spoken Dialogue Technology and Systems
- Stefanie Tomko, Roni Rosenfeld:
Shaping spoken input in user-initiative systems.
- Christopher J. Pavlovski, Jennifer C. Lai, Stella Mitchell:
Etiology of user experience with natural language speech.
- Manny Rayner, Beth Ann Hockey:
Side effect free dialogue management in a voice enabled procedure browser.
- Ian Richard Lane, Tatsuya Kawahara, Shinichi Ueno:
Example-based training of dialogue planning incorporating user and situation models.
- Shinya Fujie, Tetsunori Kobayashi, Daizo Yagi, Hideaki Kikuchi:
Prosody based attitude recognition with feature selection and its application to spoken dialog system as para-linguistic information.
- David Ollason, Yun-Cheng Ju, Siddharth Bhatia, Daniel Herron, Jackie Liu:
MS connect: a fully featured auto-attendant: system design, implementation and performance.
Multi-Channel Speech Processing
- Reinhold Haeb-Umbach, Sven Peschke, Ernst Warsitz:
Adaptive beamforming combined with particle filtering for acoustic source localization.
- Hong-Seok Kwon, Siho Kim, Keun-Sung Bae:
Time delay estimation using weighted CPSP function.
- Ilyas Potamitis, Panagiotis Zervas, Nikos Fakotakis:
DOA estimation of speech signals using semi-blind source separation techniques.
- Sang-Gyun Kim, Chang D. Yoo:
Blind separation of speech and sub-Gaussian signals in underdetermined case.
- Gil-Jin Jang, Changkyu Choi, Yongbeom Lee, Yung-Hwan Oh:
Adaptive cross-channel interference cancellation on blind signal separation outputs using source absence/presence detection and spectral subtraction.
- Erik M. Visser, Kwokleung Chan, Stanley Kim, Te-Won Lee:
A comparison of simultaneous 3-channel blind source separation to selective separation on channel pairs using 2-channel BSS.
Intersection of Spoken Language Processing and Written Language Processing
Prosodic Recognition and Analysis
- Keiichi Takamaru:
Evaluation of a threshold for detecting local slower phrases in Japanese spontaneous conversational speech.
- Nazrul Effendy, Ekkarit Maneenoi, Patavee Charnvivit, Somchai Jitapunkul:
Intonation recognition for indonesian speech based on fujisaki model.
- Jin-Song Zhang, Satoshi Nakamura, Keikichi Hirose:
Efficient tone classification of speaker independent continuous Chinese speech using anchoring based discriminating features.
- Michiko Watanabe, Yasuharu Den, Keikichi Hirose, Nobuaki Minematsu:
Clause types and filed pauses in Japanese spontaneous monologues.
- Yohei Yabuta, Yasuhiro Katagiri, Noriko Suzuki, Yugo Takeuchi:
Effect of voice prosody on the decision making process in human-computer interaction.
- Noriko Suzuki, Yasuhiro Katagiri:
Alignment of human prosodic patterns for spoken dialogue systems.
- Shinya Kiriyama, Shigeyoshi Kitazawa:
Evaluation of a prosodic labeling system utilizing linguistic information.
- Allison Blodgett:
Functions of intonation boundaries during spoken language comprehension in English.
- Marco Khne, Matthias Wolff, Matthias Eichner, Rüdiger Hoffmann:
Voice activation using prosodic features.
- Sahyang Kim:
The role of prosodic cues in word segmentation of Korean.
- Sun-Ah Jun:
Default phrasing and attachment preference in Korean.
- Sarah Borys, Aaron Cohen, Mark Hasegawa-Johnson, Jennifer Cole:
Modeling and recognition of phonetic and prosodic factors for improvements to acoustic speech recognition models.
- Eunjong Kong:
The role of pitch range variation in the discourse structure and intonation structure of Korean.
- Kazuyuki Takagi, Kazuhiko Ozeki:
Dependency analysis of read Japanese sentences using pause and F0 information: a speaker independent case.
- Shari R. Speer, Soyoung Kang:
Effects of prosodic boundaries on ambiguous syntactic clause boundaries in Japanese.
- Yasuko Nagasaki, Takanori Komatsu:
The superior effectivenes of the F0 range for identifying the context from sounds without phonemes.
- Tan Li, Montri Karnjanadecha, Thanate Khaorapapong:
A study of tone classification for continuous Thai speech recognition.
- Key-Seop Kim, Un Lim, Dong-Il Shin:
An acoustic-analytic role for the deviation between the scansion and reading of poems.
- Tomoko Ohsuga, Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa:
Estimating syntactic structure from prosodic features in Japanese speech.
- Masahiko Komatsu, Tsutomu Sugawara, Takayuki Arai:
Perceptual discrimination of prosodic types and their preliminary acoustic analysis.
Towards Rapid Speech and Natural Language Application Development
- Johann L'Hour, Olivier Boëffard, Jacques Siroux, Laurent Miclet, Francis Charpentier, Thierry Moudenc:
DORIS, a multiagent/IP platform for multimodal dialogue applications.
- Yu Chen:
EVITA-RAD: an extensible enterprise voice porTAI - rapid application development tool.
- Luis Fernando D'Haro, Ricardo de Córdoba, Rubén San Segundo, Juan Manuel Montero, Javier Macías Guarasa, José Manuel Pardo:
Strategies to reduce design time in multimodal/multilingual dialog applications.
- Gregory Aist:
Three-way system-user-expert interactions help you expand the capabilities of an existing spoken dialogue system.
- Giuseppe Di Fabbrizio, Charles Lewis:
Florence: a dialogue manager framework for spoken dialogue systems.
- Tatsuya Kawahara, Akinobu Lee, Kazuya Takeda, Katsunobu Itou, Kiyohiro Shikano:
Recent progress of open-source LVCSR engine julius and Japanese model repository.
- Hiroya Murao, Nobuo Kawaguchi, Shigeki Matsubara, Yukiko Yamaguchi, Kazuya Takeda, Yasuyoshi Inagaki:
Example-based spoken dialogue system with online example augmentation.
- Dirk Bhler:
Enhancing existing form-based dialogue managers with reasoning capabilities.
- Markku Turunen, Esa-Pekka Salonen, Mikko Hartikainen, Jaakko Hakulinen:
Robust and adaptive architecture for multilingual spoken dialogue systems.
- Porfírio P. Filipe, Nuno J. Mamede:
Towards ubiquitous task management.
Last update Fri May 25 08:23:09 2012
CET by the DBLP Team —
Data released under the ODC-BY 1.0 license — See also our legal information page