EUROSPEECH 1997:
Rhodes,
Greece
George Kokkinakis, Nikos Fakotakis, Evangelos Dermatas (Eds.):
Fifth European Conference on Speech Communication and Technology, EUROSPEECH 1997, Rhodes, Greece, September 22-25, 1997.
ISCA 1997
Keynotes
Acoustic Modelling
- Stéphane Dupont, Hervé Bourlard:
Using multiple time scales in a multi-stream speech recognition system.
- Yumi Wakita, Harald Singer, Yoshinori Sagisaka:
Speech recognition using HMM-state confusion characteristics.
- Cristina Chesta, Pietro Laface, Franco Ravera:
Bottom-up and top-down state clustering for robust acoustic modeling.
- Ralf Schlüter, Wolfgang Macherey, Stephan Kanthak, Hermann Ney, Lutz Welling:
Comparison of optimization methods for discriminative training criteria.
- Clark Z. Lee, Douglas D. O'Shaughnessy:
Clustering beyond phoneme contexts for speech recognition.
- Rathinavelu Chengalvarayan:
Influence of outliers in training the parametric trajectory models for speech recognition.
- Trym Holter, Torbjørn Svendsen:
Incorporating linguistic knowledge and automatic baseform generation in acoustic subword unit based speech recognition.
- Peter Beyerlein, Meinhard Ullrich, Patricia Wilcox:
Modelling and decoding of crossword context dependent phones in the Philips large vocabulary continuous speech recognition system.
- Philip Hanna, Ji Ming, Peter O'Boyle, F. Jack Smith:
Modelling inter-frame dependence with preceeding and succeeding frames.
- Rhys James Jones, Simon Downey, John S. Mason:
Continuous speech recognition using syllables.
- Daniel Willett, Gerhard Rigoll:
A new approach to generalized mixture tying for continuous HMM-based speech recognition.
- Klaus Beulen, Elmar Bransch, Hermann Ney:
State tying for context dependent phoneme models.
- Jacques Duchateau, Kris Demuynck, Dirk Van Compernolle:
A novel node splitting criterion in decision tree construction for semi-continuous HMMs.
- Mats Blomberg:
Creating unseen triphones by phone concatenation in the spectral, cepstral and formant domains.
- Thilo Pfau, Manfred Beham, Wolfgang Reichl, Günther Ruske:
Creating large subword units for speech recognition.
- Jacob Goldberger, David Burshtein, Horacio Franco:
Segmental modeling using a continuous mixture of non-parametric models.
- Jane W. Chang, James R. Glass:
Segmentation and modeling in segment-based recognition.
- Alfred Hauenstein:
Using syllables in a hybrid HMM-ANN recognition system.
- Ramalingam Hariharan, Juha Häkkinen, Kari Laurila, Janne Suontausta:
Noise robust segment-based word recognition using vector quantisation.
- Luis Javier Rodríguez, M. Inés Torres:
Viterbi based splitting of phoneme HMM's.
- José B. Mariño, Albino Nogueiras, Antonio Bonafonte:
The demiphone: an efficient subword unit for continuous speech recognition.
- Hiroaki Kojima, Kazuyo Tanaka:
Organizing phone models based on piecewise linear segment lattices of speech samples.
- Ivica Rogina:
Automatic architecture design by likelihood-based context clustering with crossvalidation.
- Sam T. Roweis, Abeer Alwan:
Towards articulatory speech recognition: learning smooth maps to recover articulator information.
- Anastasios Tsopanoglou, Nikos Fakotakis:
Selection of the most effective set of subword units for an HMM-based speech recognition system.
- Christophe Cerisara, Jean Paul Haton, Jean-François Mari, Dominique Fohr:
Multi-band continuous speech recognition.
- Nabil N. Bitar, Carol Y. Espy-Wilson:
The design of acoustic parameters for speaker-independent speech recognition.
Dynamic Articulatory Measurements
- Laurence Candille, Henri Meloni:
Adaptation of natural articulatory movements to the control of the command parameters of a production model.
- Maureen C. Stone, Andrew J. Lundberg, Edward P. Davis, Rao P. Gullapalli, Moriel NessAiver:
Three-dimensional coarticulatory strategies of tongue movement.
- Nathalie Parlangeau, Régine André-Obrecht:
From laryngographic and acoustic signals to voicing gestures.
- Erkki Vilkman, Raija Takalo, Taisto Maatta, Anne-Maria Laukkanen, Jaana Nummenranta, Tero Lipponen:
Ultrasonographic measurement of cricothyroid space in speech.
- Didier Demolin, Martine George, Véronique Lecuit, Thierry Metens, Alain Soquet, H. Raeymaekers:
Coarticulation and articulatory compensations studied by dynamic MRI.
- Pierre Badin, Enrico Baricchi, Anne Vilain:
Determining tongue articulation: from discrete fleshpoints to continuous shadow.
Language Identification
- Marc A. Zissman:
Predicting, diagnosing and improving automatic language identification performance.
- Cristobal Corredor-Ardoy, Jean-Luc Gauvain, Martine Adda-Decker, Lori Lamel:
Language identification with language-independent acoustic models.
- Eluned S. Parris, Harvey Lloyd-Thomas, Michael J. Carey, Jerry H. Wright:
Bayesian methods for language verification.
- HingKeung Kwan, Keikichi Hirose:
Use of recurrent network for unknown language rejection in language identification system.
- Ove Andersen, Paul Dalsgaard:
Language-identification based on cross-language acoustic models and optimised information combination.
- Jiri Navratil, Werner Zhlke:
Phonetic-context mapping in language identification.
Neural Networks for Speech and Language Processing
Training Techniques; Efficient Decoding in ASR
- Koichi Shinoda, Takao Watanabe:
Acoustic modeling based on the MDL principle for speech recognition.
- Piyush Modi, Mazin Rahim:
Discriminative utterance verification using multiple confidence measures.
- Enrico Bocchieri, Brian Mak:
Subspace distribution clustering for continuous observation density hidden Markov models.
- Harriet J. Nock, Mark J. F. Gales, Steve J. Young:
A comparative study of methods for phonetic decision-tree state clustering.
- Alfred Kaltenmeier, Jürgen Franke:
Comparing Gaussian and polynomial classification in SCHMM-based recognition systems.
- Alexandre Girardi, Harald Singer, Kiyohiro Shikano, Satoshi Nakamura:
Maximum likelihood successive state splitting algorithm for tied-mixture HMNET.
- Erik McDermott, Shigeru Katagiri:
String-level MCE for continuous phoneme recognition.
- Ze'ev Rivlin, Ananth Sankar, Harry Bratt:
HMM state clustering across allophone class boundaries.
- Mehryar Mohri, Michael Riley:
Weighted determinization and minimization for large vocabulary speech recognition.
- Steven Phillips, Anne Rogers:
Parallel speech recognition.
- Stefan Ortmanns, Thorsten Firzlaff, Hermann Ney:
Fast likelihood computation methods for continuous mixture densities in large vocabulary speech recognition.
- Kris Demuynck, Jacques Duchateau, Dirk Van Compernolle:
A static lexicon network representation for cross-word context dependent phones.
- Mukund Padmanabhan, Lalit R. Bahl, David Nahamoo, Pieter de Souza:
Decision-tree based quantization of the feature space of a speech recognizer.
- Mosur Ravishankar, Roberto Bisiani, Eric H. Thayer:
Sub-vector clustering to improve memory and speed performance of acoustic likelihood computation.
- Simon Hovell:
The incorporation of path merging in a dynamic network recogniser.
- Miroslav Novak:
Improvement on connected digits recognition using duration constraints in the asynchronous decoding scheme.
- Andreas Stolcke, Yochai Konig, Mitchel Weintraub:
Explicit word error minimization in n-best list rescoring.
- Long Nguyen, Richard M. Schwartz:
Efficient 2-pass n-best decoder.
- Tomohiro Iwasaki, Yoshiharu Abe:
A memory management method for a large word network.
Prosody
- Antonio Romano:
Persistence of prosodic features between dialectal and standard Italian utterances in six sub-varieties of a region of southern Italy (salento): first assessments of the results of a recognition test and an instrumental analysis.
- Halewijn Vereecken, Annemie Vorstermans, Jean-Pierre Martens, Bert Van Coile:
Improving the phonetic annotation by means of prosodic phrasing.
- Cecilia Odé:
A descriptive study of prosodic phenomena in Mpur (west Papuan Phylum).
- Hansjörg Mixdorff, Hiroya Fujisaki:
Automated quantitative analysis of F0 contours of utterances from a German ToBI-labeled speech database.
- Stéphanie de Tournemire:
Identification and automatic generation of prosodic contours for a text-to-speech synthesis system in French.
- Jinfu Ni, Ren-Hua Wang, Keikichi Hirose:
Quantitative analysis and formulation of tone concatenation in Chinese F0 contours.
- Christel Brindöpke, Arno Pahde, Franz Kummert, Gerhard Sagerer:
An environment for the labelling and testing of melodic aspects of speech.
- David Casacuberta, Lourdes Aguilar, Rafael Marín:
PROPAUSE: a syntactico-prosodic system designed to assign pauses.
- Volker Warnke, Ralf Kompe, Heinrich Niemann, Elmar Nöth:
Integrated dialog act segmentation and classification using prosodic features and language models.
- Monique E. van Donzel, Florien J. Koopmans-van Beinum:
Evaluation of prosodic characteristics in retold stories in Dutch by means of semantic scales.
- Gösta Bruce, Marcus Filipsson, Johan Frid, Björn Granström, Kjell Gustafson, Merle Horne, David House:
Text-to-intonation in spontaneous Swedish.
- Yann Morlec, Gérard Bailly, Véronique Aubergé:
Synthesising attitudes with global rhythmic and intonation contours.
- Dafydd Gibbon, Claudia Sassen:
Prosody-particle pairs as discourse control signs.
- Anja Elsner:
Focus detection with additional information of phrase boundaries and sentence mode.
- Laura Bosch, Núria Sebastián-Gallés:
The role of prosody in infants' native-language discrimination abilities: the case of two phonologically close languages.
- Eugene H. Buder, Anders Eriksson:
Prosodic cycles and interpersonal synchrony in American English and Swedish.
- Eva Strangert:
Relating prosody to syntax: boundary signalling in Swedish.
- Mitsuru Nakai, Hiroshi Shimodaira:
On representation of fundamental frequency of speech for prosody analysis using reliability function.
- Seong-Hwan Kim, Jin-Young Kim:
Efficient method of establishing words tone dictionary for Korean TTS system.
- Mariapaola D'Imperio, David House:
Perception of questions and statements in Neapolitan Italian.
Keyword and Topic Spotting
- Qiguang Lin, David Lubensky, Michael Picheny, P. Srinivasa Rao:
Key-phrase spotting using an integrated language model of n-grams and finite-state grammar.
- Jochen Junkawitsch, Günther Ruske, Harald Höge:
Efficient methods for detecting keywords in continuous speech.
- Raymond Lau, Stephanie Seneff:
Providing sublexical constraints for word spotting within the ANGIE framework.
- Katarina Bartkova, Denis Jouvet:
Usefulness of phonetic parameters in a rejection procedure of an HMM-based speech recognition system.
- Yoichi Yamashita, Riichiro Mizoguchi:
Keyword spotting using F0 contour matching.
- Elmar Nöth, Stefan Harbeck, Heinrich Niemann, Volker Warnke:
A frame and segment based approach for topic spotting.
Robustness in Recognition and Signal Processing
- Kuldip K. Paliwal, Yoshinori Sagisaka:
Cyclic autocorrelation-based linear prediction analysis of speech.
- Ilija Zeljkovic, Shrikanth Narayanan:
Novel filler acoustic models for connected digit recognition.
- Makoto Shozakai, Satoshi Nakamura, Kiyohiro Shikano:
A non-iterative model-adaptive e-CMN/PMC approach for speech recognition in car environments.
- Ángel de la Torre, Antonio M. Peinado, Antonio J. Rubio, Pedro García:
Discriminative feature extraction for speech recognition in noise.
- Michael K. Brendborg, Børge Lindberg:
Noise robust recognition using feature selective modeling.
- Victor Abrash:
Mixture input transformations for adaptation of hybrid connectionist speech recognizers.
- Tai-Hwei Hwang, Lee-Min Lee, Hsiao-Chuan Wang:
Adaptation of time differentiated cepstrum for noisy speech recognition.
- Noboru Kanedera, Takayuki Arai, Hynek Hermansky, Misha Pavel:
On the importance of various modulation frequencies for speech recognition.
- Wei-Tyng Hong, Sin-Horng Chen:
A robust RNN-based pre-classification for noisy Mandarin speech recognition.
- Mazin Rahim:
A parallel environment model (PEM) for speech recognition and adaptation.
- Volker Schless, Fritz Class:
Adaptive model combination for robust speech recognition in car environments.
- Stefaan Van Gerven, Fei Xie:
A comparative study of speech detection methods.
- Nikos Doukas, Patrick A. Naylor, Tania Stathaki:
Voice activity detection using source separation techniques.
- Tomohiko Taniguchi, Shoji Kajita, Kazuya Takeda, Fumitada Itakura:
Voice activity detection using source separation techniques.
- Carlos Avendaño, Sangita Tibrewala, Hynek Hermansky:
Multiresolution channel normalization for ASR in reverberant environments.
- Rafael Martínez, Agustín Álvarez Marquina, Pedro Gómez Vilda, Mercedes Pérez, Victor Nieto Lluis, Victoria Rodellar:
A speech pre-processing technique for end-point detection in highly non-stationary environments.
- Laura Docío Fernández, Carmen García-Mateo:
Application of several channel and noise compensation techiques for robust speaker recognition.
- Hany Agaiby, Thomas J. Moir:
Knowing the wheat from the weeds in noisy speech.
- Do Yeong Kim, Nam Soo Kim, Chong Kwan Un:
Model-based approach for robust speech recognition in noisy environements with multiple noise sources.
- Y. C. Chu, Charlie Jie, Vincent Tung, Ben Lin, Richard Lee:
Normalization of speaker variability by spectrum warping for robust speech recognition.
- Stéphane H. Maes:
LPC poles tracker for music/speech/noise segmentation and music cancellation.
- Doh-Suk Kim, Jae-Hoon Jeong, Soo-Young Lee, Rhee Man Kil:
Comparative evaluations of several front-ends for robust speech recognition.
- Evandro B. Gouvêa, Richard M. Stern:
Speaker normalization through formant-based warping of the frequency scale.
- Martin Westphal:
The use of cepstral means in conversational speech recognition.
- Juan M. Huerta, Richard M. Stern:
Compensation for environmental and speaker variability by normalization of pole locations.
- Jean-Baptiste Puel, Régine André-Obrecht:
Cellular phone speech recognition: noise compensation vs. robust architectures.
- Tung-Hui Chiang:
Speech recognition in noise using on-line HMM adaptation.
Modelling of Prosody
- Christos Malliopoulos, George K. Mikros:
Metrical representations of demarcation and constituency in noun phrases.
- Hannes Pirker, Kai Alter, Erhard Rank, John Matiasek, Harald Trost, Gernot Kubin:
A system of stylized intonation contours in German.
- Keikichi Hirose, Koji Iwano:
A method of representing fundamental frequency contours of Japanese using statistical models of moraic transition.
- Evita F. Fotinea, Michael A. Vlahakis, George Carayannis:
Modeling arbitrarily long sentence-Spanning F0 contours by parametric concatenation of word-Spanning patterns.
- R. J. J. H. van Son, Jan P. H. van Santen:
Strong interaction between factors influencing consonant duration.
- Jerneja Gros, Nikola Pavesic, France Mihelic:
Speech timing in Slovenian TTS.
Microphone Arrays for Speech Enhancement
Multilingual Recognition
- Chao Wang, James R. Glass, Helen M. Meng, Joseph Polifroni, Stephanie Seneff, Victor W. Zue:
YINHE: a Mandarin Chinese version of the GALAXY system.
- Patrizia Bonaventura, Filippo Gallocchio, Giorgio Micca:
Multilingual speech recognition for flexible vocabularies.
- Fuliang Weng, Harry Bratt, Leonardo Neumeyer, Andreas Stolcke:
A study of multilingual speech recognition.
- Jayadev Billa, Kristine W. Ma, John W. McDonough, George Zavaliagkos, David R. Miller, Kenneth N. Ross, Amro El-Jaroudi:
Multilingual speech recognition: the 1996 byblos callhome system.
- Tanja Schultz, Detlef Koll, Alex Waibel:
Japanese LVCSR on the spontaneous scheduling task with JANUS-3.
- Tanja Schultz, Alex Waibel:
Fast bootstrapping of LVCSR systems with multilingual phoneme sets.
Language Specific Speech Analysis
Feature Estimation,
Pitch,
and Prosody
- Anya Varnich Hansen:
Acoustic parameters optimised for recognition of phonetic features.
- Andrew K. Halberstadt, James R. Glass:
Heterogeneous acoustic measurements for phonetic classification 1.
- Ben P. Milner:
Cepstral-time matrices and LDA for improved connected digit and sub-word recognition accuracy.
- Sarel van Vuuren, Hynek Hermansky:
Data-driven design of RASTA-like filters.
- Simon Nicholson, Ben P. Milner, Stephen J. Cox:
Evaluating feature set performance using the f-ratio and j-measures.
- Javier Hernando, Climent Nadeu:
Robust speech parameters located in the frequency domain.
- François Gaillard, Frédéric Berthommier, Gang Feng, Jean-Luc Schwartz:
A modified zero-crossing method for pitch detection in presence of interfering sources.
- Jacques Simonin, Chafic Mokbel:
Using simulated annealing expectation maximization algorithm for hidden Markov model parameters estimation.
- Gunnar Fant, Stellan Hertegard, Anita Kruckenberg, Johan Liljencrants:
Covariation of subglottal pressure, F0 and glottal parameters.
- Anastasios Delopoulos, Maria Rangoussi:
The fractal behaviour of unvoiced plosives: a means for classification.
- Sumio Ohno, Hiroya Fujisaki, Hideyuki Taguchi:
A method for analysis of the local speech rate using an inventory of reference units.
- Hiroya Fujisaki, Sumio Ohno, Takashi Yagi:
Analysis and modeling of fundamental frequency contours of Greek utterances.
- Fernando Martinez, Daniel Tapias, Jorge Alvarez, Paloma Leon:
Characteristics of slow, average and fast speech and their effects in large vocabulary continuous speech recognition.
- Sungbok Lee, Alexandros Potamianos, Shrikanth Narayanan:
Analysis of children's speech: duration, pitch and formants.
- Hartmut Traunmller, Anders Eriksson:
A method of measuring formant frequencies at high fundamental frequencies.
- Tom Brøndsted, Jens Printz Madsen:
Analysis of speaking rate variations in stress-timed languages.
- Paul Micallef, Ted Chilton:
Automatic identification of phoneme boundaries using a mixed parameter model.
- Serguei Koval, Veronika Bekasova, Mikhail Khitrov, Andrey N. Raev:
Pitch detection reliability assessment for forensic applications.
- Zhihong Hu, Etienne Barnard:
Efficient estimation of perceptual features for speech recognition.
- Narendranath Malayath, Hynek Hermansky, Alexander Kain:
Towards decomposing the sources of variability in speech.
- Rathinavelu Chengalvarayan:
Use of vector-valued dynamic weighting coefficients for speech recognition: maximum likelihood approach.
- Steve W. Beet, Ladan Baghai-Ravary:
Automatic segmentation: data-driven units of speech.
- Dejan Bajic:
On robust time-varying AR speech analysis based on t-distribution.
- Dimitris Tambakas, Iliana Tzima, Nikos Fakotakis, George Kokkinakis:
A simple phoneme energy model for the Greek language and its application to speech recognition.
- James E. H. Noad, Sandra P. Whiteside, Phil Green:
A macroscopic analysis of an emotional speech corpus.
- Hiroshi Shimodaira, Mitsuru Nakai, Akihiro Kumata:
Restoration of pitch pattern of speech based on a pitch generation model.
- A. V. Agranovski, O. Y. Berg, D. A. Lednov:
The research of correlation between pitch and skin galvanic reaction at change of human emotional state.
- Claude Montacié, Marie-José Caraty, Fabrice Lefèvre:
K-NN versus Gaussian in HMM-based recognition system.
- Boris Doval, Christophe d'Alessandro, Benoit Diard:
Spectral methods for voice source parameters estimation.
Speech Coding
- Olivier van der Vrecken, Nicolas Pierret, Thierry Dutoit, Vincent Pagel, Fabrice Malfrère:
A simple and efficient algorithm for the compression of MBROLA segment databases.
- Parham Zolfaghari, Tony Robinson:
A segmental formant vocoder based on linearly varying mixture of Gaussians.
- Samir Chennoukh, Daniel J. Sinder, Gaël Richard, James L. Flanagan:
Voice mimic system using an articulatory codebook for estimation of vocal tract shape.
- Damith J. Mudugamuwa, Alan B. Bradley:
Adaptive transform coding for linear predictive residual.
- Akira Takahashi, Nobuhiko Kitawaki, Paolino Usai, David Atkinson:
Performance evaluation of objective quality measures for coded speech.
- Mohamed Ismail, Keith Ponting:
Between recognition and synthesis - 300 bits/second speech coding.
- Stephane Villette, Milos Stefanovic, Ian A. Atkinson, Ahmet M. Kondoz:
High quality split-band LPC vocoder and its fixed point real time implementation.
- Wen-Whei Chang, Hwai-Tsu Chang, Wan-Yu Meng:
Missing packet recovery techniques for DM coded speech.
- Hai Le Vu, László Lois:
Spectral sensitivity of LSP parameters and their transformed coefficients.
- V. Ramasubramanian, Kuldip K. Paliwal:
Reducing the complexity of the LPC vector quantizer using the k-d tree search algorithm.
- Aweke N. Lemma, W. Bastiaan Kleijn, Ed F. Deprettere:
Quantization using wavelet based temporal decomposition of the LSF.
- Costas S. Xydeas, Gokhan H. Ilk:
A novel 1.7/2.4 kb/s DCT based prototype interpolation speech coding system.
- Yong-Soo Choi, Hong-Goo Kang, Sang-Wook Park, Jae-Ha Yoo, Dae Hee Youn:
Improved regular pulse VSELP coding of speech at low bit-rates.
- Yong Duk Cho, Hong Kook Kim, Moo Young Kim, Sang Ryong Kim:
Joint estimation of pitch, band magnitudes, and v\UV decisions for MBE vocoder.
- Balázs Kövesi, Samir Saoudi, Jean-Marc Boucher, Gábor Horváth:
A new distance measure in LPC coding: application for real time situations.
- Peter Veprek, Alan B. Bradley:
Consideration of processing strategies for very-low-rate compression of wideband speech signals with known text transcription.
- Norbert Görtz:
Zero-redundancy error protection for CELP speech codecs.
- Ridha Matmti, Milan Jelinek, Jean-Pierre Adoul:
Low bit rate speech coding using an improved HSX model.
- Carlos M. Ribeiro, Isabel Trancoso:
Phonetic vocoding with speaker adaptation.
- Geneviève Baudoin, Jan Cernocký, Gérard Chollet:
Quantization of spectral sequences using variable length spectral segments for speech coding at very low bit rate.
- Shahrokh Ghaemmaghami, Mohamed Deriche, Boualem Boashash:
On modeling event functions in temporal decomposition based speech coding.
- Soledad Torres, F. Javier Casajs-Quirós:
Phase quantization by pitch-cycle waveform coding in low bit rate sinusoidal coders.
- Antonis Botinis, Marios Fourakis, John W. Hawks:
A perceptual study of the greek vowel space using synthetic stimuli.
- Woo-Jin Han, Sung-Joo Kim, Yung-Hwan Oh:
Mixed multi-band excitation coder using frequency domain mixture function (FDMF) for a low-bit rate speech coding.
- Tim Fingscheidt, Olaf Scheufen:
Robust GSM speech decoding using the channel decoder's soft output.
- Carl W. Seymour, Tony A. Robinson:
A low-bit-rate speech coder using adaptive line spectral frequency prediction 1319.
Speech Synthesis Techniques
- Wen Ding, Nick Campbell:
Optimising unit selection with voice source and formants in the CHATR speech synthesis system.
- Masanobu Abe, Hideyuki Mizuno, Satoshi Takahashi, Shin'ya Nakajima:
A new framework to provide high-controllability speech signal and the development of a workbench for it.
- Eduardo Rodríguez Banga, Carmen García-Mateo, Xavier Fernández Salgado:
Shape-invariant prosodic modification algorithm for concatenative text-to-speech synthesis.
- Shaw-Hwa Hwang, Sin-Horng Chen, Saga Chang:
An RNN-based spectral information generation for Mandarin text-to-speech.
- Jan P. H. van Santen, Adam L. Buchsbaum:
Methods for optimal text selection.
- Francisco M. Gimenez de los Galanes, David Talkin:
High resolution prosody modification for speech synthesis.
- Orhan Karaali, Gerald Corrigan, Ira A. Gerson, Noel Massey:
Text-to-speech conversion with neural networks: a recurrent TDNN approach.
- Jesper Högberg:
Data driven formant synthesis.
- Simon King, Thomas Portele, Florian Höfer:
Speech synthesis using non-uniform units in the Verbmobil project.
- Isabel Trancoso, Céu Viana:
On the pronunciation mode of acronyms in several European languages.
- Toni C. M. Rietveld, Joop Kerkhoff, M. J. W. M. Emons, E. J. Meijer, Angelien Sanderman, Agaath M. C. Sluijter:
Evaluation of speech synthesis systems for Dutch in tele-communication applications in GSM and PSTN networks.
- Bianca Angelini, Claudia Barolo, Daniele Falavigna, Maurizio Omologo, Stefano Sandri:
Automatic diphone extraction for an Italian text-to-speech synthesis system.
- Eric Keller:
Simplification of TTS architecture vs. operational quality.
- Georg Fries, Antje Wirth:
Felix - a TTS system with improved pre-processing and source signal generation.
- Mike Edgington:
Investigating the limitations of concatenative synthesis.
- Luis Miguel Teixeira de Jesus, Gavin C. Cawley:
Speech coding and synthesis using parametric curves.
- Alan W. Black, Paul Taylor:
Automatically clustering similar units for unit selection in speech synthesis.
- Li Jiang, Hsiao-Wuen Hon, Xuedong Huang:
Improvements on a trainable letter-to-sound converter.
- Myungjin Bae, Kyuhong Kim, Woncheol Lee:
On a cepstral pitch alteration technique for prosody control in the speech synthesis system with high quality.
- Yannis Stylianou, Thierry Dutoit, Juergen Schroeter:
Diphone concatenation using a harmonic plus noise model of speech.
Technology for S&L Acquisition,
Speech Processing Tools
- Gérard Sabah:
The "sketchboard": a dynamic interpretative memory and its use for spoken language understanding.
- Qiru Zhou, Chin-Hui Lee, Wu Chou, Andrew N. Pargellis:
Speech technology integration and research platform: a system study.
- Dieter Geller, Markus Lieb, Wolfgang Budde, Oliver Muelhens, Manfred Zinke:
Speech recognition on SPHERIC - an IC for command and control applications.
- Michael K. McCandless, James R. Glass:
MUSE: a scripting language for the development of interactive speech analysis and recognition tools.
- Silke M. Witt, Steve J. Young:
Language learning based on non-native speech recognition.
- Ute Kilian, Klaus Bader:
Task modelling by sentence templates.
- Shigeyoshi Kitazawa, Hideya Ichikawa, Satoshi Kobayashi, Yukihiro Nishinuma:
Extraction and representation rhythmic components of spontaneous speech.
- Yoon Kim, Horacio Franco, Leonardo Neumeyer:
Automatic pronunciation scoring of specific phone segments for language instruction.
- Orith Ronen, Leonardo Neumeyer, Horacio Franco:
Automatic detection of mispronunciation for language instruction.
- Agustín Álvarez Marquina, Rafael Martínez, Victor Nieto Lluis, Victoria Rodellar, Pedro Gómez Vilda:
Continuous formant-tracking applied to visual representations of the speech and speech recognition.
- Goh Kawai, Keikichi Hirose:
A CALL system using speech recognition to train the pronunciation of Japanese long vowels, the mora nasal and mora obstruents.
- Jan Nouza, Miroslav Holada, Daniel Hajek:
An educational and experimental workbench for visual processing of speech data.
- Yong-Soo Choi, Hong-Goo Kang, Sung-Youn Kim, Young-Cheol Park, Dae Hee Youn:
A 3 channel digital CVSD bit-rate conversion system using a general purpose DSP.
- Rodolfo Delmonte, Mirela Petrea, Ciprian Bacalu:
SLIM prosodic module for learning activities in a foreign language.
- Bernhard Kaspar, Karlheinz Schuhmacher, Stefan Feldes:
Barge-in revised.
- Mohammad Akbar:
Waveedit, an interactive speech processing environment for microsoft windows platform.
- Farzad Ehsani, Jared Bernstein, Amir Najmi, Ognjen Todic:
Subarashii: Japanese interactive spoken language education.
- David Goddeau, William Goldenthal, Chris Weikart:
Deploying speech applications over the web.
- Johan Schalkwyk, Jacques de Villiers, Sarel van Vuuren, Pieter J. E. Vermeulen:
CSLUsh: an extendible research environment.
- Tibor Ferenczi, Géza Németh, Gábor Olaszy, Zoltan Gaspar:
A flexible client-server model for multilingual CTS/TTS development.
- Unto K. Laine:
Critically sampled PR filterbanks of nonuniform resolution based on block recursive FAMlet transform.
- Nobuaki Minematsu, Nariaki Ohashi, Seiichi Nakagawa:
Automatic detection of accent in English words spoken by Japanese students.
- Yasuhiro Taniguchi, Allan A. Reyes, Hideyuki Suzuki, Seiichi Nakagawa:
An English conversation and pronunciation CAI system using speech recognition technology.
- Stephen Sutton, Edward C. Kaiser, A. Cronk, Ronald A. Cole:
Bringing spoken language systems to the classroom.
- Catia Cucchiarini, Lou Boves:
Automatic assessment of foreign speakers' pronunciation of dutch.
- John F. Holzrichter, Gregory C. Burnett:
Use of low power EM radar sensors for speech articulator measurements.
- Julien Epps, Annette Dowd, John Smith, Joe Wolfe:
Real time measurements of the vocal tract resonances during speech.
Phonetics and Phonology
- Eleonora Cavalcante Albano, Patrícia Aparecida Aquino:
Linguistic criteria for building and recording units for concatenative speech synthesis in brazilian portuguese.
- Knut Kvale, Arne Kjell Foldvik:
"four-and-twenty, twenty-four". what's in a number?
- Joao Antonio de Moraes:
Vowel nasalization in Brazilian Portuguese: an articulatory investigation.
- Elena Steriopolo:
Rhythmic organization pecularities of the spoken text.
- Bernhard Rueber:
Obtaining confidence measures from sentence probabilities.
- Yiqing Zu:
Sentence design for speech synthesis and speech recognition database by phonetic rules.
- Christoph Draxler, Susanne Burger:
Identification of regional variants of high German from digit sequences in German telephone speech.
- Darya Kavitskaya:
Aerodynamic constraints on the production of palatalized trills: the case of the Slavic trilled [r].
- Cheol-jae Seong, Sanghun Kim:
An experimental phonetic study of the interrelationship between prosodic phrase and syntactic structure.
- Sebastian J. G. G. Heid:
Individual differences between vowel systems of German speakers.
- Anton Batliner, Andreas Kießling, Ralf Kompe, Heinrich Niemann, Elmar Nöth:
Tempo and its change in spontaneous speech.
- Bojan Petek, Rastislav Sustarsic:
A corpus-based approach to diphthong analysis of standard Slovenian.
- Lourdes Aguilar, Julia A. Gimenez, Maria Machuca, Rafael Marín, Montse Riera:
Catalan vowel duration.
- Maria Rosaria Caputo:
The intonation of vocatives in spoken Neapolitan Italian.
- Emanuela Magno Caldognetto, Claudio Zmarich, Franco Ferrero:
A comparative acoustic study of spontaneous and read Italian speech.
- Mario Refice, Michelina Savino, Martine Grice:
A contribution to the estimation of naturalness in the intonation of Italian spontaneous speech.
- Sylvia Moosmller:
Diphthongs and the process of monophthongization in Austrian German: a first approach.
- Steve Hoskins:
The prosody of broad and narrow focus in English: two experiments.
- Alice Turk, Laurence White:
The domain of accentual lengthening in Scottish English.
- Mariette Bessac, Geneviève Caelen-Haumont:
Spontaneous dialogue: some results about the F0 predictions of a pragmatic model of information processing.
- Didier Demolin, Bernard Teston:
Phonetic characteristics of double articulations in some Mangbutu-efe languages.
- Inmaculada Hernáez, Iñaki Gaminde, Borja Etxebarria, Pilartxo Etxebarria:
Intonation modeling for the southern dialects of the Basque language 807.
- Peter O'Boyle, Ji Ming, Marie Owens, F. Jack Smith:
From phone identification to phone clustering using mutual information.
- Ahmed-Reda Berrah, Rafael Laboissière:
Phonetic code emergence in a society of speech robots: explaining vowel systems and the MUAF principle.
- Inger Moen, Hanne Gram Simonsen:
Effects of voicing on /t, d/ tongue/palate contact in English and norwegian.
- Peter Ladefoged, Gunnar Fant:
Fieldwork techniques for relating formant frequency, amplitude and bandwidth.
- Xue Wang, Louis C. W. Pols:
Word juncture modelling based on the TIMIT database.
- Motoko Ueyama:
The phonology and phonetics of second language intonation: the case of "Japanese English".
Confidence Measures in ASR
Speaker and Language Identification
Perception of Prosody
Applications of Speech Technology
- Raymond Lau, Giovanni Flammia, Christine Pao, Victor W. Zue:
Webgalaxy - integrating spoken language and hypertext navigation.
- Michael J. Carey, Eluned S. Parris, Graham Tattersall:
Pitch estimation of singing for re-synthesis and musical transcription.
- Christian Martyn Jones, Satnam Singh Dlay:
Automated lip synchronisation for human-computer interaction and special effect animation.
- Charles T. Hemphill, Yeshwant K. Muthusamy:
Developing web-based speech applications.
- Werner Verhelst:
Automatic post-synchronization of speech utterances.
- Jordi Robert-Ribes, Rami G. Mukhtar:
Automatic generation of hyperlinks between audio and transcript.
- Sebastian Möller, Rainer Schönweiler:
Analysis of infant cries for the early detection of hearing impairment.
- Athanassios Hatzis, Phil D. Green, S. J. Howard:
Optical logo-therapy (OLT): a computer-based real time visual feedback application for speech training.
- Sung-Chien Lin, Lee-Feng Chien, Ming-Chiuan Chen, Lin-Shan Lee, Keh-Jiann Chen:
Intelligent retrieval of very large Chinese dictionaries with speech queries.
- Fulvio Leonardi, Giorgio Micca, Sheyla Militello, Mario Nigra:
Preliminary results of a multilingual interactive voice activated telephone service for people-on-the-move.
- Jean-Christophe Dubois, Yolande Anglade, Dominique Fohr:
Assessment of an operational dialogue system used by a blind telephone switchboard operator.
- Antonio J. Rubio, Pedro García, Ángel de la Torre, José C. Segura, Jesús E. Díaz-Verdejo, Maria C. Benitez, Victoria E. Sánchez, Antonio M. Peinado, Juan M. López-Soler, José L. Pérez-Córdoba:
STACC: an automatic service for information access using continuous speech recognition through telephone line.
- Ramón López-Cózar, Pedro García, Jesús E. Díaz-Verdejo, Antonio J. Rubio:
A voice activated dialogue system for fast-food restaurant applications.
- Paul W. Shields, Douglas R. Campbell:
Multi-microphone sub-band adaptive signal processing for improvement of hearing aid performance.
- Hans Georg Piroth, Thomas Arnhold:
Tactile transmission of intonation and stress.
- Kerttu Huttunen, Pentti Korkko, Martti Sorri:
Hearing impairment simulation: an interactive multimedia programme on the internet for students of speech therapy.
- Sorin Ciocea, Jean Schoentgen, Lise Crevier-Buchman:
Analysis of dysarthric speech by means of formant-to-area mapping.
- Boris Lobanov, Simon V. Brickle, Andrey V. Kubashin, Tatiana V. Levkovskaja:
An intelligent telephone answering system using speech recognition.
- Ulla Ackermann, Bianca Angelini, Fabio Brugnara, Marcello Federico, Diego Giuliani, Roberto Gretter, Heinrich Niemann:
Speedata: a prototype for multilingual spoken data-entry.
- Matti Karjalainen, Péter Boda, Panu Somervuo, Toomas Altosaar:
Applications for the hearing-impaired: evaluation of finnish phoneme recognition methods.
- Nina Alarotu, Mietta Lennes, Toomas Altosaar, Anja Malm, Matti Karjalainen:
Applications for the hearing-impaired: comprehension of finnish text with phoneme errors.
- Ute Ehrlich, Gerhard Hanrieder, Ludwig Hitzenberger, Paul Heisterkamp, Klaus Mecklenburg, Peter Regel-Brietzmann:
Access - automated call center through speech understanding system.
- E. Richard Anthony, Charles Bowen, Margot T. Peet, Susan G. Tammaro:
Integrating a radio model with a spoken language interface for military simulations.
- Daniele Falavigna, Roberto Gretter:
On field experiments of continuous digit recognition over the telephone network.
- Xavier Menéndez-Pidal, James B. Polikoff, H. Timothy Bunnell:
An HMM-based phoneme recognizer applied to assessment of dysarthric speech.
- Celinda de la Torre, Gonzalo Alonso:
Multiapplication platform based on technology for mobile telephone network services.
- Els den Os, Lou Boves, David James, Richard Winski, Kurt Fridh:
Field test of a calling card service based on speaker verification and automatic speech recognition.
- Luc E. Julia, Adam Cheyer:
Speech: a privileged modality.
Spontaneous Speech Recognition
- Jean-Luc Gauvain, Lori Lamel, Gilles Adda, Martine Adda-Decker:
Transcription of broadcast news.
- Fil Alleva, Xuedong Huang, Mei-Yuh Hwang, Li Jiang:
Can continuous speech recognizers handle isolated speech?
- Tatsuo Matsuoka, Yuichi Taguchi, Katsutoshi Ohtsuki, Sadaoki Furui, Katsuhiko Shirai:
Toward automatic transcription of Japanese broadcast news.
- Mauro Cettolo, Anna Corazza:
Automatic detection of semantic boundaries.
- Etienne Bauche, Bojana Gajic, Yasuhiro Minami, Tatsuo Matsuoka, Sadaoki Furui:
Connected digit recognition in spontaneous speech.
- Francis Kubala, Hubert Jin, Spyros Matsoukas, Long Nguyen, Richard M. Schwartz, John Makhoul:
Advances in transcription of broadcast news.
Language Specific Segmental Features
Speaker Recognition
- Wei-Ying Li, Douglas D. O'Shaughnessy:
Hybrid networks based on RBFN and GMM for speaker recognition.
- Jialong He, Li Liu, Günther Palm:
A discriminative training algorithm for Gaussian mixture speaker models.
- Douglas A. Reynolds:
Comparison of background normalization methods for text-independent speaker verification.
- Owen Kimball, Michael Schmidt, Herbert Gish, Jason Waterman:
Speaker verification with limited enrollment data.
- Frédéric Bimbot, Hans-Peter Hutter, Cédric Jaboulet, Johan Koolwaaij, Johan Lindberg, Jean-Benoît Pierrot:
Speaker verification in the telephone network: research activities in the cave project.
- Mark Kuitert, Lou Boves:
Speaker verification with GSM coded telephone speech.
- Aaron E. Rosenberg, S. Parthasarathy:
Speaker identification with user-selected password phrases.
- Jesper O. Olsen:
Speaker verification based on phonetic decision making.
- Aladdin M. Ariyaeeinia, P. Sivakumaran:
Analysis and comparison of score normalisation methods for text-dependent speaker verification.
- Frederic Jauquet, Patrick Verlinde, Claude Vloeberghs:
Automatic speaker recognition on a vocoder link.
- Frédéric Bimbot, Dominique Genoud:
Likelihood ratio adjustment for the compensation of model mismatch in speaker verification.
- M. Kemal Sönmez, Larry P. Heck, Mitchel Weintraub, Elizabeth Shriberg:
A lognormal tied mixture model of pitch for prosody based speaker recognition.
Speech Synthesis:
Linguistic Analysis
Speech Analysis and Modelling
- Hisao Kuwabara:
Acoustic and perceptual properties of phonemes in continuous speech as a function of speaking rate.
- Shrikanth Narayanan, Abeer Alwan, Yong Song:
New results in vowel production: MRI, EPG, and acoustic data.
- Takayuki Arai, Steven Greenberg:
The temporal properties of spoken Japanese are similar to those of English.
- Anna Esposito:
The amplitudes of the peaks in the spectrum: data from /a/ context.
- Natalija Bolfan-Stosic, Mladen Hedjever:
Acoustical characteristics of speech and voice in speech pathology.
- Andreas Kipp, Maria-Barbara Wesenick, Florian Schiel:
Pronuncation modeling applied to automatic segmentation of spontaneous speech.
- Simon Downey, Richard Wiseman:
Dynamic and static improvements to lexical baseforms.
- Andreas Hauenstein:
Signal driven generation of word baseforms from few examples.
- Elizabeth C. Botha, Louis C. W. Pols:
Modeling the acoustic differences between L1 and L2 speech: the short vowels of africaans and south-african English.
- Béatrice Vaxelaire, Rudolph Sock:
Laryngeal movements and speech rate: an x-ray investigation.
- Anders Eriksson, Pär Wretling:
How flexible is the human voice? - a case study of mimicry.
- Helmer Strik:
The effect of low-pass filtering on estimated voice source parameters.
- Susan M. Fosnot:
Vowel development of /i/ and /u/ in 15-36 month old children at risk and not at risk to stutter.
- Alan Wrench, Alan D. McIntosh, William J. Hardcastle:
Optopalatograph: development of a device for measuring tongue movement in 3D.
- Juana M. Gutiérrez-Arriola, Francisco M. Gimenez de los Galanes, Mohammed H. Savoji, José Manuel Pardo:
Speech synthesis and prosody modification using segmentation and modelling of the excitation signal.
- Christophe Savariaux, Louis-Jean Boë, Pascal Perrier:
How can the control of the vocal tract limit the speaker's capability to produce the ultimate perceptive objectives of speech? 1063.
- Goran S. Jovanovic:
A step toward general model for symbolic description of the speech signal 1067.
- Kiyoshi Furukawa, Masayuki Nakazawa, Takashi Endo, Ryuichi Oka:
Referring in long term speech by using orientation patterns obtained from vector field of spectrum pattern.
Dialogue Systems:
Design and Applications
- J. Barnett, S. Anderson, J. Broglio, M. Singh, R. Hudson, S. W. Kuo:
Experiments in spoken queries for document retrieval.
- Frank Seide, Andreas Kellner:
Towards an automated directory information system.
- Lars Bo Larsen:
A strategy for mixed-initiative dialogue control.
- Jim Hugunin, Victor W. Zue:
On the design of effective speech-based interfaces for desktop applications.
- Matthias Denecke, Alex Waibel:
Dialogue strategies guiding users to their communicative goals.
- Sunil Issar:
A speech interface for forms on WWW.
- Giovanni Flammia, Victor W. Zue:
Learning the structure of mixed initiative dialogues using a corpus of annotated conversations 1.
- Roberto Pieraccini, Esther Levin, Wieland Eckert:
AMICA: the AT&t mixed initiative conversational architecture.
- Alicia Abella, Allen L. Gorin:
Generating semantically consistent inputs to a dialog manager.
- Esther Levin, Roberto Pieraccini:
A stochastic model of computer-human interaction for learning dialogue strategies.
- Manuela Boros, Maria Aretoulaki, Florian Gallwitz, Elmar Nöth, Heinrich Niemann:
Semantic processing of out-of-vocabulary words in a spoken dialogue system.
- Elisabeth Maier:
Clarification dialogues in VERBMOBIL.
Speech Production Modelling
- Levent M. Arslan, David Talkin:
Voice conversion by codebook mapping of line spectral frequencies and excitation spectrum.
- Chafic Mokbel, Guillaume Gravier, Gérard Chollet:
Optimal state dependent spectral representation for HMM modeling : a new theoretical framework.
- Alexandros Potamianos, Petros Maragos:
Speech analysis and synthesis using an AM-FM modulation model.
- Khaled Mawass, Pierre Badin, Gérard Bailly:
Synthesis of fricative consonants by audiovisual-to-articulatory inversion.
- Tom Claes, Ioannis Dologlou, Louis ten Bosch, Dirk Van Compernolle:
New transformations of cepstral parameters for automatic vocal tract length normalization in speech recognition.
- Simon Dobrisek, France Mihelic, Nikola Pavesic:
A multiresolutionally oriented approach for determination of cepstral features in speech recognition.
Speech Enhancement and Noise Mitigation
- Tim Haulick, Klaus Linhard, Peter Schrogmeier:
Residual noise suppression using psychoacoustic criteria.
- B. Yegnanarayana, Carlos Avendaño, Hynek Hermansky, P. Satyanarayana Murthy:
Processing linear prediction residual for speech enhancement.
- Stefan Gustafsson, Rainer Martin:
Combined acoustic echo control and noise reduction for mobile communications.
- Ki Yong Lee, JaeYeol Rheem:
A nonstationary autoregressive HMM and its application to speech enhancement.
- Néstor Becerra Yoma, Fergus R. McInnes, Mervyn A. Jack:
Spectral subtraction and mean normalization in the context of weighted matching algorithms.
- Dionysis E. Tsoukalas, J. Mourjopoulos, George Kokkinakis:
Improving the intelligibility of noisy speech using an audible noise suppression technique.
- Laurent Girin, Gang Feng, Jean-Luc Schwartz:
Noisy speech enhancement by fusion of auditory and visual information: a study of vowel transitions.
- Andreas Engelsberg, Thomas Gülzow:
Spectral subtraction using a non-critically decimated discrete wavelet transform.
- Jen-Tzung Chien, Hsiao-Chuan Wang, Chin-Hui Lee:
Bayesian affine transformation of HMM parameters for instantaneous and supervised adaptation in telephone speech recognition.
- Craig Lawrence, Mazin G. Rahim:
Integrated bias removal techniques for robust speech recognition \lambda.
- Detlev Langmann, Alexander Fischer, Friedhelm Wuppermann, Reinhold Haeb-Umbach, Thomas Eisele:
Acoustic front ends for speaker-independent digit recognition in car environments.
- Lionel Delphin-Poulat, Chafic Mokbel:
Signal bias removal using the multi-path stochastic equalization technique.
- Andrej Miksic, Bogomir Horvat:
Subband echo cancellation in automatic speech dialog systems.
- Hesham Tolba, Douglas D. O'Shaughnessy:
Speech enhancement via energy separation.
- Masashi Unoki, Masato Akagi:
A method of signal extraction from noisy signal.
- Jiri Sika, Vratislav Davidek:
Multi-channel noise reduction using wavelet filter bank.
- Imad Abdallah, Silvio Montrésor, Marc Baudry:
Speech signal detection in noisy environement using a local entropic criterion.
- Pedro J. Moreno, Brian S. Eberman:
A new algorithm for robust speech recognition: the delta vector taylor series approach.
- David Cole, Miles Moody, Sridha Sridharan:
Robust enhancement of reverberant speech using iterative noise removal.
- D. J. Jones, Scott D. Watson, K. G. Evans, Barry M. G. Cheetham, R. A. Reeve:
A network speech echo canceller with comfort noise.
- Amir Hussain, Douglas R. Campbell, Thomas J. Moir:
A new metric for selecting sub-band processing in adaptive speech enhancement systems.
- Hidefumi Kobatake, Hideta Suzuki:
Estimation of LPC cepstrum vector of speech contaminated by additive noise and its application to speech enhancement.
- Sangita Tibrewala, Hynek Hermansky:
Multi-band and adaptation approaches to robust speech recognition.
- Enrique Masgrau, Eduardo Lleida, Luis Vicente:
Non-quadratic criterion algorithms for speech enhancement.
Spoken Language Understanding
- Jeremy H. Wright, Allen L. Gorin, Giuseppe Riccardi:
Automatic acquisition of salient grammar fragments for call-type classification.
- Wolfgang Minker:
Stochastically-based natural language understanding across tasks and languages.
- Michael Riley, Fernando Pereira, Mehryar Mohri:
Transducer composition for context-dependent network expansion.
- Christian Lieske, Johan Bos, Martin Emele, Björn Gambäck, C. J. Rupp:
Giving prosody a meaning.
- Kishore Papineni, Salim Roukos, Todd Ward:
Feature-based language understanding.
- Juan-Carlos Amengual, José-Miguel Benedí, Klaus Beulen, Francisco Casacuberta, M. Asunción Castaño, Antonio Castellanos, Víctor M. Jiménez, David Llorens, Andrés Marzal, Hermann Ney, Federico Prat, Enrique Vidal, Juan Miguel Vilar:
Speech translation based on automatically trainable finite-state models.
Language Model Adaptation
- Yoshihiko Gotoh, Steve Renals:
Document space models using latent semantic analysis.
- Sven C. Martin, Jörg Liermann, Hermann Ney:
Adaptive topic - dependent language modelling using word - based varigrams.
- Jerome R. Bellegarda:
A latent semantic analysis framework for large-Span language modeling.
- Richard M. Schwartz, Toru Imai, Francis Kubala, Long Nguyen, John Makhoul:
A maximum likelihood model for topic classification of broadcast news.
- Cosmin Popovici, Paolo Baggia:
Language modelling for task-oriented domains.
- Sung-Chien Lin, Chi-Lung Tsai, Lee-Feng Chien, Keh-Jiann Chen, Lin-Shan Lee:
Chinese language model adaptation based on document classification and multiple domain-specific language models.
Prosody and Speech Recognition/Understanding
- Philippe Langlais:
Estimating prosodic weights in a syntactic-rhythmical prediction system.
- Kazuhiko Ozeki, Kazuyuki Kousaka, Yujie Zhang:
Syntactic information contained in prosodic features of Japanese utterances.
- Grace Chung, Stephanie Seneff:
Hierarchical duration modelling for speech recognition using the ANGIE framework.
- Volker Strom, Anja Elsner, Wolfgang Hess, Walter Kasper, Alexandra Klein, Hans-Ulrich Krieger, Jörg Spilker, Hans Weber, Günther Görz:
On the use of prosody in a speech-to-speech translator.
- Vincent J. van Heuven, Judith Haan, Jos J. A. Pacilly:
Automatic recognition of sentence type from prosody in dutch.
- Paul Munteanu, Bertrand Caillaud, Jean-François Serignat, Genevicve Caelen-Haumont:
Automatic word demarcation based on prosody.
Wideband Speech Coding
Speech Recognition in Adverse Environments CSR and Error Analysis
- Wei-Wen Hung, Hsiao-Chuan Wang:
A comparative analysis of blind channel equalization methods for telephone speech recognition.
- Wei-Wen Hung, Hsiao-Chuan Wang:
HMM retraining based on state duration alignment for noisy speech recognition.
- Yasuhiro Komori, Tetsuo Kosaka, Hiroki Yamamoto, Masayuki Yamada:
Fast parallel model combination noise adaptation processing.
- Takashi Endo, Shigeki Nagaya, Masayuki Nakazawa, Kiyoshi Furukawa, Ryuichi Oka:
Speech recognition module for CSCW using a microphone array.
- Jiqing Han, Munsung Han, Gyu-Bong Park, Jeongue Park, Wen Gao:
Relative mel-frequency cepstral coefficients compensation for robust telephone speech recognition.
- Seiichi Yamamoto, Masaki Naito, Shingo Kuroiwa:
Robust speech detection method for speech recognition system for telecommunication networks and its field trial.
- Laurent Mauuary, Lamia Karray:
The tuning of speech detection in the context of a global evaluation of a voice response system.
- C. Julian Chen, Ramesh A. Gopinath, Michael D. Monkowski, Michael A. Picheny, Katherine Shen:
New methods in continuous Mandarin speech recognition.
- Michelle S. Spina, Victor W. Zue:
Automatic transcription of general audio data: effect of environment segmentation on phonetic recognition 1.
- Alfred Ying Pang Ng, Lai-Wan Chan, P. C. Ching:
Automatic recognition of continuous Cantonese speech with very large vocabulary.
- Yifan Gong:
Source normalization training for HMM applied to noisy telephone speech recognition.
- Joao P. Neto, Ciro Martins, Luís B. Almeida:
The development of a speaker independent continuous speech recognizer for portuguese.
- Lin Chase:
Blame assignment for errors made by large vocabulary speech recognizers.
- Atsushi Nakamura:
Predicting speech recognition performance.
- Scott D. Watson, Barry M. G. Cheetham, P. A. Barrett, W. T. K. Wong, A. V. Lewi:
A voice activity detector for the ITU-t 8kbit/s speech coding standard g.729.
- Yeshwant K. Muthusamy, John J. Godfrey:
Vocabulary-independent recognition of american Spanish phrases and digit strings.
- Charles T. Hemphill, Yeshwant K. Muthusamy:
Developing web-based speech applications.
- Michael Meyer, Hermann Hild:
Recognition of spoken and spelled proper names.
- Takao Kobayashi, Takashi Masuko, Keiichi Tokuda:
HMM compensation for noisy speech recognition based on cepstral parameter generation.
- George Nokas, Evangelos Dermatas, George Kokkinakis:
On the robustness of the critical-band adaptive filtering method for multi-source noisy speech recognition.
- Cun-tai Guan, Shu-hung Leung, Wing-hong Lau:
A space transformation approach for robust speech recognition in noisy environments.
- Tzur Vaich, Arnon Cohen:
Robust isolated word recognition using WSP-PMC combination.
Multimodal Speech Processing,
Emerging Techniques and Applications
- Spyros Raptis, George Carayannis:
Fuzzy logic for rule-based formant speech synthesis.
- Pierre Jourlin, Juergen Luettin, Dominique Genoud, Hubert Wassner:
Integrating acoustic and labial information for speaker identification and verification.
- Kenney Ng, Victor W. Zue:
Subword unit representations for spoken document retrieval.
- Pascal Teissier, Jean-Luc Schwartz, Anne Guérin-Dugué:
Non-linear representations, sensor reliability estimation and context-dependent fusion in the audiovisual recognition of speech in noise.
- Philippe Renevey, Andrzej Drygajlo:
Securized flexible vocabulary voice messaging system on unix workstation with ISDN connection.
- Houda Mokbel, Denis Jouvet:
Automatic derivation of multiple variants of phonetic transcriptions from acoustic signals.
- Satoshi Nakamura, Ron Nagai, Kiyohiro Shikano:
Improved bimodal speech recognition using tied-mixture HMMs and 5000 word audio-visual synchronous database.
- Philippe Depambour, Régine André-Obrecht, Bernard Delyon:
On the use of phone duration and segmental processing to label speech signal.
- Martin Paping, Thomas Fahnle:
Automatic detection of disturbing robot voice- and ping pong-effects in GSM transmitted speech.
- Joseph Di Martino:
Speech synthesis using phase vocoder techniques.
- Ramesh R. Sarukkai, Craig Hunter:
Integration of eye fixation information with speech recognition systems.
- Yoshihisa Nakatoh, M. Tsushima, Takeshi Norimatsu:
Generation of broadband speech from narrowband speech using piecewise linear mapping.
- Ian E. C. Rogers:
An assessment of the benefits active noise reduction systems provide to speech intelligibility in aircraft noise environments.
- Jonas Beskow, Kjell Elenius, Scott McGlashan:
OLGA - a dialogue system with an animated talking agent.
- Sandrine Robbe, Noelle Carbonell, Claude Valot:
Towards usable multimodal command languages: definition and ergonomic assessment of constraints on users' spontaneous speech and gestures.
- Bernhard Suhm, Alex Waibel:
Exploiting repair context in interactive error recovery.
- Lionel Revéret, Frederique Garcia, Christian Benoît, Eric Vatikiotis-Bateson:
An hybrid image processing approach to liptracking independent of head orientation.
- Bertrand Le Goff:
Automatic modeling of coarticulation in text-to-visual speech synthesis.
- Ali Adjoudani, Thierry Guiard-Marigny, Bertrand Le Goff, Lionel Revéret, Christian Benoît:
A multimedia platform for audio-visual speech processing.
- Hiroya Fujisaki, Hiroyuki Kameda, Sumio Ohno, Takuya Ito, Ken Tajima, Kenji Abe:
An intelligent system for information retrieval over the internet through spoken dialogue.
- Yasemin Yardimci, A. Enis Çetin, Rashid Ansari:
Data hiding in speech using phase coding.
- Denis Burnham, John Fowler, Michelle Nicol:
CAVE: an on-line procedure for creating and running auditory-visual speech perception experiments-hardware, software, and advantages.
Databases,
Tools and Evaluations
- Florian Schiel, Christoph Draxler, Hans G. Tillmann:
The bavarian archive for speech signals: resources for the speech community.
- Christoph Draxler:
WWWTranscribe - a modular transcription system based on the world wide web.
- Inger S. Engberg, Anya Varnich Hansen, Ove Andersen, Paul Dalsgaard:
Design, recording and verification of a danish emotional speech database.
- Maxine Eskenazi, Christopher Hogan, J. Allen, Robert E. Frederking:
Issues in database creation: recording new populations, faster and better labelling.
- Stefan Feldes, Bernhard Kaspar, Denis Jouvet:
Design and analysis of a German telephone speech database for phoneme based training.
- João Paulo Neto, Ciro Martins, Hugo Meinedo, Luís B. Almeida:
The design of a large vocabulary speech corpus for portuguese.
- Lennart Nord, Britta Hammarberg, Elisabet Lundstrom:
Continued investigations of laryngectomee speech in noise - measurements and intelligibility tests.
- Léon J. M. Rothkrantz, W. A. Th. Manintveld, M. M. M. Rats, Robert J. van Vark, J. P. M. de Vreught, Henk Koppelaar:
An appreciation study of an ASR inquiry system.
- Kamel Bensaber, Paul Munteanu, Jean-François Serignat, Pascal Perrier:
Object-oriented modeling of articulatory data for speech research information systems.
- Woosung Kim, Myoung-Wan Koo:
A Korean speech corpus for train ticket reservation aid system based on speech recognition.
- Dawn Dutton, Candace A. Kamm, Susan Boyce:
Recall memory for earcons.
- Odile Mella, Dominique Fohr:
Semi-automatic phonetic labelling of large corpora.
- Stefan Grocholewski:
CORPORA - speech database for Polish diphones.
- Christel Mller, Thomas Ziem:
Multilingual speech interfaces (MSI) and dialogue design environments for computer telephony services.
- John H. L. Hansen, Sahar E. Bou-Ghazale:
Getting started with SUSAS: a speech under simulated and actual stress database.
- Paul Taylor, Michael Tanenblatt, Amy Isard:
A markup language for text-to-speech synthesis richard sproat.
- Shuichi Itahashi, Naoko Ueda, Mikio Yamamoto:
Several measures for selecting suitable speech CORPORA.
- Irene Chatzi, Nikos Fakotakis, George Kokkinakis:
Greek speech database for creation of voice driven teleservices.
Speaker Adaptation I
Assessment Methods
- Alvin F. Martin, George R. Doddington, Terri Kamm, Mark Ordowski, Mark A. Przybocki:
The DET curve in assessment of detection task performance.
- Harald Klaus, Ekkehard Diedrich, Astrid Dehnel, Jens Berger:
Speech quality evaluation of hands-free terminals.
- David S. Pallett, Jonathan G. Fiscus, William M. Fisher, John S. Garofolo:
Use of broadcast news materials for speech recognition benchmark tests.
- Norman M. Fraser:
Spoken dialogue system evaluation: a first framework for reporting results.
- Niels Ole Bernsen, Hans Dybkjær, Laila Dybkjær, Vytautas Zinkevicius:
Generality and transferability. two issues in putting a dialogue evaluation tool into practical use.
- David A. van Leeuwen, Herman J. M. Steeneken:
Within-speaker variability of the word error rate for a continuous speech recognition system.
Education for Language and Speech Communication
- Mark Huckvale, Christian Benoît, Chris Bowerman, Anders Eriksson, Mike Rosner, Mark Tatham, Briony Williams:
Opportunities for computer-aided instruction in phonetics and speech communication provided by the internet.
- Gerrit Bloothooft:
The landscape of future education in speech communication sciences.
- Kåre Sjölander, Joakim Gustafson:
An integrated system for teaching spoken dialogue systems technology.
- Janet Beck, Bernard Camilleri, Hilde Chantrain, Anu Klippi, Marianne Leterme, Matti Lehtihalmes, Peter Schneider, Wilhelm Vieregge, Eva Wigforss:
Communication science within education for logopedics/speech and language therapy in europe: the state of the art.
- Phil Green, Carlos Espain:
Education in spoken language engineering in europe.
- Valérie Hazan, Wim A. van Dommelen:
A survey of phonetics education in Europe.
Hybrid Systems for ASR
- Xin Tu, Yonghong Yan, Ronald A. Cole:
Matching training and testing criteria in hybrid speech recognition systems.
- Stéphane Dupont, Christophe Ris, Olivier Deroo, Vincent Fontaine, Jean-Marc Boite, L. Zanoni:
Context independent and context dependent hybrid HMM/ANN systems for vocabulary independent tasks.
- Jean Hennebert, Christophe Ris, Hervé Bourlard, Steve Renals, Nelson Morgan:
Estimation of global posteriors and forward-backward training of hybrid HMM/ANN systems.
- Gethin Williams, Steve Renals:
Confidence measures for hybrid HMM/ANN speech recognition.
- Gary Cook, Steve R. Waterhouse, Anthony J. Robinson:
Ensemble methods for connectionist acoustic modelling.
- Jürgen Fritsch, Michael Finke:
Improving performance on switchboard by combining hybrid HME/HMM and mixture of Gaussians acoustic models.
Topic and Dialogue Dependent Language Modelling
Lipreading
- Juergen Luettin:
Towards speaker independent continuous speechreading.
- William Goldenthal, Keith Waters, Jean-Manuel Van Thong, Oren Glickman:
Driving synthetic mouth gestures: phonetic recognition for faceme!
- Alexandrina Rogozan, Paul Deléglise:
Continuous visual speech recognition using geometric lip-shape models and neural networks.
- Jonas Beskow, Martin Dahlquist, Björn Granström, Magnus Lundeberg, Karl-Erik Spens, Tobias Öhman:
The teleface project multi-modal speech-communication for the hearing impaired.
- Rainer Stiefelhagen, Uwe Meier, Jie Yang:
Real-time lip-tracking for lipreading.
- Lionel Revéret:
From raw images of the lips to articulatory parameters: a viseme-based prediction.
Articulatory Modelling
- Bruno Mathieu, Yves Laprie:
Adaptation of Maeda's model for acoustic to articulatory inversion.
- Yohan Payan, Pascal Perrier:
Why should speech control studies based on kinematics be considered with caution? insights from a 2d biomechanical model of the tongue.
- Vittorio Sanguineti, Rafael Laboissière, David J. Ostry:
An integrated model of the biomechanics and neural control of the tongue, jaw, hyoid and larynx system.
- M. Mohammad, E. Moore, J. N. Carter, Christine H. Shadle, S. J. Gunn:
Using MRI to image the moving vocal tract during speech.
- Eric Vatikiotis-Bateson, Hani Yehia:
Unified physiological model of audible-visible speech production.
- Hélène Loevenbruck, Pascal Perrier:
Motor control information recovering from the dynamics with the EP hypothesis.
Front-Ends and Adaptation to Acoustics Speaker Adaptation
- Yasuhiro Komori, Tetsuo Kosaka, Masayuki Yamada, Hiroki Yamamoto:
Speaker adaptation for context-dependent HMM using spatial relation of both phoneme context hierarchy and speakers.
- Masayuki Yamada, Yasuhiro Komori, Tetsuo Kosaka, Hiroki Yamamoto:
Fast algorithm for speech recognition using speaker cluster HMM.
- Timothy J. Hazen, James R. Glass:
A comparison of novel techniques for instantaneous speaker adaptation.
- Yoshikazu Yamaguchi, Satoshi Takahashi, Shigeki Sagayama:
Fast adaptation of acoustic models to environmental noise using jacobian adaptation algorithm.
- Ilija Zeljkovic, Shrikanth Narayanan, Alexandros Potamianos:
Unsupervised HMM adaptation based on speech-silence discrimination.
- Mohamed Afify, Yifan Gong, Jean Paul Haton:
Correlation based predictive adaptation of hidden Markov models.
- Vassilios Diakoloukas, Vassilios Digalakis:
Adaptation of hidden Markov models using multiple stochastic transformations.
- M. J. F. Gales:
Transformation smoothing for speaker and environmental adaptation.
- Vincent Fontaine, Christophe Ris, Jean-Marc Boite:
Nonlinear discriminant analysis for improved speech recognition.
- Jürgen Tchorz, Klaus Kasper, Herbert Reininger, Birger Kollmeier:
On the interplay between auditory-based features and locally recurrent neural networks for robust speech recognition in noise.
- Nelson Morgan, Eric Fosler-Lussier, Nikki Mirghafori:
Speech recognition using on-line estimation of speaking rate.
- John N. Holmes, Wendy J. Holmes, Philip N. Garner:
Using formant frequencies in speech recognition.
- Puming Zhan, Martin Westphal, Michael Finke, Alex Waibel:
Speaker normalization and speaker adaptation - a combination for conversational speech recognition.
- Yuqing Gao, Mukund Padmanabhan, Michael Picheny:
Speaker adaptation based on pre-clustering training speakers.
- Mike Lincoln, Stephen Cox, Simon Ringland:
A fast method of speaker normalisation using formant estimation.
- Lutz Welling, N. Haberland, Hermann Ney:
Acoustic front-end optimization for large vocabulary speech recognition.
- B. T. Logan, A. J. Robinson:
Improving autoregressive hidden Markov model recognition accuracy using a non-linear frequency scale with application to speech enhancement.
- Tsuneo Nitta, Akinori Kawamura:
Designing a reduced feature-vector set for speech recognition by using KL/GPD competitive training.
- Scott Shaobing Chen, Peter DeSouza:
Speaker adaptation by correlation (ABC).
Speech Perception
- William A. Ainsworth, Georg F. Meyer:
Preliminary experiments on the perception of double semivowels.
- Niels O. Schiller:
Does syllable frequency affect production time in a delayed naming task?
- Andrew C. Morris, Gerrit Bloothooft, William J. Barry, Bistra Andreeva, Jacques C. Koreman:
Human and machine identification of consonantal place of articulation from vocalic transition segments.
- Jon Barker, Martin Cooke:
Modelling the recognition of spectrally reduced speech.
- Christophe Pallier, Anne Cutler, Núria Sebastián-Gallés:
Prosodic structure and phonetic processing: a cross-linguistic study.
- R. J. J. H. van Son, Louis C. W. Pols:
The correlation between consonant identification and the amount of acoustic consonant reduction.
- Anne Bonneau:
Relevant spectral information for the identification of vowel features from bursts.
- Aijun Li:
Perceptual study of intersyllabic formant transitions in synthesized V1-V2 in standard Chinese.
- Oleg P. Skljarov:
Role of perception of rhythmically organized speech in consolidation process of long-term memory traces (LTM-traces) and in speech production controlling.
- Arie H. van der Lugt:
SEQUENTIAL PROBABILITIES AS a CUE FOR SEGMENTATION.
- Susan Jansens, Gerrit Bloothooft, Guus de Krom:
Perception and Acoustics of Emotions in Singing.
- Christophe Pallier:
Phonemes and syllables in speech perception: size of attentional focus in French.
- Shinichi Tokuma:
Quality of a vowel with formant undershoot: a preliminary perceptual study.
- Mariëtte Koster, Anne Cutler:
Segmental and suprasegmental contributions to spoken-word recognition in dutch.
- Dawn M. Behne, Peter E. Czigler, Kirk P. H. Sullivan:
Perception of vowel duration and spectral characteristics in Swedish.
- Adrian Neagu, Gérard Bailly:
Relative contributions of noise burst and vocalic transitions to the perceptual identification of stop consonants.
- Satoshi Kitagawa, Makoto Hashimoto, Norio Higuchi:
Effect of speaker familiarity and background noise on acoustic features used in speaker identification.
- Michel Pitermann:
Dynamic versus static specification for the perceptual identity of a coarticulated vowel.
- Madelaine Plauché, Cristina Delogu, John J. Ohala:
Asymmetries in consonant confusion.
- Nicolas Dumay, Monique Radeau:
Rime and syllabic effects in phonological priming between French spoken words.
- Weizhong Zhu, Hideki Kasuya:
Roles of static and dynamic features of formant trajectories in the perception of talk indedivduality.
Dialogue Systems:
Linguistic Structures,
Modelling and Evaluation
- Chih-mei Lin, Shrikanth Narayanan, E. Russell Ritenour:
Database management and analysis for spoken dialog systems: methodology and tools.
- Candace A. Kamm, Shrikanth Narayanan, Dawn Dutton, E. Russell Ritenour:
Evaluating spoken dialog systems for telecommunication services.
- Xavier Pouteau, Emiel Krahmer, Jan Landsbergen:
Robust spoken dialogue management for driver information systems.
- Yue-Shi Lee, Hsin-Hsi Chen:
Using acoustic and prosodic cues to correct Chinese speech repairs.
- Nils Dahlbäck, Arne Jönsson:
Integrating domain specific focusing in dialogue models.
- Marilyn A. Walker, Donald Hindle, Jeanne Fromer, Giuseppe Di Fabbrizio, Craig Mestel:
Evaluating competing agent strategies for a voice email agent.
- Donna K. Byron, Peter A. Heeman:
Discourse marker use in task-oriented spoken dialog \lambda.
- Victor W. Zue, Stephanie Seneff, James R. Glass, I. Lee Hetherington, Edward Hurley, Helen M. Meng, Christine Pao, Joseph Polifroni, Rafael Schloming, Philipp Schmid:
From interface to content: translingual access and delivery of on-line information.
- Jan Alexandersson, Norbert Reithinger:
Learning dialogue structures from a corpus.
- Norbert Reithinger, Martin Klesen:
Dialogue act classification using language models.
- Didier Pernel:
User's multiple goals in spoken dialogue.
- Noriko Suzuki, Seiji Inokuchi, Kazuo Ishii, Michio Okada:
Chatting with interactive agent.
- Gavin E. Churcher, Eric Atwell, Clive Souter:
Generic template for the evaluation of dialogue management systems.
- Yasuhisa Niimi, Takuya Nishimoto, Yutaka Kobayashi:
Analysis of interactive strategy to recover from misrecognition of utterances including multiple information items.
- Francois-Arnould Mathieu, Bertrand Gaiffe, Jean-Marie Pierrel:
A referential approach to reduce perplexity in the vocal command system comppa.
- Aristomenis Thanopoulos, Nikos Fakotakis, George Kokkinakis:
Linguistic processor for a spoken dialogue system based on island parsing techniques.
- Brian Mellor, Chris Baber:
Modelling of speech-based user interfaces.
- Beth Ann Hockey, Deborah Rossen-Knill, Beverly Spejewski, Matthew Stone, Stephen Isard:
Can you predict responses to yes/no questions? yes, no, and stuff.
- Jens-Uwe Möller:
Dia-moLE: an unsupervised learning approach to adaptive dialogue models for spoken dialogue systems.
- Joakim Gustafson, Anette Larsson, Rolf Carlson, K. Hellman:
How do system questions influence lexical choices in user answers?
Speaker Recognition and Language Identification
- Kuo-Hwei Yuo, Hsiao-Chuan Wang:
Gaussian mixture models with common principal axes and their application in text-independent speaker identification.
- Dominik R. Dersch, Robin W. King:
Speaker models designed from complete data sets: a new approach to text-independent speaker verification.
- Rivarol Vergin, Douglas D. O'Shaughnessy:
A double Gaussian mixture modeling approach to speaker recognition.
- Mohamed Afify, Yifan Gong, Jean Paul Haton:
An acoustic subword unit approach to non-linguistic speech feature identification.
- Chakib Tadj, Pierre Dumouchel, Yu Fang:
N-best GMM's for speaker identification.
- Guillaume Gravier, Chafic Mokbel, Gérard Chollet:
Model dependent spectral representations for speaker recognition.
- Roland Auckenthaler, John S. Mason:
Equalizing sub-band error rates in speaker recognition.
- Stefan Slomka, Sridha Sridharan:
Automatic gender identification under adverse conditions.
- Yizhar Lavner, Isak Gath, Judith Rosenhouse:
Acoustic features and perceptive processes in the identification of familiar voices.
- Leandro Rodríguez Liñares, Carmen García-Mateo:
On the use of acoustic segmentation in speaker identification.
- Herman J. M. Steeneken, David A. van Leeuwen:
Speaker recognition by humans and machines.
- Karsten Kumpf, Robin W. King:
Foreign speaker accent classification using phoneme-dependent accent discrimination models and comparisons with human perception benchmarks.
- Li Liu, Jialong He, Günther Palm:
A comparison of human and machine in speaker recognition.
- Simo M. A. Goddijn, Guus de Krom:
Evaluation of second language learners' pronunciation using hidden Markov models.
- Brian S. Eberman, Pedro J. Moreno:
Delta vector taylor series environment compensation for speaker recognition.
- Jonathan Hume:
Wavelet-like regression features in the cepstral domain for speaker recognition.
- Rathinavelu Chengalvarayan:
Minimum classification error linear regression (MCELR) for speaker adaptation using HMM with trend functions.
- Nikos Fakotakis, Kallirroi Georgila, Anastasios Tsopanoglou:
A continuous HMM text-independent speaker recognition system based on vowel spotting.
- Johan Koolwaaij, Lou Boves:
On the independence of digits in connected digit strings.
- Johan Koolwaaij, Lou Boves:
A new procedure for classifying speakers in speaker verification systems.
- Claude Montacié, Marie-José Caraty:
SOUND CHANNEL VIDEO INDEXING.
- Javier Hernando, Climent Nadeu:
CDHMM speaker recognition by means of frequency filtering of filter-bank energies.
Style and Accent Recognition
Towards Robust ASR for Car and Telephone Applications
Language-Specific Systems
- Briony Williams, Stephen Isard:
A keyvowel approach to the synthesis of regional accents of English.
- Attila Ferencz, Radu Arsinte, István Nagy, Teodora Ratiu, Maria Ferencz, Gavril Toderean, Diana Zaiu, Tunde-Csilla Kovács, Lajos Simon:
Experimental implementation of pitch-synchronous synthesis methods for the ROMVOX text-to-speech system.
- Bernd Möbius, Richard Sproat, Jan P. H. van Santen, Joseph P. Olive:
The bell labs German text-to-speech system: an overview.
- Susan Fitt:
The generation of regional pronunciations of English for speech synthesis.
- Elena Pavlova, Yuri Pavlov, Richard Sproat, Chilin Shih, Jan P. H. van Santen:
Bell laboratories Russian text-to-speech system.
- Antonio Bonafonte, Ignasi Esquerra, Albert Febrer, Francesc Vallverdú:
A bilingual text-to-speech system in Spanish and catalan.
Pronunciation Models
Auditory Modelling and Psychoacoustics
Voice Conversion and Data Driven F0-Models
- Jan P. H. van Santen:
Combinatorial issues in text-to-speech synthesis.
- Olivier Boëffard, F. Emerard:
Application-dependent prosodic models for text-to-speech synthesis and automatic design of learning database corpus using genetic algorithm.
- Eduardo López Gonzalo, Jose M. Rodriguez-Garcia, Luis A. Hernández Gómez, Juan M. Villar:
Automatic corpus-based training of rules for prosodic generation in text-to-speech.
- Eun-Kyoung Kim, Sangho Lee, Yung-Hwan Oh:
Hidden Markov model based voice conversion using dynamic characteristics of speaker.
- Takayoshi Yoshimura, Takashi Masuko, Keiichi Tokuda, Takao Kobayashi, Tadashi Kitamura:
Speaker interpolation in HMM-based speech synthesis system.
- Vassilios Darsinos, Dimitrios Galanis, George Kokkinakis:
Designing a speaker adaptable formant-based text-to-speech system.
Vocal Tract Analysis
F0 and Duration Modelling,
Spoken language processing
- Marcel Riedi:
Modeling segmental duration with multivariate adaptive regression splines.
- Fabrice Malfrère, Thierry Dutoit:
High-quality speech synthesis for phonetic speech segmentation.
- Nick Campbell, Yoshiharu Itoh, Wen Ding, Norio Higuchi:
Factors affecting perceived quality and intelligibility in the CHATR concatenative speech synthesiser.
- Christoph Neukirchen, Daniel Willett, Gerhard Rigoll:
Reduced lexicon trees for decoding in a MMIi-connectionist/HMM speech recognition system.
- Jean Véronis, Philippe Di Cristo, Fabienne Courtois, Benoit Lagrue:
A stochastic model of intonation for French text-to-speech synthesis.
- Angelien A. Sanderman, René Collier:
Phonetic rules for a phonetic-to-speech system.
- Jan P. H. van Santen, Chilin Shih, Bernd Möbius, Evelyne Tzoukermann, Michael Tanenblatt:
Multi-lingual duration modeling.
- Plinio A. Barbosa:
A model of segment (and pause) duration generation for Brazilian Portuguese text-to-speech synthesis.
- Ariane Halber, David Roussel:
Parsing strategy for spoken language interfaces with a lexicalized tree grammar.
- Jan W. Amtrup, Henrik Heine, Uwe Jost:
What's in a word graph evaluation and enhancement of word lattices?
- Christoph Tillmann, Stephan Vogel, Hermann Ney, A. Zubiaga, Hassan Sawaf:
Accelerated DP based search for statistical translation.
- Ken Fujisawa, Toshio Hirai, Norio Higuchi:
Use of pitch pattern improvement in the CHATR speech synthesis system.
- Gerald Corrigan, Noel Massey, Orhan Karaali:
Generating segment durations in a text-zo-speech system: a hybrid rule-based/neural network approach.
- Yasushi Ishikawa, Takashi Ebihara:
On the global FO shape model using a transition network for Japanese text-to-speech systems.
- José Colás, Juan Manuel Montero, Javier Ferreiros, José Manuel Pardo:
An alternative and flexible approach in robust information retrieval systems.
- Keiko Horiguchi, Alexander Franz:
A probabilistic approach to analogical speech translation.
- Marie-José Caraty, Claude Montacié, Fabrice Lefèvre:
Dynamic lexicon for a very large vocabulary vocal dictation.
Language Modelling
- Encarna Segarra, Lluís F. Hurtado:
Construction of language models using the morphic generator grammatical inference (MGGI) methodology.
- Shuwu Zhang, Taiyi Huang:
An integrated language modeling with n-gram model and WA model for speech recognition.
- Ye-Yi Wang, Alex Waibel:
Statistical analysis of dialogue structure.
- Philip Clarkson, Ronald Rosenfeld:
Statistical language modeling using the CMU-cambridge toolkit.
- Gilles Adda, Martine Adda-Decker, Jean-Luc Gauvain, Lori Lamel:
Text normalization and speech recognition in French.
- Géraldine Damnati, Jacques Simonin:
A novel tree-based clustering algorithm for statistical language modeling.
- Shoichi Matsunaga, Shigeki Sagayama:
Variable-length language modeling integrating global constraints.
- Kamel Smaïli, Imed Zitouni, François Charpillet, Jean Paul Haton:
An hybrid language model for a continuous dictation prototype.
- Guy Perennou, L. Pousse:
Dealing with pronunciation variants at the language model level for the continuous automatic speech recognition of French.
- Ernst Günter Schukat-Talamazzini, Florian Gallwitz, Stefan Harbeck, Volker Warnke:
Rational interpolation of maximum likelihood predictors in stochastic language modeling.
- Akinori Ito, Hideyuki Saitoh, Masaharu Katoh, Masaki Kohda:
N-gram language model adaptation using small corpus for spoken dialog recognition.
- Man-Hung Siu, Mari Ostendorf:
Variable n-gram language modeling and extensions for conversational speech.
- Petra Geutner:
Fuzzy class rescoring: a part-of-speech language model.
- Akito Nagai, Yasushi Ishikawa:
Speech understanding based on integrating concepts by conceptual dependency.
- Fabio Brugnara, Marcello Federico:
Dynamic language models for interactive speech applications.
- George Demetriou, Eric Atwell, Clive Souter:
Large-scale lexical semantics for speech recognition support.
- Hajime Tsukada, Hirofumi Yamamoto, Yoshinori Sagisaka:
Integration of grammar and statistical language constraints for partial word-sequence recognition.
- Paul Taylor, Simon King, Stephen Isard, Helen Wright, Jacqueline C. Kowtko:
Using intonation to constrain language models in speech recognition.
- Peter A. Heeman, James F. Allen:
Incorporating POS tagging into language modeling.
- C. Uhrik, W. Ward:
Confidence metrics based on n-gram language model backoff behaviors.
- Ciprian Chelba, David Engle, Frederick Jelinek, Victor Jimenez, Sanjeev Khudanpur, Lidia Mangu, Harry Printz, Eric Ristad, Ronald Rosenfeld, Andreas Stolcke, Dekai Wu:
Structure and performance of a dependency language model.
- Andreas Stolcke:
Modeling linguistic segment and turn boundaries for n-best rescoring of spontaneous speech.
- P. E. Kenne, Mary O'Kane:
Hybrid language models: is simpler better?
- Thorsten Brants:
Internal and external tagsets in part-of-speech tagging.
Auditory Modelling and Psychoacoustics,
Neural Networks for Speech Processing and Recognition
- Laurent Varin, Frédéric Berthommier:
A probabilistic model of double-vowel segregation.
- Habibzadeh V. Houshang, Shigeyoshi Kitazawa:
Stimulus signal estimation from auditory-neural transduction inverse processing.
- Chakib Tadj, Pierre Dumouchel, Franck Poirier:
FDVQ based keyword spotter which incorporates a semi-supervised learning for primary processing.
- V. Lublinskaja, Christian Sappok:
The initial time Span of auditory processing used for speaker attribution of the speech signal.
- Nikko Ström:
Sparse connection and pruning in large dynamic artificial neural networks.
- Roxana Teodorescu, Dirk Van Compernolle, Ioannis Dologlou:
A modular initialization scheme for better speech recognition performance using hybrid systems of MLPs/HMMs.
- Tatiana V. Chernigovskaya:
Lateralization for auditory perception of foreign words.
- Yuri A. Kosarev, Pavel Jarov, Alexander Osipov:
The structural weighted sets method for continuous speech and text recognition.
- C. J. Sumner, D. F. Gillies:
Lateral inhibitory networks for auditory processing.
- Henning Reetz:
Missing fundamentals: a problem of auditory or mental processing?
- Felix Freitag, Enric Monte, Josep M. Salavedra:
Predictive neural networks applied to phoneme recognition.
- Suhardi, Klaus Fellbaum:
Empirical comparison of two multilayer perceptron-based keyword speech recognition algorithms.
- Toshiaki Fukada, Sophie Aveline, Mike Schuster, Yoshinori Sagisaka:
Segment boundary estimation using recurrent neural networks.
- Mike Schuster:
Incorporation of HMM output constraints in hybrid NN/HMM systems during training.
- Ludmila Babkina, Sergey Koval, Alexander Molchanov:
Principles of the hearing periphery functioning in new methods of pitch detection and speech enhancement.
- Christine Meunier, Alain Content, Ulrich H. Frauenfelder, Ruth Kearns:
The locus of the syllable effect: prelexical or lexical?
- Robin J. Lickley, Ellen Gurman Bard:
On not remembering disfluencies.
- T. Andringa:
Using an auditory model and leaky autocorrelators to tune in to speech.
Last update Fri May 25 08:23:03 2012
CET by the DBLP Team —
Data released under the ODC-BY 1.0 license — See also our legal information page