


default search action
SLT 2012: Miami, FL, USA
- 2012 IEEE Spoken Language Technology Workshop (SLT), Miami, FL, USA, December 2-5, 2012. IEEE 2012, ISBN 978-1-4673-5125-6

- Teruhisa Misu, Hideki Kashioka:

Simultaneous feature selection and parameter optimization for training of dialog policy by reinforcement learning. 1-6 - Filip Jurcícek:

Reinforcement learning for spoken dialogue systems using off-policy natural gradient method. 7-12 - Zhuoran Wang, Oliver Lemon

:
A nonparametric Bayesian approach to learning multimodal interaction management. 1-6 - Sajad Shirali-Shahreza, Gerald Penn

:
Realistic answer verification: An analysis of user errors in a sentence-repetition task. 19-24 - Svetlana Stoyanchev, Philipp Salletmayr, Jingbo Yang, Julia Hirschberg:

Localized detection of speech recognition errors. 25-30 - Milica Gasic, Matthew Henderson, Blaise Thomson, Pirros Tsiakoulis, Steve J. Young:

Policy optimisation of POMDP-based dialogue systems without state space compression. 31-36 - Blaise Thomson, Milica Gasic, Matthew Henderson, Pirros Tsiakoulis, Steve J. Young:

N-best error simulation for training spoken dialogue systems. 37-42 - Manolis Perakakis, Alexandros Potamianos:

Affective evaluation of a mobile multimodal dialogue system using brain signals. 43-48 - Fabrizio Morbini, Kartik Audhkhasi, Ron Artstein, Maarten Van Segbroeck, Kenji Sagae, Panayiotis G. Georgiou, David R. Traum, Shrikanth S. Narayanan:

A reranking approach for recognition and classification of speech input in conversational dialogue systems. 49-54 - Jason D. Williams

:
A critical analysis of two statistical spoken dialog systems in public use. 55-60 - Sungjin Lee, Maxine Eskénazi:

POMDP-based Let's Go system for spoken dialog challenge. 61-66 - Gina-Anne Levow, Siwei Wang:

Employing boosting to compare cues to verbal feedback in multi-lingual dialog. 67-72 - William Yang Wang, Dan Bohus, Ece Kamar, Eric Horvitz:

Crowdsourcing the acquisition of natural language corpora: Methods and observations. 73-78 - Kornel Laskowski:

Exploiting loudness dynamics in stochastic models of turn-taking. 79-84 - Felix Stahlberg, Tim Schlippe, Stephan Vogel, Tanja Schultz

:
Word segmentation through cross-lingual word-to-phoneme alignment. 85-90 - Arseniy Gorin, Denis Jouvet:

Class-based speech recognition using a maximum dissimilarity criterion and a tolerance classification margin. 91-96 - Nicolas Obin, Marco Liuni:

On the generalization of Shannon entropy for speech recognition. 97-102 - Shuji Komeiji, Takayuki Arakawa, Takafumi Koshinaka:

A noise-robust speech recognition method composed of weak noise suppression and weak Vector Taylor Series Adaptation. 103-106 - Fabian Triefenbach, Kris Demuynck, Jean-Pierre Martens:

Improving large vocabulary continuous speech recognition by combining GMM-based and reservoir-based acoustic modeling. 107-112 - Atsunori Ogawa, Takaaki Hori, Atsushi Nakamura:

Recognition rate estimation based on word alignment network and discriminative error type classification. 113-118 - Taehwan Kim, Karen Livescu

, Gregory Shakhnarovich:
American sign language fingerspelling recognition with phonological feature-based tandem models. 119-124 - Satoshi Kobashikawa, Takaaki Hori, Yoshikazu Yamaguchi, Taichi Asami, Hirokazu Masataki, Satoshi Takahashi:

Efficient prior and incremental beam width control to suppress excessive speech recognition time based on score range estimation. 125-130 - Jinyu Li

, Dong Yu, Jui-Ting Huang, Yifan Gong:
Improving wideband speech recognition using mixed-bandwidth training data in CD-DNN-HMM. 131-136 - Cong-Thanh Do, Mohammad Javad Taghizadeh, Philip N. Garner

:
Combining cepstral normalization and cochlear implant-like speech processing for microphone array-based speech recognition. 137-142 - Gang Li, Huifeng Zhu, Gong Cheng, Kit Thambiratnam, Behrooz Chitsaz, Dong Yu, Frank Seide:

Context-dependent Deep Neural Networks for audio indexing of real-life data. 143-148 - Yosuke Kashiwagi, Masayuki Suzuki, Nobuaki Minematsu, Keikichi Hirose:

Audio-visual feature integration based on piecewise linear transformation for noise robust automatic speech recognition. 149-152 - Gopala Krishna Anumanchipalli, Luís Caldas de Oliveira

, Alan W. Black:
Intent transfer in speech-to-speech machine translation. 153-158 - Alex Marin, Tom Kwiatkowski, Mari Ostendorf, Luke Zettlemoyer:

Using syntactic and confusion network structure for out-of-vocabulary word detection. 159-164 - Md. Akmal Haidar, Douglas D. O'Shaughnessy:

Topic n-gram count language model adaptation for speech recognition. 165-169 - Naoyuki Kanda, Ryu Takeda

, Yasunari Obuchi:
Using rhythmic features for Japanese spoken term detection. 170-175 - Matthew Henderson, Milica Gasic, Blaise Thomson, Pirros Tsiakoulis, Kai Yu, Steve J. Young:

Discriminative spoken language understanding using word confusion networks. 176-181 - Hung-yi Lee

, Tsung-Hsien Wen, Lin-Shan Lee:
Improved semantic retrieval of spoken content by language models enhanced with acoustic similarity graph. 182-187 - Tsung-Hsien Wen, Hung-yi Lee

, Tai-Yuan Chen, Lin-Shan Lee:
Personalized language modeling by crowd sourcing with social network data for voice access of cloud applications. 188-193 - Fernando García, Lluís F. Hurtado, Encarna Segarra, Emilio Sanchis, Giuseppe Riccardi:

Combining multiple translation systems for Spoken Language Understanding portability. 194-198 - Ali Orkan Bayer, Giuseppe Riccardi:

Joint language models for automatic speech recognition and understanding. 199-203 - Teppei Ohno, Tomoyosi Akiba:

Incorporating syllable duration into line-detection-based spoken term detection. 204-209 - Li Deng, Gökhan Tür

, Xiaodong He, Dilek Hakkani-Tür
:
Use of kernel deep convex networks and end-to-end learning for spoken language understanding. 210-215 - Asli Celikyilmaz

, Dilek Hakkani-Tür
, Gökhan Tür
:
Statistical semantic interpretation modeling for spoken language understanding with enriched semantic features. 216-221 - Timothy J. Hazen, Fred Richardson:

Modeling multiword phrases with constrained phrase trees for improved topic modeling of conversational speech. 222-227 - Larry P. Heck, Dilek Hakkani-Tür

:
Exploiting the Semantic Web for unsupervised spoken language understanding. 228-233 - Tomás Mikolov, Geoffrey Zweig:

Context dependent recurrent neural network language model. 234-239 - Florian Hinterleitner, Christoph Norrenbrock, Sebastian Möller, Ulrich Heute:

What makes this voice sound so bad? A multidimensional analysis of state-of-the-art text-to-speech systems. 240-245 - Pawel Swietojanski

, Arnab Ghoshal, Steve Renals
:
Unsupervised cross-lingual knowledge transfer in DNN-based LVCSR. 246-251 - Maria Astrinaki, Nicolas D'Alessandro, Benjamin Picart, Thomas Drugman, Thierry Dutoit:

Reactive and continuous control of HMM-based speech synthesis. 252-257 - Oliver Jokisch

, Yitagessu Birhanu, Rüdiger Hoffmann:
Syllable-based prosodic analysis of Amharic read speech. 252-257 - David Imseng, Hervé Bourlard, Holger Caesar, Philip N. Garner

, Gwénolé Lecorvé, Alexandre Nanchen:
MediaParl: Bilingual mixed language accented speech database. 263-268 - Jianbo Jiang, Zhiyong Wu, Mingxing Xu, Jia Jia, Lianhong Cai:

Comparison of adaptation methods for GMM-SVM based speech emotion recognition. 269-273 - Mireia Díez

, Amparo Varona
, Mikel Peñagarikano
, Luis Javier Rodríguez-Fuentes
, Germán Bordel:
On the use of phone log-likelihood ratios as features in spoken language recognition. 274-279 - Marc Ferras, Herve Boudard:

Speaker diarization and linking of large corpora. 280-285 - Adriana Stan, Peter Bell, Simon King

:
A grapheme-based method for automatic alignment of speech and text data. 286-290 - Benjamin Picart, Thomas Drugman, Thierry Dutoit:

Statistical methods for varying the degree of articulation in new HMM-based voices. 291-296 - Éva Székely, Tamás Gábor Csapó

, Bálint Tóth, Péter Mihajlik
, Julie Carson-Berndsen
:
Synthesizing expressive speech from amateur audiobook recordings. 297-302 - Kyu Jeong Han, Jason W. Pelecanos:

Frame-based phonotactic Language Identification. 303-306 - Sriram Ganapathy, Mohamed Kamal Omar, Jason W. Pelecanos:

Noisy channel adaptation in language identification. 307-312 - Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki:

Exemplar-based voice conversion in noisy environment. 313-317 - L. Paola García-Perera, Juan Arturo Nolazco-Flores

, Bhiksha Raj, Richard M. Stern
:
Optimization of the DET curve in speaker verification. 318-323 - Peter Bell, Mark J. F. Gales, Pierre Lanchantin, Xunying Liu, Yanhua Long, Steve Renals

, Pawel Swietojanski
, Philip C. Woodland:
Transcription of multi-genre media archives using out-of-domain data. 324-329 - Mohamed Bouallegue, Emmanuel Ferreira, Driss Matrouf, Georges Linarès, Maria Goudi, Pascal Nocera:

Acoustic modeling for under-resourced languages based on vectorial HMM-states representation using Subspace Gaussian Mixture Models. 330-335 - Karel Veselý, Martin Karafiát

, Frantisek Grézl, Milos Janda, Ekaterina Egorova:
The language-independent bottleneck features. 336-341 - Stefan Ziegler, Bogdan Ludusan, Guillaume Gravier:

Towards a new speech event detection approach for landmark-based speech recognition. 342-347 - João Miranda, João Paulo Neto

, Alan W. Black:
Recovery of acronyms, out-of-lattice words and pronunciations from parallel multilingual speech. 348-353 - Daniel Bolaños:

The Bavieca open-source speech recognition toolkit. 354-359 - Udhyakumar Nallasamy, Florian Metze

, Tanja Schultz
:
Active learning for accent adaptation in Automatic Speech Recognition. 360-365 - Kaisheng Yao, Dong Yu, Frank Seide, Hang Su, Li Deng, Yifan Gong:

Adaptation of context-dependent deep neural networks for automatic speech recognition. 366-369 - Leonardo Badino

, Claudia Canevari, Luciano Fadiga
, Giorgio Metta:
Deep-level acoustic-to-articulatory mapping for DBN-HMM based phone recognition. 370-375 - Andrew Rosenberg:

Modeling intensity contours and the interaction of pitch and intensity to improve automatic prosodic event detection and classification. 376-381 - Ann Lee, James R. Glass:

A comparison-based approach to mispronunciation detection. 382-387 - Mostafa Ali Shahin

, Beena Ahmed
, Kirrie J. Ballard
:
Automatic classification of unequal lexical stress patterns using machine learning algorithms. 388-391 - Korbinian Riedhammer

, Martin Gropp, Elmar Nöth
:
The FAU Video Lecture Browser system. 392-397 - Ghada AlHarbi, Thomas Hain

:
Automatic transcription of academic lectures from diverse disciplines. 398-403 - Heather Friedberg, Diane J. Litman, Susannah B. F. Paletz:

Lexical entrainment and success in student engineering groups. 404-409 - Sandrine Brognaux, Thomas Drugman, Richard Beaufort:

Automatic detection and correction of syntax-based prosody annotation errors. 410-415 - Sandrine Brognaux, Sophie Roekhaut, Thomas Drugman, Richard Beaufort:

Train&align: A new online tool for automatic phonetic alignment. 416-421 - Luiza Orosanu, Denis Jouvet, Dominique Fohr, Irina Illina, Anne Bonneau:

Combining criteria for the detection of incorrect entries of non-native speech in the context of foreign language learning. 422-427 - Yi Luan, Masayuki Suzuki, Yutaka Yamauchi, Nobuaki Minematsu, Shuhei Kato, Keikichi Hirose:

Performance improvement of automatic pronunciation assessment in a noisy classroom. 428-431 - Sechun Kang, Gary Geunbae Lee, Ho-Young Lee, Byeongchang Kim:

An automatic pitch accent feedback system for english learners with adaptation of an english corpus spoken by Koreans. 432-437 - Meysam Asgari, Izhak Shafran, Alireza Bayestehtashk:

Robust detection of voiced segments in samples of everyday conversations using unsupervised HMMS. 438-442 - Kyusong Lee, Soo-Ok Kweon, Hongsuck Seo, Gary Geunbae Lee:

Generating grammar questions using corpus data in L2 learning. 443-448 - Ian Kaplan, Andrew Rosenberg:

Analysis of speech transcripts to predict winners of U.S. Presidential and Vice-Presidential debates. 449-454 - Na Yang, R. Muraleedharan, J. Kohl, Ilker Demirkol

, Wendi Rabiner Heinzelman
, Melissa Sturge-Apple:
Speech-based emotion classification using multiclass SVM with hybrid kernel and thresholding fusion. 455-460 - Yun-Nung Chen, Florian Metze

:
Two-layer mutually reinforced random walk for improved multi-party meeting summarization. 461-466 - Anthony McCallum, Gerald Penn

, Cosmin Munteanu, Xiaodan Zhu:
Ecological validity and the evaluation of speech summarization quality. 467-472 - Tongmu Zhao, Akemi Hoshino, Masayuki Suzuki, Nobuaki Minematsu, Keikichi Hirose:

Automatic Chinese pronunciation error detection using SVM trained with structural features. 473-478 - Deana Pennell, Yang Liu:

Evaluating the effect of normalizing informal text on TTS output. 479-483

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














