


default search action
ASRU 2013: Olomouc, Czech Republic
- 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, Olomouc, Czech Republic, December 8-12, 2013. IEEE 2013, ISBN 978-1-4799-2756-2
LM: Language Modeling
- Yangyang Shi, Martha A. Larson, Catholijn M. Jonker:
K-component recurrent neural network language models using curriculum learning. 1-6 - Matti Varjokallio, Mikko Kurimo, Sami Virpioja:
Learning a subword vocabulary based on unigram likelihood. 7-12 - Berlin Chen, Yi-Wen Chen, Kuan-Yu Chen, Ea-Ee Jan:
Effective pseudo-relevance feedback for language modeling in speech recognition. 13-18 - Long Qin, Alexander I. Rudnicky
:
Learning better lexical properties for recurrent OOV words. 19-24 - Abhinav Sethy, Stanley F. Chen, Ebru Arisoy, Bhuvana Ramabhadran, Kartik Audhkhasi, Shrikanth S. Narayanan, Paul Vozila:
Joint training of interpolated exponential n-gram models. 25-30 - Hasim Sak, Cyril Allauzen, Kaisuke Nakajima, Françoise Beaufays:
Mixture of mixture n-gram language models. 31-36
AM: Acoustic Modeling
- Wen-Lin Zhang, Bi-Cheng Li, Wei-Qiang Zhang:
Compact acoustic modeling based on acoustic manifold using a mixture of factor analyzers. 37-42 - Yuuki Tachioka, Shinji Watanabe
, Jonathan Le Roux, John R. Hershey:
A generalized discriminative training framework for system combination. 43-48 - Vimal Manohar, Srinivas C. Bhargav, Srinivasan Umesh
:
Acoustic modeling using transform-based phone-cluster adaptive training. 49-54 - George Saon
, Hagen Soltau, David Nahamoo, Michael Picheny:
Speaker adaptation of neural network acoustic models using i-vectors. 55-59 - Udhyakumar Nallasamy, Mark C. Fuhs, Monika Woszczyna, Florian Metze
, Tanja Schultz
:
Neighbour selection and adaptation for rapid speaker-dependent ASR. 60-65
Dec: Decoder Search
- David Nolden, Ralf Schlüter
, Hermann Ney:
Efficient nearly error-less LVCSR decoding based on incremental forward and backward passes. 66-71
SLU: Spoken Language Understanding
- Jingjing Liu, Panupong Pasupat, Yining Wang, Scott Cyphers, James R. Glass:
Query understanding enhanced by hierarchical parsing structures. 72-77 - Puyang Xu, Ruhi Sarikaya:
Convolutional neural network based triangular CRF for joint intent detection and slot filling. 78-83 - Jan Svec
, Pavel Ircing
, Lubos Smídl
:
Semantic entity detection from multiple ASR hypotheses within the WFST framework. 84-89 - Ali Orkan Bayer, Giuseppe Riccardi:
On-line adaptation of semantic models for spoken language understanding. 90-95 - Juraj Pálfy, Sakhia Darjaa, Jiri Pospichal
:
Dysfluent speech detection by image forensics techniques. 96-101 - Heriberto Cuayáhuitl
, Nina Dethlefs, Helen Wright Hastie, Oliver Lemon
:
Barge-in effects in Bayesian dialogue act recognition and simulation. 102-107
Dial: Spoken Dialog Systems
- Emmanuel Ferreira, Fabrice Lefèvre:
Expert-based reward shaping and exploration scheme for boosting policy learning of dialogue management. 108-113 - Takuya Hiraoka, Yuki Yamauchi, Graham Neubig, Sakriani Sakti, Tomoki Toda
, Satoshi Nakamura:
Dialogue management for leading the conversation in persuasive dialogue systems. 114-119 - Yun-Nung Chen, William Yang Wang, Alexander I. Rudnicky
:
Unsupervised induction and filling of semantic slots for spoken dialogue systems using frame-semantic parsing. 120-125
Multi: Multilingual Speech & Language Processing
- Aanchan Mohan, Richard C. Rose:
Cross-lingual context sharing and parameter-tying for multi-lingual speech recognition. 126-131 - João Miranda, João Paulo da Silva Neto, Alan W. Black:
Improved punctuation recovery through combination of multiple speech streams. 132-137 - Kate M. Knill, Mark J. F. Gales, Shakti P. Rath, Philip C. Woodland, Chao Zhang, Shi-Xiong Zhang:
Investigation of multilingual deep neural networks for spoken term detection. 138-143 - Evgeny A. Stepanov
, Ilya Kashkarev, Ali Orkan Bayer, Giuseppe Riccardi, Arindam Ghosh:
Language style and domain adaptation for cross-language SLU porting. 144-149
Robust: Robustness in ASR
- Rongfeng Su, Xunying Liu, Lan Wang:
Automatic model complexity control for generalized variable parameter HMMs. 150-155 - N. Vishnu Prasad, Srinivasan Umesh
:
Improved cepstral mean and variance normalization using Bayesian framework. 156-161 - Emmanuel Vincent, Jon Barker
, Shinji Watanabe
, Jonathan Le Roux, Francesco Nesta, Marco Matassoni:
The second 'CHiME' speech separation and recognition challenge: An overview of challenge systems and outcomes. 162-167 - Antti Hurmalainen, Tuomas Virtanen:
Learning state labels for sparse classification of speech with matrix deconvolution. 168-173 - D. S. Pavan Kumar, N. Vishnu Prasad, Vikas Joshi, Srinivasan Umesh
:
Modified splice and its extension to non-stereo data for noise robust speech recognition. 174-179 - Ramón Fernandez Astudillo:
A propagation approach to modelling the joint distributions of clean and corrupted speech in the Mel-Cepstral domain. 180-185 - Soonho Baek, Hong-Goo Kang:
Vector Taylor series based HMM adaptation for generalized cepstrum in noisy environment. 186-191
SDRKWS: Spoken Document Retrieval and Keyword Spotting
- Steven Wegmann, Arlo Faria, Adam Janin, Korbinian Riedhammer
, Nelson Morgan:
The TAO of ATWV: Probing the mysteries of keyword search performance. 192-197 - Yun-Chiao Li, Hung-yi Lee
, Cheng-Tao Chung, Chun-an Chan, Lin-Shan Lee:
Towards unsupervised semantic retrieval of spoken content with query expansion based on automatically discovered acoustic patterns. 198-203 - Lidia Mangu, Hagen Soltau, Hong-Kwang Kuo, George Saon
:
The IBM keyword search system for the DARPA RATS program. 204-209 - Damianos G. Karakos, Richard M. Schwartz, Stavros Tsakalidis, Le Zhang, Shivesh Ranjan, Tim Ng, Roger Hsiao, Guruprasad Saikumar, Ivan Bulyko, Long Nguyen, John Makhoul, Frantisek Grézl, Mirko Hannemann, Martin Karafiát
, Igor Szöke, Karel Veselý, Lori Lamel, Viet Bac Le:
Score normalization and system combination for improved keyword spotting. 210-215
NewApp: New Applications of ASR
- Duc Le, Emily Mower Provost
:
Emotion recognition from spontaneous speech using Hidden Markov models with deep belief networks. 216-221 - Han-Ping Shen, Nobuaki Minematsu, Takehiko Makino, Steven H. Weinberger, Teeraphon Pongkittiphan, Chung-Hsien Wu
:
Automatic pronunciation clustering using a World English archive and pronunciation structure analysis. 222-227 - Alexei V. Ivanov, Shahab Jalalvand, Roberto Gretter, Daniele Falavigna:
Phonetic and anthropometric conditioning of MSA-KST cognitive impairment characterization system. 228-233 - Anna Katharina Fuchs, Juan Andres Morales-Cordovilla, Martin Hagmüller
:
ASR for electro-laryngeal speech. 234-238 - Lakshmish Kaushik, Abhijeet Sangwan, John H. L. Hansen:
Automatic sentiment extraction from YouTube videos. 239-244
SPFea: Speech Signal Processing and Feature Extraction
- Hideaki Konno, Hideo Kanemitsu, Nobuyuki Takahashi, Mineichi Kudo
:
Acoustic characteristics related to the perceptual pitch in whispered vowels. 245-249 - Azzedine Touazi
, Mohamed Debyeche:
An SVD-based scheme for MFCC compression in distributed speech recognition system. 250-255 - Reza Sahraeian, Dirk Van Compernolle:
A study of supervised intrinsic spectral analysis for TIMIT phone classification. 256-260 - Florian Metze
, Zaid Sheikh, Alex Waibel, Jonas Gehring, Kevin Kilgour, Quoc Bao Nguyen, Van Huy Nguyen:
Models of tone for tonal and non-tonal languages. 261-266
NN: Neural Networks in ASR
- Karel Veselý, Mirko Hannemann, Lukás Burget
:
Semi-supervised training of Deep Neural Networks. 267-272 - Alex Graves, Navdeep Jaitly, Abdel-rahman Mohamed:
Hybrid speech recognition with Deep Bidirectional LSTM. 273-278 - Bo Li, Khe Chai Sim:
Improving robustness of deep neural networks via spectral masking for automatic speech recognition. 279-284 - Pawel Swietojanski
, Arnab Ghoshal, Steve Renals
:
Hybrid acoustic models for distant and multichannel large vocabulary speech recognition. 285-290 - Meng Cai, Yongzhe Shi, Jia Liu:
Deep maxout neural networks for speech recognition. 291-296 - Tara N. Sainath, Brian Kingsbury, Abdel-rahman Mohamed, Bhuvana Ramabhadran:
Learning filter banks within a deep neural network framework. 297-302 - Tara N. Sainath, Lior Horesh, Brian Kingsbury, Aleksandr Y. Aravkin, Bhuvana Ramabhadran:
Accelerating Hessian-free optimization for Deep Neural Networks by implicit preconditioning and sampling. 303-308 - Naoyuki Kanda, Ryu Takeda
, Yasunari Obuchi:
Elastic spectral distortion for low resource speech recognition with deep neural networks. 309-314 - Tara N. Sainath, Brian Kingsbury, Abdel-rahman Mohamed, George E. Dahl, George Saon
, Hagen Soltau, Tomás Beran, Aleksandr Y. Aravkin, Bhuvana Ramabhadran:
Improvements to Deep Convolutional Neural Networks for LVCSR. 315-320 - Pierre L. Dognin, Vaibhava Goel
:
Combining stochastic average gradient and Hessian-free optimization for sequence training of deep neural networks. 321-325 - Zhiheng Huang, Geoffrey Zweig, Michael Levit, Benoît Dumoulin, Barlas Oguz, Shawn Chang:
Accelerating recurrent neural network training via two stage classes and parallelization. 326-331 - David Imseng, Petr Motlícek
, Philip N. Garner
, Hervé Bourlard:
Impact of deep MLP architecture on different acoustic modeling techniques for under-resourced speech recognition. 332-337 - Guangsen Wang, Khe Chai Sim:
Context-dependent modelling of deep neural network using logistic regression. 338-343 - Jonas Gehring, Quoc Bao Nguyen, Florian Metze
, Alex Waibel:
DNN acoustic modeling with modular multi-lingual feature extraction networks. 344-349 - Yosuke Kashiwagi, Daisuke Saito, Nobuaki Minematsu, Keikichi Hirose:
Discriminative piecewise linear transformation based on deep learning for noise robust automatic speech recognition. 350-355 - Kris Demuynck, Fabian Triefenbach:
Porting concepts from DNNs back to GMMs. 356-361 - Raymond Brueckner, Björn W. Schuller
:
Hierarchical neural networks and enhanced class posteriors for social signal classification. 362-367 - Hank Liao, Erik McDermott, Andrew W. Senior:
Large scale deep neural network acoustic modeling with semi-supervised training data for YouTube video transcription. 368-373
LowZero: ASR/Speech Search with Low or Zero Resources
- Liang Lu, Arnab Ghoshal, Steve Renals
:
Acoustic data-driven pronunciation lexicon for large vocabulary speech recognition. 374-379 - William Hartmann, Anindya Roy, Lori Lamel, Jean-Luc Gauvain:
Acoustic unit discovery and pronunciation generation from a grapheme-based lexicon. 380-385 - Oliver Walter, Timo Korthals
, Reinhold Haeb-Umbach
, Bhiksha Raj:
A hierarchical system for word discovery exploiting DTW-based initialization. 386-391 - Bart Ons, Jort F. Gemmeke, Hugo Van hamme
:
NMF-based keyword learning from scarce data. 392-397 - Yajie Miao, Florian Metze
, Shourabh Rawat:
Deep maxout networks for low-resource speech recognition. 398-403 - Yanmin Qian, Kai Yu, Jia Liu:
Combination of data borrowing strategies for low-resource LVCSR. 404-409 - Keith D. Levin, Katharine Henry, Aren Jansen, Karen Livescu
:
Fixed-dimensional acoustic embeddings of variable-length segments in low-resource settings. 410-415 - Guoguo Chen, Oguz Yilmaz, Jan Trmal, Daniel Povey, Sanjeev Khudanpur:
Using proxies for OOV keywords in the keyword search task. 416-421 - Fuchun Peng, Scott Roy, Ben Shahshahani, Françoise Beaufays:
Search results based N-best hypothesis rescoring with maximum entropy classification. 422-427 - Ankur Gandhe, Long Qin, Florian Metze
, Alexander I. Rudnicky
, Ian R. Lane, Matthias Eck:
Using web text to improve keyword spotting in speech. 428-433 - Shilin Liu, Khe Chai Sim:
Multi-stream temporally varying weight regression for cross-lingual speech recognition. 434-439 - Roger Hsiao, Tim Ng, Frantisek Grézl, Damianos G. Karakos, Stavros Tsakalidis, Long Nguyen, Richard M. Schwartz:
Discriminative semi-supervised training for keyword search in low resource languages. 440-445 - Ramya Rasipuram, Marzieh Razavi, Mathew Magimai-Doss:
Probabilistic lexical modeling and unsupervised training for zero-resourced ASR. 446-451 - Joris Driesen, Steve Renals
:
Lightly supervised automatic subtitling of weather forecasts. 452-457 - Jahn Heymann, Oliver Walter, Reinhold Haeb-Umbach
, Bhiksha Raj:
Unsupervised word segmentation from noisy input. 458-463 - Murat Saraclar
, Abhinav Sethy, Bhuvana Ramabhadran, Lidia Mangu, Jia Cui, Xiaodong Cui, Brian Kingsbury, Jonathan Mamou:
An empirical study of confusion modeling in keyword search for low resource languages. 464-469 - Frantisek Grézl, Martin Karafiát
:
Semi-supervised bootstrapping approach for neural network feature extractor training. 470-475

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.