![]() | ![]() |
| 2012 | ||
|---|---|---|
| 75 | Hao Tang, Stephen M. Chu, Mark Hasegawa-Johnson, Thomas S. Huang: Partially Supervised Speaker Clustering. IEEE Trans. Pattern Anal. Mach. Intell. 34(5): 959-971 (2012) | |
| 74 | I. Yücel Özbek, Mark Hasegawa-Johnson, Mübeccel Demirekler: On Improving Dynamic State Space Approaches to Articulatory Inversion With MAP-Based Parameter Estimation. IEEE Transactions on Audio, Speech & Language Processing 20(1): 67-81 (2012) | |
| 2011 | ||
| 73 | Po-Sen Huang, Xiaodan Zhuang, Mark Hasegawa-Johnson: Improving acoustic event detection using generalizable visual features and multi-modality modeling. ICASSP 2011: 349-352 | |
| 72 | Tim Mahrt, Jui-Ting Huang, Yoonsook Mo, Margaret Fleck, Mark Hasegawa-Johnson, Jennifer Cole: Optimal Models of Prosodic Prominence Using the Bayesian Information Criterion. INTERSPEECH 2011: 2037-2040 | |
| 71 | Xiaodan Zhuang, Xi Zhou, Mark Hasegawa-Johnson, Thomas S. Huang: Efficient Object Localization with Variation-Normalized Gaussianized Vectors. Intelligent Video Event Analysis and Understanding 2011: 93-109 | |
| 70 | Bowon Lee, Camille Goudeseune, Mark Hasegawa-Johnson: Open-loop multi-channel inversion of room impulse response CoRR abs/1106.1199: (2011) | |
| 69 | I. Yücel Özbek, Mark Hasegawa-Johnson, Mübeccel Demirekler: Estimation of Articulatory Trajectories Based on Gaussian Mixture Model (GMM) With Audio-Visual Information Fusion and Dynamic Kalman Smoothing. IEEE Transactions on Audio, Speech & Language Processing 19(5): 1180-1195 (2011) | |
| 68 | Bryce E. Lobdell, Jont B. Allen, Mark Hasegawa-Johnson: Intelligibility predictors and neural representation of speech. Speech Communication 53(2): 185-194 (2011) | |
| 2010 | ||
| 67 | Lae-Hoon Kim, Mark Hasegawa-Johnson, Gerasimos Potamianos, Vit Libal: Joint estimation of DOA and speech based on EM beamforming. ICASSP 2010: 121-124 | |
| 66 | Hao Tang, Mark Hasegawa-Johnson, Thomas S. Huang: Toward robust learning of the Gaussian mixture state emission densities for hidden Markov models. ICASSP 2010: 5242-5245 | |
| 65 | Hao Tang, Mark Hasegawa-Johnson, Thomas S. Huang: Non-frontal view facial expression recognition based on ergodic hidden Markov model supervectors. ICME 2010: 1202-1207 | |
| 64 | Jui-Ting Huang, Mark Hasegawa-Johnson: Semi-supervised training of Gaussian mixture models by conditional entropy minimization. INTERSPEECH 2010: 1353-1356 | |
| 63 | Xiaodan Zhuang, Lijuan Wang, Frank K. Soong, Mark Hasegawa-Johnson: A minimum converted trajectory error (MCTE) approach to high quality speech-to-lips conversion. INTERSPEECH 2010: 1736-1739 | |
| 62 | Lae-Hoon Kim, Kyung-Tae Kim, Mark Hasegawa-Johnson: Robust automatic speech recognition with decoder oriented ideal binary mask estimation. INTERSPEECH 2010: 2066-2069 | |
| 61 | Chi Hu, Xiaodan Zhuang, Mark Hasegawa-Johnson: FSM-based pronunciation modeling using articulatory phonological code. INTERSPEECH 2010: 2274-2277 | |
| 60 | Heejin Kim, Panying Rong, Torrey M. Loucks, Mark Hasegawa-Johnson: Kinematic analysis of tongue movement control in spastic dysarthria. INTERSPEECH 2010: 2578-2581 | |
| 59 | Hosung Nam, Vikramjit Mitra, Mark Tiede, Elliot Saltzman, Louis Goldstein, Carol Y. Espy-Wilson, Mark Hasegawa-Johnson: A procedure for estimating gestural scores from natural speech. INTERSPEECH 2010: 30-33 | |
| 58 | Su-Youn Yoon, Mark Hasegawa-Johnson, Richard Sproat: Landmark-based automated pronunciation error detection. INTERSPEECH 2010: 614-617 | |
| 57 | Xiaodan Zhuang, Xi Zhou, Mark Hasegawa-Johnson, Thomas S. Huang: Real-world acoustic event detection. Pattern Recognition Letters 31(12): 1543-1551 (2010) | |
| 56 | Xi Zhou, Xiaodan Zhuang, Hao Tang, Mark Hasegawa-Johnson, Thomas S. Huang: Novel Gaussianized vector representation for improved natural scene categorization. Pattern Recognition Letters 31(8): 702-708 (2010) | |
| 2009 | ||
| 55 | Jui-Ting Huang, Xi Zhou, Mark Hasegawa-Johnson, Thomas S. Huang: Kernel metric learning for phonetic classification. ASRU 2009: 141-145 | |
| 54 | Xiaodan Zhuang, Jing Huang, Gerasimos Potamianos, Mark Hasegawa-Johnson: Acoustic fall detection using Gaussian mixture models and GMM supervectors. ICASSP 2009: 69-72 | |
| 53 | Hao Tang, Stephen M. Chu, Mark Hasegawa-Johnson, Thomas S. Huang: Emotion recognition from speech VIA boosted Gaussian mixture models. ICME 2009: 294-297 | |
| 52 | Harsh Vardhan Sharma, Mark Hasegawa-Johnson: Universal access: speech recognition for talkers with spastic dysarthria. INTERSPEECH 2009: 1451-1454 | |
| 51 | Su-Youn Yoon, Mark Hasegawa-Johnson, Richard Sproat: Automated pronunciation scoring using confidence scoring and landmark-based SVM. INTERSPEECH 2009: 1903-1906 | |
| 50 | Yoonsook Mo, Jennifer Cole, Mark Hasegawa-Johnson: Prosodic effects on vowel production: evidence from formant structure. INTERSPEECH 2009: 2535-2538 | |
| 49 | Xiaodan Zhuang, Hosung Nam, Mark Hasegawa-Johnson, Louis Goldstein, Elliot Saltzman: Articulatory phonological code for word classification. INTERSPEECH 2009: 2763-2766 | |
| 48 | I. Yücel Özbek, Mark Hasegawa-Johnson, Mübeccel Demirekler: Formant trajectories for acoustic-to-articulatory inversion. INTERSPEECH 2009: 2807-2810 | |
| 2008 | ||
| 47 | Xi Zhou, Xiaodan Zhuang, Shuicheng Yan, Shih-Fu Chang, Mark Hasegawa-Johnson, Thomas S. Huang: SIFT-Bag kernel for video event analysis. ACM Multimedia 2008: 229-238 | |
| 46 | Shuicheng Yan, Xi Zhou, Ming Liu, Mark Hasegawa-Johnson, Thomas S. Huang: Regression from patch-kernel. CVPR 2008 | |
| 45 | Xiaodan Zhuang, Xi Zhou, Thomas S. Huang, Mark Hasegawa-Johnson: Feature analysis and selection for acoustic event detection. ICASSP 2008: 17-20 | |
| 44 | Lae-Hoon Kim, Mark Hasegawa-Johnson: Optimal speech estimator considering room response as well as additive noise: Different approaches in low and high frequency range. ICASSP 2008: 4573-4576 | |
| 43 | Hao Tang, Yuxiao Hu, Yun Fu, Mark Hasegawa-Johnson, Thomas S. Huang: Real-time conversion from a single 2D face image to a 3D text-driven emotive audio-visual avatar. ICME 2008: 1205-1208 | |
| 42 | Xi Zhou, Xiaodan Zhuang, Hao Tang, Mark Hasegawa-Johnson, Thomas S. Huang: A novel Gaussianized vector representation for natural scene categorization. ICPR 2008: 1-4 | |
| 41 | Xiaodan Zhuang, Xi Zhou, Mark Hasegawa-Johnson, Thomas S. Huang: Face age estimation using patch-based hidden Markov model supervectors. ICPR 2008: 1-4 | |
| 40 | Xiaodan Zhuang, Hosung Nam, Mark Hasegawa-Johnson, Louis M. Goldstein, Elliot Saltzman: The entropy of the articulatory phonological code: recognizing gestures from tract variables. INTERSPEECH 2008: 1489-1492 | |
| 39 | Heejin Kim, Mark Hasegawa-Johnson, Adrienne Perlman, Jon Gunderson, Thomas S. Huang, Kenneth Watkin, Simone Frame: Dysarthric speech database for universal access research. INTERSPEECH 2008: 1741-1744 | |
| 38 | Bryce E. Lobdell, Mark Hasegawa-Johnson, Jont B. Allen: Human speech perception and feature extraction. INTERSPEECH 2008: 1797-1800 | |
| 37 | Hao Tang, Xi Zhou, Matthias Odisio, Mark Hasegawa-Johnson, Thomas S. Huang: Two-stage prosody prediction for emotional text-to-speech synthesis. INTERSPEECH 2008: 2138-2141 | |
| 36 | Jui-Ting Huang, Mark Hasegawa-Johnson: Maximum mutual information estimation with unlabeled data for phonetic classification. INTERSPEECH 2008: 952-955 | |
| 35 | Hao Tang, Yun Fu, Jilin Tu, Thomas S. Huang, Mark Hasegawa-Johnson: EAVA: A 3D Emotive Audio-Visual Avatar. WACV 2008: 1-6 | |
| 34 | Hao Tang, Yun Fu, Jilin Tu, Mark Hasegawa-Johnson, Thomas S. Huang: Humanoid Audio-Visual Avatar With Emotive Text-to-Speech Synthesis. IEEE Transactions on Multimedia 10(6): 969-981 (2008) | |
| 2007 | ||
| 33 | Ming Liu, Yanxiang Chen, Xi Zhou, Xiaodan Zhuang, Mark Hasegawa-Johnson, Thomas S. Huang: Multichannel and Multimodality Person Identification. CLEAR 2007: 248-255 | |
| 32 | Xi Zhou, Xiaodan Zhuang, Ming Liu, Hao Tang, Mark Hasegawa-Johnson, Thomas S. Huang: HMM-Based Acoustic Event Detection with AdaBoost Feature Selection. CLEAR 2007: 345-353 | |
| 31 | Yun Fu, Xi Zhou, Ming Liu, Mark Hasegawa-Johnson, Thomas S. Huang: Lipreading by Locality Discriminant Graph. ICIP (3) 2007: 325-328 | |
| 30 | Xi Zhou, Yun Fu, Ming Liu, Mark Hasegawa-Johnson, Thomas S. Huang: Robust Analysis and Weighting on MFCC Components for Speech Recognition and Speaker Identification. ICME 2007: 188-191 | |
| 29 | Ming Liu, Zhengyou Zhang, Mark Hasegawa-Johnson, Thomas S. Huang: Exploring Discriminative Learning for Text-Independent Speaker Recognition. ICME 2007: 56-59 | |
| 28 | Ming Liu, Xi Zhou, Mark Hasegawa-Johnson, Thomas S. Huang, Zhengyou Zhang: Frequency domain correspondence for speaker normalization. INTERSPEECH 2007: 274-277 | |
| 2006 | ||
| 27 | Rahul Chitturi, Mark Hasegawa-Johnson: Novel entropy based moving average refiners for HMM landmarks. INTERSPEECH 2006 | |
| 26 | Rahul Chitturi, Mark Hasegawa-Johnson: Novel time domain multi-class SVMs for landmark detection. INTERSPEECH 2006 | |
| 25 | Ken Chen, Mark Hasegawa-Johnson, Aaron Cohen, Sarah Borys, Sung-Suk Kim, Jennifer Cole, Jeung-Yoon Choi: Prosody dependent speech recognition on radio news corpus of American English. IEEE Transactions on Audio, Speech & Language Processing 14(1): 232-245 (2006) | |
| 24 | Tong Zhang, Mark Hasegawa-Johnson, Stephen E. Levinson: Extraction of pragmatic and semantic salience from spontaneous spoken English. Speech Communication 48(3-4): 437-462 (2006) | |
| 23 | Tong Zhang, Mark Hasegawa-Johnson, Stephen E. Levinson: Cognitive state classification in a spoken tutorial dialogue system. Speech Communication 48(6): 616-632 (2006) | |
| 2005 | ||
| 22 | Sarah Borys, Mark Hasegawa-Johnson: Distinctive feature based SVM discriminant features for improvements to phone recognition on telephone band speech. INTERSPEECH 2005: 697-700 | |
| 21 | Mark Hasegawa-Johnson, Ken Chen, Jennifer Cole, Sarah Borys, Sung-Suk Kim, Aaron Cohen, Tong Zhang, Jeung-Yoon Choi, Heejin Kim, Taejin Yoon: Simultaneous recognition of words and prosody in the Boston University Radio Speech Corpus. Speech Communication 46(3-4): 418-439 (2005) | |
| 2004 | ||
| 20 | Mark Hasegawa-Johnson, Ameya Deoras: A factorial HMM aproach to robust isolated digit recognition in background music. INTERSPEECH 2004 | |
| 19 | Bowon Lee, Mark Hasegawa-Johnson, Camille Goudeseune, Suketu Kamdar, Sarah Borys, Ming Liu, Thomas S. Huang: AVICAR: audio-visual speech corpus in a car environment. INTERSPEECH 2004 | |
| 18 | Mark Hasegawa-Johnson, Stephen E. Levinson, Tong Zhang: Automatic detection of contrast for speech understanding. INTERSPEECH 2004 | |
| 17 | Mark Hasegawa-Johnson, Stephen E. Levinson, Tong Zhang: Children's emotion recognition in an intelligent tutoring scenario. INTERSPEECH 2004 | |
| 16 | Taejin Yoon, Sandra Chavarria, Jennifer Cole, Mark Hasegawa-Johnson: Intertranscriber reliability of prosodic labeling on telephone conversation using toBI. INTERSPEECH 2004 | |
| 15 | Sarah Borys, Aaron Cohen, Mark Hasegawa-Johnson, Jennifer Cole: Modeling and recognition of phonetic and prosodic factors for improvements to acoustic speech recognition models. INTERSPEECH 2004 | |
| 14 | Ken Chen, Mark Hasegawa-Johnson: Modeling pronunciation variation using artificial neural networks for English spontaneous speech. INTERSPEECH 2004 | |
| 13 | Mital Gandhi, Mark Hasegawa-Johnson: Source separation using particle filters. INTERSPEECH 2004 | |
| 12 | Yanli Zheng, Mark Hasegawa-Johnson, Sarah Borys: Stop consonant classification by dynamic formant trajectory. INTERSPEECH 2004 | |
| 11 | Yuexi Ren, Mark Hasegawa-Johnson, Stephen E. Levinson: Semantic analysis for a speech user interface in an intelligent tutoring system. IUI 2004: 313-315 | |
| 10 | Mohamed Kamal Omar, Mark Hasegawa-Johnson: Model enforcement: a unified feature transformation framework for classification and recognition. IEEE Transactions on Signal Processing 52(10): 2701-2710 (2004) | |
| 2003 | ||
| 9 | Mohamed Kamal Omar, Mark Hasegawa-Johnson: Maximum conditional mutual information projection for speech recognition. INTERSPEECH 2003 | |
| 8 | Mohamed Kamal Omar, Mark Hasegawa-Johnson: Non-linear maximum likelihood feature transformation for speech recognition. INTERSPEECH 2003 | |
| 7 | Ken Chen, Sarah Borys, Mark Hasegawa-Johnson, Jennifer Cole: Prosody dependent speech recognition with explicit duration modelling at intonational phrase boundaries. INTERSPEECH 2003 | |
| 6 | Mohamed Kamal Omar, Mark Hasegawa-Johnson: Approximately independent factors of speech using nonlinear symplectic transformation. IEEE Transactions on Speech and Audio Processing 11(6): 660-671 (2003) | |
| 2002 | ||
| 5 | Zhinian Jing, Mark Hasegawa-Johnson: Auditory-modeling inspired methods of feature extraction for robust automatic speech recognition. ICASSP 2002: 4176 | |
| 4 | Mohamed Kamal Omar, Mark Hasegawa-Johnson: Maximum mutual information based acoustic-features representation of phonological features for speech recognition. ICASSP 2002: 81-84 | |
| 3 | Mohamed Kamal Omar, Ken Chen, Mark Hasegawa-Johnson, Yigal Brandman: An evaluation of using mutual information for selection of acoustic-features representation of phonemes for speech recognition. INTERSPEECH 2002 | |
| 2000 | ||
| 2 | Mark Hasegawa-Johnson: Time-frequency distribution of partial phonetic information measured using mutual information. INTERSPEECH 2000: 133-136 | |
| 1 | Jun Huang, Stephen E. Levinson, Mark Hasegawa-Johnson: Signal approximation in Hilbert space and its application on articulatory speech synthesis. INTERSPEECH 2000: 775-778 | |
Colors in the list of coauthors
Last update Thu May 31 18:55:10 2012 CET by the DBLP Team —
Data released under the ODC-BY 1.0 license — See also our legal information page