default search action
Masato Akagi
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
Journal Articles
- 2023
- [j38]Kai Li, Xugang Lu, Masato Akagi, Masashi Unoki:
Contributions of Jitter and Shimmer in the Voice for Fake Audio Detection. IEEE Access 11: 84689-84698 (2023) - [j37]Huy Nguyen, Tuan Vu Ho, Masato Akagi, Masashi Unoki:
Phase-Aware Speech Enhancement With Complex Wiener Filter. IEEE Access 11: 141573-141584 (2023) - [j36]Xingfeng Li, Xiaohan Shi, Desheng Hu, Yongwei Li, Qingchen Zhang, Zhengxia Wang, Masashi Unoki, Masato Akagi:
Music Theory-Inspired Acoustic Representation for Speech Emotion Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 31: 2534-2547 (2023) - 2022
- [j35]Bagus Tris Atmaja, Akira Sasou, Masato Akagi:
Speech Emotion and Naturalness Recognitions With Multitask and Single-Task Learnings. IEEE Access 10: 72381-72387 (2022) - [j34]Maori Kobayashi, Yasuhiro Hamada, Masato Akagi:
Acoustic features correlated to perceived urgency in evacuation announcements. Speech Commun. 139: 22-34 (2022) - [j33]Bagus Tris Atmaja, Akira Sasou, Masato Akagi:
Survey on bimodal speech emotion recognition from acoustic and linguistic information fusion. Speech Commun. 140: 11-28 (2022) - 2021
- [j32]Tuan Vu Ho, Masato Akagi:
Cross-Lingual Voice Conversion With Controllable Speaker Individuality Using Variational Autoencoder and Star Generative Adversarial Network. IEEE Access 9: 47503-47515 (2021) - [j31]Yawen Xue, Michael Marxen, Masato Akagi, Peter Birkholz:
Acoustic and articulatory analysis and synthesis of shouted vowels. Comput. Speech Lang. 66: 101156 (2021) - [j30]Zhichao Peng, Jianwu Dang, Masashi Unoki, Masato Akagi:
Multi-resolution modulation-filtered cochleagram feature for LSTM-based dimensional emotion recognition from speech. Neural Networks 140: 261-273 (2021) - [j29]Bagus Tris Atmaja, Masato Akagi:
Two-stage dimensional emotion recognition by fusing predictions of acoustic and text networks using SVM. Speech Commun. 126: 9-21 (2021) - [j28]Thuan Van Ngo, Rieko Kubo, Masato Akagi:
Increasing speech intelligibility and naturalness in noise based on concepts of modulation spectrum and modulation transfer function. Speech Commun. 135: 11-24 (2021) - [j27]Yongwei Li, Jianhua Tao, Donna Erickson, Bin Liu, Masato Akagi:
$F_0$-Noise-Robust Glottal Source and Vocal Tract Analysis Based on ARX-LF Model. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3375-3383 (2021) - 2020
- [j26]Zhichao Peng, Xingfeng Li, Zhi Zhu, Masashi Unoki, Jianwu Dang, Masato Akagi:
Speech Emotion Recognition Using 3D Convolutions and Attention-Based Sliding Recurrent Networks With Auditory Front-Ends. IEEE Access 8: 16560-16572 (2020) - [j25]Thuan Van Ngo, Rieko Kubo, Masato Akagi:
Mimicking Lombard Effect: An Analysis and Reconstruction. IEICE Trans. Inf. Syst. 103-D(5): 1108-1117 (2020) - [j24]Lu Yin, Junfeng Li, Yonghong Yan, Masato Akagi:
A Two-Stage Phase-Aware Approach for Monaural Multi-Talker Speech Separation. IEICE Trans. Inf. Syst. 103-D(7): 1732-1743 (2020) - [j23]Feng Li, Masato Akagi:
Combining F0 and non-negative constraint robust principal component analysis for singing voice separation. Signal Process. 170: 107432 (2020) - [j22]Thuan Van Ngo, Masato Akagi, Peter Birkholz:
Effect of articulatory and acoustic features on the intelligibility of speech in noise: An articulatory synthesis study. Speech Commun. 117: 13-20 (2020) - [j21]Yongwei Li, Ken-Ichi Sakakibara, Masato Akagi:
Simultaneous Estimation of Glottal Source Waveforms and Vocal Tract Shapes from Speech Signals Based on ARX-LF Model. J. Signal Process. Syst. 92(8): 831-838 (2020) - 2019
- [j20]Feng Li, Masato Akagi:
Blind monaural singing voice separation using rank-1 constraint robust principal component analysis and vocal activity detection. Neurocomputing 350: 44-52 (2019) - [j19]Xingfeng Li, Masato Akagi:
Improving multilingual speech emotion recognition by combining acoustic features in a three-layer model. Speech Commun. 110: 1-12 (2019) - 2018
- [j18]Yawen Xue, Yasuhiro Hamada, Masato Akagi:
Voice conversion for emotional speech: Rule-based synthesis with degree of emotion controllable in dimensional space. Speech Commun. 102: 54-67 (2018) - 2017
- [j17]Masashi Unoki, Akikazu Miyazaki, Shota Morita, Masato Akagi:
Method of Blindly Estimating Speech Transmission Index in Noisy Reverberant Environments. J. Inf. Hiding Multim. Signal Process. 8(6): 1430-1445 (2017) - [j16]Shota Morita, Xugang Lu, Masashi Unoki, Masato Akagi:
Method of Estimating Signal-to-Noise Ratio Based on Optimal Design for Sub-band Voice Activity Detection. J. Inf. Hiding Multim. Signal Process. 8(6): 1446-1459 (2017) - 2016
- [j15]Shota Morita, Masashi Unoki, Xugang Lu, Masato Akagi:
Robust Voice Activity Detection Based on Concept of Modulation Transfer Function in Noisy Reverberant Environments. J. Signal Process. Syst. 82(2): 163-173 (2016) - 2014
- [j14]Thanh-Duc Chau, Junfeng Li, Masato Akagi:
Binaural Sound Source Localization in Noisy Reverberant Environments Based on Equalization-Cancellation Theory. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 97-A(10): 2011-2020 (2014) - 2013
- [j13]Trung-Nghia Phung, Thanh Son Phan, Thang Tat Vu, Chi Mai Luong, Masato Akagi:
Improving Naturalness of HMM-Based TTS Trained with Limited Data by Temporal Decomposition. IEICE Trans. Inf. Syst. 96-D(11): 2417-2426 (2013) - 2011
- [j12]Junfeng Li, Shuichi Sakamoto, Satoshi Hongo, Masato Akagi, Yôiti Suzuki:
Two-stage binaural speech enhancement with Wiener filter for high-quality speech communication. Speech Commun. 53(5): 677-689 (2011) - 2010
- [j11]Yu Zhou, Junfeng Li, Yanqing Sun, Jianping Zhang, Yonghong Yan, Masato Akagi:
A Hybrid Speech Emotion Recognition System Based on Spectral and Prosodic Features. IEICE Trans. Inf. Syst. 93-D(10): 2813-2821 (2010) - 2008
- [j10]Junfeng Li, Masato Akagi, Yôiti Suzuki:
A Two-Microphone Noise Reduction Method in Highly Non-stationary Multiple-Noise-Source Environments. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 91-A(6): 1337-1346 (2008) - [j9]Junfeng Li, Shuichi Sakamoto, Satoshi Hongo, Masato Akagi, Yôiti Suzuki:
Adaptive beta-order generalized spectral subtraction for speech enhancement. Signal Process. 88(11): 2764-2776 (2008) - [j8]Chun-Fang Huang, Masato Akagi:
A three-layered model for expressive speech perception. Speech Commun. 50(10): 810-828 (2008) - 2007
- [j7]Phu Chien Nguyen, Masato Akagi, Binh Phu Nguyen:
Limited error based event localizing temporal decomposition and its application to variable-rate speech coding. Speech Commun. 49(4): 292-304 (2007) - 2006
- [j6]Jianwu Dang, Masato Akagi, Kiyoshi Honda:
Communication Between Speech Production and Perception Within the Brain-Observation and Simulation. J. Comput. Sci. Technol. 21(1): 95-105 (2006) - [j5]Junfeng Li, Masato Akagi:
A noise reduction system based on hybrid noise estimation technique and post-filtering in arbitrary noise environments. Speech Commun. 48(2): 111-126 (2006) - 2005
- [j4]Takeshi Saitou, Masashi Unoki, Masato Akagi:
Development of an F0 control model based on F0 dynamic characteristics for singing-voice synthesis. Speech Commun. 46(3-4): 405-417 (2005) - 2004
- [j3]Yuichi Ishimoto, Kentaro Ishizuka, Kiyoaki Aikawa, Masato Akagi:
Fundamental Frequency Estimation for Noisy Speech Using Entropy-Weighted Periodic and Harmonic Features. IEICE Trans. Inf. Syst. 87-D(1): 205-214 (2004) - 2001
- [j2]A. C. R. Nandasena, Phu Chien Nguyen, Masato Akagi:
Spectral stability based event localizing temporal decomposition. Comput. Speech Lang. 15(4): 381-401 (2001) - 1999
- [j1]Masashi Unoki, Masato Akagi:
A method of signal extraction from noisy signal based on auditory scene analysis. Speech Commun. 27(3-4): 261-279 (1999)
Conference and Workshop Papers
- 2023
- [c116]Dung Kim Tran, Masato Akagi, Masashi Unoki:
Increasing Speech Intelligibility by Mimicking Professional Announcers' Voices and Its Physical Correlates. APSIPA ASC 2023: 1187-1192 - [c115]Kai Li, Dung Kim Tran, Xugang Lu, Masato Akagi, Masashi Unoki:
Data-driven Non-uniform Filterbanks Based on F-ratio for Machine Anomalous Sound Detection. EUSIPCO 2023: 201-205 - 2022
- [c114]Kai Li, Xugang Lu, Masato Akagi, Jianwu Dang, Sheng Li, Masashi Unoki:
Relationship Between Speakers' Physiological Structure and Acoustic Speech Signals: Data-Driven Study Based on Frequency-Wise Attentional Neural Network. EUSIPCO 2022: 379-383 - [c113]Tuan Vu Ho, Maori Kobayashi, Masato Akagi:
Speak Like a Professional: Increasing Speech Intelligibility by Mimicking Professional Announcer Voice with Voice Conversion. INTERSPEECH 2022: 171-175 - [c112]Tuan Vu Ho, Quoc Huy Nguyen, Masato Akagi, Masashi Unoki:
Vector-quantized Variational Autoencoder for Phase-aware Speech Enhancement. INTERSPEECH 2022: 176-180 - [c111]Kai Li, Sheng Li, Xugang Lu, Masato Akagi, Meng Liu, Lin Zhang, Chang Zeng, Longbiao Wang, Jianwu Dang, Masashi Unoki:
Data Augmentation Using McAdams-Coefficient-Based Speaker Anonymization for Fake Audio Detection. INTERSPEECH 2022: 664-668 - [c110]Jianhua Tao, Jiangyan Yi, Cunhang Fan, Ruibo Fu, Shan Liang, Pengyuan Zhang, Haizhou Li, Helen Meng, Dong Yu, Masato Akagi:
DDAM '22: 1st International Workshop on Deepfake Detection for Audio Multimedia. ACM Multimedia 2022: 7405-7406 - 2021
- [c109]Kai Li, Masashi Unoki, Yongwei Li, Jianwu Dang, Masato Akagi:
Study on Simultaneous Estimation of Glottal Source and Vocal Tract Parameters by ARMAX-LF Model for Speech Analysis/Synthesis. APSIPA ASC 2021: 36-43 - [c108]Xingfeng Li, Taiyang Guo, Xinhui Hu, Xinkang Xu, Jianwu Dang, Masato Akagi:
Hierarchical Prosody Analysis Improves Categorical and Dimensional Emotion Recognition. APSIPA ASC 2021: 700-704 - [c107]Bagus Tris Atmaja, Akira Sasou, Masato Akagi:
Automatic Naturalness Recognition from Acted Speech Using Neural Networks. APSIPA ASC 2021: 731-736 - 2020
- [c106]Bagus Tris Atmaja, Masato Akagi:
Deep Multilayer Perceptrons for Dimensional Speech Emotion Recognition. APSIPA 2020: 325-331 - [c105]Thuan Van Ngo, Tuan Vu Ho, Masashi Unoki, Rieko Kubo, Masato Akagi:
Enhancement of speech intelligibility under noisy reverberant conditions based on modulation spectrum concept. APSIPA 2020: 753-758 - [c104]Tuan Vu Ho, Masato Akagi:
Non-parallel Voice Conversion based on Hierarchical Latent Embedding Vector Quantized Variational Autoencoder. Blizzard Challenge / Voice Conversion Challenge 2020 - [c103]Bagus Tris Atmaja, Masato Akagi:
Multitask Learning and Multistage Fusion for Dimensional Audiovisual Emotion Recognition. ICASSP 2020: 4482-4486 - [c102]Kai Li, Masato Akagi, Yibo Wu, Jianwu Dang:
Segment-Level Effects of Gender, Nationality and Emotion Information on Text-Independent Speaker Verification. INTERSPEECH 2020: 2987-2991 - [c101]Yongwei Li, Jianhua Tao, Bin Liu, Donna Erickson, Masato Akagi:
Comparison of Glottal Source Parameter Values in Emotional Vowels. INTERSPEECH 2020: 4103-4107 - [c100]Bagus Tris Atmaja, Masato Akagi:
Improving Valence Prediction in Dimensional Speech Emotion Recognition Using Linguistic Information. O-COCOSDA 2020: 166-171 - [c99]Bagus Tris Atmaja, Masato Akagi:
On The Differences Between Song and Speech Emotion Recognition: Effect of Feature Sets, Feature Types, and Classifiers. TENCON 2020: 968-972 - [c98]Bagus Tris Atmaja, Yasuhiro Hamada, Masato Akagi:
Predicting Valence and Arousal by Aggregating Acoustic Features for Acoustic-Linguistic Information Fusion. TENCON 2020: 1081-1085 - 2019
- [c97]Tuan Vu Ho, Masato Akagi:
Non-parallel Voice Conversion with Controllable Speaker Individuality using Variational Autoencoder. APSIPA 2019: 106-111 - [c96]Thuan Van Ngo, Rieko Kubo, Masato Akagi:
Evaluation of the Lombard effect model on synthesizing Lombard speech in varying noise level environments with limited data. APSIPA 2019: 133-137 - [c95]Bagus Tris Atmaja, Kiyoaki Shirai, Masato Akagi:
Speech Emotion Recognition Using Speech Feature and Word Embedding. APSIPA 2019: 519-523 - [c94]Zhichao Peng, Zhi Zhu, Masashi Unoki, Jianwu Dang, Masato Akagi:
Dimensional Emotion Recognition from Speech Using Modulation Spectral Features and Recurrent Neural Networks. APSIPA 2019: 524-528 - [c93]Feng Li, Kaizhi Qian, Mark Hasegawa-Johnson, Masato Akagi:
Monaural Singing Voice Separation Using Fusion-Net with Time-Frequency Masking. APSIPA 2019: 1239-1243 - [c92]Xingfeng Li, Masato Akagi:
The Contribution of Acoustic Features Analysis to Model Emotion Perceptual Process for Language Diversity. INTERSPEECH 2019: 3262-3266 - 2018
- [c91]Xingfeng Li, Masato Akagi:
Maximal Information Coefficient and Predominant Correlation-Based Feature Selection Toward A Three-Layer Model for Speech Emotion Recognition. APSIPA 2018: 1428-1434 - [c90]Kyoko Takahashi, Masato Akagi:
Estimation of glottal source waveforms and vocal tract shape for singing voices with wide frequency range. APSIPA 2018: 1879-1887 - [c89]Feng Li, Masato Akagi:
Unsupervised Singing Voice Separation Using Gammatone Auditory Filterbank and Constraint Robust Principal Component Analysis. APSIPA 2018: 1924-1928 - [c88]Feng Li, Masato Akagi:
Unsupervised Singing Voice Separation Based on Robust Principal Component Analysis Exploiting Rank-1 Constraint. EUSIPCO 2018: 1920-1924 - [c87]Zhichao Peng, Zhi Zhu, Masashi Unoki, Jianwu Dang, Masato Akagi:
Auditory-Inspired End-to-End Speech Emotion Recognition Using 3D Convolutional Recurrent Neural Networks Based on Spectral-Temporal Representation. ICME 2018: 1-6 - [c86]Xingfeng Li, Masato Akagi:
A Three-Layer Emotion Perception Model for Valence and Arousal-Based Detection from Multilingual Speech. INTERSPEECH 2018: 3643-3647 - [c85]Yongwei Li, Ken-Ichi Sakakibara, Masato Akagi:
Estimation of glottal source waveforms and vocal tract shapes from speech signals based on ARX-LF model. ISCSLP 2018: 230-234 - 2017
- [c84]Masashi Unoki, Yuta Kashihara, Maori Kobayashi, Masato Akagi:
Study on method for protecting speech privacy by actively controlling speech transmission index in simulated room. APSIPA 2017: 1199-1204 - [c83]Zhichao Peng, Zhi Zhu, Masashi Unoki, Jianwu Dang, Masato Akagi:
Speech emotion recognition using multichannel parallel convolutional recurrent neural networks based on gammatone auditory filterbank. APSIPA 2017: 1750-1755 - [c82]Feng Li, Masato Akagi:
Weighted Robust Principal Component Analysis with Gammatone Auditory Filterbank for Singing Voice Separation. ICONIP (6) 2017: 849-858 - [c81]Yongwei Li, Ken-Ichi Sakakibara, Daisuke Morikawa, Masato Akagi:
Commonalities of Glottal Sources and Vocal Tract Shapes Among Speakers in Emotional Speech. ISSP 2017: 24-34 - [c80]Reda Elbarougy, Masato Akagi:
Feature selection method for real-time speech emotion recognition. O-COCOSDA 2017: 1-6 - 2016
- [c79]Reda Elbarougy, Masato Akagi:
Optimizing Fuzzy Inference Systems for Improving Speech Emotion Recognition. AISI 2016: 85-95 - [c78]Yawen Xue, Yasuhiro Hamada, Masato Akagi:
Voice conversion to emotional speech based on three-layered model in dimensional approach and parameterization of dynamic features in prosody. APSIPA 2016: 1-6 - [c77]Xingfeng Li, Masato Akagi:
Multilingual Speech Emotion Recognition System Based on a Three-Layer Model. INTERSPEECH 2016: 3608-3612 - 2015
- [c76]Yawen Xue, Yasuhiro Hamada, Masato Akagi:
Emotional speech synthesis system based on a three-layered model using a dimensional approach. APSIPA 2015: 505-514 - [c75]Xingfeng Li, Masato Akagi:
Toward improving estimation accuracy of emotion dimensions in bilingual scenario based on three-layered model. O-COCOSDA/CASLRE 2015: 21-26 - 2014
- [c74]Masato Akagi, Xiao Han, Reda Elbarougy, Yasuhiro Hamada, Junfeng Li:
Toward affective speech-to-speech translation: Strategy for emotional speech recognition and synthesis in multiple languages. APSIPA 2014: 1-10 - [c73]Yasuhiro Hamada, Reda Elbarougy, Masato Akagi:
A method for emotional speech synthesis based on the position of emotional state in Valence-Activation space. APSIPA 2014: 1-7 - [c72]Masato Akagi, Xiao Han, Reda Elbarougy, Yasuhiro Hamada, Junfeng Li:
Emotional Speech Recognition and Synthesis in Multiple Languages toward Affective Speech-to-Speech Translation System. IIH-MSP 2014: 574-577 - [c71]Shota Morita, Masashi Unoki, Xugang Lu, Masato Akagi:
Robust voice activity detection based on concept of modulation transfer function in noisy reverberant environments. ISCSLP 2014: 108-112 - [c70]Thi Duyen Ngo, Masato Akagi, The Duy Bui:
Toward a Rule-Based Synthesis of Vietnamese Emotional Speech. KSE 2014: 129-142 - [c69]Reda Elbarougy, Han Xiao, Masato Akagi, Junfeng Li:
Toward relaying an affective Speech-to-Speech translator: Cross-language perception of emotional state represented by emotion dimensions. O-COCOSDA 2014: 1-6 - 2013
- [c68]Reda Elbarougy, Masato Akagi:
Cross-lingual speech emotion recognition system based on a three-layer model for human perception. APSIPA 2013: 1-10 - [c67]Masashi Unoki, Tomohiro Ikeda, Kyohei Sasaki, Ryota Miyauchi, Masato Akagi, Nam Soo Kim:
Blind method of estimating speech transmission index in room acoustics based on concept of modulation transfer function. ChinaSIP 2013: 308-312 - [c66]Duc Thanh Chau, Masato Akagi, Junfeng Li:
Improve equalization-cancellation-based sound localization in noisy reverberant environments using direct-to-reverberant energy ratio. ChinaSIP 2013: 322-326 - [c65]Junfeng Li, Masato Akagi, Yonghong Yan:
Objective Japanese intelligibility prediction for noisy speech signals before and after noise-reduction processing. ChinaSIP 2013: 352-355 - [c64]Suminori Nishie, Masato Akagi:
Acoustic sound source tracking for a moving object using precise Doppler-Shift measurement. EUSIPCO 2013: 1-5 - [c63]Masashi Unoki, Kyohei Sasaki, Ryota Miyauchi, Masato Akagi, Nam Soo Kim:
Blind method of estimating speech transmission index from reverberant speech signals. EUSIPCO 2013: 1-5 - [c62]Masato Akagi, Hideki Hisatsune:
Admissible Range for Individualization of Head-Related Transfer Function in Median Plane. IIH-MSP 2013: 326-329 - [c61]Junfeng Li, Fei Chen, Masato Akagi, Yonghong Yan:
Comparative investigation of objective speech intelligibility prediction measures for noise-reduced signals in Mandarin and Japanese. INTERSPEECH 2013: 1184-1187 - [c60]Trung-Nghia Phung, Chi Mai Luong, Masato Akagi:
A hybrid TTS between unit selection and HMM-based TTS under limited data conditions. SSW 2013: 279-284 - 2012
- [c59]Reda Elbarougy, Masato Akagi:
Speech emotion recognition system based on a dimensional approach using a three-layered model. APSIPA 2012: 1-9 - [c58]Trung-Nghia Phung, Chi Mai Luong, Masato Akagi:
A concatenative speech synthesis for monosyllabic languages with limited data. APSIPA 2012: 1-10 - [c57]Risheng Xia, Junfeng Li, Masato Akagi, Yonghong Yan:
Evaluation of objective intelligibility prediction measures for noise-reduced signals in mandarin. ICASSP 2012: 4465-4468 - 2011
- [c56]Masashi Unoki, Xugang Lu, Rico Petrick, Shota Morita, Masato Akagi, Rüdiger Hoffmann:
Voice Activity Detection in MTF-Based Power Envelope Restoration. INTERSPEECH 2011: 2609-2612 - 2010
- [c55]Duc Thanh Chau, Junfeng Li, Masato Akagi:
A DOA estimation algorithm based on equalization-cancellation theory. INTERSPEECH 2010: 2770-2773 - [c54]Junfeng Li, Lin Yang, Yonghong Yan, Duc Thanh Chau, Masato Akagi:
Intelligibility investigation of single-channel noise reduction algorithms for Chinese and Japanese. ISCSLP 2010: 7-11 - 2009
- [c53]Masashi Unoki, Yutaka Yamasaki, Masato Akagi:
MTF-based power envelope restoration in noisy reverberant environments. EUSIPCO 2009: 228-232 - [c52]Junfeng Li, Qian-Jie Fu, Hui Jiang, Masato Akagi:
Psychoacoustically-motivated adaptive beta-order generalized spectral subtraction for cochlear implant patients. ICASSP 2009: 4665-4668 - [c51]Binh Phu Nguyen, Masato Akagi:
Efficient modeling of temporal structure of speech for applications in voice transformation. INTERSPEECH 2009: 1631-1634 - [c50]Junfeng Li, Shuichi Sakamoto, Satoshi Hongo, Masato Akagi, Yôiti Suzuki:
Two-stage binaural speech enhancement with wiener filter based on equalization-cancellation model. WASPAA 2009: 133-136 - 2008
- [c49]Junfeng Li, Hui Jiang, Masato Akagi:
Psychoacoustically-motivated adaptive β-order generalized spectral subtraction based on data-driven optimization. INTERSPEECH 2008: 171-174 - [c48]Rico Petrick, Xugang Lu, Masashi Unoki, Masato Akagi, Rüdiger Hoffmann:
Robust front end processing for speech recognition in reverberant environments: utilization of speech characteristics. INTERSPEECH 2008: 658-661 - [c47]Binh Phu Nguyen, Takeshi Shibata, Masato Akagi:
High-quality analysis/synthesis method based on temporal decomposition for speech modification. INTERSPEECH 2008: 662-665 - [c46]Junfeng Li, Shuichi Sakamoto, Satoshi Hongo, Masato Akagi, Yôiti Suzuki:
The Improved TS-BASE Approaches with Interference Compensation and Their Evaluations for Speech Enhancement. ISCSLP 2008: 141-144 - 2007
- [c45]Binh Phu Nguyen, Masato Akagi:
A flexible spectral modification method based on temporal decomposition and Gaussian mixture model. INTERSPEECH 2007: 538-541 - [c44]Junfeng Li, Shuichi Sakamoto, Satoshi Hongo, Masato Akagi, Yôiti Suzuki:
Noise reduction based on adaptive β-order generalized spectral subtraction for speech enhancement. INTERSPEECH 2007: 802-805 - [c43]Thang Tat Vu, Germine Seide, Masashi Unoki, Masato Akagi:
Method of LP-based blind restoration for improving intelligibility of bone-conducted speech. INTERSPEECH 2007: 966-969 - [c42]Chun-Fang Huang, Masato Akagi:
A rule-based speech morphing for verifying a expressive speech perception model. INTERSPEECH 2007: 2661-2664 - [c41]Takeshi Saitou, Masataka Goto, Masashi Unoki, Masato Akagi:
Vocal conversion from speaking voice to singing voice using STRAIGHT. INTERSPEECH 2007: 4005-4006 - [c40]Tatsuya Kitamura, Masato Akagi:
Speaker Individualities in Speech Spectral Envelopes and Fundamental Frequency Contours. Speaker Classification (2) 2007: 157-176 - 2006
- [c39]Junfeng Li, Masato Akagi, Yôiti Suzuki:
Improved hybrid microphone array post-filter by integrating a robust speech absence probability estimator for speech enhancement. INTERSPEECH 2006 - [c38]Xugang Lu, Masashi Unoki, Masato Akagi:
A robust feature extraction based on the MTF concept for speech recognition in reverberant environment. INTERSPEECH 2006 - [c37]Junfeng Li, Masato Akagi, Yôiti Suzuki:
Multi-channel Noise Reduction in Noisy Environments. ISCSLP (Selected Papers) 2006: 258-269 - 2005
- [c36]Chun-Fang Huang, Masato Akagi:
Toward a Rule-Based Synthesis of Emotional Speech on Linguistic Descriptions of Perception. ACII 2005: 366-373 - [c35]Junfeng Li, Xugang Lu, Masato Akagi:
A noise reduction system in arbitrary noise environments and its applications to speech enhancement and speech recognition. ICASSP (3) 2005: 277-280 - [c34]Chun-Fang Huang, Masato Akagi:
A multi-layer fuzzy logical model for emotional speech perception. INTERSPEECH 2005: 417-420 - [c33]Masashi Unoki, Masaaki Kubo, Atsushi Haniu, Masato Akagi:
A model for selective segregation of a target instrument sound from the mixed sound of various instruments. INTERSPEECH 2005: 2097-2100 - [c32]Junfeng Li, Masato Akagi:
A hybrid microphone array post-filter in a diffuse noise field. INTERSPEECH 2005: 2313-2316 - 2004
- [c31]Masashi Unoki, Masato Toi, Masato Akagi:
A speech dereverberation method based on the MTF concept using adaptive time-frequency divisions. EUSIPCO 2004: 1689-1692 - [c30]Takeshi Saitou, Naoya Tsuji, Masashi Unoki, Masato Akagi:
Analysis of acoustic features affecting "singing-ness" and its application to singing-voice synthesis from speaking-voice. INTERSPEECH 2004: 1925-1928 - [c29]Junfeng Li, Masato Akagi:
Noise reduction using hybrid noise estimation technique and post-filtering. INTERSPEECH 2004: 2705-2708 - 2003
- [c28]Phu Chien Nguyen, Masato Akagi, Tu Bao Ho:
Temporal decomposition: a promising approach to VQ-based speaker identification. ICASSP (1) 2003: 184-187 - [c27]Masashi Unoki, Masashi Furukawa, Keigo Sakata, Masato Akagi:
A method based on the MTF concept for dereverberating the power envelope from the reverberant signal. ICASSP (1) 2003: 888-891 - [c26]Masashi Unoki, Masaaki Kubo, Masato Akagi:
A model for selective segregation of a target instrument sound from the mixed sound of various instruments. ICMC 2003 - [c25]Phu Chien Nguyen, Masato Akagi, Tu Bao Ho:
Temporal decomposition: a promising approach to VQ-based speaker identification. ICME 2003: 617-620 - [c24]Phu Chien Nguyen, Masato Akagi:
Efficient quantization of speech excitation parameters using temporal decomposition. INTERSPEECH 2003: 449-452 - [c23]Masashi Unoki, Keigo Sakata, Masato Akagi:
A speech dereverberation method based on the MTF concept. INTERSPEECH 2003: 1417-1420 - 2002
- [c22]Phu Chien Nguyen, Masato Akagi:
Limited Error Based Event Localizing Temporal Decomposition. EUSIPCO 2002: 1-4 - [c21]Phu Chien Nguyen, Masato Akagi:
Improvement of the restricted temporal decomposition method for line spectral frequency parameters. ICASSP 2002: 265-268 - [c20]Masato Akagi, Takashi Kago:
Noise reduction using a small-scale microphone array in multi noise source environment. ICASSP 2002: 909-912 - [c19]Phu Chien Nguyen, Takao Ochi, Masato Akagi:
Coding speech at very low rates using straight and temporal decomposition. INTERSPEECH 2002: 1849-1852 - 2001
- [c18]Yuichi Ishimoto, Masashi Unoki, Masato Akagi:
A fundamental frequency estimation method for noisy speech based on instantaneous amplitude and frequency. INTERSPEECH 2001: 2439-2442 - 2000
- [c17]Mitsunori Mizumachi, Masato Akagi, Satoshi Nakamura:
Design of robust subtractive beamformer for noisy speech recognition. INTERSPEECH 2000: 57-60 - [c16]Masato Akagi, Hironori Kitakaze:
Perception of synthesized singing voices with fine fluctuations in their fundamental frequency contours. INTERSPEECH 2000: 458-461 - 1999
- [c15]Masashi Unoki, Masato Akagi:
Segregation of vowel in background noise using the model of segregating two acoustic sources based on auditory scene analysis. EUROSPEECH 1999: 2575-2578 - [c14]Mitsunori Mizumachi, Masato Akagi:
An objective distortion estimator for hearing aids and its application to noise reduction. EUROSPEECH 1999: 2619-2622 - 1998
- [c13]A. C. R. Nandasena, Masato Akagi:
Spectral stability based event localizing temporal decomposition. ICASSP 1998: 957-960 - [c12]Mitsunori Mizumachi, Masato Akagi:
Noise reduction by paired-microphones using spectral subtraction. ICASSP 1998: 1001-1004 - [c11]Masato Akagi, Mamoru Iwaki, Tomoya Minakawa:
Fundamental frequency fluctuation in continuous vowel utterance and its perception. ICSLP 1998 - [c10]Masato Akagi, Mamoru Iwaki, Noriyoshi Sakaguchi:
Spectral sequence compensation based on continuity of spectral sequence. ICSLP 1998 - [c9]Masashi Unoki, Masato Akagi:
Signal extraction from noisy signal based on auditory scene analysis. ICSLP 1998 - 1997
- [c8]Masato Akagi, Mitsunori Mizumachi:
Noise reduction by paired microphones. EUROSPEECH 1997: 335-338 - [c7]Masashi Unoki, Masato Akagi:
A method of signal extraction from noisy signal. EUROSPEECH 1997: 2587-2590 - 1996
- [c6]Yuji Yonezawa, Masato Akagi:
Modeling of contextual effects and its application to word spotting. ICSLP 1996: 2063-2066 - 1995
- [c5]Masato Akagi, Taw Ienaga:
Speaker individualities in fundamental frequency contours and its control. EUROSPEECH 1995: 439-442 - 1994
- [c4]Masato Akagi, Astrid van Wieringen, Louis C. W. Pols:
Perception of central vowel with pre- and post-anchors. ICSLP 1994: 503-506 - [c3]Tatsuya Kitamura, Masato Akagi:
Speaker individualities in speech spectral envelopes. ICSLP 1994: 1183-1186 - 1990
- [c2]Masato Akagi:
Contextual effect models and psycho acoustic evidence for the models. ICSLP 1990: 569-572 - 1988
- [c1]Masato Akagi, Yoh'ichi Tohkura:
On the application of spectrum target prediction model to speech recognition. ICASSP 1988: 139-142
Editorship
- 2022
- [e1]Jianhua Tao, Haizhou Li, Helen Meng, Dong Yu, Masato Akagi, Jiangyan Yi, Cunhang Fan, Ruibo Fu, Shan Lian, Pengyuan Zhang:
DDAM@MM 2022: Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia, Lisboa, Portugal, 14 October 2022. ACM 2022, ISBN 978-1-4503-9496-3 [contents]
Informal and Other Publications
- 2024
- [i4]Kai Li, Khalid Zaman, Xingfeng Li, Masato Akagi, Masashi Unoki:
Machine Anomalous Sound Detection Using Spectral-temporal Modulation Representations Derived from Machine-specific Filterbanks. CoRR abs/2409.05319 (2024) - 2022
- [i3]Tuan Vu Ho, Maori Kobayashi, Masato Akagi:
Speak Like a Professional: Increasing Speech Intelligibility by Mimicking Professional Announcer Voice with Voice Conversion. CoRR abs/2206.13021 (2022) - [i2]Bagus Tris Atmaja, Masato Akagi:
Two-stage dimensional emotion recognition by fusing predictions of acoustic and text networks using SVM. CoRR abs/2210.14495 (2022) - 2020
- [i1]Bagus Tris Atmaja, Masato Akagi:
The Effect of Silence Feature in Dimensional Speech Emotion Recognition. CoRR abs/2003.01277 (2020)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-23 21:25 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint