


default search action
Odyssey 2020: Tokyo, Japan
- Kong-Aik Lee, Takafumi Koshinaka, Koichi Shinoda:

Odyssey 2020: The Speaker and Language Recognition Workshop, 1-5 November 2020, Tokyo, Japan. ISCA 2020
Keynote: Sadaoki Furui
- Sadaoki Furui:

Modeling of Perceptual Speaker Embedding and Its Application to Speech and Speaker Recognition.
Speaker Recognition 1
- Daniel Garcia-Romero, Gregory Sell, Alan McCree:

MagNetO: X-vector Magnitude Estimation Network plus Offset for Improved Speaker Recognition. 1-8 - Shaoshi Ling, Julian Salazar, Yuzong Liu, Katrin Kirchhoff:

BERTphone: Phonetically-aware Encoder Representations for Utterance-level Speaker and Language Recognition. 9-16 - Yingke Zhu, Brian Mak

:
Orthogonality Regularizations for End-to-End Speaker Verification. 17-23 - Anna Silnova, Niko Brummer, Johan Rohdin, Themos Stafylakis, Lukás Burget:

Probabilistic Embeddings for Speaker Diarization. 24-31
Speaker and Language Recognition
- Rashmi Kethireddy, Sudarsana Reddy Kadiri

, Santosh Kesiraju, Suryakanth V. Gangashetty:
Zero-Time Windowing Cepstral Coefficients for Dialect Classification. 32-38 - Raphaël Duroselle, Denis Jouvet, Irina Illina:

Unsupervised Regularization of the Embedding Extractor for Robust Language Identification. 39-46 - Peng Shen, Xugang Lu, Komei Sugiura, Sheng Li

, Hisashi Kawai:
Compensation on x-vector for Short Utterance Spoken Language Identification. 47-52 - Po-Chin Wang, Chia-Ping Chen, Chung-Li Lu, Bo-Cheng Chan, Shan-Wen Hsiao:

Improving Embedding-based Neural-Network Speaker Recognition. 53-59 - Min Hyun Han, Woo Hyun Kang, Sung Hwan Mun, Nam Soo Kim:

Information Preservation Pooling for Speaker Embedding. 60-66 - Ville Vestman, Kong Aik Lee, Tomi Kinnunen:

Neural i-vectors. 67-74 - Mohammad MohammadAmini, Driss Matrouf, Paul-Gauthier Noé:

Denoising x-vectors for Robust Speaker Recognition. 75-80 - Pierre-Michel Bousquet, Mickael Rouvier:

Adaptation Strategy and Clustering from Scratch for New Domains of Speaker Recognition. 81-87 - Mitchell McLaren, Md. Hafizur Rahman, Diego Castán, Mahesh Kumar Nandwana, Aaron Lawson:

Adaptive Mean Normalization for Unsupervised Adaptation of Speaker Embeddings. 88-94
Diarization
- Andreas Stolcke:

Improving Diarization Robustness using Diversification, Randomization and the DOVER Algorithm. 95-101 - Qingjian Lin, Weicheng Cai, Lin Yang, Junjie Wang, Jun Zhang, Ming Li:

DIHARD II is Still Hard: Experimental Results and Discussions from the DKU-LENOVO Team. 102-109 - Liping Chen, Kong-Aik Lee, Lei He, Frank K. Soong:

On Early-stop Clustering for Speaker Diarization. 110-116 - Nikolaos Flemotomos

, Panayiotis G. Georgiou, Shrikanth Narayanan:
Linguistically Aided Speaker Diarization Using Speaker Role Information. 117-124 - Qingjian Lin, Tingle Li, Lin Yang, Junjie Wang, Ming Li:

Optimal Mapping Loss: A Faster Loss for End-to-End Speaker Diarization. 125-131
Spoofing and Countermeasure 1
- Tianxiang Chen, Avrosh Kumar, Parav Nagarsheth, Ganesh Sivaraman, Elie Khoury:

Generalization of Audio Deepfake Detection. 132-137 - Qiongqiong Wang, Kong Aik Lee, Takafumi Koshinaka:

Using Multi-Resolution Feature Maps with Convolutional Neural Networks for Anti-Spoofing in ASV. 138-142 - Madhu R. Kamble, Hemant A. Patil:

Novel Variable Length Teager Energy Profiles for Replay Spoof Detection. 143-150 - Anssi Kanervisto, Ville Hautamäki, Tomi Kinnunen, Junichi Yamagishi:

An Initial Investigation on Optimizing Tandem Speaker Verification and Countermeasure Systems Using Reinforcement Learning. 151-158 - Xiaohai Tian, Rohan Kumar Das

, Haizhou Li:
Black-box Attacks on Automatic Speaker Verification using Feedback-controlled Voice Conversion. 159-164
Keynote: Mirco Ravanelli
- Mirco Ravanelli:

Towards Unsupervised Learning of Speech Representations.
Special Session: VOiCES 2020
- Mahesh Kumar Nandwana, Michael Lomnitz, Colleen Richey, Mitchell McLaren, Diego Castán, Luciana Ferrer, Aaron Lawson:

The VOiCES from a Distance Challenge 2019: Analysis of Speaker Verification Results and Remaining Challenges. 165-170 - Jee-Weon Jung, Ju-ho Kim, Hye-Jin Shim, Seung-bin Kim, Ha-Jin Yu:

Selective Deep Speaker Embedding Enhancement for Speaker Verification. 171-178 - Aleksei Gusev, Vladimir Volokhov, Tseren Andzhukaev, Sergey Novoselov, Galina Lavrentyeva, Marina Volkova, Alice Gazizullina, Andrey Shulipa, Artem Gorlanov, Anastasia Avdeeva, Artem Ivanov, Alexander Kozlov, Timur Pekhovsky, Yuri Matveev:

Deep Speaker Embeddings for Far-Field Speaker Recognition on Short Utterances. 179-186 - Ladislav Mosner, Oldrich Plchot, Johan Rohdin, Jan Cernocký:

Utilizing VOiCES Dataset for Multichannel Speaker Verification with Beamforming. 187-193 - Raghuveer Peri, Haoqi Li, Krishna Somandepalli, Arindam Jati, Shrikanth Narayanan:

An Empirical Analysis of Information Encoded in Disentangled Neural Speaker Representations. 194-201 - Shreyas Ramoji, Prashant Krishnan V, Sriram Ganapathy:

NPLDA: A Deep Neural PLDA Model for Speaker Verification. 202-209 - Weiwei Lin, Man-Wai Mak, Lu Yi:

Learning Mixture Representation for Deep Speaker Embedding Using Attention. 210-214
Voice Conversion and Synthesis
- Dongsuk Yook, Seong-Gyun Leem, Keonnyeong Lee, In-Chul Yoo:

Many-to-Many Voice Conversion Using Cycle-Consistent Variational Autoencoder with Multiple Decoders. 215-221 - Jennifer Williams, Joanna Rownicka, Pilar Oplustil, Simon King:

Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech Synthesis. 222-229 - Kun Zhou, Berrak Sisman, Haizhou Li:

Transforming Spectrum and Prosody for Emotional Voice Conversion with Non-Parallel Training Data. 230-237 - Berrak Sisman, Haizhou Li:

Generative Adversarial Networks for Singing Voice Conversion with and without Parallel Data. 238-244 - Rui Liu

, Berrak Sisman, Feilong Bao, Guanglai Gao, Haizhou Li:
WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss. 245-251 - Xiaoxue Gao, Xiaohai Tian, Yi Zhou, Rohan Kumar Das

, Haizhou Li:
Personalized Singing Voice Generation Using WaveRNN. 252-258
Evaluation and Benchmarking
- Seyed Omid Sadjadi, Craig S. Greenberg, Elliot Singer, Douglas A. Reynolds, Lisa P. Mason, Jaime Hernandez-Cordero

:
The 2019 NIST Audio-Visual Speaker Recognition Evaluation. 259-265 - Seyed Omid Sadjadi, Craig S. Greenberg, Elliot Singer, Douglas A. Reynolds, Lisa P. Mason, Jaime Hernandez-Cordero

:
The 2019 NIST Speaker Recognition Evaluation CTS Challenge. 266-272 - Jesús Antonio Villalba López

, Daniel Garcia-Romero, Nanxin Chen, Gregory Sell, Jonas Borgstrom, Alan McCree, Leibny Paola García-Perera, Saurabh Kataria, Phani Sankar Nidadavolu, Pedro Torres-Carrasquiilo, Najim Dehak
:
Advances in Speaker Recognition for Telephone and Audio-Visual Data: the JHU-MIT Submission for NIST SRE19. 273-280 - Shreyas Ramoji, Prashant Krishnan V, Bhargavram Mysore, Prachi Singh, Sriram Ganapathy:

LEAP System for SRE 2019 CTS Challenge - Improvements and Error Analysis. 281-288 - Jahangir Alam, Gilles Boulianne, Lukás Burget, Mohamed Dahmane, Mireia Díez Sánchez, Alicia Lozano-Diez, Ondrej Glembek, Pierre-Luc St-Charles, Marc Lalonde, Pavel Matejka, Petr Mizera, João Monteiro, Ladislav Mosner, Cedric Noiseux, Ondrej Novotný, Oldrich Plchot, Johan Rohdin, Anna Silnova, Josef Slavícek, Themos Stafylakis, Shuai Wang, Hossein Zeinali:

Analysis of ABC Submission to NIST SRE 2019 CMN and VAST Challenge. 289-295
Keynote: Luciana Ferrer
- Luciana Ferrer:

The importance of Calibration in Speaker Verification.
Spoofing and Countermeasure 2
- João Monteiro, Jahangir Alam, Tiago H. Falk:

A Multi-condition Training Strategy for Countermeasures Against Spoofing Attacks to Speaker Recognizers. 296-303 - Madhu R. Kamble, Aditya Krishna Sai Pulikonda, Maddala Venkata Siva Krishna, Hemant A. Patil:

Analysis of Teager Energy Profiles for Spoof Speech Detection. 304-311 - Itshak Lapidot, Jean-François Bonastre:

Effects of Waveform PMF on Anti-spoofing Detection for Replay Data - ASVspoof 2019. 312-318 - Sung-Hyun Yoon, Min-Sung Koh, Ha-Jin Yu:

Phase Spectrum of Time-flipped Speech Signals for Robust Spoofing Detection. 319-325 - Bence Mark Halpern, Finnian Kelly

, Rob van Son
, Anil Alexander:
Residual Networks for Resisting Noise: Analysis of an Embeddings-based Spoofing Countermeasure. 326-332 - Hemlata Tak, Jose Patino, Andreas Nautsch, Nicholas W. D. Evans, Massimiliano Todisco:

An Explainability Study of the Constant Q Cepstral Coefficient Spoofing Countermeasure for Automatic Speaker Verification. 333-340 - Bhusan Chettri

, Tomi Kinnunen, Emmanouil Benetos:
Subband Modeling for Spoofing Detection in Automatic Speaker Verification. 341-348
Speaker Recognition 2
- Joon Son Chung, Jaesung Huh, Seongkyu Mun:

Delving into VoxCeleb: Environment Invariant Speaker Recognition. 349-356 - Chau Luu, Peter Bell, Steve Renals:

Dropping Classes for Deep Speaker Representation Learning. 357-364 - Xu Li, Jinghua Zhong, Jianwei Yu, Shoukang Hu, Xixin Wu, Xunying Liu, Helen Meng:

Bayesian x-vector: Bayesian Neural Network based x-vector System for Speaker Verification. 365-371 - Luciana Ferrer, Mitchell McLaren:

A Speaker Verification Backend for Improved Calibration Performance across Varying Conditions. 372-379 - Zhongxin Bai, Xiao-Lei Zhang, Jingdong Chen:

Partial AUC Metric Learning Based Speaker Verification Back-End. 380-384
Speech Application
- Sheng Li

, Xugang Lu, Raj Dabre, Peng Shen, Hisashi Kawai:
Joint Training End-to-End Speech Recognition Systems with Speaker Attributes. 385-390 - Jilong Wu, Yiteng Huang, Hyun-Jin Park, Niranjan Subrahmanya, Patrick Violette:

Small Footprint Multi-channel Keyword Spotting. 391-395 - Rasa Lileikyte, Dwight Irvin, John H. L. Hansen:

Assessing Child Communication Engagement via Speech Recognition in Naturalistic Active Learning Spaces. 396-401 - David van der Vloed, Finnian Kelly, Anil Alexander:

Exploring the Effects of Device Variability on Forensic Speaker Comparison Using VOCALISE and NFI-FRIDA, A Forensically Realistic Database. 402-407 - Kevin Wilkinghoff:

On Open-Set Speaker Identification with I-Vectors. 408-414 - Leibny Paola García-Perera

, Jesús Villalba
, Hervé Bredin, Jun Du, Diego Castán, Alejandrina Cristià, Latané Bullock, Ling Guo, Koji Okabe, Phani Sankar Nidadavolu, Saurabh Kataria, Sizhu Chen, Léo Galmant, Marvin Lavechin, Lei Sun, Marie-Philippe Gill, Bar Ben-Yair, Sajjad Abdoli, Xin Wang, Wassim Bouaziz, Hadrien Titeux, Emmanuel Dupoux, Kong Aik Lee, Najim Dehak
:
Speaker Detection in the Wild: Lessons Learned from JSALT 2019. 415-422 - Chien-Lin Huang:

Speaker Characterization Using TDNN, TDNN-LSTM, TDNN-LSTM-Attention based Speaker Embeddings for NIST SRE 2019. 423-427 - Tianyu Liang

, Yi Liu, Can Xu, Xianwei Zhang, Liang He:
Combined Vector Based on Factorized Time-delay Neural Network for Text-Independent Speaker Recognition. 428-432
Speaker Recognition 3
- Shaojin Ding, Quan Wang, Shuo-Yiin Chang, Li Wan, Ignacio López-Moreno:

Personal VAD: Speaker-Conditioned Voice Activity Detection. 433-439 - Ganesh Sivaraman, Amruta Vidwans, Elie Khoury:

Speech Bandwidth Expansion For Speaker Recognition On Telephony Audio. 440-445 - Haruna Miyamoto, Sayaka Shiota, Hitoshi Kiya:

Application of Bandwidth Extension with No Learning to Data Augmentation for Speaker Verification. 446-450 - Yanpei Shi, Qiang Huang, Thomas Hain

:
Robust Speaker Recognition Using Speech Enhancement And Attention Model. 451-458 - Saurabh Kataria, Phani Sankar Nidadavolu, Jesús Villalba

, Najim Dehak
:
Analysis of Deep Feature Loss Based Enhancement for Speaker Verification. 459-466

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














