default search action

combined dblp search
author search
venue search
publication search

ask others

John R. Hershey

> Home > Persons

Person information

affiliation: Mitsubishi Electric Research Laboratories (MERL), Cambridge, USA
affiliation: IBM T. J. Watson Research Center, New York, USA
affiliation: University of California San Diego, Department of Cognitive Science

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2025
[j12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/csl/LeglaiveFEBSWPHPB25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/csl/LeglaiveFEBSWPHPB25
Simon Leglaive, Matthieu Fraticelli, Hend Elghazaly, Léonie Borne, Mostafa Sadeghi, Scott Wisdom, Manuel Pariente, John R. Hershey, Daniel Pressnitzer, Jon P. Barker:
Objective and subjective evaluation of speech enhancement methods in the UDASE task of the 7th CHiME challenge. Comput. Speech Lang. 89: 101685 (2025)
[c116]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/BaeKMHKK25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/BaeKMHKK25
Jae-Sung Bae, Anastasia Kuznetsova, Dinesh Manocha, John R. Hershey, Trausti T. Kristjansson, Minje Kim:
Generative Data Augmentation Challenge: Zero-Shot Speech Synthesis for Personalized Speech Enhancement. ICASSP Workshops 2025: 1-5
[c115]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DementyevRWCHL25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DementyevRWCHL25
Artem Dementyev, Chandan K. A. Reddy, Scott Wisdom, Navin Chatlani, John R. Hershey, Richard F. Lyon:
Towards Sub-millisecond Latency Real-Time Speech Enhancement Models on Hearables. ICASSP 2025: 1-5
[c114]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LinGLHGNPSMHKK25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LinGLHGNPSMHKK25
Jackie Lin, Georg Götz, Hermes Sampedro Llopis, Haukur Hafsteinsson, Steinar Guðjónsson, Daniel Gert Nielsen, Finnur Pind, Paris Smaragdis, Dinesh Manocha, John R. Hershey, Trausti T. Kristjansson, Minje Kim:
Generative Data Augmentation Challenge: Synthesis of Room Acoustics for Speaker Distance Estimation. ICASSP Workshops 2025: 1-5
[c113]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/AlshammariHFFH25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/AlshammariHFFH25
Shaden Naif Alshammari, John R. Hershey, Axel Feldmann, William T. Freeman, Mark Hamilton:
I-Con: A Unifying Framework for Representation Learning. ICLR 2025
[c112]
- view
  authority control:
- export record
  dblp key:
  - conf/waspaa/ScheiblerHDL25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/waspaa/ScheiblerHDL25
Robin Scheibler, John R. Hershey, Arnaud Doucet, Henry Li:
Source Separation by Flow Matching. WASPAA 2025: 1-5
[i49]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2501-13250
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2501-13250
Jackie Lin, Georg Götz, Hermes Sampedro Llopis, Haukur Hafsteinsson, Steinar Guðjónsson, Daniel Gert Nielsen, Finnur Pind, Paris Smaragdis, Dinesh Manocha, John R. Hershey, Trausti T. Kristjansson, Minje Kim:
Generative Data Augmentation Challenge: Synthesis of Room Acoustics for Speaker Distance Estimation. CoRR abs/2501.13250 (2025)
[i48]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2501-13372
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2501-13372
Jae-Sung Bae, Anastasia Kuznetsova, Dinesh Manocha, John R. Hershey, Trausti T. Kristjansson, Minje Kim:
Generative Data Augmentation Challenge: Zero-Shot Speech Synthesis for Personalized Speech Enhancement. CoRR abs/2501.13372 (2025)
[i47]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2504-16929
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2504-16929
Shaden Naif Alshammari, John R. Hershey, Axel Feldmann, William T. Freeman, Mark Hamilton:
I-Con: A Unifying Framework for Representation Learning. CoRR abs/2504.16929 (2025)
[i46]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2505-16119
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2505-16119
Robin Scheibler, John R. Hershey, Arnaud Doucet, Henry Li:
Source Separation by Flow Matching. CoRR abs/2505.16119 (2025)
[i45]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2509-05256
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2509-05256
Daniel P. W. Ellis, Eduardo Fonseca, Ron J. Weiss, Kevin W. Wilson, Scott Wisdom, Hakan Erdogan, John R. Hershey, Aren Jansen, R. Channing Moore, Manoj Plakal:
Recomposer: Event-roll-guided generative audio editing. CoRR abs/2509.05256 (2025)
2024
[c111]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/HamiltonZHF24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/HamiltonZHF24
Mark Hamilton, Andrew Zisserman, John R. Hershey, William T. Freeman:
Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language. CVPR 2024: 13117-13127
[c110]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HanWWH24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HanWWH24
Cong Han, Kevin W. Wilson, Scott Wisdom, John R. Hershey:
Unsupervised Multi-Channel Separation And Adaptation. ICASSP 2024: 721-725
[c109]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KealeyHG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KealeyHG24
Jacob Kealey, John R. Hershey, François Grondin:
Unsupervised Improved MVDR Beamforming for Sound Enhancement. INTERSPEECH 2024
[i44]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-01413
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-01413
Simon Leglaive, Matthieu Fraticelli, Hend Elghazaly, Léonie Borne, Mostafa Sadeghi, Scott Wisdom, Manuel Pariente, John R. Hershey, Daniel Pressnitzer, Jon P. Barker:
Objective and subjective evaluation of speech enhancement methods in the UDASE task of the 7th CHiME challenge. CoRR abs/2402.01413 (2024)
[i43]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-05629
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-05629
Mark Hamilton, Andrew Zisserman, John R. Hershey, William T. Freeman:
Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language. CoRR abs/2406.05629 (2024)
[i42]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-06310
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-06310
Jacob Kealey, John R. Hershey, François Grondin:
Unsupervised Improved MVDR Beamforming for Sound Enhancement. CoRR abs/2406.06310 (2024)
[i41]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-18239
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-18239
Artem Dementyev, Chandan K. A. Reddy, Scott Wisdom, Navin Chatlani, John R. Hershey, Richard F. Lyon:
Towards sub-millisecond latency real-time speech enhancement models on hearables. CoRR abs/2409.18239 (2024)
[i40]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2411-10651
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2411-10651
Huy Tran, Yikun Bai, Ashkan Shahbazi, John R. Hershey, Soheil Kolouri:
Understanding Learning with Sliced-Wasserstein Requires Rethinking Informative Slices. CoRR abs/2411.10651 (2024)
2023
[c108]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ReddyWGHK23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ReddyWGHK23
Pradyumna Reddy, Scott Wisdom, Klaus Greff, John R. Hershey, Thomas Kipf:
Audioslots: A Slot-Centric Generative Model For Audio Separation. ICASSP Workshops 2023: 1-5
[c107]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ErdoganWCBTZH23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ErdoganWCBTZH23
Hakan Erdogan, Scott Wisdom, Xuankai Chang, Zalán Borsos, Marco Tagliasacchi, Neil Zeghidour, John R. Hershey:
TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition. INTERSPEECH 2023: 3462-3466
[i39]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-05591
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-05591
Pradyumna Reddy, Scott Wisdom, Klaus Greff, John R. Hershey, Thomas Kipf:
AudioSlots: A slot-centric generative model for audio separation. CoRR abs/2305.05591 (2023)
[i38]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-11151
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-11151
Cong Han, Kevin W. Wilson, Scott Wisdom, John R. Hershey:
Unsupervised Multi-channel Separation and Adaptation. CoRR abs/2305.11151 (2023)
[i37]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-03533
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-03533
Simon Leglaive, Léonie Borne, Efthymios Tzinis, Mostafa Sadeghi, Matthieu Fraticelli, Scott Wisdom, Manuel Pariente, Daniel Pressnitzer, John R. Hershey:
The CHiME-7 UDASE task: Unsupervised domain adaptation for conversational speech enhancement. CoRR abs/2307.03533 (2023)
[i36]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-10415
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-10415
Hakan Erdogan, Scott Wisdom, Xuankai Chang, Zalán Borsos, Marco Tagliasacchi, Neil Zeghidour, John R. Hershey:
TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition. CoRR abs/2308.10415 (2023)
2022
[c106]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/TzinisWRH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/TzinisWRH22
Efthymios Tzinis, Scott Wisdom, Tal Remez, John R. Hershey:
AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation. ECCV (37) 2022: 368-385
[c105]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DentonWH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DentonWH22
Tom Denton, Scott Wisdom, John R. Hershey:
Improving Bird Classification with Unsupervised Sound Separation. ICASSP 2022: 636-640
[c104]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SivaramanWEH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SivaramanWEH22
Aswin Sivaraman, Scott Wisdom, Hakan Erdogan, John R. Hershey:
Adapting Speech Separation to Real-World Meetings using Mixture Invariant Training. ICASSP 2022: 686-690
[c103]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MuckenhirnSEQTW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MuckenhirnSEQTW22
Hannah Muckenhirn, Aleksandr Safin, Hakan Erdogan, Felix de Chaumont Quitry, Marco Tagliasacchi, Scott Wisdom, John R. Hershey:
CycleGAN-based Unpaired Speech Dereverberation. INTERSPEECH 2022: 196-200
[c102]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PattersonWWH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PattersonWWH22
Katharine Patterson, Kevin W. Wilson, Scott Wisdom, John R. Hershey:
Distance-Based Sound Separation. INTERSPEECH 2022: 901-905
[i35]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-15652
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-15652
Hannah Muckenhirn, Aleksandr Safin, Hakan Erdogan, Felix de Chaumont Quitry, Marco Tagliasacchi, Scott Wisdom, John R. Hershey:
CycleGAN-Based Unpaired Speech Dereverberation. CoRR abs/2203.15652 (2022)
[i34]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-00562
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-00562
Katharine Patterson, Kevin W. Wilson, Scott Wisdom, John R. Hershey:
Distance-Based Sound Separation. CoRR abs/2207.00562 (2022)
[i33]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-10141
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-10141
Efthymios Tzinis, Scott Wisdom, Tal Remez, John R. Hershey:
AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation. CoRR abs/2207.10141 (2022)
2021
[c101]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WisdomEESTFSSH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WisdomEESTFSSH21
Scott Wisdom, Hakan Erdogan, Daniel P. W. Ellis, Romain Serizel, Nicolas Turpault, Eduardo Fonseca, Justin Salamon, Prem Seetharaman, John R. Hershey:
What's all the Fuss about Free Universal Sound Separation Data? ICASSP 2021: 186-190
[c100]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/TurpaultSWEHFSS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/TurpaultSWEHFSS21
Nicolas Turpault, Romain Serizel, Scott Wisdom, Hakan Erdogan, John R. Hershey, Eduardo Fonseca, Prem Seetharaman, Justin Salamon:
Sound Event Detection and Separation: A Benchmark on Desed Synthetic Soundscapes. ICASSP 2021: 840-844
[c99]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/MaitiEWW0H21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/MaitiEWW0H21
Soumi Maiti, Hakan Erdogan, Kevin W. Wilson, Scott Wisdom, Shinji Watanabe, John R. Hershey:
End-To-End Diarization for Variable Number of Speakers with Local-Global Networks and Discriminative Speaker Embeddings. ICASSP 2021: 7183-7187
[c98]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/TzinisWJHREH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/TzinisWJHREH21
Efthymios Tzinis, Scott Wisdom, Aren Jansen, Shawn Hershey, Tal Remez, Dan Ellis, John R. Hershey:
Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds. ICLR 2021
[c97]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HanLLZK0DEHMC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HanLLZK0DEHMC21
Cong Han, Yi Luo, Chenda Li, Tianyan Zhou, Keisuke Kinoshita, Shinji Watanabe, Marc Delcroix, Hakan Erdogan, John R. Hershey, Nima Mesgarani, Zhuo Chen:
Continuous Speech Separation Using Speaker Inventory for Long Recording. Interspeech 2021: 3036-3040
[c96]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/RajDCEHH0DYLKLW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/RajDCEHH0DYLKLW21
Desh Raj, Pavel Denisov, Zhuo Chen, Hakan Erdogan, Zili Huang, Maokui He, Shinji Watanabe, Jun Du, Takuya Yoshioka, Yi Luo, Naoyuki Kanda, Jinyu Li, Scott Wisdom, John R. Hershey:
Integration of Speech Separation, Diarization, and Recognition for Multi-Speaker Meetings: System Description, Comparison, and Analysis. SLT 2021: 897-904
[c95]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/WangEWWR0CH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/WangEWWR0CH21
Zhong-Qiu Wang, Hakan Erdogan, Scott Wisdom, Kevin W. Wilson, Desh Raj, Shinji Watanabe, Zhuo Chen, John R. Hershey:
Sequential Multi-Frame Neural Beamforming for Speech Separation and Enhancement. SLT 2021: 905-911
[c94]
- view
  authority control:
- export record
  dblp key:
  - conf/waspaa/WisdomJWEH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/waspaa/WisdomJWEH21
Scott Wisdom, Aren Jansen, Ron J. Weiss, Hakan Erdogan, John R. Hershey:
Sparse, Efficient, and Semantic Mixture Invariant Training: Taming In-the-Wild Unsupervised Sound Separation. WASPAA 2021: 51-55
[c93]
- view
  authority control:
- export record
  dblp key:
  - conf/waspaa/KoizumiKWEHJB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/waspaa/KoizumiKWEHJB21
Yuma Koizumi, Shigeki Karita, Scott Wisdom, Hakan Erdogan, John R. Hershey, Llion Jones, Michiel Bacchiani:
DF-Conformer: Integrated Architecture of Conv-Tasnet and Conformer Using Linear Complexity Self-Attention for Speech Enhancement. WASPAA 2021: 161-165
[c92]
- view
  authority control:
- export record
  dblp key:
  - conf/waspaa/FonsecaJEWTHPHM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/waspaa/FonsecaJEWTHPHM21
Eduardo Fonseca, Aren Jansen, Daniel P. W. Ellis, Scott Wisdom, Marco Tagliasacchi, John R. Hershey, Manoj Plakal, Shawn Hershey, R. Channing Moore, Xavier Serra:
Self-Supervised Learning from Automatically Separated Sound Scenes. WASPAA 2021: 251-255
[i32]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2105-02096
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2105-02096
Soumi Maiti, Hakan Erdogan, Kevin W. Wilson, Scott Wisdom, Shinji Watanabe, John R. Hershey:
End-to-End Diarization for Variable Number of Speakers with Local-Global Networks and Discriminative Speaker Embeddings. CoRR abs/2105.02096 (2021)
[i31]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2105-02132
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2105-02132
Eduardo Fonseca, Aren Jansen, Daniel P. W. Ellis, Scott Wisdom, Marco Tagliasacchi, John R. Hershey, Manoj Plakal, Shawn Hershey, R. Channing Moore, Xavier Serra:
Self-Supervised Learning from Automatically Separated Sound Scenes. CoRR abs/2105.02132 (2021)
[i30]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2106-00847
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-00847
Scott Wisdom, Aren Jansen, Ron J. Weiss, Hakan Erdogan, John R. Hershey:
Sparse, Efficient, and Semantic Mixture Invariant Training: Taming In-the-Wild Unsupervised Sound Separation. CoRR abs/2106.00847 (2021)
[i29]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2106-09669
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-09669
Efthymios Tzinis, Scott Wisdom, Tal Remez, John R. Hershey:
Improving On-Screen Sound Separation for Open Domain Videos with Audio-Visual Self-attention. CoRR abs/2106.09669 (2021)
[i28]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2106-15813
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-15813
Yuma Koizumi, Shigeki Karita, Scott Wisdom, Hakan Erdogan, John R. Hershey, Llion Jones, Michiel Bacchiani:
DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement. CoRR abs/2106.15813 (2021)
[i27]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2110-10739
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-10739
Aswin Sivaraman, Scott Wisdom, Hakan Erdogan, John R. Hershey:
Adapting Speech Separation to Real-World Meetings Using Mixture Invariant Training. CoRR abs/2110.10739 (2021)
2020
[c91]
- view
  - electronic edition @ dcase.community (open access)
  - details & citations
- export record
  dblp key:
  - conf/dcase/TurpaultWEHSFSS20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/dcase/TurpaultWEHSFSS20
Nicolas Turpault, Scott Wisdom, Hakan Erdogan, John R. Hershey, Romain Serizel, Eduardo Fonseca, Prem Seetharaman, Justin Salamon:
Improving Sound Event Detection in Domestic Environments using Sound Separation. DCASE 2020: 205-209
[c90]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/TzinisWHJE20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/TzinisWHJE20
Efthymios Tzinis, Scott Wisdom, John R. Hershey, Aren Jansen, Daniel P. W. Ellis:
Improving Universal Sound Separation Using Sound Classification. ICASSP 2020: 96-100
[c89]
- view
  - electronic edition @ neurips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/WisdomTEWWH20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/WisdomTEWWH20
Scott Wisdom, Efthymios Tzinis, Hakan Erdogan, Ron J. Weiss, Kevin W. Wilson, John R. Hershey:
Unsupervised Sound Separation Using Mixture Invariant Training. NeurIPS 2020
[i26]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2006-12701
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2006-12701
Scott Wisdom, Efthymios Tzinis, Hakan Erdogan, Ron J. Weiss, Kevin W. Wilson, John R. Hershey:
Unsupervised Sound Separation Using Mixtures of Mixtures. CoRR abs/2006.12701 (2020)
[i25]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2007-03932
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2007-03932
Nicolas Turpault, Scott Wisdom, Hakan Erdogan, John R. Hershey, Romain Serizel, Eduardo Fonseca, Prem Seetharaman, Justin Salamon:
Improving Sound Event Detection In Domestic Environments Using Sound Separation. CoRR abs/2007.03932 (2020)
[i24]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2011-00801
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2011-00801
Nicolas Turpault, Romain Serizel, Scott Wisdom, Hakan Erdogan, John R. Hershey, Eduardo Fonseca, Prem Seetharaman, Justin Salamon:
Sound Event Detection and Separation: a Benchmark on Desed Synthetic Soundscapes. CoRR abs/2011.00801 (2020)
[i23]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2011-00803
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2011-00803
Scott Wisdom, Hakan Erdogan, Daniel P. W. Ellis, Romain Serizel, Nicolas Turpault, Eduardo Fonseca, Justin Salamon, Prem Seetharaman, John R. Hershey:
What's All the FUSS About Free Universal Sound Separation Data? CoRR abs/2011.00803 (2020)
[i22]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2011-01143
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2011-01143
Efthymios Tzinis, Scott Wisdom, Aren Jansen, Shawn Hershey, Tal Remez, Daniel P. W. Ellis, John R. Hershey:
Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds. CoRR abs/2011.01143 (2020)
[i21]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2011-02014
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2011-02014
Desh Raj, Pavel Denisov, Zhuo Chen, Hakan Erdogan, Zili Huang, Mao-Kui He, Shinji Watanabe, Jun Du, Takuya Yoshioka, Yi Luo, Naoyuki Kanda, Jinyu Li, Scott Wisdom, John R. Hershey:
Integration of speech separation, diarization, and recognition for multi-speaker meetings: System description, comparison, and analysis. CoRR abs/2011.02014 (2020)
[i20]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2012-09727
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2012-09727
Cong Han, Yi Luo, Chenda Li, Tianyan Zhou, Keisuke Kinoshita, Shinji Watanabe, Marc Delcroix, Hakan Erdogan, John R. Hershey, Nima Mesgarani, Zhuo Chen:
Continuous Speech Separation Using Speaker Inventory for Long Multi-talker Recording. CoRR abs/2012.09727 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[j11]
- view
  authority control:
- export record
  dblp key:
  - journals/csl/HoriWKHHH19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/csl/HoriWKHHH19
Takaaki Hori, Wen Wang, Yusuke Koji, Chiori Hori, Bret Harsham, John R. Hershey:
Adversarial training and decoding strategies for end-to-end neural conversation models. Comput. Speech Lang. 54: 122-139 (2019)
[j10]
- view
  authority control:
- export record
  dblp key:
  - journals/jstsp/RouxWWSH19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jstsp/RouxWWSH19
Jonathan Le Roux, Gordon Wichern, Shinji Watanabe, Andy M. Sarroff, John R. Hershey:
Phasebook and Friends: Leveraging Discrete Representations for Source Separation. IEEE J. Sel. Top. Signal Process. 13(2): 370-382 (2019)
[c88]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/RouxWWSH19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/RouxWWSH19
Jonathan Le Roux, Gordon Wichern, Shinji Watanabe, Andy M. Sarroff, John R. Hershey:
The Phasebook: Building Complex Masks via Discrete Representations for Source Separation. ICASSP 2019: 66-70
[c87]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/RouxWEH19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/RouxWEH19
Jonathan Le Roux, Scott Wisdom, Hakan Erdogan, John R. Hershey:
SDR - Half-baked or Well Done? ICASSP 2019: 626-630
[c86]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WisdomHWTCPS19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WisdomHWTCPS19
Scott Wisdom, John R. Hershey, Kevin W. Wilson, Jeremy Thorpe, Michael Chinen, Brian Patton, Rif A. Saurous:
Differentiable Consistency Constraints for Improved Deep Speech Enhancement. ICASSP 2019: 900-904
[c85]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangMWSWHSWJL19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangMWSWHSWJL19
Quan Wang, Hannah Muckenhirn, Kevin W. Wilson, Prashant Sridhar, Zelin Wu, John R. Hershey, Rif A. Saurous, Ron J. Weiss, Ye Jia, Ignacio López-Moreno:
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking. INTERSPEECH 2019: 2728-2732
[c84]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SekiHWRH19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SekiHWRH19
Hiroshi Seki, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux, John R. Hershey:
End-to-End Multilingual Multi-Speaker Speech Recognition. INTERSPEECH 2019: 3755-3759
[c83]
- view
  authority control:
- export record
  dblp key:
  - conf/waspaa/KavalerovWEPWRH19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/waspaa/KavalerovWEPWRH19
Ilya Kavalerov, Scott Wisdom, Hakan Erdogan, Brian Patton, Kevin W. Wilson, Jonathan Le Roux, John R. Hershey:
Universal Sound Separation. WASPAA 2019: 175-179
[i19]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1905-03330
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1905-03330
Ilya Kavalerov, Scott Wisdom, Hakan Erdogan, Brian Patton, Kevin W. Wilson, Jonathan Le Roux, John R. Hershey:
Universal Sound Separation. CoRR abs/1905.03330 (2019)
[i18]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1911-07951
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1911-07951
Efthymios Tzinis, Scott Wisdom, John R. Hershey, Aren Jansen, Daniel P. W. Ellis:
Improving Universal Sound Separation Using Sound Classification. CoRR abs/1911.07951 (2019)
[i17]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1911-07953
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1911-07953
Zhong-Qiu Wang, Scott Wisdom, Kevin W. Wilson, John R. Hershey:
Alternating Between Spectral and Spatial Estimation for Speech Separation and Enhancement. CoRR abs/1911.07953 (2019)
2018
[c82]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/WatanabeRHSH18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/WatanabeRHSH18
Hiroshi Seki, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux, John R. Hershey:
A Purely End-to-End System for Multi-speaker Speech Recognition. ACL (1) 2018: 2620-2630
[c81]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangRH18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangRH18
Zhong-Qiu Wang, Jonathan Le Roux, John R. Hershey:
Multi-Channel Deep Clustering: Discriminative Spectral and Spatial Embeddings for Speaker-Independent Speech Separation. ICASSP 2018: 1-5
[c80]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangRH18a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangRH18a
Zhong-Qiu Wang, Jonathan Le Roux, John R. Hershey:
Alternative Objective Functions for Deep Clustering. ICASSP 2018: 686-690
[c79]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SettleRHWH18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SettleRHWH18
Shane Settle, Jonathan Le Roux, Takaaki Hori, Shinji Watanabe, John R. Hershey:
End-to-End Multi-Speaker Speech Recognition. ICASSP 2018: 4819-4823
[c78]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SekiWHRH18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SekiWHRH18
Hiroshi Seki, Shinji Watanabe, Takaaki Hori, Jonathan Le Roux, John R. Hershey:
An End-to-End Language-Tracking Speech Recognizer for Mixed-Language Speech. ICASSP 2018: 4919-4923
[c77]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/OchiaiWKHH18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/OchiaiWKHH18
Tsubasa Ochiai, Shinji Watanabe, Shigeru Katagiri, Takaaki Hori, John R. Hershey:
Speaker Adaptation for Multichannel End-to-End Speech Recognition. ICASSP 2018: 6707-6711
[c76]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangRWH18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangRWH18
Zhong-Qiu Wang, Jonathan Le Roux, DeLiang Wang, John R. Hershey:
End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction. INTERSPEECH 2018: 2708-2712
[c75]
- view
  authority control:
- export record
  dblp key:
  - conf/iwaenc/WilsonCTPHSSL18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iwaenc/WilsonCTPHSSL18
Kevin W. Wilson, Michael Chinen, Jeremy Thorpe, Brian Patton, John R. Hershey, Rif A. Saurous, Jan Skoglund, Richard F. Lyon:
Exploring Tradeoffs in Models for Low-Latency Speech Enhancement. IWAENC 2018: 366-370
[i16]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1804-10204
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1804-10204
Zhong-Qiu Wang, Jonathan Le Roux, DeLiang Wang, John R. Hershey:
End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction. CoRR abs/1804.10204 (2018)
[i15]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1805-05826
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1805-05826
Hiroshi Seki, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux, John R. Hershey:
A Purely End-to-end System for Multi-speaker Speech Recognition. CoRR abs/1805.05826 (2018)
[i14]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1810-01395
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1810-01395
Jonathan Le Roux, Gordon Wichern, Shinji Watanabe, Andy M. Sarroff, John R. Hershey:
Phasebook and Friends: Leveraging Discrete Representations for Source Separation. CoRR abs/1810.01395 (2018)
[i13]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1810-04826
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1810-04826
Quan Wang, Hannah Muckenhirn, Kevin W. Wilson, Prashant Sridhar, Zelin Wu, John R. Hershey, Rif A. Saurous, Ron J. Weiss, Ye Jia, Ignacio López-Moreno:
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking. CoRR abs/1810.04826 (2018)
[i12]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1811-02508
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1811-02508
Jonathan Le Roux, Scott Wisdom, Hakan Erdogan, John R. Hershey:
SDR - half-baked or well done? CoRR abs/1811.02508 (2018)
[i11]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1811-07030
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1811-07030
Kevin W. Wilson, Michael Chinen, Jeremy Thorpe, Brian Patton, John R. Hershey, Rif A. Saurous, Jan Skoglund, Richard F. Lyon:
Exploring Tradeoffs in Models for Low-latency Speech Enhancement. CoRR abs/1811.07030 (2018)
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1811-08521
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1811-08521
Scott Wisdom, John R. Hershey, Kevin W. Wilson, Jeremy Thorpe, Michael Chinen, Brian Patton, Rif A. Saurous:
Differentiable Consistency Constraints for Improved Deep Speech Enhancement. CoRR abs/1811.08521 (2018)
2017
[j9]
- view
  authority control:
- export record
  dblp key:
  - journals/csl/HoriCEHRMW17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/csl/HoriCEHRMW17
Takaaki Hori, Zhuo Chen, Hakan Erdogan, John R. Hershey, Jonathan Le Roux, Vikramjit Mitra, Shinji Watanabe:
Multi-microphone speech recognition integrating beamforming, robust feature extraction, and advanced DNN/RNN backend. Comput. Speech Lang. 46: 401-418 (2017)
[j8]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/jip/TachiokaWRH17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jip/TachiokaWRH17
Yuuki Tachioka, Shinji Watanabe, Jonathan Le Roux, John R. Hershey:
Prior-based Binary Masking and Discriminative Methods for Reverberant and Noisy Speech Recognition Using Distant Stereo Microphones. J. Inf. Process. 25: 407-416 (2017)
[j7]
- view
  authority control:
- export record
  dblp key:
  - journals/jstsp/WatanabeHKHH17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jstsp/WatanabeHKHH17
Shinji Watanabe, Takaaki Hori, Suyoun Kim, John R. Hershey, Tomoki Hayashi:
Hybrid CTC/Attention Architecture for End-to-End Speech Recognition. IEEE J. Sel. Top. Signal Process. 11(8): 1240-1253 (2017)
[j6]
- view
  authority control:
- export record
  dblp key:
  - journals/jstsp/OchiaiWHHX17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jstsp/OchiaiWHHX17
Tsubasa Ochiai, Shinji Watanabe, Takaaki Hori, John R. Hershey, Xiong Xiao:
Unified Architecture for Multichannel End-to-End Speech Recognition With Neural Beamforming. IEEE J. Sel. Top. Signal Process. 11(8): 1274-1288 (2017)
[c74]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/acl/HoriWH17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/HoriWH17
Takaaki Hori, Shinji Watanabe, John R. Hershey:
Joint CTC/attention decoding for end-to-end speech recognition. ACL (1) 2017: 518-529
[c73]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/WatanabeHH17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/WatanabeHH17
Shinji Watanabe, Takaaki Hori, John R. Hershey:
Language independent end-to-end architecture for joint language identification and speech recognition. ASRU 2017: 265-271
[c72]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/HoriWH17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/HoriWH17
Takaaki Hori, Shinji Watanabe, John R. Hershey:
Multi-level language modeling and decoding for open vocabulary end-to-end speech recognition. ASRU 2017: 287-293
[c71]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/HoriHMH17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/HoriHMH17
Chiori Hori, Takaaki Hori, Tim K. Marks, John R. Hershey:
Early and late integration of audio features for automatic video description. ASRU 2017: 430-436
[c70]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LuoCHRM17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LuoCHRM17
Yi Luo, Zhuo Chen, John R. Hershey, Jonathan Le Roux, Nima Mesgarani:
Deep clustering and conventional networks for music separation: Stronger together. ICASSP 2017: 61-65
[c69]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/MengWHE17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/MengWHE17
Zhong Meng, Shinji Watanabe, John R. Hershey, Hakan Erdogan:
Deep long short-term memory adaptive beamforming networks for multichannel robust speech recognition. ICASSP 2017: 271-275
[c68]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WatanabeHRH17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WatanabeHRH17
Shinji Watanabe, Takaaki Hori, Jonathan Le Roux, John R. Hershey:
Student-teacher network learning with enhanced features. ICASSP 2017: 5275-5279
[c67]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/HoriHLZHHMS17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/HoriHLZHHMS17
Chiori Hori, Takaaki Hori, Teng-Yok Lee, Ziming Zhang, Bret Harsham, John R. Hershey, Tim K. Marks, Kazuhiro Sumi:
Attention-Based Multimodal Fusion for Video Description. ICCV 2017: 4203-4212
[c66]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/OchiaiWHH17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/OchiaiWHH17
Tsubasa Ochiai, Shinji Watanabe, Takaaki Hori, John R. Hershey:
Multichannel End-to-end Speech Recognition. ICML 2017: 2632-2641
[p6]
- view
  authority control:
- export record
  dblp key:
  - books/sp/17/WatanabeDMH17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/books/sp/17/WatanabeDMH17
Shinji Watanabe, Marc Delcroix, Florian Metze, John R. Hershey:
Preliminaries. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 3-17
[p5]
- view
  authority control:
- export record
  dblp key:
  - books/sp/17/XiaoWEMLHSCZY17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/books/sp/17/XiaoWEMLHSCZY17
Xiong Xiao, Shinji Watanabe, Hakan Erdogan, Michael I. Mandel, Liang Lu, John R. Hershey, Michael L. Seltzer, Guoguo Chen, Yu Zhang, Dong Yu:
Discriminative Beamforming with Phase-Aware Neural Networks for Speech Enhancement and Recognition. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 79-104
[p4]
- view
  authority control:
- export record
  dblp key:
  - books/sp/17/HersheyRWWCI17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/books/sp/17/HersheyRWWCI17
John R. Hershey, Jonathan Le Roux, Shinji Watanabe, Scott Wisdom, Zhuo Chen, Yusuf Ziya Isik:
Novel Deep Architectures in Speech Processing. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 135-164
[p3]
- view
  authority control:
- export record
  dblp key:
  - books/sp/17/ErdoganHWR17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/books/sp/17/ErdoganHWR17
Hakan Erdogan, John R. Hershey, Shinji Watanabe, Jonathan Le Roux:
Deep Recurrent Networks for Separation and Recognition of Single-Channel Speech in Nonstationary Background Audio. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 165-186
[p2]
- view
  authority control:
- export record
  dblp key:
  - books/sp/17/WatanabeHMDMH17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/books/sp/17/WatanabeHMDMH17
Shinji Watanabe, Takaaki Hori, Yajie Miao, Marc Delcroix, Florian Metze, John R. Hershey:
Toolkits for Robust Speech Processing. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 369-382
[e1]
- view
  authority control:
- export record
  dblp key:
  - books/sp/WDMH2017
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/books/sp/WDMH2017
Shinji Watanabe, Marc Delcroix, Florian Metze, John R. Hershey:
New Era for Robust Speech Recognition, Exploiting Deep Learning. Springer 2017, ISBN 978-3-319-64679-4 [contents]
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/HoriHLSHM17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/HoriHLSHM17
Chiori Hori, Takaaki Hori, Teng-Yok Lee, Kazuhiro Sumi, John R. Hershey, Tim K. Marks:
Attention-Based Multimodal Fusion for Video Description. CoRR abs/1701.03126 (2017)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/OchiaiWHH17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/OchiaiWHH17
Tsubasa Ochiai, Shinji Watanabe, Takaaki Hori, John R. Hershey:
Multichannel End-to-end Speech Recognition. CoRR abs/1703.04783 (2017)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1711-08016
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1711-08016
Zhong Meng, Shinji Watanabe, John R. Hershey, Hakan Erdogan:
Deep Long Short-Term Memory Adaptive Beamforming Networks For Multichannel Robust Speech Recognition. CoRR abs/1711.08016 (2017)
2016
[c65]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HersheyCRW16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HersheyCRW16
John R. Hershey, Zhuo Chen, Jonathan Le Roux, Shinji Watanabe:
Deep clustering: Discriminative embeddings for segmentation and separation. ICASSP 2016: 31-35
[c64]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WisdomHRW16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WisdomHRW16
Scott Wisdom, John R. Hershey, Jonathan Le Roux, Shinji Watanabe:
Deep unfolding for multichannel source separation. ICASSP 2016: 121-125
[c63]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/XiaoWELHSCZMY16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/XiaoWELHSCZMY16
Xiong Xiao, Shinji Watanabe, Hakan Erdogan, Liang Lu, John R. Hershey, Michael L. Seltzer, Guoguo Chen, Yu Zhang, Michael I. Mandel, Dong Yu:
Deep beamforming networks for multi-channel speech recognition. ICASSP 2016: 5745-5749
[c62]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HoriHWH16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HoriHWH16
Takaaki Hori, Chiori Hori, Shinji Watanabe, John R. Hershey:
Minimum word error training of long short-term memory recurrent neural network language models for speech recognition. ICASSP 2016: 5990-5994
[c61]
- view
  authority control:
- export record
  dblp key:
  - conf/icmcs/HoriWHHHKFF16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icmcs/HoriWHHHKFF16
Chiori Hori, Shinji Watanabe, Takaaki Hori, Bret A. Harsham, John R. Hershey, Yusuke Koji, Youichi Fujii, Yuki Furumoto:
Driver confusion status detection using recurrent neural networks. ICME 2016: 1-6
[c60]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/IsikRCWH16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/IsikRCWH16
Yusuf Ziya Isik, Jonathan Le Roux, Zhuo Chen, Shinji Watanabe, John R. Hershey:
Single-Channel Multi-Speaker Separation Using Deep Clustering. INTERSPEECH 2016: 545-549
[c59]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ErdoganHWMR16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ErdoganHWMR16
Hakan Erdogan, John R. Hershey, Shinji Watanabe, Michael I. Mandel, Jonathan Le Roux:
Improved MVDR Beamforming Using Single-Channel Mask Prediction Networks. INTERSPEECH 2016: 1981-1985
[c58]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HoriHWH16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HoriHWH16
Chiori Hori, Takaaki Hori, Shinji Watanabe, John R. Hershey:
Context-Sensitive and Role-Dependent Spoken Language Understanding Using Bidirectional and Attention LSTMs. INTERSPEECH 2016: 3236-3240
[c57]
- view
- export record
  dblp key:
  - conf/nips/WisdomPHRA16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/WisdomPHRA16
Scott Wisdom, Thomas Powers, John R. Hershey, Jonathan Le Roux, Les E. Atlas:
Full-Capacity Unitary Recurrent Neural Networks. NIPS 2016: 4880-4888
[c56]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/HoriWHWHRHKJZA16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/HoriWHWHRHKJZA16
Takaaki Hori, Hai Wang, Chiori Hori, Shinji Watanabe, Bret Harsham, Jonathan Le Roux, John R. Hershey, Yusuke Koji, Yi Jing, Zhaocheng Zhu, Takeyuki Aikawa:
Dialog state tracking with attention-based sequence-to-sequence learning. SLT 2016: 552-558
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/TuzelTH16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/TuzelTH16
Oncel Tuzel, Yuichi Taguchi, John R. Hershey:
Global-Local Face Upsampling Network. CoRR abs/1603.07235 (2016)
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/IsikRCWH16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/IsikRCWH16
Yusuf Ziya Isik, Jonathan Le Roux, Zhuo Chen, Shinji Watanabe, John R. Hershey:
Single-Channel Multi-Speaker Separation using Deep Clustering. CoRR abs/1607.02173 (2016)
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/WisdomPHRA16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/WisdomPHRA16
Scott Wisdom, Thomas Powers, John R. Hershey, Jonathan Le Roux, Les E. Atlas:
Full-Capacity Unitary Recurrent Neural Networks. CoRR abs/1611.00035 (2016)
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/LuoCHRM16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/LuoCHRM16
Yi Luo, Zhuo Chen, John R. Hershey, Jonathan Le Roux, Nima Mesgarani:
Deep Clustering and Conventional Networks for Music Separation: Stronger Together. CoRR abs/1611.06265 (2016)
2015
[c55]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/HoriCEHRMW15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/HoriCEHRMW15
Takaaki Hori, Zhuo Chen, Hakan Erdogan, John R. Hershey, Jonathan Le Roux, Vikramjit Mitra, Shinji Watanabe:
The MERL/SRI system for the 3RD CHiME challenge using beamforming, robust feature extraction, and advanced speech recognition. ASRU 2015: 475-481
[c54]
- view
  authority control:
- export record
  dblp key:
  - conf/ica/WeningerEWVRHS15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ica/WeningerEWVRHS15
Felix Weninger, Hakan Erdogan, Shinji Watanabe, Emmanuel Vincent, Jonathan Le Roux, John R. Hershey, Björn W. Schuller:
Speech Enhancement with LSTM Recurrent Neural Networks and its Application to Noise-Robust ASR. LVA/ICA 2015: 91-99
[c53]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/RouxHW15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/RouxHW15
Jonathan Le Roux, John R. Hershey, Felix Weninger:
Deep NMF for speech separation. ICASSP 2015: 66-70
[c52]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ErdoganHWR15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ErdoganHWR15
Hakan Erdogan, John R. Hershey, Shinji Watanabe, Jonathan Le Roux:
Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks. ICASSP 2015: 708-712
[c51]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/RouxVHE15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/RouxVHE15
Jonathan Le Roux, Emmanuel Vincent, John R. Hershey, Daniel P. W. Ellis:
Micbots: Collecting large realistic datasets for speech and audio research using mobile robots. ICASSP 2015: 5635-5639
[c50]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenWEH15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenWEH15
Zhuo Chen, Shinji Watanabe, Hakan Erdogan, John R. Hershey:
Speech enhancement and recognition using multi-task learning of long short-term memory recurrent neural networks. INTERSPEECH 2015: 3274-3278
[c49]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AbdelazizWHVK15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AbdelazizWHVK15
Ahmed Hussen Abdelaziz, Shinji Watanabe, John R. Hershey, Emmanuel Vincent, Dorothea Kolossa:
Uncertainty propagation through deep neural networks. INTERSPEECH 2015: 3561-3565
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/HersheyCRW15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/HersheyCRW15
John R. Hershey, Zhuo Chen, Jonathan Le Roux, Shinji Watanabe:
Deep clustering: Discriminative embeddings for segmentation and separation. CoRR abs/1508.04306 (2015)
2014
[c48]
- view
  authority control:
- export record
  dblp key:
  - conf/globalsip/TachiokaWRH14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/globalsip/TachiokaWRH14
Yuuki Tachioka, Shinji Watanabe, Jonathan Le Roux, John R. Hershey:
Sequence discriminative training for low-rank deep neural networks. GlobalSIP 2014: 572-576
[c47]
- view
  authority control:
- export record
  dblp key:
  - conf/globalsip/WeningerHRS14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/globalsip/WeningerHRS14
Felix Weninger, John R. Hershey, Jonathan Le Roux, Björn W. Schuller:
Discriminatively trained recurrent neural networks for single-channel speech separation. GlobalSIP 2014: 577-581
[c46]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/TangWMH14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/TangWMH14
Hao Tang, Shinji Watanabe, Tim K. Marks, John R. Hershey:
Log-linear dialog manager. ICASSP 2014: 4092-4096
[c45]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SimsekliRH14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SimsekliRH14
Umut Simsekli, Jonathan Le Roux, John R. Hershey:
Non-negative source-filter dynamical system for speech enhancement. ICASSP 2014: 6206-6210
[c44]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WatanabeHMFK14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WatanabeHMFK14
Shinji Watanabe, John R. Hershey, Tim K. Marks, Youichi Fujii, Yusuke Koji:
Cost-level integration of statistical and rule-based dialog managers. INTERSPEECH 2014: 323-327
[c43]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WeningerRHW14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WeningerRHW14
Felix Weninger, Jonathan Le Roux, John R. Hershey, Shinji Watanabe:
Discriminative NMF and its application to single-channel source separation. INTERSPEECH 2014: 865-869
[c42]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TachiokaWRH14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TachiokaWRH14
Yuuki Tachioka, Shinji Watanabe, Jonathan Le Roux, John R. Hershey:
Sequential maximum mutual information linear discriminant analysis for speech recognition. INTERSPEECH 2014: 2415-2419
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/HersheyRW14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/HersheyRW14
John R. Hershey, Jonathan Le Roux, Felix Weninger:
Deep Unfolding: Model-Based Inspiration of Novel Deep Architectures. CoRR abs/1409.2574 (2014)
2013
[c41]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/TachiokaWRH13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/TachiokaWRH13
Yuuki Tachioka, Shinji Watanabe, Jonathan Le Roux, John R. Hershey:
A generalized discriminative training framework for system combination. ASRU 2013: 43-48
[c40]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/FevotteRH13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/FevotteRH13
Cédric Févotte, Jonathan Le Roux, John R. Hershey:
Non-negative dynamical system with application to speech and audio. ICASSP 2013: 3158-3162
[c39]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/RouxBKH13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/RouxBKH13
Jonathan Le Roux, Petros T. Boufounos, Kang Kang, John R. Hershey:
Source localization in reverberant environments using sparse optimization. ICASSP 2013: 4310-4314
[c38]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/TachiokaWH13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/TachiokaWH13
Yuuki Tachioka, Shinji Watanabe, John R. Hershey:
Effectiveness of discriminative training and feature transformation for reverberated and noisy speech. ICASSP 2013: 6935-6939
[c37]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WatanabeH13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WatanabeH13
Shinji Watanabe, John R. Hershey:
Stereo-based feature enhancement using dictionary learning. ICASSP 2013: 7073-7077
[c36]
- view
  - electronic edition @ aclanthology.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/ijcnlp/YoshinoWRH13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcnlp/YoshinoWRH13
Koichiro Yoshino, Shinji Watanabe, Jonathan Le Roux, John R. Hershey:
Statistical Dialogue Management using Intention Dependency Graph. IJCNLP 2013: 962-966
[c35]
- view
  authority control:
- export record
  dblp key:
  - conf/waspaa/RouxWH13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/waspaa/RouxWH13
Jonathan Le Roux, Shinji Watanabe, John R. Hershey:
Ensemble learning for speech enhancement. WASPAA 2013: 1-4
[c34]
- view
  authority control:
- export record
  dblp key:
  - conf/waspaa/SimsekliRH13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/waspaa/SimsekliRH13
Umut Simsekli, Jonathan Le Roux, John R. Hershey:
Hierarchical and coupled non-negative dynamical systems with application to audio modeling. WASPAA 2013: 1-4
2012
[j5]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/CuiXCODCHZ12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/CuiXCODCHZ12
Xiaodong Cui, Jian Xue, Xin Chen, Peder A. Olsen, Pierre L. Dognin, Upendra V. Chaudhari, John R. Hershey, Bowen Zhou:
Hidden Markov Acoustic Modeling With Bootstrap and Restructuring for Low-Resourced Languages. IEEE Trans. Speech Audio Process. 20(8): 2252-2264 (2012)
[c33]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/RouxH12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/RouxH12
Jonathan Le Roux, John R. Hershey:
Indirect model-based speech enhancement. ICASSP 2012: 4045-4048
[p1]
- view
  authority control:
- export record
  dblp key:
  - books/wi/12/HersheyRR12
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/books/wi/12/HersheyRR12
John R. Hershey, Steven J. Rennie, Jonathan Le Roux:
Factorial Models for Noise Robust Speech Recognition. Techniques for Noise Robustness in Automatic Speech Recognition 2012: 311-345
2011
[c32]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenCXOHZZ11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenCXOHZZ11
Xin Chen, Xiaodong Cui, Jian Xue, Peder A. Olsen, John R. Hershey, Bowen Zhou, Yunxin Zhao:
Clustering of bootstrapped acoustic model with full covariance. ICASSP 2011: 4496-4499
[c31]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CuiCXOHZ11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CuiCXOHZ11
Xiaodong Cui, Xin Chen, Jian Xue, Peder A. Olsen, John R. Hershey, Bowen Zhou:
Acoustic Modeling with Bootstrap and Restructuring Based on Full Covariance. INTERSPEECH 2011: 1697-1700
[c30]
- view
  authority control:
- export record
  dblp key:
  - conf/iros/TaguchiMH11
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iros/TaguchiMH11
Yuichi Taguchi, Tim K. Marks, John R. Hershey:
Entropy-based motion selection for touch-based registration using Rao-Blackwellized particle filtering. IROS 2011: 4690-4697
2010
[j4]
- view
  authority control:
- export record
  dblp key:
  - journals/csl/CookeHR10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/csl/CookeHR10
Martin Cooke, John R. Hershey, Steven J. Rennie:
Monaural speech separation and recognition challenge. Comput. Speech Lang. 24(1): 1-15 (2010)
[j3]
- view
  authority control:
- export record
  dblp key:
  - journals/csl/HersheyROK10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/csl/HersheyROK10
John R. Hershey, Steven J. Rennie, Peder A. Olsen, Trausti T. Kristjansson:
Super-human multi-talker speech recognition: A graphical modeling approach. Comput. Speech Lang. 24(1): 45-66 (2010)
[j2]
- view
  authority control:
- export record
  dblp key:
  - journals/pami/MarksHM10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/pami/MarksHM10
Tim K. Marks, John R. Hershey, Javier R. Movellan:
Tracking Motion, Deformation, and Texture Using Conditionally Gaussian Processes. IEEE Trans. Pattern Anal. Mach. Intell. 32(2): 348-363 (2010)
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/spm/RennieHO10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/spm/RennieHO10
Steven J. Rennie, John R. Hershey, Peder A. Olsen:
Single-Channel Multitalker Speech Recognition. IEEE Signal Process. Mag. 27(6): 66-80 (2010)
[c29]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DogninHGO10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DogninHGO10
Pierre L. Dognin, John R. Hershey, Vaibhava Goel, Peder A. Olsen:
Restructuring exponential family mixture models. INTERSPEECH 2010: 62-65
[c28]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HersheyOR10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HersheyOR10
John R. Hershey, Peder A. Olsen, Steven J. Rennie:
Signal interaction and the devil function. INTERSPEECH 2010: 334-337
[c27]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/OlsenGMH10
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/OlsenGMH10
Peder A. Olsen, Vaibhava Goel, Charles A. Micchelli, John R. Hershey:
Modeling posterior probabilities using the linear exponential family. INTERSPEECH 2010: 2994-2997

2000 – 2009

see FAQ

What is the meaning of the colors in the publication lists?

2009
[c26]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/RennieHO09
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/RennieHO09
Steven J. Rennie, John R. Hershey, Peder A. Olsen:
Hierarchical variational loopy belief propagation for multi-talker speech recognition. ASRU 2009: 176-181
[c25]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DogninGHO09
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DogninGHO09
Pierre L. Dognin, Vaibhava Goel, John R. Hershey, Peder A. Olsen:
A fast, accurate approximation to log likelihood of Gaussian mixture models. ICASSP 2009: 3817-3820
[c24]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/RennieHO09
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/RennieHO09
Steven J. Rennie, John R. Hershey, Peder A. Olsen:
Single-channel speech separation and recognition using loopy belief propagation. ICASSP 2009: 3845-3848
[c23]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DogninHGO09
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DogninHGO09
Pierre L. Dognin, John R. Hershey, Vaibhava Goel, Peder A. Olsen:
Refactoring acoustic models using variational density approximation. ICASSP 2009: 4473-4476
[c22]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DogninHGO09
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DogninHGO09
Pierre L. Dognin, John R. Hershey, Vaibhava Goel, Peder A. Olsen:
Refactoring acoustic models using variational expectation-maximization. INTERSPEECH 2009: 212-215
[c21]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RennieHO09
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RennieHO09
Steven J. Rennie, John R. Hershey, Peder A. Olsen:
Variational loopy belief propagation for multi-talker speech recognition. INTERSPEECH 2009: 1331-1334
2008
[c20]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/RennieHO08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/RennieHO08
Steven J. Rennie, John R. Hershey, Peder A. Olsen:
Efficient model-based speech separation and denoising using non-negative subspace analysis. ICASSP 2008: 1833-1836
[c19]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenHOY08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenHOY08
Jia-Yu Chen, John R. Hershey, Peder A. Olsen, Emmanuel Yashchin:
Accelerated Monte Carlo for Kullback-Leibler divergence between Gaussian mixture models. ICASSP 2008: 4553-4556
[c18]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HersheyO08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HersheyO08
John R. Hershey, Peder A. Olsen:
Variational Bhattacharyya divergence for hidden Markov models. ICASSP 2008: 4557-4560
[c17]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/MohantyHOKG08
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/MohantyHOKG08
Binit Mohanty, John R. Hershey, Peder A. Olsen, Suleyman Serdar Kozat, Vaibhava Goel:
Optimizing speech recognition grammars using a measure of similarity between hidden Markov models. ICASSP 2008: 4953-4956
2007
[c16]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/HersheyOR07
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/HersheyOR07
John R. Hershey, Peder A. Olsen, Steven J. Rennie:
Variational Kullback-Leibler divergence for Hidden Markov models. ASRU 2007: 323-328
[c15]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HersheyO07
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HersheyO07
John R. Hershey, Peder A. Olsen:
Approximating the Kullback Leibler Divergence Between Gaussian Mixture Models. ICASSP (4) 2007: 317-320
[c14]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/OlsenH07
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/OlsenH07
Peder A. Olsen, John R. Hershey:
Bhattacharyya error and divergence using variational importance sampling. INTERSPEECH 2007: 46-49
[c13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenOH07
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenOH07
Jia-Yu Chen, Peder A. Olsen, John R. Hershey:
Word confusability - measuring hidden Markov model similarity. INTERSPEECH 2007: 2089-2092
2006
[c12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KristjanssonHORG06
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KristjanssonHORG06
Trausti T. Kristjansson, John R. Hershey, Peder A. Olsen, Steven J. Rennie, Ramesh A. Gopinath:
Super-human multi-talker speech recognition: the IBM 2006 speech separation challenge system. INTERSPEECH 2006
[c11]
- view
  - electronic edition @ isca-archive.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/RennieOHK06
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RennieOHK06
Steven J. Rennie, Peder A. Olsen, John R. Hershey, Trausti T. Kristjansson:
The Iroquois model: using temporal dynamics to separate speakers. SAPA@INTERSPEECH 2006: 24-30
[c10]
- view
- export record
  dblp key:
  - conf/nips/HersheyKRO06
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/HersheyKRO06
John R. Hershey, Trausti T. Kristjansson, Steven J. Rennie, Peder A. Olsen:
Single Channel Speech Separation Using Factorial Dynamics. NIPS 2006: 593-600
2005
[b1]
- view
  authority control:
- export record
  dblp key:
  - phd/basesearch/Hershey05
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/phd/basesearch/Hershey05
John R. Hershey:
Perceptual inference in generative models. University of California, San Diego, USA, 2005
2004
[c9]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/MarksHRM04
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/MarksHRM04
Tim K. Marks, John R. Hershey, J. Cooper Roddey, Javier R. Movellan:
3D Tracking of Morphable Objects Using Conditionally Gaussian Nonlinear Filters. CVPR Workshops 2004: 190
[c8]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/KristjanssonAH04
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/KristjanssonAH04
Trausti T. Kristjansson, Hagai Attias, John R. Hershey:
Stereo Based 3D Tracking and Scene Learning, Employing Particle Filtering within EM. ECCV (4) 2004: 546-559
[c7]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HersheyAJK04
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HersheyAJK04
John R. Hershey, Hagai Attias, Nebojsa Jojic, Trausti T. Kristjansson:
Audio-visual graphical models for speech processing. ICASSP (5) 2004: 649-652
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/KristjanssonAH04
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/KristjanssonAH04
Trausti T. Kristjansson, Hagai Attias, John R. Hershey:
Single microphone source separation using high resolution signal reconstruction. ICASSP (2) 2004: 817-820
[c5]
- view
  - electronic edition @ isca-archive.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/HersheyKZ04
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HersheyKZ04
John R. Hershey, Trausti T. Kristjansson, Zhengyou Zhang:
Model-based fusion of bone and air sensors for speech enhancement and robust speech recognition. SAPA@INTERSPEECH 2004: 139
[c4]
- view
- export record
  dblp key:
  - conf/nips/MarksHRM04
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/MarksHRM04
Tim K. Marks, John R. Hershey, J. Cooper Roddey, Javier R. Movellan:
Joint Tracking of Pose, Expression, and Texture using Conditionally Gaussian Filters. NIPS 2004: 889-896
2001
[c3]
- view
- export record
  dblp key:
  - conf/nips/HersheyC01
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/HersheyC01
John R. Hershey, Michael Casey:
Audio-Visual Sound Separation Via Hidden Markov Models. NIPS 2001: 1173-1180
2000
[c2]
- view
  authority control:
- export record
  dblp key:
  - conf/icip/GorodnitskyH00
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icip/GorodnitskyH00
Irina F. Gorodnitsky, John R. Hershey:
A Low-Level Cortical Perception Model with Applications to Image Analysis. ICIP 2000: 308-311

1990 – 1999

see FAQ

What is the meaning of the colors in the publication lists?

1999
[c1]
- view
- export record
  dblp key:
  - conf/nips/HersheyM99
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/HersheyM99
John R. Hershey, Javier R. Movellan:
Audio Vision: Using Audio-Visual Synchrony to Locate Sounds. NIPS 1999: 813-819

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.