default search action
Timo Gerkmann
Person information
- affiliation: University Hamburg, Germany
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
Books and Theses
- 2013
- [b1]Richard C. Hendriks, Timo Gerkmann, Jesper Jensen:
DFT-Domain Based Single-Microphone Noise Reduction for Speech Enhancement. Synthesis Lectures on Speech and Audio Processing, Morgan & Claypool Publishers 2013, ISBN 9781627051439
Journal Articles
- 2024
- [j31]Navin Raj Prabhu, Nale Lehmann-Willenbrock, Timo Gerkmann:
End-to-End Label Uncertainty Modeling in Speech Emotion Recognition Using Bayesian Neural Networks and Label Distribution Learning. IEEE Trans. Affect. Comput. 15(2): 579-592 (2024) - [j30]Kristina Tesch, Timo Gerkmann:
Multi-Channel Speech Separation Using Spatially Selective Deep Non-Linear Filters. IEEE ACM Trans. Audio Speech Lang. Process. 32: 542-553 (2024) - [j29]Simon Welker, Henry N. Chapman, Timo Gerkmann:
DriftRec: Adapting Diffusion Models to Blind JPEG Restoration. IEEE Trans. Image Process. 33: 2795-2807 (2024) - 2023
- [j28]Jean-Marie Lemercier, Joachim Thiemann, Raphael Koning, Timo Gerkmann:
A neural network-supported two-stage algorithm for lightweight dereverberation on hearing devices. EURASIP J. Audio Speech Music. Process. 2023(1): 18 (2023) - [j27]Kristina Tesch, Timo Gerkmann:
Insights Into Deep Non-Linear Filters for Improved Multi-Channel Speech Enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 31: 563-575 (2023) - [j26]Huajian Fang, Dennis Becker, Stefan Wermter, Timo Gerkmann:
Integrating Uncertainty Into Neural Network-Based Speech Enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1587-1600 (2023) - [j25]Julius Richter, Simon Welker, Jean-Marie Lemercier, Bunlong Lay, Timo Gerkmann:
Speech Enhancement and Dereverberation With Diffusion-Based Generative Models. IEEE ACM Trans. Audio Speech Lang. Process. 31: 2351-2364 (2023) - [j24]Jean-Marie Lemercier, Julius Richter, Simon Welker, Timo Gerkmann:
StoRM: A Diffusion-Based Stochastic Regeneration Model for Speech Enhancement and Dereverberation. IEEE ACM Trans. Audio Speech Lang. Process. 31: 2724-2737 (2023) - 2021
- [j23]Kristina Tesch, Timo Gerkmann:
Nonlinear Spatial Filtering in Multichannel Speech Enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1795-1805 (2021) - [j22]Robert Rehr, Timo Gerkmann:
SNR-Based Features and Diverse Training Data for Robust DNN-Based Speech Enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1937-1949 (2021) - [j21]Tobias Knopp, Mirco Grosser, Matthias Graeser, Timo Gerkmann, Martin Möddel:
Efficient Joint Estimation of Tracer Distribution and Background Signals in Magnetic Particle Imaging Using a Dictionary Approach. IEEE Trans. Medical Imaging 40(12): 3568-3579 (2021) - 2020
- [j20]Lux Li, Robert Rehr, Patrick Bruns, Timo Gerkmann, Brigitte Röder:
A Survey on Probabilistic Models in Human Perception and Machines. Frontiers Robotics AI 7: 85 (2020) - 2018
- [j19]Robert Rehr, Timo Gerkmann:
On the Importance of Super-Gaussian Speech Priors for Machine-Learning Based Speech Enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 26(2): 357-366 (2018) - [j18]Martin Krawczyk-Becker, Timo Gerkmann:
On Speech Enhancement Under PSD Uncertainty. IEEE ACM Trans. Audio Speech Lang. Process. 26(6): 1140-1149 (2018) - 2017
- [j17]Robert Rehr, Timo Gerkmann:
An Analysis of Adaptive Recursive Smoothing with Applications to Noise PSD Estimation. IEEE ACM Trans. Audio Speech Lang. Process. 25(2): 397-408 (2017) - 2016
- [j16]Martin Krawczyk-Becker, Timo Gerkmann:
Fundamental Frequency Informed Speech Enhancement in a Flexible Statistical Framework. IEEE ACM Trans. Audio Speech Lang. Process. 24(5): 940-951 (2016) - [j15]Martin Krawczyk-Becker, Timo Gerkmann:
On MMSE-Based Estimation of Amplitude and Complex Speech Spectral Coefficients Under Phase-Uncertainty. IEEE ACM Trans. Audio Speech Lang. Process. 24(12): 2251-2262 (2016) - 2015
- [j14]Benjamin Cauchi, Ina Kodrasi, Robert Rehr, Stephan Gerlach, Ante Jukic, Timo Gerkmann, Simon Doclo, Stefan Goetze:
Combination of MVDR beamforming and single-channel spectral processing for enhancing noisy and reverberant speech. EURASIP J. Adv. Signal Process. 2015: 61 (2015) - [j13]Feifei Xiong, Bernd T. Meyer, Niko Moritz, Robert Rehr, Jörn Anemüller, Timo Gerkmann, Simon Doclo, Stefan Goetze:
Front-end technologies for robust ASR in reverberant environments - spectral enhancement-based dereverberation and auditory modulation filterbank features. EURASIP J. Adv. Signal Process. 2015: 70 (2015) - [j12]Timo Gerkmann, Martin Krawczyk-Becker, Jonathan Le Roux:
Phase Processing for Single-Channel Speech Enhancement: History and recent advances. IEEE Signal Process. Mag. 32(2): 55-66 (2015) - [j11]Alexander Schasse, Timo Gerkmann, Rainer Martin, Wolfgang Sörgel, Thomas Pilgrim, Henning Puder:
Two-Stage Filter-Bank System for Improved Single-Channel Noise Reduction in Hearing Aids. IEEE ACM Trans. Audio Speech Lang. Process. 23(2): 383-393 (2015) - [j10]Lin Wang, Timo Gerkmann, Simon Doclo:
Noise Power Spectral Density Estimation Using MaxNSR Blocking Matrix. IEEE ACM Trans. Audio Speech Lang. Process. 23(9): 1493-1508 (2015) - [j9]Ante Jukic, Toon van Waterschoot, Timo Gerkmann, Simon Doclo:
Multi-Channel Linear Prediction-Based Speech Dereverberation With Sparse Priors. IEEE ACM Trans. Audio Speech Lang. Process. 23(9): 1509-1520 (2015) - 2014
- [j8]Martin Krawczyk, Timo Gerkmann:
STFT phase reconstruction in voiced speech for an improved single-channel speech enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 22(12): 1931-1940 (2014) - [j7]Timo Gerkmann:
Bayesian Estimation of Clean Speech Spectral Coefficients Given a Priori Knowledge of the Phase. IEEE Trans. Signal Process. 62(16): 4199-4208 (2014) - 2013
- [j6]Timo Gerkmann, Martin Krawczyk:
MMSE-Optimal Spectral Amplitude Estimation Given the STFT-Phase. IEEE Signal Process. Lett. 20(2): 129-132 (2013) - 2012
- [j5]Richard C. Hendriks, Timo Gerkmann:
Noise Correlation Matrix Estimation for Multi-Microphone Speech Enhancement. IEEE Trans. Speech Audio Process. 20(1): 223-233 (2012) - [j4]Timo Gerkmann, Richard C. Hendriks:
Unbiased MMSE-Based Noise Power Estimation With Low Complexity and Low Tracking Delay. IEEE Trans. Speech Audio Process. 20(4): 1383-1393 (2012) - 2009
- [j3]Timo Gerkmann, Rainer Martin:
On the statistics of spectral amplitudes after variance reduction by temporal cepstrum smoothing and cepstral nulling. IEEE Trans. Signal Process. 57(11): 4165-4174 (2009) - 2008
- [j2]Timo Gerkmann, Colin Breithaupt, Rainer Martin:
Improved A Posteriori Speech Presence Probability Estimation Based on a Likelihood Ratio With Fixed Priors. IEEE Trans. Speech Audio Process. 16(5): 910-919 (2008) - 2007
- [j1]Colin Breithaupt, Timo Gerkmann, Rainer Martin:
Cepstral Smoothing of Spectral Filter Gains for Speech Enhancement Without Musical Noise. IEEE Signal Process. Lett. 14(12): 1036-1039 (2007)
Conference and Workshop Papers
- 2024
- [c106]Bunlong Lay, Jean-Marie Lemercier, Julius Richter, Timo Gerkmann:
Single and Few-Step Diffusion for Generative Speech Enhancement. ICASSP 2024: 626-630 - [c105]Tal Peer, Simon Welker, Johannes Kolhoff, Timo Gerkmann:
A Flexible Online Framework for Projection-Based Stft Phase Retrieval. ICASSP 2024: 846-850 - [c104]Simon Welker, Tal Peer, Henry N. Chapman, Timo Gerkmann:
Live Iterative Ptychography with Projection-Based Algorithms. ICASSP 2024: 2455-2459 - [c103]Navin Raj Prabhu, Bunlong Lay, Simon Welker, Nale Lehmann-Willenbrock, Timo Gerkmann:
EMOCONV-Diff: Diffusion-Based Speech Emotion Conversion for Non-Parallel and in-the-Wild Data. ICASSP 2024: 11651-11655 - [c102]Danilo de Oliveira, Timo Gerkmann:
Distilling Hubert with LSTMs via Decoupled Knowledge Distillation. ICASSP 2024: 11711-11715 - [c101]Huajian Fang, Timo Gerkmann:
Uncertainty-Based Remixing for Unsupervised Domain Adaptation in Deep Speech Enhancement. IWAENC 2024: 45-49 - [c100]Danilo de Oliveira, Eric Grinstein, Patrick A. Naylor, Timo Gerkmann:
LASER: Language-Queried Speech Enhancer. IWAENC 2024: 90-94 - [c99]Eloi Moliner, Jean-Marie Lemercier, Simon Welker, Timo Gerkmann, Vesa Välimäki:
BUDDy: Single-Channel Blind Unsupervised Dereverberation with Diffusion Models. IWAENC 2024: 120-124 - [c98]Alina Mannanova, Kristina Tesch, Jean-Marie Lemercier, Timo Gerkmann:
Meta-Learning For Variable Array Configurations in End-to-End Few-Shot Multichannel Speech Enhancement. IWAENC 2024: 200-204 - [c97]Bunlong Lay, Sebastian Zaczek, Kristina Tesch, Timo Gerkmann:
Robustness of Speech Separation Models for Similar-Pitch Speakers. IWAENC 2024: 225-229 - 2023
- [c96]Huajian Fang, Timo Gerkmann:
Uncertainty Estimation in Deep Speech Enhancement Using Complex Gaussian Mixture Models. ICASSP 2023: 1-5 - [c95]Huajian Fang, Niklas Wittmer, Johannes Twiefel, Stefan Wermter, Timo Gerkmann:
Partially Adaptive Multichannel Joint Reduction of Ego-Noise and Environmental Noise. ICASSP 2023: 1-5 - [c94]Jean-Marie Lemercier, Julius Richter, Simon Welker, Timo Gerkmann:
Analysing Diffusion-based Generative Approaches Versus Discriminative Approaches for Speech Restoration. ICASSP 2023: 1-5 - [c93]Tal Peer, Simon Welker, Timo Gerkmann:
DiffPhase: Generative Diffusion-Based STFT Phase Retrieval. ICASSP 2023: 1-5 - [c92]Julius Richter, Simon Welker, Jean-Marie Lemercier, Bunlong Lay, Tal Peer, Timo Gerkmann:
Speech Signal Improvement Using Causal Generative Diffusion Models. ICASSP 2023: 1-2 - [c91]Kristina Tesch, Timo Gerkmann:
Spatially Selective Deep Non-Linear Filters For Speaker Extraction. ICASSP 2023: 1-5 - [c90]Ehsan Yaghoubi, André Peter Kelm, Timo Gerkmann, Simone Frintrop:
Acoustic and Visual Knowledge Distillation for Contrastive Audio-Visual Localization. ICMI 2023: 15-23 - [c89]Héctor Martel, Julius Richter, Kai Li, Xiaolin Hu, Timo Gerkmann:
Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Model. INTERSPEECH 2023: 1673-1677 - [c88]Danilo de Oliveira, Navin Raj Prabhu, Timo Gerkmann:
Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models. INTERSPEECH 2023: 3632-3636 - [c87]Bunlong Lay, Simon Welker, Julius Richter, Timo Gerkmann:
Reducing the Prior Mismatch of Stochastic Differential Equations for Diffusion-based Speech Enhancement. INTERSPEECH 2023: 3809-3813 - [c86]Jean-Marie Lemercier, Julian Tobergte, Timo Gerkmann:
Extending DNN-based Multiplicative Masking to Deep Subband Filtering for Improved Dereverberation. INTERSPEECH 2023: 4024-4028 - [c85]Jean-Marie Lemercier, Simon Welker, Timo Gerkmann:
Diffusion Posterior Sampling for Informed Single-Channel Dereverberation. WASPAA 2023: 1-5 - 2022
- [c84]Navin Raj Prabhu, Nale Lehmann-Willenbrock, Timo Gerkmann:
Label Uncertainty Modeling and Prediction for Speech Emotion Recognition using t-Distributions. ACII 2022: 1-8 - [c83]Jean-Marie Lemercier, Joachim Thiemann, Raphael Koning, Timo Gerkmann:
Customizable End-To-End Optimization Of Online Neural Network-Supported Dereverberation For Hearing Devices. ICASSP 2022: 171-175 - [c82]Huajian Fang, Tal Peer, Stefan Wermter, Timo Gerkmann:
Integrating Statistical Uncertainty into Neural Network-Based Speech Enhancement. ICASSP 2022: 386-390 - [c81]Simon Welker, Tal Peer, Henry N. Chapman, Timo Gerkmann:
Deep Iterative Phase Retrieval for Ptychography. ICASSP 2022: 1591-1595 - [c80]Julius Richter, Jeanine Liebold, Timo Gerkmann:
Continuous Phoneme Recognition based on Audio-Visual Modality Fusion. IJCNN 2022: 1-8 - [c79]Navin Raj Prabhu, Guillaume Carbajal, Nale Lehmann-Willenbrock, Timo Gerkmann:
End-To-End Label Uncertainty Modeling for Speech-based Arousal Recognition Using Bayesian Neural Networks. INTERSPEECH 2022: 151-155 - [c78]Jean-Marie Lemercier, Joachim Thiemann, Raphael Koning, Timo Gerkmann:
Neural Network-augmented Kalman Filtering for Robust Online Speech Dereverberation in Noisy Reverberant Environments. INTERSPEECH 2022: 226-230 - [c77]Kristina Tesch, Nils-Hendrik Mohrmann, Timo Gerkmann:
On the Role of Spatial, Spectral, and Temporal Processing for DNN-based Non-linear Multi-channel Speech Enhancement. INTERSPEECH 2022: 2908-2912 - [c76]Simon Welker, Julius Richter, Timo Gerkmann:
Speech Enhancement with Score-Based Generative Models in the Complex STFT Domain. INTERSPEECH 2022: 2928-2932 - [c75]Danilo de Oliveira, Tal Peer, Timo Gerkmann:
Efficient Transformer-based Speech Enhancement Using Long Frames and STFT Magnitudes. INTERSPEECH 2022: 2948-2952 - [c74]Tal Peer, Simon Welker, Timo Gerkmann:
Beyond Griffin-LIM: Improved Iterative Phase Retrieval for Speech. IWAENC 2022: 1-5 - [c73]Bunlong Lay, Timo Gerkmann:
Speech Enhancement Regularized by a Speaker Verification Model. MMSP 2022: 1-6 - 2021
- [c72]Huajian Fang, Guillaume Carbajal, Stefan Wermter, Timo Gerkmann:
Joint Reduction of Ego-noise and Environmental Noise with a Partially-adaptive Dictionary. ITG Conference on Speech Communication 2021: 1-5 - [c71]Jean-Marie Lemercier, Leroy Bartel, David Ditter, Timo Gerkmann:
An Integrated Deep Clustering-Based System for Speaker Count Agnostic Speech Separation. ITG Conference on Speech Communication 2021: 1-5 - [c70]Tal Peer, Timo Gerkmann:
Intelligibility Prediction of Speech Reconstructed From Its Magnitude or Phase. ITG Conference on Speech Communication 2021: 1-5 - [c69]Tal Peer, Klaus-Johan Ziegert, Timo Gerkmann:
Plosive Enhancement Using Phase Linearization and Smoothing. ITG Conference on Speech Communication 2021: 1-5 - [c68]Huajian Fang, Guillaume Carbajal, Stefan Wermter, Timo Gerkmann:
Variational Autoencoder for Speech Enhancement with a Noise-Aware Encoder. ICASSP 2021: 676-680 - [c67]Guillaume Carbajal, Julius Richter, Timo Gerkmann:
Guided Variational Autoencoder for Speech Enhancement with a Supervised Classifier. ICASSP 2021: 681-685 - [c66]Danu Caus, Guillaume Carbajal, Timo Gerkmann, Simone Frintrop:
See the Silence: Improving Visual-Only Voice Activity Detection by Optical Flow and RGB Fusion. ICVS 2021: 41-51 - [c65]Tobias Knopp, Mirco Grosser, Matthias Graeser, Timo Gerkmann, Martin Möddel:
Dictionary-Based Background Signal Estimation For Magnetic Particle Imaging. ISBI 2021: 1540-1543 - [c64]Xiaolin Hu, Kai Li, Weiyi Zhang, Yi Luo, Jean-Marie Lemercier, Timo Gerkmann:
Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network. NeurIPS 2021: 22509-22522 - [c63]Guillaume Carbajal, Julius Richter, Timo Gerkmann:
Disentanglement Learning for Variational Autoencoders Applied to Audio-Visual Speech Enhancement. WASPAA 2021: 126-130 - 2020
- [c62]David Ditter, Timo Gerkmann:
A Multi-Phase Gammatone Filterbank for Speech Separation Via Tasnet. ICASSP 2020: 36-40 - [c61]Kristina Tesch, Timo Gerkmann:
Nonlinear Spatial Filtering for Multichannel Speech Enhancement in Inhomogeneous Noise Fields. ICASSP 2020: 196-200 - [c60]Quan Nguyen, Julius Richter, Mikko Lauri, Timo Gerkmann, Simone Frintrop:
Improving mix-and-separate training in audio-visual sound source separation with an object prior. ICPR 2020: 5844-5851 - [c59]Julius Richter, Guillaume Carbajal, Timo Gerkmann:
Speech Enhancement with Stochastic Temporal Convolutional Networks. INTERSPEECH 2020: 4516-4520 - [c58]Hongzhuo Liang, Chuangchuang Zhou, Shuang Li, Xiaojian Ma, Norman Hendrich, Timo Gerkmann, Fuchun Sun, Marcus Stoffel, Jianwei Zhang:
Robust Robotic Pouring using Audition and Haptics. IROS 2020: 10880-10887 - 2019
- [c57]Robert Rehr, Timo Gerkmann:
An Analysis of Noise-aware Features in Combination with the Size and Diversity of Training Data for DNN-based Speech Enhancement. ICASSP 2019: 601-605 - [c56]Kristina Tesch, Robert Rehr, Timo Gerkmann:
On Nonlinear Spatial Filtering in Multichannel Speech Enhancement. INTERSPEECH 2019: 91-95 - [c55]David Ditter, Timo Gerkmann:
Influence of Speaker-Specific Parameters on Speech Separation Systems. INTERSPEECH 2019: 4584-4588 - [c54]Hongzhuo Liang, Shuang Li, Xiaojian Ma, Norman Hendrich, Timo Gerkmann, Fuchun Sun, Jianwei Zhang:
Making Sense of Audio Vibration for Liquid Height Estimation in Robotic Pouring. IROS 2019: 5333-5339 - 2018
- [c53]Robert Rehr, Timo Gerkmann:
Robust DNN-Based Speech Enhancement with Limited Training Data. ITG Symposium on Speech Communication 2018: 1-5 - [c52]Martin Krawczyk-Becker, Timo Gerkmann:
A Study on the Benefits of Phase-Aware Speech Enhancement in Challenging Noise Scenarios. LVA/ICA 2018: 407-416 - [c51]Huy Phan, Martin Krawczyk-Becker, Timo Gerkmann, Alfred Mertins:
Weighted and Multi-Task Loss for Rare Audio Event Detection. ICASSP 2018: 336-340 - [c50]Martin Krawczyk-Becker, Timo Gerkmann:
Nonlinear Speech Enhancement Under Speech PSD Uncertainty. ICASSP 2018: 641-645 - 2017
- [c49]Robert Rehr, Timo Gerkmann:
MixMax Approximation as a Super-Gaussian Log-Spectral Amplitude Estimator for Speech Enhancement. INTERSPEECH 2017: 1983-1987 - 2016
- [c48]Dörte Fischer, Simon Doclo, Emanuël A. P. Habets, Timo Gerkmann:
Combined Single-Microphone Wiener and MVDR Filtering based on Speech Interframe Correlations and Speech Presence Probability. ITG Symposium on Speech Communication 2016: 1-5 - [c47]Robert Rehr, Timo Gerkmann:
A Combination of Pre-Trained Approaches and Generic Methods for an Improved Speech Enhancement. ITG Symposium on Speech Communication 2016: 1-5 - [c46]Dörte Fischer, Timo Gerkmann:
Single-microphone speech enhancement using MVDR filtering and Wiener post-filtering. ICASSP 2016: 201-205 - [c45]Robert Rehr, Timo Gerkmann:
BIAS correction methods for adaptive recursive smoothing with applications in noise PSD estimation. ICASSP 2016: 206-210 - [c44]Benjamin Cauchi, Hamza A. Javed, Timo Gerkmann, Simon Doclo, Stefan Goetze, Patrick A. Naylor:
Perceptual and instrumental evaluation of the perceived level of reverberation. ICASSP 2016: 629-633 - [c43]Christoph Brauer, Timo Gerkmann, Dirk A. Lorenz:
Sparse reconstruction of quantized speech signals. ICASSP 2016: 5940-5944 - [c42]Ante Jukic, Zichao Wang, Toon van Waterschoot, Timo Gerkmann, Simon Doclo:
Constrained multi-channel linear prediction for adaptive speech dereverberation. IWAENC 2016: 1-5 - 2015
- [c41]Benjamin Cauchi, Patrick A. Naylor, Timo Gerkmann, Simon Doclo, Stefan Goetze:
Late reverberant spectral variance estimation using acoustic channel equalization. EUSIPCO 2015: 2481-2485 - [c40]Adam Kuklasinski, Simon Doclo, Timo Gerkmann, Søren Holdt Jensen, Jesper Jensen:
Multi-channel PSD estimators for speech dereverberation - A theoretical and experimental comparison. ICASSP 2015: 91-95 - [c39]Ante Jukic, Nasser Mohammadiha, Toon van Waterschoot, Timo Gerkmann, Simon Doclo:
Multi-channel linear prediction-based speech dereverberation with low-rank power spectrogram approximation. ICASSP 2015: 96-100 - [c38]Martin Krawczyk-Becker, Dörte Fischer, Timo Gerkmann:
Utilizing spectro-temporal correlations for an improved speech presence probability based noise power estimation. ICASSP 2015: 365-369 - [c37]Robert Rehr, Timo Gerkmann:
Cepstral noise subtraction for robust automatic speech recognition. ICASSP 2015: 375-378 - [c36]Sidsel Marie Nørholm, Martin Krawczyk-Becker, Timo Gerkmann, Steven van de Par, Jesper Rindom Jensen, Mads Græsbøll Christensen:
Least squares estimate of the initial phases in STFT based speech enhancement. INTERSPEECH 2015: 1750-1754 - [c35]Ante Jukic, Toon van Waterschoot, Timo Gerkmann, Simon Doclo:
Group sparsity for mimo speech dereverberation. WASPAA 2015: 1-5 - [c34]Martin Krawczyk-Becker, Timo Gerkmann:
MMSE-optimal combination of wiener filtering and harmonic model based speech enhancement in a general framework. WASPAA 2015: 1-5 - [c33]Robert Rehr, Timo Gerkmann:
On the bias of adaptive first-order recursive smoothing. WASPAA 2015: 1-5 - 2014
- [c32]Naveen Kumar Desiraju, Simon Doclo, Timo Gerkmann, Tobias Wolff:
Efficient Multi-Channel Acoustic Echo Cancellation Using Constrained Sparse Filter Updates in the Subband Domain. ITG Symposium on Speech Communication 2014: 1-4 - [c31]Balázs Fodor, Timo Gerkmann:
A speech presence probability estimator based on fixed priors and a heavy-tailed speech model. EUSIPCO 2014: 2305-2309 - [c30]Ante Jukic, Toon van Waterschoot, Timo Gerkmann, Simon Doclo:
Speech dereverberation with multi-channel linear prediction and sparse priors for the desired signal. HSCMA 2014: 23-26 - [c29]Timo Gerkmann:
MMSE-optimal enhancement of complex speech coefficients with uncertain prior knowledge of the clean speech phase. ICASSP 2014: 4478-4482 - [c28]Ina Kodrasi, Timo Gerkmann, Simon Doclo:
Frequency-domain single-channel inverse filtering for speech dereverberation: Theory and practice. ICASSP 2014: 5177-5181 - [c27]Robert Rehr, Martin Krawczyk, Timo Gerkmann:
A posteriori voiced/unvoiced probability estimation based on a sinusoidal model. ICASSP 2014: 6944-6948 - [c26]Balázs Fodor, Timo Gerkmann:
A posteriori speech presence probability estimation based on averaged observations and a super-Gaussian speech model. IWAENC 2014: 11-15 - [c25]Ante Jukic, Toon van Waterschoot, Timo Gerkmann, Simon Doclo:
Speech dereverberation with convolutive transfer function approximation using map and variational deconvolution approaches. IWAENC 2014: 50-54 - [c24]Tobias May, Timo Gerkmann:
Generalization of supervised learning for binary mask estimation. IWAENC 2014: 154-158 - [c23]Stefan Goetze, Anna Warzybok, Ina Kodrasi, Jan Ole Jungmann, Benjamin Cauchi, Jan Rennies, Emanuël A. P. Habets, Alfred Mertins, Timo Gerkmann, Simon Doclo, Birger Kollmeier:
A study on speech quality and speech intelligibility measures for quality assessment of single-channel dereverberation algorithms. IWAENC 2014: 233-237 - [c22]Steffen Kortlang, Stephan Dieter Ewert, Timo Gerkmann:
Single channel noise reduction based on an auditory filterbank. IWAENC 2014: 283-287 - [c21]Anna Warzybok, Ina Kodrasi, Jan Ole Jungmann, Emanuël A. P. Habets, Timo Gerkmann, Alfred Mertins, Simon Doclo, Birger Kollmeier, Stefan Goetze:
Subjective speech quality and speech intelligibility evaluation of single-channel dereverberation algorithms. IWAENC 2014: 332-336 - 2013
- [c20]Richard C. Hendriks, Zekeriya Erkin, Timo Gerkmann:
Privacy preserving distributed beamforming based on homomorphic encryption. EUSIPCO 2013: 1-5 - [c19]Martin Krawczyk, Robert Rehr, Timo Gerkmann:
Phase-sensitive real-time capable speech enhancement under voiced-unvoiced uncertainty. EUSIPCO 2013: 1-5 - [c18]Richard C. Hendriks, Zekeriya Erkin, Timo Gerkmann:
Privacy-preserving distributed speech enhancement forwireless sensor networks by processing in the encrypted domain. ICASSP 2013: 7005-7009 - [c17]Ramón Fernandez Astudillo, Timo Gerkmann:
On the relation between speech corruption models in the spectral and the cepstral domain. ICASSP 2013: 7044-7048 - 2012
- [c16]Timo Gerkmann, Richard C. Hendriks:
Improved mmse-based noise PSD tracking using temporal cepstrum smoothing. ICASSP 2012: 105-108 - [c15]Martin Krawczyk, Timo Gerkmann:
STFT Phase Improvement for Single Channel Speech Enhancement. IWAENC 2012 - [c14]Lin Wang, Timo Gerkmann, Simon Doclo:
Noise PSD Estimation Using Blind Source Separation in a Diffuse Noise Field. IWAENC 2012 - 2011
- [c13]Timo Gerkmann:
Cepstral weighting for speech dereverberation without musical noise. EUSIPCO 2011: 2309-2313 - [c12]Richard C. Hendriks, Timo Gerkmann:
Estimation of the noise correlation matrix. ICASSP 2011: 4740-4743 - [c11]Nasser Mohammadiha, Timo Gerkmann, Arne Leijon:
A new approach for speech enhancement based on a constrained Nonnegative Matrix Factorization. ISPACS 2011: 1-5 - [c10]Jalil Taghia, Timo Gerkmann, Arne Leijon:
Blind source separation of nondisjoint sources in the time-frequency domain with model-based determination of source contribution. ISSPIT 2011: 276-280 - [c9]Nasser Mohammadiha, Timo Gerkmann, Arne Leijon:
A new linear MMSE filter for single channel speech enhancement based on Nonnegative Matrix Factorization. WASPAA 2011: 45-48 - [c8]Timo Gerkmann, Richard C. Hendriks:
Noise power estimation based on the probability of speech presence. WASPAA 2011: 145-148 - 2010
- [c7]Timo Gerkmann, Rainer Martin:
Cepstral Smoothing with Reduced Computational Complexity. Sprachkommunikation 2010: 1-4 - [c6]Anil M. Nagathil, Timo Gerkmann, Rainer Martin:
Musical genre classification based on a highly-resolved cepstral modulation spectrum. EUSIPCO 2010: 462-466 - [c5]Timo Gerkmann, Martin Krawczyk, Rainer Martin:
Speech presence probability estimation based on temporal cepstrum smoothing. ICASSP 2010: 4254-4257 - 2009
- [c4]Timo Gerkmann, Rainer Martin, Derya Dalga:
Multi-microphone maximum a posteriori fundamental frequency estimation in the cepstral domain. ICASSP 2009: 4505-4508 - 2008
- [c3]Colin Breithaupt, Timo Gerkmann, Rainer Martin:
A novel a priori SNR estimation approach based on selective cepstro-temporal smoothing. ICASSP 2008: 4897-4900 - 2006
- [c2]Justinian Rosca, Timo Gerkmann, Doru-Cristian Balcan:
Statistical Inference of Missing Speech Data in the ICA Domain. ICASSP (5) 2006: 617-620 - [c1]Timo Gerkmann, Rainer Martin:
Soft decision combining for dual channel noise reduction. INTERSPEECH 2006
Informal and Other Publications
- 2024
- [i60]Bunlong Lay, Timo Gerkmann:
An Analysis of the Variance of Diffusion-based Speech Enhancement. CoRR abs/2402.00811 (2024) - [i59]Jean-Marie Lemercier, Julius Richter, Simon Welker, Eloi Moliner, Vesa Välimäki, Timo Gerkmann:
Diffusion Models for Audio Restoration. CoRR abs/2402.09821 (2024) - [i58]Eloi Moliner, Jean-Marie Lemercier, Simon Welker, Timo Gerkmann, Vesa Välimäki:
BUDDy: Single-Channel Blind Unsupervised Dereverberation with Diffusion Models. CoRR abs/2405.04272 (2024) - [i57]Danilo de Oliveira, Simon Welker, Julius Richter, Timo Gerkmann:
The PESQetarian: On the Relevance of Goodhart's Law for Speech Enhancement. CoRR abs/2406.03460 (2024) - [i56]Julius Richter, Yi-Chiao Wu, Steven Krenn, Simon Welker, Bunlong Lay, Shinji Watanabe, Alexander Richard, Timo Gerkmann:
EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and Dereverberation. CoRR abs/2406.06185 (2024) - [i55]Bunlong Lay, Sebastian Zaczek, Kristina Tesch, Timo Gerkmann:
Robustness of Speech Separation Models for Similar-pitch Speakers. CoRR abs/2407.15749 (2024) - [i54]Jean-Marie Lemercier, Eloi Moliner, Simon Welker, Vesa Välimäki, Timo Gerkmann:
Unsupervised Blind Joint Dereverberation and Room Acoustics Estimation with Diffusion Models. CoRR abs/2408.07472 (2024) - [i53]Navin Raj Prabhu, Maria Tsfasman, Catharine Oertel, Timo Gerkmann, Nale Lehmann-Willenbrock:
Dynamics of Collective Group Affect: Group-level Annotations and the Multimodal Modeling of Convergence and Divergence. CoRR abs/2409.08578 (2024) - [i52]Julius Richter, Danilo de Oliveira, Timo Gerkmann:
Investigating Training Objectives for Generative Speech Enhancement. CoRR abs/2409.10753 (2024) - 2023
- [i51]Bunlong Lay, Simon Welker, Julius Richter, Timo Gerkmann:
Reducing the Prior Mismatch of Stochastic Differential Equations for Diffusion-based Speech Enhancement. CoRR abs/2302.14748 (2023) - [i50]Jean-Marie Lemercier, Julian Tobergte, Timo Gerkmann:
Extending DNN-based Multiplicative Masking to Deep Subband Filtering for Improved Dereverberation. CoRR abs/2303.00529 (2023) - [i49]Julius Richter, Simon Welker, Jean-Marie Lemercier, Bunlong Lay, Tal Peer, Timo Gerkmann:
Speech Signal Improvement Using Causal Generative Diffusion Models. CoRR abs/2303.08674 (2023) - [i48]Huajian Fang, Niklas Wittmer, Johannes Twiefel, Stefan Wermter, Timo Gerkmann:
Partially Adaptive Multichannel Joint Reduction of Ego-noise and Environmental Noise. CoRR abs/2303.15042 (2023) - [i47]Kristina Tesch, Timo Gerkmann:
Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters. CoRR abs/2304.12023 (2023) - [i46]Huajian Fang, Dennis Becker, Stefan Wermter, Timo Gerkmann:
Integrating Uncertainty into Neural Network-based Speech Enhancement. CoRR abs/2305.08744 (2023) - [i45]Danilo de Oliveira, Navin Raj Prabhu, Timo Gerkmann:
Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models. CoRR abs/2305.19184 (2023) - [i44]Héctor Martel, Julius Richter, Kai Li, Xiaolin Hu, Timo Gerkmann:
Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Model. CoRR abs/2306.00160 (2023) - [i43]Julius Richter, Simone Frintrop, Timo Gerkmann:
Audio-Visual Speech Enhancement with Score-Based Generative Models. CoRR abs/2306.01432 (2023) - [i42]Navin Raj Prabhu, Nale Lehmann-Willenbrock, Timo Gerkmann:
In-the-wild Speech Emotion Conversion Using Disentangled Self-Supervised Representations and Neural Vocoder-based Resynthesis. CoRR abs/2306.01916 (2023) - [i41]Danilo de Oliveira, Julius Richter, Jean-Marie Lemercier, Tal Peer, Timo Gerkmann:
On the Behavior of Intrusive and Non-intrusive Speech Enhancement Metrics in Predictive and Generative Settings. CoRR abs/2306.03014 (2023) - [i40]Jean-Marie Lemercier, Simon Welker, Timo Gerkmann:
Diffusion Posterior Sampling for Informed Single-Channel Dereverberation. CoRR abs/2306.12286 (2023) - [i39]Jean-Marie Lemercier, Joachim Thiemann, Raphael Koning, Timo Gerkmann:
Wind Noise Reduction with a Diffusion-based Stochastic Regeneration Model. CoRR abs/2306.12867 (2023) - [i38]Tal Peer, Simon Welker, Johannes Kolhoff, Timo Gerkmann:
A Flexible Online Framework for Projection-Based STFT Phase Retrieval. CoRR abs/2309.07043 (2023) - [i37]Navin Raj Prabhu, Bunlong Lay, Simon Welker, Nale Lehmann-Willenbrock, Timo Gerkmann:
EMOCONV-DIFF: Diffusion-based Speech Emotion Conversion for Non-parallel and In-the-wild Data. CoRR abs/2309.07828 (2023) - [i36]Bunlong Lay, Jean-Marie Lemercier, Julius Richter, Timo Gerkmann:
Single and Few-step Diffusion for Generative Speech Enhancement. CoRR abs/2309.09677 (2023) - [i35]Danilo de Oliveira, Timo Gerkmann:
Distilling HuBERT with LSTMs via Decoupled Knowledge Distillation. CoRR abs/2309.09920 (2023) - 2022
- [i34]Simon Welker, Tal Peer, Henry N. Chapman, Timo Gerkmann:
Deep Iterative Phase Retrieval for Ptychography. CoRR abs/2202.10573 (2022) - [i33]Huajian Fang, Tal Peer, Stefan Wermter, Timo Gerkmann:
Integrating Statistical Uncertainty into Neural Network-Based Speech Enhancement. CoRR abs/2203.02288 (2022) - [i32]Tal Peer, Timo Gerkmann:
Phase-Aware Deep Speech Enhancement: It's All About The Frame Length. CoRR abs/2203.16222 (2022) - [i31]Simon Welker, Julius Richter, Timo Gerkmann:
Speech Enhancement with Score-Based Generative Models in the Complex STFT Domain. CoRR abs/2203.17004 (2022) - [i30]Jean-Marie Lemercier, Joachim Thiemann, Raphael Koning, Timo Gerkmann:
Customizable End-to-end Optimization of Online Neural Network-supported Dereverberation for Hearing Devices. CoRR abs/2204.02694 (2022) - [i29]Jean-Marie Lemercier, Joachim Thiemann, Raphael Koning, Timo Gerkmann:
Neural Network-augmented Kalman Filtering for Robust Online Speech Dereverberation in Noisy Reverberant Environments. CoRR abs/2204.02741 (2022) - [i28]Jean-Marie Lemercier, Joachim Thiemann, Raphael Koning, Timo Gerkmann:
End-To-End Optimization of Online Neural Network-supported Two-Stage Dereverberation for Hearing Devices. CoRR abs/2204.02978 (2022) - [i27]Tal Peer, Simon Welker, Timo Gerkmann:
Beyond Griffin-Lim: Improved Iterative Phase Retrieval for Speech. CoRR abs/2205.05496 (2022) - [i26]Kristina Tesch, Nils-Hendrik Mohrmann, Timo Gerkmann:
On the Role of Spatial, Spectral, and Temporal Processing for DNN-based Non-linear Multi-channel Speech Enhancement. CoRR abs/2206.11181 (2022) - [i25]Danilo de Oliveira, Tal Peer, Timo Gerkmann:
Efficient Transformer-based Speech Enhancement Using Long Frames and STFT Magnitudes. CoRR abs/2206.11703 (2022) - [i24]Kristina Tesch, Timo Gerkmann:
Insights into Deep Non-linear Filters for Improved Multi-channel Speech Enhancement. CoRR abs/2206.13310 (2022) - [i23]Navin Raj Prabhu, Nale Lehmann-Willenbrock, Timo Gerkmann:
Label Uncertainty Modeling and Prediction for Speech Emotion Recognition using t-Distributions. CoRR abs/2207.12135 (2022) - [i22]Julius Richter, Simon Welker, Jean-Marie Lemercier, Bunlong Lay, Timo Gerkmann:
Speech Enhancement and Dereverberation with Diffusion-based Generative Models. CoRR abs/2208.05830 (2022) - [i21]Navin Raj Prabhu, Nale Lehmann-Willenbrock, Timo Gerkmann:
End-to-End Label Uncertainty Modeling in Speech Emotion Recognition using Bayesian Neural Networks and Label Distribution Learning. CoRR abs/2209.15449 (2022) - [i20]Jean-Marie Lemercier, Julius Richter, Simon Welker, Timo Gerkmann:
Analysing Diffusion-based Generative Approaches versus Discriminative Approaches for Speech Restoration. CoRR abs/2211.02397 (2022) - [i19]Kristina Tesch, Timo Gerkmann:
Spatially Selective Deep Non-linear Filters for Speaker Extraction. CoRR abs/2211.02420 (2022) - [i18]Tal Peer, Simon Welker, Timo Gerkmann:
DiffPhase: Generative Diffusion-based STFT Phase Retrieval. CoRR abs/2211.04332 (2022) - [i17]Simon Welker, Henry N. Chapman, Timo Gerkmann:
DriftRec: Adapting diffusion models to blind image restoration tasks. CoRR abs/2211.06757 (2022) - [i16]Huajian Fang, Timo Gerkmann:
Uncertainty Estimation in Deep Speech Enhancement Using Complex Gaussian Mixture Models. CoRR abs/2212.04831 (2022) - [i15]Jean-Marie Lemercier, Julius Richter, Simon Welker, Timo Gerkmann:
StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation. CoRR abs/2212.11851 (2022) - 2021
- [i14]Guillaume Carbajal, Julius Richter, Timo Gerkmann:
Guided Variational Autoencoder for Speech Enhancement With a Supervised Classifier. CoRR abs/2102.06454 (2021) - [i13]Huajian Fang, Guillaume Carbajal, Stefan Wermter, Timo Gerkmann:
Variational Autoencoder for Speech Enhancement with a Noise-Aware Encoder. CoRR abs/2102.08706 (2021) - [i12]Kristina Tesch, Timo Gerkmann:
Nonlinear Spatial Filtering in Multichannel Speech Enhancement. CoRR abs/2104.11033 (2021) - [i11]Guillaume Carbajal, Julius Richter, Timo Gerkmann:
Disentanglement Learning for Variational Autoencoders Applied to Audio-Visual Speech Enhancement. CoRR abs/2105.08970 (2021) - [i10]Navin Raj Prabhu, Guillaume Carbajal, Nale Lehmann-Willenbrock, Timo Gerkmann:
End-to-end label uncertainty modeling for speech emotion recognition using Bayesian neural networks. CoRR abs/2110.03299 (2021) - [i9]Xiaolin Hu, Kai Li, Weiyi Zhang, Yi Luo, Jean-Marie Lemercier, Timo Gerkmann:
Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network. CoRR abs/2112.02321 (2021) - 2020
- [i8]Hongzhuo Liang, Chuangchuang Zhou, Shuang Li, Xiaojian Ma, Norman Hendrich, Timo Gerkmann, Fuchun Sun, Jianwei Zhang:
Robust Robotic Pouring using Audition and Haptics. CoRR abs/2003.00342 (2020) - [i7]Robert Rehr, Timo Gerkmann:
SNR-Based Features and Diverse Training Data for Robust DNN-Based Speech Enhancement. CoRR abs/2004.03512 (2020) - [i6]Thilo Fryen, Manfred Eppe, Phuong D. H. Nguyen, Timo Gerkmann, Stefan Wermter:
Reinforcement Learning with Time-dependent Goals for Robotic Musicians. CoRR abs/2011.05715 (2020) - 2019
- [i5]Hongzhuo Liang, Shuang Li, Xiaojian Ma, Norman Hendrich, Timo Gerkmann, Jianwei Zhang:
Making Sense of Audio Vibration for Liquid Height Estimation in Robotic Pouring. CoRR abs/1903.00650 (2019) - [i4]David Ditter, Timo Gerkmann:
A Multi-Phase Gammatone Filterbank for Speech Separation via TasNet. CoRR abs/1910.11615 (2019) - 2017
- [i3]Robert Rehr, Timo Gerkmann:
On the Importance of Super-Gaussian Speech Priors for Pre-Trained Speech Enhancement. CoRR abs/1703.05003 (2017) - [i2]Huy Phan, Martin Krawczyk-Becker, Timo Gerkmann, Alfred Mertins:
DNN and CNN with Weighted and Multi-task Loss Functions for Audio Event Detection. CoRR abs/1708.03211 (2017) - [i1]Robert Rehr, Timo Gerkmann:
Improving the Generalizability of Deep Neural Network Based Speech Enhancement. CoRR abs/1709.02175 (2017)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-30 21:30 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint