


default search action
Rama Sanand Doddipatla
Person information
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2025
- [i25]Youmna Farag, Svetlana Stoyanchev, Mohan Li, Simon Keizer, Rama Doddipatla:
Conditional Multi-Stage Failure Recovery for Embodied Agents. CoRR abs/2507.06016 (2025) - 2024
- [j3]Alejandro Santorum Varela
, Svetlana Stoyanchev, Simon Keizer
, Rama Doddipatla
, Kate M. Knill
:
Entity Resolution in Situated Dialog With Unimodal and Multimodal Transformers. IEEE ACM Trans. Audio Speech Lang. Process. 32: 704-713 (2024) - [c58]Chengzu Li, Chao Zhang, Simone Teufel, Rama Sanand Doddipatla, Svetlana Stoyanchev:
Semantic Map-based Generation of Navigation Instructions. LREC/COLING 2024: 14628-14640 - [c57]Abigail Sticha
, Norbert Braunschweiler
, Rama Sanand Doddipatla
, Kate M. Knill
:
Advancing Faithfulness of Large Language Models in Goal-Oriented Dialogue Question Answering. CUI 2024: 32 - [c56]Cong-Thanh Do, Shuhei Imai, Rama Doddipatla, Thomas Hain:
Improving Accented Speech Recognition Using Data Augmentation Based on Unsupervised Text-to-Speech Synthesis. EUSIPCO 2024: 136-140 - [c55]Tobias Cord-Landwehr
, Christoph Böddeker, Catalin Zorila, Rama Doddipatla, Reinhold Haeb-Umbach:
Geodesic Interpolation of Frame-Wise Speaker Embeddings for the Diarization of Meeting Scenarios. ICASSP 2024: 11886-11890 - [c54]Mohan Li, Simon Keizer, Rama Doddipatla:
Prompting Whisper for QA-driven Zero-shot End-to-end Spoken Language Understanding. INTERSPEECH 2024 - [c53]Mohan Li, Cong-Thanh Do, Simon Keizer, Youmna Farag, Svetlana Stoyanchev, Rama Doddipatla:
WHISMA: A Speech-LLM to Perform Zero-Shot Spoken Language Understanding. SLT 2024: 1115-1122 - [i24]Chengzu Li, Chao Zhang, Simone Teufel, Rama Sanand Doddipatla, Svetlana Stoyanchev:
Semantic Map-based Generation of Navigation Instructions. CoRR abs/2403.19603 (2024) - [i23]Cong-Thanh Do, Shuhei Imai, Rama Doddipatla, Thomas Hain
:
Improving Accented Speech Recognition using Data Augmentation based on Unsupervised Text-to-Speech Synthesis. CoRR abs/2407.04047 (2024) - [i22]Mohan Li, Cong-Thanh Do, Simon Keizer, Youmna Farag, Svetlana Stoyanchev, Rama Doddipatla:
WHISMA: A Speech-LLM to Perform Zero-shot Spoken Language Understanding. CoRR abs/2408.16423 (2024) - 2023
- [c52]Daichi Hayakawa, Takehiko Kagoshima, Kenji Iwata, Norbert Braunschweiler, Rama Doddipatla:
Robust Recognition of Speaker Emotion With Difference Feature Extraction Using a Few Enrollment Utterances. ASRU 2023: 1-7 - [c51]Mohan Li, Catalin Zorila, Cong-Thanh Do, Rama Doddipatla:
Towards a Unified End-to-End Language Understanding System for Speech and Text Inputs. ASRU 2023: 1-8 - [c50]Norbert Braunschweiler
, Rama Sanand Doddipatla
, Simon Keizer
, Svetlana Stoyanchev
:
Enabling Semi-Structured Knowledge Access via a Question-Answering Module in Task-oriented Dialogue Systems. CUI 2023: 36:1-36:11 - [c49]Tobias Cord-Landwehr
, Christoph Böddeker, Catalin Zorila, Rama Doddipatla, Reinhold Haeb-Umbach:
Frame-Wise and Overlap-Robust Speaker Embeddings for Meeting Diarization. ICASSP 2023: 1-5 - [c48]Mohan Li, Cong-Thanh Do, Rama Doddipatla
:
Cumulative Attention Based Streaming Transformer ASR with Internal Language Model Joint Training and Rescoring. ICASSP 2023: 1-5 - [c47]Catalin Zorila, Rama Doddipatla:
On the Effectiveness of Monoaural Target Source Extraction for Distant end-to-end Automatic Speech Recognition. ICASSP 2023: 1-5 - [c46]Cong-Thanh Do, Rama Doddipatla
, Mohan Li, Thomas Hain
:
Domain Adaptive Self-supervised Training of Automatic Speech Recognition. INTERSPEECH 2023: 4389-4393 - [c45]Tobias Cord-Landwehr
, Christoph Böddeker, Catalin Zorila, Rama Doddipatla
, Reinhold Haeb-Umbach:
A Teacher-Student Approach for Extracting Informative Speaker Embeddings From Speech Mixtures. INTERSPEECH 2023: 4703-4707 - [i21]Mohan Li, Rama Doddipatla
:
Non-autoregressive End-to-end Approaches for Joint Automatic Speech Recognition and Spoken Language Understanding. CoRR abs/2304.10869 (2023) - [i20]Mohan Li, Rama Doddipatla
, Catalin Zorila:
Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition. CoRR abs/2304.11985 (2023) - [i19]Tobias Cord-Landwehr, Christoph Böddeker, Catalin Zorila, Rama Doddipatla
, Reinhold Haeb-Umbach:
A Teacher-Student approach for extracting informative speaker embeddings from speech mixtures. CoRR abs/2306.00634 (2023) - [i18]Simon Keizer, Caroline Dockes, Norbert Braunschweiler, Svetlana Stoyanchev, Rama Doddipatla:
Adversarial learning of neural user simulators for dialogue policy optimisation. CoRR abs/2306.00858 (2023) - [i17]Norbert Braunschweiler, Rama Doddipatla
, Simon Keizer, Svetlana Stoyanchev:
Evaluating Large Language Models for Document-grounded Response Generation in Information-Seeking Dialogues. CoRR abs/2309.11838 (2023) - 2022
- [j2]Norbert Braunschweiler
, Rama Doddipatla
, Simon Keizer, Svetlana Stoyanchev:
Factors in Emotion Recognition With Deep Learning Models Using Speech and Text on Multiple Corpora. IEEE Signal Process. Lett. 29: 722-726 (2022) - [c44]Catalin Zorila, Rama Doddipatla:
Speaker Reinforcement Using Target Source Extraction for Robust Automatic Speech Recognition. ICASSP 2022: 6297-6301 - [c43]Mohan Li, Shucong Zhang, Catalin Zorila, Rama Doddipatla
:
Transformer-Based Streaming ASR with Cumulative Attention. ICASSP 2022: 8272-8276 - [c42]Jisi Zhang, Catalin Zorila, Rama Doddipatla
, Jon Barker:
On monoaural speech enhancement for automatic recognition of real noisy speech using mixture invariant training. INTERSPEECH 2022: 1056-1060 - [c41]Mohan Li, Rama Sanand Doddipatla
, Catalin Zorila:
Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition. INTERSPEECH 2022: 2088-2092 - [c40]Cong-Thanh Do, Mohan Li, Rama Doddipatla
:
Multiple-hypothesis RNN-T Loss for Unsupervised Fine-tuning and Self-training of Neural Transducer. INTERSPEECH 2022: 4446-4450 - [c39]Tobias Cord-Landwehr
, Christoph Böddeker, Thilo von Neumann, Catalin Zorila, Rama Doddipatla
, Reinhold Haeb-Umbach:
Monaural Source Separation: From Anechoic To Reverberant Environments. IWAENC 2022: 1-5 - [c38]Svetlana Stoyanchev, Suraj Pandey, Simon Keizer, Norbert Braunschweiler, Rama Sanand Doddipatla:
Combining Structured and Unstructured Knowledge in an Interactive Search Dialogue System. SIGDIAL 2022: 531-540 - [c37]Mohan Li, Rama Doddipatla
:
Non-Autoregressive End-to-End Approaches for Joint Automatic Speech Recognition and Spoken Language Understanding. SLT 2022: 390-397 - [i16]Norbert Braunschweiler, Rama Doddipatla, Simon Keizer, Svetlana Stoyanchev:
A study on cross-corpus speech emotion recognition and data augmentation. CoRR abs/2201.03511 (2022) - [i15]Mohan Li, Shucong Zhang, Catalin Zorila, Rama Doddipatla
:
Transformer-based Streaming ASR with Cumulative Attention. CoRR abs/2203.05736 (2022) - [i14]Simon Keizer, Norbert Braunschweiler, Svetlana Stoyanchev, Rama Doddipatla
:
Dialogue Strategy Adaptation to New Action Sets Using Multi-dimensional Modelling. CoRR abs/2204.07082 (2022) - [i13]Jisi Zhang, Catalin Zorila, Rama Doddipatla
, Jon Barker:
On monoaural speech enhancement for automatic recognition of real noisy speech using mixture invariant training. CoRR abs/2205.01751 (2022) - [i12]Catalin Zorila, Rama Doddipatla
:
Speaker Reinforcement Using Target Source Extraction for Robust Automatic Speech Recognition. CoRR abs/2205.04433 (2022) - [i11]Cong-Thanh Do, Mohan Li, Rama Doddipatla
:
Multiple-hypothesis RNN-T Loss for Unsupervised Fine-tuning and Self-training of Neural Transducer. CoRR abs/2207.14736 (2022) - 2021
- [c36]Norbert Braunschweiler, Rama Doddipatla, Simon Keizer, Svetlana Stoyanchev:
A Study on Cross-Corpus Speech Emotion Recognition and Data Augmentation. ASRU 2021: 24-30 - [c35]Mohan Li, Rama Doddipatla:
Improving HS-DACS Based Streaming Transformer ASR with Deep Reinforcement Learning. ASRU 2021: 154-161 - [c34]Simon Keizer, Norbert Braunschweiler, Svetlana Stoyanchev, Rama Doddipatla
:
Dialogue Strategy Adaptation to New Action Sets Using Multi-Dimensional Modelling. ASRU 2021: 977-983 - [c33]Shucong Zhang, Cong-Thanh Do, Rama Doddipatla
, Erfan Loweimi
, Peter Bell, Steve Renals:
Train Your Classifier First: Cascade Neural Networks Training from Upper Layers to Lower Layers. ICASSP 2021: 2750-2754 - [c32]Mohan Li, Catalin Zorila, Rama Doddipatla
:
Head-Synchronous Decoding for Transformer-Based Streaming ASR. ICASSP 2021: 5909-5913 - [c31]Jisi Zhang, Catalin Zorila, Rama Doddipatla
, Jon Barker:
Time-Domain Speech Extraction with Spatial Information and Multi Speaker Conditioning Mechanism. ICASSP 2021: 6084-6088 - [c30]Cong-Thanh Do, Rama Doddipatla
, Thomas Hain
:
Multiple-Hypothesis CTC-Based Semi-Supervised Adaptation of End-to-End Speech Recognition. ICASSP 2021: 6978-6982 - [c29]Svetlana Stoyanchev, Simon Keizer, Rama Doddipatla
:
Action State Update Approach to Dialogue Management. ICASSP 2021: 7398-7402 - [c28]Jisi Zhang, Catalin Zorila, Rama Doddipatla
, Jon Barker:
Teacher-Student MixIT for Unsupervised and Semi-Supervised Speech Separation. Interspeech 2021: 3495-3499 - [c27]Suraj Pandey, Svetlana Stoyanchev, Rama Doddipatla
:
Towards Handling Unconstrained User Preferences in Dialogue. IWSDS 2021: 83-97 - [c26]Mohan Li, Catalin Zorila, Rama Doddipatla:
Transformer-Based Online Speech Recognition with Decoder-end Adaptive Computation Steps. SLT 2021: 1-7 - [c25]Catalin Zorila, Mohan Li, Rama Doddipatla
:
An Investigation into the Multi-channel Time Domain Speaker Extraction Network. SLT 2021: 793-800 - [i10]Jisi Zhang, Catalin Zorila, Rama Doddipatla, Jon Barker:
Time-Domain Speech Extraction with Spatial Information and Multi Speaker Conditioning Mechanism. CoRR abs/2102.03762 (2021) - [i9]Shucong Zhang, Cong-Thanh Do, Rama Doddipatla, Erfan Loweimi, Peter Bell, Steve Renals:
Train your classifier first: Cascade Neural Networks Training from upper layers to lower layers. CoRR abs/2102.04697 (2021) - [i8]Cong-Thanh Do, Rama Doddipatla, Thomas Hain
:
Multiple-hypothesis CTC-based semi-supervised adaptation of end-to-end speech recognition. CoRR abs/2103.15515 (2021) - [i7]Jisi Zhang, Catalin Zorila, Rama Doddipatla, Jon Barker:
Teacher-Student MixIT for Unsupervised and Semi-supervised Speech Separation. CoRR abs/2106.07843 (2021) - [i6]Suraj Pandey, Svetlana Stoyanchev, Rama Doddipatla:
Towards Handling Unconstrained User Preferences in Dialogue. CoRR abs/2109.08650 (2021) - [i5]Tobias Cord-Landwehr, Christoph Böddeker, Thilo von Neumann, Catalin Zorila, Rama Doddipatla, Reinhold Haeb-Umbach:
Monaural source separation: From anechoic to reverberant environments. CoRR abs/2111.07578 (2021) - 2020
- [c24]Jisi Zhang, Catalin Zorila, Rama Doddipatla
, Jon Barker:
On End-to-end Multi-channel Time Domain Speech Separation in Reverberant Environments. ICASSP 2020: 6389-6393 - [c23]Shucong Zhang, Cong-Thanh Do, Rama Doddipatla, Steve Renals:
Learning Noise Invariant Features Through Transfer Learning For Robust End-to-End Speech Recognition. ICASSP 2020: 7024-7028 - [i4]Svetlana Stoyanchev, Simon Keizer, Rama Doddipatla:
Action State Update Approach to Dialogue Management. CoRR abs/2011.04637 (2020) - [i3]Jisi Zhang, Catalin Zorila, Rama Doddipatla, Jon Barker:
On End-to-end Multi-channel Time Domain Speech Separation in Reverberant Environments. CoRR abs/2011.05958 (2020)
2010 – 2019
- 2019
- [c22]Catalin Zorila, Christoph Böddeker, Rama Doddipatla
, Reinhold Haeb-Umbach
:
An Investigation into the Effectiveness of Enhancement in ASR Training and Test for Chime-5 Dinner Party Transcription. ASRU 2019: 47-53 - [c21]Petko Nikolov Petkov, Vasileios Tsiaras, Rama Doddipatla
, Yannis Stylianou:
An Unsupervised Learning Approach to Neural-net-supported Wpe Dereverberation. ICASSP 2019: 5761-5765 - [c20]Catalin Zorila, Rama Doddipatla:
On Reducing the Effect of Speaker Overlap for Chime-5. ICASSP 2019: 6645-6649 - [i2]Catalin Zorila, Christoph Böddeker, Rama Doddipatla, Reinhold Haeb-Umbach:
An Investigation into the Effectiveness of Enhancement in ASR Training and Test for CHiME-5 Dinner Party Transcription. CoRR abs/1909.12208 (2019) - 2017
- [c19]Rama Doddipatla
, Norbert Braunschweiler, Ranniery Maia:
Speaker Adaptation in DNN-Based Speech Synthesis Using d-Vectors. INTERSPEECH 2017: 3404-3408 - 2016
- [c18]Rama Doddipatla
:
Speaker adaptive training in deep neural networks using speaker dependent bottleneck features. ICASSP 2016: 5290-5294 - 2015
- [c17]Madina Hasan, Rama Doddipatla, Thomas Hain
:
Noise-matched training of CRF based sentence end detection models. INTERSPEECH 2015: 349-353 - [i1]Raymond W. M. Ng, Mortaza Doulaty, Rama Doddipatla, Wilker Aziz, Kashif Shah, Oscar Saz, Madina Hasan, Ghada AlHarbi, Lucia Specia, Thomas Hain
:
The USFD Spoken Language Translation System for IWSLT 2014. CoRR abs/1509.03870 (2015) - 2014
- [c16]Rama Doddipatla, Madina Hasan, Thomas Hain
:
Speaker dependent bottleneck layer training for speaker adaptation in automatic speech recognition. INTERSPEECH 2014: 2199-2203 - [c15]Madina Hasan, Rama Doddipatla, Thomas Hain
:
Multi-pass sentence-end detection of lecture speech. INTERSPEECH 2014: 2902-2906 - [c14]Raymond W. M. Ng, Mortaza Doulaty, Rama Doddipatla, Wilker Aziz, Kashif Shah, Oscar Saz, Madina Hasan, Ghada AlHaribi, Lucia Specia, Thomas Hain:
The USFD SLT system for IWSLT 2014. IWSLT (Evaluation Campaign) 2014 - 2013
- [c13]D. Rama Sanand, Torbjørn Svendsen:
Synthetic speaker models using VTLN to improve the performance of children in mismatched speaker conditions for ASR. INTERSPEECH 2013: 3361-3365 - [c12]Jarle Bauck Hamar, Doddipatla Rama Sanand
, Torbjørn Svendsen
, Thippur Sreenivas:
Non-negative durational HMM. MLSP 2013: 1-6 - 2012
- [j1]D. Rama Sanand
, Srinivasan Umesh
:
VTLN Using Analytically Determined Linear-Transformation on Conventional MFCC. IEEE Trans. Speech Audio Process. 20(5): 1573-1584 (2012) - [c11]Reima Karhila, Doddipatla Rama Sanand
, Mikko Kurimo, Peter Smit
:
Creating synthetic voices for children by adapting adult average voice using stacked transformations and VTLN. ICASSP 2012: 4501-4504 - 2011
- [c10]Doddipatla Rama Sanand, Mikko Kurimo:
A Study on Combining VTLN and SAT to Improve the Performance of Automatic Speech Recognition. INTERSPEECH 2011: 2581-2584 - 2010
- [c9]Doddipatla Rama Sanand, Ralf Schlüter, Hermann Ney:
Revisiting VTLN using linear transformation on conventional MFCC. INTERSPEECH 2010: 538-541
2000 – 2009
- 2009
- [c8]D. Rama Sanand
, Shakti Prasad Rath, Srinivasan Umesh
:
Improving the performance of VTLN under mismatched speaker conditions and making it approach that of matched speaker conditions. ICASSP 2009: 4397-4400 - [c7]D. Rama Sanand, Shakti Prasad Rath, Srinivasan Umesh:
A study on the influence of covariance adaptation on jacobian compensation in vocal tract length normalization. INTERSPEECH 2009: 584-587 - [c6]A. N. Harish, D. Rama Sanand, Srinivasan Umesh:
Characterizing speaker variability using spectral envelopes of vowel sounds. INTERSPEECH 2009: 1107-1110 - 2008
- [c5]D. Rama Sanand, Srinivasan Umesh:
Study of jacobian compensation using linear transformation of conventional MFCC for VTLN. INTERSPEECH 2008: 1233-1236 - [c4]P. T. Akhil, Shakti Prasad Rath, Srinivasan Umesh, D. Rama Sanand:
A computationally efficient approach to warp factor estimation in VTLN using EM algorithm and sufficient statistics. INTERSPEECH 2008: 1713-1716 - [c3]D. Rama Sanand, V. Balaji, Rani R. Sandhya, Srinivasan Umesh:
Use of spectral centre of gravity for generating speaker invariant features for automatic speech recognition. INTERSPEECH 2008: 2258-2261 - 2007
- [c2]Srinivasan Umesh, D. Rama Sanand, G. Praveen:
Speaker-Invariant Features for Automatic Speech Recognition. IJCAI 2007: 1738-1743 - [c1]D. Rama Sanand, D. Dinesh Kumar, Srinivasan Umesh:
Linear transformation approach to VTLN using dynamic frequency warping. INTERSPEECH 2007: 1138-1141
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-10-22 03:25 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint