default search action
Naoya Takahashi
- > Home > Persons > Naoya Takahashi
Publications
- 2024
- [j8]Ryosuke Sawata, Naoya Takahashi, Stefan Uhlich, Shusuke Takahashi, Yuki Mitsufuji:
The whole is greater than the sum of its parts: improving music source separation by bridging networks. EURASIP J. Audio Speech Music. Process. 2024(1): 39 (2024) - [i32]Mayank Kumar Singh, Naoya Takahashi, Wei-Hsiang Liao, Yuki Mitsufuji:
SilentCipher: Deep Audio Watermarking. CoRR abs/2406.03822 (2024) - 2023
- [c34]Kin Wai Cheuk, Ryosuke Sawata, Toshimitsu Uesaka, Naoki Murata, Naoya Takahashi, Shusuke Takahashi, Dorien Herremans, Yuki Mitsufuji:
Diffroll: Diffusion-Based Generative Music Transcription with Unsupervised Pretraining Capability. ICASSP 2023: 1-5 - [c33]Nirmesh Shah, Mayank Kumar Singh, Naoya Takahashi, Naoyuki Onoe:
Nonparallel Emotional Voice Conversion for Unseen Speaker-Emotion Pairs Using Dual Domain Adversarial Network & Virtual Domain Pairing. ICASSP 2023: 1-5 - [c32]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Hierarchical Diffusion Models for Singing Voice Neural Vocoder. ICASSP 2023: 1-5 - [c30]Hao-Wen Dong, Naoya Takahashi, Yuki Mitsufuji, Julian J. McAuley, Taylor Berg-Kirkpatrick:
CLIPSep: Learning Text-queried Sound Separation with Noisy Unlabeled Videos. ICLR 2023 - [c28]Mayank Kumar Singh, Naoya Takahashi, Naoyuki Onoe:
Iteratively Improving Speech Recognition and Voice Conversion. INTERSPEECH 2023: 206-210 - [c27]Kazuki Shimada, Archontis Politis, Parthasaarathy Sudarsanam, Daniel Aleksander Krause, Kengo Uchida, Sharath Adavanne, Aapo Hakala, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen, Yuki Mitsufuji:
STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events. NeurIPS 2023 - [d4]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Aapo Hakala, Shusuke Takahashi, Daniel Aleksander Krause, Naoya Takahashi, Sharath Adavanne, Yuichiro Koyama, Kengo Uchida, Yuki Mitsufuji, Tuomas Virtanen:
STARSS23: Sony-TAu Realistic Spatial Soundscapes 2023. Version 1.0.0. Zenodo, 2023 [all versions] - [d3]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Aapo Hakala, Shusuke Takahashi, Daniel Aleksander Krause, Naoya Takahashi, Sharath Adavanne, Yuichiro Koyama, Kengo Uchida, Yuki Mitsufuji, Tuomas Virtanen:
STARSS23: Sony-TAu Realistic Spatial Soundscapes 2023. Version 1.1.0. Zenodo, 2023 [all versions] - [i31]Nirmesh Shah, Mayank Kumar Singh, Naoya Takahashi, Naoyuki Onoe:
Nonparallel Emotional Voice Conversion For Unseen Speaker-Emotion Pairs Using Dual Domain Adversarial Network & Virtual Domain Pairing. CoRR abs/2302.10536 (2023) - [i30]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Cross-modal Face- and Voice-style Transfer. CoRR abs/2302.13838 (2023) - [i29]Ryosuke Sawata, Naoya Takahashi, Stefan Uhlich, Shusuke Takahashi, Yuki Mitsufuji:
The Whole Is Greater than the Sum of Its Parts: Improving DNN-based Music Source Separation. CoRR abs/2305.07855 (2023) - [i28]Mayank Kumar Singh, Naoya Takahashi, Naoyuki Onoe:
Iteratively Improving Speech Recognition and Voice Conversion. CoRR abs/2305.15055 (2023) - [i27]Kazuki Shimada, Archontis Politis, Parthasaarathy Sudarsanam, Daniel Krause, Kengo Uchida, Sharath Adavanne, Aapo Hakala, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen, Yuki Mitsufuji:
STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events. CoRR abs/2306.09126 (2023) - 2022
- [c26]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Sharath Adavanne, Daniel Krause, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji, Tuomas Virtanen:
STARSS22: A Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events. DCASE 2022 - [c25]Naoya Takahashi, Yuki Mitsufuji:
Amicable Examples for Informed Source Separation. ICASSP 2022: 241-245 - [c24]Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Naoya Takahashi, Emiru Tsunoo, Yuki Mitsufuji:
Multi-ACCDOA: Localizing And Detecting Overlapping Sounds From The Same Class With Auxiliary Duplicating Permutation Invariant Training. ICASSP 2022: 316-320 - [c23]Naoya Takahashi, Yuki Mitsufuji:
Amicable Examples for Informed Source Separation. ICASSP 2022: 4368-4372 - [c22]Yuichiro Koyama, Kazuhide Shigemi, Masafumi Takahashi, Kazuki Shimada, Naoya Takahashi, Emiru Tsunoo, Shusuke Takahashi, Yuki Mitsufuji:
Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection. ICASSP 2022: 8872-8876 - [c21]Shrutina Agarwal, Naoya Takahashi, Sriram Ganapathy:
Leveraging Symmetrical Convolutional Transformer Networks for Speech to Singing Voice Style Transfer. INTERSPEECH 2022: 3013-3017 - [d2]Adavanne Politis, Yuki Mitsufuji, Parthasaarathy Sudarsanam, Kazuki Shimada, Sharath Adavanne, Yuichiro Koyama, Daniel Krause, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen:
STARSS22: Sony-TAu Realistic Spatial Soundscapes 2022 dataset. Version 1.0.0. Zenodo, 2022 [all versions] - [d1]Archontis Politis, Yuki Mitsufuji, Parthasaarathy Sudarsanam, Kazuki Shimada, Sharath Adavanne, Yuichiro Koyama, Daniel Aleksander Krause, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen:
STARSS22: Sony-TAu Realistic Spatial Soundscapes 2022 dataset. Version 1.1.0. Zenodo, 2022 [all versions] - [i26]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Sharath Adavanne, Daniel Krause, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji, Tuomas Virtanen:
STARSS22: A dataset of spatial recordings of real scenes with spatiotemporal annotations of sound events. CoRR abs/2206.01948 (2022) - [i25]Shrutina Agarwal, Sriram Ganapathy, Naoya Takahashi:
Leveraging Symmetrical Convolutional Transformer Networks for Speech to Singing Voice Style Transfer. CoRR abs/2208.12410 (2022) - [i24]Kin Wai Cheuk, Ryosuke Sawata, Toshimitsu Uesaka, Naoki Murata, Naoya Takahashi, Shusuke Takahashi, Dorien Herremans, Yuki Mitsufuji:
DiffRoll: Diffusion-based Generative Music Transcription with Unsupervised Pretraining Capability. CoRR abs/2210.05148 (2022) - [i23]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Hierarchical Diffusion Models for Singing Voice Neural Vocoder. CoRR abs/2210.07508 (2022) - [i22]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Robust One-Shot Singing Voice Conversion. CoRR abs/2210.11096 (2022) - [i21]Hao-Wen Dong, Naoya Takahashi, Yuki Mitsufuji, Julian J. McAuley, Taylor Berg-Kirkpatrick:
CLIPSep: Learning Text-queried Sound Separation with Noisy Unlabeled Videos. CoRR abs/2212.07065 (2022) - 2021
- [c20]Naoya Takahashi, Yuki Mitsufuji:
Densely Connected Multi-Dilated Convolutional Networks for Dense Prediction Tasks. CVPR 2021: 993-1002 - [c19]Sakya Basak, Shrutina Agarwal, Sriram Ganapathy, Naoya Takahashi:
End-to-End Lyrics Recognition with Voice to Singing Style Transfer. ICASSP 2021: 266-270 - [c18]Naoya Takahashi, Shota Inoue, Yuki Mitsufuji:
Adversarial Attacks on Audio Source Separation. ICASSP 2021: 521-525 - [c17]Kazuki Shimada, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji:
Accdoa: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization And Detection. ICASSP 2021: 915-919 - [c16]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Hierarchical disentangled representation learning for singing voice conversion. IJCNN 2021: 1-7 - [i20]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Hierarchical disentangled representation learning for singing voice conversion. CoRR abs/2101.06842 (2021) - [i19]Sakya Basak, Shrutina Agarwal, Sriram Ganapathy, Naoya Takahashi:
End-to-end lyrics Recognition with Voice to Singing Style Transfer. CoRR abs/2102.08575 (2021) - [i18]Kazuki Shimada, Naoya Takahashi, Yuichiro Koyama, Shusuke Takahashi, Emiru Tsunoo, Masafumi Takahashi, Yuki Mitsufuji:
Ensemble of ACCDOA- and EINV2-based Systems with D3Nets and Impulse Response Simulation for Sound Event Localization and Detection. CoRR abs/2106.10806 (2021) - [i17]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Source Mixing and Separation Robust Audio Steganography. CoRR abs/2110.05054 (2021) - [i16]Naoya Takahashi, Yuki Mitsufuji:
Amicable examples for informed source separation. CoRR abs/2110.05059 (2021) - [i15]Yuichiro Koyama, Kazuhide Shigemi, Masafumi Takahashi, Kazuki Shimada, Naoya Takahashi, Emiru Tsunoo, Shusuke Takahashi, Yuki Mitsufuji:
Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection. CoRR abs/2110.06501 (2021) - [i14]Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Naoya Takahashi, Emiru Tsunoo, Yuki Mitsufuji:
Multi-ACCDOA: Localizing and Detecting Overlapping Sounds from the Same Class with Auxiliary Duplicating Permutation Invariant Training. CoRR abs/2110.07124 (2021) - 2020
- [c14]Naoya Takahashi, Mayank Kumar Singh, Sakya Basak, Sudarsanam Parthasaarathy, Sriram Ganapathy, Yuki Mitsufuji:
Improving Voice Separation by Incorporating End-To-End Speech Recognition. ICASSP 2020: 41-45 - [i13]Kazuki Shimada, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji:
Sound Event Localization and Detection Using Activity-Coupled Cartesian DOA Vector and RD3net. CoRR abs/2006.12014 (2020) - [i12]Naoya Takahashi, Yuki Mitsufuji:
D3Net: Densely connected multidilated DenseNet for music source separation. CoRR abs/2010.01733 (2020) - [i11]Naoya Takahashi, Shota Inoue, Yuki Mitsufuji:
Adversarial attacks on audio source separation. CoRR abs/2010.03164 (2020) - [i10]Kazuki Shimada, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji:
ACCDOA: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization and Detection. CoRR abs/2010.15306 (2020) - [i9]Naoya Takahashi, Yuki Mitsufuji:
Densely connected multidilated convolutional networks for dense prediction tasks. CoRR abs/2011.11844 (2020) - 2019
- [c10]Naoya Takahashi, Sudarsanam Parthasaarathy, Nabarun Goswami, Yuki Mitsufuji:
Recursive Speech Separation for Unknown Number of Speakers. INTERSPEECH 2019: 1348-1352 - [i8]Naoya Takahashi, Sudarsanam Parthasaarathy, Nabarun Goswami, Yuki Mitsufuji:
Recursive speech separation for unknown number of speakers. CoRR abs/1904.03065 (2019) - [i7]Naoya Takahashi, Mayank Kumar Singh, Sakya Basak, Sudarsanam Parthasaarathy, Sriram Ganapathy, Yuki Mitsufuji:
Improving Voice Separation by Incorporating End-to-end Speech Recognition. CoRR abs/1911.12928 (2019) - 2018
- [c9]Naoya Takahashi, Purvi Agrawal, Nabarun Goswami, Yuki Mitsufuji:
PhaseNet: Discretized Phase Modeling with Deep Neural Networks for Audio Source Separation. INTERSPEECH 2018: 2713-2717 - [c8]Naoya Takahashi, Nabarun Goswami, Yuki Mitsufuji:
Mmdenselstm: An Efficient Combination of Convolutional and Recurrent Neural Networks for Audio Source Separation. IWAENC 2018: 106-110 - [i6]Naoya Takahashi, Nabarun Goswami, Yuki Mitsufuji:
MMDenseLSTM: An efficient combination of convolutional and recurrent neural networks for audio source separation. CoRR abs/1805.02410 (2018) - 2017
- [c7]Stefan Uhlich, Marcello Porcu, Franck Giron, Michael Enenkl, Thomas Kemp, Naoya Takahashi, Yuki Mitsufuji:
Improving music source separation based on deep neural networks through data augmentation and network blending. ICASSP 2017: 261-265 - [c5]Naoya Takahashi, Yuki Mitsufuji:
Multi-Scale multi-band densenets for audio source separation. WASPAA 2017: 21-25 - [i4]Naoya Takahashi, Yuki Mitsufuji:
Multi-scale Multi-band DenseNets for Audio Source Separation. CoRR abs/1706.09588 (2017) - 2016
- [i1]Naoya Takahashi, Mitsuharu Matsumoto, Shuji Hashimoto:
Noise reduction combining microphone and piezoelectric device. CoRR abs/1611.03178 (2016) - 2007
- [c1]Naoya Takahashi, Mitsuharu Matsumoto, Shuji Hashimoto:
Electric Koto by vibrating Body. ICMC 2007
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-08-18 00:35 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint