Остановите войну!
for scientists:
default search action
Yuki Mitsufuji
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
Journal Articles
- 2024
- [j9]Ryosuke Sawata, Naoya Takahashi, Stefan Uhlich, Shusuke Takahashi, Yuki Mitsufuji:
The whole is greater than the sum of its parts: improving music source separation by bridging networks. EURASIP J. Audio Speech Music. Process. 2024(1): 39 (2024) - [j8]Stefan Uhlich, Giorgio Fabbro, Masato Hirano, Shusuke Takahashi, Gordon Wichern, Jonathan Le Roux, Dipam Chakraborty, Sharada Mohanty, Kai Li, Yi Luo, Jianwei Yu, Rongzhi Gu, Roman A. Solovyev, Alexander L. Stempkovskiy, Tatiana Habruseva, Mikhail Sukhovei, Yuki Mitsufuji:
The Sound Demixing Challenge 2023 - Cinematic Demixing Track. Trans. Int. Soc. Music. Inf. Retr. 7(1): 44-62 (2024) - [j7]Giorgio Fabbro, Stefan Uhlich, Chieh-Hsin Lai, Woosung Choi, Marco A. Martínez Ramírez, Wei-Hsiang Liao, Igor Gadelha, Geraldo Ramos, Eddie Hsu, Hugo Rodrigues, Fabian-Robert Stöter, Alexandre Défossez, Yi Luo, Jianwei Yu, Dipam Chakraborty, Sharada P. Mohanty, Roman A. Solovyev, Alexander L. Stempkovskiy, Tatiana Habruseva, Nabarun Goswami, Tatsuya Harada, Minseok Kim, Jun Hyung Lee, Yuanliang Dong, Xinran Zhang, Jiafeng Liu, Yuki Mitsufuji:
The Sound Demixing Challenge 2023 - Music Demixing Track. Trans. Int. Soc. Music. Inf. Retr. 7(1): 63-84 (2024) - 2022
- [j6]Yuhta Takida, Wei-Hsiang Liao, Chieh-Hsin Lai, Toshimitsu Uesaka, Shusuke Takahashi, Yuki Mitsufuji:
Preventing oversmoothing in VAE via generalized variance parameterization. Neurocomputing 509: 137-156 (2022) - 2021
- [j5]Yuki Mitsufuji, Norihiro Takamune, Shoichi Koyama, Hiroshi Saruwatari:
Multichannel Blind Source Separation Based on Evanescent-Region-Aware Non-Negative Tensor Factorization in Spherical Harmonic Domain. IEEE ACM Trans. Audio Speech Lang. Process. 29: 607-617 (2021) - 2020
- [j4]Yuki Mitsufuji, Stefan Uhlich, Norihiro Takamune, Daichi Kitamura, Shoichi Koyama, Hiroshi Saruwatari:
Multichannel Non-Negative Matrix Factorization Using Banded Spatial Covariance Matrices in Wavenumber Domain. IEEE ACM Trans. Audio Speech Lang. Process. 28: 49-60 (2020) - [j3]Yu Maeno, Yuki Mitsufuji, Prasanga N. Samarasinghe, Naoki Murata, Thushara D. Abhayapala:
Spherical-Harmonic-Domain Feedforward Active Noise Control Using Sparse Decomposition of Reference Signals from Distributed Sensor Arrays. IEEE ACM Trans. Audio Speech Lang. Process. 28: 656-670 (2020) - 2019
- [j2]Fabian-Robert Stöter, Stefan Uhlich, Antoine Liutkus, Yuki Mitsufuji:
Open-Unmix - A Reference Implementation for Music Source Separation. J. Open Source Softw. 4(41): 1667 (2019) - 2014
- [j1]Yuki Mitsufuji, Axel Roebel:
On the use of a spatial cue as prior information for stereo sound source separation based on spatially weighted non-negative tensor factorization. EURASIP J. Adv. Signal Process. 2014: 40 (2014)
Conference and Workshop Papers
- 2023
- [c47]Silin Gao, Beatriz Borges, Soyoung Oh, Deniz Bayazit, Saya Kanno, Hiromi Wakaki, Yuki Mitsufuji, Antoine Bosselut:
PeaCoK: Persona Commonsense Knowledge for Consistent and Engaging Narratives. ACL (1) 2023: 6569-6591 - [c46]Kin Wai Cheuk, Ryosuke Sawata, Toshimitsu Uesaka, Naoki Murata, Naoya Takahashi, Shusuke Takahashi, Dorien Herremans, Yuki Mitsufuji:
Diffroll: Diffusion-Based Generative Music Transcription with Unsupervised Pretraining Capability. ICASSP 2023: 1-5 - [c45]Junghyun Koo, Marco A. Martínez Ramírez, Wei-Hsiang Liao, Stefan Uhlich, Kyogu Lee, Yuki Mitsufuji:
Music Mixing Style Transfer: A Contrastive Learning Approach to Disentangle Audio Effects. ICASSP 2023: 1-5 - [c44]Koichi Saito, Naoki Murata, Toshimitsu Uesaka, Chieh-Hsin Lai, Yuhta Takida, Takao Fukui, Yuki Mitsufuji:
Unsupervised Vocal Dereverberation with Diffusion-Based Generative Models. ICASSP 2023: 1-5 - [c43]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Hierarchical Diffusion Models for Singing Voice Neural Vocoder. ICASSP 2023: 1-5 - [c42]Zhi Zhong, Masato Hirano, Kazuki Shimada, Kazuya Tateishi, Shusuke Takahashi, Yuki Mitsufuji:
An Attention-Based Approach to Hierarchical Multi-Label Music Instrument Classification. ICASSP 2023: 1-5 - [c41]Hao-Wen Dong, Naoya Takahashi, Yuki Mitsufuji, Julian J. McAuley, Taylor Berg-Kirkpatrick:
CLIPSep: Learning Text-queried Sound Separation with Noisy Unlabeled Videos. ICLR 2023 - [c40]Chieh-Hsin Lai, Yuhta Takida, Naoki Murata, Toshimitsu Uesaka, Yuki Mitsufuji, Stefano Ermon:
FP-Diffusion: Improving Score-based Diffusion Models by Enforcing the Underlying Score Fokker-Planck Equation. ICML 2023: 18365-18398 - [c39]Naoki Murata, Koichi Saito, Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Yuki Mitsufuji, Stefano Ermon:
GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration. ICML 2023: 25501-25522 - [c38]Ryosuke Sawata, Naoki Murata, Yuhta Takida, Toshimitsu Uesaka, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji:
Diffiner: A Versatile Diffusion-based Generative Refiner for Speech Enhancement. INTERSPEECH 2023: 3824-3828 - [c37]Keisuke Toyama, Taketo Akama, Yukara Ikemiya, Yuhta Takida, Wei-Hsiang Liao, Yuki Mitsufuji:
Automatic Piano Transcription With Hierarchical Frequency-Time Transformer. ISMIR 2023: 215-222 - [c36]Kazuki Shimada, Archontis Politis, Parthasaarathy Sudarsanam, Daniel Aleksander Krause, Kengo Uchida, Sharath Adavanne, Aapo Hakala, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen, Yuki Mitsufuji:
STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events. NeurIPS 2023 - [c35]Zhi Zhong, Hao Shi, Masato Hirano, Kazuki Shimada, Kazuya Tateishi, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji:
Extending Audio Masked Autoencoders toward Audio Restoration. WASPAA 2023: 1-5 - 2022
- [c34]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Sharath Adavanne, Daniel Krause, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji, Tuomas Virtanen:
STARSS22: A Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events. DCASE 2022 - [c33]Silin Gao, Jena D. Hwang, Saya Kanno, Hiromi Wakaki, Yuki Mitsufuji, Antoine Bosselut:
ComFact: A Benchmark for Linking Contextual Commonsense Knowledge. EMNLP (Findings) 2022: 1656-1675 - [c32]Naoya Takahashi, Yuki Mitsufuji:
Amicable Examples for Informed Source Separation. ICASSP 2022: 241-245 - [c31]Yuichiro Koyama, Naoki Murata, Stefan Uhlich, Giorgio Fabbro, Shusuke Takahashi, Yuki Mitsufuji:
Music Source Separation With Deep Equilibrium Models. ICASSP 2022: 296-300 - [c30]Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Naoya Takahashi, Emiru Tsunoo, Yuki Mitsufuji:
Multi-ACCDOA: Localizing And Detecting Overlapping Sounds From The Same Class With Auxiliary Duplicating Permutation Invariant Training. ICASSP 2022: 316-320 - [c29]Ricardo Falcón Pérez, Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Yuki Mitsufuji:
Spatial Mixup: Directional Loudness Modification as Data Augmentation for Sound Event Localization and Detection. ICASSP 2022: 431-435 - [c28]Bo-Yu Chen, Wei-Han Hsu, Wei-Hsiang Liao, Marco A. Martínez Ramírez, Yuki Mitsufuji, Yi-Hsuan Yang:
Automatic DJ Transitions with Differentiable Audio Effects and Generative Adversarial Networks. ICASSP 2022: 466-470 - [c27]Naoya Takahashi, Yuki Mitsufuji:
Amicable Examples for Informed Source Separation. ICASSP 2022: 4368-4372 - [c26]Yuichiro Koyama, Kazuhide Shigemi, Masafumi Takahashi, Kazuki Shimada, Naoya Takahashi, Emiru Tsunoo, Shusuke Takahashi, Yuki Mitsufuji:
Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection. ICASSP 2022: 8872-8876 - [c25]Yuhta Takida, Takashi Shibuya, Wei-Hsiang Liao, Chieh-Hsin Lai, Junki Ohmura, Toshimitsu Uesaka, Naoki Murata, Shusuke Takahashi, Toshiyuki Kumakura, Yuki Mitsufuji:
SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization. ICML 2022: 20987-21012 - [c24]Johannes Imort, Giorgio Fabbro, Marco A. Martínez Ramírez, Stefan Uhlich, Yuichiro Koyama, Yuki Mitsufuji:
Distortion Audio Effects: Learning How to Recover the Clean Signal. ISMIR 2022: 218-225 - [c23]Marco A. Martínez Ramírez, Wei-Hsiang Liao, Chihiro Nagashima, Giorgio Fabbro, Stefan Uhlich, Yuki Mitsufuji:
Automatic music mixing with deep learning and out-of-domain data. ISMIR 2022: 411-418 - 2021
- [c22]Yasuhide Hyodo, Chihiro Sugai, Junya Suzuki, Masafumi Takahashi, Masahiko Koizumi, Asako Tomura, Yuki Mitsufuji, Yota Komoriya:
Psychophysiological Effect of Immersive Spatial Audio Experience Enhanced Using Sound Field Synthesis. ACII 2021: 1-8 - [c21]Naoya Takahashi, Yuki Mitsufuji:
Densely Connected Multi-Dilated Convolutional Networks for Dense Prediction Tasks. CVPR 2021: 993-1002 - [c20]Ryosuke Sawata, Stefan Uhlich, Shusuke Takahashi, Yuki Mitsufuji:
All For One And One For All: Improving Music Separation By Bridging Networks. ICASSP 2021: 51-55 - [c19]Naoya Takahashi, Shota Inoue, Yuki Mitsufuji:
Adversarial Attacks on Audio Source Separation. ICASSP 2021: 521-525 - [c18]Kazuki Shimada, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji:
Accdoa: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization And Detection. ICASSP 2021: 915-919 - [c17]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Hierarchical disentangled representation learning for singing voice conversion. IJCNN 2021: 1-7 - 2020
- [c16]Naoya Takahashi, Mayank Kumar Singh, Sakya Basak, Sudarsanam Parthasaarathy, Sriram Ganapathy, Yuki Mitsufuji:
Improving Voice Separation by Incorporating End-To-End Speech Recognition. ICASSP 2020: 41-45 - [c15]Yu Maeno, Yuhta Takida, Naoki Murata, Yuki Mitsufuji:
Array-Geometry-Aware Spatial Active Noise Control Based on Direction-of-Arrival Weighting. ICASSP 2020: 8414-8418 - 2019
- [c14]Naoki Murata, Jihui Zhang, Yu Maeno, Yuki Mitsufuji:
Global and Local Mode-domain Adaptive Algorithms for Spatial Active Noise Control Using Higher-order Sources. ICASSP 2019: 526-530 - [c13]Naoya Takahashi, Sudarsanam Parthasaarathy, Nabarun Goswami, Yuki Mitsufuji:
Recursive Speech Separation for Unknown Number of Speakers. INTERSPEECH 2019: 1348-1352 - 2018
- [c12]Yu Maeno, Yuki Mitsufuji, Thushara D. Abhayapala:
Mode Domain Spatial Active Noise Control Using Sparse Signal Representation. ICASSP 2018: 211-215 - [c11]Naoya Takahashi, Purvi Agrawal, Nabarun Goswami, Yuki Mitsufuji:
PhaseNet: Discretized Phase Modeling with Deep Neural Networks for Audio Source Separation. INTERSPEECH 2018: 2713-2717 - [c10]Naoya Takahashi, Nabarun Goswami, Yuki Mitsufuji:
Mmdenselstm: An Efficient Combination of Convolutional and Recurrent Neural Networks for Audio Source Separation. IWAENC 2018: 106-110 - [c9]Yu Maeno, Yuki Mitsufuji, Prasanga N. Samarasinghe, Thushara D. Abhayapala:
Mode-Domain Spatial Active Noise Control Using Multiple Circular Arrays. IWAENC 2018: 441-445 - 2017
- [c8]Keiichi Osako, Yuki Mitsufuji, Rita Singh, Bhiksha Raj:
Supervised monaural source separation based on autoencoders. ICASSP 2017: 11-15 - [c7]Stefan Uhlich, Marcello Porcu, Franck Giron, Michael Enenkl, Thomas Kemp, Naoya Takahashi, Yuki Mitsufuji:
Improving music source separation based on deep neural networks through data augmentation and network blending. ICASSP 2017: 261-265 - [c6]Naoya Takahashi, Yuki Mitsufuji:
Multi-Scale multi-band densenets for audio source separation. WASPAA 2017: 21-25 - 2016
- [c5]Yuki Mitsufuji, Shoichi Koyama, Hiroshi Saruwatari:
Multichannel blind source separation based on non-negative tensor factorization in wavenumber domain. ICASSP 2016: 56-60 - 2015
- [c4]Xin Guo, Stefan Uhlich, Yuki Mitsufuji:
NMF-based blind source separation using a linear predictive coding error clustering criterion. ICASSP 2015: 261-265 - [c3]Stefan Uhlich, Franck Giron, Yuki Mitsufuji:
Deep neural network based instrument extraction from music. ICASSP 2015: 2135-2139 - 2014
- [c2]Yuki Mitsufuji, Marco Liuni, Alex Baker, Axel Roebel:
Online NON-negative Tensor Deconvolution for source detection in 3DTV audio. ICASSP 2014: 3082-3086 - 2013
- [c1]Yuki Mitsufuji, Axel Roebel:
Sound source separation based on non-negative tensor factorization incorporating spatial cue as prior knowledge. ICASSP 2013: 71-75
Data and Artifacts
- 2023
- [d4]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Aapo Hakala, Shusuke Takahashi, Daniel Aleksander Krause, Naoya Takahashi, Sharath Adavanne, Yuichiro Koyama, Kengo Uchida, Yuki Mitsufuji, Tuomas Virtanen:
STARSS23: Sony-TAu Realistic Spatial Soundscapes 2023. Zenodo, 2023 - [d3]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Aapo Hakala, Shusuke Takahashi, Daniel Aleksander Krause, Naoya Takahashi, Sharath Adavanne, Yuichiro Koyama, Kengo Uchida, Yuki Mitsufuji, Tuomas Virtanen:
STARSS23: Sony-TAu Realistic Spatial Soundscapes 2023. Zenodo, 2023 - 2022
- [d2]Adavanne Politis, Yuki Mitsufuji, Parthasaarathy Sudarsanam, Kazuki Shimada, Sharath Adavanne, Yuichiro Koyama, Daniel Krause, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen:
STARSS22: Sony-TAu Realistic Spatial Soundscapes 2022 dataset. Zenodo, 2022 - [d1]Archontis Politis, Yuki Mitsufuji, Parthasaarathy Sudarsanam, Kazuki Shimada, Sharath Adavanne, Yuichiro Koyama, Daniel Aleksander Krause, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen:
STARSS22: Sony-TAu Realistic Spatial Soundscapes 2022 dataset. Zenodo, 2022
Informal and Other Publications
- 2024
- [i79]Yuhta Takida, Yukara Ikemiya, Takashi Shibuya, Kazuki Shimada, Woosung Choi, Chieh-Hsin Lai, Naoki Murata, Toshimitsu Uesaka, Kengo Uchida, Wei-Hsiang Liao, Yuki Mitsufuji:
HQ-VAE: Hierarchical Discrete Representation Learning with Variational Bayes. CoRR abs/2401.00365 (2024) - [i78]Yixiao Zhang, Yukara Ikemiya, Gus Xia, Naoki Murata, Marco A. Martínez Ramírez, Wei-Hsiang Liao, Yuki Mitsufuji, Simon Dixon:
MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models. CoRR abs/2402.06178 (2024) - [i77]Silin Gao, Mete Ismayilzada, Mengjie Zhao, Hiromi Wakaki, Yuki Mitsufuji, Antoine Bosselut:
DiffuCOMET: Contextual Commonsense Knowledge Diffusion. CoRR abs/2402.17011 (2024) - [i76]Hao Hao Tan, Kin Wai Cheuk, Taemin Cho, Wei-Hsiang Liao, Yuki Mitsufuji:
MR-MT3: Memory Retaining Multi-Track Music Transcription to Mitigate Instrument Leakage. CoRR abs/2403.10024 (2024) - [i75]Yutong He, Alexander Robey, Naoki Murata, Yiding Jiang, Joshua Williams, George J. Pappas, Hamed Hassani, Yuki Mitsufuji, Ruslan Salakhutdinov, J. Zico Kolter:
Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation. CoRR abs/2403.19103 (2024) - [i74]Toshimitsu Uesaka, Taiji Suzuki, Yuhta Takida, Chieh-Hsin Lai, Naoki Murata, Yuki Mitsufuji:
Understanding Multimodal Contrastive Learning Through Pointwise Mutual Information. CoRR abs/2404.19228 (2024) - [i73]Shiqi Yang, Zhi Zhong, Mengjie Zhao, Shusuke Takahashi, Masato Ishii, Takashi Shibuya, Yuki Mitsufuji:
Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation. CoRR abs/2405.14598 (2024) - [i72]Dongjun Kim, Chieh-Hsin Lai, Wei-Hsiang Liao, Yuhta Takida, Naoki Murata, Toshimitsu Uesaka, Yuki Mitsufuji, Stefano Ermon:
PaGoDA: Progressive Growing of a One-Step Generator from a Low-Resolution Diffusion Teacher. CoRR abs/2405.14822 (2024) - [i71]Junyoung Seo, Kazumi Fukuda, Takashi Shibuya, Takuya Narihira, Naoki Murata, Shoukang Hu, Chieh-Hsin Lai, Seungryong Kim, Yuki Mitsufuji:
GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping. CoRR abs/2405.17251 (2024) - [i70]Akio Hayakawa, Masato Ishii, Takashi Shibuya, Yuki Mitsufuji:
Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation. CoRR abs/2405.17842 (2024) - [i69]Yixiao Zhang, Yukara Ikemiya, Woosung Choi, Naoki Murata, Marco A. Martínez Ramírez, Liwei Lin, Gus Xia, Wei-Hsiang Liao, Yuki Mitsufuji, Simon Dixon:
Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning. CoRR abs/2405.18386 (2024) - [i68]Koichi Saito, Dongjun Kim, Takashi Shibuya, Chieh-Hsin Lai, Zhi Zhong, Yuhta Takida, Yuki Mitsufuji:
SoundCTM: Uniting Score-based and Consistency Models for Text-to-Sound Generation. CoRR abs/2405.18503 (2024) - [i67]Sungho Lee, Marco A. Martínez Ramírez, Wei-Hsiang Liao, Stefan Uhlich, Giorgio Fabbro, Kyogu Lee, Yuki Mitsufuji:
Searching For Music Mixing Graphs: A Pruning Approach. CoRR abs/2406.01049 (2024) - [i66]Kengo Uchida, Takashi Shibuya, Yuhta Takida, Naoki Murata, Shusuke Takahashi, Yuki Mitsufuji:
MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training. CoRR abs/2406.01867 (2024) - [i65]Mayank Kumar Singh, Naoya Takahashi, Wei-Hsiang Liao, Yuki Mitsufuji:
SilentCipher: Deep Audio Watermarking. CoRR abs/2406.03822 (2024) - [i64]Hiromi Wakaki, Yuki Mitsufuji, Yoshinori Maeda, Yukiko Nishimura, Silin Gao, Mengjie Zhao, Keiichi Yamada, Antoine Bosselut:
ComperDial: Commonsense Persona-grounded Dialogue Dataset and Benchmark. CoRR abs/2406.11228 (2024) - [i63]Yu-Hua Chen, Woosung Choi, Wei-Hsiang Liao, Marco Martínez-Ramírez, Kin Wai Cheuk, Yuki Mitsufuji, Jyh-Shing Roger Jang, Yi-Hsuan Yang:
Improving Unsupervised Clean-to-Rendered Guitar Tone Transformation Using GANs and Integrated Unaligned Clean Data. CoRR abs/2406.15751 (2024) - [i62]Marco Comunità, Zhi Zhong, Akira Takahashi, Shiqi Yang, Mengjie Zhao, Koichi Saito, Yukara Ikemiya, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji:
SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond. CoRR abs/2406.17672 (2024) - 2023
- [i61]Naoki Murata, Koichi Saito, Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Yuki Mitsufuji, Stefano Ermon:
GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration. CoRR abs/2301.12686 (2023) - [i60]Yuhta Takida, Masaaki Imaizumi, Chieh-Hsin Lai, Toshimitsu Uesaka, Naoki Murata, Yuki Mitsufuji:
Adversarially Slicing Generative Networks: Discriminator Slices Feature for One-Dimensional Optimal Transport. CoRR abs/2301.12811 (2023) - [i59]Zhi Zhong, Masato Hirano, Kazuki Shimada, Kazuya Tateishi, Shusuke Takahashi, Yuki Mitsufuji:
An Attention-based Approach to Hierarchical Multi-label Music Instrument Classification. CoRR abs/2302.08136 (2023) - [i58]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Cross-modal Face- and Voice-style Transfer. CoRR abs/2302.13838 (2023) - [i57]Silin Gao, Beatriz Borges, Soyoung Oh, Deniz Bayazit, Saya Kanno, Hiromi Wakaki, Yuki Mitsufuji, Antoine Bosselut:
PeaCoK: Persona Commonsense Knowledge for Consistent and Engaging Narratives. CoRR abs/2305.02364 (2023) - [i56]Masato Hirano, Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Yuki Mitsufuji:
Diffusion-based Signal Refiner for Speech Separation. CoRR abs/2305.05857 (2023) - [i55]Zhi Zhong, Hao Shi, Masato Hirano, Kazuki Shimada, Kazuya Tateishi, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji:
Extending Audio Masked Autoencoders Toward Audio Restoration. CoRR abs/2305.06701 (2023) - [i54]Ryosuke Sawata, Naoya Takahashi, Stefan Uhlich, Shusuke Takahashi, Yuki Mitsufuji:
The Whole Is Greater than the Sum of Its Parts: Improving DNN-based Music Source Separation. CoRR abs/2305.07855 (2023) - [i53]Hao Shi, Kazuki Shimada, Masato Hirano, Takashi Shibuya, Yuichiro Koyama, Zhi Zhong, Shusuke Takahashi, Tatsuya Kawahara, Yuki Mitsufuji:
Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders. CoRR abs/2305.10734 (2023) - [i52]Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Naoki Murata, Yuki Mitsufuji, Stefano Ermon:
On the Equivalence of Consistency-Type Models: Consistency Models, Consistent Diffusion Models, and Fokker-Planck Regularization. CoRR abs/2306.00367 (2023) - [i51]Kazuki Shimada, Archontis Politis, Parthasaarathy Sudarsanam, Daniel Krause, Kengo Uchida, Sharath Adavanne, Aapo Hakala, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen, Yuki Mitsufuji:
STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events. CoRR abs/2306.09126 (2023) - [i50]Keisuke Toyama, Taketo Akama, Yukara Ikemiya, Yuhta Takida, Wei-Hsiang Liao, Yuki Mitsufuji:
Automatic Piano Transcription with Hierarchical Frequency-Time Transformer. CoRR abs/2307.04305 (2023) - [i49]Giorgio Fabbro, Stefan Uhlich, Chieh-Hsin Lai, Woosung Choi, Marco A. Martínez Ramírez, Wei-Hsiang Liao, Igor Gadelha, Geraldo Ramos, Eddie Hsu, Hugo Rodrigues, Fabian-Robert Stöter, Alexandre Défossez, Yi Luo, Jianwei Yu, Dipam Chakraborty, Sharada P. Mohanty, Roman A. Solovyev, Alexander L. Stempkovskiy, Tatiana Habruseva, Nabarun Goswami, Tatsuya Harada, Minseok Kim, Jun Hyung Lee, Yuanliang Dong, Xinran Zhang, Jiafeng Liu, Yuki Mitsufuji:
The Sound Demixing Challenge 2023 - Music Demixing Track. CoRR abs/2308.06979 (2023) - [i48]Stefan Uhlich, Giorgio Fabbro, Masato Hirano, Shusuke Takahashi, Gordon Wichern, Jonathan Le Roux, Dipam Chakraborty, Sharada P. Mohanty, Kai Li, Yi Luo, Jianwei Yu, Rongzhi Gu, Roman A. Solovyev, Alexander L. Stempkovskiy, Tatiana Habruseva, Mikhail Sukhovei, Yuki Mitsufuji:
The Sound Demixing Challenge 2023 - Cinematic Demixing Track. CoRR abs/2308.06981 (2023) - [i47]Eleonora Grassucci, Yuki Mitsufuji, Ping Zhang, Danilo Comminiello:
Enhancing Semantic Communication with Deep Generative Models - An ICASSP Special Session Overview. CoRR abs/2309.02478 (2023) - [i46]Takashi Shibuya, Yuhta Takida, Yuki Mitsufuji:
BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network. CoRR abs/2309.02836 (2023) - [i45]Carlos Hernandez-Olivan, Koichi Saito, Naoki Murata, Chieh-Hsin Lai, Marco A. Martínez Ramírez, Wei-Hsiang Liao, Yuki Mitsufuji:
VRDMG: Vocal Restoration via Diffusion Posterior Sampling with Multiple Guidance. CoRR abs/2309.06934 (2023) - [i44]Kazuki Shimada, Kengo Uchida, Yuichiro Koyama, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji, Tatsuya Kawahara:
Zero- and Few-shot Sound Event Localization and Detection. CoRR abs/2309.09223 (2023) - [i43]Frank Cwitkowitz, Kin Wai Cheuk, Woosung Choi, Marco A. Martínez Ramírez, Keisuke Toyama, Wei-Hsiang Liao, Yuki Mitsufuji:
Timbre-Trap: A Low-Resource Framework for Instrument-Agnostic Music Transcription. CoRR abs/2309.15717 (2023) - [i42]Qiyu Wu, Mengjie Zhao, Yutong He, Lang Huang, Junya Ono, Hiromi Wakaki, Yuki Mitsufuji:
Towards reporting bias in visual-language datasets: bimodal augmentation by decoupling object-attribute association. CoRR abs/2310.01330 (2023) - [i41]Dongjun Kim, Chieh-Hsin Lai, Wei-Hsiang Liao, Naoki Murata, Yuhta Takida, Toshimitsu Uesaka, Yutong He, Yuki Mitsufuji, Stefano Ermon:
Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion. CoRR abs/2310.02279 (2023) - [i40]Mengjie Zhao, Junya Ono, Zhi Zhong, Chieh-Hsin Lai, Yuhta Takida, Naoki Murata, Wei-Hsiang Liao, Takashi Shibuya, Hiromi Wakaki, Yuki Mitsufuji:
On the Language Encoder of Contrastive Cross-modal Models. CoRR abs/2310.13267 (2023) - [i39]Yutong He, Naoki Murata, Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Dongjun Kim, Wei-Hsiang Liao, Yuki Mitsufuji, J. Zico Kolter, Ruslan Salakhutdinov, Stefano Ermon:
Manifold Preserving Guided Diffusion. CoRR abs/2311.16424 (2023) - 2022
- [i38]Johannes Imort, Giorgio Fabbro, Marco A. Martínez Ramírez, Stefan Uhlich, Yuichiro Koyama, Yuki Mitsufuji:
Removing Distortion Effects in Music Using Deep Neural Networks. CoRR abs/2202.01664 (2022) - [i37]Yuhta Takida, Takashi Shibuya, Wei-Hsiang Liao, Chieh-Hsin Lai, Junki Ohmura, Toshimitsu Uesaka, Naoki Murata, Shusuke Takahashi, Toshiyuki Kumakura, Yuki Mitsufuji:
SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization. CoRR abs/2205.07547 (2022) - [i36]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Sharath Adavanne, Daniel Krause, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji, Tuomas Virtanen:
STARSS22: A dataset of spatial recordings of real scenes with spatiotemporal annotations of sound events. CoRR abs/2206.01948 (2022) - [i35]Marco A. Martínez Ramírez, Wei-Hsiang Liao, Giorgio Fabbro, Stefan Uhlich, Chihiro Nagashima, Yuki Mitsufuji:
Automatic music mixing with deep learning and out-of-domain data. CoRR abs/2208.11428 (2022) - [i34]Chieh-Hsin Lai, Yuhta Takida, Naoki Murata, Toshimitsu Uesaka, Yuki Mitsufuji, Stefano Ermon:
Regularizing Score-based Models with Score Fokker-Planck Equations. CoRR abs/2210.04296 (2022) - [i33]Kin Wai Cheuk, Ryosuke Sawata, Toshimitsu Uesaka, Naoki Murata, Naoya Takahashi, Shusuke Takahashi, Dorien Herremans, Yuki Mitsufuji:
DiffRoll: Diffusion-based Generative Music Transcription with Unsupervised Pretraining Capability. CoRR abs/2210.05148 (2022) - [i32]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Hierarchical Diffusion Models for Singing Voice Neural Vocoder. CoRR abs/2210.07508 (2022) - [i31]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Robust One-Shot Singing Voice Conversion. CoRR abs/2210.11096 (2022) - [i30]Silin Gao, Jena D. Hwang, Saya Kanno, Hiromi Wakaki, Yuki Mitsufuji, Antoine Bosselut:
ComFact: A Benchmark for Linking Contextual Commonsense Knowledge. CoRR abs/2210.12678 (2022) - [i29]Ryosuke Sawata, Naoki Murata, Yuhta Takida, Toshimitsu Uesaka, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji:
A Versatile Diffusion-based Generative Refiner for Speech Enhancement. CoRR abs/2210.17287 (2022) - [i28]Junghyun Koo, Marco A. Martínez Ramírez, Wei-Hsiang Liao, Stefan Uhlich, Kyogu Lee, Yuki Mitsufuji:
Music Mixing Style Transfer: A Contrastive Learning Approach to Disentangle Audio Effects. CoRR abs/2211.02247 (2022) - [i27]Koichi Saito, Naoki Murata, Toshimitsu Uesaka, Chieh-Hsin Lai, Yuhta Takida, Takao Fukui, Yuki Mitsufuji:
Unsupervised vocal dereverberation with diffusion-based generative models. CoRR abs/2211.04124 (2022) - [i26]Hao-Wen Dong, Naoya Takahashi, Yuki Mitsufuji, Julian J. McAuley, Taylor Berg-Kirkpatrick:
CLIPSep: Learning Text-queried Sound Separation with Noisy Unlabeled Videos. CoRR abs/2212.07065 (2022) - 2021
- [i25]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Hierarchical disentangled representation learning for singing voice conversion. CoRR abs/2101.06842 (2021) - [i24]Yuhta Takida, Wei-Hsiang Liao, Toshimitsu Uesaka, Shusuke Takahashi, Yuki Mitsufuji:
Preventing Posterior Collapse Induced by Oversmoothing in Gaussian VAE. CoRR abs/2102.08663 (2021) - [i23]Koichi Saito, Stefan Uhlich, Giorgio Fabbro, Yuki Mitsufuji:
Training Speech Enhancement Systems with Noisy Speech Datasets. CoRR abs/2105.12315 (2021) - [i22]Kazuki Shimada, Naoya Takahashi, Yuichiro Koyama, Shusuke Takahashi, Emiru Tsunoo, Masafumi Takahashi, Yuki Mitsufuji:
Ensemble of ACCDOA- and EINV2-based Systems with D3Nets and Impulse Response Simulation for Sound Event Localization and Detection. CoRR abs/2106.10806 (2021) - [i21]Yuki Mitsufuji, Giorgio Fabbro, Stefan Uhlich, Fabian-Robert Stöter:
Music Demixing Challenge at ISMIR 2021. CoRR abs/2108.13559 (2021) - [i20]Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Source Mixing and Separation Robust Audio Steganography. CoRR abs/2110.05054 (2021) - [i19]Naoya Takahashi, Yuki Mitsufuji:
Amicable examples for informed source separation. CoRR abs/2110.05059 (2021) - [i18]Ricardo Falcón Pérez, Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Yuki Mitsufuji:
Spatial mixup: Directional loudness modification as data augmentation for sound event localization and detection. CoRR abs/2110.06126 (2021) - [i17]Yuichiro Koyama, Naoki Murata, Stefan Uhlich, Giorgio Fabbro, Shusuke Takahashi, Yuki Mitsufuji:
Music Source Separation with Deep Equilibrium Models. CoRR abs/2110.06494 (2021) - [i16]Yuichiro Koyama, Kazuhide Shigemi, Masafumi Takahashi, Kazuki Shimada, Naoya Takahashi, Emiru Tsunoo, Shusuke Takahashi, Yuki Mitsufuji:
Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection. CoRR abs/2110.06501 (2021) - [i15]Bo-Yu Chen, Wei-Han Hsu, Wei-Hsiang Liao, Marco A. Martínez Ramírez, Yuki Mitsufuji, Yi-Hsuan Yang:
Automatic DJ Transitions with Differentiable Audio Effects and Generative Adversarial Networks. CoRR abs/2110.06525 (2021) - [i14]Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Naoya Takahashi, Emiru Tsunoo, Yuki Mitsufuji:
Multi-ACCDOA: Localizing and Detecting Overlapping Sounds from the Same Class with Auxiliary Duplicating Permutation Invariant Training. CoRR abs/2110.07124 (2021) - 2020
- [i13]Kazuki Shimada, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji:
Sound Event Localization and Detection Using Activity-Coupled Cartesian DOA Vector and RD3net. CoRR abs/2006.12014 (2020) - [i12]Naoya Takahashi, Yuki Mitsufuji:
D3Net: Densely connected multidilated DenseNet for music source separation. CoRR abs/2010.01733 (2020) - [i11]Naoya Takahashi, Shota Inoue, Yuki Mitsufuji:
Adversarial attacks on audio source separation. CoRR abs/2010.03164 (2020) - [i10]Ryosuke Sawata, Stefan Uhlich, Shusuke Takahashi, Yuki Mitsufuji:
All for One and One for All: Improving Music Separation by Bridging Networks. CoRR abs/2010.04228 (2020) - [i9]Kazuki Shimada, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji:
ACCDOA: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization and Detection. CoRR abs/2010.15306 (2020) - [i8]Naoya Takahashi, Yuki Mitsufuji:
Densely connected multidilated convolutional networks for dense prediction tasks. CoRR abs/2011.11844 (2020) - 2019
- [i7]Naoya Takahashi, Sudarsanam Parthasaarathy, Nabarun Goswami, Yuki Mitsufuji:
Recursive speech separation for unknown number of speakers. CoRR abs/1904.03065 (2019) - [i6]Cyril Cadoux, Stefan Uhlich, Marc Ferras, Yuki Mitsufuji:
Closing the Training/Inference Gap for Deep Attractor Networks. CoRR abs/1911.02091 (2019) - [i5]Naoya Takahashi, Mayank Kumar Singh, Sakya Basak, Sudarsanam Parthasaarathy, Sriram Ganapathy, Yuki Mitsufuji:
Improving Voice Separation by Incorporating End-to-end Speech Recognition. CoRR abs/1911.12928 (2019) - 2018
- [i4]Yu Maeno, Yuki Mitsufuji, Thushara D. Abhayapala:
Mode Domain Spatial Active Noise Control Using Sparse Signal Representation. CoRR abs/1803.00187 (2018) - [i3]Naoya Takahashi, Nabarun Goswami, Yuki Mitsufuji:
MMDenseLSTM: An efficient combination of convolutional and recurrent neural networks for audio source separation. CoRR abs/1805.02410 (2018) - [i2]Joachim Muth, Stefan Uhlich, Nathanaël Perraudin, Thomas Kemp, Fabien Cardinaux, Yuki Mitsufuji:
Improving DNN-based Music Source Separation using Phase Features. CoRR abs/1807.02710 (2018) - 2017
- [i1]Naoya Takahashi, Yuki Mitsufuji:
Multi-scale Multi-band DenseNets for Audio Source Separation. CoRR abs/1706.09588 (2017)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-07-30 23:43 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint