default search action
APSIPA 2023: Taipei, Taiwan
- Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2023, Taipei, Taiwan, October 31 - Nov. 3, 2023. IEEE 2023, ISBN 979-8-3503-0067-3
- Yi-Li Tseng, Yi-Ling Chien, Tse-Min Chuang, Yen-Nan Chiu, Wen-Che Tsai:
Cognitive Assessment of Autism Spectrum Disorder Using an EEG-based Social Interaction Platform. i-vi - Guanxiong Pei, Bingjie Li, Taihao Li, Cunhang Fan, Chao Zhang, Zhao Lv:
Mixed Emotion Recognition Based on EEG Signals. 1-7 - Yoshiki Niimura, Jun Takemoto, Atsuhiko Kai, Seiichi Nakagawa:
Attention-based CNN and Relative Phase Feature Modeling for Improved Imagined Speech Recognition. 8-14 - Jumpei Aizawa, Masaki Ogura, Masanori Shimono, Naoki Wakamiya:
Manipulation of Neuronal Network Firing Patterns using Temporal Deep Unfolding-based MPC. 15-21 - Wakako Nakamura:
Goodness of Fit to the Convolution Model of fMRI Data and Determination of the Regularization Parameter. 22-26 - Shinya Matsumoto, Kan Okubo, Takuya Abe, Kiyoshi Nishikawa:
Detection model of sister chromatid cohesion defects based on Vision Transformer. 27-31 - Xiaomei Zou, Shiyong Hu, Taihao Li:
GRALA: modeling social information for microblog sentiment analysis from the view of balancing sparsity and smoothness of social contexts. 32-37 - Yufei Jiang, Thi-Nga Ho, Eng Siong Chng:
Adopting Neural Translation Model in Data Generation for Inverse Text Normalization. 38-45 - Zechen Liang, Qiaosong Fan, Yuan-Gen Wang:
Mismatched Semi-supervised Learning with Feature Similarity Consistency. 46-50 - Jen-Tzung Chien, Chien-Ching Chen:
Collaborative Pseudo Labeling for Prompt-Based Learning. 51-56 - Jen-Tzung Chien, Ming-Yen Chen, Jing-Hao Xue:
Learning Meta Soft Prompt for Few-Shot Language Models. 57-62 - A. Jayakrishnan, Venkatesan Meenakshi Sundaram, Prabhavathy Panneer, Alkha Mohan:
MSDF-Net: A Multi-Scale Deep Fusion Network with Dilated Convolutions for Cloud Removal from Sentinel-2 Imagery. 63-70 - Xiangqing Zhang, Yan Feng, Shun Zhang, Yuning Wang:
Instance Implant-Aided Non-uniformly Cropping for Person Detection in Aerial Images. 71-78 - Yi Pei Ng, Chi Wee Tan, Yiqi Tew:
A Futuristic Proposed Solution to a Global Challenge: The Promise of AI for Social Distancing Compliance. 79-83 - Pei-Sze Tan, Sailaja Rajanala, Arghya Pal, Raphaël C.-W. Phan, Huey Fang Ong:
Unbiased Decision-Making Framework in Long-Video Macro & Micro-Expression Spotting. 84-89 - Yongxiong Xiao, Shiqiang Zhu, Te Li, Minhong Wan, Wei Song, Jason Gu, Qiang Fu:
Adaptive Beamforming Based on Interference-Plus-Noise Covariance Matrix Reconstruction for Speech Separation. 90-95 - Hang Chen, Jun Du, Zhe Wang, Chenxi Wang, Yuling Ren, Qinglong Li, Ruibo Liu, Chin-Hui Lee:
Correlated Multi-Level Speech Enhancement for Robust Real-World ASR Applications Using Mask-Waveform-Feature Optimization. 96-101 - Haodong Zhou, Tao Li, Jie Wang, Lin Li, Qingyang Hong:
CASA-Net: Cross-attention and Self-attention for End-to-End Audio-visual Speaker Diarization. 102-106 - Aoqi Guo, Junnan Wu, Peng Gao, Wenbo Zhu, Qinwen Guo, Dazhi Gao, Yujun Wang:
Enhanced Neural Beamformer with Spatial Information for Target Speech Extraction. 107-113 - Bang Zeng, Hongbin Suo, Yulong Wan, Ming Li:
Low-complexity Multi-Channel Speaker Extraction with Pure Speech Cues. 114-118 - Jin Li, Man-Wai Mak, Nan Yan, Lan Wang:
Modeling Suprasegmental Information Using Finite Difference Network for End-to-End Speaker Verification. 119-124 - Priyanka Gupta, Piyushkumar K. Chodingala, Hemant A. Patil:
Relevance of Quadrature Phase For Replay Detection in Voice Assistants (VAs). 125-130 - Baveet Singh Hora, Krishna Parmar, Shrey Machhar, Hemant A. Patil, Kiran Praveen, Balaji Radhakrishnan:
Exploring Residual Cepstral Features for Spoken Language Identification. 131-138 - WooSeok Ko, Seyun Um, Zhenyu Piao, Hong-Goo Kang:
Consideration of Varying Training Lengths for Short-Duration Speaker Verification. 139-144 - Ritu Srivastava, Saiteja Kosgi, Sarath Sivaprasad, Neha Sahipjohn, Vineet Gandhi:
Adversarial Robustness of Mel Based Speaker Recognition Systems. 145-150 - Daichi Kamakura, Eita Nanamura, Takehisa Oyama, Kazuyoshi Yoshii:
Joint Drum Transcription and Metrical Analysis Based on Periodicity-Aware Multi-Task Learning. 151-157 - Daichi Kamakura, Eita Nakamura, Kazuyoshi Yoshii:
CTC2: End-to-End Drum Transcription Based on Connectionist Temporal Classification With Constant Tempo Constraint. 158-164 - Tsung-Ping Chen, Li Su, Kazuyoshi Yoshii:
Learning Multifaceted Self-Similarity for Musical Structure Analysis. 165-172 - Hideki Kawahara, Kohei Yatabe, Ken-Ichi Sakakibara, Mitsunori Mizumachi, Tatsuya Kitamura:
Simultaneous Measurement of Multiple Acoustic Attributes Using Structured Periodic Test Signals Including Music and Other Sound Materials. 173-180 - Jian-Jia Huang, Shih-Chieh Chang, Cheng-Hsu Cheng, Timothy Wan, Yu-Cheng Pei:
Gait Analysis in Powered Exoskeleton-Assisted Walking in Patients with Stroke: A Case Series Cohort. 187-194 - Toshiyuki Nakanishi, Koichi Fujiwara, Kazuya Sobue:
Prediction Model of Postoperative Pain Exacerbation Using a Wearable Electrocardiogram Sensor. 195-198 - Yuma Kanaizuka, Takahiro Manabe, Jian-Jia Huang, Jen-Wen Hung, Yumie Ono:
Directional Neural Connectivity during Robot Mirror Therapy in Patients with Stroke. 199-205 - Noriaki Kanayama, Makoto Miyakoshi, Maro G. Machizawa:
Evaluation of neural response recorded using scalp EEG in virtual reality environment. 206-210 - Shi-Zong Huang, Ching-Te Chiu, Yu-Jen Chang:
Machine Learning Based Action Recognition with Modular CNN. 211-216 - Ting-Jui Wong, Pei-Yun Tsai:
Real-Time Processing for Weighted Pulse Decomposition of Photoplethysmography Signals Based on Interior Point Method in Wearable Devices for Hemodynamic State. 217-221 - Chen-Yen Lin, Kuang-Hao Stanley Liu:
QoS-Aware Downlink Beamforming for Joint Transmission in Multi-Cell Networks. 222-229 - Yi-Mei Li, Jung-Chun Chi, Yuan-Hao Huang:
Deep-Learning-Based Lattice Reduction Preprocessing for Time-Correlated MIMO Systems. 230-237 - Fang-Ching Chen, Yi-Wen Liu:
Utilizing Unlabeled Data and Synthetic Data for Bird Sound Detection: Consistency Training, Mean Teacher, and Domain Adaptation Techniques. 238-242 - Muhammad H. Hyanda, Nur Ahmadi, Peter H. Charlton, Timothy G. Constandinou, Ayu Purwarianti, Trio Adiono:
A Comparative Evaluation of Video Codecs for rPPG-based Heart Rate Estimation. 243-247 - Listi Restu Triani, Nur Ahmadi, Trio Adiono:
Human Activity Recognition Based on FMCW Radar Using CNN and Transfer Learning. 248-253 - A. Sumarudin, Nana Sutisna, Infall Syafalni, Bambang Riyanto Trilaksono, Trio Adiono:
DQN Algorithm Design for Fast Efficient Shortest Path System. 254-260 - Jayandi Panggabean, Nana Sutisna, Infall Syafalni, Trio Adiono:
Comparison of MPPT based on Deep Reinforcement Learning by DQN, DDPG and TD3. 261-266 - Muhammad Dzaky Prihatmoko, Nur Ahmadi, Peter H. Charlton, Trio Adiono:
Signal Quality Assessment for Wearable Multichannel Photoplethysmography Signals. 267-271 - Van-Hieu Nguyen, Gia-Thien Luu, Thien Van Luong, Mai Xuan Trang, Philippe Ravier, Olivier Buttelli:
After-Fatigue Condition: A Novel Analysis Based on Surface EMG Signals. 272-277 - Do Hai Son, Karim Abed-Meraim, Tran Trong Duy, Nguyen Linh Trung, Tran Thi Thuy Quynh:
On the Semi-Blind Mutually Referenced Equalizers for MIMO Systems. 278-283 - Viet-Duc Le, Thi-Lich Nghiem, Thi-Lan Le:
Accurate continuous action and gesture recognition method based on skeleton and sliding windows techniques. 284-290 - Toan Gian, Tien-Hoa Nguyen, Trung Tan Nguyen, Van-Cuong Pham, Thien Van Luong:
Transformer-Based Deep Learning Detector for Dual-Mode Index Modulation 3D-OFDM. 291-296 - Trung-Hieu Le, Thai-Khanh Nguyen, Trung-Kien Tran, Thanh-Hai Tran, Cuong Pham:
GAFormer: Wearable IMU-Based Human Activity Recognition with Gramian Angular Field and Transformer. 297-303 - Eswar Adapa, Anish C. Turlapaty, Surya Naidu:
Fatigue Classification and Onset estimation using Surface EMG Signals during Strength Training. 304-310 - Yuki Yamamoto, Sou Nobukawa, Nobuhiko Wagatsuma, Keiichiro Inagaki:
P300 Event-Related Potential in Perception of Multiple Traffic Objects During Vehicle Driving. 311-316 - Akira Tamamori:
Kernel Random Projection Depth for Outlier Detection. 317-324 - Kaito Katayama, Koichi Fujiwara, Kazuki Yamamoto:
Soft-Sensor Construction Method Based on Adaptive Modeling and Transfer Learning for Manufacturing Process Including Maintenance Periods. 325-328 - Shang Li Yuen, Phooi Yee Lau, Chin Wee Wong, Muhammad Hafiz Samsuri, Zarina Hussin, Nur Afiqah Kamarudin, Muhammad Syukri Mohd Talib, Hock Woon Hon:
Detecting Wire Bonding Defects in Point Clouds on Self-Generated Dataset. 329-335 - Lily Tyszczuk, Liat Levita, Jaime Delgadillo, Haihong Zhang, Mahnaz Arvaneh:
Predicting Outcomes of Cognitive Behavioral Therapy for Depression Using Data Driven Approaches. 336-343 - Chun-Yi He, Jen-Tzung Chien:
Learning Adapters for Code-Switching Speech Recognition. 344-349 - Sandipan Dhar, MD. Tousin Akhter, Padmanabha Banerjee, Nanda Dulal Jana, Swagatam Das:
FID-RPRGAN-VC: Fréchet Inception Distance Loss based Region-wise Position Normalized Relativistic GAN for Non-Parallel Voice Conversion. 350-356 - Sin-Hong Lee, Chih-Hung Kuo, Tsai-Chun Yu:
Deformable Aligned Fusion for Video Super Resolution. 357-364 - Kuan-Hua Huang, Li-Wei Kang:
Learning Single Image Rain Streak Removal Based on Deep Attention Mechanism. 365-372 - Yi-Kai Liao, Gong-Si Lin, Mei-Chen Yeh:
A Transformer-Based Framework for Tiny Object Detection. 373-377 - Duan-Yu Chen, Yu-Hsuan Chen:
Lightweight Models Distillation with Learnable Teaching Material: An Application for Smart Table Tennis System. 378-383 - Yu-Jen Chen, Po-Chyi Su:
Selecting Suitable Data Input for Deep-Learning Sign-Language Recognition with a Small Dataset. 384-391 - Yu-Jen Wei, Jia-Hong Chen, Tien-Ying Kuo:
Analysis of the Interaction Effect on Pruning and Transfer Learning in Model Training. 392-395 - Chih-Hao Wang, Yu-Jen Wei, Ching Hsiang Chang, Tien-Ying Kuo:
Old Damaged Photo Recovery with Style Transfer-Based Data Augmentation. 396-400 - Kun-Lin Tsai, Yan-Hao Chen, Choa-Ting Huang, Guo-Wei Huang, Shih-Ting Tseng:
A Deep Learning based Sustainable Energy Scheduling System. 401-407 - Yuan Zhao, Hanmin Sheng, Jinliang Shao:
A Computational Efficient Direct Position Determination Approach of Narrow-band Emitter. 408-413 - Wenxiang Dong, H. Vicky Zhao:
Modeling and Analysis of the Epidemic-Behavior Co-evolution Dynamics with User Irrationality. 414-421 - Dongjie Zhu, Weibin Zhu, Tianrui Wang, Yingying Gao, Junlan Feng, Shilei Zhang:
Noise-robust Pitch Detection Based on Super-Resolution Harmonics. 422-426 - Yingke Zhao, Wen Zhang, Jingdong Chen:
A Subband Approach to Personal Sound Zone with Joint Optimization of Sound Pressure and Particle Velocity. 427-431 - Cuiping Cheng, Pengcheng Yue, Taihao Li:
An Multi-evidence Fusion Based on C-Distance with Uncertain Reasoning for Classification. 432-437 - Na Li, Linbo Shang, Zhichao Zhang:
On Uncertainty Principles for Lowband Graph Signals. 438-442 - Linghan Li, Heliu Chen, Renjie He, Yuchao Dai, Mingyi He:
CoA-DLinkNet: Connectivity-Enhanced Dual-Branch Road Extraction Network Based on D-LinkNet. 443-449 - Gejian Zhao, Chuan Qin:
Black-box Lossless Fragile Watermarking Based on Hidden Space Search for DNN Integrity Authentication. 450-455 - Xiaoyi Zhou, Shuai Lee:
Hiding patient information in medical images:A high-capacity and reversible hiding algorithm for E-healthcare. 456-461 - Jiarong Mao, Yuting An, Xiaoyi Zhou:
A Visually Meaningful Image Encryption Algorithm with Attention Mechanism and Artificial Bee Colony Optimization. 462-467 - Yujie Yin, Heng Yin, Zhaoxia Yin, Wanli Lyu, Sha Wei:
High-Quality Triggers Based Fragile Watermarking for Optical Character Recognition Model. 468-475 - Mengjie Qin, Zheyuan Lin, Minhong Wan, Chunlong Zhang, Jason Gu, Te Li:
Coupled Transformed Induced Tensor Nuclear Norm for Robust Tensor Completion. 476-483 - Meijiang Fu, Yixiao Zheng, Dongliang Chang, Wenpan Li, Zhanyu Ma:
Multi-Frequency Feature Enhancement for Multi-Granularity Visual Classification. 484-489 - Tongtao Ling, Lei Chen, Chen Liao, Shilei Huang, Zhipeng Yu, Yi Liu:
Improving Aspect Sentiment Classification via Retrieving from Training Data. 490-497 - Yu-Ping Ruan, Shu-Kai Zheng, Jiantao Huang, Xiaoning Zhang, Yulong Liu, Taihao Li:
CH-MEAD: A Chinese Multimodal Conversational Emotion Analysis Dataset with Fine-Grained Emotion Taxonomy. 498-505 - Eita Nakamura, Yasuyuki Saito:
Evolutionary Analysis and Cultural Transmission Models of Color Style Distributions in Painting Arts. 506-513 - Keiji Yamadera, Michiharu Niimi:
Ultimatelink Between Characters Having a Certain Meaning in Physical Space to URL in Cyberspace with Robust Print and Scan. 514-518 - Takuya Nagatoshi, Michiharu Niimi:
Human Flow Measurement System Using Floor Estimation of Depth Images for Low-End IoT Devices. 519-522 - Seongjean Kim, Seonghwa Choi, Sanghoon Lee:
Holo-QoI: A Human Factor-Based Dataset and Prediction Framework for Assessing Quality of Interaction in Augmented Reality. 523-528 - Hiroshi Higashi:
Supervised Single-channel EEG Decomposition using Detector-kernel Networks for Noise Reduction. 529-534 - Xinyu Wang, Mingtao Li, Hao Li, Sio-Hang Pun, Fei Chen:
Cross-Subject Classification of Spoken Mandarin Vowels and Tones with EEG Signals: A Study of End-to-End CNN with Fine-Tuning. 535-539 - Tung Dang, Kentaro Ono, Takafumi Sasaoka, Shigeto Yamawaki, Maro G. Machizawa:
Decoding time-course of saliency network of fMRI signals by EEG signals using optimized forward variable selection: a concurrent EEG-fMRI study. 540-545 - Mitali Ahuja, Shuji Komeiji, Takumi Mitsuhashi, Yasushi Iimura, Hiroharu Suzuki, Hidenori Sugano, Koichi Shinoda, Toshihisa Tanaka:
Multimodal recognition of speech and electrocorticogram. 546-550 - Chia-Yi Lin, Jun-Cheng Chen, Ja-Ling Wu:
Enhancing Real-Time Semantic Segmentation with Textual Knowledge of Pre-Trained Vision-Language Model: A Lightweight Approach. 551-558 - Zichen Song, Hiroshi Higashi, Shin Ishii:
EEG study on anticipation of difficulty for upcoming auditory task. 559-566 - Ingon Chanpornpakdi, Toshihisa Tanaka:
Event-Related Potential in Rapid Serial Visual Presentation-based Partial Face Cognition Depends on Visible Face Components. 567-574 - Sattaya Singkul, Theerat Sakdejayont, Tawunrat Chalothorn:
Residual, Mixer, and Attention: The Three-way Combination for Streaming Wake Word Detection Framework. 575-582 - Tengyu Deng, Eita Nakamura, Kazuyoshi Yoshii:
Audio-to-Score Singing Transcription Based on Joint Estimation of Pitches, Onsets, and Metrical Positions With Tatum-Level CTC Loss. 583-590 - Li-Jen Chang, Yu-Cheng Liao, Chia-Hui Lin, Shih-Fang Yang-Mao, Hwann-Tzong Chen:
Mask2Hand: Learning to Predict the 3D Hand Pose and Shape from Shadow. 591-598 - Daichi Nakaya, Shoko Imaizumi:
A Reversible Image Processing Method for Color Tone Control Using Data Hiding. 599-604 - Fatma Shalabi, Huy H. Nguyen, Hichem Felouat, Ching-Chun Chang, Isao Echizen:
Image-Text Out-Of-Context Detection Using Synthetic Multimodal Misinformation. 605-612 - Chinatsu Tanaka, Minoru Kuribayashi, Nobuo Funabiki:
Gait Recognition Scheme Focusing on Operating Characteristics at Feature Points Detected by OpenPose. 613-619 - Reon Akai, Minoru Kuribayashi, Nobuo Funabiki:
A Study on Eliminating Biased Node in Federated Learning. 620-627 - Koi Yee Ng, Simying Ong, Yuen Peng Loh:
Can StArtGAN withstand Image Processing Attacks? 628-634 - Chang Wang, Jun Du, Hang Chen, Ruoyu Wang, Chao-Han Huck Yang, Jiangjiang Zhao, Yuling Ren, Qinglong Li, Chin-Hui Lee:
Enhancing Privacy Preservation with Quantum Computing for Word-Level Audio-Visual Speech Recognition. 635-642 - Irina Lebedeva, Mohamed Jaward Bah, Taihao Li:
Interpretable Image Recognition in Hyperbolic Space. 643-650 - Hui Sze Liew, Yuen Peng Loh, Simying Ong:
Low-light is More Than Darkness: An Empirical Study on Illumination Types and Enhancement Methods. 651-658 - Sophia Chulif, Sue Han Lee, Yang Loong Chang, Mark Tee Kit Tsun, Kok Chin Chai, Yi Lung Then:
MoMo Strategy: Learn More from More Mistakes. 659-665 - Hamza Ahmed Ishrat, Abel Yu Hao Chai, Sue Han Lee, Patrick Hang Hui Then:
Unveiling Robust Feature Spaces: Image vs. Embedding-Oriented Approaches for Plant Disease Identification. 666-673 - Jia Xuan Gue, Chun Yong Chong, Mei Kuan Lim:
Facial Expression Recognition as markers of Depression. 674-680 - Sophia Chulif, Sue Han Lee, Yang Loong Chang, Mark Tee Kit Tsun, Kok Chin Chai, Yi Lung Then:
How Transferable are Herbarium-Field Features in Few-Shot Plant Identification with Triplet Loss? 681-687 - Hochang Rhee, Nam Ik Cho:
Resolution-Adaptive Lossless Image Compression Using Frequency Decomposition Network. 688-695 - Chaewon Lee, Won-Dong Jang, Chang-Su Kim:
Implementation and Analysis on Backpropagating Refinement Scheme for Interactive Image Segmentation. 696-702 - Jihoo Lee, Je-Won Kang:
Implicit Neural Representation for Video Coding Through Progressive Feature Extraction. 703-708 - Thuy Thi Pham, Truong Thanh Nhat Mai, Chul Lee:
Deep Unfolded Underwater Image Enhancement Based on Extreme Channels Prior. 709-713 - Young-Min Jeong, Tae-Sung Park, Jeong-Hyeok Park, Jong-Ok Kim:
Low-Light Image Enhancement via Distillation of NIR-to-RGB Conversion Knowledge. 714-718 - Wen-Nung Lie, Veasna Vann:
3D Human Skeleton Estimation from Single RGB Image Based on Fusion of Predicted Depths from Multiple Virtual-Viewpoints. 719-725 - Tzu-Hsuan Lin, Chia-Hsiang Lin, Si-Sheng Young:
GNN-Based Small-Data Learning with Area-Control Mechanism for Hyperspectral Satellite Change Detection. 726-732 - Egor Poliakov, Wei-Jie Hung, Ching-Chun Huang:
Efficient Constraint-Aware Neural Architecture Search for Object Detection. 733-740 - Ngoc-Quan Ha-Phan, Thanh-Nguyen Truong, Vu-Hoang Tran, Ching-Chun Huang:
A Reliable Feature-Based Framework for Vehicle Tracking in Advanced Driver Assistance Systems. 741-747 - Jie-Fan Chang, Kuan-Ting Lai, Cheng-Xuan Zhuang, Guo-Shiang Lin, Ku-Yaw Chang:
Light-weight Zero-Reference-based Image Enhancement for Low-Light Images. 748-752 - Cheng-Yu Lu, Heng-Cheng Hsu, Chen-Kuo Chiang:
Classwise Self-Paced Self-Training for Semi-Supervised Image Classification. 753-758 - Mahamat Moussa, Chern Hong Lim, KokSheik Wong:
CapFormer: A Space-Time Video Description Model using Joint-Attention Transformer. 759-764 - Kohei Hayashi, Yoshihiro Maeda, Norishige Fukushima:
Local Contrast Enhancement with Multiscale Filtering. 765-770 - Reina Kaneko, Yuya Sato, Takumi Ueda, Hiroshi Higashi, Yuichi Tanaka:
Marine Snow Removal Benchmarking Dataset. 771-778 - Kazuki Naganuma, Shunsuke Ono:
Cross-Frame Foreground Structural Similarity Modeling by Convolutional Sparse Representation. 779-783 - Ryunosuke Eguchi, Iori Kobayashi, Shunsuke Ono, Ryo Matsuoka:
JPEG Artifact Removal for Hyperspectral Images Based on Spatial-Spectral Regularization. 784-787 - Hsuan Lin, Keigo Hirakawa:
Data Driven Multiband Image Fusion That Preserves Wavelength-Specific Image Features. 788-794 - Seishi Takamura:
Shot-Noise-Aware Image Signal Restoration for Photoelectronic Charge-Based Sensors. 795-799 - Quang Nhat Tran, Shih-Hsuan Yang:
Generative Adversarial Network-Based Frame Interpolation with Multi-Perspective Discrimination. 800-805 - Hrishav Bakul Barua, Ganesh Krishnasamy, KokSheik Wong, Kalin Stefanov, Abhinav Dhall:
ArtHDR-Net: Perceptually Realistic and Accurate HDR Content Creation. 806-812 - Wei Wang, Xuejing Lei, Yueru Chen, Ming-Sui Lee, C.-C. Jay Kuo:
LSR++: An Efficient and Tiny Model for Image Super-Resolution. 813-819 - Yuki Shimamura, Michiharu Niimi:
High-Quality Font Generation Based on StyleGAN2 and FSFont Font Generation Model. 820-825 - Yunming Yang, Masaaki Ikehara:
Enhanced Residual Fourier Transformation Network for Lightweight Image Super-resolution. 826-832 - Ai-Fang Chai, Sailaja Rajanala, Arghya Pal, Raphaël C.-W. Phan, Chee-Ming Ting:
ELEGANT: End-to-end Language Grounded Speech Denoiser for Efficient Generation of Talking Face. 833-838 - Limhourlaurent Meam, Suradej Duangpummet, Waree Kongprawechnon:
Segmentation Enhancement for Iris Recognition Using Unit Gradient Vectors. 839-845 - Tsun-Hin Cheung, Kin-Man Lam:
FactLLaMA: Optimizing Instruction-Following Language Models with External Knowledge for Automated Fact-Checking. 846-853 - Toshio Irino, Shintaro Doan:
Auditory Representation Effective for Estimating Vocal Tract Information. 854-861 - Hikaru Yamaga, Kazuki Hatakeyama, Kazunori Kojima, Shi-wook Lee, Yoshiaki Itoh:
Accurate and Practical Query-by-Example Using Multiple Deep Learning Models and Frame Compression Methods. 862-867 - Kenta Yamada, Yoshiki Masuyama, Kouei Yamaoka, Nobutaka Ono:
Fundamental Frequency Estimation Based on Finite-Order Harmonic Constraint Differential Equation. 868-872 - Wu-Hao Li, Chen-Yu Chiang, Te-Hsin Liu:
Tone Labeling by Deep Learning-based Tone Recognizer for Mandarin Speech. 873-880 - Ting-Jia Shu, Jen-Tzung Chien:
Learning to Enhance the Position Embedding and Coherence. 881-886 - Yao-Zhong Zhang, Chiung-An Chen, Jia-Sheng Zhang, Jia-Wen Wang:
VLSI Design of Near-Lossless Image Compression using Improved LZW. 887-891 - Yu-Wen Peng, Chia-Yu Hu, Yen-Ju Chin, He-Sheng Chou, Yuan-Jin Lin, Yu-Lin Liu, Shih-Lun Chen, Tsung-Yi Chen, Kuo-Chen Li, Chiung-An Chen:
The color demosaicing and image scaling based on improve Hamilton-Adams. 892-897 - Wei-Wen Hsu, Yao-Chung Chang, Wei-Min Lee, Yu-Chuan Huang, Da-Wen Lu:
Improving Regularization of Deep Learning Models in Fundus Analysis. 898-901 - Chung-Yen Wei, Bo-Yuan Xu, Yu-Xiang Zhao:
Design of Interactive System for Acupoint Analysis Based on Augmented Reality. 902-909 - Mu-Feng Lin, Yi-Qian Li, Tsung-Yi Chen, Yu-Lin Liu, Yuan-Jin Lin, Mei-Ling Chan, Chiung-An Chen, Kuo-Chen Li, Shih-Lun Chen:
Dental Positioning Medical Assistance System for BW Radiograph Based on YOLOV4. 910-917 - Wei-Wen Hsu, Jia-Yi Lin, Hsin-Hung Lai, Wan-Lin Hsu, Jeng-Ting Jiang, Yao-Chung Chang, Yao-Feng Li:
The Development of an AI-assisted Diagnosis System for Adult Glioma Subtyping Prediction. 918-921 - Jianmin Dong, Datian Peng, Guanxiong Pei, Taihao Li:
Poisoning Attacks against Gait-based Identity Recognition. 922-926 - Jian Xu, Dongheng Zhang, Jiamu Li, Qibin Sun, Yan Chen:
STrack: Velocity Estimation Using Single Antenna WiFi Devices. 927-933 - Kai Wang, Dimitrios Hatzinakos:
SEformer: Dual-Path Conformer Neural Network is a Good Speech Denoiser. 934-940 - Pengcheng Yue, Shu-Kai Zheng, Taihao Li:
Complex Feature Information Enhanced Speech Emotion Recognition. 941-946 - Min Zhang, Xiaosong Qiao, Yanqing Zhao, Chang Su, Yinglu Li, Ming Zhu, Junhao Zhu, Yuang Li, Xiaofeng Zhao, Yilun Liu, Wenbing Ma, Mengyao Piao, Jiawei Yu, Xinglin Lv, Song Peng, Shimin Tao, Hao Yang, Yanfei Jiang:
Incorporating Pinyin into Pipeline Named Entity Recognition from Chinese Speech. 947-953 - Pan Deng, Jie Zhang, Xinyuan Zhou, Zhongyi Ye, Weitai Zhang, Jianwei Cui, Lirong Dai:
Learning Semantic Information from Machine Translation to Improve Speech-to-Text Translation. 954-959 - Jiahao Yang, Jianguo Wei, Kuntharrgyal Khysru, Junhai Xu, Wenhuan Lu, Wenjun Ke, Xiaokang Yang:
Effective Fine-tuning Method for Tibetan Low-resource Dialect Speech Recognition. 960-965 - Qi Wang, Mingkuan Liu, Xianhong Chen, Mengwen Xiong:
Multi-task Piano Transcription with Local Relative Time Attention. 966-971 - Zehua Zhang, Changjun He, Shiyun Xu, Mingjiang Wang:
Real and imaginary part interaction network for monaural speech enhancement and de-reverberation. 972-977 - Genshun Wan, Hang Chen, Tan Liu, Chenxi Wang, Jia Pan, Zhongfu Ye:
Progressive Multi-scale Self-supervised Learning for Speech Recognition. 978-982 - Genshun Wan, Hang Chen, Pengcheng Li, Jia Pan, Zhongfu Ye:
Improved Data2vec with Soft Supervised Hidden Unit for Mandarin Speech Recognition. 983-987 - Yanfeng Wu, Pengcheng Yue, Cuiping Cheng, Taihao Li:
Investigation of Ensemble of Self-Supervised Models for Speech Emotion Recognition. 988-995 - Liang Tao, Maoshen Jia, Bing Bu, Dingding Yao:
Single Source Zone Detection in the Spherical Harmonic Domain for Multisource Localization. 996-1001 - Yunrui Cai, Changhe Song, Boshi Tang, Dongyang Dai, Zhiyong Wu, Helen Meng:
Robust Representation Learning for Speech Emotion Recognition with Moment Exchange. 1002-1007 - Pengcheng Yue, Yanfeng Wu, Leyuan Qu, Shukai Zheng, Shuyuan Zhao, Taihao Li:
Few Shot Learning Guided by Emotion Distance for Cross-corpus Speech Emotion Recognition. 1008-1012 - Kiyohide Sato, Keita Kishi, Tetsuo Kosaka:
Speech Emotion Recognition by Late Fusion of Linguistic and Acoustic Features using Deep Learning Models. 1013-1018 - Bagus Tris Atmaja, Akira Sasou:
Multilingual, Cross-lingual, and Monolingual Speech Emotion Recognition on EmoFilm Dataset. 1019-1025 - Bagus Tris Atmaja, Akira Sasou:
Ensembling Multilingual Pre-Trained Models for Predicting Multi-Label Regression Emotion Share from Speech. 1026-1029 - Ngoc-Anh Nguyen Thi, Bao Thang Ta, Nhat Minh Le, Van Hai Do:
An Automatic Pipeline For Building Emotional Speech Dataset. 1030-1035 - S. Uthiraa, Hemant A. Patil:
Analysis of Emotions in Speech using AESDD. 1036-1041 - Ning Guo, Tomohiro Nakatani, Shoko Araki, Takehiro Moriya:
Modified Parametric Multichannel Wiener Filter for Low-latency Enhancement of Speech Mixtures with Unknown Number of Speakers. 1042-1049 - Sota Misawa, Norihiro Takamune, Kohei Yatabe, Daichi Kitamura, Hiroshi Saruwatari:
Blind Source Separation Using Independent Low-Rank Matrix Analysis with Spectrogram-Consistency Regularization. 1050-1057 - Shinya Furunaga, Tetsuya Ueda, Shoji Makino:
Moving Interference Speaker removal using Geometrically Constrained Independent Vector Analysis. 1058-1063 - Meng-Hsuan Wu, Yih-Liang Shen, Hsuan-Cheng Chou, Bo-Wun Shih, Tai-Shih Chi:
A Dual-Channel Three-Stage Model for DoA and Speech Enhancement. 1064-1068 - Zhongxin Bai, Chao Pan, Gong Chen, Jingdong Chen, Jacob Benesty:
A Weighted Binary Cross-Entropy for Sound Event Representation Learning and Few-Shot Classification. 1069-1074 - Tsung-Han Tsai, Tzu-Chieh Chen:
A Reconfigurable Hardware Architecture for Graph Convolution Network in Action Recognition. 1075-1078 - Chung-Chian Hsu, Chi-Yuan Chen, S. M. Salahuddin Morsalin, Arthur Chang, Wen-Lin Fan:
Automated Carina Detection in Chest X-ray Images Using Non-Overlapping and Cross-Squeeze Convolutional Neural Networks. 1079-1084 - Manman Zhang, Yuchen Ma, Ge Luo, Sheng Li, Zhenxing Qian, Xinpeng Zhang:
Identifying the Style of Chatting. 1085-1092 - Chian C. Ho, Cian-Duo Lin:
Pose-Based Visual Servoing with Lightweight Deep-Learning Binarization for Autonomous Mobile Robot Application. 1093-1099 - Ryugo Uchiyama, Nari Tanabe:
Real-Time Noise Suppression Using Harmonic/Percussive Separation with Morphological Operations for Hammering Test. 1100-1106 - Shuichi Ohno, Shenjian Wang, Kiyotsugu Takaba:
ΔΣ Modulators for Discrete-time Closed Loop Control Systems with Quantization and Saturation. 1107-1111 - Kazuaki Murayama:
Asymptotic Estimation Performance of Linear Regression Model with Sparse Bayesian Learning as Both Samples and Signals Approach Infinity. 1112-1118 - Ryoya Harashima, Ryunosuke Eguchi, Seisuke Kyochi:
Convolutional Multidimensional Amplitude Spectrum Nuclear Norm for Frequency-domain Robust Principal Component Analysis. 1119-1125 - Reza Mirzaeifard, Naveen K. D. Venkategowda, Alexander Jung, Stefan Werner:
Moreau Envelope ADMM for Decentralized Weakly Convex Optimization. 1126-1130 - Yu-Ching Chung, Ji-Yan Han, Bo-Sin Wang, Wei-Zhong Zheng, Kung-Yao Shen, Ying-Hui Lai:
An Audio-Visual Speech Enhancement System Based on 3D Image Features: An Application in Hearing Aids. 1131-1137 - Kaien Mo, Xianrui Wang, Yichen Yang, Tetsuya Ueda, Shoji Makino, Jingdong Chen:
On Joint Dereverberation and Source Separation with Geometrical Constraints and Iterative Source Steering. 1138-1142 - Leander Melroy Maben, Zixun Guo, Chen Chen, Utkarsh Chudiwal, Chng Eng Siong:
Study of Generative Adversarial Networks for Noisy Speech Simulation from Clean Speech. 1143-1149 - Kenta Iwai, Takanobu Nishiura:
Step Size Control of Shared-error Normalized Least Mean Square Algorithm for Acoustic Echo and Noise Canceller. 1150-1154 - Haoran Xing, Shiqi Zhang, Daiki Takeuchi, Daisuke Niizumi, Noboru Harada, Shoji Makino:
Enhancing Spectrogram for Audio Classification Using Time-Frequency Enhancer. 1155-1160 - Wen-Chin Huang, Tomoki Toda:
Evaluating Methods for Ground-Truth-Free Foreign Accent Conversion. 1161-1166 - Chihiro Watanabe, Hirokazu Kameoka:
DisC-VC: Disentangled and F0-Controllable Neural Voice Conversion. 1167-1171 - Matsuri Iwasaki, Sunao Hara, Masanobu Abe:
Speech Synthesis Using Ambiguous Inputs From Wearable Keyboards. 1172-1178 - Iago Lourenço Correa, Sei Ueno, Akinobu Lee:
Accent-Preserving Voice Conversion between Native-Nonnative Speakers for Second Language Learning. 1179-1186 - Dung Kim Tran, Masato Akagi, Masashi Unoki:
Increasing Speech Intelligibility by Mimicking Professional Announcers' Voices and Its Physical Correlates. 1187-1192 - Reza Mirzaeifard, Naveen K. D. Venkategowda, Stefan Werner:
Robust Networked Federated Learning for Localization. 1193-1198 - Ehsan Lari, Vinay Chakravarthi Gogineni, Reza Arablouei, Stefan Werner:
Continual Local Updates for Federated Learning with Enhanced Robustness to Link Noise. 1199-1203 - Ekin Ugurel, Shuai Huang, Cynthia Chen:
Gaussian Process Learning for Location-Based Service Data. 1204-1207 - Anthony Kuh, Tyler Baguio:
Distributed on-line anomaly detection using kernel methods. 1208-1213 - Jiacong Xu, Riley Kilfoyle, Zixiang Xiong, Ligang Lu:
Communication-Efficient Design of Learning System for Energy Demand Forecasting of Electrical Vehicles. 1214-1220 - Shota Naiki, Shumpei Miura, Kenta Iwai, Takanobu Nishiura, Yoshiharu Soeta:
Radiated Sound Field Reproduction for Surrounding Loudspeaker Array Based on Higher-order Ambisonics. 1221-1225 - Pei-Lin Zhong, You-Siang Chen, Mingsian R. Bai:
Multichannel learning-based spatially extended active noise control via model matching and sensor transfer function interpolation. 1226-1233 - Yi-Tsung Shen, Cheng-Yuan Chang:
A Study of the Microphone Protection of Active Noise Control for Axial Fan. 1234-1239 - Kenya Doi, Yoshinobu Kajikawa:
SFANC with Compensation Filter Based on MEFxDCTLMS Algorithm. 1240-1244 - Woon-Seng Gan, Dongyuan Shi, Xiaoyi Shen:
Practical Active Noise Control: Restriction of Maximum Output Power. 1245-1249 - Trio Adiono, Michael Jonathan, Erwin Setiawan, Nana Sutisna, Rahmat Mulyawan, Infall Syafalni:
A QoS Throughput Performance Measurement Comparison between UGS and BE Services of a Real-time FPGA Based OFDM Multi-user System Design Implementation. 1250-1256 - Trio Adiono, Reina Puteri Ramadhani, Clarance Amadeus, Sindy Novaria Cicilya Sinaga:
Algorithm Development for Stepwise Valve Deflation Method in Blood Pressure Measurement. 1257-1262 - Kendrik Emkel Ginting, Nana Sutisna, Infall Syafalni, Trio Adiono:
SUMO Based Hardware/Software Co-simulation for Two-Intersection Adaptive and Collaborative Traffic Signal Controller. 1263-1270 - Devi Noor Endrawati, Infall Syafalni, Nana Sutisna, Trio Adiono:
Sparsity Exploration for Structured and Unstructured Weight Formations in CNN Architecture. 1271-1278 - Binh-Duong Dinh, Thanh-Thu Nguyen, Thi-Thao Tran, Van-Truong Pham:
1M parameters are enough? A lightweight CNN-based model for medical image segmentation. 1279-1284 - Luong Thi Theu, Quang-Huy Tran, Tran Thi Thuy Quynh, Duc-Tan Tran:
Imaging Ultrasound Scattering Targets using Density-Enhanced Chaotic Compressive Sampling. 1285-1290 - Sinh-Huy Nguyen, Thi-Thu-Hong Le, Hoang-Bach Nguyen, Ngoc-Bach Duong, Hung-Cuong Nguyen, Chi-Thanh Nguyen, Van-Loi Nguyen, Hai Vu:
Segmentation and observation of hand rehabilitation exercises by supporting of acceleration signals. 1291-1295 - Thanh Thi Hien Duong, Trung-Hieu Nguyen, The Thanh-Dat Le, Thi-Lich Nghiem, Duc-Huy Pham, Thi-Lan Le:
Investigating the Role of Human Action Detector in Visual-guide Audio Source Separation System. 1296-1303 - Thi Hoai Linh Nguyen, Quoc Khuong Nguyen, Van Duc Nguyen:
A combination of time and frequency synchronization with Doppler compensation for coded OFDM-based UWA systems. 1304-1309 - Priyanka Gupta, Aastha Kachhi, Hemant A. Patil:
Classification of Normal vs. Pathological Infant Cries Using Morse Wavelets. 1310-1316 - Mary Josy John, Imad Barhumi:
Compressive Sensing Based Algorithms for Limited-View PAT Image Reconstruction. 1317-1322 - Sidharrth Nagappan, Chern Hong Lim, Anuja Thimali Dharmaratne:
Towards AST-LLDs for the Analysis of Depression in Speech Signals. 1323-1328 - Jinwei Lin:
ecVoice: Audio Text Extraction Optimization of Video Based on Idioms Similarity Replacement. 1329-1336 - Yi Yang Ang, Kirish Boodhoo, Wee Ser, Rex Xiao Tan:
Heart Rate Acquisition and Processing Techniques for a Miniature Wearable Microphone Sensor. 1337-1342 - Kraithep Sirisanwannakul, Nutchanon Siripool, Kenji Suzuki, Waree Kongprawechnon, Jessada Karnjana:
Detection and Correction of Defective Relative Humidity Data Collected from the Greenhouse Environment Using Nested Kalman Filters with Standard Deviation Analysis. 1343-1348 - Ting Wei Wang, Shang-Hong Lai:
Pedestrian Crossing Intention Prediction with Multi-Modal Transformer-Based Model. 1349-1356 - Chi Wee Tan, Khai Yin Lim:
Revolutionizing Formative Assessment in STEM Fields: Leveraging AI and NLP Techniques. 1357-1364 - Hao-Jiun Tu, Yu-Feng Ou, Yong-Tai Chen, Chao-Tsung Huang:
A Biased Mixed-Precision Convolution Engine for Hardware-Efficient Computational Imaging CNN. 1365-1371 - Ting-Wei Chen, Chia-Ping Chen, Chung-Li Lu, Bo-Cheng Chan, Yu-Han Cheng, Hsiang-Feng Chuang, Wei-Yu Chen:
A Lightweight Speaker Verification Model For Edge Device. 1372-1377 - Bagus Jati Santoso, Dwi Prasetya Armunanta, Baskoro Adi Pratomo, Hudan Studiawan:
Efficient Dictionary and Grid-Based Framework for Answering Durable k-Nearest Neighbor Queries on Time Series Data. 1378-1385 - Heri Prasetyo, Abid Ammar Mahdy, Abrar Dwi Fairuz Nadhif, Taufiqurrakhman Nur Hidayat, Rudi Hartono:
Dual-Path Residual Attention Convolution Networks for Color-Embedded-Grayscale Image. 1386-1391 - Yin-Rong Zhuo, Hui-Lin Chen, Yu-Guang Chen:
DOC: A Novel DOuble-Contour-Based Macro Placement Framework for Mixed-Size Designs. 1392-1397 - AprilPyone MaungMaung, Isao Echizen, Hitoshi Kiya:
Hindering Adversarial Attacks with Multiple Encrypted Patch Embeddings. 1398-1404 - Keita Takeda, Ryuji Miyamoto, Osamu Takyu:
Implementation of PLIM on 429MHz LoRa/FSK with improved conversion table. 1405-1409 - Soma Furusawa, Kazunori Hayashi, Kaito Kameda, Ryo Hayakawa:
Numerical Performance Evaluation of ℓ1 - ℓ2 Sparse Reconstruction Using Optical Analog Circuit. 1410-1416 - Shangdi Liao, Haipeng Liu, Dingchang Zheng, Fei Chen:
Assessing the Effects of Filtering Processing on Pulse Wave Transit Time Measured by Photoplethysmography from Earlobe. 1417-1421 - Muyang Du, Chuan Liu, Jiaxing Qi, Junjie Lai:
Efficient Incremental Text-to-Speech on GPUs. 1422-1428 - Rui Luo, Yan Feng, Mingxin He, Yuliang Zhang:
Retinex-based Low-Light Image Enhancement. 1429-1434 - Shichuang Xie, Jiasheng Wu, Yanli Chen, Meng Han, Ting Wu, Tong Qiao:
Fine-grained Face Anti-Spoofing based on Recursive Self-Attention and Multi-Scale Fusion. 1435-1442 - Xiujian Liang, Bingshan Liu, Qichao Ying, Zhenxing Qian, Hsunfang Cho, Xinpeng Zhang:
StyleStegan: Leak-free Style Transfer Based on Feature Steganography. 1443-1450 - Ruiguo Yang, Xinhui Han, Wenfa Qi, Wei Hu:
Robust Watermark Imaging via Graph-signal Optimization. 1451-1457 - Wenfa Qi, Jiameng Wang, Zichen Yuan, Xiaolong Li:
A print-scan-resilient watermarking scheme for trademark images. 1458-1462 - Ziyi Xi, Wenmin Huang, Kangkang Wei, Weiqi Luo, Peijia Zheng:
AI-Generated Image Detection using a Cross-Attention Enhanced Dual-Stream Network. 1463-1470 - Boru Chen, Waleed Abdulla:
ResNet-Based Camera Model Identification with Adaptive Preprocessing Module and Weight Fusion of Global Information. 1471-1478 - Tong Tang, Yuan Tan, Shihang Ding, Zhidu Li:
Structural Quality Assured Global Optimization for CTU-Level Rate Control of Screen Content Coding. 1479-1483 - Zhaojie Luo, Kazunori Komatani:
Multimodal Emotion Recognition based on 2D Kernel Density Estimation for Multiple Labels Fusion. 1484-1491 - Neha Sahipjohn, Neil Kumar Shah, Vishal Tambrahalli, Vineet Gandhi:
RobustL2S: Speaker-Specific Lip-to-Speech Synthesis exploiting Self-Supervised Representations. 1492-1499 - Yi-Chong Zeng:
Realizing Nipple in Profile Recognition and Nipple Detection Using a Single Classification. 1500-1505 - Siang-Ling Zhang, Huai-Hsun Cheng, Yen-Hsin Chen, Mei-Chen Yeh:
Exploring a CLIP-Enhanced Automated Approach for Video Description Generation. 1506-1511 - Wei-Chi Lin, Ming-Zhan Lee, He-Sheng Chou, Yuan-Jin Lin, Kuo-Chen Li, Ting-Lan Lin, Shin-Lun Chen:
3D Point Cloud Denoising Based on Color Attribute. 1512-1516 - Jia-Sheng Zhang, Chiung-An Chen, Shih-Lun Chen, Yao-Zhong Zhang:
The DSP and DDR4 VLSI Design for Multi-Sensor in Biomedical System. 1517-1520 - Hao-Cheng Lu, Chiung-An Chen, Jia-Sheng Zhang, Yao-Zhong Zhang:
Identification of Victims Wearing Vibrant Clothing using MATLAB. 1521-1524 - Yu-Lin Liu, He-Sheng Chou, Ming-Zhan Lee, Mei-Ling Chan, Ting-Lan Lin, Chiung-An Chen, Shin-Lun Chen:
Point Cloud Inpainting Based on Delaunay Triangulation. 1525-1529 - Zixiao Liu, Sheng Guo, Man-On Pun:
Dense Three-Dimensional Color Reconstruction for Large-Scale Outdoor Scenes. 1530-1535 - Yan-Ru Chen, Shih-Wei-Chen Tseng, Yu-Chi Chen, Yeong-Hwa Chang:
Safety Enhancement for Mobility Scooter with Rule-Based Danger Prevention. 1536-1541 - Rongjun Li, Wei Peng:
Dictionary-driven Chinese ASR Entity Correction with Controllable Decoding. 1542-1548 - Ruizhe Ou, Haotian Yan, Ming Wu, Chuang Zhang:
A Method of Efficient Synthesizing Post-disaster Remote Sensing Image with Diffusion Model and LLM. 1549-1555 - Binchen Yang, Yu Guo, Xingyan Chen:
Privacy-oriented Coded Caching in Mobile Information-centric Networking. 1556-1563 - Zhaoye Zheng, Ke Zhang, Chaojun Shi, Fei Zheng:
MKTformer: Fine-grained Meter Classification Based on Multi-modal Knowledge Transfer. 1564-1570 - Zhen Li, Lang Wang, Wenjuan An, Song Qi, Xiaoxu Li, Xuezhi Fei:
Feature Augmentation Reconstruction Network for Few-Shot Image Classification. 1571-1578 - Xiaowei Guo, Jijie Wu, Kai Ren, Qi Song, Xiaoxu Li:
Dual Feature Reconstruction Network For Few-shot Image Classification. 1579-1584 - Qi Tian, Wen Wu, Qin Zhu, Tao Cai, Siyi Jiang, Yaqing Li, Jinrun Zhou, Nan Zhu, Yina Wei, Tao Tang, Kedi Xu, Feng Lin, Linqing Feng:
A Cloud-based Data Platform for Efficient EEG Data Management, Collaboration, and Analysis. 1585-1592 - Xiajie Zhou, Candy Olivia Mawalim, Benita Angela Titalim, Masashi Unoki:
Incorporating the Digit Triplet Test in A Lightweight Speech Intelligibility Prediction for Hearing Aids. 1593-1600 - Kazuki Yamato, Shuntaro Fujisawa, Satoshi Ito:
Deep Learning-based MRI Super-Resolution Using Non-uniform Segmented Phase-Scrambling Fourier Transform Signals. 1601-1606 - Ahmed Ibrahum, Kwang-Ho Park, Jang-Eui Hong, Van-Huy Pham, Keun Ho Ryu:
An Extreme Gradient Boosting-based Prediction for Depression. 1607-1613 - Minseok Chu, Gilwon Kang, Keun Ho Ryu:
An Improved Check Digit-based Participant Identification System for Human Biorepositories. 1614-1621 - Suphachok Buaruk, Somrudee Deepaisarn:
Enhancing Snoring Detection with Statistical Analysis of Audio Features. 1622-1627 - Shih-Shuo Tung, Ming-Yu Chung, Jinn Ho, Wen-Liang Hwang:
Un-Rectifying in ReLU Networks and Applications. 1628-1635 - Wan-Chia Huang, Cheng-Liang Shih, Irin Tri Anggraini, Yanqi Xiao, Nobuo Funabiki, Chih-Peng Fan:
OpenPose Based Yoga Poses Difficulty Estimation for Dynamic and Static Yoga Exercises. 1636-1640 - Jiahao Zhao, Kazuyoshi Yoshii:
Multimodal Multifaceted Music Emotion Recognition Based on Self-Attentive Fusion of Psychology-Inspired Symbolic and Acoustic Features. 1641-1645 - Young-Long Chen, Hsin-I Huang, Tzu-Te Yen:
Learned String Quartet Music with Variational Auto Encoder. 1646-1651 - Yu-Fang Huang, Tsung-Jung Liu, Chun-An Lin, Kuan-Hsien Liu:
SOAda-YOLOR: Small Object Adaptive YOLOR Algorithm for Road Object Detection. 1652-1658 - Wei-Shen Tai, Kuan-Hsien Liu:
Badminton Self-Training System Based on Virtual Reality. 1659-1663 - Rinka Kawano, Masaki Kawamura:
Rotation Angle Detection Using a Pilot Signal from Rotated Stego-Image. 1664-1669 - Ayaka Sakamoto, Rinka Kawano, Masaki Kawamura:
Application for generating re-accessible screenshots of web pages using histogram shrinkage. 1670-1676 - Teru Nagamori, Sayaka Shiota, Hitoshi Kiya:
Domain Adaptation for Efficiently Fine-tuning Vision Transformer with Encrypted Images. 1677-1683 - Akinobu Ura, Minoru Kuribayashi, Nobuo Funabiki:
Study on Face Landmark-based Analysis for Synthetic Media Identification Generated by Adversarial Generative Networks. 1684-1690 - Ahmed Khan, Minoru Kuribayashi, KokSheik Wong, Vishnu Monn Baskaran:
HDR Image Watermarking based on Saliency Detection and Quantization Index Modulation. 1691-1696 - Hau Joan, Li Peng Tan, Yiqi Tew:
Quick Response (QR) codes embedding in VVC using Quantisation Parameter Manipulation. 1697-1704 - Chen-Hsiu Huang, Ja-Ling Wu:
CPIPS: Learning to Preserve Perceptual Distances in End-to-End Image Compression. 1705-1711 - Wei-Hsun Ke, Yu-Wen Tseng, Wen-Huang Cheng:
Task-Specific Pruning: Efficient Parameter Reduction in Multi-task Object Detection Models. 1712-1717 - Chia-Hao Kao, Yi-Hsin Chen, Cheng Chien, Wei-Chen Chiu, Wen-Hsiao Peng:
Transformer-based Image Compression with Variable Image Quality Objectives. 1718-1725 - Cong-Ty Dang, Vu-Hoang Tran, Ngoc-Hoang-Lam Le, Ching-Chun Huang:
From Synthetic To Real: Enhancing Deep Learning Models With Generative Adversarial Networks For Efficient Data Utilization In Automatic Retail Stores. 1726-1731 - Wei-Hong Su, Sze-Ann Chen, Chen-I Chin, Hsu-Feng Hsiao:
Virtual Garment Fitting Through Parsing and Context-Aware Generative Adversarial Networks with Discriminator Group. 1732-1738 - Tzu-Po Lin, Monyneath Yim, Jui-Chiu Chiang, Wen-Hsiao Peng, Wen-Nung Lie:
Sparse Tensor-based point cloud attribute compression using Augmented Normalizing Flows. 1739-1744 - Yuya Yamamoto:
Toward Leveraging Pre-Trained Self-Supervised Frontends for Automatic Singing Voice Understanding Tasks: Three Case Studies. 1745-1752 - Miki Oshio, Hokuto Munakata, Ryu Takeda, Kazunori Komatani:
Out-of-Vocabulary Word Detection in Spoken Dialogues Based on Joint Decoding with User Response Patterns. 1753-1759 - Tuan Vu Ho, Shota Horiguchi, Shinji Watanabe, Paola García, Takashi Sumiyoshi:
Synthetic Data Augmentation for ASR with Domain Filtering. 1760-1765 - Kohta Masuda, Jun Ogata, Masafumi Nishida, Masafumi Nishimura:
Multi-Self-Supervised Learning Model-Based Throat Microphone Speech Recognition. 1766-1770 - Kwok Chin Yuen, Haoyang Li, Chng Eng Siong:
ASR Model Adaptation for Rare Words Using Synthetic Data Generated by Multiple Text-To-Speech Systems. 1771-1778 - Tatsunari Takagi, Atsunori Ogawa, Norihide Kitaoka, Yukoh Wakabayashi:
Streaming End-to-End ASR Using CTC Decoder and DRA for Linguistic Information Substitution. 1779-1783 - Wataru Nakamura, Kenta Takahashi:
A Biometric Signature Scheme with Template Protection and Authenticated Sample Recoverability. 1784-1791 - Yosuke Kaga, Yusei Suzuki, Kenta Takahashi:
IPFed: Identity protected federated learning for user authentication. 1792-1797 - Yumo Ouchi, Hidetsugu Uchida, Narishige Abe:
Privacy-Preserving Image Transformation Method for Person Detection and Re-ID. 1798-1803 - Taito Tonosaki, Shokei Kawakami, Koichi Ito, Takafumi Aoki, Yoshiko Yasumura, Masakazu Fujio, Yosuke Kaga, Kenta Takahashi:
Eye Biometrics Combined with Periocular and Iris Recognition Using CNN. 1804-1810 - Takahiko Arakawa, Yuya Sato, Hitoshi Sakano, Tetsushi Ohki:
Development of a Robust Ear Recognition Algorithm using Planar Approximation. 1811-1815 - Sea Ran Cleon Liew, Ngai-Fong Law:
Word encoding for word-looking DGA-based Botnet classification. 1816-1821 - Haowei Cheng, Candy Olivia Mawalim, Kai Li, Lijun Wang, Masashi Unoki:
Analysis of Spectro-Temporal Modulation Representation for Deep-Fake Speech Detection. 1822-1829 - Ryu Takeda, Yui Sudo, Kazunori Komatani:
Flexible Evidence Model to Reduce Uncertainty Mismatch Between Speech Enhancement and ASR Based on Encoder-Decoder Architecture. 1830-1837 - Takahiro Fukumori, Taito Ishida, Yoichi Yamashita:
Investigating the Effectiveness of Speaker Embeddings for Shout Intensity Prediction. 1838-1842 - Atsuo Hiroe, Katsutoshi Itoyama, Kazuhiro Nakadai:
Is the Ideal Ratio Mask Really the Best? - Exploring the Best Extraction Performance and Optimal Mask of Mask-based Beamformers. 1843-1850 - Koharu Horii, Kengo Ohta, Ryota Nishimura, Atsunori Ogawa, Norihide Kitaoka:
Language modeling for spontaneous speech recognition based on disfluency labeling and generation of disfluent text. 1851-1856 - Shuta Taniguchi, Tsuneo Kato, Akihiro Tamura, Keiji Yasuda:
Transformer-based Automatic Speech Recognition of Simultaneous Interpretation with Auxiliary Input of Source Language Text. 1857-1861 - Lester Phillip Violeta, Tomoki Toda:
An Analysis of Personalized Speech Recognition System Development for the Deaf and Hard-of-Hearing. 1862-1867 - Ching-Chieh Chen, Wei-Cheng Hsu, Tzu-Han Lin, Kuan-Dar Chen, Yung-An Tsou, Yi-Wen Liu:
Classification of Vocal Cord Disorders: Comparison Across Voice Datasets, Speech Tasks, and Machine Learning Methods. 1868-1873 - Chen-Xin Yu, Wu-Pei Chen, Chin-Yen Ju, Tsung-Yi Chen, Kuo-Chen Li, Chiung-An Chen, Mei-Ling Chan, Shih-Lun Chen:
Application of Deep Learning Techniques for Thermal Imagery Analysis in Abnormal Identification of Floor Tiles in Heritage Environments. 1878-1884 - Shao-Hua Lu, Jia-Teng Lu, Szu-Yin Lin, Chih-Hsien Hsia:
Wavelet and Cutout in YOLO Architecture for Road Pothole Detection. 1885-1891 - Ming-Yi Wei, Yu-Chi Wang, Liang-Ying Ke, Chih-Hsien Hsia:
Robust Finger Vein Recognition Based on Lightweight Attention Convolutional Neural Networks. 1892-1895 - Ping-Han Chen, Yung-Sheng Hung, Chih-Hsien Hsia:
Lightweight CNN and Image Enhancement Using in Palm Vein Recognition. 1896-1902 - Szu-Yin Lin, Ming-Chun Chien, Edwin Tiong Kwong Meng, Yu-Chien Wang, Yu-Yi Kuo, Che-Hsuan Lin:
Breast Cancer Detection Auxiliary System Leveraging Deep Learning and Mixed Reality. 1903-1906 - Wanli Lyu, Lulu Cheng, Zhaoxia Yin, Bin Luo:
Efficient Reversible Data Hiding for 3D Mesh Models Based on Multi-LSB Substitution and Ring-prediction. 1907-1914 - Wanli Lyu, Mengjiang Wu, Zhaoxia Yin, Bin Luo:
MAEDefense: An Effective Masked AutoEncoder Defense against Adversarial Attacks. 1915-1922 - Yusheng Guo, Nan Zhong, Zhenxing Qian, Xinpeng Zhang, Hsunfang Cho:
Preemptive Image Protection against Steganography. 1923-1930 - Mingyang Zhang, Yi Zhou, Zhizheng Wu, Haizhou Li:
Zero-shot multi-speaker accent TTS with limited accent data. 1931-1936 - Xiao-Ying Zhao, Qiushi Zhu, Jie Zhang, Yeping Zhou, Peiqi Liu:
Speech Enhancement with Multi-granularity Vector Quantization. 1937-1942 - Mohan Shi, Jie Zhang, Zhihao Du, Fan Yu, Qian Chen, Shiliang Zhang, Li-Rong Dai:
A Comparative Study on Multichannel Speaker-Attributed Automatic Speech Recognition in Multi-party Meetings. 1943-1948 - Fengrun Zhang, Chengfei Li, Shuhao Deng, Yaoping Wang, Jinfeng Bai:
Hybrid Syllable and Character Representations for Mandarin ASR. 1949-1954 - Jing Li, Rui Li, Shen Guo, Aishan Wumaier:
Enhancing Whisper Model for Pronunciation Assessment with Multi-Adapters. 1955-1959 - Haowen Li, Wen Zhang, Lijun Zhang:
DoA Estimation of Room Reflections Using NN-Based MUSIC Algorithm. 1960-1965 - Junjie Chen, Yongwei Li, Ziping Zhao, Xuefei Liu, Zhengqi Wen, Jianhua Tao:
Hybrid Multi-Task Learning for End-To-End Multimodal Emotion Recognition. 1966-1971 - Neil Kumar Shah, Vivek Srivastava, Mohit Bhardwaj, Satej Kadlay, Dharmeshkumar Agrawal, Savita Bhat, Niranjan Pedanekar:
It's What You Say and How You Say It: Exploring Audio and Textual Features for Podcast Data. 1972-1977 - Lixiang Li, Xiaopeng Xue, Haipeng Peng, Yeqing Ren, Mengmeng Zhao:
Improved One-class Learning for Voice Spoofing Detection. 1978-1983 - Xingyu Chen, Fei Ma, Amy Bastine, Prasanga N. Samarasinghe, Huiyuan Sun:
Sound Field Estimation around a Rigid Sphere with Physics-informed Neural Network. 1984-1989 - Ningning Pan, Jacob Benesty, Jingdong Chen:
A Controlled Noise Reduction Wiener Filter Based on the Quadratic Eigenvalue Problem. 1990-1994 - Sijie Wang, Askar Hamdulla, Mijit Ablimit:
Target Speaker Extraction with Attention Enhancement and Gated Fusion Mechanism. 1995-2001 - Jia Qi Yip, Dianwen Ng, Bin Ma, Chng Eng Siong:
Analysis of Speech Separation Performance Degradation on Emotional Speech Mixtures. 2002-2007 - Ruifeng Zhang, Tetsuya Ueda, Shoji Makino:
Geometrically Constrained Blind Moving Source Extraction based on Constant Separation Vector and Auxiliary Function Technique. 2008-2012 - Kanta Shimonishi, Takahiro Fukumori, Yoichi Yamashita:
Universal Sound Separation Using Replay-based Data Sampling in Incremental Learning. 2013-2018 - Keisuke Takazawa, Hirokazu Kameoka, Masahiro Yukawa:
Multiple Sound Source Tracking Based on Generative Modeling and Recursive Bayesian Filtering of Spatial Gradient Spectra. 2019-2023 - Tetsuya Ueda, Tomohiro Nakatani, Rintaro Ikeshita, Shoko Araki, Shoji Makino:
Spatially-Regularized Switching Independent Vector Analysis. 2024-2030 - Arya Widyadhana, Shintami Chusnul Hidayati, Dini Adni Navastara, Yeni Anistyasari:
ASF-LLRDA: Locality-regularized Linear Regression Discriminant Analysis with Approximately Symmetrical Face Preprocessing for Face Recognition. 2031-2036 - Shaokang Wang, Songlin Sun:
Joint Optimization Algorithm for Adaptive Bit Allocation Based on Temporal-Spatial Information. 2037-2042 - Yuma Kinoshita, Kouei Yamaoka, Hitoshi Kiya:
Maximization of 2D Cross-Correlation Based on Auxiliary Function Method for Image Alignment. 2043-2047 - Jing-Ming Guo, Yu-Shun Huang, Ting-Yu Chang, Tai-Cyuan Ciou, Yun-Ching Yeh, Jeffrey Chen:
Multitask Record for Badminton Match. 2048-2052 - Jing-Ming Guo, S. Sankarasrinivasan, Let Viet Hung, Wei Liu:
Deep Residual and Classified Neural Networks for Inverse Halftoning. 2053-2060 - Yoto Fujita, Yoshiaki Bando, Keisuke Imoto, Masaki Onishi, Kazuyoshi Yoshii:
DOA-Aware Audio-Visual Self-Supervised Learning for Sound Event Localization and Detection. 2061-2067 - Shi Cheng, Jun Du, Qing Wang, Ya Jiang, Zhaoxu Nian, Shutong Niu, Chin-Hui Lee, Yu Gao, Wenbin Zhang:
Improving Sound Event Localization and Detection with Class-Dependent Sound Separation for Real-World Scenarios. 2068-2073 - Ami Igarashi, Shunsuke Tsubaki, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada, Keisuke Imoto:
Joint Analysis of Acoustic Scenes and Sound Events Based on Semi-Supervised Approach. 2074-2080 - Jeongsoo Park, Dong-Gyun Han, Hyoung Sul La, Sangmin Lee, Yoonchang Han, Eun-Jin Yang:
Cross-domain Sound Recognition for Efficient Underwater Data Analysis. 2081-2086 - Tomohiro Takahashi, Yuma Kinoshita, Natsuki Ueno, Yukoh Wakabayashi, Nobutaka Ono, Jun Honda, Seishi Fukuma, Aoi Kitamori, Hiroshi Nakagawa:
Augmentation of Various Speed Data by Controlling Frame Overlap for Acoustic Traffic Monitoring. 2087-2091 - Chien-Cheng Tseng, Su-Ling Lee:
Distributed Computation of Heat Kernel Smoothing Using Series Expansion Method. 2092-2098 - Chih-Chang Yu, Zi-Hang Huang, Hsu-Yung Cheng:
In-Air Handwriting for Chinese Character Recognition from Monocular Camera: A Deep Learning based Approach with Fingertip Detection and Virtual Strokes Elimination. 2099-2103 - Cyun-Bo Wang, Jian-Jiun Ding:
EffSegmentNet: Efficient Design for Real-time Semantic Segmentation. 2104-2111 - Zhichao Zhang:
Universal Optimal Parameters of the Closed-Form Linear Canonical Wigner Distribution. 2112-2117 - Chi-Kai Hsieh, Feng-Tsun Chien, Min-Kuan Chang:
Autoencoder-Enhanced Federated Learning with Reduced Overhead and Lower Latency. 2118-2123 - Masaya Kumagai, Ayano Nakai-Kasai, Tadashi Wadayama:
Deep Unfolding-based Distributed MIMO Detection. 2124-2130 - Fityanul Akhyar, Ledya Novamizanti, Koredianto Usman, Ghanes Mahesa Aditya, Farhan Nur Hakim, Mukhamad Zidni Ilman, Ferdi Ramdhon, Chih-Yang Lin:
A Comparative Analysis of the Yolo Models for Intelligent Lobster Surveillance Camera. 2131-2136 - Chun-Huang Lo, Chung-Nan Lee:
A UAV Indoor Obstacle Avoidance System Based on Deep Reinforcement Learning. 2137-2143 - Hideyoshi Miura, Shoya Abukawa, Tomotaka Kimura, Kouji Hirata:
Approximate modeling of malware diffusion on wireless mobile devices. 2144-2148 - Wen-Ping Lai, Wen-Ru Chen, Hong-Lun Lai, Hong-Yi Li:
Impacts of 5G-TDD Time Slot Configurations on the Downlink and Uplink Data Rates. 2149-2154 - Zhenzhen Tian, Xinyu Zhang, Wei Yan, Jihua Wang:
Bearing Fault Diagnosis and Interpretation Based on 2D Images and Convolutional Neural Network. 2155-2162 - An-Chi Wei, Yi-Cheng Chang, Jyh-Rou Sze:
Study on Reduction of Background Fringes for Defect Detection of Specular Surface. 2163-2167 - Latifa Nabila Harfiya, Yan-Cheng Hsu, Yung-Hui Li, Jia-Ching Wang:
On the Optimal Self-Supervised Multi-Fault Detector for Temperature Sensor Data. 2168-2172 - Chin-Wei Lee, Daniel Hládek, Matús Pleva, Yuan-Fu Liao, Ming-Hsiang Su:
Application of Wafer Defect Pattern Classification Model in the Semiconductor Industry. 2173-2177 - Ta-Fu Chen, Yi-Xing Lin, Ming-Hsiang Su, Po-Kai Chen, Tzu-Chiang Tai, Jia-Ching Wang:
Question Answering System Based on Pre-Training Model and Retrieval Reranking for Industry 4.0. 2178-2181 - Anuwat Chaiwongyen, Suradej Duangpummet, Jessada Karnjana, Waree Kongprawechnon, Masashi Unoki:
Deepfake-speech Detection with Pathological Features and Multilayer Perceptron Neural Network. 2182-2188 - Jia-Hao Hsu, Hua-Wei Tseng, Chung-Hsien Wu, Esther Ching-Lan Lin, Po See Chen:
Temporal and Type Correlation in Digital Phenotyping for Bipolar Disorder State Prediction Using Multitask Self-Supervised Learning. 2189-2195 - I-Ting Hsieh, Chung-Hsien Wu, Shu-Wei Tsai:
Data Selection Based on Phoneme Affinity Matrix for Electrolarynx Speech Recognition. 2196-2202 - Kai-Hsuan Chan, Yi-Chong Zeng:
Reduction of Annotation Effort in Medical Image Analysis Based on Self-supervised Learning. 2203-2208 - Farchan Raswa Hakim, Prabowo Yoga Wicaksana, Wenny Ramadha Putri, Agus Harjoko, Jia-Ching Wang:
STUA-Net: A Fingerprint Reconstruction with Swin Transformer and Soft Collective Attention. 2209-2212 - Yuki Kitagishi, Hosana Kamiyama, Naohiro Tawara, Atsunori Ogawa, Noboru Miyazaki, Taichi Asami:
Coarse-Age Loss: A New Training Method Using Coarse-Age Labeled Data for Speaker Age Estimation. 2213-2220 - Taiyang Guo, Sixia Li, Shunsuke Kidani, Shogo Okada, Masashi Unoki:
Contribution of modulation spectral features for cross-lingual speech emotion recognition under noisy reverberant conditions. 2221-2227 - Tomoka Wakamatsu, Sayaka Shiota, Hitoshi Kiya:
Vocal Tract Length Perturbation-based Pseudo-Speaker Augmentation for Speaker Embedding Learning. 2228-2232 - Minato Uesaka, Hideto Kawauchi, Kouei Yamaoka, Yukoh Wakabayashi, Yuma Kinoshita, Nobutaka Ono, Jun Noguchi, Satoshi Watanabe, Noritaka Ichinohe, Seico Benner, Hidenori Yamasue:
Automatic Call Classification of Autism Model Marmosets by Deep Learning and Analysis of Their Vocal Development. 2233-2237 - Lu Yi, Man-Wai Mak:
Cross-Domain adaptation in Distance Space for Speaker Verification. 2238-2243 - Bo Peng, Waleed H. Abdulla, Kevin I-Kai Wang:
Urban Noise Monitoring using Edge Computing with CNN-LSTM on Jetson Nano. 2244-2250 - Tsung-Hsi Wu, Pei-Yuan Chen, Chien-Chih Chen, Meng-Ju Chung, Zheng-Kai Ye, Ming-Hsu Li:
Random forest of Classification and Regression Tree (CART) in the estimation of SWC based on meteorological inputs and hydrodynamics behind. 2251-2255 - Chung-I Huang, Jih-Sheng Chang, Chen-Kai Sun, Taichi Wang, Wei-Yu Chen, Hui Hung Yu, Wen-Yi Chang, Fang-Pang Lin:
A Framework for Reusing Earth Science Data on Data and Model Marketplaces. 2256-2260 - Yi-Ju Lee, Fang-Yi Cheng, Chih-Yung Feng, Zhih-Min Yang:
Impact of the weighted loss function on the innovative CMAQ-CNN PM2.5 forecasting model. 2261-2266 - Xiaoquan Ke, Man-Wai Mak, Helen M. Meng:
Jointly Modelling Transcriptions and Phonemes with Optimal Features to Detect Dementia from Spontaneous Cantonese. 2267-2273 - Keigo Hojo, Daiki Mori, Yukoh Wakabayashi, Kengo Ohta, Atsunori Ogawa, Norihide Kitaoka:
Combining multiple end-to-end speech recognition models based on density ratio approach. 2274-2279 - Shunichi Kohara, Masanobu Abe, Sunao Hara:
Speech-Emotion Control for Text-to-Speech in Spoken Dialogue Systems Using Voice Conversion and x-vector Embedding. 2280-2286 - Yuting Geng, Masato Nakayama, Takanobu Nishiura:
Narrow-edged Acoustical Beamforming Utilizing Phase Inversion for Frequency Modulation-based Parametric Array Loudspeaker. 2287-2293 - Akio Kobayashi, Keiichi Yasu:
Corpus Construction for Deaf Speakers and Analysis by Automatic Speech Recognition. 2294-2298 - Hikaru Yamaga, Toshifumi Momma, Kazunori Kojima, Yoshiaki Itoh:
Ensemble of Transformer and Convolutional Recurrent Neural Network for Improving Discrimination Accuracy in Automatic Chord Recognition. 2299-2305 - Nagito Shione, Yukoh Wakabayashi, Norihide Kitaoka:
Construction of Automatic Speech Recognition Model that Recognizes Linguistic Information and Verbal/Non-verbal Phenomena. 2306-2311 - Lifan Zhong, Erica Cooper, Junichi Yamagishi, Nobuaki Minematsu:
Exploring Isolated Musical Notes as Pre-training Data for Predominant Instrument Recognition in Polyphonic Music. 2312-2319 - Hayata Nakano, Tsubasa Yoshizawa, Yuting Geng, Kenta Iwai, Takanobu Nishiura:
Speech Quality Improvement Utilizing Out-of-Focus Areas in Rolling-Shutter Video on Speech Extraction. 2320-2325 - Chung-Che Wang, Yu-Chun Lin, Yu-Teng Hsu, Jyh-Shing Roger Jang:
Personalized Audio Quality Preference Prediction. 2326-2330 - Yi-Cheng Wang, Tzu-Ting Yang, Hsin-Wei Wang, Bi-Cheng Yan, Berlin Chen:
AVATAR: Robust Voice Search Engine Leveraging Autoregressive Document Retrieval and Contrastive Learning. 2331-2335 - Chia-Chuan Liu, Chia-Ping Chen, Chung-Li Lu, Bo-Cheng Chan, Yu-Han Cheng, Hsiang-Feng Chuang, Wei-Yu Chen:
Regression-based Sound Event Detection with Semi-supervised Learning. 2336-2342 - Linh Thi Thuc Tran, Felix Albu, Hieu Trung Nguyen, Sven Nordholm:
Proportionate NLMS with Variable Step-Size for Adaptive Feedback Cancellation in Hearing Aids. 2343-2348 - Hsuan-Cheng Chou, Yih-Liang Shen, Meng-Hsuan Wu, Bo-Wun Shih, Tai-Shih Chi:
Residual Echo Suppression using Spatial Feature for Stereo Acoustic Echo Cancellation. 2349-2353 - Jen-Chieh Cheng, Jian-Jiun Ding:
Multitaper Adaptive Time-Frequency Windowed Fourier Transform Based on the Reliable Region of Window Widths. 2354-2361 - Prakash Kumar Karn, Waleed H. Abdulla:
Enhancing Retinal Disease Classification with Dual Scale Twin Vision Transformers using OCT Imaging. 2362-2369 - Kai Chieh Chang, Mark Hasegawa-Johnson, Nancy L. McElwain, Bashima Islam:
Classification of Infant Sleep/Wake States: Cross-Attention among Large Scale Pretrained Transformer Networks using Audio, ECG, and IMU Data. 2370-2377 - Yuta Iinuma, Takuto Ozawa, Sou Nobukawa, Nobuhiko Wagatsuma, Keiichiro Inagaki:
Dynamic Characteristics of Electroencephalogram Reflecting Driving-Experience-Dependent Performance Using Microstates. 2378-2383 - Madhu R. Kamble:
Quefrency Domain Features with Residual Networks for Spoof Speech Detection. 2384-2389 - Zhenyu Piao, Hyungseob Lim, Miseul Kim, Hong-Goo Kang:
PDF-NET: Pitch-adaptive Dynamic Filter Network for Intra-gender Speaker Verification. 2390-2394 - Miho Hirohashi, Yoichi Haneda:
Subjective Evaluation of a Focused Sound Source Reproducing at the positions of a Listener's Moving Hand. 2395-2401 - Xin Wang, Hongjie He, Fan Chen:
Time Sensitive Hash and Adaptive Image Recovery based Self-embedding Fragile Watermarking Scheme in Encrypted Images. 2402-2408 - Wenjiang Chi, Xiaoqin Feng, Liumeng Xue, Yunlin Chen, Lei Xie, Zhifei Li:
Multi-granularity Semantic and Acoustic Stress Prediction for Expressive TTS. 2409-2415 - Wen Wang, Dongchao Yang, Qichen Ye, Bowen Cao, Yuexian Zou:
NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement. 2416-2423 - Binghuai Lin, Liyuan Wang:
Multi-accent pronunciation assessment based on domain adversarial training. 2424-2428 - Yunhao Zhao, Changchun Bao, Xue Yang, Jing Zhou:
GAN-Based Time-Domain Packet Loss Concealment Method with Consistent Mapping Approach. 2429-2435 - Yu Liu, Hongxia Wang:
Feature Selection Based on Clonal Selection Algorithm for Image Steganalysis. 2441-2447 - Tianci Wu, Shulin He, Hui Zhang, Xueliang Zhang:
ScaleFormer: Transformer-based speech enhancement in the multi-scale time domain. 2448-2453 - Yangfu Li, Xiaodan Lin:
UniVR: A Unified Framework for Pitch-Shifted Voice Restoration in Speaker Identification. 2454-2460
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.