


default search action
ACM Transactions on Architecture and Code Optimization, Volume 22
Volume 22, Number 1, March 2025
- Perry Gibson
, José Cano
, Elliot Crowley
, Amos J. Storkey
, Michael F. P. O'Boyle
:
DLAS: A Conceptual Model for Across-Stack Deep Learning Acceleration. 1:1-1:28 - Xinbiao Gan
:
GraphService: Topology-aware Constructor for Large-scale Graph Applications. 2:1-2:24 - Renjun Zhang
, Tianming Zhang
, Zinuo Cai
, Dongmei Li
, Ruhui Ma
, Rajkumar Buyya
:
MemoriaNova: Optimizing Memory-Aware Model Inference for Edge Computing. 3:1-3:25 - Andrea Lepori
, Alexandru Calotoiu
, Torsten Hoefler
:
Iterating Pointers: Enabling Static Analysis for Loop-based Pointers. 4:1-4:25 - Viktor Razilov
, Ipek Geçin
, Emil Matús, Gerhard P. Fettweis
:
Conflict Management in Vector Register Files. 5:1-5:19 - Jingle Xu
, Jiayu Fu
, Lin Gan
, Yaojian Chen
, Zhaoqi Sun
, Zhenchun Huang
, Guangwen Yang
:
Leveraging the Hardware Resources to Accelerate cryo-EM Reconstruction of RELION on the New Sunway Supercomputer. 6:1-6:25 - Yuta Saito
, Kazunori Sakamoto
, Hironori Washizaki
, Yoshiaki Fukazawa
:
Multiple Function Merging for Code Size Reduction. 7:1-7:26 - Peihua Zhang
, Chenggang Wu
, Hanzhi Hu
, Lichen Jia
, Mingfan Peng
, Jiali Xu
, Mengyao Xie
, Yuanming Lai
, Yan Kang
, Zhe Wang
:
Shining Light on the Inter-procedural Code Obfuscation: Keep Pace with Progress in Binary Diffing. 8:1-8:28 - Dengke Han
, Mingyu Yan
, Xiaochun Ye
, Dongrui Fan
:
Characterizing and Understanding HGNN Training on GPUs. 9:1-9:25 - Jingyu Wang
, Ruilong Ma
, Xiang Yang, Qi Qi
, Zirui Zhuang
, Jing Wang, Jianxin Liao, Song Guo
:
DeepZoning: Re-accelerate CNN Inference with Zoning Graph for Heterogeneous Edge Cluster. 10:1-10:26 - Chenghao Ouyang
, Jinhan Xin
, Siqi Zeng
, Guohui Li
, Jianjun Li
, Zhibin Yu
:
Constructing a Supplementary Benchmark Suite to Represent Android Applications with User Interactions by using Performance Counters. 11:1-11:28 - Xinglei Dou
, Lei Liu
, Limin Xiao
:
An Intelligent Scheduling Approach on Mobile OS for Optimizing UI Smoothness and Power. 12:1-12:27 - Kwanghoon Choi
, Igjae Kim
, Sunho Lee
, Jaehyuk Huh
:
ShieldCXL: A Practical Obliviousness Support with Sealed CXL Memory. 13:1-13:25 - Yun Chen
, Ali Hajiabadi
, Romain Poussier
, Yaswanth Tavva
, Andreas Diavastos
, Shivam Bhasin
, Trevor E. Carlson
:
PARADISE: Criticality-Aware Instruction Reordering for Power Attack Resistance. 14:1-14:25 - Chunfeng Li
, Feng Shi
, Fei Yin
, Karim Soliman
, Jin Wei
:
A High Scalability Memory NoC with Shared-Inside Hierarchical-Groupings for Triplet-Based Many-Core Architecture. 15:1-15:26 - Jin Zhao
, Yu Zhang
, Donghao He
, Qikun Li
, Weihang Yin
, Hui Yu
, Hao Qi
, Xiaofei Liao, Hai Jin
, Haikun Liu
, Linchen Yu
, Zhan Zhang
:
An Efficient ReRAM-based Accelerator for Asynchronous Iterative Graph Processing. 16:1-16:26 - Xinyu Li
, Guangyao Guo
, Yanzhi Lan
, Feng Xue
, Chenji Han
, Gen Niu
, Fuxin Zhang
:
Tiaozhuan: A General and Efficient Indirect Branch Optimization for Binary Translation. 17:1-17:26 - Jianhua Gao
, Zeming Liu
, Yizhuo Wang
, Weixing Ji
:
RaNAS: Resource-Aware Neural Architecture Search for Edge Computing. 18:1-18:18 - Adnan Hasnat
, Shoaib Akram
:
SPIRIT: Scalable and Persistent In-Memory Indices for Real-Time Search. 19:1-19:26 - Dezhong Yao
, Sifan Zhao
, Tongtong Liu
, Gang Wu
, Hai Jin
:
ApSpGEMM: Accelerating Large-scale SpGEMM with Heterogeneous Collaboration and Adaptive Panel. 20:1-20:23 - Weiduo Chen
, Xiaoshe Dong
, Fan Zhang
, Bowen Li
, Yufei Wang
, Qiang Wang
:
ATP: Achieving Throughput Peak for DNN Training via Smart GPU Memory Management. 21:1-21:27 - Zhuoran Song
, Jiabei Long
, Li Jiang
, Naifeng Jing
, Xiaoyao Liang
:
GCNTrain+: A Versatile and Efficient Accelerator for Graph Convolutional Neural Network Training. 22:1-22:22 - Wenjie Qi
, Zhipeng Tan
, Ziyue Zhang
, Ying Yuan
, Dan Feng
:
exZNS: Extending Zoned Namespace to Support Byte-loggable Zones. 23:1-23:28 - Long Zheng
, Bing Zhu
, Pengcheng Yao
, Yuhang Zhou
, Chengao Pan
, Wenju Zhao
, Xiaofei Liao
, Hai Jin
, Jingling Xue
:
PRAGA: A Priority-Aware Hardware/Software Co-design for High-Throughput Graph Processing Acceleration. 24:1-24:27 - Yingshuai Dong, Chencheng Ye, Haikun Liu, Liting Tang, Xiaofei Liao, Hai Jin, Cheng Chen, Yanjiang Li, Yi Wang:
DTAP: Accelerating Strongly-Typed Programs with Data Type-Aware Hardware Prefetching. 25:1-25:25 - Xueliang Wei
, Dan Feng
, Wei Tong
, Bing Wu
, Xu Jiang
:
COVER: Alleviating Crash-Consistency Error Amplification in Secure Persistent Memory Systems. 26:1-26:27 - Xinqi Chen
, Erci Xu
, Dengyao Mo
, Ruiming Lu
, Haonan Wu
, Dian Ding
, Guangtao Xue
:
MasterPlan: A Reinforcement Learning Based Scheduler for Archive Storage. 27:1-27:25 - Brandon Kammerdiener
, J. Zach McMichael
, Michael R. Jantz
, Kshitij A. Doshi
, Terry R. Jones
:
Flexible and Effective Object Tiering for Heterogeneous Memory Systems. 28:1-28:24 - Zhiqiang Chen
, Yongwen Wang
, Hongwei Zhou
, Jian Zhang
:
Steered Bubble: An Interposer-based Deadlock Recovery Algorithm for Multi-chiplet Systems. 29:1-29:24 - Shruthi Karunakar
, Rajshekar Kalayappan
, Sandeep Chandran
:
Consequence-based Clustered Architecture. 30:1-30:25 - Jiahui Yang
, Fulin Nan
, Zhirong Shen
, Zhisheng Chen
, Yuhui Cai
, Dmitrii I. Kaplun
, Xiaoli Wang
, Quanqing Xu
, Chuanhui Yang
, Jiwu Shu
:
TPRepair: Tree-based Pipelined Repair in Clustered Storage Systems. 31:1-31:25 - Jianrong Yan
, Wenbin Jiang
, Dongao He
, Suyang Wen
, Yang Li
, Hai Jin
, Zhiyuan Shao
:
RT-GNN: Accelerating Sparse Graph Neural Networks by Tensor-CUDA Kernel Fusion. 32:1-32:27 - Yi Dai
, Kai Lu
, Sheng Ma
, Jinshu Su
, Dongsheng Li
:
Bubble-Swap Flow Control. 33:1-33:26 - Dongjie Tang
, Zijun Wu
, Yun Wang
, Yicheng Gu
, Fangxin Liu
, Zhengwei Qi
:
gCom: Fine-grained Compressors in Graphics Memory of Mobile GPU. 34:1-34:25 - Ruixing Zong
, Jiapeng Zhang
, Zhuo Tang
, Kenli Li
:
IBing: An Efficient Interleaved Bidirectional Ring All-Reduce Algorithm for Gradient Synchronization. 35:1-35:23 - Quancheng Wang
, Ming Tang
, Ke Xu
, Han Wang
:
Unveiling and Evaluating Vulnerabilities in Branch Predictors via a Three-Step Modeling Methodology. 36:1-36:26 - Pengyu Yang
, Weihao Cui
, Chunyu Xue
, Han Zhao
, Chen Chen
, Quan Chen
, Jing Yang
, Minyi Guo
:
Taming Flexible Job Packing in Deep Learning Training Clusters. 37:1-37:24 - Zhenlin Wu
, Haosong Zhao
, Hongyuan Liu
, Wujie Wen
, Jiajia Li
:
gHyPart: GPU-friendly End-to-End Hypergraph Partitioner. 38:1-38:25 - Mariano Benito
, Enrique Vallejo
, Ramón Beivide:
LIA: Latency-Improved Adaptive routing for Dragonfly networks. 39:1-39:26 - Yiming Gan
, Jingwen Leng
, Bo Yu
, Yuhao Zhu
:
KINDRED: Heterogeneous Split-Lock Architecture for Safe Autonomous Machines. 40:1-40:25 - Tzung-Han Juang
, Christophe Dubach
:
Maximizing Data and Hardware Reuse for HLS with Early-Stage Symbolic Partitioning. 41:1-41:26 - Cheng Xu
, Chao Li
, Xiaofeng Hou
, Junyi Mei
, Jing Wang
, Pengyu Wang
, Shixuan Sun
, Minyi Guo
, Baoping Hao
:
Enhancing High-Throughput GPU Random Walks Through Multi-Task Concurrency Orchestration. 42:1-42:26 - Qiong Chang
, Weimin Wang
, Jun Miyazaki
:
Accelerating Nearest Neighbor Search in 3D Point Cloud Registration on GPUs. 43:1-43:24 - Yekang Zhan
, Xiangrui Yang
, Haichuan Hu
, Qiang Cao
, Yifan Zhang
, Jie Yao
:
AIS: An Active Idleness I/O Scheduler to Reduce Buffer-Exhausted Degradation of Solid-State Drives. 44:1-44:26 - Coby Soss
, Aravind Sukumaran-Rajam
, Janet Layne
, Edoardo Serra
, Mahantesh Halappanavar
, Assefaw H. Gebremedhin
:
ScaWL: Scaling k-WL (Weisfeiler-Lehman) Algorithms in Memory and Performance on Shared and Distributed-Memory Systems. 45:1-45:25
Volume 22, Number 2, June 2025
- Yiming Wang
, Weizhe Zhang
, Meng Hao
, Weizhi Kong
, Yuan Wen
:
Dynamic Power Management Through Multi-agent Deep Reinforcement Learning for Heterogeneous Systems. 46:1-46:23 - Xinyuan Wang
, Xingchen Li
, Yun Peng
, Hejiao Huang
:
Comprehensive Evaluation and Opportunity Discovery for Deterministic Concurrency Control. 47:1-47:24 - Théophile Bastian
, Hugo Pompougnac
, Alban Dutilleul
, Fabrice Rastello
:
CesASMe and Staticdeps: static detection of memory-carried dependencies for code analyzers. 48:1-48:23 - Fuyu Wang
, Minghua Shen
, Yutong Lu
, Nong Xiao
:
Ceiba: An Efficient and Scalable DNN Scheduler for Spatial Accelerators. 49:1-49:23 - Kelun Lei
, Shaokang Du
, Xin You
, Hailong Yang
, Zhongzhi Luan
, Yi Liu
, Depei Qian
:
Exploiting Dynamic Regular Patterns in Irregular Programs for Efficient Vectorization. 50:1-50:25 - Xueying Wang
, Shigang Li
, Hao Qian
, Fan Luo
, Zhaoyang Hao
, Tong Wu
, Ruiyuan Xu
, Huimin Cui
, Xiaobing Feng
, Guangli Li
, Jingling Xue
:
OptiFX: Automatic Optimization for Convolutional Neural Networks with Aggressive Operator Fusion on GPUs. 51:1-51:27 - Yifu He
, Han Zhao
, Weihao Cui
, Shulai Zhang
, Quan Chen
, Minyi Guo
:
ARACHNE: Optimizing Distributed Parallel Applications with Reduced Inter-Process Communication. 52:1-52:26 - Kailin Yang
, José F. Martínez
:
VersaTile: Flexible Tiled Architectures via Associative Processors. 53:1-53:26 - Changqing Shi
, Yufei Sun
, Rui Chen
, Jiahao Wang
, Qiang Guo
, Chunye Gong
, Yicheng Sui
, Yutong Jin
, Yuzhi Zhang
:
TransCL: An Automatic CUDA-to-OpenCL Programs Transformation Framework. 54:1-54:24 - Zhibo Xuan
, Xin You
, Tianyu Feng
, Hailong Yang
, Zhongzhi Luan
, Yi Liu
, Depei Qian
:
SimTrace: Exploiting Spatial and Temporal Sampling for Large-Scale Performance Analysis. 55:1-55:26 - Congyong Chen
, Shengan Zheng
, Yuhang Zhang
, Linpeng Huang
:
FusionFS: A Contention-Resilient File System for Persistent CPU Caches. 56:1-56:23 - Jingcheng Shen
, Lang Yang
, Linbo Long
, Zhenhua Tan
, Congming Gao
, Kan Zhong
, Masao Okita
, Fumihiko Ino
:
Overlapping Aware Data Placement Optimizations for LSM Tree-Based Store on ZNS SSDs. 57:1-57:25 - Minghua Shen, Aoxiang Qin
, Nong Xiao
:
ODGS: Dependency-Aware Scheduling for High-Level Synthesis with Graph Neural Network and Reinforcement Learning. 58:1-58:25 - Gaoyang Zhao
, Qiuran Li
, Rongzhen Lin
, Yaohua Wang
:
Shift-CIM: In-SRAM Alignment To Support General-Purpose Bit-level Sparsity Exploration in SRAM Multiplication. 59:1-59:25 - Xin Cheng
, Jinpeng Ye
, Haoyu Deng
, Tingting Zhang
, Tianyi Liu
, Jian Wang
:
LitTLS: Lightweight Thread-Level Speculation on Little Cores. 60:1-60:27 - Chaoyang Jia
, Jingyu Liu
, Shi Chen
, Kai Lu
, Li Shen
:
TSN Cache: Exploiting Data Localities in Graph Computing Applications. 61:1-61:26 - Shantian Qin
, Zhihua Fan
, Wenming Li
, Zhen Wang
, Xuejun An
, Xiaochun Ye
, Dongrui Fan
:
PANDA: Adaptive Prefetching and Decentralized Scheduling for Dataflow Architectures. 62:1-62:27 - Yu Tang
, Lujia Yin
, Qiao Li
, Hongyu Zhu
, Hengjie Li
, Xingcheng Zhang
, Linbo Qiao
, Dongsheng Li
, Jiaxin Li
:
Koala: Efficient Pipeline Training through Automated Schedule Searching on Domain-Specific Language. 63:1-63:25 - Yuting Li
, Yun Xu
, Pengcheng Wang
, Yonghui Xu
, Weiguang Wang
:
A Lock-free RDMA-friendly Index in CPU-parsimonious Environments. 64:1-64:25 - Xueliang Wei
, Dan Feng
, Wei Tong
, Bing Wu
, Xu Jiang
:
SEED: Speculative Security Metadata Updates for Low-Latency Secure Memory. 65:1-65:27 - Xiaobo Lu
, Jianbin Fang
, Lin Peng
, Chun Huang
, Zixiao Yu
, Tiejun Li
:
Gator: Accelerating Graph Attention Networks by Jointly Optimizing Attention and Graph Processing. 66:1-66:24 - Yacine Hakimi
, Riyadh Baghdadi
, Yacine Challal
:
Supporting Dynamic Program Sizes in Deep Learning-Based Cost Models for Code Optimization. 67:1-67:25 - Yicheng Wang
, Lijie Xu
, Tian Guo
, Wensheng Dou
, Hongbin Zeng
, Wei Wang
, Jun Wei
, Tao Huang
:
BridgeGC: An Efficient Cross-Level Garbage Collector for Big Data Frameworks. 68:1-68:27 - Zhen Du
, Ying Liu
, Ninghui Sun
, Huimin Cui
, Xiaobing Feng
, Jiajia Li
:
SRSparse: Generating Codes for High-Performance Sparse Matrix-Vector Semiring Computations. 69:1-69:26 - Chenji Han
, Zifei Zhang
, Feng Xue
, Xinyu Li
, Yuxuan Wu
, Tingting Zhang
, Tianyi Liu
, Qi Guo
, Fuxin Zhang
:
SnsBooster: Enhancing Sampling-based μArch Evaluation Efficiency through Online Performance Sensitivity Analysis. 70:1-70:27 - Amit Tiwari
, V. Krishna Nandivada
:
Unleashing Parallelism with Elastic-Barriers. 71:1-71:26 - Gia Bao Thieu
, Sven Gesper
, Guillermo Payá-Vayá
:
DCMA: Accelerating Parallel DMA Transfers with a Multi-Port Direct Cached Memory Access in a Massive-Parallel Vector Processor. 72:1-72:25 - Aurélie Saulquin
, Mazdak Fatahi
, Pierre Boulet
, Samy Meftali
:
ModNEF : An Open Source Modular Neuromorphic Emulator for FPGA for Low-Power In-Edge Artificial Intelligence. 73:1-73:24 - Zhengding Hu
, Jingwei Sun
, Guangzhong Sun
:
GNNPilot: A Holistic Framework for High-Performance Graph Neural Network Computations on GPUs. 74:1-74:26 - Jinghao Zhao
, Hongwei Yang
, Meng Hao
, Weizhe Zhang
, Hui He
, Desheng Wang
:
HEngine: A High Performance Optimization Framework on a GPU for Homomorphic Encryption. 75:1-75:23 - Wen Cheng
, Qianya Cheng
, Yi Liu
, Lingfang Zeng
, André Brinkmann
, Yang Wang
:
9Ring: A 3D-Stacked Memory-Based Accelerator for Flexible and Efficient Deep CNN Applications. 76:1-76:26 - Cunchen Hu
, Heyang Huang
, Liangliang Xu
, Xusheng Chen
, Chenxi Wang
, Jiang Xu
, Shuang Chen
, Hao Feng
, Sa Wang
, Yungang Bao
, Ninghui Sun
, Yizhou Shan
:
ShuffleInfer: Disaggregate LLM Inference for Mixed Downstream Workloads. 77:1-77:24 - Suchita Pati
, Shaizeen Aga
, Nuwan Jayasena
, Matthew D. Sinclair
:
GOLDYLOC: Global Optimizations & Lightweight Dynamic Logic for Concurrency. 78:1-78:28 - Yi Zhang
, Xiaomeng Yi
, Yu Huang
, Jingrui Yuan
, Chuangyi Gui
, Dan Chen
, Long Zheng
, Jianhui Yue
, Xiaofei Liao
, Hai Jin
, Jingling Xue
:
Cheetah: Accelerating Dynamic Graph Mining with Grouping Updates. 79:1-79:26 - Manolis Katsaragakis
, Christos Baloukas
, Lazaros Papadopoulos
, Francky Catthoor
, Dimitrios Soudris
:
Performance, Energy and NVM Lifetime-Aware Data Structure Refinement and Placement for Heterogeneous Memory Systems. 80:1-80:27 - Farui Wang
, Meng Hao
, Siyu Yang
, Weizhe Zhang
:
Deep Learning Workload Mapping Optimization on Jetson Platforms. 81:1-81:23 - Wenlong Mu
, Yue Tang
, Bo Huang
, Jianmei Guo
:
AOBO: A Fast-Switching Online Binary Optimizer on AArch64. 82:1-82:27

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.