


default search action
ACM Transactions on Architecture and Code Optimization, Volume 22
Volume 22, Number 1, March 2025
- Perry Gibson
, José Cano
, Elliot Crowley
, Amos J. Storkey
, Michael F. P. O'Boyle
:
DLAS: A Conceptual Model for Across-Stack Deep Learning Acceleration. 1:1-1:28 - Xinbiao Gan
:
GraphService: Topology-aware Constructor for Large-scale Graph Applications. 2:1-2:24 - Renjun Zhang
, Tianming Zhang
, Zinuo Cai
, Dongmei Li
, Ruhui Ma
, Rajkumar Buyya
:
MemoriaNova: Optimizing Memory-Aware Model Inference for Edge Computing. 3:1-3:25 - Andrea Lepori
, Alexandru Calotoiu
, Torsten Hoefler
:
Iterating Pointers: Enabling Static Analysis for Loop-based Pointers. 4:1-4:25 - Viktor Razilov
, Ipek Geçin
, Emil Matús, Gerhard P. Fettweis
:
Conflict Management in Vector Register Files. 5:1-5:19 - Jingle Xu
, Jiayu Fu
, Lin Gan
, Yaojian Chen
, Zhaoqi Sun
, Zhenchun Huang
, Guangwen Yang
:
Leveraging the Hardware Resources to Accelerate cryo-EM Reconstruction of RELION on the New Sunway Supercomputer. 6:1-6:25 - Yuta Saito
, Kazunori Sakamoto
, Hironori Washizaki
, Yoshiaki Fukazawa
:
Multiple Function Merging for Code Size Reduction. 7:1-7:26 - Peihua Zhang
, Chenggang Wu
, Hanzhi Hu
, Lichen Jia
, Mingfan Peng
, Jiali Xu
, Mengyao Xie
, Yuanming Lai
, Yan Kang
, Zhe Wang
:
Shining Light on the Inter-procedural Code Obfuscation: Keep Pace with Progress in Binary Diffing. 8:1-8:28 - Dengke Han
, Mingyu Yan
, Xiaochun Ye
, Dongrui Fan
:
Characterizing and Understanding HGNN Training on GPUs. 9:1-9:25 - Jingyu Wang
, Ruilong Ma
, Xiang Yang, Qi Qi
, Zirui Zhuang
, Jing Wang, Jianxin Liao, Song Guo
:
DeepZoning: Re-accelerate CNN Inference with Zoning Graph for Heterogeneous Edge Cluster. 10:1-10:26 - Chenghao Ouyang
, Jinhan Xin
, Siqi Zeng
, Guohui Li
, Jianjun Li
, Zhibin Yu
:
Constructing a Supplementary Benchmark Suite to Represent Android Applications with User Interactions by using Performance Counters. 11:1-11:28 - Xinglei Dou
, Lei Liu
, Limin Xiao
:
An Intelligent Scheduling Approach on Mobile OS for Optimizing UI Smoothness and Power. 12:1-12:27 - Kwanghoon Choi
, Igjae Kim
, Sunho Lee
, Jaehyuk Huh
:
ShieldCXL: A Practical Obliviousness Support with Sealed CXL Memory. 13:1-13:25 - Yun Chen
, Ali Hajiabadi
, Romain Poussier
, Yaswanth Tavva
, Andreas Diavastos
, Shivam Bhasin
, Trevor E. Carlson
:
PARADISE: Criticality-Aware Instruction Reordering for Power Attack Resistance. 14:1-14:25 - Chunfeng Li
, Feng Shi
, Fei Yin
, Karim Soliman
, Jin Wei
:
A High Scalability Memory NoC with Shared-Inside Hierarchical-Groupings for Triplet-Based Many-Core Architecture. 15:1-15:26 - Jin Zhao
, Yu Zhang
, Donghao He
, Qikun Li
, Weihang Yin
, Hui Yu
, Hao Qi
, Xiaofei Liao
, Hai Jin
, Haikun Liu
, Linchen Yu
, Zhan Zhang
:
An Efficient ReRAM-based Accelerator for Asynchronous Iterative Graph Processing. 16:1-16:26 - Xinyu Li
, Guangyao Guo
, Yanzhi Lan
, Feng Xue
, Chenji Han
, Gen Niu
, Fuxin Zhang
:
Tiaozhuan: A General and Efficient Indirect Branch Optimization for Binary Translation. 17:1-17:26 - Jianhua Gao, Zeming Liu, Yizhuo Wang
, Weixing Ji:
RaNAS: Resource-Aware Neural Architecture Search for Edge Computing. 18:1-18:18 - Adnan Hasnat
, Shoaib Akram
:
SPIRIT: Scalable and Persistent In-Memory Indices for Real-Time Search. 19:1-19:26 - Dezhong Yao
, Sifan Zhao
, Tongtong Liu
, Gang Wu
, Hai Jin
:
ApSpGEMM: Accelerating Large-scale SpGEMM with Heterogeneous Collaboration and Adaptive Panel. 20:1-20:23 - Weiduo Chen
, Xiaoshe Dong
, Fan Zhang
, Bowen Li
, Yufei Wang
, Qiang Wang
:
ATP: Achieving Throughput Peak for DNN Training via Smart GPU Memory Management. 21:1-21:27 - Zhuoran Song
, Jiabei Long
, Li Jiang
, Naifeng Jing
, Xiaoyao Liang
:
GCNTrain+: A Versatile and Efficient Accelerator for Graph Convolutional Neural Network Training. 22:1-22:22 - Wenjie Qi
, Zhipeng Tan
, Ziyue Zhang
, Ying Yuan
, Dan Feng
:
exZNS: Extending Zoned Namespace to Support Byte-loggable Zones. 23:1-23:28 - Long Zheng
, Bing Zhu
, Pengcheng Yao
, Yuhang Zhou
, Chengao Pan
, Wenju Zhao
, Xiaofei Liao
, Hai Jin
, Jingling Xue
:
PRAGA: A Priority-Aware Hardware/Software Co-design for High-Throughput Graph Processing Acceleration. 24:1-24:27 - Yingshuai Dong, Chencheng Ye, Haikun Liu, Liting Tang, Xiaofei Liao, Hai Jin, Cheng Chen, Yanjiang Li, Yi Wang:
DTAP: Accelerating Strongly-Typed Programs with Data Type-Aware Hardware Prefetching. 25:1-25:25 - Xueliang Wei
, Dan Feng
, Wei Tong
, Bing Wu
, Xu Jiang
:
COVER: Alleviating Crash-Consistency Error Amplification in Secure Persistent Memory Systems. 26:1-26:27 - Xinqi Chen
, Erci Xu
, Dengyao Mo
, Ruiming Lu
, Haonan Wu
, Dian Ding
, Guangtao Xue
:
MasterPlan: A Reinforcement Learning Based Scheduler for Archive Storage. 27:1-27:25 - Brandon Kammerdiener
, J. Zach McMichael
, Michael R. Jantz
, Kshitij A. Doshi
, Terry R. Jones
:
Flexible and Effective Object Tiering for Heterogeneous Memory Systems. 28:1-28:24 - Zhiqiang Chen
, Yongwen Wang
, Hongwei Zhou
, Jian Zhang
:
Steered Bubble: An Interposer-based Deadlock Recovery Algorithm for Multi-chiplet Systems. 29:1-29:24 - Shruthi Karunakar
, Rajshekar Kalayappan
, Sandeep Chandran
:
Consequence-based Clustered Architecture. 30:1-30:25 - Jiahui Yang
, Fulin Nan
, Zhirong Shen
, Zhisheng Chen
, Yuhui Cai
, Dmitrii I. Kaplun
, Xiaoli Wang
, Quanqing Xu
, Chuanhui Yang
, Jiwu Shu
:
TPRepair: Tree-based Pipelined Repair in Clustered Storage Systems. 31:1-31:25 - Jianrong Yan
, Wenbin Jiang
, Dongao He
, Suyang Wen
, Yang Li
, Hai Jin
, Zhiyuan Shao
:
RT-GNN: Accelerating Sparse Graph Neural Networks by Tensor-CUDA Kernel Fusion. 32:1-32:27 - Yi Dai
, Kai Lu
, Sheng Ma
, Jinshu Su
, Dongsheng Li
:
Bubble-Swap Flow Control. 33:1-33:26 - Dongjie Tang
, Zijun Wu
, Yun Wang
, Yicheng Gu
, Fangxin Liu
, Zhengwei Qi
:
gCom: Fine-grained Compressors in Graphics Memory of Mobile GPU. 34:1-34:25 - Ruixing Zong
, Jiapeng Zhang
, Zhuo Tang
, Kenli Li
:
IBing: An Efficient Interleaved Bidirectional Ring All-Reduce Algorithm for Gradient Synchronization. 35:1-35:23 - Quancheng Wang
, Ming Tang
, Ke Xu
, Han Wang
:
Unveiling and Evaluating Vulnerabilities in Branch Predictors via a Three-Step Modeling Methodology. 36:1-36:26 - Pengyu Yang
, Weihao Cui
, Chunyu Xue
, Han Zhao
, Chen Chen
, Quan Chen
, Jing Yang
, Minyi Guo
:
Taming Flexible Job Packing in Deep Learning Training Clusters. 37:1-37:24 - Zhenlin Wu
, Haosong Zhao
, Hongyuan Liu
, Wujie Wen
, Jiajia Li
:
gHyPart: GPU-friendly End-to-End Hypergraph Partitioner. 38:1-38:25 - Mariano Benito
, Enrique Vallejo
, Ramón Beivide:
LIA: Latency-Improved Adaptive routing for Dragonfly networks. 39:1-39:26 - Yiming Gan
, Jingwen Leng
, Bo Yu
, Yuhao Zhu
:
KINDRED: Heterogeneous Split-Lock Architecture for Safe Autonomous Machines. 40:1-40:25 - Tzung-Han Juang
, Christophe Dubach
:
Maximizing Data and Hardware Reuse for HLS with Early-Stage Symbolic Partitioning. 41:1-41:26 - Cheng Xu
, Chao Li
, Xiaofeng Hou
, Junyi Mei
, Jing Wang
, Pengyu Wang
, Shixuan Sun
, Minyi Guo
, Baoping Hao
:
Enhancing High-Throughput GPU Random Walks Through Multi-Task Concurrency Orchestration. 42:1-42:26 - Qiong Chang
, Weimin Wang
, Jun Miyazaki
:
Accelerating Nearest Neighbor Search in 3D Point Cloud Registration on GPUs. 43:1-43:24 - Yekang Zhan
, Xiangrui Yang
, Haichuan Hu
, Qiang Cao
, Yifan Zhang
, Jie Yao
:
AIS: An Active Idleness I/O Scheduler to Reduce Buffer-Exhausted Degradation of Solid-State Drives. 44:1-44:26 - Coby Soss
, Aravind Sukumaran-Rajam
, Janet Layne
, Edoardo Serra
, Mahantesh Halappanavar
, Assefaw H. Gebremedhin
:
ScaWL: Scaling k-WL (Weisfeiler-Lehman) Algorithms in Memory and Performance on Shared and Distributed-Memory Systems. 45:1-45:25

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.