


default search action
29th PACT 2020: Virtual Event, GA, USA
- Vivek Sarkar, Hyesoon Kim:

PACT '20: International Conference on Parallel Architectures and Compilation Techniques, Virtual Event, GA, USA, October 3-7, 2020. ACM 2020, ISBN 978-1-4503-8075-1
Keynote I
- Rick Stevens:

Overview of HPC and AI Computing for COVID-19 in the US. 1
Session 1: Optimizations for GPUs
- Jiannan Tian

, Sheng Di, Kai Zhao
, Cody Rivera
, Megan Hickman Fulp, Robert Underwood
, Sian Jin
, Xin Liang, Jon Calhoun, Dingwen Tao
, Franck Cappello:
cuSZ: An Efficient GPU-Based Error-Bounded Lossy Compression Framework for Scientific Data. 3-15 - Kishore Punniyamurthy, Andreas Gerstlauer:

TAFE: Thread Address Footprint Estimation for Capturing Data/Thread Locality in GPU Systems. 17-29 - Ziheng Wang:

SparseRT: Accelerating Unstructured Sparsity on GPUs for Deep Learning Inference. 31-42 - Chanyoung Oh, Zhen Zheng, Xipeng Shen

, Jidong Zhai, Youngmin Yi:
GOPipe: A Granularity-Oblivious Programming Framework for Pipelined Stencil Executions on GPU. 43-54 - Changwan Hong, Laxman Dhulipala, Julian Shun:

Exploring the Design Space of Static and Incremental Graph Connectivity Algorithms on GPUs. 55-69
Session 2: Compiler Optimization and Code Generation
- Bastian Hagedorn, Archibald Samuel Elliott, Henrik Barthels, Rastislav Bodík, Vinod Grover:

Fireiron: A Data-Movement-Aware Scheduling Language for GPUs. 71-82 - Lorenzo Chelini, Tobias Gysi, Tobias Grosser

, Martin Kong
, Henk Corporaal:
Automatic Generation of Multi-Objective Polyhedral Compiler Transformations. 83-96 - Mingchuan Wu, Ying Liu, Huimin Cui, Qingfu Wei, Quanfeng Li, Limin Li, Fang Lv, Jingling Xue

, Xiaobing Feng:
Bandwidth-Aware Loop Tiling for DMA-Supported Scratchpad Memory. 97-109 - Guixin Ye, Zhanyong Tang, Huanting Wang

, Dingyi Fang, Jianbin Fang, Songfang Huang, Zheng Wang
:
Deep Program Structure Modeling Through Multi-Relational Graph-based Learning. 111-123 - Linjian Ma, Jiayu Ye, Edgar Solomonik:

AutoHOOT: Automatic High-Order Optimization for Tensors. 125-137 - Tanzima Sultana, Blake Allen, Apan Qasem:

Intelligent Data Placement on Discrete GPU Nodes with Unified Memory. 139-151
Poster Session I
- Ruobing Chen, Jinping Wu, Haosen Shi, Yusen Li, Haiyan Yin

, Shanjiang Tang, Xiaoguang Liu, Gang Wang:
Deep Learning Assisted Resource Partitioning for Improving Performance on Commodity Servers. 153-154 - Bokyeong Kim, Soojin Hwang, Sanghoon Cha, Chang Hyun Park, Jongse Park, Jaehyuk Huh:

Decoupled Address Translation for Heterogeneous Memory Systems. 155-156 - Jiho Kim, Sanghun Cho

, Minsoo Rhu, Ali Bakhoda, Tor M. Aamodt, John Kim
:
Bandwidth Bottleneck in Network-on-Chip for High-Throughput Processors. 157-158
Keynote II
- Sarita V. Adve:

Scalable Specialization: Architectures, Interfaces, & Applications. 159
Session 3: Parallel Architectures
- Mohamed Assem Ibrahim

, Onur Kayiran, Yasuko Eckert, Gabriel H. Loh, Adwait Jog:
Analyzing and Leveraging Shared L1 Caches in GPUs. 161-173 - Subhankar Pal

, Siying Feng, Dong-Hyeon Park, Sung Kim, Aporva Amarnath, Chi-Sheng Yang
, Xin He, Jonathan Beaumont
, Kyle May, Yan Xiong, Kuba Kaszyk
, John Magnus Morton, Jiawen Sun, Michael F. P. O'Boyle, Murray Cole, Chaitali Chakrabarti, David T. Blaauw, Hun-Seok Kim, Trevor N. Mudge, Ronald G. Dreslinski:
Transmuter: Bridging the Efficiency Gap using Memory and Dataflow Reconfiguration. 175-190 - Xulong Tang

, Ziyu Zhang, Weizheng Xu, Mahmut Taylan Kandemir, Rami G. Melhem, Jun Yang:
Enhancing Address Translations in Throughput Processors via Compression. 191-204 - Sawan Singh

, Alexandra Jimborean
, Alberto Ros
:
Regional Out-of-Order Writes in Total Store Order. 205-216 - Stuart Byma, Akash Balasaheb Dhasade

, Adrian M. Altenhoff, Christophe Dessimoz
, James R. Larus:
Parallel and Scalable Precise Clustering. 217-228
Session 4: Hardware/software for Security&Machine Learning
- Omais Shafi, Janibul Bashir

:
SecSched: Flexible Scheduling in Secure Processors. 229-240 - Kim-Anh Tran, Christos Sakalis, Magnus Själander, Alberto Ros

, Stefanos Kaxiras, Alexandra Jimborean
:
Clearing the Shadows: Recovering Lost Performance for Invisible Speculative Execution through HW/SW Co-Design. 241-254 - Yulin Zhang, Xiaoming Li:

Fast Convolutional Neural Networks with Fine-Grained FFTs. 255-265 - Masuma Akter Rumi, Xiaolong Ma, Yanzhi Wang, Peng Jiang:

Accelerating Sparse CNN Inference on GPUs with Performance-Aware Weight Pruning. 267-278 - Zhangxiaowen Gong, Houxiang Ji

, Christopher W. Fletcher, Christopher J. Hughes
, Josep Torrellas:
SparseTrain: Leveraging Dynamic Sparsity in Software for Training DNNs on General-Purpose SIMD Processors. 279-292
Session 5: Best Paper
- Qian Lou

, Sarath Chandra Janga, Lei Jiang:
Helix: Algorithm/Architecture Co-design for Accelerating Nanopore Genome Base-calling. 293-304 - Saurabh Gupta, Niranjan Soundararajan, Ragavendra Natarajan, Sreenivas Subramoney

:
Opportunistic Early Pipeline Re-steering for Data-dependent Branches. 305-316 - Abhinav Jangda, Arjun Guha:

Model-Based Warp Overlapped Tiling for Image Processing Programs on GPUs. 317-328 - Yiming Gan, Yuxian Qiu, Lele Chen, Jingwen Leng, Yuhao Zhu:

Low-Latency Proactive Continuous Vision. 329-342
Poster Session II
- Mahmut T. Kandemir, Jihyun Ryoo, Hui Zhao, Myoungsoo Jung, Mustafa Karaköy:

Collective Affinity Aware Computation Mapping. 343-344 - Feng Yu, Jiacheng Zhao, Huimin Cui, Xiaobing Feng, Jingling Xue

:
VTensor: Using Virtual Tensors to Build a Layout-oblivious AI Programming Framework. 345-346 - Roberto Castañeda Lozano, Murray Cole, Björn Franke:

Parallelizing Parallel Programs: A Dynamic Pattern Analysis for Modernization of Legacy Parallel Code. 347-348 - Xinglei Dou, Lei Liu:

A New Qubits Mapping Mechanism for Multi-programming Quantum Computing. 349-350 - Matthew Rodriguez, Ahmed Hassan, Michael F. Spear

:
Exploiting Locality in Scalable Ordered Maps. 351-352 - Majed Valad Beigi, Bahareh Pourshirazi, Gokhan Memik, Zhichun Zhu:

DeepSwapper: A Deep Learning Based Page Swap Management Scheme for Hybrid Memory Systems. 353-354 - Tiago T. Jost, Yves Durand, Christian Fabre

, Albert Cohen, Frédéric Pétrot:
VP Float: First Class Treatment for Variable Precision Floating Point Arithmetic. 355-356 - Vignesh Adhinarayanan, Wu-chun Feng:

Approximate Pattern Matching for On-Chip Interconnect Traffic Prediction. 357-358
Keynote III
- Bradford L. Chamberlain:

Compiling Chapel: Keys to Making Parallel Programming Productive at Scale. 359
Session 6: Domain/Application-Specific Hardware/Software
- Kartik Lakshminarasimhan, Ajeya Naithani

, Josué Feliu
, Lieven Eeckhout:
The Forward Slice Core Microarchitecture. 361-372 - Yan Pei, Swarnendu Biswas, Donald S. Fussell, Keshav Pingali:

A Methodology for Principled Approximation in Visual SLAM. 373-386 - Jonathan M. Baker, David I. Schuster

, Frederic T. Chong
:
Memory-Equipped Quantum Architectures: The Power of Random Access. 387-398 - Soroush Ghodrati, Hardik Sharma, Sean Kinzer, Amir Yazdanbakhsh

, Jongse Park, Nam Sung Kim, Doug Burger, Hadi Esmaeilzadeh:
Mixed-Signal Charge-Domain Acceleration of Deep Neural Networks through Interleaved Bit-Partitioned Arithmetic. 399-411 - Mohammad Alaul Haque Monil

, Mehmet E. Belviranli, Seyong Lee, Jeffrey S. Vetter, Allen D. Malony:
MEPHESTO: Modeling Energy-Performance in Heterogeneous SoCs and Their Trade-Offs. 413-425
Session 7: Memory/Storage Systems
- Kai Wu, Ivy Bo Peng

, Jie Ren
, Dong Li:
Ribbon: High Performance Cache Line Flushing for Persistent Memory. 427-439 - Rachata Ausavarungnirun, Timothy Merrifield, Jayneel Gandhi

, Christopher J. Rossbach
:
PRISM: Architectural Support for Variable-granularity Memory Metadata. 441-454 - Trinayan Baruah, Yifan Sun

, Saiful A. Mojumder, José L. Abellán, Yash Ukidave, Ajay Joshi, Norman Rubin, John Kim
, David R. Kaeli:
Valkyrie: Leveraging Inter-TLB Locality to Enhance GPU Performance. 455-466 - Changyeon Jo, Hyunik Kim

, Hexiang Geng, Bernhard Egger:
RackMem: A Tailored Caching Layer for Rack Scale Computing. 467-480 - Harsh Gugale, Nagendra Gulur, Yashwant Marathe, Lizy K. John:

ATTC (@C): Addressable-TLB based Translation Coherence. 481-492

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














