


default search action
30th ICS 2016: Istanbul, Turkey
- Ozcan Ozturk, Kemal Ebcioglu, Mahmut T. Kandemir, Onur Mutlu:

Proceedings of the 2016 International Conference on Supercomputing, ICS 2016, Istanbul, Turkey, June 1-3, 2016. ACM 2016, ISBN 978-1-4503-4361-9
Heterogeneous Systems
- Tobias Grosser

, Torsten Hoefler:
Polly-ACC Transparent compilation to heterogeneous hardware. 1:1-1:13 - Jianqiao Liu, Nikhil Hegde, Milind Kulkarni:

Hybrid CPU-GPU scheduling and execution of tree traversals. 2:1-2:12 - Siddharth Rai

, Mainak Chaudhuri:
Exploiting Dynamic Reuse Probability to Manage Shared Last-level Caches in CPU-GPU Heterogeneous Processors. 3:1-3:14
Power, Energy, Variation
- Haris Ribic, Yu David Liu:

AEQUITAS: Coordinated Energy Management Across Parallel Applications. 4:1-4:12 - Dimitrios Chasapis

, Marc Casas
, Miquel Moretó
, Martin Schulz
, Eduard Ayguadé, Jesús Labarta
, Mateo Valero
:
Runtime-Guided Mitigation of Manufacturing Variability in Power-Constrained Multi-Socket NUMA Nodes. 5:1-5:12 - Bilge Acun, Phil Miller

, Laxmikant V. Kalé:
Variation Among Processors Under Turbo Boost in HPC Systems. 6:1-6:12
NVMs & Persistent Memory
- David Fiala, Frank Mueller, Kurt B. Ferreira, Christian Engelmann

:
Mini-Ckpts: Surviving OS Failures in Persistent Memory. 7:1-7:14 - Nusrat Sharmin Islam, Md. Wasi-ur-Rahman, Xiaoyi Lu, Dhabaleswar K. Panda:

High Performance Design for HDFS with Byte-Addressability of NVM and RDMA. 8:1-8:14 - Amro Awad

, Sergey Blagodurov, Yan Solihin:
Write-Aware Management of NVM-based Memory Extensions. 9:1-9:12
Data Centers
- Yang Hu, Chao Li, Longjun Liu

, Tao Li:
HOPE: Enabling Efficient Service Orchestration in Software-Defined Data Centers. 10:1-10:12 - Longjun Liu

, Hongbin Sun, Chao Li, Yang Hu, Nanning Zheng, Tao Li:
Towards an Adaptive Multi-Power-Source Datacenter. 11:1-11:11 - Xu Zhou, Haoran Cai, Qiang Cao, Hong Jiang, Lei Tian, Changsheng Xie:

GreenGear: Leveraging and Managing Server Heterogeneity for Improving Energy Efficiency in Green Data Centers. 12:1-12:14 - Hameedah Sultan, Arpit Katiyar, Smruti R. Sarangi:

Noise Aware Scheduling in Data Centers. 13:1-13:14
GPUs and SIMD
- Guoyang Chen, Xipeng Shen

:
Coherence-Free Multiview: Enabling Reference-Discerning Data Placement on GPU. 14:1-14:13 - Ang Li, Shuaiwen Leon Song, Mark Wijtvliet, Akash Kumar

, Henk Corporaal:
SFU-Driven Transparent Approximation Acceleration on GPUs. 15:1-15:14 - Peng Jiang

, Linchuan Chen, Gagan Agrawal:
Reusing Data Reorganization for Efficient SIMD Parallelization of Adaptive Irregular Applications. 16:1-16:10
Communication and Coherence
- Xuehai Qian, Koushik Sen, Paul Hargrove

, Costin Iancu:
SReplay: Deterministic Sub-Group Replay for One-Sided Communication. 17:1-17:13 - Konstantina Mitropoulou, Vasileios Porpodas, Xiaochun Zhang, Timothy M. Jones

:
Lynx: Using OS and Hardware Support for Fast Fine-Grained Inter-Core Communication. 18:1-18:12 - Yuan Yao, Guanhua Wang, Zhiguo Ge, Tulika Mitra

, Wenzhi Chen, Naxin Zhang:
Efficient Timestamp-Based Cache Coherence Protocol for Many-Core Architectures. 19:1-19:13
Tools and Libraries
- Linnan Wang, Wei Wu

, Zenglin Xu, Jianxiong Xiao, Yi Yang:
BLASX: A High Performance Level-3 BLAS Library for Heterogeneous Multi-GPU Computing. 20:1-20:11 - Snehasish Kumar, Vijayalakshmi Srinivasan, Amirali Sharifian, Nick Sumner

, Arrvindh Shriraman:
Peruse and Profit: Estimating the Accelerability of Loops. 21:1-21:13 - Nadav Chachmon, Daniel Richins, Robert S. Cohn, Magnus Christensson, Wenzhi Cui, Vijay Janapa Reddi:

Simulation and Analysis Engine for Scale-Out Workloads. 22:1-22:13
Potpourri
- Patrick Judd, Jorge Albericio, Tayler H. Hetherington, Tor M. Aamodt, Natalie D. Enright Jerger

, Andreas Moshovos:
Proteus: Exploiting Numerical Precision Variability in Deep Neural Networks. 23:1-23:12 - Fei Lei, Dezun Dong, Xiangke Liao, Xing Su, Cunlu Li:

Galaxyfly: A Novel Family of Flexible-Radix Low-Diameter Topologies for Large-Scales Interconnection Networks. 24:1-24:12 - Zhiying Li, Ruini Xue, Lixiang Ao:

Replichard: Towards Tradeoff between Consistency and Performance for Metadata. 25:1-25:11
Memory
- Albert Esteve, Alberto Ros

, Antonio Robles
, María Engracia Gómez
, José Duato
:
TokenTLB: A Token-Based Page Classification Approach. 26:1-26:13 - Emilio G. Cota, Paolo Mantovani, Luca P. Carloni:

Exploiting Private Local Memories to Reduce the Opportunity Cost of Accelerator Integration. 27:1-27:12 - Suzhen Wu, Yanping Lin, Bo Mao, Hong Jiang:

GCaR: Garbage Collection aware Cache Management with Improved Performance for Flash-based SSDs. 28:1-28:12
Scheduling
- Changdae Kim

, Jaehyuk Huh:
Fairness-oriented OS Scheduling Support for Multicore Systems. 29:1-29:12 - Yunlong Xu, Rui Wang, Tao Li, Mingcong Song, Lan Gao, Zhongzhi Luan, Depei Qian:

Scheduling Tasks with Mixed Timing Constraints in GPU-Powered Real-Time Systems. 30:1-30:13 - Mehmet E. Belviranli, Farzad Khorasani, Laxmi N. Bhuyan, Rajiv Gupta

:
CuMAS: Data Transfer Aware Multi-Application Scheduling for Shared GPUs. 31:1-31:12
Parallelism Issues
- Saeed Maleki, Donald Nguyen, Andrew Lenharth, María Jesús Garzarán, David A. Padua, Keshav Pingali:

DSMR: A Parallel Algorithm for Single-Source Shortest Path Problem. 32:1-32:14 - Hao Wang

, Weifeng Liu
, Kaixi Hou, Wu-chun Feng:
Parallel Transposition of Sparse Data Structures. 33:1-33:13 - Kanak Mahadik, Christopher Wright

, Jinyi Zhang, Milind Kulkarni, Saurabh Bagchi, Somali Chaterji
:
SARVAVID: A Domain Specific Language for Developing Scalable Computational Genomics Applications. 34:1-34:12
Multiplication
- Eli Ben-Sasson, Matan Hamilis, Mark Silberstein, Eran Tromer

:
Fast Multiplication in Binary Fields on GPUs via Register Cache. 35:1-35:12 - Pham Nguyen Quang Anh, Rui Fan, Yonggang Wen:

Balanced Hashing and Efficient GPU Sparse General Matrix-Matrix Multiplication. 36:1-36:12 - Daniele Buono, Fabrizio Petrini, Fabio Checconi, Xing Liu

, Xinyu Que, Chris Long, Tai-Ching Tuan:
Optimizing Sparse Matrix-Vector Multiplication for Large-Scale Data Analytics. 37:1-37:12
Prefetching
- Sanyam Mehta, Rajat Garg, Nishad Trivedi, Pen-Chung Yew

:
TurboTiling: Leveraging Prefetching to Boost Performance of Tiled Codes. 38:1-38:12 - Sam Ainsworth

, Timothy M. Jones
:
Graph Prefetching Using Data Structure Knowledge. 39:1-39:11 - Reena Panda, Yasuko Eckert, Nuwan Jayasena, Onur Kayiran, Michael Boyer, Lizy Kurian John:

Prefetching Techniques for Near-memory Throughput Processors. 40:1-40:14
GPU Architecture
- Mohammad Abdel-Majeed

, Daniel Wong
, Justin Kuang, Murali Annavaram
:
Origami: Folding Warps for Energy Efficient GPUs. 41:1-41:12 - Yuxi Liu, Zhibin Yu, Lieven Eeckhout, Vijay Janapa Reddi, Yingwei Luo, Xiaolin Wang, Zhenlin Wang, Cheng-Zhong Xu

:
Barrier-Aware Warp Scheduling for Throughput Processors. 42:1-42:12 - Lingda Li

, Ari B. Hayes, Shuaiwen Leon Song, Eddy Z. Zhang:
Tag-Split Cache for Efficient GPGPU Cache Utilization. 43:1-43:12

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














