default search action
30th ICS 2016: Istanbul, Turkey
- Ozcan Ozturk, Kemal Ebcioglu, Mahmut T. Kandemir, Onur Mutlu:
Proceedings of the 2016 International Conference on Supercomputing, ICS 2016, Istanbul, Turkey, June 1-3, 2016. ACM 2016, ISBN 978-1-4503-4361-9
Heterogeneous Systems
- Tobias Grosser, Torsten Hoefler:
Polly-ACC Transparent compilation to heterogeneous hardware. 1:1-1:13 - Jianqiao Liu, Nikhil Hegde, Milind Kulkarni:
Hybrid CPU-GPU scheduling and execution of tree traversals. 2:1-2:12 - Siddharth Rai, Mainak Chaudhuri:
Exploiting Dynamic Reuse Probability to Manage Shared Last-level Caches in CPU-GPU Heterogeneous Processors. 3:1-3:14
Power, Energy, Variation
- Haris Ribic, Yu David Liu:
AEQUITAS: Coordinated Energy Management Across Parallel Applications. 4:1-4:12 - Dimitrios Chasapis, Marc Casas, Miquel Moretó, Martin Schulz, Eduard Ayguadé, Jesús Labarta, Mateo Valero:
Runtime-Guided Mitigation of Manufacturing Variability in Power-Constrained Multi-Socket NUMA Nodes. 5:1-5:12 - Bilge Acun, Phil Miller, Laxmikant V. Kalé:
Variation Among Processors Under Turbo Boost in HPC Systems. 6:1-6:12
NVMs & Persistent Memory
- David Fiala, Frank Mueller, Kurt B. Ferreira, Christian Engelmann:
Mini-Ckpts: Surviving OS Failures in Persistent Memory. 7:1-7:14 - Nusrat Sharmin Islam, Md. Wasi-ur-Rahman, Xiaoyi Lu, Dhabaleswar K. Panda:
High Performance Design for HDFS with Byte-Addressability of NVM and RDMA. 8:1-8:14 - Amro Awad, Sergey Blagodurov, Yan Solihin:
Write-Aware Management of NVM-based Memory Extensions. 9:1-9:12
Data Centers
- Yang Hu, Chao Li, Longjun Liu, Tao Li:
HOPE: Enabling Efficient Service Orchestration in Software-Defined Data Centers. 10:1-10:12 - Longjun Liu, Hongbin Sun, Chao Li, Yang Hu, Nanning Zheng, Tao Li:
Towards an Adaptive Multi-Power-Source Datacenter. 11:1-11:11 - Xu Zhou, Haoran Cai, Qiang Cao, Hong Jiang, Lei Tian, Changsheng Xie:
GreenGear: Leveraging and Managing Server Heterogeneity for Improving Energy Efficiency in Green Data Centers. 12:1-12:14 - Hameedah Sultan, Arpit Katiyar, Smruti R. Sarangi:
Noise Aware Scheduling in Data Centers. 13:1-13:14
GPUs and SIMD
- Guoyang Chen, Xipeng Shen:
Coherence-Free Multiview: Enabling Reference-Discerning Data Placement on GPU. 14:1-14:13 - Ang Li, Shuaiwen Leon Song, Mark Wijtvliet, Akash Kumar, Henk Corporaal:
SFU-Driven Transparent Approximation Acceleration on GPUs. 15:1-15:14 - Peng Jiang, Linchuan Chen, Gagan Agrawal:
Reusing Data Reorganization for Efficient SIMD Parallelization of Adaptive Irregular Applications. 16:1-16:10
Communication and Coherence
- Xuehai Qian, Koushik Sen, Paul Hargrove, Costin Iancu:
SReplay: Deterministic Sub-Group Replay for One-Sided Communication. 17:1-17:13 - Konstantina Mitropoulou, Vasileios Porpodas, Xiaochun Zhang, Timothy M. Jones:
Lynx: Using OS and Hardware Support for Fast Fine-Grained Inter-Core Communication. 18:1-18:12 - Yuan Yao, Guanhua Wang, Zhiguo Ge, Tulika Mitra, Wenzhi Chen, Naxin Zhang:
Efficient Timestamp-Based Cache Coherence Protocol for Many-Core Architectures. 19:1-19:13
Tools and Libraries
- Linnan Wang, Wei Wu, Zenglin Xu, Jianxiong Xiao, Yi Yang:
BLASX: A High Performance Level-3 BLAS Library for Heterogeneous Multi-GPU Computing. 20:1-20:11 - Snehasish Kumar, Vijayalakshmi Srinivasan, Amirali Sharifian, Nick Sumner, Arrvindh Shriraman:
Peruse and Profit: Estimating the Accelerability of Loops. 21:1-21:13 - Nadav Chachmon, Daniel Richins, Robert S. Cohn, Magnus Christensson, Wenzhi Cui, Vijay Janapa Reddi:
Simulation and Analysis Engine for Scale-Out Workloads. 22:1-22:13
Potpourri
- Patrick Judd, Jorge Albericio, Tayler H. Hetherington, Tor M. Aamodt, Natalie D. Enright Jerger, Andreas Moshovos:
Proteus: Exploiting Numerical Precision Variability in Deep Neural Networks. 23:1-23:12 - Fei Lei, Dezun Dong, Xiangke Liao, Xing Su, Cunlu Li:
Galaxyfly: A Novel Family of Flexible-Radix Low-Diameter Topologies for Large-Scales Interconnection Networks. 24:1-24:12 - Zhiying Li, Ruini Xue, Lixiang Ao:
Replichard: Towards Tradeoff between Consistency and Performance for Metadata. 25:1-25:11
Memory
- Albert Esteve, Alberto Ros, Antonio Robles, María Engracia Gómez, José Duato:
TokenTLB: A Token-Based Page Classification Approach. 26:1-26:13 - Emilio G. Cota, Paolo Mantovani, Luca P. Carloni:
Exploiting Private Local Memories to Reduce the Opportunity Cost of Accelerator Integration. 27:1-27:12 - Suzhen Wu, Yanping Lin, Bo Mao, Hong Jiang:
GCaR: Garbage Collection aware Cache Management with Improved Performance for Flash-based SSDs. 28:1-28:12
Scheduling
- Changdae Kim, Jaehyuk Huh:
Fairness-oriented OS Scheduling Support for Multicore Systems. 29:1-29:12 - Yunlong Xu, Rui Wang, Tao Li, Mingcong Song, Lan Gao, Zhongzhi Luan, Depei Qian:
Scheduling Tasks with Mixed Timing Constraints in GPU-Powered Real-Time Systems. 30:1-30:13 - Mehmet E. Belviranli, Farzad Khorasani, Laxmi N. Bhuyan, Rajiv Gupta:
CuMAS: Data Transfer Aware Multi-Application Scheduling for Shared GPUs. 31:1-31:12
Parallelism Issues
- Saeed Maleki, Donald Nguyen, Andrew Lenharth, María Jesús Garzarán, David A. Padua, Keshav Pingali:
DSMR: A Parallel Algorithm for Single-Source Shortest Path Problem. 32:1-32:14 - Hao Wang, Weifeng Liu, Kaixi Hou, Wu-chun Feng:
Parallel Transposition of Sparse Data Structures. 33:1-33:13 - Kanak Mahadik, Christopher Wright, Jinyi Zhang, Milind Kulkarni, Saurabh Bagchi, Somali Chaterji:
SARVAVID: A Domain Specific Language for Developing Scalable Computational Genomics Applications. 34:1-34:12
Multiplication
- Eli Ben-Sasson, Matan Hamilis, Mark Silberstein, Eran Tromer:
Fast Multiplication in Binary Fields on GPUs via Register Cache. 35:1-35:12 - Pham Nguyen Quang Anh, Rui Fan, Yonggang Wen:
Balanced Hashing and Efficient GPU Sparse General Matrix-Matrix Multiplication. 36:1-36:12 - Daniele Buono, Fabrizio Petrini, Fabio Checconi, Xing Liu, Xinyu Que, Chris Long, Tai-Ching Tuan:
Optimizing Sparse Matrix-Vector Multiplication for Large-Scale Data Analytics. 37:1-37:12
Prefetching
- Sanyam Mehta, Rajat Garg, Nishad Trivedi, Pen-Chung Yew:
TurboTiling: Leveraging Prefetching to Boost Performance of Tiled Codes. 38:1-38:12 - Sam Ainsworth, Timothy M. Jones:
Graph Prefetching Using Data Structure Knowledge. 39:1-39:11 - Reena Panda, Yasuko Eckert, Nuwan Jayasena, Onur Kayiran, Michael Boyer, Lizy Kurian John:
Prefetching Techniques for Near-memory Throughput Processors. 40:1-40:14
GPU Architecture
- Mohammad Abdel-Majeed, Daniel Wong, Justin Kuang, Murali Annavaram:
Origami: Folding Warps for Energy Efficient GPUs. 41:1-41:12 - Yuxi Liu, Zhibin Yu, Lieven Eeckhout, Vijay Janapa Reddi, Yingwei Luo, Xiaolin Wang, Zhenlin Wang, Cheng-Zhong Xu:
Barrier-Aware Warp Scheduling for Throughput Processors. 42:1-42:12 - Lingda Li, Ari B. Hayes, Shuaiwen Leon Song, Eddy Z. Zhang:
Tag-Split Cache for Efficient GPGPU Cache Utilization. 43:1-43:12
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.