


default search action
29th ICS 2015: Newport Beach/Irvine, CA, USA
- Laxmi N. Bhuyan, Fred Chong, Vivek Sarkar:

Proceedings of the 29th ACM on International Conference on Supercomputing, ICS'15, Newport Beach/Irvine, CA, USA, June 08 - 11, 2015. ACM 2015, ISBN 978-1-4503-3559-1
Keynote Address I
- Albert Cohen:

Streaming Task Parallelism. 1
GPU Parallelism
- Javier Cabezas, Lluís Vilanova

, Isaac Gelado, Thomas B. Jablin, Nacho Navarro, Wen-mei W. Hwu:
Automatic Parallelization of Kernels in Shared-Memory Multi-GPU Nodes. 3-13 - Yulong Yu, Weijun Xiao, Xubin He, He Guo, Yuxin Wang, Xin Chen:

A Stall-Aware Warp Scheduling for Dynamically Optimizing Thread-level Parallelism in GPGPUs. 15-24 - Mehmet E. Belviranli, Peng Deng, Laxmi N. Bhuyan, Rajiv Gupta

, Qi Zhu
:
PeerWave: Exploiting Wavefront Parallelism on GPUs with Peer-SM Synchronization. 25-35
Communication and Computation Models
- Ozan Tuncer, Vitus J. Leung, Ayse K. Coskun:

PaCMap: Topology Mapping of Unstructured Communication Patterns onto Non-contiguous Allocations. 37-46 - Raghesh Aloor, V. Krishna Nandivada

:
Unique Worker model for OpenMP. 47-56 - Benjamin S. Parsons, Vijay S. Pai:

Exploiting Process Imbalance to Improve MPI Collective Operations in Hierarchical Systems. 57-66
GPU Cache Management and Datastructures
- Chao Li

, Shuaiwen Leon Song, Hongwen Dai, Albert Sidelnik, Siva Kumar Sastry Hari, Huiyang Zhou
:
Locality-Driven Dynamic GPU Cache Bypassing. 67-77 - Nabeel AlSaber, Milind Kulkarni:

SemCache++: Semantics-Aware Caching for Efficient Multi-GPU Offloading. 79-88 - Bin Wang, Weikuan Yu

, Xian-He Sun, Xinning Wang:
DaCache: Memory Divergence-Aware GPU Cache Management. 89-98
GPU Datastructures and Scheduling
- Naser Sedaghati, Te Mu, Louis-Noël Pouchet, Srinivasan Parthasarathy

, P. Sadayappan:
Automatic Selection of Sparse Matrix Representation on GPUs. 99-108 - Ang Li, Gert-Jan van den Braak, Henk Corporaal, Akash Kumar

:
Fine-Grained Synchronizations and Dataflow Programming on GPUs. 109-118 - Bo Wu, Guoyang Chen, Dong Li, Xipeng Shen

, Jeffrey S. Vetter:
Enabling and Exploiting Flexible Task Assignment on GPU through SM-Centric Program Transformations. 119-130
Keynote Address II
- Ricardo Bianchini:

Datacenter Efficiency: What's Next? 131
Big Data
- Ruijin Zhou, Huixiang Chen, Tao Li:

Towards Lightweight and Swift Storage Resource Management in Big Data Cloud Era. 133-142 - Wenting He, Huimin Cui, Binbin Lu, Jiacheng Zhao

, Shengmei Li, Gong Ruan, Jingling Xue
, Xiaobing Feng, Wensen Yang, Youliang Yan:
Hadoop+: Modeling and Evaluating the Heterogeneity for MapReduce Applications in Heterogeneous Clusters. 143-153 - Maciej Besta, Torsten Hoefler:

Active Access: A Mechanism for High-Performance Distributed Data-Centric Computations. 155-164 - Sergei Shudler

, Alexandru Calotoiu, Torsten Hoefler, Alexandre Strube
, Felix Wolf:
Exascaling Your Library: Will Your Implementation Meet Your Expectations? 165-175
Stencil Computation
- Tobias Gysi, Tobias Grosser

, Torsten Hoefler:
MODESTO: Data-centric Analytic Optimization of Complex Stencil Programs on Heterogeneous Architectures. 177-186 - Yulong Luo, Guangming Tan, Zeyao Mo, Ninghui Sun:

FAST: A Fast Stencil Autotuning Framework Based On An Optimal-solution Space Model. 187-196 - Ian J. Bertolacci, Catherine Olschanowsky, Ben Harshbarger, Bradford L. Chamberlain, David G. Wonnacott, Michelle Mills Strout:

Parameterized Diamond Tiling for Stencil Computations with Chapel parallel iterators. 197-206 - Holger Stengel, Jan Treibig, Georg Hager

, Gerhard Wellein:
Quantifying Performance Bottlenecks of Stencil Computations Using the Execution-Cache-Memory Model. 207-216
Green Computing
- Md. Enamul Haque, Iñigo Goiri, Ricardo Bianchini, Thu D. Nguyen:

GreenPar: Scheduling Parallel High Performance Applications in Green Datacenters. 217-227 - Xu Zhou, Qiang Cao, Hong Jiang, Changsheng Xie:

Underprovisioning the Grid Power Infrastructure for Green Datacenters. 229-240 - Yiqing Hua, Chao Li, Weichao Tang, Li Jiang, Xiaoyao Liang:

Building Fuel Powered Supercomputing Data Center at Low Cost. 241-250
Emerging Technologies
- Ke Chen, Sheng Li, Jung Ho Ahn

, Naveen Muralimanohar, Jishen Zhao, Cong Xu, Seongil O, Yuan Xie, Jay B. Brockman, Norman P. Jouppi:
History-Assisted Adaptive-Granularity Caches (HAAG$) for High Performance 3D DRAM Architectures. 251-261 - Shen Gao, Bingsheng He

, Jianliang Xu
:
Real-Time In-Memory Checkpointing for Future Hybrid Memory Systems. 263-272 - Amir Kavyan Ziabari, José L. Abellán, Rafael Ubal, Chao Chen, Ajay Joshi, David R. Kaeli:

Leveraging Silicon-Photonic NoC for Designing Scalable GPUs. 273-282
Keynote Address III
- Margo I. Seltzer:

Automatically Scalable Computation. 283
Microarchitecture
- Zhaoxiang Jin, Görkem Asilioglu, Soner Önder:

Mower: A New Design for Non-blocking Misprediction Recovery. 285-294 - Shaizeen Aga, Abhayendra Singh, Satish Narayanasamy

:
zFENCE: Data-less Coherence for Efficient Fences. 295-305
Heterogeneous Systems
- Christos Margiolas, Michael F. P. O'Boyle:

PALMOS: A Transparent, Multi-tasking Acceleration Layer for Parallel Heterogeneous Systems. 307-318 - Hari Sundar, Omar Ghattas

:
A Nested Partitioning Algorithm for Adaptive Meshes on Heterogeneous Clusters. 319-328 - Kallia Chronaki, Alejandro Rico, Rosa M. Badia

, Eduard Ayguadé, Jesús Labarta
, Mateo Valero
:
Criticality-Aware Dynamic Task Scheduling for Heterogeneous Architectures. 329-338
Data Structures
- Weifeng Liu

, Brian Vinter:
CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication. 339-350 - Tobias Grosser

, Jagannathan Ramanujam
, Louis-Noël Pouchet, P. Sadayappan, Sebastian Pop:
Optimistic Delinearization of Parametrically Sized Arrays. 351-360 - Snehasish Kumar, Naveen Vedula, Arrvindh Shriraman, Vijayalakshmi Srinivasan:

DASX: Hardware Accelerator for Software Data Structures. 361-372
Parallelization and Algorithms
- Yun Zou, Sanjay V. Rajopadhye

:
Automatic Energy Efficient Parallelization of Uniform Dependence Computations. 373-382 - Kaixi Hou, Hao Wang, Wu-chun Feng:

ASPaS: A Framework for Automatic SIMDization of Parallel Sorting on x86-based Many-core Processors. 383-392 - Diego Caballero, Sara Royuela

, Roger Ferrer
, Alejandro Duran, Xavier Martorell:
Optimizing Overlapped Memory Accesses in User-directed Vectorization. 393-404
Applications and Modeling
- Seyong Lee

, Jeremy S. Meredith, Jeffrey S. Vetter:
COMPASS: A Framework for Automated Performance Modeling and Prediction. 405-414 - Mani Zandifar, Mustafa Abdul Jabbar

, Alireza Majidi, David E. Keyes
, Nancy M. Amato, Lawrence Rauchwerger:
Composing Algorithmic Skeletons to Express High-Performance Scientific Applications. 415-424 - Ioannis Papadopoulos

, Nathan L. Thomas, Adam Fidel, Nancy M. Amato, Lawrence Rauchwerger:
STAPL-RTS: An Application Driven Runtime System. 425-434

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














