


default search action
33rd IPDPS 2019: Rio de Janeiro, Brazil
- 2019 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2019, Rio de Janeiro, Brazil, May 20-24, 2019. IEEE 2019, ISBN 978-1-7281-1246-6

Keynote 1
- Ian T. Foster:

Coding the Continuum. 1
Session 1: Graph Algorithms
- Ariful Azad, Aydin Buluç

:
LACC: A Linear-Algebraic Algorithm for Finding Connected Components in Distributed Memory. 2-12 - Monika Henzinger

, Alexander Noe, Christian Schulz
:
Shared-Memory Exact Minimum Cuts. 13-22 - Udit Agarwal, Vijaya Ramachandran:

Distributed Weighted All Pairs Shortest Paths Through Pipelining. 23-32 - Philipp Bamberger, Fabian Kuhn, Yannic Maus:

Local Distributed Algorithms in Highly Dynamic Networks. 33-42
Session 2: HPC Systems
- Alvaro Frank, Tim Süß, André Brinkmann:

Effects and Benefits of Node Sharing Strategies in HPC Batch Systems. 43-53 - Constantino Gómez, Francesc Martínez, Adrià Armejach

, Miquel Moretó
, Filippo Mantovani
, Marc Casas
:
Design Space Exploration of Next-Generation HPC Machines. 54-65 - Tal Ben-Nun, Maciej Besta, Simon Huber, Alexandros Nikolaos Ziogas

, Daniel Peter, Torsten Hoefler:
A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning. 66-77 - Jens Domke

, Kazuaki Matsumura, Mohamed Wahib, Haoyu Zhang, Keita Yashima, Toshiki Tsuchikawa, Yohei Tsuji, Artur Podobas, Satoshi Matsuoka:
Double-Precision FPUs in High-Performance Computing: An Embarrassment of Riches? 78-88
Session 3: Numerical Algorithms
- Edward Hutter, Edgar Solomonik:

Communication-Avoiding Cholesky-QR2 for Rectangular Matrices. 89-100 - Jordi Wolfson-Pou, Edmond Chow:

Asynchronous Multigrid Methods. 101-110 - Ahmad Abdelfattah, Stanimire Tomov

, Jack J. Dongarra:
Fast Batched Matrix Multiplication for Small Sizes Using Half-Precision Arithmetic on GPUs. 111-122 - Israt Nisa, Jiajia Li

, Aravind Sukumaran-Rajam
, Richard W. Vuduc
, P. Sadayappan
:
Load-Balanced Sparse MTTKRP on GPUs. 123-133
Session 4: Scheduling and Load Balancing I
- Kunal Agrawal, I-Ting Angelina Lee, Jing Li, Kefu Lu, Benjamin Moseley:

Practically Efficient Scheduler for Minimizing Average Flow Time of Parallel Jobs. 134-144 - Klaus Jansen, Marten Maack, Alexander Mäcker:

Scheduling on (Un-)Related Machines with Setup Times. 145-154 - M. Yusuf Özkaya, Anne Benoit

, Bora Uçar
, Julien Herrmann, Ümit V. Çatalyürek:
A Scalable Clustering-Based Task Scheduler for Homogeneous Processors Using DAG Partitioning. 155-165 - Guillaume Aupy, Ana Gainaru, Valentin Honoré

, Padma Raghavan, Yves Robert
, Hongyang Sun:
Reservation Strategies for Stochastic Jobs. 166-175
Session 5: Accelerating Neural Networks
- Bruno R. C. Magalhães, Thomas Sterling, Felix Schürmann, Michael L. Hines:

Exploiting Flow Graph of System of ODEs to Accelerate the Simulation of Biologically-Detailed Neural Networks. 176-187 - Jiawen Liu, Dong Li, Gokcen Kestor, Jeffrey S. Vetter:

Runtime Concurrency Control and Operation Scheduling for High Performance Neural Network Training. 188-199 - Shriram S. B, Anshuj Garg, Purushottam Kulkarni:

Dynamic Memory Management for GPU-Based Training of Deep Neural Networks. 200-209 - Nikoli Dryden, Naoya Maruyama, Tom Benson, Tim Moon, Marc Snir, Brian Van Essen:

Improving Strong-Scaling of CNN Training by Exploiting Finer-Grained Parallelism. 210-220
Session 6: GPU Computing I
- Pengyu Wang

, Lu Zhang, Chao Li, Minyi Guo:
Excavating the Potential of GPU for Accelerating Graph Traversal. 221-230 - Hartwig Anzt

, Tobias Ribizel
, Goran Flegar, Edmond Chow, Jack J. Dongarra:
ParILUT - A Parallel Threshold ILU for GPUs. 231-241 - Jie Zhang, Xiaoyi Lu, Ching-Hsiang Chu, Dhabaleswar K. Panda:

C-GDR: High-Performance Container-Aware GPUDirect MPI Communication Schemes on RDMA Networks. 242-251 - Tyler N. Allen

, Xizhou Feng, Rong Ge:
Slate: Enabling Workload-Aware Efficient Multiprocessing for Modern GPGPUs. 252-261
Session 7: Learning and Prediction Systems
- Jielong Xu, Jian Tang, Zhiyuan Xu, Chengxiang Yin, Kevin A. Kwiat, Charles A. Kamhoua:

A Deep Recurrent Neural Network Based Predictive Control Framework for Reliable Distributed Stream Data Processing. 262-272 - Adrian Colaso, Pablo Prieto

, Pablo Abad Fidalgo, José-Ángel Gregorio
, Valentin Puente:
Architecting Racetrack Memory Preshift through Pattern-Based Prediction Mechanisms. 273-282 - Ryan Chard, Zhuozhao Li

, Kyle Chard
, Logan T. Ward, Yadu N. Babuji, Anna Woodard, Steven Tuecke, Ben Blaiszik, Michael J. Franklin, Ian T. Foster:
DLHub: Model and Data Serving for Science. 283-292 - Huizhang Luo, Dan Huang, Qing Liu

, Zhenbo Qiao, Hong Jiang, Jing Bi
, Haitao Yuan
, Mengchu Zhou, Jinzhen Wang
, Zhenlu Qin
:
Identifying Latent Reduced Models to Precondition Lossy Compression. 293-302
Session 8: Multicore Computing
- Mehrzad Nejat, Miquel Pericàs, Per Stenström:

QoS-Driven Coordinated Management of Resources to Save Energy in Multi-core Systems. 303-313 - Md. Vasimuddin, Sanchit Misra, Heng Li

, Srinivas Aluru:
Efficient Architecture-Aware Acceleration of BWA-MEM for Multicore Systems. 314-324 - Stephanie Labasan, Matthew Larsen, Hank Childs, Barry Rountree:

Power and Performance Tradeoffs for Visualization Algorithms. 325-334 - Shuai Che, Jieming Yin:

Northup: Divide-and-Conquer Programming in Systems with Heterogeneous Memories and Processors. 335-344
Plenary Session: Best Papers
- T.-H. Hubert Chan, Mauro Sozio, Bintao Sun:

Distributed Approximate k-Core Decomposition and Min-Max Edge Orientation: Breaking the Diameter Barrier. 345-354 - Jahanzeb Maqbool Hashmi, Sourav Chakraborty, Mohammadreza Bayatpour, Hari Subramoni, Dhabaleswar K. Panda:

FALCON: Efficient Designs for Zero-Copy MPI Datatype Processing on Emerging Architectures. 355-364 - Pankaj Khanchandani, Roger Wattenhofer:

Two Elementary Instructions Make Compare-and-Swap. 365-374 - James Gentry, Chavit Denninnart

, Mohsen Amini Salehi:
Robust Dynamic Resource Allocation via Probabilistic Task Pruning in Heterogeneous Computing Systems. 375-384
Keynote 2
- Lawrence Rauchwerger:

Two Roads to Parallelism: From Serial Code to Programming with STAPL. 385
Session 9: Cloud Computing
- Zhichao Yan, Hong Jiang, Yujuan Tan, Stan Skelton, Hao Luo:

Z-Dedup: A Case for Deduplicating Compressed Contents in Cloud. 386-395 - Petar Kochovski, Rizos Sakellariou

, Marko Bajec, Pavel D. Drobintsev, Vlado Stankovski:
An Architecture and Stochastic Method for Database Container Placement in the Edge-Fog-Cloud Continuum. 396-405 - Nikos Tziritas, Thanasis Loukopoulos, Samee Khan

, Cheng-Zhong Xu
, Albert Y. Zomaya
:
Online Live VM Migration Algorithms to Minimize Total Migration Time and Downtime. 406-417 - Nishant Saurabh

, Julian Remmers, Dragi Kimovski
, Radu Prodan, Jorge G. Barbosa
:
Semantics-Aware Virtual Machine Image Management in IaaS Clouds. 418-427
Session 10: Graph Algorithms II
- Yongzhe Zhang, Zhenjiang Hu:

Composing Optimization Techniques for Vertex-Centric Graph Processing via Communication Channels. 428-438 - Loc Hoang, Roshan Dathathri, Gurbinder Gill, Keshav Pingali:

CuSP: A Customizable Streaming Edge Partitioner for Distributed Graph Analytics. 439-450 - Chirag Jain, Sanchit Misra, Haowen Zhang, Alexander T. Dilthey, Srinivas Aluru:

Accelerating Sequence Alignment to Graphs. 451-461 - Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, Viktor K. Prasanna:

Accurate, Efficient and Scalable Graph Embedding. 462-471
Session 11: Linear Algebra
- Ichitaro Yamazaki, Zhaojun Bai, Ding Lu

, Jack J. Dongarra:
Matrix Powers Kernels for Thick-Restart Lanczos with Explicit External Deflation. 472-481 - Roy Nissim

, Oded Schwartz:
Revisiting the I/O-Complexity of Fast Matrix Multiplication with Recomputations. 482-490 - Elad Weiss, Oded Schwartz:

Computation of Matrix Chain Products on Parallel Machines. 491-500 - Hua Huang

, Edmond Chow:
Overlapping Communications with Other Communications and Its Application to Distributed Dense Matrix Computations. 501-510
Session 12: Storage Systems
- Woong Shin

, Christopher Brumgard, Bing Xie
, Sudharshan S. Vazhkudai, Devarshi Ghoshal, Sarp Oral, Lavanya Ramakrishnan:
Data Jockey: Automatic Data Management for HPC Multi-tiered Storage Systems. 511-522 - Hao Fan, Song Wu, Shadi Ibrahim, Ximing Chen, Hai Jin, Jiang Xiao, Haibing Guan:

NCQ-Aware I/O Scheduling for Conventional Solid State Drives. 523-532 - Junqing Gu, Chentao Wu, Xin Xie, Han Qiu, Jie Li

, Minyi Guo, Xubin He
, Yuanyuan Dong, Yafei Zhao:
Optimizing the Parity Check Matrix for Efficient Decoding of RS-Based Cloud Storage Systems. 533-544 - Zhipeng Li, Min Lv, Yinlong Xu, Yongkun Li, Liangliang Xu:

D3: Deterministic Data Distribution for Efficient Data Reconstruction in Erasure-Coded Distributed Storage Systems. 545-556
Session 13: Applications I
- Zhao Liu, Xuesen Chu, Xiaojing Lv, Hongsong Meng, Shupeng Shi, Wenji Han, Jingheng Xu, Haohuan Fu, Guangwen Yang:

SunwayLB: Enabling Extreme-Scale Lattice Boltzmann Method Based Computing Fluid Dynamics Simulations on Sunway TaihuLight. 557-566 - Oleksandr Rudyy

, Marta Garcia-Gasulla
, Filippo Mantovani
, Alfonso Santiago
, Raül Sirvent, Mariano Vázquez:
Containers in HPC: A Scalability and Portability Study in Production Biological Simulations. 567-577 - Priyanka Ghosh, Sriram Krishnamoorthy

, Ananth Kalyanaraman:
PaKman: Scalable Assembly of Large Genomes on Distributed Memory Machines. 578-589 - Md. Mostofa Ali Patwary, Milind Chabbi, Heewoo Jun, Jiaji Huang, Greg Diamos, Kenneth Church

:
Language Modeling at Scale. 590-599
Session 14: File Systems
- Simbarashe Dzinamarira, Florin Dinu, T. S. Eugene Ng:

DYRS: Bandwidth-Aware Disk-to-Memory Migration of Cold Data in Big-Data File Systems. 600-609 - Bharti Wadhwa, Arnab Kumar Paul

, Sarah Neuwirth
, Feiyi Wang, Sarp Oral, Ali Raza Butt
, Jon Bernard, Kirk W. Cameron
:
iez: Resource Contention Aware Load Balancing for Large-Scale Parallel File Systems. 610-620 - Salvatore Di Girolamo, Pirmin Schmid

, Thomas C. Schulthess, Torsten Hoefler:
SimFS: A Simulation Data Virtualizing File System Interface. 621-630 - Guillaume Aupy, Olivier Beaumont

, Lionel Eyraud-Dubois:
Sizing and Partitioning Strategies for Burst-Buffers to Reduce IO Contention. 631-640
Session 15: GPU Computing II
- Prashant Singh Rawat, Miheer Vaidya, Aravind Sukumaran-Rajam

, Atanas Rountev, Louis-Noël Pouchet, P. Sadayappan
:
On Optimizing Complex Stencils on GPUs. 641-652 - Wenyi Zhao, Quan Chen, Hao Lin, Jianfeng Zhang, Jingwen Leng, Chao Li, Wenli Zheng, Li Li, Minyi Guo:

Themis: Predicting and Reining in Application-Level Slowdown on Spatial Multitasking GPUs. 653-663 - Mohammad Khavari Tavana, Yifan Sun

, Nicolas Bohm Agostini
, David R. Kaeli:
Exploiting Adaptive Data Compression to Improve Performance and Energy-Efficiency of Compute Workloads in Multi-GPU Systems. 664-674 - Kyung Hoon Kim, Priyank Devpura, Abhishek Nayyar, Andrew Doolittle, Ki Hwan Yum, Eun Jung Kim:

Dual Pattern Compression Using Data-Preprocessing for Large-Scale GPU Architectures. 675-685
Session 16: Scheduling and Load Balancing II
- Arnaud Legrand, Denis Trystram, Salah Zrigui:

Adapting Batch Scheduling to Workload Characteristics: What Can We Expect From Online Learning? 686-695 - Heng Wu, Wenbo Zhang, Yuanjia Xu, Hao Xiang, Tao Huang, Haiyang Ding, Zheng Zhang:

Aladdin: Optimized Maximum Flow Management for Shared Production Clusters. 696-707 - Pawel Garncarek

, Tomasz Jurdzinski
, Dariusz R. Kowalski, Miguel A. Mosteiro:
mmWave Wireless Backhaul Scheduling of Stochastic Packet Arrivals. 708-717 - Petra Berenbrink, Tom Friedetzky

, Dominik Kaaser, Peter Kling
:
Tight & Simple Load Balancing. 718-726
Keynote 3
- Luiz DeRose:

The Path to Delivering Programable Exascale Systems. 727
Session 17: Managing Data
- Philip Dexter, Kenneth Chiu, Bedri Sendir

:
An Error-Reflective Consistency Model for Distributed Data Stores. 728-737 - Jason Arnold, Boris Glavic

, Ioan Raicu:
A High-Performance Distributed Relational Database System for Scalable OLAP Processing. 738-748 - Ondrej Meca, Lubomír Ríha

, Tomás Brzobohatý
:
An Approach for Parallel Loading and Pre-Processing of Unstructured Meshes Stored in Spatially Scattered Fashion. 749-760
Session 18: Message Passing
- Sayan Ghosh, Mahantesh Halappanavar, Ananth Kalyanaraman, Arif Khan, Assefaw H. Gebremedhin:

Exploring MPI Communication Models for Graph Applications Using Graph Matching as a Case Study. 761-770 - Zhiqiang Zuo, Rong Gu, Xi Jiang, Zhaokang Wang, Yihua Huang, Linzhang Wang, Xuandong Li:

BigSpa: An Efficient Interprocedural Static Analysis Engine in the Cloud. 771-780 - S. Mahdieh Ghazimirsaeed, Seyed Hessam Mirsadeghi, Ahmad Afsahi:

An Efficient Collaborative Communication Mechanism for MPI Neighborhood Collectives. 781-792
Session 19: Managing Power and Energy
- Srinivasan Ramesh, Swann Perarnau, Sridutt Bhalachandra, Allen D. Malony, Peter H. Beckman:

Understanding the Impact of Dynamic Power Capping on Application Progress. 793-804 - Mohak Chadha, Michael Gerndt:

Modelling DVFS and UFS for Region-Based Energy Aware Tuning of HPC Applications. 805-814 - Wenli Zheng

, Xiaorui Wang, Yue Ma, Chao Li, Hao Lin, Bin Yao, Jianfeng Zhang, Minyi Guo:
SprintCon: Controllable and Efficient Computational Sprinting for Data Center Servers. 815-824 - Mathieu Bacou

, Grégoire Todeschi, Alain Tchana, Daniel Hagimont, Baptiste Lepers, Willy Zwaenepoel:
Drowsy-DC: Data Center Power Management System. 825-834
Session 20: Networks
- Dongxiao Yu, Yifei Zou, Yong Zhang, Feng Li, Jiguo Yu, Yu Wu, Xiuzhen Cheng, Francis C. M. Lau:

Distributed Dominating Set and Connected Dominating Set Construction Under the Dynamic SINR Model. 835-844 - Linghui Luo

, Christian Scheideler, Thim Strothmann:
MULTISKIPGRAPH: A Self-Stabilizing Overlay Network that Maintains Monotonic Searchability. 845-854 - Soumyottam Chatterjee

, Gopal Pandurangan
, Peter Robinson:
Network Size Estimation in Small-World Networks Under Byzantine Faults. 855-865 - Corentin Hardy, Erwan Le Merrer, Bruno Sericola:

MD-GAN: Multi-Discriminator Generative Adversarial Networks for Distributed Datasets. 866-877
Session 21: Dealing with Faults
- Luanzheng Guo

, Dong Li:
MOARD: Modeling Application Resilience to Transient Faults on Data Objects. 878-889 - Giorgis Georgakoudis

, Ignacio Laguna, Hans Vandierendonck
, Dimitrios S. Nikolopoulos, Martin Schulz
:
SAFIRE: Scalable and Accurate Fault Injection for Parallel Multithreaded Applications. 890-899 - Zaeem Hussain, Taieb Znati, Rami G. Melhem:

Optimal Placement of In-memory Checkpoints Under Heterogeneous Failure Likelihoods. 900-910 - Bogdan Nicolae

, Adam Moody, Elsa Gonsiorowski, Kathryn M. Mohror, Franck Cappello:
VeloC: Towards High Performance Adaptive Asynchronous Checkpointing at Large Scale. 911-920
Session 22: Optimizing Memory Behavior
- Wen Pan, Tao Xie, Xiaojia Song:

HART: A Concurrent Hash-Assisted Radix Tree for DRAM-PM Hybrid Memory Systems. 921-931 - Evangelos Vasilakis, Vassilis Papaefstathiou, Pedro Trancoso

, Ioannis Sourdis:
LLC-Guided Data Migration in Hybrid Memory Systems. 932-942 - Matthias Hauck, Marcus Paradies, Holger Fröning:

Software-Based Buffering of Associative Operations on Random Memory Addresses. 943-952 - Gongjin Sun, Junjie Shen, Alexander V. Veidenbaum:

Combining Prefetch Control and Cache Partitioning to Improve Multicore Performance. 953-962
Session 23: Programming Languages
- John Bachan, Scott B. Baden, Steven A. Hofmeyr, Mathias Jacquelin, Amir Kamil

, Dan Bonachea
, Paul H. Hargrove, Hadia Ahmed:
UPC++: A High-Performance Communication Framework for Asynchronous Computation. 963-973 - Tsung-Wei Huang, Chun-Xun Lin, Guannan Guo, Martin D. F. Wong

:
Cpp-Taskflow: Fast Task-Based Parallel Programming Using Modern C++. 974-983 - Laleh Aghababaie Beni, Saikiran Ramanan, Aparna Chandramowlishwaran

:
Portal: A High-Performance Language and Compiler for Parallel N-Body Problems. 984-995 - Thomas Macht, Clemens Grelck:

SAC Goes Cluster: Fully Implicit Distributed Computing. 996-1006
Session 24: Accelerating Graph Processing
- Scott Sallinen, Roger Pearce, Matei Ripeanu:

Incremental Graph Processing for On-line Analytics. 1007-1018 - Timothy A. K. Zakian, Ludovic Anthony Richard Capelli

, Zhenjiang Hu:
Incrementalization of Vertex-Centric Programs. 1019-1029 - Wole Jaiyeoba, Kevin Skadron

:
GraphTinker: A High Performance Data Structure for Dynamic Graph Processing. 1030-1041
Session 25: Applications II
- Shunjie Zhou, Fan Zhang, Hanhua Chen, Hai Jin, Bing Bing Zhou:

FastJoin: A Skewness-Aware Distributed Stream Join System. 1042-1052 - Philipp Habermann, Chi Ching Chi, Mauricio Alvarez-Mesa, Ben H. H. Juurlink:

A Bin-Based Bitstream Partitioning Approach for Parallel CABAC Decoding in Next Generation Video Coding. 1053-1062 - Yujing Ma, Florin Rusu, Martin Torres:

Stochastic Gradient Descent on Modern Hardware: Multi-core CPU or GPU? Synchronous or Asynchronous? 1063-1072
Session 26: Security and Reliability
- Thorsten Götte, Vipin Ravindran Vijayalakshmi, Christian Scheideler:

Always be Two Steps Ahead of Your Enemy. 1073-1082 - Diksha Gupta, Jared Saia, Maxwell Young

:
Peace Through Superior Puzzling: An Asymmetric Sybil Defense. 1083-1094 - Swarnendu Biswas, Rui Zhang, Michael D. Bond

, Brandon Lucia:
Rethinking Support for Region Conflict Exceptions. 1095-1106

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














