IPDPS 2017: Orlando / Buena Vista, FL, USA - Workshops
- 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPS Workshops 2017, Orlando / Buena Vista, FL, USA, May 29 - June 2, 2017. IEEE Computer Society 2017, ISBN 978-1-5386-3408-0
HCW: Heterogeneity in Computing Workshop
Session 1: Managing the Different Components of Heterogeneous Systems
- Oliver Jakob Arndt, Fabian David Trager, Tobias Moß, Holger Blume:
Portable Implementation of Advanced Driver-Assistance Algorithms on Heterogeneous Architectures. 6-17 - Siddharth Rai, Mainak Chaudhuri:
Improving CPU Performance Through Dynamic GPU Access Throttling in CPU-GPU Heterogeneous Processors. 18-29
Session 2: Scheduling and Resource Allocation
- Sonia López, Stavan Satish Karia:
Alternative Processor Within Threshold: Flexible Scheduling on Heterogeneous Systems. 42-53 - Dylan Machovec, Sudeep Pasricha, Anthony A. Maciejewski, Howard Jay Siegel, Gregory A. Koenig, Michael Wright, Marcia Hilton, Rajendra Rambharos, Thomas Naughton, Neena Imam:
Preemptive Resource Management for Dynamically Arriving Tasks in an Oversubscribed Heterogeneous Computing System. 54-64 - Lilia Zaourar, Massinissa Ait Aba, David Briand, Jean-Marc Philippe:
Modeling of Applications and Hardware to Explore Task Mapping and Scheduling Strategies on a Heterogeneous Micro-Server System. 65-76 - Thibaud Ecarot, Djamal Zeghlache, Cedric Brandily:
Consumer-and-Provider-Oriented Efficient IaaS Resource Allocation. 77-85
RAW: Reconfigurable Architectures Workshop
Session 1: Architectures for Convolutional Neural Networks and Sliding Window
- Marco Bacis, Giuseppe Natale, Emanuele Del Sozzo, Marco Domenico Santambrogio:
A Pipelined and Scalable Dataflow Implementation of Convolutional Neural Networks on FPGA. 90-97 - Haruyoshi Yonekawa, Hiroki Nakahara:
On-Chip Memory Based Binarized Convolutional Deep Neural Network Applying Batch Normalization Free Technique on an FPGA. 98-105 - Murad Qasaimeh, Joseph Zambreno, Phillip H. Jones:
A Modified Sliding Window Architecture for Efficient BRAM Resource Utilization. 106-114
Session 2: Design and Programming Methods
- Gary Gréwal, Shawki Areibi, Matthew Westrik, Ziad Abuowaimer, B. Zhao:
Automatic Flow Selection and Quality-of-Result Estimation for FPGA Placement. 115-123 - Javier Alejandro Varela, Norbert Wehn, Qian Liang, Songyin Tang:
Exploiting Decoupled OpenCL Work-Items with Data Dependencies on FPGAs: A Case Study. 124-131 - Luca Stornaiuolo, A. Parravicini, Gianluca Durelli, Marco D. Santambrogio:
Exploiting FPGAs from Higher Level Languages A Signal Analysis Case Study. 132-140 - Philip Gottschling, Christian Hochberger:
ReEP: A Toolset for Generation and Programming of Reconfigurable Datapaths for Event Processing. 141-149
Session 3: Acceleration of Curran's Approximation and Elliptic Curve Crypto
- Anna Maria Nestorov, Enrico Reggiani, Hristina Palikareva, Pavel Burovskiy, Tobias Becker, Marco D. Santambrogio:
A Scalable Dataflow Implementation of Curran's Approximation Algorithm. 150-157 - Rabia Shahid, Ted Winograd, Kris Gaj:
A Generic Approach to the Development of Coprocessors for Elliptic Curve Cryptosystems. 158-167
Session 4: Acceleration of Biological Signal Processing
- Luca Cerina, Pierandrea Cancian, Giuseppe Franco, Marco Domenico Santambrogio:
A Hardware Acceleration for Surface EMG Non-Negative Matrix Factorization. 168-174 - Giovanni Pietro Seu, Gian Nicola Angotzi, Giuseppe Tuveri, Luigi Raffo, Luca Berdondini, Alessandro Maccione, Paolo Meloni:
On-FPGA Real-Time Processing of Biological Signals From High-Density MEAs: a Design Space Exploration. 175-183
Session 5: Design Methods
- Yosi Ben-Asher, Esti Stein, Ramachandran Vaidyanathan:
Combining Boolean Gates and Branching Programs in One Model can Lead to Faster Circuits. 184-191 - Utsav Agarwal, Ramachandran Vaidyanathan:
Efficient Totally-Ordered Subset Generation, with Application in Partial Reconfiguration. 192-201
Short Papers
- Godwin Enemali, Adewale Adetomi, Tughrul Arslan:
FAReP: Fragmentation-Aware Replacement Policy for Task Reuse on Reconfigurable FPGAs. 202-206 - Tejaswini Ananthanarayana, Sonia López, Marcin Lukowiak:
Power Analysis of HLS-Designed Customized Instruction Set Architectures. 207-212 - Tajas Ruschke, Lukas Johannes Jung, Christian Hochberger:
A Near Optimal Integrated Solution for Resource Constrained Scheduling, Binding and Routing on CGRAs. 213-218 - Adewale Adetomi, Godwin Enemali, Tughrul Arslan:
Clock Buffers, Nets, and Trees for On-Chip Communication: A Novel Network Access Technique in FPGAs. 219-222 - Enrico Reggiani, Eleonora DArnese, Andrea Purgato, Marco D. Santambrogio:
Pearson Correlation Coefficient Acceleration for Modeling and Mapping of Neural Interconnections. 223-228 - Tripti Jain, Klaus Schneider, Frederik Walk:
Out-of-Order Execution of Buffered Function Units in Exposed Data Path Architectures. 229-234 - Emanuele Del Sozzo, Lorenzo Di Tucci, Marco D. Santambrogio:
A Highly Scalable and Efficient Parallel Design of N-Body Simulation on FPGA. 241-246 - Francesca Palumbo, Carlo Sau, Danilo Pani, Paolo Meloni, Luigi Raffo:
Feasibility Study of Real-Time Spiking Neural Network Simulations on a Swarm Intelligence Based Digital Architecture. 247-250
HiCOMB: 16th IEEE International Workshop on High Performance Computational Biology
Session 1
- Cyrus Cousins, Chirstopher M. Pietras, Donna K. Slonim:
Scalable FRaC Variants: Anomaly Detection for Precision Medicine. 253-262 - Jae-Seung Yeom, Tanya Kostova-Vassilevska, Peter D. Barnes Jr., David R. Jefferson, Tomas Oppelstrup:
Exploratory Modeling and Simulation of the Evolutionary Dynamics of Single-Stranded RNA Virus Populations. 263-272
Session 2
- Julia D. Warnke-Sommer, Hesham H. Ali:
Parallel NGS Assembly Using Distributed Assembly Graphs Enriched with Biological Knowledge. 273-282 - Vasudevan Rengasamy, Paul Medvedev, Kamesh Madduri:
Parallel and Memory-Efficient Preprocessing for Metagenome Assembly. 283-292
Session 3
- Philip E. Davis, Adam M. Terwilliger, David Zeitler, Gregory Wolffe:
Scalable Parallelization of a Markov Coalescent Genealogy Sampler. 293-302 - Mucahid Kutlu, Gagan Agrawal, James S. Blachly:
Par-eXpress: A Tool for Analysis of Sequencing Experiments With Ambiguous Assignment of Fragments in Parallel. 303-310
EduPar: NSF/TCPP Workshop on Parallel and Distributed Computing Education
Session 1: Tools and Programming Environment
- Abdul Dakkak, Carl Pearson, Cheng Li, Wen-mei W. Hwu:
RAI: A Scalable Project Submission System for Parallel Programming Courses. 315-322 - Brian Broll, Ákos Lédeczi, Péter Völgyesi, János Sallai, Miklós Maróti, Chris Vanags:
Introducing Parallel and Distributed Computing to K12. 323-330 - Tianyi Bao, William B. Gardner:
Log Visualization Tool for Message-Passing Programming in Pilot. 331-338 - David A. Richie, James A. Ross:
I Can Has Supercomputer? A Novel Approach to Teaching Parallel and Distributed Computing Concepts Using a Meme-Based Programming Language. 339-345
Session 2: Pedagogy and Experience
- Jane Wyngaard, Heather Lynch, Jaroslaw Nabrzyski, Allen Pope, Shantenu Jha:
Hacking at the Divide Between Polar Science and HPC: Using Hackathons as Training Tools. 352-359 - Vivek Sarkar, Max Grossman, Zoran Budimlic, Shams Imam:
Preparing an Online Java Parallel Computing Course. 360-366 - Jawwad Ahmed Shamsi:
A Laboratory Based Course on GPU Programming: Methods, Practices, and Lessons. 367-374
ParLearning: The 6th International Workshop on Parallel and Distributed Computing for Large Scale Machine Learning and Big Data Analytics
Session 1
- Azalia Mirhoseini, Bita Darvish Rouhani, Ebrahim M. Songhori, Farinaz Koushanfar:
ExtDict: Extensible Dictionaries for Data- and Platform-Aware Large-Scale Learning. 379-388 - Songze Li, Sucha Supittayapornpong, Mohammad Ali Maddah-Ali, Salman Avestimehr:
Coded TeraSort. 389-398 - Nitin A. Gawande, Joshua B. Landwehr, Jeff A. Daily, Nathan R. Tallent, Abhinav Vishnu, Darren J. Kerbyson:
Scaling Deep Learning Workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing. 399-408 - Jing Chen, Jianbin Fang, Weifeng Liu, Tao Tang, Xuhao Chen, Canqun Yang:
Efficient and Portable ALS Matrix Factorization for Recommender Systems. 409-418
Session 2
- Thomas P. Parnell, Celestine Dünner, Kubilay Atasu, Manolis Sifalakis, Haris Pozidis:
Large-Scale Stochastic Learning Using GPUs. 419-428 - Amaury Durand, Yanik Ngoko, Christophe Cérin:
Distributed and in-Situ Machine Learning for Smart-Homes and Buildings: Application to Alarm Sounds Detection. 429-432 - DeJiao Niu, Rui Xue, Tao Cai, Hai Li, Kingsley Effah, Hang Zhang:
The New Large-Scale RNNLM System Based on Distributed Neuron. 433-436 - Yuchen Qiao, Kazuma Hashimoto, Akiko Eriguchi, Haixia Wang, Dongsheng Wang, Yoshimasa Tsuruoka, Kenjiro Taura:
Cache Friendly Parallelization of Neural Encoder-Decoder Models Without Padding on Multi-core Architecture. 437-440
PDCO: 7th IEEE Workshop Parallel / Distributed Computing and Optimization
Session 1: Scheduling I
- Laleh Ghalami, Daniel Grosu:
A Parallel Approximation Algorithm for Scheduling Parallel Identical Machines. 442-451 - Hadrien Croubois, Eddy Caron:
Communication Aware task Placement for Workflow Scheduling on DaaS-Based Cloud. 452-461 - Muhammad Qasim, Touseef Iqbal, Ehsan Ullah Munir, Nikos Tziritas, Samee U. Khan, Laurence T. Yang:
Dynamic Mapping of Application Workflows in Heterogeneous Computing Environments. 462-471
Session 2: Scheduling II
- Jorge M. Cortés-Mendoza, Andrei Tchernykh, Igor Bychkov, Alexander Feoktistov, Pascal Bouvry, Loic Didelot:
Load-Aware Strategies for Cloud-Based VoIP Optimization with VM Startup Prediction. 472-481 - David Pena, Andrei Tchernykh, Sergio Nesmachnow, Renzo Massobrio, Alexander Feoktistov, Igor Bychkov:
Multiobjective Vehicle-type Scheduling in Urban Public Transport. 482-491
Session 3: Parallel Metaheuristics and Machine Learning
- Emmanuel Kieffer, Grégoire Danoy, Pascal Bouvry, Anass Nagih:
A new Co-evolutionary Algorithm Based on Constraint Decomposition. 492-500 - Javier A. Cruz-Lopez, Vincent Boyer, Didier El Baz:
Training Many Neural Networks in Parallel via Back-Propagation. 501-509 - Amir Nakib, Mohamed Hilia, Frederic Heliodore, El-Ghazali Talbi:
Design of Metaheuristic Based on Machine Learning: A Unified Approach. 510-518
Session 4: Graphs, Networks and Algorithms
- Raphael Kimmig, Henning Meyerhenke, Darren Strash:
Shared Memory Parallel Subgraph Enumeration. 519-529 - Julien Collet, Tanguy Sassolas, Yves Lhuillier, Renaud Sirdey, Jacques Carlier:
Exploration of de Bruijn Graph Filtering for de novo Assembly Using GraphLab. 530-539 - He Li, Robson Eduardo De Grande, Azzedine Boukerche:
An Efficient CPP Solution for Resilience-Oriented SDN Controller Deployment. 540-549
Session 5: Parallel Algorithms
- Chris Rohlfs, Mohamed Zahran:
Optimal Bandwidth Selection for Kernel Regression Using a Fast Grid Search and a GPU. 550-556 - Numair Khan, Mohamed Zahran:
Space-Efficient Pointwise Computation of the Distance Transform on GPUs. 557-566 - Christian Herold, Olaf Krzikalla, Andreas Knüpfer:
Optimizing One-Sided Communication of Parallel Applications Using Critical Path Methods. 567-576
GABB: Graph Algorithms Building Blocks
Session 1
- George M. Slota, Sivasankaran Rajamanickam, Kamesh Madduri:
Order or Shuffle: Empirically Evaluating Vertex Order Impact on Parallel Graph Computations. 588-597 - Sayyad Nayyaroddeen, Mahak Gambhir, Kishore Kothapalli:
A Study of Graph Decomposition Algorithms for Parallel Symmetry Breaking. 598-607
Session 2
- Hayden Jananthan, Karia Dibert, Jeremy Kepner:
Constructing Adjacency Arrays from Incidence Arrays. 608-615 - Yangzihao Wang, Sean Baxter, John D. Owens:
Mini-Gunrock: A Lightweight Graph Analytics Framework on the GPU. 616-626 - Charles Colley, Junyuan Lin, Xiaozhe Hu, Shuchin Aeron:
Algebraic Multigrid for Least Squares Problems on Graphs with Applications to HodgeRank. 627-636
Session 3
- David Ediger, James P. Fairbanks:
Deriving Streaming Graph Algorithms from Static Definitions. 637-642
Session 4
- Aydin Buluç, Tim Mattson, Scott McMillan, José E. Moreira, Carl Yang:
Design of the GraphBLAS API for C. 643-652 - William P. Horn, Gabriel Tanase, Hao Yu, Pratap Pattnaik:
A Linear Algebra-Based Programming Interface for Graph Computations in Scala and Spark. 653-659
AsHES: The Seventh International Workshop on Accelerators and Hybrid Exascale Systems
Session 1: Programming Models and Runtime Systems
- Michael Wolfe, Seyong Lee, Jungwon Kim, Xiaonan Tian, Rengan Xu, Sunita Chandrasekaran, Barbara M. Chapman:
Implementing the OpenACC Data Model. 662-672 - Sergio Pino, Lori L. Pollock, Sunita Chandrasekaran:
Exploring Translation of OpenMP to OpenACC 2.5: Lessons Learned. 673-682 - Ivy Bo Peng, Roberto Gioiosa, Gokcen Kestor, Pietro Cicotti, Erwin Laure, Stefano Markidis:
Exploring the Performance Benefit of Hybrid Memory System on HPC Environments. 683-692
Session 2: Algorithms
- Mehmet Deveci, Christian Trott, Sivasankaran Rajamanickam:
Performance-Portable Sparse Matrix-Matrix Multiplication for Many-Core Architectures. 693-702 - Antonio Gómez-Iglesias, Miguel Cárdenas Montes:
Time and Energy to Solution Evaluation for the Three-Point Angular Correlation Function. 703-712 - Kaixi Hou, Wu-chun Feng, Shuai Che:
Auto-Tuning Strategies for Parallelizing Sparse Matrix-Vector (SpMV) Multiplication on Multi- and Many-Core Processors. 713-722
Session 3: Scheduling and Architectures
- Max Grossman, Vivek Kumar, Nick Vrvilo, Zoran Budimlic, Vivek Sarkar:
A Pluggable Framework for Composable HPC Scheduling Libraries. 723-732 - Sandra Catalán, Rafael Rodríguez-Sánchez, Enrique S. Quintana-Ortí, José R. Herrero:
Static Versus Dynamic Task Scheduling of the Lu Factorization on ARM big. LITTLE Architectures. 733-742
HIPS: 22nd International Workshop on High Level Programming Models and Supportive Environments
Session 1
- Dana Akhmetova, Roman Iakymchuk, Örjan Ekeberg, Erwin Laure:
Performance Study of Multithreaded MPI and OpenMP Tasking in a Large Scientific Code. 756-765 - Mostafa Mehrabi, Nasser Giacaman, Oliver Sinnen:
Annotation-Based Parallelization of Java Code. 775-784
Session 2
- Alexis Engelke, Josef Weidendorfer:
Using LLVM for Optimized Lightweight Binary Re-Writing at Runtime. 785-794 - Nathan Zhang, Michael Driscoll, Charles Markley, Samuel Williams, Protonu Basu, Armando Fox:
Snowflake: A Lightweight Portable Stencil DSL. 795-804 - Pavel Shamis, M. Graham Lopez, Gilad Shainer:
Enabling One-Sided Communication Semantics on ARM. 805-813
Session 3
- Jari-Matti Mäkelä, Martti Forsell, Ville Leppänen:
Towards a Language Framework for Thick Control Flows. 814-823
APDCM: 19th Workshop on Advances in Parallel and Distributed Computational Models
Session 1: Distributed Computing
- Aisha Aljohani, Gokarna Sharma:
Complete Visibility for Mobile Agents with Lights Tolerating a Faulty Agent. 834-843 - Yonghwan Kim, Haruka Ohno, Yoshiaki Katayama, Toshimitsu Masuzawa:
A Self-Stabilizing Algorithm for Constructing (1, 1)-Maximal Directed Acyclic Graph. 844-853 - Jonas Posner, Claudia Fohry:
Fault Tolerance for Cooperative Lifeline-Based Global Load Balancing in Java with APGAS and Hazelcast. 854-863 - Debarshi Dutta, Meher Chaitanya, Kishore Kothapalli, Debajyoti Bera:
Applications of Ear Decomposition to Efficient Heterogeneous Algorithms for Shortest Path/Cycle Problems. 864-873
Session 2: Scheduling and Hardware Models
- Guillaume Aupy, Anne Benoit, Loïc Pottier, Padma Raghavan, Yves Robert, Manu Shantharam:
Co-Scheduling Algorithms for Cache-Partitioned Systems. 874-883 - Loris Marchal, Samuel McCauley, Bertrand Simon, Frédéric Vivien:
Minimizing I/Os in Out-of-Core Task Tree Scheduling. 884-893 - Max Plauth, Christoph Sterz, Felix Eberhardt, Frank Feinbube, Andreas Polze:
Assessing NUMA Performance Based on Hardware Event Counters. 904-913
Session 3: Parallel Computing
- Daniel Dauwe, Sudeep Pasricha, Anthony A. Maciejewski, Howard Jay Siegel:
An Analysis of Resilience Techniques for Exascale Computing Platforms. 914-923 - Tomoki Kawamura, Yoneda Kazunori, Takashi Yamazaki, Takashi Iwamura, Masahiro Watanabe, Yasushi Inoguchi:
A Compression Method for Storage Formats of a Sparse Matrix in Solving the Large-Scale Linear Systems. 924-931 - Takahiro Nishimura, Jacir Luiz Bordim, Yasuaki Ito, Koji Nakano:
Accelerating the Smith-Waterman Algorithm Using Bitwise Parallel Bulk Computation Technique on GPU. 932-941 - Yi Yang, Yasuaki Ito, Koji Nakano:
Photomosaic Generation by Rearranging Subimages, with GPU Acceleration. 942-951
HPPAC: 13th Workshop on High-Performance, Power-Aware Computing
Session 1
- Hayk Shoukourian, Torsten Wilde, Detlef Labrenz, Arndt Bode:
Using Machine Learning for Data Center Cooling Infrastructure Efficiency Prediction. 954-963 - Wissam Abu Ahmad, Andrea Bartolini, Francesco Beneventi, Luca Benini, Andrea Borghesi, Marco Cicala, Privato Forestieri, Cosimo Gianfreda, Daniele Gregori, Antonio Libri, Filippo Spiga, Simone Tinti:
Design of an Energy Aware Petaflops Class High Performance Cluster Based on Power Architecture. 964-973 - Aniruddha Marathe, Ghaleb Abdulla, Barry L. Rountree, Kathleen Shoga:
Towards a Unified Monitoring Framework for Power, Performance and Thermal Metrics: A Case Study on the Evaluation of HPC Cooling Systems. 974-983
Session 2
- Xinning Hui, Zhihui Du, Jason Liu, Hongyang Sun, Yuxiong He, David A. Bader:
When Good Enough Is Better: Energy-Aware Scheduling for Multicore Servers. 984-993 - Shouq Alsubaihi, Jean-Luc Gaudiot:
A Runtime Workload Distribution with Resource Allocation for CPU-GPU Heterogeneous Systems. 994-1003