
Torsten Hoefler
Torsten Höfler
Person information
- affiliation: ETH Zürich
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2021
- [j42]Fabian Schuiki
, Florian Zaruba
, Torsten Hoefler, Luca Benini
:
Stream Semantic Registers: A Lightweight RISC-V ISA Extension Achieving Full Compute Utilization in Single-Issue Cores. IEEE Trans. Computers 70(2): 212-227 (2021) - [j41]Maciej Besta
, Jens Domke
, Marcel Schneider, Marek Konieczny, Salvatore Di Girolamo, Timo Schneider, Ankit Singla, Torsten Hoefler:
High-Performance Routing With Multipathing and Path Diversity in Ethernet and HPC Networks. IEEE Trans. Parallel Distributed Syst. 32(4): 943-959 (2021) - [j40]Johannes de Fine Licht
, Maciej Besta
, Simon Meierhans
, Torsten Hoefler
:
Transformations of High-Level Synthesis Codes for High-Performance Computing. IEEE Trans. Parallel Distributed Syst. 32(5): 1014-1029 (2021) - 2020
- [j39]Thomas Häner, Torsten Hoefler, Matthias Troyer:
Assertion-based optimization of Quantum programs. Proc. ACM Program. Lang. 4(OOPSLA): 133:1-133:20 (2020) - [j38]Tobias Grosser, Theodoros Theodoridis, Maximilian Falkenstein, Arjun Pitchanathan, Michael Kruse, Manuel Rigger, Zhendong Su, Torsten Hoefler:
Fast linear programming through transprecision computing on small and sparse data. Proc. ACM Program. Lang. 4(OOPSLA): 195:1-195:28 (2020) - [j37]Jesper Larsson Träff, Torsten Hoefler:
Special issue: Selected papers from EuroMPI 2019. Parallel Comput. 99: 102695 (2020) - [j36]Carlos Osuna, Tobias Wicky, Fabian Thuering, Torsten Hoefler, Oliver Fuhrer:
Dawn: a High-level Domain-Specific Language Compiler Toolchain for Weather and Climate Applications. Supercomput. Front. Innov. 7(2): 79-97 (2020) - [j35]Asif Ali Khan
, Hauke Mewes, Tobias Grosser
, Torsten Hoefler, Jerónimo Castrillón
:
Polyhedral Compilation for Racetrack Memories. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 39(11): 3968-3980 (2020) - [j34]Maciej Besta, Marc Fischer, Tal Ben-Nun, Dimitri Stanojevic, Johannes de Fine Licht, Torsten Hoefler:
Substream-Centric Maximum Matchings on FPGA. ACM Trans. Reconfigurable Technol. Syst. 13(2): 8:1-8:33 (2020) - [c175]Elad Hoffer, Tal Ben-Nun, Itay Hubara, Niv Giladi, Torsten Hoefler, Daniel Soudry:
Augment Your Batch: Improving Generalization Through Instance Repetition. CVPR 2020: 8126-8135 - [c174]Andreas Kurth, Samuel Riedel, Florian Zaruba, Torsten Hoefler, Luca Benini:
ATUNs: Modular and Scalable Support for Atomic Operations in a Shared Memory Multiprocessor. DAC 2020: 1-6 - [c173]Johannes de Fine Licht, Grzegorz Kwasniewski, Torsten Hoefler:
Flexible Communication Avoiding Matrix Multiplication on FPGA with High-Level Synthesis. FPGA 2020: 244-254 - [c172]Marcus Ritter, Alexandru Calotoiu, Sebastian Rinke, Thorsten Reimann, Torsten Hoefler, Felix Wolf:
Learning Cost-Effective Sampling Strategies for Empirical Performance Modeling. IPDPS 2020: 884-895 - [c171]Maciej Besta, Raghavendra Kanakagiri, Harun Mustafa, Mikhail Karasikov, Gunnar Rätsch, Torsten Hoefler, Edgar Solomonik:
Communication-Efficient Jaccard similarity for High-Performance Distributed Genome Comparisons. IPDPS 2020: 1122-1132 - [c170]Shigang Li, Tal Ben-Nun, Salvatore Di Girolamo, Dan Alistarh, Torsten Hoefler:
Taming unbalanced training workloads in deep learning with partial collective operations. PPoPP 2020: 45-61 - [c169]Yuyang Jin, Haojie Wang, Xiongchao Tang, Torsten Hoefler, Xu Liu, Jidong Zhai:
Identifying scalability bottlenecks for large-scale parallel programs with graph analysis. PPoPP 2020: 409-410 - [c168]Alexandr Nigay, Lukas Mosimann, Timo Schneider, Torsten Hoefler:
Communication and Timing Issues with MPI Virtualization. EuroMPI 2020: 11-20 - [c167]Lukas Gianinazzi, Torsten Hoefler:
Parallel Planar Subgraph Isomorphism and Vertex Connectivity. SPAA 2020: 269-280 - [c166]Konstantin Taranov, Benjamin Rothenberger, Adrian Perrig, Torsten Hoefler:
sRDMA - Efficient NIC-based Authentication and Encryption for Remote Direct Memory Access. USENIX Annual Technical Conference 2020: 691-704 - [p2]Alexandru Calotoiu, Marcin Copik, Torsten Hoefler, Marcus Ritter, Sergei Shudler, Felix Wolf:
ExtraPeak: Advanced Automatic Performance Modeling for HPC Applications. Software for Exascale Computing 2020: 453-482 - [i75]Tobias Gysi, Tobias Grosser, Laurin Brandner, Torsten Hoefler:
A Fast Analytical Model of Fully Associative Caches. CoRR abs/2001.01653 (2020) - [i74]Robert Gerstenberger, Maciej Besta, Torsten Hoefler:
Enabling Highly-Scalable Remote Memory Access Programming with MPI-3 One Sided. CoRR abs/2001.07747 (2020) - [i73]Florian Zaruba, Fabian Schuiki, Torsten Hoefler, Luca Benini:
Snitch: A 10 kGE Pseudo Dual-Issue Processor for Area and Energy Efficient Execution of Floating-Point Intensive Workloads. CoRR abs/2002.10143 (2020) - [i72]Chris Cummins, Zacharias V. Fisches, Tal Ben-Nun, Torsten Hoefler, Hugh Leather:
ProGraML: Graph-based Deep Learning for Program Optimization and Analysis. CoRR abs/2003.10536 (2020) - [i71]Shigang Li, Tal Ben-Nun, Dan Alistarh, Salvatore Di Girolamo, Nikoli Dryden, Torsten Hoefler:
Breaking (Global) Barriers in Parallel Stochastic Optimization with Wait-Avoiding Group Averaging. CoRR abs/2005.00124 (2020) - [i70]Peter Grönquist, Chengyuan Yao, Tal Ben-Nun, Nikoli Dryden, Peter Dueben, Shigang Li, Torsten Hoefler:
Deep Learning for Post-Processing Ensemble Weather Forecasts. CoRR abs/2005.08748 (2020) - [i69]Tobias Gysi, Christoph Müller, Oleksandr Zinenko, Stephan Herhut, Eddie Davis, Tobias Wicky, Oliver Fuhrer, Torsten Hoefler, Tobias Grosser:
Domain-Specific Multi-Level IR Rewriting for GPU. CoRR abs/2005.13014 (2020) - [i68]Bryan A. Plummer, Nikoli Dryden, Julius Frost, Torsten Hoefler, Kate Saenko:
Shapeshifter Networks: Cross-layer Parameter Sharing for Scalable and Effective Deep Learning. CoRR abs/2006.10598 (2020) - [i67]Andrei Ivanov, Nikoli Dryden, Tal Ben-Nun, Shigang Li, Torsten Hoefler:
Data Movement Is All You Need: A Case Study on Optimizing Transformers. CoRR abs/2007.00072 (2020) - [i66]Lukas Gianinazzi, Torsten Hoefler:
Parallel Planar Subgraph Isomorphism and Vertex Connectivity. CoRR abs/2007.01199 (2020) - [i65]Maciej Besta, Jens Domke, Marcel Schneider, Marek Konieczny, Salvatore Di Girolamo, Timo Schneider, Ankit Singla, Torsten Hoefler:
High-Performance Routing with Multipathing and Path Diversity in Supercomputers and Data Centers. CoRR abs/2007.03776 (2020) - [i64]Daniele De Sensi, Salvatore Di Girolamo, Kim H. McMahon, Duncan Roweth, Torsten Hoefler:
An In-Depth Analysis of the Slingshot Interconnect. CoRR abs/2008.08886 (2020) - [i63]Maciej Besta, Armon Carigiet, Zur Vonarburg-Shmaria, Kacper Janda, Lukas Gianinazzi, Torsten Hoefler:
High-Performance Parallel Graph Coloring with Strong Guarantees on Work, Depth, and Quality. CoRR abs/2008.11321 (2020) - [i62]Yuyang Jin, Haojie Wang, Teng Yu, Xiongchao Tang, Torsten Hoefler, Xu Liu, Jidong Zhai:
ScalAna: Automating Scaling Loss Detection with Graph Analysis. CoRR abs/2009.01692 (2020) - [i61]Maksym Planeta, Jan Bierbaum, Leo Sahaya Daphne Antony, Torsten Hoefler, Hermann Härtig:
TardiS: Migrating Containers with RDMA Networks. CoRR abs/2009.06988 (2020) - [i60]Salvatore Di Girolamo, Andreas Kurth, Alexandru Calotoiu, Thomas Benz, Timo Schneider, Jakub Beránek, Luca Benini, Torsten Hoefler:
PsPIN: A high-performance low-power architecture for flexible in-network compute. CoRR abs/2010.03536 (2020) - [i59]Grzegorz Kwasniewski, Tal Ben-Nun, Alexandros Nikolaos Ziogas, Timo Schneider, Maciej Besta, Torsten Hoefler:
On the Parallel I/O Optimality of Linear Algebra Kernels: Near-Optimal LU Factorization. CoRR abs/2010.05975 (2020) - [i58]Maciej Besta, Torsten Hoefler:
Fault Tolerance for Remote Memory Access Programming Models. CoRR abs/2010.09025 (2020) - [i57]Maciej Besta, Torsten Hoefler:
Accelerating Irregular Computations with Hardware Transactional Memory and Active Messages. CoRR abs/2010.09135 (2020) - [i56]Hermann Schweizer, Maciej Besta, Torsten Hoefler:
Evaluating the Cost of Atomic Operations on Modern Architectures. CoRR abs/2010.09852 (2020) - [i55]Patrick Schmid, Maciej Besta, Torsten Hoefler:
High-Performance Distributed RMA Locks. CoRR abs/2010.09854 (2020) - [i54]Maciej Besta, Florian Marending, Edgar Solomonik, Torsten Hoefler:
SlimSell: A Vectorizable Graph Representation for Breadth-First Search. CoRR abs/2010.09913 (2020) - [i53]Maciej Besta, Syed Minhaj Hassan, Sudhakar Yalamanchili, Rachata Ausavarungnirun, Onur Mutlu, Torsten Hoefler:
Slim NoC: A Low-Diameter On-Chip Network Topology for High Energy Efficiency and Scalability. CoRR abs/2010.10683 (2020) - [i52]Marcin Copik, Tobias Grosser, Torsten Hoefler, Paolo Bientinesi, Benjamin Berkels:
Work-stealing prefix scan: Addressing load imbalance in large-scale image registration. CoRR abs/2010.12478 (2020) - [i51]Maciej Besta, Marc Fischer, Tal Ben-Nun, Dimitri Stanojevic, Johannes de Fine Licht, Torsten Hoefler:
Substream-Centric Maximum Matchings on FPGA. CoRR abs/2010.14684 (2020) - [i50]Johannes de Fine Licht, Andreas Kuster, Tiziano De Matteis, Tal Ben-Nun, Dominic Hofer, Torsten Hoefler:
StencilFlow: Mapping Large Stencil Programs to Distributed Spatial Computing Systems. CoRR abs/2010.15218 (2020) - [i49]Maciej Besta, Dimitri Stanojevic, Tijana Zivic, Jagpreet Singh, Maurice Hoerold, Torsten Hoefler:
Log(Graph): A Near-Optimal High-Performance Graph Representation. CoRR abs/2010.15879 (2020) - [i48]Maciej Besta, Michal Podstawski, Linus Groner, Edgar Solomonik, Torsten Hoefler:
To Push or To Pull: On Reducing Communication and Synchronization in Graph Computations. CoRR abs/2010.16012 (2020) - [i47]Tal Ben-Nun, Lukas Gianinazzi, Torsten Hoefler, Yishai Oltchik:
Parametric Graph Templates: Properties and Algorithms. CoRR abs/2011.07001 (2020) - [i46]Paul Scheffler, Florian Zaruba, Fabian Schuiki, Torsten Hoefler, Luca Benini:
Indirection Stream Semantic Register Architecture for Efficient Sparse-Dense Linear Algebra. CoRR abs/2011.08070 (2020) - [i45]Chris Cummins, Hugh Leather, Zacharias V. Fisches, Tal Ben-Nun, Torsten Hoefler, Michael F. P. O'Boyle:
Deep Data Flow Analysis. CoRR abs/2012.01470 (2020) - [i44]Marcin Copik, Grzegorz Kwasniewski, Maciej Besta, Michal Podstawski, Torsten Hoefler:
SeBS: A Serverless Benchmark Suite for Function-as-a-Service Computing. CoRR abs/2012.14132 (2020) - [i43]Marcin Copik, Alexandru Calotoiu, Tobias Grosser, Nicolas Wicki, Felix Wolf, Torsten Hoefler:
Extracting Clean Performance Models from Tainted Programs. CoRR abs/2012.15592 (2020)
2010 – 2019
- 2019
- [j33]Pedro Yébenes
, Jesús Escudero-Sahuquillo
, Pedro Javier García
, Francisco J. Quiles
, Torsten Hoefler:
Head-of-line blocking avoidance in Slim Fly networks using deadlock-free non-minimal and adaptive routing. Concurr. Comput. Pract. Exp. 31(2) (2019) - [j32]Thomas C. Schulthess, Peter Bauer, Nils Wedi, Oliver Fuhrer, Torsten Hoefler, Christoph M. Schär:
Reflecting on the Goal and Baseline for Exascale Computing: A Roadmap Based on Weather and Climate Simulations. Comput. Sci. Eng. 21(1): 30-41 (2019) - [j31]Tal Ben-Nun, Torsten Hoefler:
Demystifying Parallel and Distributed Deep Learning: An In-depth Concurrency Analysis. ACM Comput. Surv. 52(4): 65:1-65:43 (2019) - [j30]Claude Barthels, Ingo Müller, Konstantin Taranov, Gustavo Alonso, Torsten Hoefler:
Strong consistency is not hard to get: Two-Phase Locking and Two-Phase Commit on Thousands of Cores. Proc. VLDB Endow. 12(13): 2325-2338 (2019) - [j29]Sergei Shudler
, Yannick Berens, Alexandru Calotoiu, Torsten Hoefler, Alexandre Strube
, Felix Wolf
:
Engineering Algorithms for Scalability through Continuous Validation of Performance Expectations. IEEE Trans. Parallel Distributed Syst. 30(8): 1768-1785 (2019) - [c165]Tobias Gysi, Tobias Grosser, Torsten Hoefler:
Absinthe: Learning an Analytical Performance Model to Fuse and Tile Stencil Codes in One Shot. PACT 2019: 370-382 - [c164]Niels Gleinig, Frances Ann Hubis, Torsten Hoefler:
Embedding Functions Into Reversible Circuits: A Probabilistic Approach to the Number of Lines. DAC 2019: 72 - [c163]Maciej Besta, Marc Fischer, Tal Ben-Nun, Johannes de Fine Licht, Torsten Hoefler:
Substream-Centric Maximum Matchings on FPGA. FPGA 2019: 152-161 - [c162]Paul R. Eller, Torsten Hoefler, William Gropp:
Using performance models to understand scalable Krylov solver performance at scale for structured grid problems. ICS 2019: 138-149 - [c161]Tal Ben-Nun, Maciej Besta, Simon Huber, Alexandros Nikolaos Ziogas, Daniel Peter, Torsten Hoefler:
A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning. IPDPS 2019: 66-77 - [c160]Torsten Hoefler:
Invited Talk 2. IPDPS Workshops 2019: 392 - [c159]Salvatore Di Girolamo, Pirmin Schmid, Thomas C. Schulthess, Torsten Hoefler:
SimFS: A Simulation Data Virtualizing File System Interface. IPDPS 2019: 621-630 - [c158]Felix Thaler, Stefan Moosbrugger, Carlos Osuna, Mauro Bianco, Hannes Vogt, Anton Afanasyev, Lukas Mosimann, Oliver Fuhrer, Thomas C. Schulthess, Torsten Hoefler:
Porting the COSMO Weather Model to Manycore CPUs. PASC 2019: 13:1-13:11 - [c157]Tobias Gysi, Tobias Grosser, Laurin Brandner, Torsten Hoefler:
A fast analytical model of fully associative caches. PLDI 2019: 816-829 - [c156]Martin Küttler, Maksym Planeta, Jan Bierbaum, Carsten Weinhold, Hermann Härtig, Amnon Barak, Torsten Hoefler:
Corrected trees for reliable group communication. PPoPP 2019: 287-299 - [c155]Jesper Larsson Träff, Torsten Hoefler:
Foreword EuroMPI 2019. EuroMPI 2019: 1:1-1:2 - [c154]Alexandros Nikolaos Ziogas, Tal Ben-Nun, Guillermo Indalecio Fernández, Timo Schneider, Mathieu Luisier, Torsten Hoefler:
A data-centric approach to extreme-scale ab initio dissipative quantum transport simulations. SC 2019: 1:1-1:13 - [c153]Cédric Renggli, Saleh Ashkboos, Mehdi Aghagolzadeh, Dan Alistarh, Torsten Hoefler:
SparCML: high-performance sparse communication for machine learning. SC 2019: 11:1-11:15 - [c152]Daniele De Sensi, Salvatore Di Girolamo, Torsten Hoefler:
Mitigating network noise on Dragonfly networks through application-aware routing. SC 2019: 16:1-16:32 - [c151]Grzegorz Kwasniewski, Marko Kabic, Maciej Besta, Joost VandeVondele, Raffaele Solcà, Torsten Hoefler:
Red-blue pebbling revisited: near optimal parallel matrix-matrix multiplication. SC 2019: 24:1-24:22 - [c150]Maciej Besta, Simon Weber
, Lukas Gianinazzi, Robert Gerstenberger, Andrey Ivanov, Yishai Oltchik, Torsten Hoefler:
Slim graph: practical lossy graph compression for approximate graph processing, storage, and analytics. SC 2019: 35:1-35:25 - [c149]Salvatore Di Girolamo, Konstantin Taranov, Andreas Kurth, Michael Schaffner, Timo Schneider, Jakub Beránek
, Maciej Besta, Luca Benini
, Duncan Roweth, Torsten Hoefler:
Network-accelerated non-contiguous memory transfers. SC 2019: 56:1-56:14 - [c148]Alexandros Nikolaos Ziogas, Tal Ben-Nun, Guillermo Indalecio Fernández, Timo Schneider, Mathieu Luisier, Torsten Hoefler:
Optimizing the data movement in quantum transport simulations via data-centric parallel programming. SC 2019: 78:1-78:17 - [c147]Tal Ben-Nun, Johannes de Fine Licht, Alexandros Nikolaos Ziogas, Timo Schneider, Torsten Hoefler:
Stateful dataflow multigraphs: a data-centric model for performance portability on heterogeneous architectures. SC 2019: 81:1-81:14 - [c146]Tiziano De Matteis, Johannes de Fine Licht, Jakub Beránek
, Torsten Hoefler:
Streaming message interface: high-performance distributed memory programming on reconfigurable hardware. SC 2019: 82:1-82:33 - [e7]Torsten Hoefler, Jesper Larsson Träff:
Proceedings of the 26th European MPI Users' Group Meeting, EuroMPI 2019, Zürich, Switzerland, September 11-13, 2019. ACM 2019, ISBN 978-1-4503-7175-9 [contents] - [i42]Elad Hoffer, Tal Ben-Nun, Itay Hubara, Niv Giladi, Torsten Hoefler, Daniel Soudry:
Augment your batch: better training with larger batches. CoRR abs/1901.09335 (2019) - [i41]Tal Ben-Nun, Maciej Besta, Simon Huber, Alexandros Nikolaos Ziogas, Daniel Peter, Torsten Hoefler:
A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning. CoRR abs/1901.10183 (2019) - [i40]Salvatore Di Girolamo, Pirmin Schmid, Thomas C. Schulthess, Torsten Hoefler:
SimFS: A Simulation Data Virtualizing File System Interface. CoRR abs/1902.03154 (2019) - [i39]Tal Ben-Nun, Johannes de Fine Licht, Alexandros Nikolaos Ziogas, Timo Schneider, Torsten Hoefler:
Stateful Dataflow Multigraphs: A Data-Centric Model for High-Performance Parallel Programs. CoRR abs/1902.10345 (2019) - [i38]Maciej Besta, Dimitri Stanojevic, Johannes de Fine Licht, Tal Ben-Nun, Torsten Hoefler:
Graph Processing on FPGAs: Taxonomy, Survey, Challenges. CoRR abs/1903.06697 (2019) - [i37]Maciej Besta, Marcel Schneider, Karolina Cynk, Marek Konieczny, Erik Henriksson, Salvatore Di Girolamo, Ankit Singla, Torsten Hoefler:
FatPaths: Routing in Supercomputers, Data Centers, and Clouds with Low-Diameter Networks when Shortest Paths Fall Short. CoRR abs/1906.10885 (2019) - [i36]Tiziano De Matteis, Johannes de Fine Licht, Torsten Hoefler:
FBLAS: Streaming Linear Algebra on FPGA. CoRR abs/1907.07929 (2019) - [i35]Shigang Li, Tal Ben-Nun, Salvatore Di Girolamo, Dan Alistarh, Torsten Hoefler:
Taming Unbalanced Training Workloads in Deep Learning with Partial Collective Operations. CoRR abs/1908.04207 (2019) - [i34]Salvatore Di Girolamo, Konstantin Taranov, Andreas Kurth, Michael Schaffner, Timo Schneider, Jakub Beránek, Maciej Besta, Luca Benini, Duncan Roweth, Torsten Hoefler:
Network-Accelerated Non-Contiguous Memory Transfers. CoRR abs/1908.08590 (2019) - [i33]Elad Hoffer, Berry Weinstein, Itay Hubara, Tal Ben-Nun, Torsten Hoefler, Daniel Soudry:
Mix & Match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency. CoRR abs/1908.08986 (2019) - [i32]Grzegorz Kwasniewski, Marko Kabic, Maciej Besta, Joost VandeVondele, Raffaele Solcà, Torsten Hoefler:
Red-blue pebbling revisited: near optimal parallel matrix-matrix multiplication. CoRR abs/1908.09606 (2019) - [i31]Tiziano De Matteis, Johannes de Fine Licht, Jakub Beránek, Torsten Hoefler:
Streaming Message Interface: High-Performance Distributed Memory Programming on Reconfigurable Hardware. CoRR abs/1909.03231 (2019) - [i30]Daniele De Sensi, Salvatore Di Girolamo, Torsten Hoefler:
Mitigating Network Noise on Dragonfly Networks through Application-Aware Routing. CoRR abs/1909.07865 (2019) - [i29]Johannes de Fine Licht, Torsten Hoefler:
hlslib: Software Engineering for Hardware Design. CoRR abs/1910.04436 (2019) - [i28]Maciej Besta, Emanuel Peter, Robert Gerstenberger, Marc Fischer, Michal Podstawski, Claude Barthels, Gustavo Alonso, Torsten Hoefler:
Demystifying Graph Databases: Analysis and Taxonomy of Data Organization, System Designs, and Graph Queries. CoRR abs/1910.09017 (2019) - [i27]Maciej Besta, Torsten Hoefler:
Active Access: A Mechanism for High-Performance Distributed Data-Centric Computations. CoRR abs/1910.12897 (2019) - [i26]Peter Grönquist, Tal Ben-Nun, Nikoli Dryden, Peter Dueben, Luca Lavarini, Shigang Li, Torsten Hoefler:
Predicting Weather Uncertainty with Deep Convnets. CoRR abs/1911.00630 (2019) - [i25]Maciej Besta, Raghavendra Kanakagiri, Harun Mustafa, Mikhail Karasikov, Gunnar Rätsch, Torsten Hoefler, Edgar Solomonik:
Communication-Efficient Jaccard Similarity for High-Performance Distributed Genome Comparisons. CoRR abs/1911.04200 (2019) - [i24]Fabian Schuiki, Florian Zaruba, Torsten Hoefler, Luca Benini:
Stream Semantic Registers: A Lightweight RISC-V ISA Extension Achieving Full Compute Utilization in Single-Issue Cores. CoRR abs/1911.08356 (2019) - [i23]Johannes de Fine Licht, Grzegorz Kwasniewski, Torsten Hoefler:
Flexible Communication Avoiding Matrix Multiplication on FPGA with High-Level Synthesis. CoRR abs/1912.06526 (2019) - [i22]Alexandros Nikolaos Ziogas, Tal Ben-Nun, Guillermo Indalecio Fernández, Timo Schneider, Mathieu Luisier, Torsten Hoefler:
Optimizing the Data Movement in Quantum Transport Simulations via Data-Centric Parallel Programming. CoRR abs/1912.08810 (2019) - [i21]Maciej Besta, Simon Weber, Lukas Gianinazzi, Robert Gerstenberger, Andrey Ivanov, Yishai Oltchik, Torsten Hoefler:
Slim Graph: Practical Lossy Graph Compression for Approximate Graph Processing, Storage, and Analytics. CoRR abs/1912.08950 (2019) - [i20]Maciej Besta, Torsten Hoefler:
Slim Fly: A Cost Effective Low-Diameter Network Topology. CoRR abs/1912.08968 (2019) - [i19]Alexandros Nikolaos Ziogas, Tal Ben-Nun, Guillermo Indalecio Fernández, Timo Schneider, Mathieu Luisier, Torsten Hoefler:
A Data-Centric Approach to Extreme-Scale Ab initio Dissipative Quantum Transport Simulations. CoRR abs/1912.10024 (2019) - [i18]Maciej Besta, Marc Fischer, Vasiliki Kalavri, Michael Kapralov, Torsten Hoefler:
Practice of Streaming and Dynamic Graphs: Concepts, Models, Systems, and Parallelism. CoRR abs/1912.12740 (2019) - 2018
- [j28]Robert Gerstenberger, Maciej Besta, Torsten Hoefler:
Enabling highly scalable remote memory access programming with MPI-3 one sided. Commun. ACM 61(10): 106-113 (2018) - [j27]Shigang Li
, Yunquan Zhang, Torsten Hoefler:
Cache-Oblivious MPI All-to-All Communications Based on Morton Order. IEEE Trans. Parallel Distributed Syst. 29(3): 542-555 (2018) - [c145]Maciej Besta, Dimitri Stanojevic, Tijana Zivic, Jagpreet Singh, Maurice Hoerold, Torsten Hoefler:
Log(graph): a near-optimal high-performance graph representation. PACT 2018: 7:1-7:13 - [c144]Maciej Besta, Syed Minhaj Hassan, Sudhakar Yalamanchili, Rachata Ausavarungnirun, Onur Mutlu, Torsten Hoefler:
Slim NoC: A Low-Diameter On-Chip Network Topology for High Energy Efficiency and Scalability. ASPLOS 2018: 43-55 - [c143]Alexandru Calotoiu, Alexander Graf, Torsten Hoefler, Daniel Lorenz, Sebastian Rinke, Felix Wolf:
Lightweight Requirements Engineering for Exascale Co-design. CLUSTER 2018: 201-211 - [c142]Yosuke Oyama, Tal Ben-Nun, Torsten Hoefler, Satoshi Matsuoka:
Accelerating Deep Learning Frameworks with Micro-Batches. CLUSTER 2018: 402-412 - [c141]Konstantin Taranov, Gustavo Alonso, Torsten Hoefler:
Fast and strongly-consistent per-item resilience in key-value stores. EuroSys 2018: 39:1-39:14 - [c140]Ingo Müller
, Andrea Arteaga, Torsten Hoefler, Gustavo Alonso:
Reproducible Floating-Point Aggregation in RDBMSs. ICDE 2018: 1049-1060 - [c139]Tal Ben-Nun, Alice Shoshana Jakobovits, Torsten Hoefler:
Neural Code Comprehension: A Learnable Representation of Code Semantics. NeurIPS 2018: 3589-3601 - [c138]Dan Alistarh, Torsten Hoefler, Mikael Johansson, Nikola Konstantinov, Sarit Khirirat, Cédric Renggli:
The Convergence of Sparsified Gradient Methods. NeurIPS 2018: 5977-5987 - [c137]Lukas Gianinazzi, Pavel Kalvoda, Alessandro De Palma, Maciej Besta, Torsten Hoefler:
Communication-avoiding parallel minimum cuts and connected components. PPOPP 2018: 219-232 - [c136]Johannes de Fine Licht, Michaela Blott, Torsten Hoefler:
Designing scalable FPGA architectures using high-level synthesis. PPOPP 2018: 403-404 - [c135]Heng Lin, Xiaowei Zhu, Bowen Yu, Xiongchao Tang, Wei Xue, Wenguang Chen, Lufei Zhang, Torsten Hoefler, Xiaosong Ma, Xin Liu, Weimin Zheng, Jingfang Xu:
ShenTu: processing multi-trillion edge graphs on millions of cores in seconds. SC 2018: 56:1-56:11 - [c134]Cedric Baumann, Andrei Marian Dan, Yuri Meshman, Torsten Hoefler, Martin T. Vechev:
Automatic Verification of RMA Programs via Abstraction Extrapolation. VMCAI 2018: 47-70 - [i17]Cédric Renggli, Dan Alistarh, Torsten Hoefler:
SparCML: High-Performance Sparse Communication for Machine Learning. CoRR abs/1802.08021 (2018) - [i16]Ingo Müller, Andrea Arteaga, Torsten Hoefler, Gustavo Alonso:
Reproducible Floating-Point Aggregation in RDBMSs. CoRR abs/1802.09883 (2018) - [i15]Tal Ben-Nun, Torsten Hoefler:
Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis. CoRR abs/1802.09941 (2018) - [i14]Yosuke Oyama, Tal Ben-Nun, Torsten Hoefler, Satoshi Matsuoka:
μ-cuDNN: Accelerating Deep Learning Frameworks with Micro-Batching. CoRR abs/1804.04806 (2018) - [i13]