default search action
Torsten Hoefler
Torsten Höfler
Person information
- affiliation: ETH Zürich
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [j66]Maciej Besta, Robert Gerstenberger, Emanuel Peter, Marc Fischer, Michal Podstawski, Claude Barthels, Gustavo Alonso, Torsten Hoefler:
Demystifying Graph Databases: Analysis and Taxonomy of Data Organization, System Designs, and Graph Queries. ACM Comput. Surv. 56(2): 31:1-31:40 (2024) - [j65]Daniele De Sensi, Edgar Costa Molero, Salvatore Di Girolamo, Laurent Vanbever, Torsten Hoefler:
Canary: Congestion-aware in-network allreduce using dynamic trees. Future Gener. Comput. Syst. 152: 70-82 (2024) - [j64]Peter Bauer, Torsten Hoefler, Bjorn Stevens, Wilco Hazeleger:
Digital twins of Earth and the computing challenge of human interaction. Nat. Comput. Sci. 4(3): 154-157 (2024) - [j63]Maciej Besta, Torsten Hoefler:
Parallel and Distributed Graph Neural Networks: An In-Depth Concurrency Analysis. IEEE Trans. Pattern Anal. Mach. Intell. 46(5): 2584-2606 (2024) - [j62]Thomas Benz, Michael Rogenmoser, Paul Scheffler, Samuel Riedel, Alessandro Ottaviano, Andreas Kurth, Torsten Hoefler, Luca Benini:
A High-Performance, Energy-Efficient Modular DMA Engine Architecture. IEEE Trans. Computers 73(1): 263-277 (2024) - [j61]Jinfan Chen, Shigang Li, Ran Guo, Jinhui Yuan, Torsten Hoefler:
AutoDDL: Automatic Distributed Deep Learning With Near-Optimal Bandwidth Cost. IEEE Trans. Parallel Distributed Syst. 35(8): 1331-1344 (2024) - [c275]Maciej Besta, Nils Blach, Ales Kubicek, Robert Gerstenberger, Michal Podstawski, Lukas Gianinazzi, Joanna Gajda, Tomasz Lehmann, Hubert Niewiadomski, Piotr Nyczyk, Torsten Hoefler:
Graph of Thoughts: Solving Elaborate Problems with Large Language Models. AAAI 2024: 17682-17690 - [c274]Samuel Riedel, Marc Gantenbein, Alessandro Ottaviano, Torsten Hoefler, Luca Benini:
LRSCwait: Enabling Scalable and Efficient Synchronization in Manycore Systems Through Polling-Free and Retry-Free Operation. DATE 2024: 1-6 - [c273]Marcin Copik, Alexandru Calotoiu, Pengyu Zhou, Konstantin Taranov, Torsten Hoefler:
FaaSKeeper: Learning from Building Serverless Services with ZooKeeper as an Example. HPDC 2024: 94-108 - [c272]Piotr Luczynski, Lukas Gianinazzi, Patrick Iff, Leighton Wilson, Daniele De Sensi, Torsten Hoefler:
Near-Optimal Wafer-Scale Reduce. HPDC 2024: 334-347 - [c271]Saleh Ashkboos, Maximilian L. Croci, Marcelo Gennari Do Nascimento, Torsten Hoefler, James Hensman:
SliceGPT: Compress Large Language Models by Deleting Rows and Columns. ICLR 2024 - [c270]Tim Dettmers, Ruslan Svirschevski, Vage Egiazarian, Denis Kuznedelev, Elias Frantar, Saleh Ashkboos, Alexander Borzunov, Torsten Hoefler, Dan Alistarh:
SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression. ICLR 2024 - [c269]Langwen Huang, Lukas Gianinazzi, Yuejiang Yu, Peter D. Düben, Torsten Hoefler:
DiffDA: a Diffusion model for weather-scale Data Assimilation. ICML 2024 - [c268]Marcin Copik, Marcin Chrapek, Larissa Schmid, Alexandru Calotoiu, Torsten Hoefler:
Software Resource Disaggregation for HPC with Serverless Computing. IPDPS 2024: 139-156 - [c267]Yves Baumann, Tal Ben-Nun, Maciej Besta, Lukas Gianinazzi, Torsten Hoefler, Piotr Luczynski:
Low-Depth Spatial Tree Algorithms. IPDPS 2024: 180-192 - [c266]Nils Blach, Maciej Besta, Daniele De Sensi, Jens Domke, Hussein Harake, Shigang Li, Patrick Iff, Marek Konieczny, Kartik Lakhotia, Ales Kubicek, Marcel Ferrari, Fabrizio Petrini, Torsten Hoefler:
A High-Performance Design, Implementation, Deployment, and Evaluation of The Slim Fly Network. NSDI 2024 - [c265]Daniele De Sensi, Tommaso Bonato, David Saam, Torsten Hoefler:
Swing: Short-cutting Rings for Higher Bandwidth Allreduce. NSDI 2024 - [c264]Lukas Gianinazzi, Alexandros Nikolaos Ziogas, Langwen Huang, Piotr Luczynski, Saleh Ashkboosh, Florian Scheidl, Armon Carigiet, Chio Ge, Nabil Abubaker, Maciej Besta, Tal Ben-Nun, Torsten Hoefler:
Arrow Matrix Decomposition: A Novel Approach for Communication-Efficient Sparse Matrix Multiplication. PPoPP 2024: 404-416 - [c263]Kartik Lakhotia, Laura Monroe, Kelly Isham, Maciej Besta, Nils Blach, Torsten Hoefler, Fabrizio Petrini:
PolarStar: Expanding the Horizon of Diameter-3 Networks. SPAA 2024: 345-357 - [c262]Mikhail Khalilov, Marcin Chrapek, Siyuan Shen, Alessandro Vezzu, Thomas Benz, Salvatore Di Girolamo, Timo Schneider, Daniele De Sensi, Luca Benini, Torsten Hoefler:
OSMOSIS: Enabling Multi-Tenancy in Datacenter SmartNICs. USENIX ATC 2024: 247-263 - [i184]Torsten Hoefler, Marcin Copik, Pete Beckman, Andrew Jones, Ian T. Foster, Manish Parashar, Daniel A. Reed, Matthias Troyer, Thomas C. Schulthess, Dan Ernst, Jack J. Dongarra:
XaaS: Acceleration as a Service to Enable Productive High-Performance Cloud Computing. CoRR abs/2401.04552 (2024) - [i183]Langwen Huang, Lukas Gianinazzi, Yuejiang Yu, Peter D. Düben, Torsten Hoefler:
DiffDA: a diffusion model for weather-scale data assimilation. CoRR abs/2401.05932 (2024) - [i182]Daniele De Sensi, Tommaso Bonato, David Saam, Torsten Hoefler:
Swing: Short-cutting Rings for Higher Bandwidth Allreduce. CoRR abs/2401.09356 (2024) - [i181]Samuel Riedel, Marc Gantenbein, Alessandro Ottaviano, Torsten Hoefler, Luca Benini:
LRSCwait: Enabling Scalable and Efficient Synchronization in Manycore Systems through Polling-Free and Retry-Free Operation. CoRR abs/2401.09359 (2024) - [i180]Lukas Möller, Marcin Copik, Alexandru Calotoiu, Torsten Hoefler:
Cppless: Productive and Performant Serverless Programming in C++. CoRR abs/2401.10834 (2024) - [i179]Marcin Copik, Marcin Chrapek, Larissa Schmid, Alexandru Calotoiu, Torsten Hoefler:
Software Resource Disaggregation for HPC with Serverless Computing. CoRR abs/2401.10852 (2024) - [i178]Maciej Besta, Florim Memedi, Zhenyu Zhang, Robert Gerstenberger, Nils Blach, Piotr Nyczyk, Marcin Copik, Grzegorz Kwasniewski, Jürgen Müller, Lukas Gianinazzi, Ales Kubicek, Hubert Niewiadomski, Onur Mutlu, Torsten Hoefler:
Topologies of Reasoning: Demystifying Chains, Trees, and Graphs of Thoughts. CoRR abs/2401.14295 (2024) - [i177]Saleh Ashkboos, Maximilian L. Croci, Marcelo Gennari Do Nascimento, Torsten Hoefler, James Hensman:
SliceGPT: Compress Large Language Models by Deleting Rows and Columns. CoRR abs/2401.15024 (2024) - [i176]Lukas Gianinazzi, Alexandros Nikolaos Ziogas, Langwen Huang, Piotr Luczynski, Saleh Ashkboos, Florian Scheidl, Armon Carigiet, Chio Ge, Nabil Abubaker, Maciej Besta, Tal Ben-Nun, Torsten Hoefler:
Arrow Matrix Decomposition: A Novel Approach for Communication-Efficient Sparse Matrix Multiplication. CoRR abs/2402.19364 (2024) - [i175]Saleh Ashkboos, Amirkeivan Mohtashami, Maximilian L. Croci, Bo Li, Martin Jaggi, Dan Alistarh, Torsten Hoefler, James Hensman:
QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs. CoRR abs/2404.00456 (2024) - [i174]Tommaso Bonato, Abdul Kabbani, Daniele De Sensi, Rong Pan, Yanfang Le, Costin Raiciu, Mark Handley, Timo Schneider, Nils Blach, Ahmad Ghalayini, Daniel S. F. Alves, Michael Papamichael, Adrian M. Caulfield, Torsten Hoefler:
SMaRTT-REPS: Sender-based Marked Rapidly-adapting Trimmed & Timed Transport with Recycled Entropies. CoRR abs/2404.01630 (2024) - [i173]Yves Baumann, Tal Ben-Nun, Maciej Besta, Lukas Gianinazzi, Torsten Hoefler, Piotr Luczynski:
Low-Depth Spatial Tree Algorithms. CoRR abs/2404.12953 (2024) - [i172]Siyuan Shen, Langwen Huang, Marcin Chrapek, Timo Schneider, Jai Dayal, Manisha Gajbe, Robert Wisniewski, Torsten Hoefler:
LLAMP: Assessing Network Latency Tolerance of HPC Applications with Linear Programming. CoRR abs/2404.14193 (2024) - [i171]Piotr Luczynski, Lukas Gianinazzi, Patrick Iff, Leighton Wilson, Daniele De Sensi, Torsten Hoefler:
Near-Optimal Wafer-Scale Reduce. CoRR abs/2404.15888 (2024) - [i170]Nabil Abubaker, Torsten Hoefler:
SpComm3D: A Framework for Enabling Sparse Communication in 3D Sparse Kernels. CoRR abs/2404.19638 (2024) - [i169]Torsten Hoefler, Alexandru Calotoiu, Anurag Dipankar, Thomas C. Schulthess, Xavier Lapillonne, Oliver Fuhrer:
Towards Specialized Supercomputers for Climate Sciences: Computational Requirements of the Icosahedral Nonhydrostatic Weather and Climate Model. CoRR abs/2405.13043 (2024) - [i168]Timo Schneider, Pengcheng Xu, Torsten Hoefler:
FPsPIN: An FPGA-based Open-Hardware Research Platform for Processing in the Network. CoRR abs/2405.16378 (2024) - [i167]Maciej Besta, Lorenzo Paleari, Ales Kubicek, Piotr Nyczyk, Robert Gerstenberger, Patrick Iff, Tomasz Lehmann, Hubert Niewiadomski, Torsten Hoefler:
CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks. CoRR abs/2406.02524 (2024) - [i166]Maciej Besta, Ales Kubicek, Roman Niggli, Robert Gerstenberger, Lucas Weitzendorf, Mingyuan Chi, Patrick Iff, Joanna Gajda, Piotr Nyczyk, Jürgen Müller, Hubert Niewiadomski, Marcin Chrapek, Michal Podstawski, Torsten Hoefler:
Multi-Head RAG: Solving Multi-Aspect Problems with LLMs. CoRR abs/2406.05085 (2024) - [i165]Wenqi Jiang, Hang Hu, Torsten Hoefler, Gustavo Alonso:
Accelerating Graph-based Vector Search via Delayed-Synchronization Traversal. CoRR abs/2406.12385 (2024) - [i164]Maciej Besta, Florian Scheidl, Lukas Gianinazzi, Shachar Klaiman, Jürgen Müller, Torsten Hoefler:
Demystifying Higher-Order Graph Neural Networks. CoRR abs/2406.12841 (2024) - [i163]Tommaso Bonato, Abdul Kabbani, Ahmad Ghalayini, Mohammad Dohadwala, Michael Papamichael, Daniele De Sensi, Torsten Hoefler:
REPS: Recycling Entropies for Packet Spraying to Adaptively Explore Paths and Mitigate Failures. CoRR abs/2407.21625 (2024) - [i162]Patrik Okanovic, Grzegorz Kwasniewski, Paolo Sylos Labini, Maciej Besta, Flavio Vella, Torsten Hoefler:
High Performance Unstructured SpMM Computation Using Tensor Cores. CoRR abs/2408.11551 (2024) - [i161]Luigi Fusco, Mikhail Khalilov, Marcin Chrapek, Giridhar Chukkapalli, Thomas C. Schulthess, Torsten Hoefler:
Understanding Data Movement in Tightly Coupled Heterogeneous Systems: A Case Study with the Grace Hopper Superchip. CoRR abs/2408.11556 (2024) - [i160]Elias Frantar, Roberto L. Castro, Jiale Chen, Torsten Hoefler, Dan Alistarh:
MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models. CoRR abs/2408.11743 (2024) - [i159]Maciej Besta, Robert Gerstenberger, Patrick Iff, Pournima Sonawane, Juan Gómez-Luna, Raghavendra Kanakagiri, Rui Min, Onur Mutlu, Torsten Hoefler, Raja Appuswamy, Aidan O'Mahony:
Hardware Acceleration for Knowledge Graph Processing: Challenges & Recent Developments. CoRR abs/2408.12173 (2024) - [i158]Mikhail Khalilov, Salvatore Di Girolamo, Marcin Chrapek, Rami Nudelman, Gil Bloch, Torsten Hoefler:
Network-Offloaded Bandwidth-Optimal Broadcast and Allgather for Distributed AI. CoRR abs/2408.13356 (2024) - [i157]Daniele De Sensi, Lorenzo Pichetti, Flavio Vella, Tiziano De Matteis, Zebin Ren, Luigi Fusco, Matteo Turisini, Daniele Cesarini, Kurt Lust, Animesh Trivedi, Duncan Roweth, Filippo Spiga, Salvatore Di Girolamo, Torsten Hoefler:
Exploring GPU-to-GPU Communication: Insights into Supercomputer Interconnects. CoRR abs/2408.14090 (2024) - 2023
- [j60]Torsten Hoefler, Thomas Häner, Matthias Troyer:
Disentangling Hype from Practicality: On Realistically Achieving Quantum Advantage. Commun. ACM 66(5): 82-87 (2023) - [j59]Torsten Hoefler, Duncan Roweth, Keith D. Underwood, Robert Alverson, Mark Griswold, Vahid Tabatabaee, Mohan Kalkunte, Surendra Anubolu, Siyuan Shen, Moray McLaren, Abdul Kabbani, Steve Scott:
Data Center Ethernet and Remote Direct Memory Access: Issues at Hyperscale. Computer 56(7): 67-77 (2023) - [j58]Torsten Hoefler, Bjorn Stevens, Andreas F. Prein, Johanna Baehr, Thomas C. Schulthess, Thomas F. Stocker, John A. Taylor, Daniel Klocke, Pekka Manninen, Piers M. Forster, Tobias Kölling, Nicolas Gruber, Hartwig Anzt, Claudia Frauen, Florian Ziemen, Milan Klöwer, Karthik Kashinath, Christoph M. Schär, Oliver Fuhrer, Bryan N. Lawrence:
Earth Virtualization Engines: A Technical Perspective. Comput. Sci. Eng. 25(3): 50-59 (2023) - [j57]Satoshi Matsuoka, Jens Domke, Mohamed Wahib, Aleksandr Drozd, Torsten Hoefler:
Myths and legends in high-performance computing. Int. J. High Perform. Comput. Appl. 37(3-4): 245-259 (2023) - [j56]Maciej Besta, Marc Fischer, Vasiliki Kalavri, Michael Kapralov, Torsten Hoefler:
Practice of Streaming Processing of Dynamic Graphs: Concepts, Models, and Systems. IEEE Trans. Parallel Distributed Syst. 34(6): 1860-1876 (2023) - [j55]Paul Scheffler, Florian Zaruba, Fabian Schuiki, Torsten Hoefler, Luca Benini:
Sparse Stream Semantic Registers: A Lightweight ISA Extension Accelerating General Sparse Linear Algebra. IEEE Trans. Parallel Distributed Syst. 34(12): 3147-3161 (2023) - [c261]Wei Qiu, Marcin Copik, Yun Wang, Alexandru Calotoiu, Torsten Hoefler:
User-guided Page Merging for Memory Deduplication in Serverless Systems. IEEE Big Data 2023: 159-169 - [c260]Tal Ben-Nun, Berke Ates, Alexandru Calotoiu, Torsten Hoefler:
Bridging Control-Centric and Data-Centric Optimization. CGO 2023: 173-185 - [c259]Tal Ben-Nun, Lukas Gianinazzi, Torsten Hoefler, Yishai Oltchik:
Maximum Flows in Parametric Graph Templates. CIAC 2023: 97-111 - [c258]Patrick Iff, Maciej Besta, Matheus A. Cavalcante, Tim Fischer, Luca Benini, Torsten Hoefler:
Sparse Hamming Graph: A Customizable Network-on-Chip Topology. DAC 2023: 1-6 - [c257]Patrick Iff, Maciej Besta, Matheus A. Cavalcante, Tim Fischer, Luca Benini, Torsten Hoefler:
HexaMesh: Scaling to Hundreds of Chiplets with an Optimized Chiplet Arrangement. DAC 2023: 1-6 - [c256]Tiziano De Matteis, Lukas Gianinazzi, Johannes de Fine Licht, Torsten Hoefler:
Streaming Task Graph Scheduling for Dataflow Architectures. HPDC 2023: 225-237 - [c255]Yunqiang Li, Jan C. van Gemert, Torsten Hoefler, Bert Moons, Evangelos Eleftheriou, Bram-Ernst Verhoef:
Differentiable Transportation Pruning. ICCV 2023: 16911-16921 - [c254]Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh:
OPTQ: Accurate Quantization for Generative Pre-trained Transformers. ICLR 2023 - [c253]Langwen Huang, Torsten Hoefler:
Compressing multidimensional weather and climate data into neural networks. ICLR 2023 - [c252]Lukas Trümper, Tal Ben-Nun, Philipp Schaad, Alexandru Calotoiu, Torsten Hoefler:
Performance Embeddings: A Similarity-Based Transfer Tuning Approach to Performance Optimization. ICS 2023: 50-62 - [c251]Marcin Copik, Roman Böhringer, Alexandru Calotoiu, Torsten Hoefler:
FMI: Fast and Cheap Message Passing for Serverless Functions. ICS 2023: 373-385 - [c250]Marcin Copik, Konstantin Taranov, Alexandru Calotoiu, Torsten Hoefler:
rFaaS: Enabling High Performance Serverless with RDMA and Leases. IPDPS 2023: 897-907 - [c249]Maciej Besta, Afonso Claudino Catarino, Lukas Gianinazzi, Nils Blach, Piotr Nyczyk, Hubert Niewiadomski, Torsten Hoefler:
HOT: Higher-Order Dynamic Graph Representation Learning With Efficient Transformers. LoG 2023: 15 - [c248]Kazuki Osawa, Shigang Li, Torsten Hoefler:
PipeFisher: Efficient Training of Large Language Models Using Pipelining and Fisher Information Matrices. MLSys 2023 - [c247]Tommy Nguyen, Yue Shi, Samuel Alexander Stein, Tim Stavenger, Marvin Warner, Martin Roetteler, Torsten Hoefler, Ang Li:
A Reference Implementation for a Quantum Message Passing Interface. QCE 2023: 292-293 - [c246]Maciej Besta, Robert Gerstenberger, Marc Fischer, Michal Podstawski, Nils Blach, Berke Egeli, George Mitenkov, Wojciech Chlapek, Marek T. Michalewicz, Hubert Niewiadomski, Jürgen Müller, Torsten Hoefler:
The Graph Database Interface: Scaling Online Transactional and Analytical Graph Workloads to Hundreds of Thousands of Cores. SC 2023: 22:1-22:18 - [c245]Marcin Chrapek, Mikhail Khalilov, Torsten Hoefler:
HEAR: Homomorphically Encrypted Allreduce. SC 2023: 36:1-36:17 - [c244]Maciej Besta, Pawel Renc, Robert Gerstenberger, Paolo Sylos Labini, Alexandros Nikolaos Ziogas, Tiancheng Chen, Lukas Gianinazzi, Florian Scheidl, Kalman Szenes, Armon Carigiet, Patrick Iff, Grzegorz Kwasniewski, Raghavendra Kanakagiri, Chio Ge, Sammy Jaeger, Jaroslaw Was, Flavio Vella, Torsten Hoefler:
High-Performance and Programmable Attentional Graph Neural Networks with Global Tensor Formulations. SC 2023: 66:1-66:16 - [c243]Roberto L. Castro, Andrei Ivanov, Diego Andrade, Tal Ben-Nun, Basilio B. Fraguela, Torsten Hoefler:
VENOM: A Vectorized N: M Format for Unleashing the Power of Sparse Tensor Cores. SC 2023: 72:1-72:14 - [c242]Wenqi Jiang, Shigang Li, Yu Zhu, Johannes de Fine Licht, Zhenhao He, Runbin Shi, Cédric Renggli, Shuai Zhang, Theodoros Rekatsinas, Torsten Hoefler, Gustavo Alonso:
Co-design Hardware and Algorithm for Vector Search. SC 2023: 87:1-87:15 - [c241]Philipp Schaad, Timo Schneider, Tal Ben-Nun, Alexandru Calotoiu, Alexandros Nikolaos Ziogas, Torsten Hoefler:
FuzzyFlow: Leveraging Dataflow To Find and Squash Program Optimization Bugs. SC 2023: 88:1-88:15 - [c240]Yue Shi, Tommy Nguyen, Samuel Alexander Stein, Tim Stavenger, Marvin Warner, Martin Roetteler, Torsten Hoefler, Ang Li:
A Reference Implementation for a Quantum Message Passing Interface. SC Workshops 2023: 1420-1425 - [c239]Daniele De Sensi, Tiziano De Matteis, Konstantin Taranov, Salvatore Di Girolamo, Tobias Rahn, Torsten Hoefler:
Noise in the Clouds: Influence of Network Performance Variability on Application Scalability. SIGMETRICS (Abstracts) 2023: 17-18 - [c238]Kartik Lakhotia, Kelly Isham, Laura Monroe, Maciej Besta, Torsten Hoefler, Fabrizio Petrini:
In-network Allreduce with Multiple Spanning Trees on PolarFly. SPAA 2023: 165-176 - [c237]Andrei Ivanov, Benjamin Rothenberger, Arnaud Dethise, Marco Canini, Torsten Hoefler, Adrian Perrig:
SAGE: Software-based Attestation for GPU Execution. USENIX ATC 2023: 485-499 - [d3]Maciej Besta, Robert Gerstenberger, Marc Fischer, Michal Podstawski, Jürgen Müller, Nils Blach, Berke Egeli, George Mitenkov, Marek T. Michalewicz, Torsten Hoefler:
GDI-RMA 0.1 Software Artifact. Zenodo, 2023 - [d2]Maciej Besta, Pawel Renc, Robert Gerstenberger, Paolo Sylos Labini, Alexandros Nikolaos Ziogas, Tiancheng Chen, Lukas Gianinazzi, Florian Scheidl, Kalman Szenes, Armon Carigiet, Patrick Iff, Grzegorz Kwasniewski, Raghavendra Kanakagiri, Chio Ge, Sammy Jaeger, Jaroslaw Was, Flavio Vella, Torsten Hoefler:
GNN Scaling 0.1 Software Artifact. Zenodo, 2023 - [d1]Lukas Gianinazzi, Alexandros Nikolaos Ziogas, Piotr Luczynski, Saleh Ashkboosh, Langwen Huang, Florian Scheidl, Chio Ge, Armon Carigiet, Maciej Besta, Tal Ben-Nun, Torsten Hoefler:
Arrow Matrix Decompositions. Zenodo, 2023 - [i156]Niels Gleinig, Tal Ben-Nun, Torsten Hoefler:
A Theory of I/O-Efficient Sparse Neural Network Inference. CoRR abs/2301.01048 (2023) - [i155]Satoshi Matsuoka, Jens Domke, Mohamed Wahib, Aleksandr Drozd, Torsten Hoefler:
Myths and Legends in High-Performance Computing. CoRR abs/2301.02432 (2023) - [i154]Jinfan Chen, Shigang Li, Ran Guo, Jinhui Yuan, Torsten Hoefler:
AutoDDL: Automatic Distributed Deep Learning with Asymptotically Optimal Communication. CoRR abs/2301.06813 (2023) - [i153]Niels Gleinig, Tobias Rohner, Torsten Hoefler:
Approximate Reversible Circuits for NISQ-Era Quantum Computers. CoRR abs/2302.01066 (2023) - [i152]Torsten Hoefler, Duncan Roweth, Keith D. Underwood, Bob Alverson, Mark Griswold, Vahid Tabatabaee, Mohan Kalkunte, Surendra Anubolu, Siyuan Shen, Abdul Kabbani, Moray McLaren, Steve Scott:
Datacenter Ethernet and RDMA: Issues at Hyperscale. CoRR abs/2302.03337 (2023) - [i151]Kartik Lakhotia, Laura Monroe, Kelly Isham, Maciej Besta, Nils Blach, Torsten Hoefler, Fabrizio Petrini:
PolarStar: Expanding the Scalability Horizon of Diameter-3 Networks. CoRR abs/2302.07217 (2023) - [i150]Lukas Trümper, Tal Ben-Nun, Philipp Schaad, Alexandru Calotoiu, Torsten Hoefler:
Performance Embeddings: A Similarity-based Approach to Automatic Performance Optimization. CoRR abs/2303.08142 (2023) - [i149]Andrei Ivanov, Nikoli Dryden, Tal Ben-Nun, Saleh Ashkboos, Torsten Hoefler:
STen: Productive and Efficient Sparsity in PyTorch. CoRR abs/2304.07613 (2023) - [i148]Kazuki Osawa, Satoki Ishikawa, Rio Yokota, Shigang Li, Torsten Hoefler:
ASDL: A Unified Interface for Gradient Preconditioning in PyTorch. CoRR abs/2305.04684 (2023) - [i147]Thomas Benz, Michael Rogenmoser, Paul Scheffler, Samuel Riedel, Alessandro Ottaviano, Andreas Kurth, Torsten Hoefler, Luca Benini:
A High-performance, Energy-efficient Modular DMA Engine Architecture. CoRR abs/2305.05240 (2023) - [i146]Paul Scheffler, Florian Zaruba, Fabian Schuiki, Torsten Hoefler, Luca Benini:
Sparse Stream Semantic Registers: A Lightweight ISA Extension Accelerating General Sparse Linear Algebra. CoRR abs/2305.05559 (2023) - [i145]Marcin Copik, Roman Böhringer, Alexandru Calotoiu, Torsten Hoefler:
FMI: Fast and Cheap Message Passing for Serverless Functions. CoRR abs/2305.08763 (2023) - [i144]Maciej Besta, Robert Gerstenberger, Marc Fischer, Michal Podstawski, Jürgen Müller, Nils Blach, Berke Egeli, George Mitenkov, Wojciech Chlapek, Marek T. Michalewicz, Torsten Hoefler:
High-Performance Graph Databases That Are Portable, Programmable, and Scale to Hundreds of Thousands of Cores. CoRR abs/2305.11162 (2023) - [i143]Tal Ben-Nun, Berke Ates, Alexandru Calotoiu, Torsten Hoefler:
Bridging Control-Centric and Data-Centric Optimization. CoRR abs/2306.00366 (2023) - [i142]Tiziano De Matteis, Lukas Gianinazzi, Johannes de Fine Licht, Torsten Hoefler:
Streaming Task Graph Scheduling for Dataflow Architectures. CoRR abs/2306.02730 (2023) - [i141]Tim Dettmers, Ruslan Svirschevski, Vage Egiazarian, Denis Kuznedelev, Elias Frantar, Saleh Ashkboos, Alexander Borzunov, Torsten Hoefler, Dan Alistarh:
SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression. CoRR abs/2306.03078 (2023) - [i140]Wenqi Jiang, Shigang Li, Yu Zhu, Johannes de Fine Licht, Zhenhao He, Runbin Shi, Cédric Renggli, Shuai Zhang, Theodoros Rekatsinas, Torsten Hoefler, Gustavo Alonso:
Co-design Hardware and Algorithm for Vector Search. CoRR abs/2306.11182 (2023) - [i139]Philipp Schaad, Timo Schneider, Tal Ben-Nun, Alexandru Calotoiu, Alexandros Nikolaos Ziogas, Torsten Hoefler:
FuzzyFlow: Leveraging Dataflow To Find and Squash Program Optimization Bugs. CoRR abs/2306.16178 (2023) - [i138]Torsten Hoefler, Thomas Häner, Matthias Troyer:
Disentangling Hype from Practicality: On Realistically Achieving Quantum Advantage. CoRR abs/2307.00523 (2023) - [i137]Tal Ben-Nun, Lukas Gianinazzi, Torsten Hoefler, Yishai Oltchik:
Maximum Flows in Parametric Graph Templates. CoRR abs/2307.08420 (2023) - [i136]Yunqiang Li, Jan C. van Gemert, Torsten Hoefler, Bert Moons, Evangelos Eleftheriou, Bram-Ernst Verhoef:
Differentiable Transportation Pruning. CoRR abs/2307.08483 (2023) - [i135]Maciej Besta, Nils Blach, Ales Kubicek, Robert Gerstenberger, Lukas Gianinazzi, Joanna Gajda, Tomasz Lehmann, Michal Podstawski, Hubert Niewiadomski, Piotr Nyczyk, Torsten Hoefler:
Graph of Thoughts: Solving Elaborate Problems with Large Language Models. CoRR abs/2308.09687 (2023) - [i134]Julia Bazinska, Andrei Ivanov, Tal Ben-Nun, Nikoli Dryden, Maciej Besta, Siyuan Shen, Torsten Hoefler:
Cached Operator Reordering: A Unified View for Fast GNN Training. CoRR abs/2308.12093 (2023) - [i133]