"Matrix multiplication beyond auto-tuning: rewrite-based GPU code generation."

Michel Steuwer, Toomas Remmelg, Christophe Dubach (2016)
a service of Schloss Dagstuhl - Leibniz Center for Informatics