Volume 8, Number 1, April 2011

Chen-Yong Cher, Eren Kursun: Exploring the effects of on-chip thermal variation on high-performance multicore architectures. 2
Carole-Jean Wu, Margaret Martonosi: Adaptive timekeeping replacement: Fine-grained capacity management for shared CMP caches. 3
Lucas Vespa, Ning Weng: Deterministic finite automata characterization and optimization for scalable pattern matching. 4
Abhishek Bhattacharjee, Gilberto Contreras, Margaret Martonosi: Parallelization libraries: Characterizing and reducing overheads. 5
Volume 8, Number 2, July 2011
Xiangyu Dong, Yuan Xie, Naveen Muralimanohar, Norman P. Jouppi: Hybrid checkpointing using emerging nonvolatile memories for future exascale systems. 6
Jianjun Li, Chenggang Wu, Wei-Chung Hsu: Efficient and effective misaligned data access handling in a dynamic binary translation system. 7
Guru Venkataramani, Christopher J. Hughes, Sanjeev Kumar, Milos Prvulovic: DeFT: Design space exploration for on-the-fly detection of coherence misses. 8
Jason Hiser, Daniel W. Williams, Wei Hu, Jack W. Davidson, Jason Mars, Bruce R. Childers: Evaluating indirect branch handling mechanisms in software dynamic translation systems. 9
Volume 8, Number 3, October 2011
Xi E. Chen, Tor M. Aamodt: Hybrid analytical modeling of pending cache hits, data prefetching, and MSHRs. 10
Marios Kleanthous, Yiannakis Sazeides: CATCH: A mechanism for dynamically detecting cache-content-duplication in instruction caches. 11
Hans Vandierendonck, André Seznec: Managing SMT resource usage through speculative instruction window weighting. 12
Min Feng, Chen Tian, Changhui Lin, Rajiv Gupta: Dynamic access distance driven cache replacement. 14
Ahmad Samih, Yan Solihin, Anil Krishna: Evaluating placement policies for managing capacity sharing in CMP architectures with private caches. 15
Chang-Ching Yeh, Kuei-Chung Chang, Tien-Fu Chen, Chingwei Yeh: Maintaining performance on power gating of microprocessor functional units by using a predictive pre-wakeup strategy. 16
Hyunjin Lee, Sangyeun Cho, Bruce R. Childers: DEFCAM: A design and evaluation framework for defect-tolerant cache memories. 17
Volume 8, Number 4, January 2012
Per Stenström, Koen De Bosschere: Introduction to the special issue on high-performance and embedded architectures and compilers. 18
Jorge Albericio, Ruben Gran Tejero, Pablo Ibáñez, Víctor Viñals, José María Llabería: ABS: A low-cost adaptive controller for prefetching in a banked shared last-level cache. 19
Ali Galip Bayrak, Nikola Velickovic, Paolo Ienne, Wayne Burleson: An architecture-independent instruction shuffler to protect against side-channel attacks. 20

Jeroen V. Cleemput, Bart Coppens, Bjorn De Sutter: Compiler mitigations for time attacks on modern x86 processors. 23
Jason McCandless, David Gregg: Compiler techniques to improve dynamic branch prediction for indirect jump and call instructions. 24
Antonio García-Guirado, Ricardo Fernández Pascual, Alberto Ros, José M. García: DAPSCO: Distance-aware partially shared cache organization. 25
Zhenjiang Wang, Chenggang Wu, Pen-Chung Yew, Jianjun Li, Di Xu: On-the-fly structure splitting for heap objects. 26
Dibyendu Das, Benoît Dupont de Dinechin, Ramakrishna Upadrasta: Efficient liveness computation using merge sets and DJ-graphs. 27
George Patsilaras, Niket K. Choudhary, James Tuck: Efficiently exploiting memory level parallelism on asymmetric coupled cores in the dark silicon era. 28
Roman Malits, Evgeny Bolotin, Avinoam Kolodny, Avi Mendelson: Exploring the limits of GPGPU scheduling in control flow bound applications. 29
Lois Orosa, Elisardo Antelo, Javier D. Bruguera: FlexSig: Implementing flexible hardware signatures. 30
J. Rubén Titos Gil, Manuel E. Acacio, José M. García, Tim Harris, Adrián Cristal, Osman S. Unsal, Ibrahim Hur, Mateo Valero: Hardware transactional memory with software-defined conflicts. 31
Yongjoo Kim, Jongeun Lee, Toan X. Mai, Yunheung Paek: Improving performance of nested loops on reconfigurable array processors. 32
Petar Radojkovic, Sylvain Girbal, Arnaud Grasset, Eduardo Quiñones, Sami Yehia, Francisco J. Cazorla: On the evaluation of the impact of shared resources in multithreaded COTS processors in time-critical environments. 34
Leonid Domnitser, Aamer Jaleel, Jason Loew, Nael B. Abu-Ghazaleh, Dmitry Ponomarev: Non-monopolizable caches: Low-complexity mitigation of cache side channel attacks. 35
Alejandro Rico, Felipe Cabarcas, Carlos Villavieja, Milan Pavlovic, Augusto Vega, Yoav Etsion, Alex Ramírez, Mateo Valero: On the simulation of large-scale architectures using multiple application abstraction levels. 36
Selma Saidi, Pranav Tendulkar, Thierry Lepley, Oded Maler: Optimizing explicit data transfers for data parallel applications on the cell architecture. 37

Yaozu Dong, Yu Chen, Zhenhao Pan, Jinquan Dai, Yunhong Jiang: ReNIC: Architectural extension to SR-IOV I/O virtualization for efficient replication. 40
Tom M. Bruintjes, Karel H. G. Walters, Sabih H. Gerez, Bert Molenkamp, Gerard J. M. Smit: Sabrewing: A lightweight architecture for combined floating-point and integer arithmetic. 41
Mario Kicherer, Fabian Nowak, Rainer Buchty, Wolfgang Karl: Seamlessly portable applications: Managing the diversity of modern heterogeneous systems. 42
Nathanael Premillieu, André Seznec: SYRANT: SYmmetric resource allocation on not-taken and taken paths. 43
William Hasenplaugh, Pritpal S. Ahuja, Aamer Jaleel, Simon C. Steely Jr., Joel S. Emer: The gradient-based cache partitioning algorithm. 44
Javier Lira, Timothy M. Jones, Carlos Molina, Antonio González: The migration prefetcher: Anticipating data promotion in dynamic NUCA caches. 45
Kishore Kumar Pusukuri, Rajiv Gupta, Laxmi N. Bhuyan: Thread Tranquilizer: Dynamically reducing performance variation. 46
Dong-song Zhang, Deke Guo, Fang-Yuan Chen, Fei Wu, Tong Wu, Ting Cao, Shiyao Jin: TL-plane-based multi-core energy-efficient real-time scheduling algorithm for sporadic tasks. 47
Michael J. Lyons, Mark Hempstead, Gu-Yeon Wei, David Brooks: The accelerator store: A shared memory framework for accelerator-based systems. 48
Daniel A. Orozco, Elkin Garcia, Rishi Khan, Kelly Livingston, Guang R. Gao: Toward high-throughput algorithms on many-core architectures. 49
Kevin Stock, Louis-Noël Pouchet, P. Sadayappan: Using machine learning to improve automatic vectorization. 50
Kanit Therdsteerasukdi, Gyungsu Byun, Jason Cong, M. Frank Chang, Glenn Reinman: Utilizing RF-I and intelligent scheduling for better throughput/watt in a mobile GPU memory system. 51
Frederick Ryckbosch, Stijn Polfliet, Lieven Eeckhout: VSim: Simulating multi-server setups at near native hardware speed. 52
Miao Zhou, Yu Du, Bruce R. Childers, Rami G. Melhem, Daniel Mossé: Writeback-aware partitioning and replacement for last-level caches in phase change main memory systems. 53
Qingping Wang, Sameer Kulkarni, John Cavazos, Michael F. Spear: A transactional memory with automatic performance tuning. 54
Bartosz Bogdanski, Sven-Arne Reinemo, Frank Olaf Sem-Jacobsen, Ernst Gunnar Gran: sFtree: A fully connected and deadlock-free switch-to-switch routing algorithm for fat-trees. 55



