Jump to : Download | Abstract | BibTex reference | EndNote reference |


Lluc Alvarez, Ramon Bertran, Marc Gonzalez, Xavier Martorell, Nacho Navarro, Eduard Ayguade. UPC-DAC-RR-CAP-2011-19 (Grup de Computació d'Altes Prestacions) Design Space Exploration of CMPs with Caches and Local Memories. Research Report Departament d'Arquitectura de Computadors (DAC) - UPC, July 2011.


Download paper: Adobe portable document (pdf) PDF

Copyright notice:This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.


Chip multiprocessors (CMPs) are the dominating architectures nowadays. There is a big variety of designs in current CMPs, with different number of cores and memory subsystems, because they are used in a wide spectrum of domains and so its best configuration highly depends on several design goals such as performance, energy consumption, scalability, area and programmability. This paper studies different chip configurations in terms of number of cores, size of the shared L3 cache and off-chip bandwidth requirements in order to find what is the most efficient design for High Performance Computing applications. In addition, it analyses two types of CMPs: CMPs with a traditional cache hierarchy, or cache-based CMPs, and CMPs with both cache hierarchy and local memories, or hybrid memory CMPs. Results show that, for HPC workloads, cache-based cores perform better when the shared L3 cache is reduced in order to make room for additional cores and the bus is able to provide a high bandwidth, reaching a speedup of 3.31x against a baseline architecture. The best chip configurations of hybrid memory CMPs are the ones with a moderate number of cores and not so mall L3 caches and, furthermore, they don’t need a bus with high bandwidth. They achieve a maximum speedup of 3.06x against the baseline architecture. In the direct comparison between the chip configurations of the two types of CMPs, hybrid memory CMPs outperform cache-based CMPs in almost all configurations, achieving a maximum speedup of 1.74x

BibTex Reference

   Author = {Alvarez, Lluc and Bertran, Ramon and Gonzalez, Marc and Martorell, Xavier and Navarro, Nacho and Ayguade, Eduard},
   Title = {{UPC-DAC-RR-CAP-2011-19 (Grup de Computació d'Altes Prestacions) Design Space Exploration of CMPs with Caches and Local Memories}},
   Institution = {Departament d'Arquitectura de Computadors (DAC) - UPC},
   Month = {July},
   Year = {2011}

EndNote Reference [help]

Get EndNote Reference (.ref)

Home | Presentation | Teaching | Research | Related Institutions | News Top

Last update: Jan 2, 2015
Copyright © 2000-2015 Departament d'Arquitectura de Computadors

It has been automatically generated using the bib2html program.