A High-Bandwidth Memory Pipeline For Wide-Issue Processors
Sangyeun Cho, Samsung Electronics
Tuesday, April 6, 2004
10:00am - SENSQ 5317
Refreshments at 9:30am in SENSQ 5319
Abstract
Providing adequate data bandwidth is extremely important for a wide-issue processor to achieve its full performance potential. Adding a large number of ports to a data cache, however, becomes increasingly inefficient and can add to the hardware complexity significantly. We take an alternative or complementary approach for providing more data bandwidth, called data decoupling. We further study an interesting, yet less explored behavior of memory access instructions, called access region locality, which is concerned with each static memory instruction and its range of access locations at run time. Our experimental study using a set of SPEC95 benchmark programs shows that most memory access instructions reference a single region during program execution. Also shown is that it is possible to accurately predict the access region of a memory instruction by scrutinizing the addressing mode of the instruction and the past access history of it. We describe and evaluate a wide-issue superscalar processor with two distinct sets of memory pipelines and caches, driven by the access region predictor. Experimental results indicate that the proposed mechanism is very effective in providing high memory bandwidth to the processor, resulting in comparable or better performance than a conventional memory design with a heavily multi-ported data cache that can lead to much higher hardware complexity.
Biosketch
Sangyeun Cho received the BS degree in computer engineering from Seoul National University, Seoul, Korea, in 1994. He earned his MS and PhD degrees in computer science from the University of Minnesota in Minneapolis in 1996 and 2002, respectively. Since 1999, he has been with Samsung Electronics Co., where he has designed several generations of the CalmRISC(TM) embedded processor core and their caches. He was an intern software engineer at Intel Microprocessor Research Lab. (MRL) in 1998.





