Abstract
Thoughts?RingScalar is a complexity-effective microarchitecture for
out-of-order superscalar processors, that reduces the area,
latency, and power of all major structures in the instruction
flow. The design divides an N-way superscalar into N columns
connected in a unidirectional ring, where each column
contains a portion of the instruction window, a bank of
the register file, and an ALU. The design exploits the fact that
most decoded instructions are waiting on just one operand to
use only a single tag per issue window entry, and to restrict
instruction wakeup and value bypass to only communicate
with the neighboring column. Detailed simulations of fourissue
single-threaded machines running SPECint2000 show
that RingScalar has IPC only 13% lower than an idealized
superscalar, while providing large reductions in area, power,
and circuit latency.