Abstract for Wallace, Tullsen, Calder, "Instruction Recycling on a Multiple-Path
Processor"
Processors that can simultaneously execute multiple paths of execution
will only exacerbate the fetch bandwidth problem already plaguing conventional
processors. On a multiple-path processor, which speculatively executes
less likely paths of hard-to-predict branches, the work done along a speculative
path is normally discarded if that path is found to be incorrect. Instead,
it can be beneficial to keep these instruction traces stored in the processor
for possible future use. This paper introduces instruction recycling,
where previously decoded instructions from recently executed paths are
injected back into the rename stage. This increases the supply of instructions
to the execution pipeline and decreases fetch latency. In addition, if
the operands have not changed for a recycled instruction, the instruction
can bypass the issue and execution stages, benefiting from instruction
reuse. Instruction recycling and reuse are examined for a simultaneous
multithreading architecture with multiple path execution. It is shown to
increase performance by 7% for single-program workloads and by 12% on multiple-program
workloads.