Abstract for Collins, Tullsen, Wang, Shen, "Dynamic Speculative Precomputation"
A large number of memory accesses in memory-bound applications
are irregular, such as pointer dereferences, and can be effectively
targeted by thread-based prefetching techniques like Speculative Precomputation. These techniques
execute instructions, for example on an available SMT thread context, that have been extracted directly
from the program they are trying to accelerate. Proposed techniques typically
require manual user intervention
to extract and optimize instruction sequences.
This paper proposes Dynamic Speculative Precomputation, which performs all necessary instruction analysis,
extraction, and optimization through the use of back-end instruction
analysis hardware, located off the processor's critical path.
For a set of memory limited benchmarks an average speedup of 14\% is achieved when constructing
simple p-slices, and this gain
grows to 33% when making use of aggressive optimizations.