Parallel error detection using heterogeneous cores
Accepted version
Peer-reviewed
Repository URI
Repository DOI
Change log
Authors
Abstract
Microprocessor error detection is increasingly important, as the number of transistors in modern systems heightens their vulnerability. In addition, many modern workloads in domains such as the automotive and health industries are increasingly error intolerant, due to strict safety standards. However, current detection techniques require duplication of all hardware structures, causing a considerable increase in power consumption and chip area. Solutions in the literature involve running the code multiple times on the same hardware, which reduces performance significantly and cannot capture all errors.
We have designed a novel hardware-only solution for error detection, that exploits parallelism in checking code which may not exist in the original execution. We pair a high-performance out-of-order core with a set of small low-power cores, each of which checks a portion of the out-of-order core's execution. Our system enables the detection of both hard and soft errors, with low area, power and performance overheads.
Description
Keywords
Journal Title
Conference Name
Journal ISSN
Volume Title
Publisher
Publisher DOI
Sponsorship
Engineering and Physical Sciences Research Council (EP/K026399/1)
Engineering and Physical Sciences Research Council (EP/M506485/1)
Engineering and Physical Sciences Research Council (EP/J016284/1)