NAN propagation instead of fault trapping. Can we avoid speculative execution?
Posted: 2018-05-24, 11:07:49
Floating point calculations can generate infinity (INF) and not-a-number (NAN) in case of errors. These codes will propagate to the end result of a sequence of calculations in most cases. This is a convenient way of detecting floating point errors, and it is more efficient than using traps (software interrupts) for detecting numerical errors. Traps are particularly troublesome if vector registers are used.
The NAN code contains a payload of additional bits which can contain information about the kind of error that generated the NAN. The NAN payload can be very useful for error codes from mathematical function libraries. The IEEE 754 standard for floating point representation is incomplete with respect to the propagation of NAN payloads. I have discussed these problems with the working group behind the IEEE 754 floating point standard, but they do not want to make any modifications in a forthcoming revision of the standard because NAN payloads are rarely used today and it is difficult to predict future needs (http://754r.ucbtest.org/background/nan-propagation.pdf). The missing details can easily be specified for ForwardCom in order to make reliable propagation of INF and NAN.
I have written a paper describing the details of NAN propagation, including recommendations on how to use it and which compiler optimization options to use. You can find the paper here: agner.org/optimize/nan_propagation.pdf
I wonder if we need fault trapping at all in ForwardCom when NAN propagation is the preferred way of detecting floating point errors anyway. ForwardCom has options for trapping integer overflow as well, but most current microprocessor system have no such options and it is probably better to rely on special instructions for overflow detection instead.
A superscalar (out-of-order) processor will have less need for speculative execution if there is no fault trapping. See the thread "possible execution pipeline" at http://www.forwardcom.info/forum/viewtopic.php?f=1&t=78
We still need traps (software interrupts) for detecting illegal instructions and for memory access violations. Illegal instructions can be detected in the in-order front end. Memory addresses can also be calculated in the in-order front end if we follow the proposal for "control flow decoupling" in chapter 8.1 of the ForwardCom manual. Branching will also be in the in-order front end.
So I am wondering, is it possible to make a superscalar processor with no speculative execution at all? We can have speculative fetch, decode, and address calculation in connection with branch prediction, but stop speculating before the execute stage in the pipeline and before the out-of-order back end.
We still need hardware interrupts for servicing external hardware and for task switching. The interrupt can wait for the pipeline to be flushed if response time is not critical, or we could have one or more CPU cores reserved for servicing external hardware.
The NAN code contains a payload of additional bits which can contain information about the kind of error that generated the NAN. The NAN payload can be very useful for error codes from mathematical function libraries. The IEEE 754 standard for floating point representation is incomplete with respect to the propagation of NAN payloads. I have discussed these problems with the working group behind the IEEE 754 floating point standard, but they do not want to make any modifications in a forthcoming revision of the standard because NAN payloads are rarely used today and it is difficult to predict future needs (http://754r.ucbtest.org/background/nan-propagation.pdf). The missing details can easily be specified for ForwardCom in order to make reliable propagation of INF and NAN.
I have written a paper describing the details of NAN propagation, including recommendations on how to use it and which compiler optimization options to use. You can find the paper here: agner.org/optimize/nan_propagation.pdf
I wonder if we need fault trapping at all in ForwardCom when NAN propagation is the preferred way of detecting floating point errors anyway. ForwardCom has options for trapping integer overflow as well, but most current microprocessor system have no such options and it is probably better to rely on special instructions for overflow detection instead.
A superscalar (out-of-order) processor will have less need for speculative execution if there is no fault trapping. See the thread "possible execution pipeline" at http://www.forwardcom.info/forum/viewtopic.php?f=1&t=78
We still need traps (software interrupts) for detecting illegal instructions and for memory access violations. Illegal instructions can be detected in the in-order front end. Memory addresses can also be calculated in the in-order front end if we follow the proposal for "control flow decoupling" in chapter 8.1 of the ForwardCom manual. Branching will also be in the in-order front end.
So I am wondering, is it possible to make a superscalar processor with no speculative execution at all? We can have speculative fetch, decode, and address calculation in connection with branch prediction, but stop speculating before the execute stage in the pipeline and before the out-of-order back end.
We still need hardware interrupts for servicing external hardware and for task switching. The interrupt can wait for the pipeline to be flushed if response time is not critical, or we could have one or more CPU cores reserved for servicing external hardware.