Few instructions, many variants

ForwardCom differs from common RISC and CISC instruction sets by having few instructions but many variants of each instruction. This principle can be illustrated by the peculiar example of fused multiply-and-add instructions.

The X86 instruction set with its latest instruction set extensions (AVX512-FP16) has 198 different fused multiply-and-add instructions. This number includes one instruction for each combination of floating point precision, vector length, operand order, and sign change of each operand. In contrast, ForwardCom has only two such instructions. Yet, the two ForwardCom instructions cover all the same functionalities as the 198 X86 instructions, and many more. Unlike X86, the ForwardCom mul_add instruction covers not only floating point, but also integer multiply-and-add, in addition to unlimited vector lengths, immediate operands, memory operands with different addressing modes, and non-destructive 4-operand versions.

This is due to the flexible template system that ForwardCom instructions are based on. The operand type field in the instruction code can specify different floating point precisions or different integer sizes. The length of a vector is specified in the vector register itself, or in an extra register for memory operands. The template has six extra bits for instruction-specific options. These bits are used for specifying sign change of each operand in this case. Register, memory, or immediate operands are selected by a choice of different instruction formats indicated by the mode bits. This includes short versions of the instruction code with few features and longer versions with more features, addresses, constants, or options. The consistent template system makes compiling simpler and it also makes the hardware more efficient because instruction decoding is simpler and more streamlined.

The template system adds flexibility not only to the fused multiply-and-add instruction, but to all the most common instructions. Some less common instructions only provide one or a few variants.

ARM and other RISC systems have few instructions, but also less flexibility. The limited instruction size (typically 32 bits) in RISC systems makes it impossible to combine so many features in a single instruction. This means that you need several instructions in a RISC system to do what you can do with a single instruction in ForwardCom.

The option bits are useful for implementing different variants of an instruction. These option bits are typically used for adding features that are cheap to implement in hardware, yet make it possible to do thing with a single instruction that require a sequence of multiple instructions in other systems. The following examples illustrate this:

Integer division has option bits for different rounding modes. It is quite complicated to calculate a correctly rounded integer division result in other systems without convertion to floating point.

Integer absolute value has option bits for how to handle overflow. Checking for integer overflow is more complicated in other systems.

Floating point max and min have option bits for how to handle NAN inputs. There are different standards for how max and min functions should handle NAN inputs. Compilers for other systems may insert a sequence of extra instructions in order to assure strict conformance with a particular standard.