Page 1 of 1

Default integer size 32 or 64 bits?

Posted: 2021-03-22, 9:58:14
by agner
Some ForwardCom instructions are available in a short form using format template C. Template C has one register field, 16 bits of immediate data, and no operand size field. This will fit an instruction like for example

Code: Select all

int r1 += 1000
I am in doubt whether the integer size should be 32 bits or 64 bits for instruction formats that have no operand size field. The current version has an integer size of 64 bits in this case, based on the logic that a large integer version will work in most cases where a smaller integer size is specified. However, this places a serious burden on the compiler or the assembly programmer to decide whether it is safe to use a larger integer size than specified in the original code. The above code will not work with a 64 bit integer size if the programmer intended to get an unsigned result modulo 232 in a 64-bit register.

Format C is particular useful for combined ALU/branch instructions, for example in a loop like this:

Code: Select all

for (int i=0; i<100; i++) {...}
This loop can be implemented very efficiently with an increment-and-compare instruction that adds 1 to a register and jumps back if the value is below the limit. This fits format C with the loop counter in the register field, an 8-bit constant limit, and an 8-bit address offset for jumping back.

This will work perfectly well regardless if the integer size in almost all cases. But what if the programmer has made a branch that sets the loop counter to -1 inside the loop in order to restart the loop in certain cases. This will not work if the instruction that sets the loop counter to -1 is using a 32-bit operand size (unused bits are zero), while the increment-and-compare instruction is using 64 bits.

The best modern compilers can do amazing things in terms of optimization, but is it realistic to require that the compiler can decide whether it is safe to replace a 32-bit instruction with a 64-bit instruction? Or is it better to set the default integer size in format C to 32 bits because this is the most common integer size?

There are obvious cases where it is safe to use 64 bits, for example when setting a register to a small positive value. The assembler actually does this optimization automatically. And there are other cases where it is obviously not safe to use a different integer size, for example in branch instructions that check for overflow. And then there are the difficult cases where it requires a lot of logic in the compiler to decide whether it can use a different operand size.

There is actually a third possibility: We could make the rule that all integer instructions with an operand size less than 64 bits must sign-extend the result to 64 bits. This will increase the number of cases where we can use a larger integer size than specified, but there will still be contrived cases that are difficult to decide. Another disadvantage with sign-extending everything is that it will increase the power consumption because unused bits will be shifting.

What is your opinion? Should we use 32 bits or 64 bits in short-form instructions that have no operand size field? 64 bits will increase the number of cases where we can use short form instructions, but at the cost of considerable complexity in the compiler. 32 bits will result in slightly larger code because we need instructions of double size in certain cases with 64-bit operands. (Double-size instructions can execute at the same throughput as single-size instructions, but they take up more space in the code cache).

Re: Default integer size 32 or 64 bits?

Posted: 2021-03-22, 14:53:36
by HubertLamontagne
Arm64 has a fairly extensive solution to this, so it might make sense to look it up. One solution is to use 64 bit instructions to do 32 bit math and let the top 32 bits be junk, except for specific cases (generally operations that propagate bits rightwards):

- Right shift and arithmetic right shift
- Division
- Comparing numbers
- Operations that mix integer precision. The most common case of these is addition since it comes up in memory addressing (array[int32Index]).
- It might make sense to provide a specific 32x32->64 multiplication because sometimes those can be computed faster (even though 64x64->64 multiplication has the same low bits).

Re: Default integer size 32 or 64 bits?

Posted: 2021-06-17, 19:49:52
by wolfschaf
I think you should stick with the philosophy of ForwardCom.

Don't try to do support the way of doing things of the past. Just do the thing that is fast, simple and produces less code.

And I think you should require that any language or programmer that wants to interact with ForwardCom should know about what to do in certain situations to get the most out of it.


For example (the loop example you mentioned):
If you specify that this short form instruction should have a 64-bit integer size, then the programmer has to know that. Such a programmer can specify the loop variable to be 64-bits and wouldn't have any problems when resetting the loop, right?

When you have a language, this language should be responsible for providing ways of specifying the exact behaviour you want. If a programmer sets a variable to be 32-bits, then normally a language will use the 32-bit version of certain instructions. But if there is only a 64-bit version available, the language has to decide what should happen in this case.
This also holds for optimization.

If you specify that the instruction should be 32-bits, you have the same problem, just for different cases. Now what if someone specifies a 64-bit integer and wants to use this instruction? A language also has to decide what should happen.
Currently, 32-bit as a default integer is more common, but it's more or less just for historical reasons.



For me, it would be interesting to compare more scenarios between this 32/64-bit difference. I don't know what other drawbacks/benefits there are, so maybe you could list some more?

Another thing you could do is to do both, have a 32-bit and 64-bit mode for this instruction. You would have to encode this information somewhere, maybe add another opcode for that.


So for now, my options would be:
  • Go for 64-bits
  • Go for 32-bits
  • Somehow provide both instructions
    Encode this information in an additional opcode or somewhere else.
  • Sign Extend all operations to 64-bits
    But what if you don't want that? Then you would have to clear the upper part again.
  • Provide option for sign-extending to 64-bit
    Then you could decide if you want one or the other

Re: Default integer size 32 or 64 bits?

Posted: 2021-06-18, 17:09:31
by agner
Most integer instructions are available in both 8, 16, 32, and 64 bits versions. The question is only which one to prioritize for short instructions (single code word). The forthcoming version (1.11) will have both 32 and 64 bit short versions of some instructions. Most branch and loop instructions will have 32 bits for the short version while other integer sizes require a longer version (double-word instructions). I made this decision because most programming languages give you 32 bits if you just write int. High level language programmers should not have to bother about hardware details. I am not sure whether optimizing compilers are always able to decide when it is safe to use a different integer size.

Re: Default integer size 32 or 64 bits?

Posted: 2021-06-19, 0:24:32
by Kulasko
I would be in favor of 64 bits, for the reason to have a clean "native" size that is also shared with pointers.

While writing C++, I eventually switched to size_t as my default type for this very reason. I don't like to have an arbitrary type dependent on what some compiler thinks is right.

In my opinion, the main argument for 32 bits is that it's more efficient while the value range still is enough for most applications.

What could also be noteworthy, although maybe not as important, is that while Rust defaults to i32 without any type requirement (for the very same reason, because C did that before), but exclusively uses usize for indexing arrays and similar structures, which is the size_t equivalent.

Re: Default integer size 32 or 64 bits?

Posted: 2021-06-19, 5:32:29
by agner
Power consumption is also an issue. 32 bit integers use less power. My soft core can run faster when 64-bit integers are not implemented.
While writing C++, I eventually switched to size_t
size_t is unsigned, so it would not fit the short version loop instructions. The corresponding signed type in C is ptrdiff_t. Neither has a fixed size. I use int32_t or int64_t if I want to make sure what the size is.

The forthcoming ForwardCom version (1.11) has the following loop instructions:
  • increment_compare/ jump if above, below, above_or_equal, below_or_equal. Short versions use signed 32-bit integers.
  • subtract and jump if not zero, not negative. Short versions use signed 32-bit integers.
  • subtract maximum vector length and jump if positive. This is the vector loop instruction. Short version uses 64-bit integers because the loop counter is a pointer.

Re: Default integer size 32 or 64 bits?

Posted: 2021-06-22, 13:09:56
by Cuminies
Can you make it a flag in the cpu? Like a mode? 32/64 bit mode. When reading some instructions it can convert them before anything else happens. I don't know anything about CPUs.

Re: Default integer size 32 or 64 bits?

Posted: 2021-06-24, 15:10:00
by HubertLamontagne
Can you make it a flag in the cpu? Like a mode? 32/64 bit mode. When reading some instructions it can convert them before anything else happens. I don't know anything about CPUs.
Normally, 32/64/32-in-64 bits for memory addresses/pointers is a CPU flag yes, as it affects a ton of other stuff (page mapping etc), and the OS sets this flag when starting up (depending on if it's a 32 / 64 bit OS), and when loading an executable file (using 32-in-64 bit mode when loading a 32 bit executable in a 64 bit OS).

For 32 vs 64 bits on data operations, this has existed on historical architectures - for instance on the SNES's 65816 CPU. Modern CPUs avoid this for some good reasons:

- C++ code in tends to have an insane mix of 32/64 bit operations, depending on if the writer likes to use int or size_t or int64_t or uint32_t. Generally upgrading integer size shouldn't have an effect on behavior (except with insane tricks like x << 31 >> 31 to optimize x < 0 ? -1 : 0) and the insanely written C/C++ standard says that the compiler can totally "upgrade" types behind your back like this. But prepare to be met with pitchforks when you do this, as sometimes people DO rely on overflow behavior and it's not totally obvious when, and you end up with the nasty situation of code that changes behavior depending on if the compiler stores the result in RAM or not. Because pointer indexing is 64 bit, the result is that the code mixes and matches 32 bit and 64 bit operations and it's not really possible to set one more at the start and have it work all the way along. Upgrading int to 64 bits across the board is also likely going to be met with pitchforks (plus, it increases data cache usage).

- The mode switch would probably have to become part of the calling convention, so that the compiler would have to add mode switches before and after function calls if it's in the wrong mode (it can't guess if the function on the other side cares about the mode or not or what mode it wants), so you'd get mode switches all over the place.

- Mode switches make disassembly in debuggers etc difficult, because the same code could have different interpretations in different modes (especially if your immediate operand sizes change, as on the 65816). It's better to have different CPU opcodes entirely. You don't really even need 32 bit operations at all, really (except for memory loads/stores), as you can get the same results with 64 bit operations and sign extension/zero extension operations, and sign-extending/zero-extending/32-bit versions of math operations only exist so that you can get the same results as 32-bit operations with less sign-extension/zero-extension operations anyways.