The difference between a vector register and a scalar register is that you can handle the entire vector register with a single operation. If you want to add 1 to four 32-bit registers you need four instructions. If you want to add 1 to all four elements of a 128 bit vector register you only need one instruction. If you want to add 1 to only one of the four elements in a 128 bit register, you make an addition to the entire vector register with a mask that enables only the element you want. So, yes, it is possible to subdivide a 1024 bit register if the hardware has a maximum vector length of 1024 bits. You might actually use this as a kind of array with 32 elements of 32 bits each.
I guess you want to use the 32 elements as independent registers that you can use for unrelated purposes. This is possible in principle, but you get problems with out-of-order processors. The first Intel 8086 processor had 16-bit registers that could be divided into two 8-bit registers. For example:
Code: Select all
MOV AX, 0102H ; AH = 1, AL = 2
ADD AL,4 ; AL = 6
ADD AH,1 ; AH = 2
MOV BX,AX ; BX = 0206H
This worked fine until they invented superscalar processors with out-of-order processing. Some superscalar processors treat AL and AH as individual registers that can be operated simultaneously or out of order. But when they are joined together in the last line then you have to wait until the in-flight AL and AH temporary registers retire into the physical register AX before you can access them as one register. This takes several clock cycles. Other hardware implementations keep AL and AH together so that you avoid the penalty when they are joined together, but you lose the advantage of out-of-order processing because you cannot access them independently. This problem was unpredicted in the original design when out-of-order processing was not invented yet. But it keeps causing problems and suboptimal solutions in today's superscalar processors. That's why I designed ForwardCom so that no instruction uses a partial register and leaves the rest of the register unchanged, except for instructions intended explicitly for this.