split/unified regiter file

zchieply · Aug 27, 2012

i have noticed that cpu's use separate register files for integer and floating point numbers wile gpu use a unified file. my ? is what are the pros and cons of each.

rpg.314 · Aug 27, 2012

CPU's use separate because that's how they evolved. FPU's were really expensive and people preferred not having them when not needed.

GPU's use unified because by the time they got around to having both int and fp units, there were enough transistors to go around to bother separating them.

AFAIK, there is no good reason to have separate reg files beyond the ability to make a chip that runs a strict subset without paying for FPU.

3dilettante · Aug 27, 2012

One reason is that FP units did not exist for as long as the integer pipes did. The FP pipes are where coprocessor logic was absorbed into an integer CPU.

Split register files can have a number of benefits.
You get more capacity for each type of data, instead of having half as much for each.
This can also be done without adding extra bits for IDs, as space in the encoding is something you don't want to waste. Simply going by the opcode, you can pick between two register pools without spending 2-3 more bits per instruction if they shared the same file.

With multi-issue architectures, the number of operands that can be read from or written to a register file can be become a serious bottleneck.

Highly ported register files dedicate more area and burn more power in the circuitry that surrounds them, and the complexity adds cost in terms of engineering or clock penalties.
Since FP and integer work infrequently need the same data and have different needs for register size and arrangement, you can split them apart and either have two simpler register files as opposed to one heavily engineered one, or you can have two multi-issue pipelines with complex register files that would have been impractical if they were unified.

The downside is that there is an additional latency penalty in the cases you do want to transfer work from one side to the other, and you now have on-chip storage that you can't freely use if you have a workload that needs more of one register type but not the other.

All that being said, there are now GPUs on the market that do have a separate files.

Exophase · Aug 27, 2012

3dilettante said:
All that being said, there are now GPUs on the market that do have a separate files.

Separate scalar and vector register files anyway. Not exactly the same thing, but the usage has pretty much converged similarly - CPUs typically use the same register file for integer and FP SIMD.

Voxilla · Aug 27, 2012

AFAIK all CPU SIMD instructions use the same registers for both integer, float, double, ...

Novum · Aug 27, 2012

rpg.314 said:
CPU's use separate because that's how they evolved. FPU's were really expensive and people preferred not having them when not needed.

GPU's use unified because by the time they got around to having both int and fp units, there were enough transistors to go around to bother separating them.

AFAIK, there is no good reason to have separate reg files beyond the ability to make a chip that runs a strict subset without paying for FPU.

Not sure about not having an advantage there. Larger register files could mean longer latencies, etc and CPUs still need to be highly tailored for low latency, single thread performance.

3dilettante · Aug 27, 2012

Exophase said:
Separate scalar and vector register files anyway. Not exactly the same thing, but the usage has pretty much converged similarly - CPUs typically use the same register file for integer and FP SIMD.

Which instructions for the scalar unit are FP ops?
SIMD instructions (FP or INT) for general-purpose CPUs didn't catch on until after the FPU was integrated, making them an add-on to an add-on.
They either inherited or share the same motivations for being separate from the integer path.

Exophase · Aug 27, 2012

Voxilla said:
AFAIK all CPU SIMD instructions use the same registers for both integer, float, double, ...

Yeah I don't know of any other either, I guess I should have said "at least most" or "all I'm aware of", not a very good choice of words on my part. The closest I can really think of is MMX (integer SIMD only) vs SSE (integer SIMD + float SIMD), but that's compounded by the fact that the MMX registers are shared with floating point scalar registers! And of course that whole design is just a historical artifact. I don't know of anyone who willingly made both FP and integer SIMD go to different registers, and naturally that's because there's low demand for simultaneous FP + integer SIMD operation.

3dilettante said:
Which instructions for the scalar unit are FP ops?
SIMD instructions (FP or INT) for general-purpose CPUs didn't catch on until after the FPU was integrated, making them an add-on to an add-on.
They either inherited or share the same motivations for being separate from the integer path.

I don't know of any scalar FP instructions, but is the converse true - that all the usual integer operations are included in the scalar units? That could well be the case, I don't really know the instruction availability in GCN, and while there's some stuff you might not often use it might not really be enough of a win to withhold it. Or might complicate the compilers.

Maybe I'm picking at technicalities, but I see the motivations as very similar but still slightly different. I could be wrong on this, but I always figured the real motivation of scalar execution on GPUs is strictly to generate the control path, while integer on CPUs will deal with both control and data. I suppose you could argue that they'd still converge on the same thing - on the CPU they'd like you to move as much data integer stuff to the SIMDs as possible and on the GPUs they'd like you to use the scalar units for data stuff that isn't very regular/parallel.

split/unified regiter file

Similar threads