A recent change meant to reduce the impact of integrating Arcturus' larger CU count mentioned a term that was briefly discussed a few times in this forum in the past.
https://lists.freedesktop.org/archives/amd-gfx/2019-August/037800.html
This change is because SE/SH layout on Arcturus is 8*1, different from
4*2(or 4*1) on Vega ASICs.
Currently the cu bitmap array is 4x4 size, and besides the bitmap is used widely
across SW stack. To mostly reduce the scale of impact, we make the cu bitmap
array compatible with SE/SH layout on Arcturus. Then the store of cu bits of
each shader array for Arcturus will be like below:
SE0,SH0 --> bitmap[0][0]
SE1,SH0 --> bitmap[1][0]
SE2,SH0 --> bitmap[2][0]
SE3,SH0 --> bitmap[3][0]
SE4,SH0 --> bitmap[0][1]
SE5,SH0 --> bitmap[1][1]
SE6,SH0 --> bitmap[2][1]
SE7,SH0 --> bitmap[3][1]
The SE/SH layout seem like it addresses a sub-division that is possible within the CUs in a shader engine.
Going back to documentation in the Southern Islands ISA doc, there is a hardware register called HW_ID that references the SE number and another 1-bit identifier for the shader array within the SE.
What impact having the CUs within an SE being part of one single array or split into two is unclear. Perhaps it has some impact on how the CUs can be signaled or how they can arbitrate for shared resources like the memory crossbar or export.
In theory, the combination of SE, SH, and CU identifiers could have given enough space to differentiate 128 CUs all the way back in Southern Islands, at least in terms of that element of the architecture.
Since the SI ISA doc, AMD hasn't wanted to keep documenting this hardware register, though even with the recent RDNA ISA doc you can see the jump in numbers over where it likely still is.
This change sheds some light on Vega might look in terms of that register, and how Arcturus didn't follow the previously mentioned way of getting to 128 CUs.
Vega apparently had 4 SEs with 2 SH each, or some products with 4 SEs and 1 SH.
Arcturus is apparently going for 8 SEs and 1 SH each, but in order to reduce the impact of having to rewrite a commonly-used layout table that assumed 4 SEs max, the SH count is being repurposed for Arcturus to serve as an additional bit for differentiating between the first and second halves of the set of 8 shader engines.
What it means to have 1 SH isn't clear, although if it deals with how the shader engines can interface with the rest of the chip it might prevent excess complexity in linking them to their infrastructure (or the there's some other barrier to having 16 shader arrays).
Edit:
The table from the SI ISA doc, for reference:
Code:
Table 5.8 HW_ID
Field Bits Description
WAVE_ID 3:0 Wave buffer slot number (0-9).
SIMD_ID 5:4 SIMD to which the wave is assigned within the CU.
7:6 reserved.
CU_ID 11:8 Compute unit to which the wave is assigned.
SH_ID 12 Shader array (within an SE) to which the wave is assigned.
SE_ID 14:13 Shader engine the wave is assigned to.
TG_ID 19:16 Thread-group ID
VM_ID 23:20 Virtual Memory ID
RING_ID 26:24 Compute Ring ID
STATE_ID 29:27 State ID (graphics only, not compute).