R700 Family ISA

You bet I am :D

The Shared GPR Pool seems to refer to state-defined registers and I guess corresponds with GDS in early RV770 diagrams.

It's interesting to see that no barrier is required in absolute addressing mode if a pair of LDS instructions: write-followed-by-read, is performed, as serialisation of execution of these instructions within the clause is effectively a barrier.

Seems like it'll be a while before I understand it all.

Jawed
 
Apart from the stuff about SIMD-global GPRs, LDS, and the double precision support pretty much identical to the R600 ISA document. R700 lost integer add (why? still supports integer sub) in the trans alu and gained (as we all already know) shift support in the xyzw alus.

Oh and btw there's at least one functional difference from rv770 to rv790 - "Burst memory reads are not supported by the RV770; however, the 710, 730, 740 and 790 do support it." (Chapter 7.3). So at least it's not the exact same chip tuned for higher frequency :).

edit: I guess the lost integer add in trans alu is a typo. The doc lists the add_64 as available in the trans unit instead, and that can't be.
 
Last edited by a moderator:
Apart from the stuff about SIMD-global GPRs, LDS, and the double precision support pretty much identical to the R600 ISA document. R700 lost integer add (why? still supports integer sub) in the trans alu and gained (as we all already know) shift support in the xyzw alus.

Oh and btw there's at least one functional difference from rv770 to rv790 - "Burst memory reads are not supported by the RV770; however, the 710, 730, 740 and 790 do support it." (Chapter 7.3). So at least it's not the exact same chip tuned for higher frequency :).

edit: I guess the lost integer add in trans alu is a typo. The doc lists the add_64 as available in the trans unit instead, and that can't be.

Some questions:
- does anyone know if NVidia GPUs are already capable of doing burst memory reads?
- any idea of what kind of impact this feature could have on real world performance? I've read that memory read latencies could be pretty huge... :?:
 
R600 and R700 Programming Guide

http://www.phoronix.com/scan.php?page=article&item=amd_r600_700_guide&num=1

2.3.2 Pixel Program Flow
1. The PA reads out position data from the SX‟s position buffer and together with connectivity data from the VGT assembles primitives
2. The primitives are then sent to SC for coarse scan conversion.
3. The tiles are then sent to SPI for final pixel interpolation:
a. The SPI allocates a pixel thread and GPRs
b. The SC and SPI read per-vertex parameter data from the SX
c. It then interpolates that data along with barycentric data (I,J) from the SC to arrive at the per-pixel
value of each parameter.
d. These values are then loaded into GPRs
4. The shader core is then notified that a pixel wavefront and shader program are ready for execution
5. The shader program runs on all pixels in the wavefront and at the end exports data to the SX‟s export buffer
6. The SPI is informed when a wavefront has completed so that it can de-allocate GPR space
So that seems to confirm that interpolated attributes are emplaced by the SPI - and interestingly it's SPI that actually performs allocation and de-allocation of registers for all types of shaders.

Some tasty stuff in here, including a detailed list of changes from R600 to R700 and a sprinkling of interesting bugs. The bug relating to the GPU mistakenly interpreting literal offsets to indexed registers as previous registers is particularly funny. Bet that had them scratching their heads for a while.

Also interesting to discover that a fetch clause in R700 can be 16 instructions, up from 8 instructions in R600.

Jawed
 
Oh and btw there's at least one functional difference from rv770 to rv790 - "Burst memory reads are not supported by the RV770; however, the 710, 730, 740 and 790 do support it." (Chapter 7.3). So at least it's not the exact same chip tuned for higher frequency :).
This was confirmed to be an error, RV790 doesn't support Burst memory reads
 
This was confirmed to be an error, RV790 doesn't support Burst memory reads
Yes that's an old message there you quoted... But yes there are no logical differences between rv770 and rv790.
The new document though is quite interesting, if not a bit short. Apart from the errata section, provides quite a good overview how things work. And a good explanation of how a driver needs to handle synchronization issues.
 
Back
Top