Recent content by rSkip

  1. rSkip

    AMD RDNA3 Specifications Discussion Thread

    Some good reads I found in the RDNA3 ISA document: 4.1.1. Cache Controls: SLC, GLC and DLC controls how load/store/atomic instructions interact with each cache level 5.2. Instruction Clauses talks about S_CLAUSE, followed by instructions on the same execution unit to be issued back-to-back...
  2. rSkip

    AMD RDNA3 Specifications Discussion Thread

    RDNA1 & 2: RDNA3: I would say it's still hardware scheduler, with software hints to power off some parts of scheduler.
  3. rSkip

    NVidia Ada Speculation, Rumours and Discussion

    H100 can still run compute workloads, which require texture filtering, so texture units are still there. I guess rasterizer and ROPs might only exist in the graphics-capable GPC after nvidia decoupled ROPs and MCs.
  4. rSkip

    AMD CDNA Discussion Thread

    From dieshot and infinity links placement, I guess that MI100 uses 3x Infinity Fabric Links + WAFL(?), with 3x more links unused. Just guessing, I can be totally wrong.
  5. rSkip

    AMD Radeon RDNA2 Navi (RX 6500, 6600, 6700, 6800, 6900 XT)

    https://www.amd.com/en/products/specifications/compare/graphics/10516%2C10521%2C10526 For 6900XT, 6800XT, 6800:
  6. rSkip

    GCN, GCN2.3.., Vega, and Navi Instruction Cache limitations

    In your example, there are only two instruction streams, one with 32 registers/thread and the other with 64 register/thread. So L1 instructions has [32KB / 2(streams) / 8B(per instruction) = 2048 instructions] per instruction stream.
  7. rSkip

    AMD: Navi Speculation, Rumours and Discussion [2019-2020]

    comparing with the fortnite model
  8. rSkip

    Nvidia Ampere Discussion [2020-05-14]

    Added lines and perf/w labels to the 1.9x perf/watt slide. (Keep in mind that TU102 doesn't scale as well to 320W, that 1.32x perf/w is comparing worst to worst at different power.)
  9. rSkip

    AMD: Navi Speculation, Rumours and Discussion [2019-2020]

    2x GFlops and 1.5x BW, you are talking about TITAN RTX and RTX 2070.
  10. rSkip

    Intel ARC GPUs, Xe Architecture for dGPUs [2018-2022]

    IMO, "SIMT and SIMD units" = variable vector width ( SIMD8 / SIMD16 / SIMD32 ). It gives you (or the compiler) an option to trade TLP <-> DLP. Gen Graphics has this for years. It's similar to the choice of wave32 / wave64 modes on AMD RDNA. Intel might add more modes to the existing 8/16/32 for...
  11. rSkip

    AMD Navi Product Reviews and Previews: (5500, 5600 XT, 5700, 5700 XT)

    Navi10 dieshot from AMD HotChips slide
  12. rSkip

    AMD: Navi Speculation, Rumours and Discussion [2019-2020]

    http://chipsleuth.com/tahiti.html#annotated-rams-overview This guy is great. He's got a Tahiti article focused on CU and a Fiji one about the whole chip.
  13. rSkip

    AMD: Navi Speculation, Rumours and Discussion [2019-2020]

    SIMD & Wave execution: GCN: CU has 4 x SIMD16, Wave64 execute on SIMD16 x 4cycles. RDNA: CU has 2 x SIMD32, Wave32 execute on SIMD32 x 1cycles. LDS: GCN: 10 Wave64 on Each SIMD16, 2560 threads per CU. 2560 threads (1CU) share 64KB LDS. RDNA: 20 Wave32 on Each SIMD32, 1280 threads per CU. 2560...
  14. rSkip

    AMD: Navi Speculation, Rumours and Discussion [2019-2020]

    https://www.amd.com/en/press-releases/2019-05-26-amd-announces-next-generation-leadership-products-computex-2019-keynote Footnotes:
  15. rSkip

    AMD: Pirate Islands (R* 3** series) Speculation/Rumor Thread

    https://lh4.googleusercontent.com/-5NWAqiPRGt0/VWwbN8R4BXI/AAAAAAAALJs/62rqQEaLQ-g/w2178-h1225-no/desktop.jpg 1002-67C8, 1150MHz core freq, 8GB HBM. source: http://www.chiphell.com/forum.php?mod=redirect&goto=findpost&ptid=1302682&pid=29101917
Back
Top