Recent content by fellix

  1. F

    PS5 Pro *spawn

    The RDNA3 CU can do 128 Dot2 op's/cycle (512 total op's) for both 16 and 8-bit types. Whatever RDNA revision will be in PS5 Pro could be double-pumping specifically for INT8 op's.
  2. F

    RDNA4

    I wonder if AMD was kind of cornered by its market position to "brute force" the RDNA3 design -- just more of everything (caches, VGPRs and constipated dual-issue) and very little specific/targeted improvements. And what's up with WMMA support? FSR should have already been forked to implement...
  3. F

    Speculation and Rumors: Nvidia Blackwell ...

    Only if Nvidia breaks from their two-level cache hierarchy. My understanding is that Nvidia really tries to avoid the chiplet/tile route for their designs and keep the cache subsystem streamlined even at the cost of compensating it with faster and more expensive GDDR memory. Power consumption...
  4. F

    RDNA4

    https://chipsandcheese.com/2024/01/28/examining-amds-rdna-4-changes-in-llvm/
  5. F

    Polygons, voxels, SDFs... what will our geometry be made of in the future?

    Whitepaper and supplemental materials: https://momentsingraphics.de/HPG2023.html
  6. F

    CYBERPUNK 2077 [PC Specific Patches and Settings]

    Not much different from all the VS/PS/CS kernels running concurrently on the vector ALUs.
  7. F

    CYBERPUNK 2077 [PC Specific Patches and Settings]

    In Cyberpunk Overdrive, the PT and Compute stages take the vast majority of the ALU runtime. The pixel/vertex shading share is relatively small, so even with sub-optimal SIMD occupancy the GPU will be sucking a lot of power.
  8. F

    Intel ARC GPUs, Xe Architecture for dGPUs [2022-]

    Did Nvidia really bothered with the AF pattern since G80?
  9. F

    Intel ARC GPUs, Xe Architecture for dGPUs [2022-]

    Q2 RTX is a free standalone game on Steam.
  10. F

    Intel ARC GPUs, Xe Architecture for dGPUs [2022-]

    Indeed, this CFD simulator uses lower data format for memory storage, i.e. during load/store operations. This might explain underutilization bottleneck cases for some GPU architectures, related to the memory subsystem. Here is a 1080Ti, gaining quite an advantage with FP16C packing:
  11. F

    Intel ARC GPUs, Xe Architecture for dGPUs [2022-]

    Could you see if Arc can run this CFD benchmark, please -- https://github.com/ProjectPhysX/FluidX3D/releases
  12. F

    Intel ARC GPUs, Xe Architecture for dGPUs [2022-]

    https://chipsandcheese.com/2022/10/20/microbenchmarking-intels-arc-a770/
  13. F

    Nvidia Geforce Drivers Release Announcement thread

    Isn't the Control Panel a UWP app with DCH drivers now?
Back
Top