Recent content by Ext3h

  1. Ext3h

    Dynamic register allocation in GPUs

    Not just that. That needs to be provided for 2 read and 1 write ports just to keep the overall lower bound for instruction latency reasonable. Add on top one more read and write port with lower latency bounds for communication with the next cache tier. That's a lot to service in a single cycle...
  2. Ext3h

    The AMD 9070 / 9070XT Reviews and Discussion Thread

    Sorry, looks like I misunderstood. I expected RR to be active by default in a lot more titles than it actually is. So those performance differences are real after all.
  3. Ext3h

    The AMD 9070 / 9070XT Reviews and Discussion Thread

    We already do get an out-of-order scoreboard for memory loads in RDNA4, that already is a huge improvement. Still not perfect, but at least it does already mean that a wave that does fully hit a cache won't be stalled behind one that at least partially misses. Doesn't change though that this...
  4. Ext3h

    Dynamic register allocation in GPUs

    Huh, curious that this approach ended up being efficient at all. I mean that this implies that the M3 is backing the register file with a cache that has multiple associativity rather than a simple adder+mux. There's definitely been a trade-off there between better occupation and increased...
  5. Ext3h

    Dynamic register allocation in GPUs

    Most likely both require assistance by the shader compiler to flag the lower part of the stack as "preemptable", so rather than a hard register count, you now have a peak working set size, and a total register count for the deepest part of the call tree. Worst case it requires hoisting a couple...
  6. Ext3h

    DirectStorage GPU Decompression, RTX IO, Smart Access Storage

    We don't. In the linked video he simply switched texture settings to "medium" prior to testing all the other cards. All of the following tests in that video merely demonstrate the (expected) overhead of decompression, but not those hard spikes. But there's plenty of unique reports from users who...
  7. Ext3h

    DirectStorage GPU Decompression, RTX IO, Smart Access Storage

    DirectStorage does two things. What you refer to is the GDeflate decompression that's happening on the GPU. The other half is shifting the uploads from the 3D/Compute queues to the copy queue as a hidden implementation detail. That's not just semantic sugar to keep them out of the other queues...
  8. Ext3h

    DirectStorage GPU Decompression, RTX IO, Smart Access Storage

    It's just a scheduling problem, so there are plenty of solutions: NVidia could use the designated PCIe protocol features for ensuring that command streams have priority over data streams. Even though that's IMHO close to impossible to happen for the Blackwell family. NVidia could force the Copy...
  9. Ext3h

    DirectStorage GPU Decompression, RTX IO, Smart Access Storage

    Monster Hunter Wilds came, used DirectStorage aggressively for all for asset streaming - and to no-ones surprise it introduced frame time spikes during asset streaming on NVidias entire GPU lineup, up to and including the 5090. Don't get hung up on the video mixing up two distinct issues though...
  10. Ext3h

    PCIe 12VHPWR and 12V-2x6 power connector issues

    Not 15% of the total dissipation, but only of the losses up to the connector... But yes, I don't even trust the measurement either. He claimed to have measured the high frequency noise by putting the current clamp as close to the GPU as possible, so he possibly picked up a lot of noise that...
  11. Ext3h

    PCIe 12VHPWR and 12V-2x6 power connector issues

    In a lengthy discussion Igor from Igor's Lab also pointed out, that the 4090 and 5090 are dumping a lot of high frequency and foremost high amplitude noise directly into the 12V-2x6 connector. There is a filter on the GPU that's eliminating noise >= 40kHz (so they do avoid Skin Effect which...
  12. Ext3h

    NVIDIA discussion [2025]

    https://www.techpowerup.com/review/nvidia-geforce-rtx-5090-founders-edition-unboxing/4.html Look at the solder points for the 12VHPWR socket on the 5090 FE, can be clearly seen on the rear. Just how do you think you think that's supposed to be two rails, with only 2 pin-through slots and one of...
  13. Ext3h

    PCIe 12VHPWR and 12V-2x6 power connector issues

    Nice catch indeed. However, that will probably be misunderstood as "if my terminal isn't recessed, then the cable is good". A recessed terminal is only one way it can have failed, it can also: Loose pressure because the female part is effectively supposed to be a spring that may become worn...
  14. Ext3h

    PCIe 12VHPWR and 12V-2x6 power connector issues

    You can find high-res images of the plug on reddit when people ask if they have seated it correctly. You can clearly see the square receptacles for the individual terminals, and also that it has a seamless case. It does not have a bridge.
  15. Ext3h

    PCIe 12VHPWR and 12V-2x6 power connector issues

    Context. Don't ignore the context. That image was posted as a negative example because it demonstrated exactly what Nvidia messed up in their very own 8-pin to 12VHPWR adapter when they were using a brittle, soldered connection without crimping. That one failed in a pretty unique way, where you...
Back
Top