Nvidia Blackwell Architecture Speculation

  • Thread starter Deleted member 2197
  • Start date
it's probably a CPU limit.
Very well could be, although there are games out there like Alan Wake that are relatively CPU light, not sure why they’d pick a game like Far Cry 6. Also it’s roughly in line with the uplift on other cards despite the 5090 being the only card with a substantial SM bump. Interesting.

Edit: See below, could also be that clock speeds on the 5090 are down (they look up elsewhere but only modestly). So much for those 2.9Ghz base clocks.

Also, the fine print says there is a MFG (max frame gen I’m guessing) 4x model.
 
Last edited:
Very well could be, although there are games out there like Alan Wake that are relatively CPU light, not sure why they’d pick a game like Far Cry 6. Also it’s roughly in line with the uplift on other cards despite the 5090 being the only card with a substantial SM bump. Interesting.

Also, the fine print says there is a MFG (max frame gen I’m guessing) 4x model.
Looks like the clock speeds are down a bit on the 5090 - base clock is 2Ghz, boost is 2.4. Presumably this was to keep the power in check, though it may also explain those 2x power connector rumors.
 
Last edited:
 
Getting the transistor counts and die sizes for the small chips will be interesting.

I wonder if GB203 is going to be roughly the same transistor count and die size, or even smaller?, as AD103. TPU's spec page actually has it smaller in terms of die size at the moment, but I'm not sure what that is based on. However if it does come in at ~half the tranistor count of GB202 that would be it at 46b which is the same as AD103.

Getting both signficant perf and capability uplift without any increase in transistors on just likely an iteration of the same node is something.
 
Getting the transistor counts and die sizes for the small chips will be interesting.

I wonder if GB203 is going to be roughly the same transistor count and die size, or even smaller?, as AD103. TPU's spec page actually has it smaller in terms of die size at the moment, but I'm not sure what that is based on. However if it does come in at ~half the tranistor count of GB202 that would be it at 46b which is the same as AD103.

Getting both signficant perf and capability uplift without any increase in transistors on just likely an iteration of the same node is something.
SM count is up but only by ~5%.
 
The new transformer model for ray reconstruction looks awesome, and you can use the Nvidia app to overide the native ray reconstruction in the game (same with frame gen where you can force 4x instead of 1x).
Will be really intrigued to see this in action more. I have to imagine you’ll need quite a high refresh rate to avoid noticeable latency on a 4x framegen
 
With the 5090 it will be interesting to see how much performance they may have left on the table due to power if a dual power connector AIB model comes out (Kingpin has implied they will be making one).
 

The number of triangles used to create games has exponentially increased over the past 30 years. With the introduction of the Unreal Engine 5 Nanite geometry system, developers can build open worlds filled with hundreds of millions of triangles. However, as ray traced game scenes explode in geometric complexity, the cost to build the bounding volume hierarchy (BVH) each frame for various levels of detail (LOD) grows exponentially, making it impossible to achieve real-time frame rates. RTX Mega Geometry accelerates BVH building, making it possible to ray trace up to 100x more triangles than today’s standard.

RTX Mega Geometry intelligently updates clusters of triangles in batches on the GPU, reducing CPU overhead and increasing performance and image quality in ray traced scenes. RTX Mega Geometry is coming soon to the NVIDIA RTX Branch of Unreal Engine (NvRTX), so developers can use Nanite and fully ray trace every triangle in their projects. For developers using custom engines, RTX Mega Geometry will be available at the end of the month as an SDK to RTX Kit. Sign up to be notified of availability.
 
It's around 10.5% on the 5080 over 4080 with a very minor clock bump. Has some shader execution reordering and better tensor utilization so that will gain something, but I would guess just standard raw non-DLSS non-DXR performance is going to be like 20%.
In a sense it’s what people should expect right now - a new gen with a modest but respectable raw uplift at similar (or better) prices and a focus on better features/AI. In that sense it has the makings of a success, especially in the mid range.

Still a bit of a surprise on the 5090 though - the raw uplift looks in line with the other cards despite all that extra silicon and bandwidth. Although Far Cry 6 is pretty CPU limited so they may be hiding the ball a bit there.
 
Back
Top