Current Generation Hardware Speculation with a Technical Spin [post GDC 2020] [XBSX, PS5]

Status
Not open for further replies.
Well , all related to the ssd (plus 3d audio, didnt read that part). Shifty will be dissapointed without the BVH traversal block and only the intersection hardware.

There's also this:

Digital Foundry said:
A technique called Sampler Feedback Streaming - SFS - was built to more closely marry the memory demands of the GPU, intelligently loading in the texture mip data that's actually required with the guarantee of a lower quality mip available if the higher quality version isn't readily available, stopping GPU stalls and frame-time spikes. Bespoke hardware within the GPU is available to smooth the transition between mips, on the off-chance that the higher quality texture arrives a frame or two later. Microsoft considers these aspects of the Velocity Architecture to be a genuine game-changer, adding a multiplier to how physical memory is utilised.
 
but we do know... assuming an industry standard 10 deep BVH, the 2080Ti can, at 2ghz, in theory do 13.6 gigarays/sec (RT core count , 68 in the 2080Ti's case) times clockspeed , divided by bvh depth) gets you your theoretical peak RT performance, nvidia quotes 10, but they also quote a lower boost clock. for xbox series X, 1825mhz times 208 RT cores gets you the 380 billion/second number MS quoted, divided by the 10 deep BVH gets you ~38 (37.96 to be exact) gigarays/sec, or ~4x that of the 2080Ti. Devs can feel free to chime in on this, but the math is correct on paper. In practice the difference is even bigger because nvidia's design is literally insane, an RT op is warp-wide, so the entire SM is basically idle while RT is being done on a turing RT core, plus it has no latency hiding, which i'd happily go into detail about if anyone actually cares. But no, RT performance on this is not freakin below a 2060, its ~4x that of a 2080Ti minimum
You seriously believe this kind of first grade calculations? And No, the CUDA cores are not idle during RT ops.

This is Aa bb traversals and intersection and not to be confuses with gigarays. Two completely different things. It takes multiple aabb intersection and a triangle intersection for a ray.

And even then they are two different metrixc anyhow, we need way more Information before we compare turning to RDNA 2.0
 
Other hardware feature additions to the new consoles includes the following:

Over/Under-estimate Conservative Rasterization
Primitive Ordered Pixel Shaders
Sparse 3D Images
Post-depth Coverage
Packed Math (already implemented with two console vendors)
Sub-Dword Addressing (SDWA instructions)
Data Parallel Processing (DPP instructions)

The listed below are subset functionality to the console implementations of mesh shaders:

Broadcasting to Multiple Viewports
Per-view Attributes
Fast Geometry Shaders
 
Can you do some calculations? I am open to all number crunching.
We can't, the quoted metrics are not the same, as explained above.

NVIDIA's GigaRays metric encompasses a lot of ray intersections, but we don't know how to convert GigaRays into intersection tests. Since we lack the formula.
The Xbox quoted intersection tests also lack context, we don't know how many intersection tests are in any given GigaRay. We lack the formula, so we can't convert them.

Microsoft states the Xbox is capable of doing the equivalent of 13TF of Ray Tracing through it's RT cores, NVIDIA claims 23TF through RT cores and additional 2TF through the concurrent INT/FP execution.

geforce-rtx-gtx-dxr-metro-exodus-rtx-rt-core-dlss-frame-expanded-850px.png
 
There was so much emphasis on locked clocks, no boost that maybe MS is suspecting sony is going to try to do some kind of dynamic clocking? Dynamic clocks would allow for nice peak numbers but then what is sustainable is whole another matter.

It's more traditional to want a very consistent target. The current gen had some opportunities for dynamic clocks (Jaguar's turbo at that time was debatable), and they opted not to do so.
Having the option to vary platform clocks with SMT is something of a change, although in this case it's at developer behest. I'm curious how this is handled, perhaps like a VM spin-up, versus the BIOS-level settings needed for PCs to change SMT mode.

Have they commented on whether or not the BVH accelerators are in the TMUs like we've seen on AMD's patents?
52 CUs x 4 TMUs x 1.825 GHz = 379.6 billion/s of something.
 
A lot of people believe xsx is 380mm2 or even 400mm2 and they think the number from Aquariuszi in Taiwan is fake.

Now the official number is 360.45 mm2, and Aquariuszi again proves his credibility.

He said 350 mm2 for xsx and PS5 is 300mm2.

Heh, not everyone thought 350 was that crazy....
Okay, the res/cap arrays are almost assured to be spaced based on the underlying BGA pitch to fit with vias (so either 0.40 or 0.50 quantized) and it seems to be 0.40. The space between each little chip in a string is spaced exactly 2.4mm or 2.5mm.

Measuring each string of 7 or 10 as close to the chip,as possible the image has a slight vertical stretch. I averaged the string on each side, with a tiny difference.
Vertical 22.231 pix/mm
Horizontal 21.758 pix/mm

So I get a rather precise 350 mm2.

No that makes no sense. But that's what I have. Maybe it shifts 0.1 per chip to align with a 0.50 bga and fan out. So 2.5 per chip means... 380 mm2.
Well if anyone can measure the BGA pitch of modern GPU from AMD and it's 0.40mm then I would say 350 mm2 is worth a bet. Some sort of bet.

And 0.50 would mean 380 obviously.
Funny how my method of aligning to a known substrate pitch was correct, and I was so off compared to everyone else, including digital foundry, that I thought my 0.40 pitch must be wrong. :runaway:
 
Glad to see a reasonable die size and rdna2's big improvement just in time for next gen, it means the doom-and-gloom about price was an overreaction, the SoC is practically the same size as previous launch consoles. I'm not really worried about the ram cost because the weird two speed partitions is a good indication they still try to design to a price point. And the power should be maybe a little above 200W and not too expensive cooling.

Questions remain about the BOM: the SSD size and cost, the increased cost of 7nm, and the proprietary storage expansion. It sucks that it's proprietary but we went through all the possibilities and it was pretty much expected.
 
I think you're asking about the SSD node process size? Cause I thought they already specified 1TB of SSD Capacity, maybe even last year at VGAs.
 
Status
Not open for further replies.
Back
Top