Current Generation Hardware Speculation with a Technical Spin [post GDC 2020] [XBSX, PS5]

Status
Not open for further replies.
I could be wrong here but I thought XSX had a different pool for the CPU of memory
specifically set aside for the CPU ?

No. It's the same memory. Different chip densities.
The address space was divided to make it less constrained.
Otherwise most of the requests would hit the 2x density chips and draw the bandwidth to the lower end (336GB/sec) for the whole system.
 
Sure it can, if you have enough of it. I dont even understand your point. Nothing we know of current Navi vs RTX2000 series, points to XSX GPU, which will be Navi 2, being somewhere between 2070-2080.
The 2070 Super and the 2080 are close. Maybe they are 10~15% apart. The Series X GPU can easily drop 10% below the 2080 if starved for bandwidth. Worse yet the Series X has a portion of it's RAM at around 336GB/s, which can degrade performance even further.

Series X: 560GB/s
2070 Super: 448GB/s
2080: 448GB/s
2080 Super: 495GB/s

At best case, the Series X has a 60GB/s (after discarding 50GB/s for Zen2) advantage over the 2080 and the 2070 Super, is this enough to overcome the CPU/GPU contention? especially when the CPU is fully loaded with 16 threads? my guess is a resounding NO. Though I am open for correction.
 
Last edited:
Infinity fabric will likely be better at handling memory contention between CPU and GPU. Will likely be less of an impact this gen vs last gen. AMD has improved their APU memory performance over the years due to constraints in the PC APU space. I don't think the issue will be as bad as some make it out to be but likely the XSX will have a ~4̶0̶% (edit, 30%) GPU bandwidth advantage vs the the PS5, which will not be easy to overcome.
 
Last edited:
The 2070 Super and the 2080 are close. Maybe they are 10~15% apart. The Series X GPU can easily drop 10% below the 2080 if starved for bandwidth. Worse yet the Series X has a portion of it's RAM at around 336GB/s, which can degrade performance even further.

Series X: 560GB/s
2070 Super: 448GB/s
2080: 448GB/s
2080 Super: 495GB/s

At best case, the Series X has a 112GB/s advantage over the 2080 and the 2070 Super, is this enough to overcome the CPU/GPU contention? especially when the CPU is fully loaded with 16 threads? my guess is a resounding NO. Though I am open for correction.
And how much advantage does XBX have over RX580 in terms of BW (that it still needs to share with CPU)? Or PS4Pro vs 470 for that matter?

We dont know what performance we will be getting in .1%, but we do know that spec wise, XSX is fitting 2080S while PS5 fits ~2070S.
 
For example if there are no other examples of a comparable power draw we may conclude that unless you want to use AVX - you're good and your GPU will always run at 2.23.

Power draw also varies based on whether the instructions can be carried out fully in cache or need to swap with ram. In cache is much faster and that generates more power draw.

You can test this in synthetics like occt and p95. In cache sse > ram swapped avx for power draw.
 
Around 2070S I guess. Perhaps a bit lower depending on BW and clock speeds.

Probably around RTX2070 levels.

2080-level cards run much faster on their boost clocks and have a parallel INT pipeline.
If we measure theoretical flops that places them way above 12

Aha explains why DF came to the 2080 conclusion then. They are on different architectures so not apples to apples. Between 2080 and 2080S perhaps. We most likely can do a better comparison when RDNA2 gpus are here.
 
Infinity fabric will likely be better at handling memory contention between CPU and GPU. Will likely be less of an impact this gen vs last gen. AMD has improved their APU memory performance over the years due to constraints in the PC APU space. I don't think the issue will be as bad as some make it out to be but likely the XSX will have a ~40% GPU bandwidth advantage vs the the PS5, which will not be easy to overcome.
40%?
 
Yes true, it's probably around 2080 and 2080s depending on the situation. it should be, 12TF lies around those.

Where do you put PS5?
Actually just naively scaling from RX 5700 XT, 12 TF RDNA2 should be somewhat faster than 2080S even if it was only as efficient per clock as RDNA1, but AMD claims RDNA2 to get more work done per clock too (real work, not theoretical flops-work)
 
Actually just naively scaling from RX 5700 XT, 12 TF RDNA2 should be somewhat faster than 2080S even if it was only as efficient per clock as RDNA1, but AMD claims RDNA2 to get more work done per clock too (real work, not theoretical flops-work)

NV states default TF but they actually have higher numbers. A 2080 is actually around 11TF for example, or thats what some have explained to me here. I did the same comparison as you before.
 
NV states default TF but they actually have higher numbers. A 2080 is actually around 11TF for example, or thats what some have explained to me here. I did the same comparison as you before.
I'm not comparing to flops, theoretical or whatever it clocks itself to in real situation, I'm comparing to realworld performance in variety of games (aka TPU numbers) by assuming the RDNA2/XSX is as fast as 5700 XT per FLOP, so RDNA2/XSX performance would be 5700XT * 1.246, which would end up faster than 2080S in all but 4K, add to that AMD's claimed improvements on performance per clock and it should be faster regardless of resolution
 
Probably due to different architectures, i think we may have a better idea when we can compare to RDNA2 GPU's. DF's conclusion that it's a 2080 (atleast) is a good sign for the XSX. With PS5 being 2070 level.
 
Will be interesting how much L3 they each have.

Was trying to figure out how to add up to the 76MB of SRAM but... I dunno, I could be missing something.

Zen 2
-------
L1 = (32kB I$+ 32kB D$) * 8 = 512kB
L2 = 512kB * 8 = 4MB
L3 = 16MB * 2 = 32MB :?:
Total CPU = 36.5MB???

Anaconda GPU
4 SIMD per WGP
-----------------------
I$ = 32kB per WGP
K$ = 16kB
Vector register file = 128kB*4 SIMD
L0 Vector cache = 2 * 16kB per WGP

Scalar write-back cache = 16kB*4 SIMD
Scalar RF = 10kB *4 SIMD

L1 = 128kB per shader array (assume 4)
LDS = 2*64kB per WGP
^very confused at trying to read the whitepaper


28 WGP =23.5MB ???

GPU L2 = 4-5 10MB? (4 slices * 4-5 64-bit channels * 256kB or 512kB per slice)

Render Back End caches = 128kB per RBE ??? (guesswork - I think sebbbi did some tests back with GCN, but hard to say if there are any changes while I can't remember the amount he deduced)


Total 64-70MB???

Plus stuff we have no idea about, like audio, RT, redundancy
 
Last edited:
I happen to think he's probably wrong and the procedural work will be handled up front and put onto ssd. I think the reason it was done in memory was because of the hdd performance. With a fast ssd they can probably make big gains up front and save a good chunk of the computing power. It's a good topic for discussion.

Edit: I do think there's a larger industry discussion to be had about the cost of games. Gamers are expecting a lot, but without having the cost of games go up, I'm not sure how many development studios are actually financially equipped to handle it. How many flops can Ubisoft handle if the cost of game dev goes 25% or 50% to meet consumer expectations for this new hardware.
I gave this some thought, and taking in some of the discussion from Shifty, Fox, Brit, some of the other B3D members talking about ML/AI and Alex, I actually think Alex is right here.
Here me out, I only have 1 axiom this relies on:
The best possible texture that you can possibly see is dependent on your display resolution
- in this case that target is at best likely 4K native.

But in the world of dynamic scalers and dynamic resolution, checkerboarding, TAA, VRS there is likely not going to be many titles at native 4K.
So immediately having textures running at that level and trying to MIP them at 4K is a waste of resources especially at distance, you won't see it.

And then, comes the interesting point of bringing ray tracing into the equation. If we are ray tracing the base resolution will probably be around 1080p up to 1440p maximum. High resolution textures won't matter there either.
And the only option is to upscale from 1080p to 4K, in which we would likely use something like DLSS. So we don't need the hard drive to hold textures greater than 1080p because DLSS is the one creating the textures really.
 
Status
Not open for further replies.
Back
Top