Current Generation Hardware Speculation with a Technical Spin [post GDC 2020] [XBSX, PS5]

Status
Not open for further replies.
GDDR6 has 2 16bit individual channels in one 32bit connection to a memory chip, whether that chip is 1GB or 2GB in density, so it should be possible to assign one of those channels for CPU tasks and one for GPU splitting the mem chip, I don't know if more complex granularity can be done? Of course even in that scenario you cannot run max 560GBs bandwidth to the GPU anymore, but you can keep all 10 chips fully or partially active for the GPU imo.

True, you could do that, but it's just better to timeshare. The CPU just doesn't use the memory that often.

From a hardware point of view it’s possible the question is whether or not a memory controller would allow for such a split or if it would effectively enforce locking off the entire bus to a single host. It feels like it wouldn’t and you’d get either one or the other and have to time division multiplex between the two consumers.

You can just do it in software by allocating memory carefully. Half of the physical address space of the chip is on the channel 0 and the other half is on channel 1.
 
Like Cerny said, just a 2-3% downclock could reduce thermal by 10%. That's ~50MHz and shouldn't be much greater. so it's still pretty much a 10 TF console through and through with a sizable 350MHz clock advantage in worst case scenario while keeping a impressive 405MHz advantage most of the time.
Here's the thing though, why would Sony even bother using boost clocks if the variance in clock speed is only 50MHz? I can see the trade off lowering the CPU frequency and boosting the GPU as a fair trade off but if it really is that small of a clock difference it seems pretty pointless to me.
 
Here's the thing though, why would Sony even bother using boost clocks if the variance in clock speed is only 50MHz? I can see the trade off lowering the CPU frequency and boosting the GPU as a fair trade off but if it really is that small of a clock difference it seems pretty pointless to me.

For the same reason that somehow 9.99999 TFLOPs is some sort of “deal breaker” but 10.3 somehow is ok? The reason being: annoying forum users.
 
Here's the thing though, why would Sony even bother using boost clocks if the variance in clock speed is only 50MHz? I can see the trade off lowering the CPU frequency and boosting the GPU as a fair trade off but if it really is that small of a clock difference it seems pretty pointless to me.
The easy answer: Sony's last minute push :p

The potential answer: Being able to shift power budget at a fixed SoC cap between CPU and GPU gives more oomph for the buck, because practically there is no application (esp. game) that would have an invariant CPU-GPU performance profile. Even within the GPU, bottleneck in parts of pipeline can lead to under-utilization and in turn lower power use in other parts, resulting in clock ramping opportunities (because more power budget is available).
 
That's not how the XSX memory setup works. There are 10 32-bit controllers, of which 6 attach to 2GB chips and 4 attach to 1GB chips. Every controller works at 14Gbps. The 10GB "fast pool" is just the first 1GB on all the chips, so the 6GB "slow pool" is not independent of the fast pool, having the CPU use the slow portion does still slow down the fast portion.

That's why the total bandwidth is still 560GB/s, not 560+336GB/s
Correct. I was wrong.
 
For the same reason that somehow 9.99999 TFLOPs is some sort of “deal breaker” but 10.3 somehow is ok? The reason being: annoying forum users.
Agree that's probably the case. If it really is only 50MHZ wouldn't that still put their TF # over 10 (10.0045?) Looks more like its 60MHZ as that would put their TF # slightly under 10 hehe:D. I'm also curious as to what kind of trade off with CPU frequency they'll need to have a sustained 2.18MHZ GPU clock or whatever it is typically when not in boost mode. I do like the idea of trading CPU clocks for a higher clocked GPU especially considering that even at a lower clock these new CPU's blow last generations Jaguar CPU's out of the water. It will be interesting to see if Sony's boost clock is actually utilized or if really only exists as a means to inflate their TF # on paper.
 
Like Cerny said, just a 2-3% downclock could reduce thermal by 10%. That's ~50MHz and shouldn't be much greater. so it's still pretty much a 10 TF console through and through with a sizable 350MHz clock advantage in worst case scenario while keeping a impressive 405MHz advantage most of the time.

a 400mhz advantage on fewer CUs mean nothing. best case scenario for ps5 is that its a 10.3 tflops machine which it is not vs a stable 12.16 tflop machine. there are literally no benefits to ps5s gpu vs xbox. these are objective facts. and not to mention RT which xbox blows ps5 out of the water with meaning more workload for the ps5 gpu to do GI. this is a moot conversation and im not even going to discuss the lower bandwidth for the RAM.
 
a 400mhz advantage on fewer CUs mean nothing. best case scenario for ps5 is that its a 10.3 tflops machine which it is not vs a stable 12.16 tflop machine. there are literally no benefits to ps5s gpu vs xbox. these are objective facts. and not to mention RT which xbox blows ps5 out of the water with meaning more workload for the ps5 gpu to do GI. this is a moot conversation and im not even going to discuss the lower bandwidth for the RAM.
It’s like Vega 56 vs. 64 never happened.

Your use of superlatives is a big hint you’re not interested in the truth.
 
a 400mhz advantage on fewer CUs mean nothing. best case scenario for ps5 is that its a 10.3 tflops machine which it is not vs a stable 12.16 tflop machine. there are literally no benefits to ps5s gpu vs xbox. these are objective facts. and not to mention RT which xbox blows ps5 out of the water with meaning more workload for the ps5 gpu to do GI. this is a moot conversation and im not even going to discuss the lower bandwidth for the RAM.

How does the RT of the XSX blow the PS5 out of the water?.
 
not sure about blowing it out of the water
but more CUs => more intersection engines + more bandwidth.

That should result in better RT performance. Not sure what the actual delta is; there is no known information on that.

Yeah I expect there to be a delta, but I expect that delta to be the same as the TF delta as I suspect the RT performance scales with CUs and clocks.

But a ~18ish% differential isn't "blows out of the water" by any stretch of the imagination.
 
Yeah I expect there to be a delta, but I expect that delta to be the same as the TF delta as I suspect the RT performance scales with CUs and clocks.

But a ~18ish% differential isn't "blows out of the water" by any stretch of the imagination.
In theory, the XSX should be able to do 44% more ray intersections in parallel over PS5. PS5 will need to catch up using clockspeed, but it will be burdened by the lesser bandwidth.
I wouldn't necessary call that 15% difference. You're looking at TFs instead of intersections. Compute difference is 15-18%.
 
In theory, the XSX should be able to do 44% more ray intersections in parallel over PS5. PS5 will need to catch up using clockspeed, but it will be burdened by the lesser bandwidth.
I wouldn't necessary call that 15% difference. You're looking at TFs instead of intersections. Compute difference is 15-18%

The problem is we dont know if the intersection engines are tied to the GPU clock or not, I suspect they are as they are reusing the TMU caches which are tied to the GPU clock.
 
The problem is we dont know if the intersection engines are tied to the GPU clock or not, I suspect they are as they are reusing the TMU caches which are tied to the GPU clock.
But if you'll recall Nvidia also separates its RT performance from it's compute performance.

The 2080 TI is clocks much lower than 2080 Super, but has significant performance advantage over it in ray tracing.

2080 Regular:
1515 MHz and 1710 MHz
2944 CUDA cores

2080 Super:
1650 MHz and 1815 MHz
3072 CUDA cores

2080 TI:
1350/1545 MHz
4352 CUDA cores
 
Last edited:
But if you'll recall Nvidia also separates its RT performance from it's compute performance.

The 2080 TI is clocks much lower than 2080 Super, but has significant performance advantage over it in ray tracing.

2080 Regular:
1515 MHz and 1710 MHz

2080 Super:
1650 MHz and 1815 MHz

2080 TI:
1350/1545 MHz

Yeah but Nvidia solution is seperate to AMD's and we aren't sure yet if that is how AMDs solution works. I suspect it may be different.

It's part of TMU-complex, same clocks as rest of the GPU (the graphics portion, SoC/IF clocks is another matter)

So should be linear to the GPU core clock and CUs then no?
 
You're assuming that everything is better with slighlty Faster CUs with lower bandwidth than having more CUs with higher bandwidth. It isn't that simple. It will all depend on what you're doing.
Oh to be clear I understand faster clock along doesn't mean it's better in all areas, of course more CUs and bandwidth would excel in their own specific areas. Like you said it's all situation dependent.
 
Yeah but Nvidia solution is seperate to AMD's and we aren't sure yet if that is how AMDs solution works. I suspect it may be different.



So should be linear to the GPU core clock and CUs then no?
This question was asked and tested it would appear!

https://www.eurogamer.net/articles/...3-nvidia-geforce-rtx-2080-super-review?page=2

The biggest factors for pure RT performance would appear to be
cores + memory bandwidth

I don't suspect that they will be that different. Intersection must occur and both use BVH
 
Status
Not open for further replies.
Back
Top