Digital Foundry Article Technical Discussion Archive [2013]

Status
Not open for further replies.
Do you have any evidence to back that up or is it pure assumption?



No it's terrible indicator since it's blatantly 3x bigger than the GPU in X1. Dave Baumann has already more or less said the reason the dev kits held such GPU's is that these were the first GPU's to market with the GCN architecture and thus the only option for the dev kits within the desired timeframe.

Don't be selective in your quotes. Dave Baumann also said if the parts in the x1 behave as expected it could outperform the discrete cards. I will look up the quote later in on my phone at the moment
 
Said in response to a question about VGleaks' "14+4" reveal



I don't think your interpretation makes much sense given this pretty clear statement on the subject.

Having said that, I agree that this also doesn't make the PS4 architecture "unbalanced" as that will ultimately be dependent on the workloads it is going to be tasked with in next generation games. Whether those workloads will trend towards a heavy enough use of GPU compute for the PS4 architecture to achieve high utilization of its ALU resources is an open question, though.

Sure it fits. All the power is tied up in the gpu and the 14+4 was seemingly an example of how to reclaim some of that performance for the general purpose side using the various compute enhancements we know about.
 
Don't be selective in your quotes. Dave Baumann also said if the parts in the x1 behave as expected it could outperform the discrete cards. I will look up the quote later in on my phone at the moment

He never said that in relation to the 7970 and any implication that this would be true in relation to the 7790 could and most likely would be down to the simple fact that X1 has far more bandwidth available to it than the 7790 when the esram is used. This isnt some special performance advantage afforded by the specific use of esram. But rather the already well understood advantage of having generally more memory bandwidth. Im sure dave would be first to admit that in non bandwidth limited scenarios the 7790 would have its own performance advantages.
 
You read that completely wrong fwiw. MS insider interpretation of gcn capabilities matched DFs interpretation of Cernys GPGPU claim wrt GCN.

192 is not a rumor its a claim. The sane claim roughly as when MS said that the edram can get up to 256gb\s bandwidth but only when doing 4xMSAA.

This only makes sense if they are talking about the bandwidth in regards to compression you dont magically get more from MSAA, and the compression is the same on the PS4 as well so its a moot point.
 
He never said that in relation to the 7970 and any implication that this would be true in relation to the 7790 could and most likely would be down to the simple fact that X1 has far more bandwidth available to it than the 7790 when the esram is used. This isnt some special performance advantage afforded by the specific use of esram. But rather the already well understood advantage of having generally more memory bandwidth. Im sure dave would be first to admit that in non bandwidth limited scenarios the 7790 would have its own performance advantages.

It feels like MS is aiming for 7790 performance but is really getting 7770 performance (1.28 TF, 72 GB/s bandwidth) plus ESRAM.

I would wager, that when the ESRAM is used effectively, the performance of the Xbox One's graphics subsystem will far an away outstrip any of those discrete parts you mention.


Theres no mistaking what he wrote here. I remember what he wrote because it was a question I asked that both he and ERP replied to.
 
Yes and as I said there's no implication there that 'effective use of the esram' isn't merely a reference to utilising the bandwidth advantage afforded by the esram over the narrow 128bit GDDR5 bus of the 7790. Theres nothing in Daves statement that says anything to me about esram affording some kind of special performance advantage above the additional bandwidth it offers.

An interesting side note to this discussion though is that with that statement Dave has effectively admitted that the 7790 is a horribly unbalanced design capable of being in his words "far outstripped" in performance by a weaker gpu with more memory bandwidth.

Clearly though that's only going to be the case in bandwidth restricted scenarios. Where computational performance is the limiting factor no amount of esram will put the x1 gpu on par with a 7790.
 
Yes and as I said there's no implication there that 'effective use of the esram' isn't merely a reference to utilising the bandwidth advantage afforded by the esram over the narrow 128bit GDDR5 bus of the 7790. Theres nothing in Daves statement that says anything to me about esram affording some kind of special performance advantage above the additional bandwidth it offers.

An interesting side note to this discussion though is that with that statement Dave has effectively admitted that the 7790 is a horribly unbalanced design capable of being in his words "far outstripped" in performance by a weaker gpu with more memory bandwidth.

Clearly though that's only going to be the case in bandwidth restricted scenarios. Where computational performance is the limiting factor no amount of esram will put the x1 gpu on par with a 7790.

The context of the thread lends itself to Daves pronouncement. No one said there would be an increase in computational tflop count. That was not the point of the DF article nor of my references, so I'm not sure what point it is you are trying to make.
 
Yes and as I said there's no implication there that 'effective use of the esram' isn't merely a reference to utilising the bandwidth advantage afforded by the esram over the narrow 128bit GDDR5 bus of the 7790. Theres nothing in Daves statement that says anything to me about esram affording some kind of special performance advantage above the additional bandwidth it offers.

An interesting side note to this discussion though is that with that statement Dave has effectively admitted that the 7790 is a horribly unbalanced design capable of being in his words "far outstripped" in performance by a weaker gpu with more memory bandwidth.

Clearly though that's only going to be the case in bandwidth restricted scenarios. Where computational performance is the limiting factor no amount of esram will put the x1 gpu on par with a 7790.

The article also doesn't address the massive difference in ROPS (PS4 has 2x the ROPS), nor the difference in cache (PS4 has more cache per unit of work (not per CU) on the GPU if you spread a specific load over all the CU's), nor does it address the difference in texturing (PS4 has 1.5x the texture units of the XBONE).

So all we can really conclude from the article is that with only the pure computational benefit, the least performance increase the PS4 is likely to see is 17.6%.

How much does that change when we add in the extra ROPs, TEXT and Cache? who knows, but id hazard a guess itd be somewhere closer to 40%.

In my opinon you'd be better off looking at the relative difference between a 7670 and a 7750 then what the article suggests.
 
Sure it fits. All the power is tied up in the gpu and the 14+4 was seemingly an example of how to reclaim some of that performance for the general purpose side using the various compute enhancements we know about.

Definitely, but he actually said that an incentivisation to use GPGPU is that "it has a little bit more ALU in it than it would if you were thinking strictly about graphics."
 
Definitely, but he actually said that an incentivisation to use GPGPU is that "it has a little bit more ALU in it than it would if you were thinking strictly about graphics."

Maybe hes talking about using it do compute whilst the graphics subsystems are doing all the work(i remember reading this before). That would effectively give you more ALU then if you were using them for only graphics and would do it for effectively nothing as well.
 
Yes and as I said there's no implication there that 'effective use of the esram' isn't merely a reference to utilising the bandwidth advantage afforded by the esram over the narrow 128bit GDDR5 bus of the 7790. Theres nothing in Daves statement that says anything to me about esram affording some kind of special performance advantage above the additional bandwidth it offers.

An interesting side note to this discussion though is that with that statement Dave has effectively admitted that the 7790 is a horribly unbalanced design capable of being in his words "far outstripped" in performance by a weaker gpu with more memory bandwidth.

Clearly though that's only going to be the case in bandwidth restricted scenarios. Where computational performance is the limiting factor no amount of esram will put the x1 gpu on par with a 7790.
You've *successfully* mapped all of the possible performance numbers into a couple of them at most.

Reducing the theoretical performance to two factors, bandwidth and flops, is certainly wrong in my eyes since I think it's a pretty narrow sighted view of the actual numbers.

Why is Xbox One's GPU weak? :???: There is nothing weak about it, it's going to be utilized almost fully, I am sure. Do you think Gran Turismo 4 could run at 60 fps on the original Xbox like it did on the PS2? I'd say no.

Could the original Xbox run F-Zero GX at 60 fps like in the Gamecube? I don't think so! :eek: Yet, the original Xbox was the most powerful console of its generation.

After reading sebbbi's posts about flops and the scratchpad memory, and having watched the Xbox One games in action all I can say is that the console is a monster, performance wise.

I think of the Xbox One as a finely tuned sports car.

Another question, if bandwidth alone is so important, what's the point of having 32MB of eSRAM instead of using a 256 or 512 bits bus even if they utilized DDR3 memory? ;)

As for DF article, no matter the modifications they made to the hardware in order to "emulate" consoles, I think they are impossible to emulate. It was an interesting read nonetheless, as usual, it's just that it's not ex cathedra.
 
The article also doesn't address the massive difference in ROPS (PS4 has 2x the ROPS), nor the difference in cache (PS4 has more cache per unit of work (not per CU) on the GPU if you spread a specific load over all the CU's), nor does it address the difference in texturing (PS4 has 1.5x the texture units of the XBONE).

So all we can really conclude from the article is that with only the pure computational benefit, the least performance increase the PS4 is likely to see is 17.6%.

How much does that change when we add in the extra ROPs, TEXT and Cache? who knows, but id hazard a guess itd be somewhere closer to 40%.

In my opinon you'd be better off looking at the relative difference between a 7670 and a 7750 then what the article suggests.
The article hasn't been made to solve the mysteries of these consoles nor to tell us the whole truth. It's entertaining to read and that's it.

Richard is not saying that those tests are essential to have an accurate idea of the actual performance you can get from both consoles. There is nothing set in stone for now.

If the leaked specs of the upcoming Xbox specs are true, it looks like Sony are going to have the edge in graphics performance this generation. The CPUs sound to be pretty similar though.

Even so, I wonder why the GPU of the PS4 is going to feature 32ROPs instead of 24 ROPs, for instance. I don't see how they are going to fully utilise those 32ROPs.
 
Maybe hes talking about using it do compute whilst the graphics subsystems are doing all the work(i remember reading this before). That would effectively give you more ALU then if you were using them for only graphics and would do it for effectively nothing as well.
Maybe, but the question was about the seemingly inexplicable 14+4 recomendation.
 
Even so, I wonder why the GPU of the PS4 is going to feature 32ROPs instead of 24 ROPs, for instance. I don't see how they are going to fully utilise those 32ROPs.
I understand very little about GCN so I could be wrong, but I remember this was discussed a few months ago. Considering the 7850(16CU) and 7870(20CU) both have 32 ROPs, it seems normal that the PS4 is right in line with it's 256bit bus, 18CU and 32ROPs. ROPs are mixed with the memory controllers and they can't put any number they like (which they can with CUs). It's either 4, 8, 16, or 32 ROPs, there's no inbetween.

But the presence of esram in the xbox one makes it impossible to try any comparison of memory controller, it's probably heavily modified :D
 
Last edited by a moderator:
but you only mention 7790 and 7770 not 7970 i dont see how you think hes saying xbox one is faster than 7970 based on you talking about a 7790 or 7770 :?:

In the thread I cited, I spoke to the 7770 and 7790 in terms of performance targets from my perspective. Dave Baumann appears to have said that a properly used esram will far and away outperform those discrete parts.

I raised the 7970 point as a target performance by MS as it was included in the dev kits. I'm not saying it does or doesn't match that. I have no idea other than Dave saying it was the first part available using gcn architecture that could be placed in a devkit.

The DF article doesn't know the actual capabilities of Xbox one because its the only GCN architecture anywhere with the inclusion of onboard ESRAM. How can you mimic that?
 
In the thread I cited, I spoke to the 7770 and 7790 in terms of performance targets from my perspective. Dave Baumann appears to have said that a properly used esram will far and away outperform those discrete parts.

I raised the 7970 point as a target performance by MS as it was included in the dev kits. I'm not saying it does or doesn't match that. I have no idea other than Dave saying it was the first part available using gcn architecture that could be placed in a devkit.

The DF article doesn't know the actual capabilities of Xbox one because its the only GCN architecture anywhere with the inclusion of onboard ESRAM. How can you mimic that?

If simply having a large amount of eSRAM on die was the solution to more then doubling your performance then the GPU's would have a damn sight more cache and youd probably see a GPU with it already.
 
If simply having a large amount of eSRAM on die was the solution to more then doubling your performance then the GPU's would have a damn sight more cache and youd probably see a GPU with it already.

Like the Intel Haswell-E?
 
Status
Not open for further replies.
Back
Top