Digital Foundry Article Technical Discussion Archive [2013]

Status
Not open for further replies.
Sony didn't deny the actual numbers though, while taking time to deny/explain some side issues.

Essentially confirmation imo.

Yup. Sony went out of their way to correct minutia of that article and provide context and in doing so they had every opportunity to correct the allotment claims being made. The fact they get into detailed corrections and then didn't correct the allotment speaks volumes.

Additionally, Jon Blow pretty much confirmed it long beforehand. So there's that.
 
Do you have something from amd explaining it like that?

The link is posted. However, Extremetech and others state 11 CU, 4 SIMD per CU. I believe that would be most correct as 11 SIMD in a CU would be odd. Also I recall seeing a diagram showing how the CU was broken out into 4 SIMD modules each further broken out into more and more sub components.

Regardless, I had said that their new engineering most likely corresponds to the better output of their card as opposed to just adding more and more for return. It seems they rearchitected the core aspects. Besides you get more performance from more ROPS, etc. when you have a bandwidth to the on board GDDR5 rated at 380+ gb/s, neither of which the X1 and PS4 approach.
 
The interview, in truth, gives us a lot of info... If we read there is almost everything there.

First of all, the confirmation that nextgen platforms will be CPU and Bandwith bound.

These are the 2 key points of the 2 nextgen platforms, and each manufacturer has made specific and different decisions in these 2 Key areas.

Bandwith is the main reason of limiting the usage and efficency of CU and ROP.

CPU seems to be the source of many concern x MS, from this they created SHAPE , move engine, up clock, audio block etc...
I "assume", but it easy now, that Sony solution to this concern in GPGPU and dwell with additional CU and ACE...

We have all now.
 
I would say anytime they speak about the details of the PS4.

You're making accusations without any actual counter arguments to support them. What *specifically* was PR/FUD regarding their comments? If I wanted to see drive by X1 trolling I'd go to GAF. I expect more from posters here so please do lay out your claims/accusations with some quotes/retorts when ya get a chance. :)
 
Those hard numbers should be interpreted based on their respective architecture and use cases. I am pretty curious about the outcome between PS3 and 4, specially if they port a title between the 2.

EDIT:
I skimmed through the article quickly...

The interview, in truth, gives us a lot of info... If we read there is almost everything there.

First of all, the confirmation that nextgen platforms will be CPU and Bandwith bound.

These are the 2 key points of the 2 nextgen platforms, and each manufacturer has made specific and different decisions in these 2 Key areas.

Bandwith is the main reason of limiting the usage and efficency of CU and ROP.

CPU seems to be the source of many concern x MS, from this they created SHAPE , move engine, up clock, audio block etc...
I "assume", but it easy now, that Sony solution to this concern in GPGPU and dwell with additional CU and ACE...

We have all now.

Yes and no, it's probably insufficient just looking at the numbers.

MS engineers' comments should be taken assuming Xbox One's architecture. They may not apply to another different architecture and programming model like PS3 or PS4. e.g., In their assessment, CPU may be a more dominant bottleneck because of the way they design and use Xbox One h/w.

If the programming model is different, e.g., if the GPU is given more autonomy and power, and they can pipe data from one GPU subsystem to another quickly using more ACEs without involving the CPU (much, if at all), the run-time behavior may be different.

If the programmer choose to ignore the architectural differences and program both like a regular PC, then again the run-time behavour would be different.
 
It's been said multiple times and some of you have ignored it every single time, so please reconsider the way you interact on the Beyond3D forums. Unless you want a mini-vacation away from the forums too. Calling out the source in technical articles is not the B3D way.
 
MS engineers' comments should be taken assuming Xbox One's architecture. They may not apply to another different architecture and programming model like PS3 or PS4.

Finally someone who isn't a complete and total idiot. Thank you patsu! The rest of you really should learn from this and stop ignoring common sense or reading way too much into comments.
 
Furthermore, we still aren't doing platform comparison yet! With any luck, DF will land a similar interview with Sony and then we can compare these low-level features without needing the opposite sides guessing what they're competitor is doing.
 
I was wondering if anyone with a proper understanding of this article and the ramifications would like to join us on a podcast tonight to help discuss it? None of us are very tech-inclined and could definitely use some insight on discussing the topic.

We would be very grateful.
 
What I got from the article:
If there even was the slightest latency advantage of the ESRam, then the would have focussed on it for several paragraphs

So all in all it's a good read, although, honestly there is nothing in the article that I didn't already know.
 
aaaah, so that's what happened to my post!
I thought I felt the vacation hammer flying past my head.
 
Yes and no, it's probably insufficient just looking at the numbers.

MS engineers' comments should be taken assuming Xbox One's architecture. They may not apply to another different architecture and programming model like PS3 or PS4. e.g., In their assessment, CPU may be a more dominant bottleneck because of the way they design and use Xbox One h/w.

If the programming model is different, e.g., if the GPU is given more autonomy and power, and they can pipe data from one GPU subsystem to another quickly using more ACEs without involving the CPU (much, if at all), the run-time behavior may be different.

If the programmer choose to ignore the architectural differences and program both like a regular PC, then again the run-time behavour would be different.

This is exactly what I said. They both have the same CPU.
Sony CPU concern bring them to bet on GPGPU and so additional CU and ACE.
MS goes with "more" CPU that bring on the table Audio Block, Move Engine, up-clock etc....
2 different solution for the same problem.
It is impossible now to say which one is right or if 1 solution is really better than another...
Only time will tell and I suspect that in the future we will talk a lot about it.

The other issue, bandwith bound, I suspect will be very similar and will have similar impact on the CU and ROP usage of the 2 system.
 
Low latency was emphasised in the leaked docs. The lower the better, no?
What leaked docs? Better doesn't necessarily mean significant, and without a figure we can't know if it's significant or not. I'm still looking for some kind of official info which claims a significant advantage from esram latency. I'm not saying it's not there, but the data certainly didn't show up yet.

From the article, about the GPU latency:
"Nick Baker: You're right. GPUs are less latency sensitive. We've not really made any statements about latency."

About the CPU being able to access ESRAM:
"Nick Baker: We do but it's very slow"
 
What leaked docs? Better doesn't necessarily mean significant, and without a figure we can't know if it's significant or not. "

VG Leaks had this blurb.

The advantages of ESRAM are lower latency and lack of contention from other memory clients—for instance the CPU, I/O, and display output. Low latency is particularly important for sustaining peak performance of the color blocks (CBs) and depth blocks (DBs).

I'm thinking since access to the ESRAM is through the cache hierarchy, even if the latency is better than DDR3, it's probably on the same order of magnitude.
 
VG Leaks had this blurb.



I'm thinking since access to the ESRAM is through the cache hierarchy, even if the latency is better than DDR3, it's probably on the same order of magnitude.
For the CUs and the CPU that makes a lot of sense, but do the ROPs go through the cache hierarchy usually? I thought they were invariably close coupled to the memory controller, so I'm thinking maybe the lower latency help the ROPs reach their max bandwidth. That would explain why they said that 16 ROPs are more than enough, because they have statistically less wait cycles, and it would also explain the devs rumors which said the xbone is at it's best when there's a lot of writes.

EDIT: Forget it, I was looking at the ROPs bypassing L2, but obviously they still have their own cache Z$ C$. Still, maybe they tweaked that?
 
Last edited by a moderator:
This is exactly what I said. They both have the same CPU.
Sony CPU concern bring them to bet on GPGPU and so additional CU and ACE.
MS goes with "more" CPU that bring on the table Audio Block, Move Engine, up-clock etc....
2 different solution for the same problem.
It is impossible now to say which one is right or if 1 solution is really better than another...
Only time will tell and I suspect that in the future we will talk a lot about it.

The other issue, bandwith bound, I suspect will be very similar and will have similar impact on the CU and ROP usage of the 2 system.

You stated:

First of all, the confirmation that nextgen platforms will be CPU and Bandwith bound.

I am just saying that's not necessarily true. ^_^

I don't know if both systems will be bandwidth bound in similar ways too. They are very different.
 
Digital Foundry: And you have CPU read access to the ESRAM, right? This wasn't available on Xbox 360 eDRAM.
Nick Baker: We do but it's very slow.
Digital Foundry: There's been some discussion online about low-latency memory access on ESRAM. My understanding of graphics technology is that you forego latency and you go wide, you parallelise over however many compute units are available. Does low latency here materially affect GPU performance?
Nick Baker: You're right. GPUs are less latency sensitive. We've not really made any statements about latency.

By this part of the statement statement I wouldn't put too much faith in eSRAM for compute/latency in the future.

Slow for CPU, not sensitive to GPU.
 
Status
Not open for further replies.
Back
Top