Understanding XB1's internal memory bandwidth *spawn

No, I already addressed both of those in a previous post. And it seems to me that it's you that's applying a specific interpretation to both of those sources to suit your world view. An interpretation which you state as fact but is at best debatable.

The validity of dev docs are now debatable? But you interpretation of silence is what...more valid than the dev docs now?

I'm interpreting info directly from dev docs as support for my position that MS is NOT ruling out the eSRAM's latency as a boon for performance in some facet. Evidently they told devs it was useful for the CB/DB's. You, on the other hand, want to stretch Baker's comment and then lean entirely on how you *think* Goossen should have jumped in all while ignoring the info we do have from MS's dev docs.

You've made your view quite clear Astro. It clearly isn't shared by everyone and is certainly not as factual as you try to imply it is. So why don't we just agree to disagree and move on.
I'm happy enough making my points with evidence behind me. Nobody is forcing you to agree or even to respond if ya don't feel like it. No need to get defensive.

It's a little more than odd and while not in itself proof that there are no major latency advantages, it certainly raises serious doubts about that argument.
It wasn't nearly cut and dry enough for that. You'd need an argument to undermine the dev docs. Your interpretation of Baker's reaction which is likely out of context and being digested in a bubble isn't anywhere near enough to justify ignoring the dev docs. This interview doesn't alter what they had been telling devs.

Regarding your other two sources, as mentioned in earlier posts, one is a generic statement...
...those sources which inconveniently kill your argument. Detailed metrics or not, if we are taking MS as the definitive source here those dev docs carry far more weight than your speculation as to how Goossen should have responded via interruption. Again, I agree with your speculation, but it's far too flimsy to undermine the dev docs and their explicitly clear statement on the matter.

...which may or may not have been VGL's own speculation and the other could quite easily have been a reference to the GPU's natural latency hiding qualities (your factually stated opinion to the contrary not withstanding).
If your argument is now premised on the thought that VGLeaks made it up or misunderstood it, that's fine but you need to make that case with some evidence. bkilian noted a while back that it seemed they were taking the dev docs verbatim and transcribing them to avoid copyright pulldowns. From your perspective I'd recommend maybe asking him if VGLeaks is being creative with their article there on the matter. Could help your argument and give mine pause as a result.

It seems to be quite clearly talking about the advantages of an APU with shared memory and not specifically low latency esram.
So he is speaking of the DDR3 as the low latency memory? I mean, that's off-chip memory so what does he mean when he speaks of getting everything as close to memory as possible via an APU design? To me that sounds more like a reference to eSRAM, even though he notes the CPU too which would be weird. I may be wrong. I posted it for others to opine on it as everyone seems to want to focus entirely on the single line response from Baker.

Saying that latency was one of the design aspects they considered is hardly proof or even particularly useful evidence of the esram having game changing low latency.
Strawman much? "Game changing" low latency? I'm not even settled on which kind of latency those quotes refer to and you want to assert that I'm making highly specified claims based on them with that kind of hyperbole? Get real. Stop being defensive. I included those quotes to note other instances in the interview where the engineers noted latency in an unknown context. The latency in the eSRAM was always for fetch requests, which Goossen directly brought up as a boon for Exemplar.

Dev docs note color/depth blocks. Goossen notes Exemplar. These are specifics and you've yet to actually combat these points with any rigor at all. You've just whined about missing details as if that somehow invalidates MS's own expressed views on the matter, and clung to what you think someone should have said based on an unclearcomment from Baker.

I'm fine with including all the evidence/info we have and trying to incorporate all of it into a speculative conclusion until we get more information...but let's not pretend that your argument on what he didn't say is somehow more valid than what MS told devs and what Goossen cited directly. Not all evidence is created equal.

Latency is a problem that must be addressed in almost every aspect of the system and it's no surprise that they would have worked to keep it to a minimum in every area. It doesn't specifically mean that they implemented a very low latency embedded memory pool.
I'm open to hearing your speculation as to the context of the diction there. That's why I posted it.
 
I'm interpreting info directly from dev docs as support for my position that MS is NOT ruling out the eSRAM's latency as a boon for performance in some facet.

They recommend to read from DRAM and write to ESRAM explicitly for latency benefits.
 
Maybe we can look at Apples Powervr6 in the new Iphone5S

Lower triangle throughput than than the previoius iphone but double the in game performance. They do more "balancing" of increased shader performance and increased rasterization performance, but half the triangle throughput.
 
I'm interpreting info directly from dev docs as support for my position that MS is NOT ruling out the eSRAM's latency as a boon for performance in some facet. Evidently they told devs it was useful for the CB/DB's.

To me that sounds more like a reference to eSRAM, even though he notes the CPU too which would be weird. I may be wrong. I posted it for others to opine on it as everyone seems to want to focus entirely on the single line response from Baker.
Boyd Multerer would agree with you. See below.
Look at the wording Boyd uses very carefully. 'Dataflow' 'right data in right cache at right time' 'makes all the difference in the world' sounds like not just bandwidth but low latency is extremely important for the architecture.
xbox_sram_cache2.jpg
 
Evidently they told devs it was useful for the CB/DB's. You, on the other hand, want to stretch Baker's comment and then lean entirely on how you *think* Goossen should have jumped in all while ignoring the info we do have from MS's dev docs.
MS basically retracted that statement in their interview with DF. And for good reason. All evidence (that means every fillrate test out there) will show you, that latency is almost negligible for ROP operations, bandwidth is paramount. There are better examples or scenarios, where the latency may make a difference. This isn't one of them.
 
Maybe we can look at Apples Powervr6 in the new Iphone5S

Lower triangle throughput than than the previoius iphone but double the in game performance. They do more "balancing" of increased shader performance and increased rasterization performance, but half the triangle throughput.

Probably just means iOS games were never triangle bound, the geometry on iOS games always looks quite low/hideous. Anyways it's tricky to draw parallels to the console side. The big wildcard here is that even if the esram can "average" 150mb/second, that doesn't necessarily mean anything. Compared to gddr5 on the other platform, regardless of the game scenario, situation and circumstances gddr5 will be able to deliver a fixed predictable amount of bandwidth every time because it's not dependent on circumstances. On the other hand if the stars don't align we don't know if the esram will be able to deliver anywhere near the stated average. So while they may match ggdr5 bandwidth "much of the time", there can still be situations where they end up far from average depending on circumstance. We'll have to see in actual games how things pans out.
 
and the spin continues, as if any DRAM's peak bw is sustainable w/o things aligned perfectly and not "depending on circumstances" ;-)
 
As if gddr is free from conditions necessary to get near maximum speeds, like bursting, etc. Right.

Esram will take more developer thought to ensure good utilization. On a macro scale. With that, I agree. But it isn't true that on a micro, hardware level there are these mysterious technical constraints that will make it more difficult to achieve anything like maximum speeds absent in other solutions. Quite the contrary.
 
and the spin continues, as if any DRAM's peak bw is sustainable w/o things aligned perfectly and not "depending on circumstances" ;-)

It would seem its far easier to achieve then the eSRAM's peak. Its not spin its merely a discussion of how easy it is to utilise and in what cases that happens. I don't see how someone can see that as spin.
 
Applying this logic to the PS4 leads us to abandon absolutely all known info about these machines without prejudice. Congrats, you've taken us back to the stone age of 2012.

Not so, but don't take Cerney's word for Gospel either. He discounted eSRAM with 1TB/s bandwidth, is he correct or he just talking up his team? Just don't naive and think they are not just telling you half the truth at any given time.
 
and the spin continues, as if any DRAM's peak bw is sustainable w/o things aligned perfectly and not "depending on circumstances" ;-)

I meant coding circumstances of course. Given the unpredictable nature of games it's not always possible to have the optimal circumstances required to max out hardware. Think of last gens universal shaders vs separate vertex/pixel shaders where the latter was much more likely to be bottle necked,
 
Not so, but don't take Cerney's word for Gospel either. He discounted eSRAM with 1TB/s bandwidth, is he correct or he just talking up his team? Just don't naive and think they are not just telling you half the truth at any given time.

What would a PS4 GPU do with 1TB/s bandwidth? There were other reasons Cerny et al didn't go with embedded RAM, but it wasn't due to wanting the best performance possible in terms of distant hardware potential. It was due to them wanting simplicity. He never suggested that decision was due to anything other than simplicity. He admitted that PS4 wasn't built to target a new gen of graphics tech.

These guys aren't trained PR speakers. They are engineers who are excited to dig into the nuances of a very deliberate, complex set of solutions for their platform's design. It's not PR. Nothing about the DF interview is even remotely suggestive of it being half truths. They went out of their way to very deliberately be as transparent as we could've ever hoped for. They had multiple chances to (fairly) lay out figures that weren't realistic. Instead they dove right into the detailed calculations and gave us real world figures. That's gotta be a first for a console manufacturer.

A half truth would be something like saying they can get up to 204 GB/s on the eSRAM without explaining that it's a peak figure that games won't actually get to. They did way better than simply avoiding that double speak, they outright owned it and proudly dissected real world figures. Just one example of several I could list from the article.




Betanumerical said:
It would seem its far easier to achieve then the eSRAM's peak. Its not spin its merely a discussion of how easy it is to utilise and in what cases that happens. I don't see how someone can see that as spin.

Can't be that challenging if MS's first parties figure out how to tap into that extra bandwidth in a couple months near the tail end of their dev cycle for the launch portfolio. Real world game code is apparently netting them around 200 GB/s of bandwidth so clearly it's not such a challenge as to prevent good utilization of the memory subsystem on competitive grounds.
 
Can't be that challenging if MS's first parties figure out how to tap into that extra bandwidth in a couple months near the tail end of their dev cycle for the launch portfolio. Real world game code is apparently netting them around 200 GB/s of bandwidth so clearly it's not such a challenge as to prevent good utilization of the memory subsystem on competitive grounds.

It'd be interesting to see what scenarios this happens in, as you would have been writing reading 91GB/s or more as well as writing 109GB/s or some combination of the two. Maybe some actual developers can tell us when scenarios like this are likely to occur.
 
Last edited by a moderator:
It'd be interesting to see what scenarios this happens in, as you would have been writing reading 91GB/s or more as well as writing 109GB/s or some combination of the two. Maybe some actual developers can tell us when scenarios like this are likely to occur.

They told you, you didn't want to listen, and I thought the subject of this malarkey has already been banned. Keep on spinning.
 
They told you, you didn't want to listen, and I thought the subject of this malarkey has already been banned.

They have given us a average bandwidth without a timestep or anything else, this subject is not banned, this thread is entirely the point of the subject. If you don't like it then don't post in it.

I personally wan't more information and details then they have provided as I feel it would lead to better comparisons.
 
They have given us a average bandwidth without a timestep or anything else, this subject is not banned, this thread is entirely the point of the subject. If you don't like it then don't post in it.

I personally wan't more information and details then they have provided as I feel it would lead to better comparisons.

Either you believe it, or not. By definition average is already lapsed a period of time.
 
Either you believe it, or not. By definition average is already lapsed a period of time.

What matters a lot is that time step. You can believe the number and still want to know how long it was measured over.

Whats also interesting is the 50-55GB/s number. I'm just wondering how much bandwidth a CPU would like and need in next gen games, it has a 30GB/s bus to the DRAM probably for a good reason.
 
I'm just wondering how much bandwidth a CPU would like and need in next gen games, it has a 30GB/s bus to the DRAM probably for a good reason.
4 core (8 thread) Haswell at 3.5 GHz (3.9 GHz turbo) is fine with 25.6 GB/s (and it shares that bandwidth with the HD 4600 IGP). Jaguar cores have around half the clocks and lower IPC compared to Haswell. Bandwidth shouldn't be a problem for them.
 
Back
Top