Xbox One's apparant high peak bandwidth per flop. Advantages?

Rangers

Legend
I say apparent because of the oddity of the whole ESRAM spec increase. But assuming it's a real generally usable increase, then XBO can sports some very hefty peak bandwidth figures of 272 GB/s against only 1.3 teraflops in the GPU.

So, what can this lead to?

Obviously without comparing to "PS4", we need to compare to some unified RAM setup of more reasonable BW to understand. Which like it or not is more or less a proxy for PS4's design, or we can call it "generic alternative GDDR5 console design X" :p

First thing that strikes me is 272 GB/s would be a LOT in the PC space for a 1.3 TF GPU.

Then even more when we look at the resolutions we deal at. Heck Killer Instinct is just 720P. On PC, 272 GB/s would seem to be overkill even for 1080P, but we also understand how PC is not the full answer here.

Could there be particular effects XBO would be strong at? Can we expect it to be very strong in particle effects? Fog/smoke?

Will the ESRAM being such a small pool "nerf" the positive effects of such large BW, or not so much?
 
In my opinion it'll mostly be used for intermediate buffer storage. eg depth buffers, etc.

For most other things, the data requires copying in/out of esram, which still is limited by dram bandwidth. So in these cases the only way you'll see significant benefit is if the data is read/written a very large number of times within the period it is located in esram. Some things come to mind - but most will be handled by the on chip caches fairly well. Transparency rendering is an obvious case - but it also can be cache optimised very well.

So it begs the question, what kind of problem requires lots of duplicated read/write bandwidth, has a high cache miss rate while also having fairly low ALU cost and isn't ROP limited.... Any takers?
 
What exactly are procedurally generated effects? What falls in that category?

Infinite texture variety with true random patterns. Also not just textures but also geometry so stuff like rock/tree formations don't have to be modeled by hand. I guess all naturally occurring formations benefit from procedural stuff. Stuff like cracks or holes can be procedurally generated too.

Also particle effects and fluids can be procedurally generated instead of prebaked fire animations/smoke etc.

http://software.intel.com/en-us/art...-accurately-model-procedurally-spreading-fire
http://software.intel.com/en-us/articles/procedural-trees-and-procedural-fire-in-a-virtual-world

http://developer.download.nvidia.com/SDK/10.5/direct3d/Source/Smoke/doc/Smoke.wmv
http://developer.download.nvidia.co...rc/SnowAccumulation/Docs/SnowAccumulation.pdf
http://developer.download.nvidia.com/SDK/10.5/direct3d/Source/PerlinFire/doc/PerlinFire.pdf
 
Last edited by a moderator:
eSRAM and the high bandwidth will allow lots of procedurally generated effects.
I also think it's more about efficiency and taking advantage of all the resources available on the system. I learnt from devs not to believe in Teraflops alone as a measure of the actual capabilities of a console. The 32meg scratchpad/buffer can cancel some of the memory bandwidth disadvantages of the DDR3, which is actually fast for a DDR3 but not enough compared to a more modern setup you can find on the PS4 and the PC.

Aside from that, Xbox One's move engines are completely designed to deal with compressed textures on the fly, so if you have the entire image in the DDR3 or the lower latency eSram, you have the move engines that deal with it compressed, which means you can move the data around faster -it looks good for overdraw effects-.
 
Infinite texture variety with true random patterns. Also not just textures but also geometry so stuff like rocks formations don't have to be modeled by hand. I guess all naturally occurring formations benefit from procedural stuff. Stuff like cracks or holes can be procedurally generated.

Also particle effects and fluids can be procedurally generated instead of prebaked fire animations/smoke etc.

How is that memory speed bound? To me that sounds something that is alu bound instead of memory speed bound. That's Assuming you want to calculate more details in real time instead of accessing large(r) textures.
 
What exactly are procedurally generated effects? What falls in that category?

A procedural effect is typically something structural (eg an image, 3D model, etc) that is generated by a mathematical algorithm. For example - an artist will often not have direct control over the pixels in an procedural image, just in the parameters used by the algorithm to create the image.

For a good example, take a look at the wiki page for julia sets (this is known as a fractal - which is a type of procedural effect).
As you can see, the maths is quite complex - and it's fairly limited in what it can visually create. This is the primary problem with most procedural stuff - randomness and naturalism is hard to create with an algorithm. Terrain generation, for example, usually has to involve complex offline erosion simulations to get something that feels natural.

So, they are good starting points - but their lack of artistic control and general expense means that they have fairly few commercial uses (and those uses are usually extremely complex). They also limit your ability to compress the output, etc if you are trying to generate them in realtime.

So as should be obvious, they aren't used very often in games other than a starting point. They are only really used in fully dynamic systems such as particle behaviour (which usually just boils down to simple randomization).
 
In my opinion it'll mostly be used for intermediate buffer storage. eg depth buffers, etc.

For most other things, the data requires copying in/out of esram, which still is limited by dram bandwidth.

hunh? What does esram bandwidth have to do with dram bandwidth? Only the final tile write...
 
hunh? What does esram bandwidth have to do with dram bandwidth? Only the final tile write...

I'm saying that unless it is an intermediate buffer, you'll still have to copy data in and/or out of esram... So if the access pattern is nice and coherent (ie the data is read into cache once/twice on average per byte) then you'd be better having left it in dram.
 
So, they are good starting points - but their lack of artistic control and general expense means that they have fairly few commercial uses (and those uses are usually extremely complex). They also limit your ability to compress the output, etc if you are trying to generate them in realtime.

So as should be obvious, they aren't used very often in games other than a starting point. They are only really used in fully dynamic systems such as particle behaviour (which usually just boils down to simple randomization).
For actual game stuff, the current pace-setter seems to be Allegorithmic's Substance


Probably too expensive for games for a long time, at least for realtime changes, but you could update textures every now and then for long term environment changes.
 
It also has a high flop per ROP rating. Is that advantageous as well?
Or high transistor per Gigaflop, that must mean something as well
 
It also has a high flop per ROP rating. Is that advantageous as well?
Or high transistor per Gigaflop, that must mean something as well

The XB1 is an artisanally crafted, efficiently pipelined, balanced, Direct X processing Monster. ;)

Whether you put the FLOP in the numerator or the denominator is up to you but those are ratios and those ratios are there for a reasons of balance, efficiency, ... balance ... ESRAM balanced efficiency working synergistically with the MS made, Direct x functionality ...

Anyhoo it's all about the games.:p
 
There was an upgrade a few weeks back to 109 GB/s, so the max theoretical read plus write BW is now 200+ GB/s.
 
Give the 7770,7970 bandwidth and it will still be a 7770,i don't think that having way more bandwidth than you need will help,at least not to the point were it will make the GPU work like a much stronger one.
 
I think when dealing with stuff like this it is important to note not only the over all bandwidth but also the write bandwidth and possibly read bandwidth.

So for the XBONE it has 272GB/s over all bandwidth

With a max of 109GB/s write bandwidth

And a max of 163GB/s read bandwidth

If we just keep talking about 272GB/s of bandwidth people might be led to believe it can actually write that fast. Which is why I think the distinction is important as it causes less confusion.
 
Back
Top