When a card is said to be "bandwidth limited"

Bill

Banned
What exactly is meant? WHAT is limited? Sorry I'm dumb on the basics.

I'm guessing, textures and geometry/polygon/vertices? And in fact are geometry, polygons and vertices the same thing?

Basically if PS3 is bandwidth limited, or any graphics card but it seemed more appropiate here than console talk maybe, where they would think I was just bashing the PS3.

Basically what would the games look like? You would get more shading for "free" right, when you're bandwidth limited, So would it be a recognizable "look" and how so? Basically low polygon, possibly low texture, but extremely high shader effects?

What might be some examples of games with that "look" on PC or otherwise, if it is indeed a correct asumption?
 
Bill said:
What exactly is meant? WHAT is limited? Sorry I'm dumb on the basics.

It basicly means the bottle neck in the system is the memory bandwidth to the GPU ( well generally its the memory bandwidth we are talking about ).

I'm guessing, textures and geometry/polygon/vertices? And in fact are geometry, polygon and vertices the same thing?
The frame buffer is a big factortoo. Polys and vertices are both the same thing more or less.

Basically what would the games look like? You would get more shading for "free" right, when you're bandwidth limited,
Dunno if I would call it free but anyway...

So would it be a recognizable "look" and how so? Basically low polygon, possibly low texture, but extremely high shader effects?
Definatly not it would be the other way around. You would have a low-medium amount of poly and lots of texures and few shader effects.
 
bloodbob said:
The frame buffer is a big factortoo. Polys and vertices are both the same thing more or less.


Dunno if I would call it free but anyway....

I dont really understand the part about the framebuffer..is the framebuffer size affected by how many shaders are run on it? I thought framebuffer size was a "hard" number as opposed to a variable number, such as always the same at 720P?

For the second qouted part, basically my impression was if in a non bandwidth limited scenario a game runs at 80 FPS, but it is BW limited and therefore can only run at 60 FPS, then you might as well run some extra shaders on it in the meantime, since the bottleneck is BW not shaders, and you could then apply more shaders before you get down to being shader limited again. And sort of, the shaders would then have "idle time" as opposed to the 80FPS shader limited scenario, so you might as well use. So I am incorrect?

I also dont see how a BW limited game would use lots of textures. Aren't textures one of the main things a GPU "fetches" from memory? Therefore to be BW limited would seem to hit textures hard.
 
Last edited by a moderator:
Bill said:
I dont really understand the part about the framebuffer..is the framebuffer size affected by how many shaders are run on it? I thought framebuffer size was a "hard" number as opposed to a variable number, such as always the same at 720P?

The frame buffer size is always the same but the trafic to and from the frame buffer changes quite alot.

For the second qouted part, basically my impression was if in a non bandwidth limited scenario a game runs at 80 FPS, but it is BW limited and therefore can only run at 60 FPS, then you might as well run some extra shaders on it in the meantime, since the bottleneck is BW not shaders, and you could then apply more shaders before you get down to being shader limited again. And sort of, the shaders would then have "idle time" as opposed to the 80FPS shader limited scenario, so you might as well use. So I am incorrect?
Thats correct you could use more shaders ( well your shader is going to end up adding traffic either via texture reading texture or writing to the frame buffer ).

I also dont see how a BW limited game would use lots of textures. Aren't textures one of the main things a GPU "fetches" from memory? Therefore to be BW limited would seem to hit textures hard.
Umm yes the textures are the main thing the GPU fetches so the textures are the main thing using up the bandwidth.
 
So trying to get a feel for rough percentages?

Is most BW most easily taken by shaders, textures, or polygons?

And BW can limit all three of above?

Would it be realistic for example, to have plenty of polygons, plenty of textures, but BW limited heavily on shaders?

As far as shaders using more textures, cant you just load in X textures, and run more and more shaders on them? Thereby not using any more textures? Not saying this would be ideal but..might it be a workaround?
 
Well isn't the fact, that in a modern shader bounded title, the texture fetch load on the total GPU bandwidth in some less than, say a color/z op's and multisampling (which are far more frequent). The most constrain now, for the textures, is the size - all they need is large enough address space, so even for a single fetch to occur it would be in a shotrest possible time (this maybe applies mostly to the filter sampling?!).
Take for example a heavy shader-profile GPU like Xenos - it has a fast-and-furious but small eDRAM buffer for the rasterization hardwork and only a tiny shared 128-bit bus to the large common address space, for everything else.
 
fellix said:
, the texture fetch load on the total GPU bandwidth in some less than, say a color/z op's and multisampling (which are far more frequent)

Are those just for AA?

What would happen if a GPU was "color/z op" limited?
 
Bill said:
Is the trend more toward being shader bound in modern titles, or BW bound?

Shader bound. You know, the limitations in RSX are not due to "whoops!" They are chosen because they suit the path that is being taken going forward. Don't think of the hardware as being bound in some way by itself. It is bound by limitations under certain conditions, conditions that are imposed by the code being run (the game, the 3D engine). Games want more shaders and it therefore follows that you need more relative shading capacity than vertex processing or texturing performance. You don't program to become bound, you program to avoid the bounds, getting "free" performance where it can be taken while another part is bound. At some point, however, you will be completely restricted by the hardware's limitations. At this point you are efficient.

Bandwidth limitations is a really complex subject. You are talking about external bandwidth here, with regards to the external memory. In a very simple analogy you can think about this as a car engine. The cylinders (processing pipelines) have a certain capacity to process data (fuel) and this aggregates into a total amount of fuel for all cylinders. The actual amount of fuel in the cylinder is not governed by the cylinders or valves themselves, but by the fule injection system like the carburetor. Increasing the number of cylinders or their size usually warrants an increase in fuel flow, but you also know that you need air (that works nicely in this analogy...air...breathing room) so a larger cylinder using the same amount of fuel as a smaller one may be able to use it more efficiently by combusting (processing) with more air. Of course, when you are talking about memory you are talking about a bi-directional flow, whereas in an engine the fuel only flows in one direction.

PS. One more thing. Just like a car engine, these are not designed with open conditions, solving for some theoretical optimum without realistic constraints. A car, for example, cannot be expected to have a 1000 gallon fuel tank so it can guzzle fuel at $0.01/gallon to achieve some explosive maximum torque. That's just a theory. Reality is different and engines, like 3D hardware, is designed accordingly.

Yes, yes...I know...everyone groans when the car/engine analogy is used for computer hardware. Sorry! ;)
 
Last edited by a moderator:
wireframe said:
Shader bound. You know, the limitations in RSX are not due to "whoops!" They are chosen because they suit the path that is being taken going forward. Don't think of the hardware as being bound in some way by itself. It is bound by limitations under certain conditions, conditions that are imposed by the code being run (the game, the 3D engine). Games want more shaders and it therefore follows that you need more relative shading capacity than vertex processing or texturing performance. You don't program to become bound, you program to avoid the bounds, getting "free" performance where it can be taken while another part is bound. At some point, however, you will be completely restricted by the hardware's limitations. At this point you are efficient.

Bandwidth limitations is a really complex subject. You are talking about external bandwidth here, with regards to the external memory. In a very simple analogy you can think about this as a car engine. The cylinders (processing pipelines) have a certain capacity to process data (fuel) and this aggregates into a total amount of fuel for all cylinders. The actual amount of fuel in the cylinder is not governed by the cylinders or valves themselves, but by the fule injection system like the carburetor. Increasing the number of cylinders or their size usually warrants an increase in fuel flow, but you also know that you need air (that works nicely in this analogy...air...breathing room) so a larger cylinder using the same amount of fuel as a smaller one may be able to use it more efficiently by combusting (processing) with more air. Of course, when you are talking about memory you are talking about a bi-directional flow, whereas in an engine the fuel only flows in one direction.

PS. One more thing. Just like a car engine, these are not designed with open conditions, solving for some theoretical optimum without realistic constraints. A car, for example, cannot be expected to have a 1000 gallon fuel tank so it can guzzle fuel at $0.01/gallon to achieve some explosive maximum torque. That's just a theory. Reality is different and engines, like 3D hardware, is designed accordingly.

Yes, yes...I know...everyone groans when the car/engine analogy is used for computer hardware. Sorry! ;)


Sorry I didn't really understand the analogy. What is air? What is bigger cylinders? Which GPU part is the carburator? Etc. How do they all fit together like BW and GPU's? I understand pipelines are cylinders and data (shader pograms?) are fuel..
 
If your card was not bandwidth limited your fps drop from one res to the next higher one would be solely dependant upon the card's fill rate for example

if your card has a fill rate of 30,720,000 pixels a sec your FPS at

640*480 = 100
600*600 = 64
1024*768 = 39
 
Modern pipelines can be thought of as a series of smaller units doing a specialised task (lets just ignore unified shaders for now). These individual parts are very hungry - they are exceptionally fast at doing what they're designed for.

Now, bandwidth typically describes how much data that can be fed into the mouthes of these hungry little units.

A simple vertex layout is 32 bytes in size, and if your vertex engine is capable of processing 30 million vertices a second you need to provide it with 30,000,000 * 32 bytes of data every second. You need to feed ~915mb/sec into the vertex engine. That is your bandwidth.

Now, best case scenario is that it feeds in MORE than 915mb/sec, at which point the vertex engine is NEVER waiting for data. As soon as its finished processing one vertex the next one is ready and waiting.

Worst case scenario is that it CANNOT feed in that fast (less than 915mb/sec) and the vertex engine ends up waiting. It finishes one vertex and has to wait a few nanoseconds before the next vertex is ready. That is a few nanoseconds that it could have used to process more vertices.

So there's a nice balance to be found - how much data can you feed in per second, how much of that data can you process per second.

Extrapolate that to a more complex pipeline - one where you have multiple units. If one part slows down then subsequent parts will slow down. Referring back to the previous example - if the vertex processor is idle and waiting for more data, then it isn't going to be sending anything to the rasterizer - so the rasterizer will be idle whilst it waits for more data to work with.

Thats a feed-forward bottleneck. Early delays slow down the rest of the pipeline.

However you can also get a later stage slowing down the previous ones. If the pixel processor gets jammed up working too hard then it'll put up a big STOP sign and tell the vertex processor that it doesn't want any more. So, even if the vertex processor has data being fed in (original example) it can't do anything with it because it has been told NOT to produce any more output.

So, the net result for the vertex processor is that it now has MORE time in which it can operate on a vertex. It might be able to process a vertex in 100ns but if the pixel engine can only accept a new one every 150ns then the vertex processor can spend 50% more time per-vertex (maybe on a more complex animation system). OR, if by design the pixel processor will NEVER be able to keep up with the vertex processor there is absolutely no point in having that complex vertex processor - you might as well ship a slower/cheaper one...

If you're familiar with computer algorithms, this is all analagous to the "producer-consumer" pattern. One stage can only produce as much as the next consumer can accept.
 
Bill said:
Anybody know a good basic guide on the web that explains all this in detail?
Real-Time Rendering Second Edition is an essential test on this sort of topic. Go find a library or a second-hand copy. You might be able to find something via their extensive website.

hth
Jack
 
YeuEmMaiMai said:
If your card was not bandwidth limited your fps drop from one res to the next higher one would be solely dependant upon the card's fill rate for example

if your card has a fill rate of 30,720,000 pixels a sec your FPS at

640*480 = 100
600*600 = 64
1024*768 = 39
That would happen even if you were bandwidth bound. Geometry doesn't take much bandwidth relatively speaking, and generally the AGP bus is used (In PS3 it'll be the FlexIO link). So rendering each pixel requires a certain amount of memory access.

So resolution does not really affect bandwidth limitation. If anything, higher resolution makes a card slightly less dependent on bandwidth, because the same texture data is spread over more pixels (during magnification), and compression schemes work a bit better.
 
wireframe said:
Shader bound. You know, the limitations in RSX are not due to "whoops!" They are chosen because they suit the path that is being taken going forward. Don't think of the hardware as being bound in some way by itself. It is bound by limitations under certain conditions, conditions that are imposed by the code being run (the game, the 3D engine). Games want more shaders and it therefore follows that you need more relative shading capacity than vertex processing or texturing performance. You don't program to become bound, you program to avoid the bounds, getting "free" performance where it can be taken while another part is bound. At some point, however, you will be completely restricted by the hardware's limitations. At this point you are efficient.
That's not quite how it works, though your perception of general trends is correct. Games will definately be using shaders more.

Obviously they didn't say "whoops!", but there's only so much you can do when you cap the data streaming in and out. Textures can't always be massively compressed, e.g. HDR/FP textures (though some compression is available); render targets used for shadows, reflections, postprocessing; etc. Output is usually more than 32-bit with HDR, and RSX has only 40 bytes per clock of BW. Programming is rarely going to be able to find an equivalent, less bandwidth intensive technique. They'll have "free" math ops, but a chocolate bar doesn't help if you're really thirsty.

The biggest bandwidth limitation arises with alpha blending. Smoke, fire, muzzle flare, volumetric fog and lights, dust clouds, and fur rendering are just a few examples of things that won't really get much better with shader power, and often these are the things that have the biggest "wow" factor. You just need to crank out the layers as fast as you can. Unfortunately, you have to both read from and write to the framebuffer. Worst case scenario: A high resolution FP16 HDR texture blended into a FP16 framebuffer needs 24 bytes of bandwidth per pixel.
 
Last edited by a moderator:
Bill said:
So would it be a recognizable "look" and how so? Basically low polygon, possibly low texture, but extremely high shader effects?
This has already been addressed, but I'll give my take.

Geometry doesn't use much bandwidth, (though I suppose it might if you're rendering a 50 million polygon scene), so you won't see much impact here. High geometry models (i.e. when you have only a few pixels per polygon on average) are usually the domain of character models. These generally need skinning (FYI this is non-rigid-body animation), so the vertex shader will limit the number of vertices per second that you can process. Geometry bandwidth is proportional to vertex throughput.

Traditional textures don't use too much bandwidth either, because you can get 8:1 compression, and magnification of the textures will reduce this further. Things get more complicated under other scenarios (I described them above in a previous post), but I think that may be a bit beyond your 3D comprehension.

"Shader effects" use textures, so there's not much you can say here. Sometimes they need lots of BW per pixel, sometimes not so much. I don't think you can really make any generalizations at all about how a game will look.

About the only thing you can say for sure about RSX is that really heavy alpha blending effects (again, see above) will be absent. Even this, though, is hard to quantify. You will still see nice effects, but they can look better on other systems, that's all.


In any case, with the level the current graphics hardware is at (both PC and console), remember that graphics depend much more on developer and artist capabilities than the hardware. Don't expect any distinguishable characteristic of PS3 games due to the bandwidth limitation.
 
Gee, I would have thot the answer to "does it have a recognizable look" would have been, "aliased".
 
geo said:
Gee, I would have thot the answer to "does it have a recognizable look" would have been, "aliased".

Wait his last paragraph needs a different highlight than that, which will be valid for any of the two new consoles:

In any case, with the level the current graphics hardware is at (both PC and console), remember that graphics depend much more on developer and artist capabilities than the hardware. Don't expect any distinguishable characteristic of PS3 games due to the bandwidth limitation.
 
Back
Top