Quick tech questions cencerning next gen...

pixelbox said:
So the "fill" in fillrate refers to how fast pixels can fill a polygon or fill the GPU?

Fill the screen, which is full of polygons and the whole thing is kept in the Frame Buffer (the memory), not the GPU.
 
Everything is starting to clear up! I've picked up pieces of CPU/GPU info and things got too comfusing this time around since CPU/GPU's got more intricate. Thank you everyone for taking the time to explain, i won't be as lost!
 
pixelbox said:
Now i got it! After all these years!
Obviously these days things get much more complicated than that simply because some shaders and other effects eat up fillrate, so as lots of people said lots of times already, talking about fillrate is a bit useless today. Like Guden said:

This just goes to show btw that fillrate is a highly archaic method of measuring performance. If PS3 did have over 13Gpix/s fillrate, it could redraw a 1080P screen 106 times per frame at 60fps. A rediculous and uselessly high figure.
 
london-boy said:
Fill the screen, which is full of polygons and the whole thing is kept in the Frame Buffer (the memory), not the GPU.

Well..many GPUs have framebuffer caches. So a small, tiny, portion may be kept on the GPU at any one time.
 
london-boy said:
Obviously these days things get much more complicated than that simply because some shaders and other effects eat up fillrate, so as lots of people said lots of times already, talking about fillrate is a bit useless today. Like Guden said:
Yeah i was just gonna say effects are counted as extra pixels that cut into the fillrate's budget. I remember i was as another forum and members there claimed that hardwired effects didn't require/eat fillrate. I see how credible they were now...I see now why ps2 could do the things it does. One thing still gets me, with all of those pipelines, how come ps2 wasn't really able to effects like normal maps or bump maps(other than path of neo and jak3)? Is it that hard to do in software?
 
pixelbox said:
Yeah i was just gonna say effects are counted as extra pixels that cut into the fillrate's budget. I remember i was as another forum and members there claimed that hardwired effects didn't require/eat fillrate. I see how credible they were now...I see now why ps2 could do the things it does. One thing still gets me, with all of those pipelines, how come ps2 wasn't really able to effects like normal maps or bump maps(other than path of neo and jak3)? Is it that hard to do in software?

Basically the problem with PS2 and pixel based effects is that there were some features that were lacking in the GS. Features like proper blending techniques that are needed to produce per-pixel effects.
In the end you still saw some forms of per-pixel effects here and there, but it really depends on what kind of effects we're talking about, and in the end they never looked as good as even "simple" shading effects produced by the Xbox.
PS2 had a whole lot of fillrate which meant it could make up for some of the lacking hardwired features with lots of others nice looking effects. PS2 was very very fast at alpha and particles (much like X360 is now, thanks to the EDRAM) so the developers tried to overcome the lack of hardware features by using those 2 effects accordingly.

If you look at Shadow of the Colossus, it has some effects no one would have thought PS2 could do, still they made their effects look like "the real thing" (whatever The Real Thing is, in this case hardware features). Things like HDR and Fur Shading.

I think PS2 also enjoyed the luxury of being by far the most dominant console, so developers could spend more time and money on the platform, taking risk they couldn't on Xbox and GC. Also, Sony's first party devs were just REALLY good.
 
3.Are VLIW/SIMD different modes of a processor or types of codes.
SIMD and VLIW are deeply tied to the actual implementation of the hardware - for SIMD, a processor will have a number of execution units (typically 4, since that's a common vector width for a lot of things), which are tied together, and do the same operation on 4 seperate pieces of data at the same time (this saves on hardware resources and makes for smaller code size compared to an implementation with 4 independent parallel execution units).

VLIW is a whole different kettle of fish, but is also related to how a processor handles parallelism. It's also a rather complex thing to explain (since to explain it, you have to go over some fundamental issues that it was designed to address), so it can't be summed up in a few sentences. If you want a brief summary in relation to next-gen consoles, though: completely irrelevant, since none of them (AFAIK) have anything in them that's VLIW, at least, not that's exposed to the programmer (as someone else mentioned, GPUs are often VLIW at the hardware level, but they're accessed entirely through shader APIs, which abstract the hardware details to a large extent).

If you want a full explanation, read on: In a traditional processor, while it has the hardware resources to execute multiple instructions simultaneously (think of a CPU that has seperate floating point units and integer units, both can be running simultaneously), the instruction stream is a linear stream of single instructions.

Since you don't want most of your processor sitting idle every cycle, you have to extract parallelism from the linear instruction stream. The easiest way to do this is to simply look at the instruction stream, and if there are adjacent instructions which aren't dependent on each other, and which have their operands (the data they're working on) immediately available, to run them. If there aren't any, the processor just issues what it can (if one of them can run, it will run on it's own, if nothing can run, it will stall until the results from whatever previous calculation is holding it up are ready).

Another way to handle it is to put in some very complex hardware that essentially buffers the instruction stream somewhat, and looks through it for instructions that can run on any given cycle, and runs them, even if there are instructions which are logically ahead of them in the instruction stream, which haven't been run yet (and then there's another buffer which keeps all the results, puts them back in order at the end).

The first example would be an in-order processor (which both the Cell and Xenon are), and the second example would be an out-of-order processor (you'll often see this referred to as OOE, for Out of Order Execution), which is what any modern PC CPU is (anything post-pentium pro for intel, and i think the original athlon was amd's first OOE cpu). OOE has the advantage that the programmer and compiler don't have to think too hard about dependency chains and latencies, but all the housekeeping and scheduling hardware takes up a lot of transistors, which Microsoft and Sony apparently though would be better spent on additional execution resources.

A VLIW processor is an in-order processor with a twist: it takes the implicit parallelism that an in-order processor requires, and makes it explicit in the instruction set - in other words, if you were designing a processor which could do an integer operation, a floating point operation, and a load/store (memory operation) in a single cycle, instead of having 3 seperate instructions, and letting the hardware notice that they can run in parallel, a VLIW instruction set will have 3 "slots" per instruction, corresponding to the int unit, the fp unit, and the load/store unit.
 
Last edited by a moderator:
Shader

pixelbox said:
I remember i was as another forum and members there claimed that hardwired effects didn't require/eat fillrate. I see how credible they were now...

Maybe I have made mistake in this (please add correction) but let me try to provide explanation.

PS2 has super multi-pass effects (maybe more than 20 pass @ 60fps no problem) because of super-fill-rate but very fast but simple pixel processor so some shader operations must have vector units for help, but Xbox has not so much fill-rate and bandwidth so has only few pass/frame but has slow but super smart shader processor which has many shader operations per pixel. PS3 is like Xbox method of many shader operations per pixel but much better because Xbox has 4 super smart shader processor at 233mhz but (if RSX is like G70 performance) PS3 has 24 genius shader processor at 550mhz.
 
london-boy said:
Basically the problem with PS2 and pixel based effects is that there were some features that were lacking in the GS. Features like proper blending techniques that are needed to produce per-pixel effects.
In the end you still saw some forms of per-pixel effects here and there, but it really depends on what kind of effects we're talking about, and in the end they never looked as good as even "simple" shading effects produced by the Xbox.
PS2 had a whole lot of fillrate which meant it could make up for some of the lacking hardwired features with lots of others nice looking effects. PS2 was very very fast at alpha and particles (much like X360 is now, thanks to the EDRAM) so the developers tried to overcome the lack of hardware features by using those 2 effects accordingly.

If you look at Shadow of the Colossus, it has some effects no one would have thought PS2 could do, still they made their effects look like "the real thing" (whatever The Real Thing is, in this case hardware features). Things like HDR and Fur Shading.

I think PS2 also enjoyed the luxury of being by far the most dominant console, so developers could spend more time and money on the platform, taking risk they couldn't on Xbox and GC. Also, Sony's first party devs were just REALLY good.
You know what? These statments remind me of MGS3 and how good that water looked on a system lacking so many hardwired effects. Anyway, you mentioned that fillrate is how many pixels "fill the screen", i still don't get how that ties to draw distances or effects. By your definition, fillrate=resolution. And what about texels? Is there a texel fillrate? Once i get a more clear definition, i'll be set and could move on.
 
arhra said:
SIMD and VLIW are deeply tied to the actual implementation of the hardware - for SIMD, a processor will have a number of execution units (typically 4, since that's a common vector width for a lot of things), which are tied together, and do the same operation on 4 seperate pieces of data at the same time (this saves on hardware resources and makes for smaller code size compared to an implementation with 4 independent parallel execution units).

VLIW is a whole different kettle of fish, but is also related to how a processor handles parallelism. It's also a rather complex thing to explain (since to explain it, you have to go over some fundamental issues that it was designed to address), so it can't be summed up in a few sentences. If you want a brief summary in relation to next-gen consoles, though: completely irrelevant, since none of them (AFAIK) have anything in them that's VLIW, at least, not that's exposed to the programmer (as someone else mentioned, GPUs are often VLIW at the hardware level, but they're accessed entirely through shader APIs, which abstract the hardware details to a large extent).

If you want a full explanation, read on: In a traditional processor, while it has the hardware resources to execute multiple instructions simultaneously (think of a CPU that has seperate floating point units and integer units, both can be running simultaneously), the instruction stream is a linear stream of single instructions.

Since you don't want most of your processor sitting idle every cycle, you have to extract parallelism from the linear instruction stream. The easiest way to do this is to simply look at the instruction stream, and if there are adjacent instructions which aren't dependent on each other, and which have their operands (the data they're working on) immediately available, to run them. If there aren't any, the processor just issues what it can (if one of them can run, it will run on it's own, if nothing can run, it will stall until the results from whatever previous calculation is holding it up are ready).

Another way to handle it is to put in some very complex hardware that essentially buffers the instruction stream somewhat, and looks through it for instructions that can run on any given cycle, and runs them, even if there are instructions which are logically ahead of them in the instruction stream, which haven't been run yet (and then there's another buffer which keeps all the results, puts them back in order at the end).

The first example would be an in-order processor (which both the Cell and Xenon are), and the second example would be an out-of-order processor (you'll often see this referred to as OOE, for Out of Order Execution), which is what any modern PC CPU is (anything post-pentium pro for intel, and i think the original athlon was amd's first OOE cpu). OOE has the advantage that the programmer and compiler don't have to think too hard about dependency chains and latencies, but all the housekeeping and scheduling hardware takes up a lot of transistors, which Microsoft and Sony apparently though would be better spent on additional execution resources.

A VLIW processor is an in-order processor with a twist: it takes the implicit parallelism that an in-order processor requires, and makes it explicit in the instruction set - in other words, if you were designing a processor which could do an integer operation, a floating point operation, and a load/store (memory operation) in a single cycle, instead of having 3 seperate instructions, and letting the hardware notice that they can run in parallel, a VLIW instruction set will have 3 "slots" per instruction, corresponding to the int unit, the fp unit, and the load/store unit.
Got it!
 
Anyway, you mentioned that fillrate is how many pixels "fill the screen", i still don't get how that ties to draw distances or effects. By your definition, fillrate=resolution. And what about texels? Is there a texel fillrate? Once i get a more clear definition, i'll be set and could move on.
Depending on the graphics chip, many effects can require objects to be drawn more than once (the PS2's GX, for instance, can only draw one texture layer at once, so any time you see an object with more than one layer of textures on it, it had to be drawn multiple times, once for each texture; more modern GPUs can do multiple texures in a single pass (well, even less modern graphics chips could... the GS was kind of primitive in that respect, but it had plenty of fillrate and bandwidth to make up for it)). Also, you'll often draw pixels multiple times due to objects overlapping (for example, if you draw the terrain in a game before you draw the characters, you'll be drawing the characters over the top of the terrain). In addition, whenever you have any kind of transparency, you have to draw the pixel multiple times (smoke effects, or whatever).

Modern GPUs have a lot of stuff that reduces the need for multi-pass solutions (multitexturing, shaders, early-z to reduce opaque overdraw, etc), but even now, there are a lot of things that have to be broken down into multiple passes (doom 3's lighting system required the screen to be drawn more or less in full once for every light, iirc), and there's nothing really you can do to reduce the overdraw needed for things like transparent particles, and the like (well, except design games where they're not used much).
 
pixelbox said:
You know what? These statments remind me of MGS3 and how good that water looked on a system lacking so many hardwired effects. Anyway, you mentioned that fillrate is how many pixels "fill the screen", i still don't get how that ties to draw distances or effects. By your definition, fillrate=resolution. And what about texels? Is there a texel fillrate? Once i get a more clear definition, i'll be set and could move on.
Yeah, I think the bottom line is that overdraw causes the relationship between fillrate and draw distance. Overdraw, as described just above, is when you draw (render) a distant object, then draw one a little closer that obscures it, then draw one even closer that obscures them both, then draw one really close... etc. So therefore even with a fixed screen resolution (and fixed number of actual display pixels), the GPU may be attempting to draw many pixels per actual display pixel, with only the closest opaque one being seen.

Modern GPU's have "smart" hardware that attempts to limit overdraw, and are all effective to some degree, but none eliminate this problem entirely. So when you have a greater draw distance, you increase the chances of objects at varying distances from the camera obscuring one another. Thus fillrate and draw distance are related. A greater fillrate allows you more overdraw headroom, which allows for more obscuring geometry, which in general allows for greater draw distances.

Note too that the type of game and the style/artwork have a big impact on draw distances. Lots of trees and tons of characters on screen put a much heavier overdraw load on the GPU (and thus limit draw distance) than a grassless plain with a few tanks off in the distance.
 
Just a thought, but a raytracing system wouldn't have any overdraw. That'd be taken up by secondary rays, but it'd be a totally efficient process with one surface rendered per pixel (ignoring transparent surfaces) and no layers of polygons being drawn over the same pixel obscuring each other. From that respect it sounds like a good system. Is there any way to render polygons per pixel based on a ray-tracing selection? I guess standard GPU evolution is such that it's faster to draw everything than be carefully selective.
 
Shifty Geezer said:
Just a thought, but a raytracing system wouldn't have any overdraw. That'd be taken up by secondary rays, but it'd be a totally efficient process with one surface rendered per pixel (ignoring transparent surfaces) and no layers of polygons being drawn over the same pixel obscuring each other. From that respect it sounds like a good system. Is there any way to render polygons per pixel based on a ray-tracing selection? I guess standard GPU evolution is such that it's faster to draw everything than be carefully selective.

The disadvantage for raytracing is that for every single pixel you need to iterate over every object and find out which one is closest to the screen. As you can imagine, this can be pretty slow. There are ways you can speed this up however.

With raytracing the most efficient techniques are to use KD-Trees to split a scene up into sections where you try to isolate sections that have no polygons, and split other sections based on some hueristic (number of polygons, size, number of splits, etc). For any given section, you need to iterate over every single object in that section to determine which is closest to the camera. You then send out secondary and tertiary rays after this is complete for things like refractions/reflections.

Some people have done work on real-time raytracing hardware, and have gotten exciting results. Look here:

http://graphics.cs.uni-sb.de/SaarCOR/

and here for the pdf presented last year:

http://graphics.cs.uni-sb.de/~woop/rpu/RPU_SIGGRAPH05.pdf

Nite_Hawk
 
You can also use a z-buffer hidden surface removal algorithm instead of shooting out the first ray to check closest object in that particular pixel. From there you then go and shoot the "shadow ray", rays for refraction and reflection.
 
Bigus Dickus said:
Yeah, I think the bottom line is that overdraw causes the relationship between fillrate and draw distance. Overdraw, as described just above, is when you draw (render) a distant object, then draw one a little closer that obscures it, then draw one even closer that obscures them both, then draw one really close... etc. So therefore even with a fixed screen resolution (and fixed number of actual display pixels), the GPU may be attempting to draw many pixels per actual display pixel, with only the closest opaque one being seen.

Modern GPU's have "smart" hardware that attempts to limit overdraw, and are all effective to some degree, but none eliminate this problem entirely. So when you have a greater draw distance, you increase the chances of objects at varying distances from the camera obscuring one another. Thus fillrate and draw distance are related. A greater fillrate allows you more overdraw headroom, which allows for more obscuring geometry, which in general allows for greater draw distances.

Note too that the type of game and the style/artwork have a big impact on draw distances. Lots of trees and tons of characters on screen put a much heavier overdraw load on the GPU (and thus limit draw distance) than a grassless plain with a few tanks off in the distance.
Great! ok now i get it. So a fix resolution, i.e. 720p, means nothing without the fillrate to back it up. And with that (720p) resolution in conjuction with a poor fillrate you either have a simple environment with huge draw distance or a complex area with a short one. Silent hill 3 was a great looking game with the shadows and all but the draw distance lacked. Now i knew it had something to do with the shadows, i just didn't know how. With all of those shadows and extra layers of animating textures, i see why. Thanks for the def.
 
pixelbox said:
Great! ok now i get it. So a fix resolution, i.e. 720p, means nothing without the fillrate to back it up. And with that (720p) resolution in conjuction with a poor fillrate you either have a simple environment with huge draw distance or a complex area with a short one. Silent hill 3 was a great looking game with the shadows and all but the draw distance lacked. Now i knew it had something to do with the shadows, i just didn't know how. With all of those shadows and extra layers of animating textures, i see why. Thanks for the def.

Silent Hill games can afford to have very short distances cause they're designed that way. In the circumstances, the developers could focus on making the character models VERY high poly for the time and very detailed. Also adding very cool effects like soft self-shadowing, animated textures and in general higher res textures than most other PS2 games.
 
london-boy said:
Silent Hill games can afford to have very short distances cause they're designed that way. In the circumstances, the developers could focus on making the character models VERY high poly for the time and very detailed. Also adding very cool effects like soft self-shadowing, animated textures and in general higher res textures than most other PS2 games.
YEAH, YEAH. I'm not knocking it. I use to show it off as a means to display what the ps2 can do. But i have one more question. When you talk about bandwidth speeds(22.5 Gps), they have that number cut into two pieces, (read 20, write 2.5). What does that mean.
 
Data can flow both ways between components (sometimes). For CPU <> RAM, a BW of 22.5 GB/s means you can do 22.5 GBs CPU reading from RAM, 22.5 GB/s writing to RAM, or any combination of reading and writing to a max of 22.5 GB/s.

Where a BW is divided in two, like 10.8/10.8, it means you've two seperate pipes each one way only. That means you can read up to 10.8 GB/s and write up to 10.8 GB/s simultaneously. However if for one second you are only reading 5 GB/s, the 'leftover' BW can't be used for writing.
 
Back
Top