PS2 vs PS3 vs PS4 fillrate

alexsok

Regular
I am a bit confused here and I'd like the pros here to clarify this for me.

The creator of the GT series of games mentioned in several interviews the following:

“When the [PlayStation 2] came out, one unique characteristic of that system was that the screen fill rate was very fast. Even looking back now, it’s very fast. In some cases, it’s faster than the PS3. There, we were able to use a lot of textures. It was able to do that read-modify-write, where it reads the screen, you take the screenshot, and you modify it and send it back. It could do that very quickly.

“I don’t know if anybody remembers, but when the PS2 first came out, the first thing I did on that was a demo for the announcement. I showed a demo of GT3 that showed the Seattle course at sunset with the heat rising off the ground and shimmering. You can’t re-create that heat haze effect on the PS3 because the read-modify-write just isn’t as fast as when we were using the PS2. There are things like that. Another reason is because of the transition to full HD.”

And then in a more recent interview it was said that:

Yamauchi has a soft spot for the PS2 iterations of GT - with its exceptionally fast fill-rate, they were able to achieve feats not possible on the PS3 or even, he claims, the PS4.

Considering the fact that the PS3 was much more powerful than the PS2 and the PS4 is far more powerful than the PS3, how does this whole fillrate issue factor in here? This whole "ready-modify-write" part of the equation can't be done on a PS3 or even a PS4?

Doesn't the PS3 play PS2 games and is able to display the same effects supposedly unachievable on it?
 
The fill-rate in PS3 for pixels and textures isn't inferior to PS2. It's the dedicated bandwidth that the G70-based RSX was starved of. The GPU in PS2 was intentionally designed with wide internal data-paths and fast embedded memory to offset the lack of programmable pixel pipeline, so the developers could use conventional rendering methods, burning a lot of bandwidth, to achieve all the advanced effects.
This is very different approach to how the graphics technology is evolved today and it's why Sony didn't bothered with emulating PS2 titles on PS3 once they removed the dedicated hardware.

About PS4 -- it shouldn't be a problem though, but using such an outdated approach is a waste of resources anyway, even on a modern GPU beast.
 
He means that compared to the resolution the PS2 ran at (720x480 for GT3 I think), it had an amazing bandwidth width its 50GB/s 4GB low latency EDRAM dedicated to video and it had no real restrictions afaik either.

PS4 on the other hand has 172GB/s for all of its memory but also typically 1920x1080 pixels to fill. That is about 3,5x the bandwidth for 7x the pixels.
 
First of all, this mysterious "ready-modify-write" = alpha blending :)

Seems that in general, the fill rate (ROP rate) is the most misunderstood part of graphics rendering performance. Increased maximum fill rate only helps when you are not memory BW, ALU, TMU or geometry bound. In modern games, most draw calls you submit are are bound by these four things instead of fill rate. This is because we have moved from simple (vertex based) gouraud shading to sophisticated per pixel lighting and material definition. Pixel shaders that do more than 100 operations per pixel are quite common nowadays.

In our latest Xbox 360 game we only were fill bound in three cases: shadow map rendering, particle rendering and foliage rendering. Infinite fill rate would make these steps around 10%-25% faster with our shaders, with a total impact less than 5% for the frame rate.

Pure fill rate is no longer the bottleneck for modern (next gen) particle rendering, as particle rendering has gotten much more sophisticated. Modern games do complex per pixel lighting on particles and output particles to a HDR render target. This means that particle pixel shader samples multiple textures (color and normal map at least) per pixel, increasing the TMU usage and the BW usage. Lighting uses lots of ALU instructions. The more lights you have, the more expensive the shader becomes. Soft particles also fetch the depth data (uncompressed read of 32 bits per pixel) = quite a bit extra BW cost (+ TMU cost). Blending to the 64 bit HDR back buffer eats a lot of bandwidth.

In comparison PS2 was designed for high fill rate. This was possible because each pixel did only a very simple ALU operation and only accessed one texture (no programmable pixel shaders were supported). Thus by design your "shader" was never ALU bound or TMU bound. Most common texture format was 256 color paletted texture, everything else was slow. For each outputted pixel the GPU sampled exactly one of these (low bit depth) textures. And it didn't support any fancy filtering (anisotropic) that require multiple TMU cycles to complete. So it was never TMU or BW bound, as long as all your textures (and your render target) fit to the 4 MB EDRAM. The most common render target format was the low precision 16 bit (565) format. In comparison modern 64 bit HDR particle rendering requires 4x more bandwidth per pixel, and if you also take the resolution increase into account, the back buffer BW requirement for particle rendering is over 20x in modern games. Not even the modern BW monsters such as Radeon 7970 GE can reach their full fill rate on particle rendering, because the BW becomes a limit halfway there. 64 bit HDR blending with 32 ROPS at 1000 MHz requires 512 GB/s BW (and the card "only" has 288 GB/s BW). So there's no performance benefit for increased GPU fill rate, until the BW problem is solved.

PS3 and PS4 can definitely exceed the fill rate of PS2. You can get quite high fill rate if you are willing to go back to gouraud shaded rendering with a single 256 color texture (or DXT1 compressed texture) on each particle/object, and you perform no per pixel calculations. However I am personally much more happy with the particles I see in next gen games. The particle counts (and overdraw) don't need to be that high when you have sophisticated particle lighting and soft particle rendering (making the particles look volumetric instead of textured billboards floating in air). For example smoke in recent games (BF4) looks incredible when missiles pass it (lighting the smoke in a realistic way).
 
Kaz is just talking about frame buffer effects though I think - the heat haze effect just blurred lines and moved them from left to right in a post process direct shifting of pixels in frame buffer.

I don't disagree though that modern effects on next-gen can look way cooler. Just saying he's not wrong and explaining where he is coming from.
 
PS4 on the other hand has 172GB/s for all of its memory but also typically 1920x1080 pixels to fill. That is about 3,5x the bandwidth for 7x the pixels.
Difference is, PS4 isn't constrained to only do effects with fillrate, it has considerable computing resources to leverage as well. PS2 could do some simple computations with alpha blending, but that is quite limited and the GS lacked some blending modes as well IIRC. I don't think anyone's seriously arguing PS4 is limited compared to PS2 in any way, considering even early games out for PS4 right now. :)
 
I've tried to answer the question, don't think anyone thinks the PS4 won't be able to output better graphics than the PS2 ...
 
In comparison modern 64 bit HDR particle rendering requires 4x more bandwidth per pixel, and if you also take the resolution increase into account, the back buffer BW requirement for particle rendering is over 20x in modern games. Not even the modern BW monsters such as Radeon 7970 GE can reach their full fill rate on particle rendering, because the BW becomes a limit halfway there. 64 bit HDR blending with 32 ROPS at 1000 MHz requires 512 GB/s BW (and the card "only" has 288 GB/s BW). So there's no performance benefit for increased GPU fill rate, until the BW problem is solved.

Isn't that where high bandwidth caches come in though? The 290 doubles the the 7970GE's fill rate while only increasing bandwidth by 11% so there must be some benefit.
 
I heard something (probably in a thread here on B3D) that modern GPUs write framebuffer pixels out to their L2, essentially letting it act as an on-chip tile buffer apparently. Should help alleviating bandwidth constraints when blending multiple layers and such...or so one might hope anyway. :)
 
Writable (i.e. coherent) L2 is relatively new feature for GPUs, previously limited to a read-only texture cache. Traditionally, small amounts of color/depth caches were implemented to speed up frame-buffer op's and some compute tasks. AMD still retains those structures alongside the larger coherent L2, not sure about NV.
 
Kaz certainltly isnt talking about overall visual quality. Clearly he knows that both the PS3 and PS4 destroy PS2 in terms of performance. Its very easy in this discussion to get carried away and talk about the areas that the PS4 and PS3 obliterate the PS2 but thats besides the point. Kaz is saying that for the given relative performance PS2 offered in its time, it had fill rate advantages that for the given performance the newer consoles offer for their time, cannot be replicated unless (probably) drastic sacrifices are made in the overall visual quality. GT5 clearly looks better than GT4 but there is a reason why GT5s visuals did not give room for the heat haze effect. Even water, smoke and fog effects were a pixelated mess that was not in line with the expected quality of its generation because of PS3's bandwidth limitations. And that despite being more powerful than the PS2. PS2 did not have this issue with fog/water spraying effects despite its high quality visuals for its time and weaker specs. Thats one area that Kaz would have prefered the PS3 had an abundance of performance to add those effects with better fidelity
 
Yes, exactly ... But the weather effects and issues with smoke and overdraw have been solved in a different way in GT6 very nicely ... obviously though it was harder, but this will look better in the end, with lighting on smoke, depth of field etc.
 
Considering the fact that the PS3 was much more powerful than the PS2 and the PS4 is far more powerful than the PS3, how does this whole fillrate issue factor in here? This whole "ready-modify-write" part of the equation can't be done on a PS3 or even a PS4?

read-modify-write are framebuffer operations, those were not only alpha blending, you could setup some more (at that time) advanced settings if I recall correctly.

ps2 fillrate was about 2.4GPixel/s if I recall correctly. with rmf@32bit/pixel you need about 40GB/s (read 32bit color, write 32bit color, read 32bit zbuffer, write 32bit zbuffer) and that's exactly what the edram delivered, so you could go banannas as far as as rop operations go without worrying in any way about performance. I think that's the point of his statements. (btw. you could enable texture sampling of 32bit texture, which would take 10GB/s, which is what the edram has dedicated to the TMUs, that's how you end up with 50GB/s edram performance).

on ps3 you have 8rops@550MHz ->4.4GPixel/s fillrate -> 70GB/s requirement if you could do it the same way the ps2 did, sadly you only have 20GB/s, so you cannot keep up with the ps2.

on ps4 you have 32rops@800MHz ->25.6GPixel/s fillrate if you go for hdr/64bit targets, you'd actually need 512GB/s peak.

another point:
the resolution on ps2 was 400p, that's just 256k pixel, with 2.4GP/s, you can have 156x overdraw while running @60Hz


so. let me go to rage mode and play the 'advocate' for a ps2 alike 'next gen' architecture:
if you wonder what someone would do with that kind of primitive kind of rendering (no pixelshader), look no further than pixar. Reyes rendering does not utilize pixelshader, everything is calculated per vertex as the ps2 does. if Sony had gone for that direction, you could by now have toy story alike image quality.
while you have 'advanced' texture filtering now, reyes style rendering would probably sample primitive bilinear textures and to compensate for that, you'd end up rendering in super sampled mode, just like movies do. some would call that 'wasting performance', but if you look at the motion and shader aliasing some 'next gen' games produce, texture filtering seems to be the least source of aliasing.
</ragemode>

in reality, the problem is that it is very risky to go that way. if you swim with the swarm, you might not be the alpha animal, but chances are high you survive and stay in a good shape (pc architecture on consoles that everybody knows how to use). if you go for the lonely wolf like sony did with the ps2, you might be lucky as they were and end up running circles around your competition (regarding sales), but if you are unlucky and publishers rather go for ports than unique games, there will be no games for your platform -> eol.
 
it had an amazing bandwidth width its 50GB/s 4GB low latency EDRAM dedicated to video and it had no real restrictions afaik either.

I'm sure that's a typo, but imagine a PS2 with 4 GB EDRAM... in fact, even a PS4 would be kinda interesting with that much ;)
 
if you wonder what someone would do with that kind of primitive kind of rendering (no pixelshader), look no further than pixar. Reyes rendering does not utilize pixelshader, everything is calculated per vertex as the ps2 does. if Sony had gone for that direction, you could by now have toy story alike image quality.

Er, that just moves the ALU intensive stage to a different part of the pipeline. Or with a unified architecture, it's just semantics really - the ALUs will do the work anyway, it's just much more efficient to do it using fragments and deferred shading.

while you have 'advanced' texture filtering now, reyes style rendering would probably sample primitive bilinear textures and to compensate for that, you'd end up rendering in super sampled mode, just like movies do. some would call that 'wasting performance', but if you look at the motion and shader aliasing some 'next gen' games produce, texture filtering seems to be the least source of aliasing.

Er, PRMan is the master of texture filtering. Seriously, just because they shade by micropolygon grids, it doesn't mean that they don't need to filter the texture stuff.

Also, PRMan is no longer the absolutely best possible renderer, it had a lot of trouble adapting to raytracing and it still requires a lot of manual setup work to build various acceleration structures and caches (point clouds, brick maps, shadow maps etc)

Most VFX and CG animation studios now prefer fully raytraced renderers like Arnold. Requires less man-hours to setup and that's much more expensive than more CPUs.

Realtime rendering is of course a different animal altogether...
 
Er, that just moves the ALU intensive stage to a different part of the pipeline. Or with a unified architecture, it's just semantics really - the ALUs will do the work anyway, it's just much more efficient to do it using fragments and deferred shading.
'efficiency' comparision on different architectures are a bit problematic.
you solve problems in a different way on an unified architecture
e.g. if you were drawing 'next gen' volumetric particles on a ps2 architecture (ala panta ray), you'd render bazillions of slices over and over again. on 'unified architectures' you render one slice but sample bazillions of times in a loop.
now, which one is more efficient?
on ps2, games had completely different limitations, I think I read somewhere that GTA3 had not even backface culling, as fillrate never was an issue. when they ported to PC, they had quite some fillrate issues.
lets move it to some current-gen tech. lets say you voxelize scenes for voxel cone tracing. afaik nobody does it on 'next gen', as it is too slow to voxelize. if you had a GS architecture with embeded framebuffer, you could render those 3d gbuffers like you don't care. (sure, then you'd end up with cone tracing being the issue, but that's what I try to imply, it's not 'better' or 'worse', it's just completely different).

Er, PRMan is the master of texture filtering. Seriously, just because they shade by micropolygon grids, it doesn't mean that they don't need to filter the texture stuff.
they don't do anisotropic filtering as your gpu does. that's wrong if you want high quality renderings. you need to sample every texel, do all the math you do in a shader and then combine the results. the high filtering quality is a result of super sampling rather than texture functions. yes, you can get away with that kind of filtering on albedo textures (and that's what they samples mostly back then when also mipmaps were invented), but that's it. for normals/displacement mapping you get wrong results, two 90degree oriented normals might end up with completely different colors than an averaged sample after the most advanced texture filtering you can set. that's one reason why reyes style rendering tessellates objects to insanity, to get a high enough coverage of input samples.

Also, PRMan is no longer the absolutely best possible renderer, it had a lot of trouble adapting to raytracing and it still requires a lot of manual setup work to build various acceleration structures and caches (point clouds, brick maps, shadow maps etc)
you are right at this point, yet isn't it sad we cannot achive an aliasing free image even compared to this outdated tech?

Most VFX and CG animation studios now prefer fully raytraced renderers like Arnold. Requires less man-hours to setup and that's much more expensive than more CPUs.

Realtime rendering is of course a different animal altogether...
and comparing a PS2 alike rendering with 'unified shader pipelines' is like comparing one of those 2 to path tracing. you have completely different ways to engage problems, you have a completely different set of issues and non-issues. you could argue 'path tracing is not as efficient' and then try to create an unbiased image of at least the same quality... you won't be more efficient using rasterization.


and I'm not saying "that's how it would be" cause there is no hardware to do all the nice stuff we might do if it was here. I (alone) cannot come up with all the fancy things we might be doing if we had crazy fast rasterization/fillrate monster in opposite to the current high compute monsters with rasterization/fillrate limits.

just saying, it's a completely other kind of thinking. and the gran turismo guy (like so many old school ps2 devs) liked those nice parts of it.
 
read-modify-write are framebuffer operations, those were not only alpha blending, you could setup some more (at that time) advanced settings if I recall correctly.

According to the documentation you can do (A - B) * C + D, where A, B, and D are source/dest color or 0, and C is source/dest alpha or a constant.
I'm looking at the GS manual now and there at least isn't anything beyond what's called alpha blending. It's more flexible than the traditional LERP between source and dest using source alpha or a constant (which is all you get on say, Nintendo DS), and it at least has what's necessary for PS1 compatibility, but I wouldn't consider it particularly more advanced than just saying it has alpha blending.
 
'efficiency' comparision on different architectures are a bit problematic.

I was talking about the efficiency of shading per vertex vs. per fragment. On the current GPU architectures, it's much more efficient to keep the number of vertices relatively low (because of the quads) and use deferred shading (shading complexity, overdraw etc)

PRMan has the luxury of a very, very efficient hidden surface removal system and bucketing, so shading per grid point makes sense. It also allows high quality fast motion blur and depth of field, and also cheap displacement mapping. These are all required for movie visual effects and 20+ years ago REYES was the only way to get them.

But as I've already mentioned, we're already at a point where standard raytracers can also do high scene complexity, motion blur and still keep reasonable rendering times - with the added bonus of raytraced shadows, reflections and global illumination. So even REYES can be challenged.


In the end, my points are:
- Pixar's approach is not comparable to the PS2 at all
- Pixar's approach is no longer king anyway
- current GPU architecture has evolved to support the most commonly accepted and used approaches anyway
 
According to the documentation you can do (A - B) * C + D, where A, B, and D are source/dest color or 0, and C is source/dest alpha or a constant.
I'm looking at the GS manual now and there at least isn't anything beyond what's called alpha blending. It's more flexible than the traditional LERP between source and dest using source alpha or a constant (which is all you get on say, Nintendo DS), and it at least has what's necessary for PS1 compatibility, but I wouldn't consider it particularly more advanced than just saying it has alpha blending.
when I was helping some guys on the GS emulation side of an PS2 emulator, this was already enough of an head ache to get properly emulated (back then), on top of that (if I recall correctly) there is a "mask" you could set, that allows you to select individual bits of a pixel that are written to the framebuffer (another head ache for the emulator, on GL you can just set a bool per channel and quite some games seem to use individual bits to mask areas or flip sign bits or something... way too long ago when I was helping out).
 
I was talking about the efficiency of shading per vertex vs. per fragment. On the current GPU architectures, it's much more efficient to keep the number of vertices relatively low (because of the quads) and use deferred shading (shading complexity, overdraw etc)
that's like a self fullfilling prophecy.
you are already so used to what you expect from gbuffer techs, that you imply our last gen hardware is the best for it. it's actually the other way around, most engines/games switches to deferred/g-buffer techs for ps3/xbox360, as you could not get lighting the way you've done it previously on PC. before that generation, there was barely any game using that tech ( I think it was STALKER only).
I cannot talk for everyone in the industry, but at least in our company, we didn't switch with happy faces to this glory solution, on the contrary: we had to drop AA, drop shading variation (anisotropic materials, backface lighting etc) and proper lighting of transparent objects etc. (e.g. our PC-shader database would have not fitted into the main memory of the PS3.)

This generation, I'm quite sure, engines/games will move away from gbuffer tech again. forward rendering makes just so much more sense if you aim for consistent antialiased rendering and you won't get punished with 6cycles and bad divergency behavior if you branch.
moving one step further, games will start to use tessellation more and more and moving to higher triangle counts, it will be just logical to do shading in hull and domain shaders, passing as few interpolators to fragments as possible. on a bonus side, you'll get more stable (less aliased) shading, as vertices will stay quite similar across several frame, unlike fragments that depend every frame on the exact sample position.

PRMan has the luxury of a very, very efficient hidden surface removal system and bucketing, so shading per grid point makes sense. It also allows high quality fast motion blur and depth of field, and also cheap displacement mapping. These are all required for movie visual effects and 20+ years ago REYES was the only way to get them.
you are perfectly right, that's the point I try to make the whole time.
for every requirement you have a dedicated solution. but it's easy to mix up requirement and solution. just like gbuffers were/are a solution to the hardware we had last gen, it's not the hardware that was designed to make gbuffer rendering fast. nobody asked for that when last gen was designed.
and Pixar had different goals, it wasn't about subpixel perfect lighting, it's about those effects that you mentioned. imagin we have had the same goals for games. aliasing free rendering in motion, nobody would use gbuffers. we'd have to scarify lighting quality and 'realtime' and instead bake most things.
there are games that have those 'different' goals. Rage runs absolutly smooth 60Hz on my PS3.
And look at GT5, 1080p with 2xMSAA (not full width tho) @60Hz. I could not imagin they ever thought 'that hardware was made with gbuffers in mind'. and if they had a choice, they'd probably drop this ps3 RSX design and go for more ps2 GS alike design.
you can see that even more in their pre-rendered intro trailer: http://www.gamersyde.com/download_gran_turismo_6_launch_trailer-31263_en.html
it's has quite high quality motion blur (no game has that, not even in-game cutscenes), they have very good antialiasing and... well... look at the lighting, it looks baked, lightsources are frequently just sprites and 'worst' of all, even in the pre-rendered trailer it looks LDR.

But as I've already mentioned, we're already at a point where standard raytracers can also do high scene complexity, motion blur and still keep reasonable rendering times - with the added bonus of raytraced shadows, reflections and global illumination. So even REYES can be challenged.
and yet, after using raytracing for 'cars', pixar dropped it for years again, as it increased frame times dramatically while adding a minor difference to the frame (beat me and I will try to find the source of that)

In the end, my points are:
- Pixar's approach is not comparable to the PS2 at all
- Pixar's approach is no longer king anyway
- current GPU architecture has evolved to support the most commonly accepted and used approaches anyway
my points are
-Reyes kind of rendering would be a good fit on a ps4 with a design of the ps2. 3Billion transistors (60x more than ps2) for 8x the pixel count might have ended up in enough power to rasterize the scene in a reyes approach.
-pixar's approach still delivers image quality that by far exceeds any game in the sense of anti-aliasing.
-current GPU architecture evolves 5years before it gets released, it's design is frozen long before the software-tech is developed that gonna be used on it. it's a bet that it will be useful and that's why software solutions should be seen as 'made for the hardware' and if the hardware was PS2 alike, we'd write software as we did for ps2 (and maybe reyes). we'd solve other low hanging fruits as we did on ps2 and avoid problematic cases just like we do now in so many (other) cases.
 
Back
Top