Anti-aliasing Without EDRAM in NVidia's PS3 GPU...

trinibwoy · May 4, 2005

MfA said:
API, developer inertia, compatibility. On the PC the direct mode renderers are what the APIs and software is designed for, when your competition sets the terms you can compete on things are not always easy.

Oh I didn't know there was so much dependence on the API/software. Thought it was mostly just a hardware implementation of how to handle geometry processing.

Fafalada · May 4, 2005

SimonF said:
Reading back how? If you mean with software then you'll see the system grind to a halt.

I think the better question would be "reading back where". From my perspective every Z-read That my code explicitly initiates, is "software"

That said, I'd argue that a good 2/3s of postprocessing effects manipulate Z-buffer in some manner or another. On some architectures even more then that (like when you are forced to invert Z-buffer manually because Z-test only supports one-way comparators).

Anyway - to me it's just plain common sense that each tile can output a Z-Buffer after it's completed. Just make sure the output is also unified graphics mem, and we've got all that's needed for Layer rendering abstraction, complete with all the postprocessing fun 8)

Farid · May 4, 2005

nAo said:
2) If NVIDIA GPU will not have a big pool of embedded memory then performance wise R500 could be faster when AA is on.

That would be assuming that you're GPU bandwidth bound.
There's a lot of possible rendering scenarios, especially in an embedded platform, where you can have MSAA "for free".

MfA said:
This isnt the same situation as with the poorly conceived GS. ATI's eDRAM chip is unlikely to be lacking features ...

Why wouldn't eDRAM be include at the expense of something else, in this case? (if the XeGPU has indeed eDRAM)
For instance, ERP thinks that even the ArtX's Flipper is paying a big price, transistor wise, for its eDRAM:
http://www.beyond3d.com/forum/viewtopic.php?p=445308#445308

MfA · May 5, 2005

The percentage of the die it takes up has gone down ... sure if you dont put it in you could put in more pipelines, but I doubt they would actually change the pipelines themselves much. Plus they dont need to worry about framebuffer compression, which saves a few trannies (but not a huge deal).

Without eDRAM you might be optimized for the long shaders everyone raves about, but you cant solve every problem with a longer shader. Dynamic inter-object shadows, volumetric effects, particle systems ...

Of course Id still rather see a tiler.

V3 · May 5, 2005

MfA said:
Of course Id still rather see a tiler.

But don't tiler increase transistors count too over the traditional method ?

MfA · May 5, 2005

Pound for pound? No I dont believe so.

V3 · May 5, 2005

MfA said:
Pound for pound? No I dont believe so.

Doesn't tiler has extra stages, that it needs to do ?

psurge · May 5, 2005

IMO you'd still want compression for the stuff in eDRAM - the information computed for compression (specifically: which nearby samples have identical colors) can be used to optimize downsampling and blending.

MfA · May 5, 2005

The downsampling is so little work, why worry about it? How does knowing if a neighbouring sample has the same color help in blending?

psurge · May 5, 2005

Well, say a fragment (single color) covers 3 samples inside a pixel which is currently fully covered by an opaque fragment (4 samples with identical colors). Doesn't that mean you can do the color blend once (for all 3 samples) instead of 3 times?

Maybe for 4xAA this doesn't give you much, but I'm still hoping for 8x or 16x one of these days...

3dcgi · May 6, 2005

psurge said:
Well, say a fragment (single color) covers 3 samples inside a pixel which is currently fully covered by an opaque fragment (4 samples with identical colors). Doesn't that mean you can do the color blend once (for all 3 samples) instead of 3 times?

That's correct.

MfA · May 6, 2005

Unless you are doing it to save power, you would actually need to make the blending a potential bottleneck ... now that might make sense for multisampling, but I dont think for outside of that. As for multisampling, that could be handled more low brow than the compression schemes being used now.

psurge · May 6, 2005

I agree that if you're not multisampling it's pointless. The idea also doesn't have much merit if a low number of samples are being used (2x, maybe 4x).

Were you thinking that higher sample counts (16X) should be dropped in favor of some kind of super-sampling (MSAA doesn't address in-shader aliasing)?

MfA · May 6, 2005

16x? Even with external DRAM that is a bit optimistic, with eDRAM you would need virtualized memory and be able to spill to external DRAM ... without tiling that is.

Im pretty sure noone will do 16x multisampling.

jvd · May 6, 2005

MfA said:
16x? Even with external DRAM that is a bit optimistic, with eDRAM you would need virtualized memory and be able to spill to external DRAM ... without tiling that is.

Im pretty sure noone will do 16x multisampling.

I'm hoping for 6x "temporal" from the xenon . I use 6x2 in every game i can . The games really do show a big improvement over regular 6x . Perhaps ati can perfect it so that 6x3 would work giving you some where between 12 and 18x quality . On my x800xt pe the flickering gets bad at 6x3 though so i would think it would take alot of work . Perhaps a hybrid 4x mode from nvidia with 2x multi and 2x super sampling .

AlNom · May 6, 2005

personally, I can't tell the difference between 6xMSAA and 6xTAA (on the R350). The difference would be even harder to distinguish on a TV, I think.

Lazy8s · May 6, 2005

MfA:

Im pretty sure noone will do 16x multisampling.

... in the console sector.

AlNom · May 6, 2005

Lazy8s said:
MfA:

Im pretty sure noone will do 16x multisampling.

Click to expand...

... in the console sector.

You're taking the quote out of context. (Plus we're in the console forum

)

Lazy8s · May 6, 2005

No, I'm just agreeing with MfA about PowerVR's tile-based display list solution.

Jawed · May 6, 2005

MfA said:
16x? Even with external DRAM that is a bit optimistic, with eDRAM you would need virtualized memory and be able to spill to external DRAM ... without tiling that is.

Im pretty sure noone will do 16x multisampling.

R500 seems to use its EDRAM to hold the frame buffer (colour + Z per pixel) but not to hold any AA sample data (slightly simplistic description).

R500 appears to use system RAM to store AA samples. The EDRAM/blending/filtering part of R500 has a latency-hiding buffer designed to feed a pipeline of pixel operations performed against the framebuffer.

In other words, Xbox 360's GPU will necessarily spill AA sample data over into system RAM.

The combination of:

- pipelined blending/filtering operations (minimal latency)
- blending/filtering hardware being on the same chip as the frame buffer

is what gives R500 it's extremely high speeds. Well, that's my interpretation.

(Lots of simplifications in this - I just want to point out that Xbox 360 can't do all AA within EDRAM, as far as I can tell. I just want to point this out as a comparison point for possible NVidia architectures.)

Jawed

Anti-aliasing Without EDRAM in NVidia's PS3 GPU...

trinibwoy

Meh

Fafalada

Farid

Artist formely known as Vysez

MfA

V3

MfA

V3

psurge

MfA

psurge

3dcgi

MfA

psurge

MfA

jvd

AlNom

Moderator

Lazy8s

AlNom

Moderator

Lazy8s

Jawed

Similar threads