XBOX 2 GRAPHICS DETAILS EMERGE .. not much actually

Brimstone · Nov 10, 2004

Is it still unknown if the R500 is a tile based deferred renderer?

Megadrive1988 · Nov 10, 2004

^highly doubtful. almost certainly not.

Brimstone · Nov 10, 2004

How else can the Xbox 2 pull off 4 X AA at 720p? There is only so much eDRAM they can put on the GPU and only so much overall system bandwidth available from a unified memory architechture.

Megadrive1988 · Nov 10, 2004

good question, one that i dont have an answer for really. I'm confident ATI has things covered though. with a next gen Hyper Z and other algorithms.

and hopefully more than 10 MB eDRAM....

Inane_Dork · Nov 10, 2004

Brimstone said:
How else can the Xbox 2 pull off 4 X AA at 720p? There is only so much eDRAM they can put on the GPU and only so much overall system bandwidth available from a unified memory architechture.

Considering that current cards already do this, I don't see why it would be so unprecedented for future cards to do the same (without going to TBDR).

function · Nov 10, 2004

Current cards don't have a 10MB+ embedded frame/back buffer though ...

I suppose they could cut the screen up, drop to 32-bit colour when writing out to the buffer or ... just not use 4x AA.

Seriously, the vast majority of Xbox 2 owners, in the entire lifetime of the Xbox 2, won't have HDTV. Spending all those resources doing 1280 x 768 buffers with 4 x AA and 8 or 16 times aniso seems like a potentially large miss allocation of the console's resources.

Just using a standard 1280 x 768 buffer with aniso would allow them to make great use of HDTVs, and also use the same buffer to create the equivalent of a 640 x 480 supersampled image using the rumoured "high quality video scaler". Everyone's a winner, nothing's really going to waste and the game can be optimised for rendering at one resolution. Good for VGA too.

Brimstone · Nov 10, 2004

Inane_Dork said:
Brimstone said:

How else can the Xbox 2 pull off 4 X AA at 720p? There is only so much eDRAM they can put on the GPU and only so much overall system bandwidth available from a unified memory architechture.

Click to expand...

Considering that current cards already do this, I don't see why it would be so unprecedented for future cards to do the same (without going to TBDR).

Computer graphic cards have ample amounts of bandwidth because of the dedicated memory bus. A console with a UMA isn't the same thing. After the eDRAM is blown out only so much bandwidth is going to be available.

Farid · Nov 10, 2004

Brimstone said:
How else can the Xbox 2 pull off 4 X AA at 720p? There is only so much eDRAM they can put on the GPU and only so much overall system bandwidth available from a unified memory architechture.

IIRC that Xenon diagram, you're not forced to use that EDRAM (We're not even sure it was embedded DRAM, IIRC, it was enhanced DRAM) to store your framebuffers.

Inane_Dork · Nov 10, 2004

Brimstone said:
Computer graphic cards have ample amounts of bandwidth because of the dedicated memory bus. A console with a UMA isn't the same thing. After the eDRAM is blown out only so much bandwidth is going to be available.

Do you realize how high the cache hit frequency would be with a 10 MB framebuffer cache? A 10 MB cache would drastically drop the memory bandwidth demands of a 44 MB framebuffer (64 bit color, 32 bit Z, 1280x720, 4x MSAA). This is especially the case if you compress the color and Z buffers (which ATi does).

pc999 · Nov 11, 2004

A bit more of the same

http://www.eurogamer.net/article.php?article_id=57141

nAo · Nov 11, 2004

Inane_Dork said:
Do you realize how high the cache hit frequency would be with a 10 MB framebuffer cache? A 10 MB cache would drastically drop the memory bandwidth demands of a 44 MB framebuffer (64 bit color, 32 bit Z, 1280x720, 4x MSAA). This is especially the case if you compress the color and Z buffers (which ATi does).

A 10 MB cache COULD , not would, drastically drop the memory bandwith requirements. If most of your current working set doesn't fit in the edram you're actually gonna trash a lot of bandwith. I believe current texture caches are doing pretty well on modern GPUs..edram make sense as a very big L2 texture cache and/or as back buffer if you can split the rendering process to make your buffers completely fit in the edram.

ciao,
Marco

akira888 · Nov 11, 2004

nAo:

Didn't Baumann say something about R500 being able to tile the screen a while back? If so then 10MB just might be more than enough.

nAo · Nov 11, 2004

akira888 said:
nAo:

Didn't Baumann say something about R500 being able to tile the screen a while back? If so then 10MB just might be more than enough.

Dunno what Dave wrote..but tiling is a way to split your data working set

Inane_Dork · Nov 11, 2004

nAo said:
A 10 MB cache COULD , not would, drastically drop the memory bandwith requirements. If most of your current working set doesn't fit in the edram you're actually gonna trash a lot of bandwith.

And what application, exactly, would produce such behavior? A game? I can't think of why a game's working set of the framebuffer would be larger than 1/4 of the screen at one time. Keep in mind that 1/4 figure is completely without compression.

nAo · Nov 11, 2004

Inane_Dork said:
nAo said:

A 10 MB cache COULD , not would, drastically drop the memory bandwith requirements. If most of your current working set doesn't fit in the edram you're actually gonna trash a lot of bandwith.

Click to expand...

And what application, exactly, would produce such behavior? A game? I can't think of why a game's working set of the framebuffer would be larger than 1/4 of the screen at one time. Keep in mind that 1/4 figure is completely without compression.

Just factor togheter MSAA, FP render targets (probably there will be no compression here..), Z-buffer, and so on..
Maybe tiling to 1/4 of the screen will be enough most of the time..but we're not talking about a huge cache here, 'just' a very fast/wide dram

ciao,
Marco

Inane_Dork · Nov 11, 2004

nAo said:
Just factor togheter MSAA, FP render targets (probably there will be no compression here..), Z-buffer, and so on..

I pretty much did.

1280x720 screen
64 bits of color
32 bits of Z
4x MSAA

Do the math and you come up with 44 MB, like I used. I'm not trying to skirt the issue. Factor in how much color and depth compression ATi can do with 4x MSAA and one-quarter of the screen is about the worst case scenario.

Or look at it this way. Cards already have caches for the depth buffer. That must make most to all cases faster. Why would adding eDRAM slow it down?

Guden Oden · Nov 11, 2004

Brimstone said:
Computer graphic cards have ample amounts of bandwidth because of the dedicated memory bus. A console with a UMA isn't the same thing. After the eDRAM is blown out only so much bandwidth is going to be available.

You don't seem to realize a console with UMA isn't going to NEED as much bandwidth to begin with.

Just for starters, screen update speed will be 60 frames/sec absolute max, it will NEVER exceed that. Second, games will generally run in lower resolutions than on a PC. Third, it is presumed games will use lots more, and longer/more complicated shaders, thus lifting further emphasis off the bandwidth-burning rasterization side of 3D rendering.

MfA · Nov 11, 2004

As far as the framebuffer is concerned there is no real working set with the way most games work ... there is some spatial coherency, but no temporal coherency (ie. the odds of a location being accessed is entirely independent of whether it is in cache or not ... unlike with general purpose processing where there tends to be a huge amount of temporal coherency). To get temporal coherency in framebuffer access you need tiling.

Now maybe with a virtualized memory system you could store the framebuffer inside eDRAM and use external memory as overflow ... and maybe fragmentation wouldnt be significant enough to take away the advantage of compression ... and maybe this could fit enough of the framebuffer in there for it to be advantageous.

Present GPUs have caches because they need to accumulate the spatially coherent writes for compression, otherwise they'd still just have FIFOs.

mczak · Nov 11, 2004

Couldn't you easily move off the not-often-accessed parts of the z/color buffer to normal ram from edram? Presumably z-buffer compression for msaa works by exploiting the fact that most pixels don't contain polygon edges - thus you need to store only one z value instead of 4 (in case of 4x FSAA).
So you'd just allocate basically a non-fsaa z buffer in edram, and the rest of it (3 times as large) in normal ram. You'd also need 3 bits per pixel to flag if the 3 subpixels have the same z value as the first subpixel in the edram buffer (these 3 bits might be on-die cache).
So you WILL need to access the normal ram, but only on the pixels which have polygon edges, which compared to accesses in the edram should be a lot less frequently.
Color can be handled just the same, and you have just dropped the needed size of the edram by a factor of 4.
Simple, isn't it (it is so simple I'm just running off to the patent office

).

mczak

MfA · Nov 12, 2004

You need to store all the Z-values all the time (otherwise intersecting surfaces wont work correctly). You can get away with reading less values though, which is basically what the hierarchical Z-buffer does.

Storing the colour in seperate full-pixel/subpixel buffers is possible, but you get a little extra latency.

XBOX 2 GRAPHICS DETAILS EMERGE .. not much actually

Brimstone

B3D Shockwave Rider

Megadrive1988

Brimstone

B3D Shockwave Rider

Megadrive1988

Inane_Dork

Rebmem Roines

function

None functional

Brimstone

B3D Shockwave Rider

Farid

Artist formely known as Vysez

Inane_Dork

Rebmem Roines

pc999

nAo

Nutella Nutellae

akira888

nAo

Nutella Nutellae

Inane_Dork

Rebmem Roines

nAo

Nutella Nutellae

Inane_Dork

Rebmem Roines

Guden Oden

Senior Member

MfA

mczak

MfA

Similar threads