FS ATI interview on R500 + Block Diagram

DeanoC said:
DaveBaumann said:
tEd said:
[Is it true that they only have 4 texture units? I was little surprised to say at least

No, its 4 groups of 4. They are grouped in four as these are the most common sampling requirements.

Xenon has 32 memory fetch units, 16 have filtering and address logic (textures) and 16 just do a straight lookup from memory (unfiltered and no addressing modes AKA vertex fetch).

Unification means that any shader can use either type (filtered or unfiltered) as it see fit (no concept of dependent reads or otherwise). This means that the XeGPU has an almost CPU like view of memory.


go on...any more information is appreciated ;)
 
wireframe said:
Now I am thoroughly confused of what the R500 is. So, is it now etched in stone that the R500 can do 192 shader ops per clock? 48 parallel processing units, each capable of 4 ops?

So, this turns everything on its head in respect to shading capability. It was first reported that the RSX had higher shading performance than R500, with 136 ops to R500's 96. Seeing the 'typo' of 196, as pointed out by Neeyik, it makes it look like somone just slipped a 1 in front of the 'original' 96.

Are we (me inc.) getting confused here? The ATI bloke in the HardOCP interview said:
Shader Performance of the Xbox 360 GPU is 48 billion shader operations per second. While that is what Microsoft told us, Mr. Feldstein of ATI let us know that the Xbox 360 GPU is capable of doing two of those shaders per cycle
which surely works out at 96 * 500M = 48G shader ops sec? Or am I missing something? And isn't this 196 the GFlops metric excluding the NV rendering flops?
 
gmoran said:
wireframe said:
Now I am thoroughly confused of what the R500 is. So, is it now etched in stone that the R500 can do 192 shader ops per clock? 48 parallel processing units, each capable of 4 ops?

So, this turns everything on its head in respect to shading capability. It was first reported that the RSX had higher shading performance than R500, with 136 ops to R500's 96. Seeing the 'typo' of 196, as pointed out by Neeyik, it makes it look like somone just slipped a 1 in front of the 'original' 96.

Are we (me inc.) getting confused here? The ATI bloke in the HardOCP interview said:
Shader Performance of the Xbox 360 GPU is 48 billion shader operations per second. While that is what Microsoft told us, Mr. Feldstein of ATI let us know that the Xbox 360 GPU is capable of doing two of those shaders per cycle
which surely works out at 96 * 500M = 48G shader ops sec? Or am I missing something? And isn't this 196 the GFlops metric excluding the NV rendering flops?

Thank you!

It's 96 Shader ops per cycle and NOT 196 OR 192!

There's some FUD flying around the 'net at the moment! :p
 
Shader Performance of the Xbox 360 GPU is 48 billion shader operations per second. While that is what Microsoft told us

Ok if this is true then wouldn't the RSX's 100 billion shader operations per second better? I don't understand why people are saying that X360's GPU is better.
 
mckmas8808 said:
Shader Performance of the Xbox 360 GPU is 48 billion shader operations per second. While that is what Microsoft told us

Ok if this is true then wouldn't the RSX's 100 billion shader operations per second better? I don't understand why people are saying that X360's GPU is better.

Mostly cause they're wrong. I think it depends on the point of view, the eDRAM will allow for some nice little things.
 
mckmas8808 said:
Ok if this is true then wouldn't the RSX's 100 billion shader operations per second better? I don't understand why people are saying that X360's GPU is better.

Well it looks to me (for what that's worth), but Jaws agrees, that its 96 shader ops per cycle. There are so many numbers floating around that people get confused, and look to spin them to their own preferences.

These numbers don't really mean alot, certainly too early to say RSX is superior. 95% free FSAA on R500 certainly looks pretty usefull.
 
london-boy said:
mckmas8808 said:
Shader Performance of the Xbox 360 GPU is 48 billion shader operations per second. While that is what Microsoft told us

Ok if this is true then wouldn't the RSX's 100 billion shader operations per second better? I don't understand why people are saying that X360's GPU is better.

Mostly cause they're wrong. I think it depends on the point of view, the eDRAM will allow for some nice little things.

Mostly because the RSX does not do 100 billion shader ops per second it does 136 shader ops per cycle or almost 75billion shader ops per cycle and that is likely based architecture of the Xenos that the Xenos can sustain an actual shader throughput that are significant closer to the peak than the RSX.

The 100 billion number does not only include operations by the RSX.
 
mckmas8808 said:
Shader Performance of the Xbox 360 GPU is 48 billion shader operations per second. While that is what Microsoft told us

Ok if this is true then wouldn't the RSX's 100 billion shader operations per second better? I don't understand why people are saying that X360's GPU is better.

unfortunately the way ati and nvidia do their math we can only roughly compare 100 to 96 billion shader ops/sec.

even if nvidia's rsx can do 4 more bn sop/sec, xenos' edram gives the 360 free 4x fsaa, Z depths, occlusion culling, and also does a very good job at figuring stencil shadows at 95% efficiency.

I highly doubt RSX can claim the same, perf hit for RSX will be much greater when applying the same techniques. Overall I'd say xenos is much more elegant, efficient, and developer friendly which in turn will allow devs to extract much more out of the chip.
 
Mostly cause they're wrong. I think it depends on the point of view, the eDRAM will allow for some nice little things.

Mostly because the RSX does not do 100 billion shader ops per second it does 136 shader ops per cycle or almost 75billion shader ops per cycle and that is likely based architecture of the Xenos that the Xenos can sustain an actual shader throughput that are significant closer to the peak than the RSX.

The 100 billion number does not only include operations by the RSX.[/quote]

im not questioning you, but could you provide a link for those number pls?
 
dukmahsik said:
mckmas8808 said:
Shader Performance of the Xbox 360 GPU is 48 billion shader operations per second. While that is what Microsoft told us

Ok if this is true then wouldn't the RSX's 100 billion shader operations per second better? I don't understand why people are saying that X360's GPU is better.

unfortunately the way ati and nvidia do their math we can only roughly compare 100 to 96 billion shader ops/sec.

even if nvidia's rsx can do 4 more bn sop/sec, xenos' edram gives the 360 free 4x fsaa, Z depths, occlusion culling, and also does a very good job at figuring stencil shadows at 95% efficiency.

I highly doubt RSX can claim the same, perf hit for RSX will be much greater when applying the same techniques. Overall I'd say xenos is much more elegant, efficient, and developer friendly which in turn will allow devs to extract much more out of the chip.

:rolleyes: All of that from eDRAM? Makes you wonder how PC parts have survived this long without it! I mean, it's like adding eDRAM just makes a GPU better than one that doesn't have any.
 
london-boy said:
dukmahsik said:
mckmas8808 said:
Shader Performance of the Xbox 360 GPU is 48 billion shader operations per second. While that is what Microsoft told us

Ok if this is true then wouldn't the RSX's 100 billion shader operations per second better? I don't understand why people are saying that X360's GPU is better.

unfortunately the way ati and nvidia do their math we can only roughly compare 100 to 96 billion shader ops/sec.

even if nvidia's rsx can do 4 more bn sop/sec, xenos' edram gives the 360 free 4x fsaa, Z depths, occlusion culling, and also does a very good job at figuring stencil shadows at 95% efficiency.

I highly doubt RSX can claim the same, perf hit for RSX will be much greater when applying the same techniques. Overall I'd say xenos is much more elegant, efficient, and developer friendly which in turn will allow devs to extract much more out of the chip.

:rolleyes: All of that from eDRAM? Makes you wonder how PC parts have survived this long without it! I mean, it's like adding eDRAM just makes a GPU better than one that doesn't have any.

"Inside the Smart 3D Memory is what is referred to as a 3D Logic Unit. This is literally 192 Floating Point Unit processors inside our 10MB of RAM. This logic unit will be able to exchange data with the 10MB of RAM at an incredible rate of 2 Terabits per second. So while we do not have a lot of RAM, we have a memory unit that is extremely capable in terms of handling mass amounts of data extremely quickly. The most incredible feature that this Smart 3D Memory will deliver is “antialiasing for freeâ€￾ done inside the Smart 3D RAM at High Definition levels of resolution. (For more of just what HiDef specs are, you can read about it here. Yes, the 10MB of Smart 3D Memory can do 4X Multisampling Antialiasing at or above 1280x720 resolution without impacting the GPU. Therefore, not only will all of your games on Xbox 360 be in High Definition, but they also will have 4XAA applied.

The Smart 3D Memory can also compute Z depths, occlusion culling, and also does a very good job at figuring stencil shadows. Stencil shadows are used in games that will use the DOOM 3 engine such as Quake 4 and Prey."

http://www.hardocp.com/article.html?art=NzcxLDM=
 
dukmahsik said:
"Inside the Smart 3D Memory is what is referred to as a 3D Logic Unit. This is literally 192 Floating Point Unit processors inside our 10MB of RAM. This logic unit will be able to exchange data with the 10MB of RAM at an incredible rate of 2 Terabits per second. So while we do not have a lot of RAM, we have a memory unit that is extremely capable in terms of handling mass amounts of data extremely quickly. The most incredible feature that this Smart 3D Memory will deliver is “antialiasing for freeâ€￾ done inside the Smart 3D RAM at High Definition levels of resolution. (For more of just what HiDef specs are, you can read about it here. Yes, the 10MB of Smart 3D Memory can do 4X Multisampling Antialiasing at or above 1280x720 resolution without impacting the GPU. Therefore, not only will all of your games on Xbox 360 be in High Definition, but they also will have 4XAA applied.

The Smart 3D Memory can also compute Z depths, occlusion culling, and also does a very good job at figuring stencil shadows. Stencil shadows are used in games that will use the DOOM 3 engine such as Quake 4 and Prey."

http://www.hardocp.com/article.html?art=NzcxLDM=

I still don't see how can a 10mb EDRAM cache memory could bring significant boost to a GPU.It's not that Nvidia didn't consider using all these features, they in fact evaluated all these before sticking back to NUMA and not including the EDRAM.Why would they throw in 300million trannies when they could also reduce the size of their GPU and just add the EDRAM?

By the way the EDRAM IP that is used by the Xenos belongs to NEC right?It wasn't designed specifically to include additional GPU features doesn't it?We all know what EDRAMs are used for and this applies to Sony with the PS2's GS as well.So why didn't they went with it?
 
I thought it was already stated when the MTV specs came out that "Shader Ops/s" was a nearly worthless benchmark.

So, can anyone explain to me why this is a meaningful benchmark and how do we reconsile and then fairly compare the numbers side by side?
 
The 100 billion claim comes on a later slide with no referring to graphics system and with a picture of both the Cell and GPU.

So what are you trying to say Tim. If your slide works and is correct, why would the slide a few clicks later be completly wrong.

Oh and at the top of the slide it says The most powerful Graphics System ever built.

Pic here for people to see. :)


ps3_r02.jpg

________
Amber trichomes
 
Last edited by a moderator:
hugo said:
dukmahsik said:
"Inside the Smart 3D Memory is what is referred to as a 3D Logic Unit. This is literally 192 Floating Point Unit processors inside our 10MB of RAM. This logic unit will be able to exchange data with the 10MB of RAM at an incredible rate of 2 Terabits per second. So while we do not have a lot of RAM, we have a memory unit that is extremely capable in terms of handling mass amounts of data extremely quickly. The most incredible feature that this Smart 3D Memory will deliver is “antialiasing for freeâ€￾ done inside the Smart 3D RAM at High Definition levels of resolution. (For more of just what HiDef specs are, you can read about it here. Yes, the 10MB of Smart 3D Memory can do 4X Multisampling Antialiasing at or above 1280x720 resolution without impacting the GPU. Therefore, not only will all of your games on Xbox 360 be in High Definition, but they also will have 4XAA applied.

The Smart 3D Memory can also compute Z depths, occlusion culling, and also does a very good job at figuring stencil shadows. Stencil shadows are used in games that will use the DOOM 3 engine such as Quake 4 and Prey."

http://www.hardocp.com/article.html?art=NzcxLDM=

I still don't see how can a 10mb EDRAM cache memory could bring significant boost to a GPU.

Then re-read what was written above. Basically eDRAM gives you the advantage of bandwidth (wont chew up as much of the UMA bandwidth because you are using the eDRAM for the framebuffer) and the logic on the eDRAM is extremely fast allowing for very fast processing for those logic tasks.

And 4x AA is not cheap, but if the R500 only takes a 1-5% total hit at most that is insane. We have yet to see how the RSX will perform with a lot of AA. At 720p I would guess it would do ok, but with a lot of geometry as the Sony render targets shown that could be an unknown.

It's not that Nvidia didn't consider using all these features, they in fact evaluated all these before sticking back to NUMA and not including the EDRAM.Why would they throw in 300million trannies when they could also reduce the size of their GPU and just add the EDRAM?

Time. And the NUMA appears to be an issue of cost (in a good way). The GPU does not need XDR, so why have 512MB of XDR in a UMA when a NUMA where the CELL, that needs XDR, gets a large XDR pool and the GPU gets a 256MB GDDR3 pool that is more than suffice for it?
 
Question..
How "finalized" are the specs for X360? Really locked down or could there be some last minute changes?

The thing I was wondering about is, what ever happened with the Fast14-tech stuff regarding GPU-design? That seemed pretty sweet, being able to up the Mhz without having the chip getting hotter.. (or someway along these lines)...

That would boost performance quite a bit... is this somehting MS "waited" to reveal or are the numbers so locked down that it´s "No way José" on this one?

Just wondering, because the FAST14-tech sounded pretty sweet and there where tons of speculation about it before...
 
mckmas8808 said:
The 100 billion claim comes on a later slide with no referring to graphics system and with a picture of both the Cell and GPU.

So what are you trying to say Tim. If your slide works and is correct, why would the slide a few clicks later be completly wrong.

Pic here for people to see. :)


ps3_r02.jpg

It is pretty damn clear that this slide is not about the RSX only. They add the 25 million extra shader ops per second because they are able to offload some Vertex processing to the Cell chip.
 
How "finalized" are the specs for X360? Really locked down or could there be some last minute changes?

The thing I was wondering about is, what ever happened with the Fast14-tech stuff regarding GPU-design? That seemed pretty sweet, being able to up the Mhz without having the chip getting hotter.. (or someway along these lines)...

That would boost performance quite a bit... is this somehting MS "waited" to reveal or are the numbers so locked down that it´s "No way José" on this one?


Considering that the consoles is 3 or so months away from going into manufacturing and the games are 6 month away from release, the specks BETTER be locked down, for Microsoft's sake.
 
Geeforcer said:
How "finalized" are the specs for X360? Really locked down or could there be some last minute changes?

The thing I was wondering about is, what ever happened with the Fast14-tech stuff regarding GPU-design? That seemed pretty sweet, being able to up the Mhz without having the chip getting hotter.. (or someway along these lines)...

That would boost performance quite a bit... is this somehting MS "waited" to reveal or are the numbers so locked down that it´s "No way José" on this one?


Considering that the consoles is 3 or so months away from going into manufacturing and the games are 6 month away from release, the specks BETTER be locked down, for Microsoft's sake.


yeah, I know.. but the thing is that devs still have alpha-kits.. with beta kits coming sometime this summer, so there "might" be time.. but then again, I don´t know how much changes this requires..

just a thought anyways, the X360 looks real good no matter how you put it... :D
 
Back
Top