FS ATI interview on R500 + Block Diagram

Acert93 said:
We have yet to see how the RSX will perform with a lot of AA. At 720p I would guess it would do ok, but with a lot of geometry as the Sony render targets shown that could be an unknown.

Yes we aren't sure about this without further information on the RSX but Chatani-san has already mentioned that the RSX can collaboratively work with the Cell's SPUs in this regard.
 
And Tim to me isn't that the point. When devs make games aren't they going to use it Cell to help with some calculations while the RSX renders somethings? To me they are saying in that slide is "Our PS3 will be able to produce 100 billion shader ops/sec." And isn't that what really matters? I mean thats what the game will use right?
________
MODEL C TEN
 
Last edited by a moderator:
mckmas8808 said:
And Tim to me isn't that the point. When devs make games aren't they going to use it Cell to help with some calculations while the RSX renders somethings? To me they are saying in that slide is "Our PS3 will be able to produce 100 billion shader ops/sec." And isn't that what really matters? I mean thats what the game will use right?

The Xbox 360 patents mention offloading some of the vertex processing load to the xCPUs. I do not believe that is unheard of. Actually with the R500 unified shaders that would we advantageous for games that require a lot of pixel shading power and minimal vertex shading because there would be less wasted horse power (of course PS3 may have so much you can just waste some and still get a superior result, we do not know yet).

The question I have is about the Spiderman type demo that was done on CELL. Specifically the flexiblity of CELL interactivity in the rendering pipeline. With WGF 2.0 we are looking at GPUs passing vertex and pixel shader data back and forth which would open up a lot of flexibility. Is CELL able to do this with RSX (i.e. latency between the two, and is RSX designed to pass/receive information like that), or is the CELLs involvement mostly pre- and post- rendering effects like vertex and lighting manipulation before and postrendering frame buffer rendering.

The pre- and post- manipulation is not necessarily new (e.g. Doom 3's shadows are processor bound), what I want to know is the INSANE level of detail seen demos like the the Spiderman demo if CELL realistically can contribute to such processes in realtime WITH the GPU.
 
XavierS said:
Has anyone read the 360 GPU article on Extremetech.com: http://www.extremetech.com/article2/0,1558,1818139,00.asp. They claim the GPU can do 120 billion shader ops per second. Could they have gotten that figure by counting the logic units in the eDram?

The war of shader ops is never going to end! o_O

Next Sony/NV will say, "Opps we forgot to count the opps from bluetooth interface, that makes it 130"

Next MS/ATI will mention "We forgot the shader ops the video chip offers, that makes it 140"

Then Sony will say, "We missed the shader ops from the wifi chip, it is now 150"

Then MS will say, "we miss the shader opps from our power cord!..."

o_O
 
Josh378 said:
http://www.firingsquad.com/features/xbox_360_interview/

well confusion solved!!!!

-Josh378

I gotta say this is getting hilarious by the minute!

The units are now 'flops' and not 'shops'!* :p

48 Shader units ~ 196 flops per cycle

@ 500 MHz ~ 98 GFlops

*note: shops ~ shader ops!
 
How big is the chip? ATI and Microsoft won't give out numbers, but told us "it's smaller than you think." Even running at 500MHz, it draws less than 35 watts of power when running full-bore. That's including the EDRAM. ATI has built pretty aggressive power management features into the chip, and can perform clock gating at both a "macro" and "micro" level, turning off large blocks of the chip or smaller logical units. It's even passively cooled—at least, there is no fan directly on the GPU. There is only a passive heat sink attached, which the fans on the back of the Xbox 360 draw air over and out of the box.

So, how does 35W compare to, say, X850XTPE?

Jawed
 
http://www.extremetech.com/article2/0,1558,1818139,00.asp

The 48 ALUs are divided into three SIMD groups of 16. When it reaches the final shader pipe, each of the 16 ALUs has the ability to write out two samples to the 10MB of EDRAM. Thus, the chip is capable of writing out a maximum of 32 samples per clock. At 500MHz, that means a peak fill rate of 16 gigasamples. Each of the ALUs can perform 5 floating-point shader operations. Thus, the peak computational power of the shader units is 240 floating-point shader ops per cycle, or 120 billion shader ops per second at 500MHz.

???
 
My advise? Read the 'leaked' specs, 'coz they are more accurate than the 'real' specs coming from random sites! :LOL:
 
Jawed said:
How big is the chip? ATI and Microsoft won't give out numbers, but told us "it's smaller than you think." Even running at 500MHz, it draws less than 35 watts of power when running full-bore. That's including the EDRAM. ATI has built pretty aggressive power management features into the chip, and can perform clock gating at both a "macro" and "micro" level, turning off large blocks of the chip or smaller logical units. It's even passively cooled—at least, there is no fan directly on the GPU. There is only a passive heat sink attached, which the fans on the back of the Xbox 360 draw air over and out of the box.

So, how does 35W compare to, say, X850XTPE?

Jawed

i think it's in the 70-80watt range for a x850xtpe but of course for the whole card with memory and stuff.

The 35watt for r500 is probably the chip only
 
Jaws and Jawed, could you guys give some feedback on the relevance on Shader Op/s here?



As for power draw: http://www.teenja.com/p/articles/mi_zdext/is_200405/ai_ziff125906

Power Draw: Both nVidia and ATI have been more forthcoming about power draw with this generation of GPUs, and on this particular front, ATI holds the advantage, partly due to its lower transistor count, and partly because ATI is using TSMC's low-k dielectric manufacturing process. The GeForce 6800 Ultra has a maximum power draw of around 110 watts according to nVidia. As a result, it requires two Molex power connectors, and nVidia has recommended a 480-watt power supply to make sure both the GPU and the power-hungry CPU have enough juice to get the job done. ATI's stated power draw is more around 65 watts, nearly half that of nVidia. The result is that the Radeon X800 XT is a single-slot card with a single Molex connector, and can be run with a high-end CPU using a 350-watt power supply.
 
Acert93 said:
Jaws and Jawed, could you guys give some feedback on the relevance on Shader Op/s here?
...

I've posted my interpretation earlier in this thread,

http://www.beyond3d.com/forum/viewtopic.php?p=526435#526435

1 shader op per cycle ~ 1 shader execution unit

1 shader execution unit ~ vector unit or scalar unit

e.g. ALU = 1 scalar unit + 4-way SIMD unit ~ 2 shader ops per cycle

It's simply a 'count' of the number of execution units, more specifically, shader execution units, AFAIK...
 
mckmas8808 said:
And Tim to me isn't that the point. When devs make games aren't they going to use it Cell to help with some calculations while the RSX renders somethings? To me they are saying in that slide is "Our PS3 will be able to produce 100 billion shader ops/sec." And isn't that what really matters? I mean thats what the game will use right?

The Cell is of cause more suitable for vertex processing than the xCPU, but it would also be possible to offload vertex operations to the xCPU. That makes it is unfair compare the RSX + Cell against the xGPU alone. And most of the time the CPUs will be used to do other work anyway and vertex shading will be left to the GPU.

The RSX is around 50+% faster than the xGPU in peak theoretical performance pretty in pretty much every way. How much of this advantage it can retain in real life is a hypothetical question until we see some real hardware. I think it will lose most if not all of this advantage but the picture is far from complete and I would not be very surprised if I was wrong.
 
Jawed said:
So, how does 35W compare to, say, X850XTPE?

35W is damn cool the top range desktop GPUs are using 70-90W, I have my doubt about 35W it seems low but the eDRAM helps power consumption as it reduces IO-power usage quite a bit.

If it really only is 35W I don't understand why they did not try to clock it more aggressive (500MHz is quite low for modern 90nm GPU).

Edit: forgot that the 35W does not include the GDDR3 memory and the 70-90W does - still 35W is quite low.
 
I thought this was hard to reconcile
ATI calls them "Perfectly Efficient" shaders.
All 48 of the ALUs are able to perform operations on either pixel or vertex data. All 48 have to be doing the same thing during the same clock cycle (pixel or vertex operations), but this can alternate from clock to clock. One cycle, all 48 ALUs can be crunching vertex data, the next, they can all be doing pixel ops, but they cannot be split in the same clock cycle.
http://www.extremetech.com/article2/0,1558,1818139,00.asp
 
Jaws said:
48 Shader units ~ 196 flops per cycle

I hate to nitpick, but I'm awfully confused about the 196 flops per cycle number. If there are 48 Shader units and each can do 4 floating point ops, doesn't that equal 192 flops/cycle?

I find it especially confusing because in the interview, he addresses both numbers (192 [processors] and the 196) but IMO the 196 flops/cycle is incorrect?
 
Back
Top