xbox360 gpu explained...or so

DaveBaumann said:
Why do you say the GPU is is limited to two quads? As far as I can tell the shader core is actually operating on 8 quads (when dealing with pixels). However, when doing Z/Stencil passes most of the shader core will be dedicated to geometry processing and will be wanting to output many quads to the ROP/Memory unit.
Because 4 quads with Z / 2 quads with 32bit color / 1 quad with 64bit color is what the leaked specs state.
Besides, bandwidth between GPU and eDRAM is limited.
 
DaveBaumann said:
That 2Tb figure is from the interface running at 2GHz. I've yet to establish of the entire chip is running at 2GHz, the memory and the interface or just the interface. Naturally if th entire thing is running at 2GHz that is going to have a considerable impact on fill-rate.


Sorry, but I think the 2Tbit are the eDRAM onchip bandwidth only. Something like the 48GB/sec figure for the GS / PS2. So they have 5.33times the bandwidth of the GS. I don't think the 2Tbit are the speed of the connection between the eDRAM and the GPU...IMHO that would be insane.
 
Demirug said:
If I understand the blockdigramm right there is no way to get pixel from the scan converter to pipe comm. This means that in Z/Stencil passes the data need to go to every shader pipe.

Again, that block diagram is not accurate, I wouldn't take it too seriously. One thing I forgot to ask is whether this uses a ring bus memory or not, which if that were the case the units can pretty much talk to each other fairly easily. I'll clarify this.

mboeller said:
Sorry, but I think the 2Tbit are the eDRAM onchip bandwidth only. Something like the 48GB/sec figure for the GS / PS2. So they have 5.33times the bandwidth of the GS. I don't think the 2Tbit are the speed of the connection between the eDRAM and the GPU...IMHO that would be insane.

I was taling about the onchip interface (between the ROPs and the EDRAM).
 
One thing I forgot to ask is whether this uses a ring bus memory or not
I think thats a given really.
They seem to be talking like it does, with the whole threading bit etc.
Its called xbox 360 & from one of the above linked articles, the GPU is also north bridge & you were talking about how the ring topology was relevant to the rest of the system in the patent discussion...

Man, this chip seems so good.
I wannit! <crosses fingers for r520 to be ring architecture, can live without edram & primitive processor I guess>
 
How about they leave the eDRAM and the primative processor on there and sell it as the R600--I would buy it ;)

The 2 package GPU is a neat idea... maybe this concept could find its way into PC GPUs. If the tiling is as effecient as it sounds, they could just make a modest bump to its size...

Really sounds like a beast. I cannot wait to learn more about the RSX!
 
Don't get too excited about the RSX. Expect it to pretty much be an NV4x, not a WGF2.0 part. It depends on what excited you. R500 is exciting from a "new and different" and "it has WGF2.0 features" standpoint. RSX may only be exciting from a performance standpoint compared to 6800 Ultras or "What if I put two G70s in my SLI box" :)
 
DemoCoder said:
Don't get too excited about the RSX. Expect it to pretty much be an NV4x, not a WGF2.0 part. It depends on what excited you. R500 is exciting from a "new and different" and "it has WGF2.0 features" standpoint. RSX may only be exciting from a performance standpoint compared to 6800 Ultras or "What if I put two G70s in my SLI box" :)

Yeah I think that sums it up. I really like what ATI has done with the Xbox360 GPU but I don't think we'll be wowed by the next gen PC parts from either IHV.
 
The G70 will definately be faster than just a 24-pipeline NV40 however. It does have some performance improvements in the pipeline.
 
While history does tell us that cards that show up 1yr after a new series (like NV40 and R420) we would get upgrades in speed and some new bells and whistles to compliment what is already there and to shore up any underperforming areas that are easy to fix. i.e. 9700 => 9800.

But I was hoping since the RSX is going into the PS3 and the rumors of "NV50" being scrapped they maybe accelerated a part. I know that would be a lot of work...

Anyhow, thanks for the heads up. While a NV40 Supercharged would be a great card (I got a 6800GT and like it) that would be sort of dissappointing. I was expecting more out of SM 3.0 and FP blending. Maybe they will rework these in the G70. Although I must admit with Higher Order Surfaces, Unified Shaders, almost free AA, fast eDRAM, the R500 sounds like an aweful nice part. I would love one in my PC :oops:

Thanks again!
 
DaveBaumann said:
I was taling about the onchip interface (between the ROPs and the EDRAM).

OK, so it was only an misunderstanding.


[crazy math]
What if the 236GBit Interface between the GPU and the eDRAM is no mistake, but reality. Using a little bit of crazy math here I come to the following:

236/256 x 96bit = 88.5 ( rounded to 89bit )
=>
1280x720x88.5bit = 9956,25 KByte
1280x720x89bit = 10012,5 KByte

The result is really close to the reported size of the eDRAM; 10MB

Could it really be true that the datastream from the GPU is transfered to the eDRAM and then compressed down to 75% before stored in the eDRAM? On one side this simple math makes perfect sense, but on the other side IMHO it is simply crazy math with no relationship to the real thing
[/crazy math]


I have also another questions about the eDRAM:

Will the use of eDRAM enhance the effective fillrate/effective pixel-shader power ( TBDR-style ) compared with an normal design, or will the normaly used front-to-back sorting be good enough and has not to/cannot be improved.

What about stencil shadowing? What improvement we will see due to the eDRAM?

What about Dot3? Is the dot3-performance related to the eDRAM, or completely unrelated?
 
NVidia doesn't need to ship a super featured card right now, and neither does ATI. Since there is no WGF2.0 right now, the R500 on the desktop would be a waste, since at best, its advanced features could only be exposed via OpenGL extensions. ATI is right to delay introducing R500 technology to the desktop. What they've done is use the console space as a "proof of concept" and they will take that experience to design a future desktop part. But right now in '05, it is too early to ship WGF2.0-style parts.
 
mboeller said:
Could it really be true that the datastream from the GPU is transfered to the eDRAM and then compressed down to 75% before stored in the eDRAM? On one side this simple math makes perfect sense, but on the other side IMHO it is simply crazy math with no relationship to the real thing
[/crazy math]

Some stuff I wrote in another thread:
Overall, what is required per quad are 4* 32bit color, compressed Z (3 * 24bit at most) and 4* 4bit of coverage mask, meaning less or equal to 216bit per quad. And since there may be two quads per clock with color, that's at most 432bit required for the connection between the two chips, which equals 27GB/s.

What I haven't considered here is the address bus. If there's no separate address transmission (which is likely considering there can be up to four quads in Z only mode), you have to add that to the number of bits as well. I guess 236Gb/s does make sense.
 
Pete said:
Is that how nV arrived at 136? (24 pipes * 4 components) + (10 vertex shaders * 4 components) = 136?
What about 24 pipes * (2 * 2 dual issue ALU + normalize ) + 8 pipes * ( 1 vec4 + 1 scalar ) ? Seems about right in every way. Comparing to 6800U: 24/16 * 550/400 ~= 2.
 
http://www.extremetech.com/article2/0,1558,1818139,00.asp

Each of the ALUs can perform 5 floating-point shader operations. Thus, the peak computational power of the shader units is 240 floating-point shader ops per cycle, or 120 billion shader ops per second at 500MHz.


Here is a nasty limitation.

All 48 of the ALUs are able to perform operations on either pixel or vertex data. All 48 have to be doing the same thing during the same clock cycle (pixel or vertex operations), but this can alternate from clock to clock. One cycle, all 48 ALUs can be crunching vertex data, the next, they can all be doing pixel ops, but they cannot be split in the same clock cycle.
 
rwolf said:
Here is a nasty limitation.

All 48 of the ALUs are able to perform operations on either pixel or vertex data. All 48 have to be doing the same thing during the same clock cycle (pixel or vertex operations), but this can alternate from clock to clock. One cycle, all 48 ALUs can be crunching vertex data, the next, they can all be doing pixel ops, but they cannot be split in the same clock cycle.

Yep, it sounds shit to me. It makes me wonder if dynamic branching is ever going to bring improved performance. Seems unlikely.

Jawed
 
Unknown Soldier said:
According to this .. MS are saying the Xbox 2 is more powerful than the PS3

http://xbox360.ign.com/articles/617/617951p1.html

Factor 5 drop Xbox 360 for PS3

Factor 5, an independent Californian developer previously responsible for a couple of Rogue Squadron games, has switched allegiances in dramatic style this week, having previously backed the Xbox 360. Factor 5 were one of the first studios to demonstrate the potential of Microsoft's XNA development platform (a system for making PC and Xbox development easy), but switched exclusively to Sony's new beast when its power was revealed. The Xbox 360 is certainly very strong, but technically the PS3 is more so, and Factor 5 claim it is for these reasons they switched.

"I was shocked by how powerful the new consoles are," said Julian Eggebrecht, president of Factor 5. "They should really free our development." Apparently, the greater processing power of Sony's much-hyped new CPU 'the Cell', will allow Factor 5 to better simulate the real world making for more realistic games. We've no word on the nature of Factor 5's next-gen products, but the company in the past has created middleware as well as new games.

News Source: Ferrago

---------------------------------

Seems like some developers seem to think the PS3 is more powerful.

US
 
Unknown Soldier said:
Seems like some developers seem to think the PS3 is more powerful.

US
I don't think it has so much to do with "power". I really think it is all about the money. The main reason why so many developers seem to be flocking to PS3 is simply money. Either they have made deals with Sony (and received money) or they expect the market for the PS3 to be much bigger than for XB360. I don't think it matters much which platform is ahead of the other. To dethrone the PS3 the XB360 would need to be substantially ahead of the PS3 (say, 30-50% more powerful) and that seems not to be the case (in the dev's minds).
 
Back
Top