How can Nvidia be ahead of ATI but 360 GPU is on par with RSX?

Status
Not open for further replies.
We need performance figures. All this "Lookie!! Unified Shaders!! OMG!! Revolution in graphics!!" might just as well be deserved, but all of this doesn´t really matter until we can see how well both parts perform, so speaking on wether one has more advanced features than the other doesn´t really matter if the performance isn´t there.
 
Titanio said:
Maybe I misread that, but I assumed G70 supports FP10. If not, FP16 it is, for its "min" HDR precision, which shouldn't be a problem either (especially if MSAA is not a simultaneous possibility ;)). Either way, placing it amongst "advances" over RSX/G70 etc. seems odd.
I just didn't recall reading about FP10 anywhere but in Xenos articles. Maybe I didn't remember something in there.

As for why it's an advantage, if your game's dynamic range can fit in FP10, it's a bit bandwidth/cache advantage. Half the color bits means less stuff thrown around and better cache usage. This is especially useful when alpha blending particles and such (read AND write per pixel). That would be the advantage of using such a mode with the RSX, to my limited understanding. Reducing bandwidth requirements looks to be important with it.


zidane1strife said:
Given it's virtually 'free' from the penalties of AA, and that it should offer better performance for similar clock/transistor budget, we should be seeing things that blow g70 based demos out of the water, yet we are not.

What gives? That is the question.
They're trade-offs, not trade-ups. If they were trade-ups, there wouldn't be any serious debate on the topic.

Too, in my opinion, I haven't seen anything on the PS3 that's just impossible on the X360. I think I've only seen one thing on the X360 that I thought would give the PS3 a hard time. Kameo's throne room has, according to the developer, over 1 million particles floating around the room. Maybe the RSX could do that, I don't know. But I do know it plays into Xenos' strength and RSX's potential weakness.


SubD said:
So far the only thing real is what developers have shown running in public. And until something better comes from the 360 developers, the PS3 is displaying a significant realworld advantage.
I don't see that either machine has left the other in its dust. Any significant advantage or disadvantage is most likely opinion based, unless I misunderstood the entire post.
 
Almasy said:
We need performance figures. All this "Lookie!! Unified Shaders!! OMG!! Revolution in graphics!!" might just as well be deserved, but all of this doesn´t really matter until we can see how well both parts perform, so speaking on wether one has more advanced features than the other doesn´t really matter if the performance isn´t there.

dont bring the nintendo revolution in here! ;)
 
Jawed said:
Not to mention the vastly more efficient texturing that Xenos can perform, because it can texture even when a conventional GPU would have no texturing instruction to run.

Oh please...we've had this discussion before about efficiency, and nothing conclusive came from it...

Jawed said:
And the viable per pixel dynamic branching, which is a technique beyond RSX's reach because the architecture is too large-grained.

Excuse me but since when can Xenos do per pixel dynamic branching when it has 3 SIMD engines of 16 ALUs each?
 
Jaws said:
If Xenos is EXCLUSIVELY used for pixel shading with ALL it's 48 ALUs, then it would fall roughly on par with 24 pixel pipes of a hypothetical RSX (48 vec4 units etc etc...) and you'll still have the vertex shaders available on the RSX...AND CELL...

Seeing as we all fancy throwing stupid things around, here's one to chew over:

48*5 = 240
(24*2*4) + (8*5) = 232

:oops:
 
dukmahsik said:
has nvidia state that rsx will run with 90% efficiency?

We don't know exactly what RSX is, never mind it's efficiency! And no, I'm not also taking ATI' 90% PR number into account! Efificiency of 90%...90% of what? Only real world benchmarks will reveal...
 
SubD said:
And until something better comes from the 360 developers, the PS3 is displaying a significant realworld advantage.

Um no what youve been seeing is a significant realworld advantage of the 7800 GTX over the x800/850... the latter of which I might add, is not SM3.0 compliant nor as fast.

What you are seeing on launch games are x800 development with Xenos optimisations. I'd wager that the improvement between what you see on x360 to date and next fall (a full year with the actual Xenos) will be significant. Much of the xenos panning here will go by the wayside...
 
If PS3 needs to use CELL to achieve the same graphics processing power as the Xenos, then doesn't much of the PS3 CPU advantage go out the window

Yes.

But RSX* is more capable than Xenos in all pixel-shader avalilability requirements even when all 48 Xenos unified-shaders perform as pixel-shaders. Increasing vertex shader availability beyond RSX* capability (8@550mhz) drops Xenos already inferior much pixel shader availability too much.

Best advantage of RSX* vs Xenos is constant availability of full pixel shaders through all clock-cycles. Can really max it out in all scenes. Xenos always a balancing act and always a compromise.

CELL is bonus.

48 shader pipes to run vertex programs instead of 8, when doing any kind of shadow pre-render.

No. Clock-cycles per frame is limited. More clock cycles as vertex shaders means less clock cycles as pixel shaders.

Regarding bandwidth fears, EA says ... 3 SPEs dedicated for graphics ... fill-rate limited.

Very impressive, no? Suggests something different about RSX architecture from 7800GTX.

*RSX=7800GTX @ 550 mhz (assumption based on transistor count) ... actual configuration unknown
 
Dave Baumann said:
Seeing as we all fancy throwing stupid things around, here's one to chew over:

48*5 = 240
(48*4) + (8*5) = 232

:oops:

Where's the clock rate difference?

:Xenos

48vec4*8 flops/cycle*0.5 Ghz + 48scalar*1 flop/cycle* 0.5 Ghz
~ 192 + 24
~ 216 Gflops

:RSX 24 pixel units

48vec4* 8 flops/cycle*0.55 Ghz
~ 211 Gflops

216 and 211 are on par...
 
But RSX* is more capable than Xenos in all pixel-shader avalilability requirements even when all 48 Xenos unified-shaders perform as pixel-shaders. Increasing vertex shader availability beyond RSX* capability (8@550mhz) drops Xenos already inferior much pixel shader availability too much.
And, again, you have glossed over all the architectural differences between the two parts = the ALU's and the organisation of the ALU's are not even close to being the same.
 
Efficiency

Jaws said:
We don't know exactly what RSX is, never mind it's efficiency! And no, I'm not also taking ATI' 90% PR number into account! Efificiency of 90%...90% of what? Only real world benchmarks will reveal...

Efficiency is per scene based on vertex and pixel shader requirement. Xenos can adapt to varing needs. In scenes where 1 vertex shader is required, Xenos is efficient, but RSX wastes 7 vertex shaders, hence RSX less efficient. RSX compensates by having more pixel-shader capability than Xenos so wastage is not liability except in terms of chip size and cost.
 
Dave Baumann said:
Seeing as we all fancy throwing stupid things around, here's one to chew over:

48*5 = 240
(24*2*4) + (8*5) = 232

:oops:

I saw that too :cool:

ihamoitc something about your pixel shading numbers comparison seems wrong... can you explain them again?
 
Jaws said:
Oh please...we've had this discussion before about efficiency, and nothing conclusive came from it...
Before we even start to count all the scheduling efficiencies that accrue from Xenos's scheduler, you've got basic stumbling blocks inside RSX's superscalar architecture. RSX's ALUs sitting idle because of register bandwidth restrictions, or because dual-issues are not possible on successively dependent instructions.

Do you seriously expect me to accept that a stall-less, zero-latency-branch pipeline is not more efficient than a conventional GPU pipeline? Have you entirely forgotten the scheduler discussions?

Excuse me but since when can Xenos do per pixel dynamic branching when it has 3 SIMD engines of 16 ALUs each?
With predication on each pixel. See the ATI patent on nested control flow:

http://v3.espacenet.com/textdoc?DB=EPODOC&IDX=US2005154864&F=0

The big deal about Xenos is the 64-pixel batches. As opposed to 1024 in RSX. Makes a vast difference in whether code with a dynamic branch is faster or not. The ideal would be 1-pixel batches. But the scheduler/register file/batch queue would be ginormous. So the compromise is an 8x8 tile. As opposed to RSX's 32x32 tile.

Jawed
 
Jaws said:
Where's the clock rate difference?

:Xenos

48vec4*8 flops/cycle*0.5 Ghz + 48scalar*1 flop/cycle* 0.5 Ghz
~ 192 + 24
~ 216 Gflops

:RSX 24 pixel units

48vec4* 8 flops/cycle*0.55 Ghz
~ 211 Gflops

216 and 211 are on par...

Requoted if it was missed Dave with all the simultaneous posts!
 
Jaws said:
Where's the clock rate difference?

I'm illustrating that these "if Xenos does this and RSX does that with the vertex shaders / pixel shaders one is less powerful then the other" trails are just stupid. It completely ignores the structure of the ALU's themselve, its ignores the the arragement of the ALU's, its ignores the efficiency of them - I mean we haven't even counted the extra 16 (filtered) texture address processors for Xenos, which uses ALU cycles on RSX; do we know if RSX has dedicated shader interpolators either?
 
ihamoitc2005 said:
48 shader pipes to run vertex programs instead of 8, when doing any kind of shadow pre-render.
No. Clock-cycles per frame is limited. More clock cycles as vertex shaders means less clock cycles as pixel shaders.
There's no pixel shading to be done. It's all vertex shader.

Jawed
 
clock cycles per frame

Jawed said:
There's no pixel shading to be done. It's all vertex shader.

Jawed

Rendering each frame takes X number of clock cycles.

Y number of clock cycles dedicated to vertex shader operations means (X-Y) number of clock cycles available for pixel shader operations no?
 
Status
Not open for further replies.
Back
Top