RSX can use all 35 GB/s from/to CELL without hitting XDR ramjvd said:The rsx will never have acess to 57gb of bandwidth . Unless of course the cell will be disabled in games
RSX can use all 35 GB/s from/to CELL without hitting XDR ramjvd said:The rsx will never have acess to 57gb of bandwidth . Unless of course the cell will be disabled in games
nAo said:I believe most NG games will use a first pass to lay down zbuffer to remove opaque ovedraw.
Now RSX has 24 pixel pipelines clocked at 550 MHz and we have to use them to shade 100 Mpixel/s (a 1080p frame buffer @ 50 FPS).
24*550/100 = 132 clocks per pixel.
More than 200 dot4 and 100 nrm per pixel (efficiency is not 100% of course!)..well..it seems good to me
Even with a lot of texture layers available programmable flops count is still very high
And what is it going to do with that 35gbs ? Is it going to store its framebuffer there ? feature textures from the cell ? What exactly ?nAo said:RSX can use all 35 GB/s from/to CELL without hitting XDR ramjvd said:The rsx will never have acess to 57gb of bandwidth . Unless of course the cell will be disabled in games
scooby_dooby said:hmm strange wording.
"Supports" != Requires
I think common sense tells you that if this were true Sony would've announced it in a more visible way, to make a big deal of the fact their games will all be 1080p. Not some strangely worded press release.
Not counting tmu's in the flops. I have taken out all the fixed function flops that Nvidia is counting like norm/scale/bias etc. because Xenos seems to have that seperate functionality as well. Because each pixel shader ALU in the G70 can perform 2 ops (ie. multiply and add) on upto 4 components, this provides you with 8 flops per ALU. Take 24 pipelines * 2 ALU's per pipe * 8 flops = 384flops in the pixel shader array. The vertex shader ALU can do 2 ops on upto 5 components resulting in 10 flops per ALU. 8 vertex shaders * 10 flops = 80 flops in the vertex shader array. Total is 80 + 384 = 464. I also showed "or 272 when texturing" because half of the pixel shader ALU's also are used in texturing. Xenos has 48 ALU's that can do 2 ops on upto 5 components resulting in 480 flops per clock. These can all be used while texturing so I indicated a number of texture samples possible in the same clock.A couple of Qs - How are you figuring these numbers out, exactly? I'm guessing your mapping the texture address functionality in Xenos to flops, but..how?
Where did you get that information/hint? I've never seen it mentioned that there's some kind of significant fixed functionality of that kind on R500, while Nvidia often talks about the free normals and SFU units (which is what I'm assuming you are referring to) as a significant part of their functionality. I understand they are not programmable but it's something that has to be used all the time anyways, and saves a lot of cyles.Not counting tmu's in the flops. I have taken out all the fixed function flops that Nvidia is counting like norm/scale/bias etc. because Xenos seems to have that seperate functionality as well.
I quoted this earlier from Wavey's article: "Additional to the 48 ALU's is specific logic that performs all the pixel shader interpolation calculations which ATI suggests equates to about an extra 33% of pixels shader computational capability."Where did you get that information/hint?
Rockster said:Not counting tmu's in the flops. I have taken out all the fixed function flops that Nvidia is counting like norm/scale/bias etc. because Xenos seems to have that seperate functionality as well. Because each pixel shader ALU in the G70 can perform 2 ops (ie. multiply and add) on upto 4 components, this provides you with 8 flops per ALU. Take 24 pipelines * 2 ALU's per pipe * 8 flops = 384flops in the pixel shader array. The vertex shader ALU can do 2 ops on upto 5 components resulting in 10 flops per ALU. 8 vertex shaders * 10 flops = 80 flops in the vertex shader array. Total is 80 + 384 = 464. I also showed "or 272 when texturing" because half of the pixel shader ALU's also are used in texturing. Xenos has 48 ALU's that can do 2 ops on upto 5 components resulting in 480 flops per clock. These can all be used while texturing so I indicated a number of texture samples possible in the same clock.A couple of Qs - How are you figuring these numbers out, exactly? I'm guessing your mapping the texture address functionality in Xenos to flops, but..how?
All these number are pretty meaningless in and of themselves, but there seems to be a general impression that the G70/RSX has more raw power, while Xenos is more efficient. I have seen numbers like 1.8TFlop vs 1TFlop, etc., when in terms of raw power they are very close.
No more meaningless than all the other figures being thrown about....which indeed does make that kind of comparison meaningless for now.
Hmm, OK. I thought he was just referring to the logic integrated in the EDRAM module.Additional to the 48 ALU's is specific logic that performs all the pixel shader interpolation calculations which ATI suggests equates to about an extra 33% of pixels shader computational capability."
I know, but so far their architectures have been fairly simillar, so that was expectable. Not so with R500, though.It's a pretty standard thing. And if you look at benchmarks between Nvidia and ATI PC architectures you don't see any drastic, unexplained per clock advantage.
Don't you think that those mini-alus (2 flops each) should be counted too, for the total of 20 programmable flops? Or are they the same thing as SFU units which can't count as programmable?Because each pixel shader ALU in the G70 can perform 2 ops (ie. multiply and add) on upto 4 components, this provides you with 8 flops per ALU
I don't know where are you getting this from my posts, I was saying that just because current GPUs have lots of hardwired functionality, doesn't mean R500 must have it too, it just looks like different design philosophy, so comparing them on per-clock basis probably doesn't mean much.So you disagree, and think there is a big disparity in raw math performance? And if so, why?
Shifty Geezer said:Just a niggling request for people to stop calling ATi's XB360 chipset 'R500' and call it either 'C1' or 'Xenos', as there is no R500 chipset and now we know it's a totally unrelated part, we shouldn't artificially lump it with ATi's existing architecture numbering and confuse matters for people that are none the wiser.
Thank you for your cooperation, and we return now to normal scheduling...
xbdestroya said:Shifty Geezer said:Just a niggling request for people to stop calling ATi's XB360 chipset 'R500' and call it either 'C1' or 'Xenos', as there is no R500 chipset and now we know it's a totally unrelated part, we shouldn't artificially lump it with ATi's existing architecture numbering and confuse matters for people that are none the wiser.
Thank you for your cooperation, and we return now to normal scheduling...
Wait a minute, where did we learn that R500 is not related to Xenos or C1? All I remember learning is that Xenos is the name for public consumption and that C1 was the internal project name. I still thought R500 was the valid chip name designator though.