How can we compare the Xenos to other unified shader PC GPUs?

Discussion in 'Console Technology' started by Mobius1aic, Oct 30, 2007.

  1. Mobius1aic

    Mobius1aic Quo vadis?
    Veteran

    Joined:
    Oct 30, 2007
    Messages:
    1,649
    Likes Received:
    244
    Just a thought, especially since it seems it wasn't designed with what I call the "R600 philosophy": use of an insane amount of unified shaders.

    Now surely, as used purely as a DX9 GPU it's on par if not more powerful than GeForce 8600s and Radeon 2600s?

    Just wondering, I do find the idea of putting Crysis on the 360 intriguing, and I think it's more than capable of doing it except for the memory issue, which I had predicted way in the beginning of the "CRYSIS COULD BE PUT ON TEH 360!" wars all over the internet.
     
  2. rekator

    Regular

    Joined:
    Dec 21, 2006
    Messages:
    779
    Likes Received:
    20
    Location:
    France
    It's the father…
     
  3. Betanumerical

    Veteran

    Joined:
    Aug 20, 2007
    Messages:
    1,544
    Likes Received:
    10
    Location:
    In the land of the drop bears
    We know that Nvidia's shader processors aren't directly comparable to ATI's because the 2900's '320' shaders, is still slower then the 8800's '128' shaders even after you convert the 8800's because they are clocked higher. The 8600gts has 32 shaders which are clocked at 1.45ghz, where as the Xenos has 48 shaders clocked at 500mhz.

    They should be pretty close in fillrate and texture filtering though.

    Xenos fillrate.

    500 x 8 = 4.0 gigapixels / seconds

    Texel fillrate

    500 x 16 = 8.0 gigatexels / seconds

    8600gts fillrate.

    675 x 8 = 5.4 gigapixels / seconds

    Texel fillrate

    675 x 16 = 10.8 gigatexels / seconds

    It seems the 8600gts pretty much has the Xenos licked.

    PS: Please someone correct me if i got any of these numbers wrong.
     
    #3 Betanumerical, Oct 30, 2007
    Last edited by a moderator: Oct 30, 2007
  4. StefanS

    StefanS meandering Velosoph
    Veteran

    Joined:
    Apr 20, 2002
    Messages:
    3,608
    Likes Received:
    75
    Location:
    Vienna
    A good place to start is definitely Dave's extensive Xenos article if you want to learn more about Xenos. It's somewhat of a precursor to ATI's later models (introducing unified shaders).
     
  5. Rangers

    Legend

    Joined:
    Aug 4, 2006
    Messages:
    12,322
    Likes Received:
    1,120
    Completely apples and oranges. 8600GT wouldn't have Xenos licked. IIRC 8600GT is only a bit faster than a 7600GT, and Xenos should be a good deal faster than a 7600GT.
    Really man, just look at the games. It's not that hard to see that 360 and PS3 are similar. And we know RSX=7900GTX basically. Which is about 2X a 7600GT.

    8600GT's shaders only process one component at a time. They're scaler. 360's ALU's process five at a time. So in apples to apples terms you might say 360 has 48X5=240 shaders. Even that wont tell you a whole lot. But I'm confident Xenos has significantly more raw shading power than 8600GT.

    You really cant compare easily with the G80 family though. Xenos is a lot more like X1900/7800GTX class.

    Also, benchmarks were done and linked recently, and you find Xenos again scoring almost exactly in the area of a 7900GTX.
     
  6. Betanumerical

    Veteran

    Joined:
    Aug 20, 2007
    Messages:
    1,544
    Likes Received:
    10
    Location:
    In the land of the drop bears
    Something I missed thanks for that, I also had no clue on the 8600GTS performance numbers. If the 8600GTS's is only scalar then how come the G80 is absolutely slaughtering the 2900?, or does ATI count each component it can process as a shader?
     
  7. Heinrich4

    Regular

    Joined:
    Aug 11, 2005
    Messages:
    596
    Likes Received:
    9
    Location:
    Rio de Janeiro,Brazil
    Before b3d forum crash i see a topic here with a japanese blog link (please if annyone have this send to me) talk about Xenos have performance very similar to Geforce 7800GTX (they compare geforce with 4 shader alu with Xenos).

    (i se here in b3d a old topic of guy/engineer ATI compare R-500/C1/Xenos with Radeon 1800XTX/R-520 with overall same performance at 720P resolutions)
     
  8. Mobius1aic

    Mobius1aic Quo vadis?
    Veteran

    Joined:
    Oct 30, 2007
    Messages:
    1,649
    Likes Received:
    244
    Alright, cool. This was the kind of info I was expecting. The Xenos has shown to be quite the performer as I thought it was, but I wanted to know where it lied in the evolution of unified shader GPUs, because the way it had been detailed out so far, it makes you want to look at like an Nvidia 8 series, with lower amounts of more powerful shaders instead of an ATi 2xxx series, which is a massive load of less powerful shaders units instead.

    Any reason as to why ATi didn't use it for a PC graphics board? Frankly it would have given ATi a nice product to offer that Nvidia didn't have until the 8 series. Even if Xenos really isn't fully DX10 capable, you still can't mess with the pure efficiency of a unified shader GPU, even with DX9.
     
  9. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,185
    Likes Received:
    1,841
    Location:
    Finland
    I don't know about comparison to 7800, but IIRC some ATI reps mentioned it being around X1800-X1900 in terms of performance, which kinda matches the 7800 comparison too.
     
  10. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    7,583
    Likes Received:
    703
    Location:
    Guess...
    Any game can run on any system. The only question is how much you have to change it. Since Crysis can bring a GTX to its knees on high settings you certainly can't expect anything like that on the 360. Medium settings would probably be doable though with maybe a few enhancements.

    Xenos is the first generation unified GPU from ATI as opposed to R600 being the second generation. So R600 is an evolved version of Xenos. G80 is a completely different family so doesn't really relate.

    G80's and R600's shader units are actually pretty similar. The difference is as I understand it that G80's are a little more flexible in how they are utilised due to being fully scalar and they also run at a higher clock speed. Xenos on the other hand has much bigger shader units which lack that flexibility and thus will suffer lower utilisation. They obviously run at a lower clock speed to either G80 or R600. Raw power wise on paper it stands as something like this:

    Xenos: 216 GFLOPS
    G80: 345.6 GFLOPS
    R600: 473.6 GFLOPS

    And in terms of theoretical utilisation of that its something like this:

    G80>R600>>Xenos

    eDRAM doesn't suit the PC model and ATI already had the X1900 series on the horizon which has Xenos beat already in most areas.
     
  11. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    http://blogs.msdn.com/shawnhar/archive/2006/12/11/sixty-fractals-per-second.aspx

    http://texhnologix.blogzine.jp/texhnologix/2006/12/xenos_shader_pe.html
    http://texhnologix.blogzine.jp/texhnologix/2006/12/xenos_madd_perf.html
    http://texhnologix.blogzine.jp/texhnologix/2007/06/xenos_fillrate__1.html
    http://texhnologix.blogzine.jp/texhnologix/2007/06/xenos_fillrate_.html

    Xenos's ALUs are a small evolution from the baseline set by R300, towards R600. R300's MAD+ADD was chopped down effectively for Xenos, into MAD+SF (or MAD + scalar ADD, if I remember right).

    Simplistically, R300 and Xenos can issue two independent instructions each clock cycle, MAD + SF. In R300 this is vec3 MAD + SF. In Xenos it's vec4 MAD + SF. R300 can do vec4 MAD, with the SF joining in. The rationalisation for Xenos's design is that it's got to do both vertex and pixel shading, and vertex shading more commonly needs to operate on vec4 data (x,y,z,w) whereas in pixel shading vec3 (red, green, blue) is often all that's needed (hence the bias of R300's pixel shaders).

    R300 uses the ADD ALU as a pre-processor for MAD instructions (mostly for Directx 8 "fixed functions", like scaling by 2x). At best you can get 3 instructions out of R300 (which is the same all the way up to R580), MAD for RGB, SF (e.g. reciprocal) and ADD/DX8-FF. The latter must always deliver its result to the MAD+SF ALU, though, it cannot write to a register (took me ages to realise this restriction :sad: ). As far as I can tell Xenos integrates the DX8-FFs and there's no "auxilliary ALU" like R300's ADD on the side.

    R600 is vec4 MAD+SF but the twist is that it's 5 entirely independent instructions. On a good day it is 2x faster than R300 per clock, per ALU, but it prolly averages 30-50% faster.

    Jawed
     
  12. Mobius1aic

    Mobius1aic Quo vadis?
    Veteran

    Joined:
    Oct 30, 2007
    Messages:
    1,649
    Likes Received:
    244
    @__@ Numbers.......

    LOL Thanks though guys. I'm not to good with exacting issues with GPU design and programming, but I can understand it pretty well on a minimal basis. Now if only we had a real idea of what the Wii's GPU really is........
     
  13. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    No, R600 is truly second generation. The organisation of the register file (all scalar) and the 5-way instruction issue are quite a departure.

    No, G80's ALU pipeline is superscalar, it issues a scalar MAD + scalar SF per pixel* per clock (later G8x, G9x variants can do scalar MAD + scalar MUL, apparently). R600 issues 4x scalar MAD + scalar SF per pixel per clock.

    Actually, G80 issues SFs at 1/4 or 1/8 rate (but this isn't a problem because they're rarely needed more frequently).

    * actually for pixels it's pairs of pixels for instruction issue, but for vertices it's singly so - a peculiarity of batching in G80

    Jawed
     
    #13 Jawed, Oct 30, 2007
    Last edited by a moderator: Oct 30, 2007
  14. Heinrich4

    Regular

    Joined:
    Aug 11, 2005
    Messages:
    596
    Likes Received:
    9
    Location:
    Rio de Janeiro,Brazil
    Thanx a lot for links and information but overall how much in % ( generaly NUMA shaders Alus gpu betwen 53% in G70 to 60% of max flops cicle) processing sustained of maximum theorical Xenos can reach (480 flops per cicle or peak 240 GFlops pixel shaders+vertex shaders at same time)?

    (i have heard something like 75% of max Gflops sustained and nothing 90% expected eficience )
     
    #14 Heinrich4, Oct 30, 2007
    Last edited by a moderator: Oct 30, 2007
  15. Jawed

    Legend

    Joined:
    Oct 2, 2004
    Messages:
    10,873
    Likes Received:
    767
    Location:
    London
    Whenever you issue an ADD instruction or a MUL instruction, you've "wasted" 50% of the available FLOPs in the MAD ALU! Not every instruction is a MAD.

    It's better to think in terms of ALU component utilisation. e.g. a vec2 instruction for one clock cycle on Xenos leaves 2 scalar MAD units and the SF idle.

    Both G80 and R600 tackle the utilisation problem head-on. G80 goes further in two ways:
    1. all vector instructions are broken down to issue only the portions of the vector that are being used - there is no vec2 used + vec2 unused problem - this is sequential component issue
    2. it uses the ALU pipeline to do some texture-related calculations, which increases the utilisation of the SF unit (which otherwise could be idle)
    Jawed
     
  16. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    For R600, 320 shaders means 320 scalar processors, but for Xenos, 48 shaders mean 48 vec4 plus 48 scalar.

    Also your math is a bit off because 128 shaders in G80 is not compensated by it's higher clock. R600 can and does beat G80 in many math limited tests. However, G80's scalar processors are more flexible than R600's for scalar operations, and there's a LOT more to performance than just shader math ability.
     
  17. LunchBox

    Regular

    Joined:
    Mar 13, 2002
    Messages:
    901
    Likes Received:
    8
    Location:
    California
    My memory is hazy but I could've sworn it was deducted (maybe confirmed????) that it was on par with the Radeon X1800 from one of the threads here like over a year ago...
     
  18. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    8,185
    Likes Received:
    1,841
    Location:
    Finland
    I think that was about some ATI rep saying it's around that level of performance
     
  19. Blazkowicz

    Legend Veteran

    Joined:
    Dec 24, 2004
    Messages:
    5,607
    Likes Received:
    256
    8600 GTS nears the X1950 pro though, with one card beating the other depending on the game (the X1950 pro being overall faster). but I agree shaders are a weak point, 32 scalars units is a low number really.
     
  20. pjbliverpool

    pjbliverpool B3D Scallywag
    Legend

    Joined:
    May 8, 2005
    Messages:
    7,583
    Likes Received:
    703
    Location:
    Guess...
    Specifically he said the X1800 would be faster at high resolutions and Xenos would have an edge at lower resolutions due to higher shader power and frame buffer bandwidth.

    It has also been stated elsewhere by ATI that Xenos is theoratically weaker than the X1900 but should give a similar end user experience - presumably accounting for its closed box nature allowing it to go better utilised.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...