A primer on the X360 shader ALU's as I understand it

Discussion in 'Console Technology' started by superguy, Mar 21, 2006.

  1. Platon

    Veteran

    Joined:
    Jul 14, 2005
    Messages:
    1,127
    Likes Received:
    10
    Location:
    Sweden
    I think you answer your own question. Not only does xbox360 have mutlipolae cores, which will be taking some time for the devs to get used to, but also the GPU is quite a different beast than anything else out there. It will take time, launchand near launch titles are hardly anything to go by when you want to see what the hardware is going for, and even less now, during this gen with the complexity of the hardware...
     
  2. Asher

    Regular

    Joined:
    Jul 1, 2005
    Messages:
    972
    Likes Received:
    8
    Location:
    Calgary, Alberta
    Factor in a couple VMX128 units with their new 3D-oriented instructions and RSX/Cell may be at a disadvantage in this area. ;)

    No one can say either way, but a lot of people are putting way too much emphasis on SPEs and what they can do for graphics workloads vs VMX128 units. There are instructions on VMX128 units that complete in one or two cycles that take upwards of 10 different instructions on SPEs, for instance.
     
    Johnny Awesome likes this.
  3. Asher

    Regular

    Joined:
    Jul 1, 2005
    Messages:
    972
    Likes Received:
    8
    Location:
    Calgary, Alberta
    Yep. Even in generic newspaper articles now talking about PS3, Reuters & Co tend to mention that the PS3 is "twice as powerful as the Xbox 360", probably an allusion to the 1TFLOP vs 2 TFLOP numbers.

    The casual gamers seem to be under the impression that the PS3 will be significantly more powerful than the Xbox 360, while the "hardcore" gamers are actually more likely to know the distance isn't huge by any means.
     
  4. LunchBox

    Regular

    Joined:
    Mar 13, 2002
    Messages:
    901
    Likes Received:
    8
    Location:
    California
    people seem to forget that Xenos is 3 arrays of 16 pipes... so minimum PER CLOCK would be 16 VS...
     
  5. LunchBox

    Regular

    Joined:
    Mar 13, 2002
    Messages:
    901
    Likes Received:
    8
    Location:
    California
    AFAIK... Wasn't the VMX units put there to enhance the PPE's FP output... If you use it on graphics... wouldn't it take away from physics and A.I. of the game...
     
  6. ROG27

    Regular

    Joined:
    Oct 27, 2005
    Messages:
    572
    Likes Received:
    4
    Yes. That's why a massively parallel architecture like CELL is more flexible (has more independently operating threads) and that is why SPEs can realistically help with graphics.
     
  7. ROG27

    Regular

    Joined:
    Oct 27, 2005
    Messages:
    572
    Likes Received:
    4
    Those same newspapers said the same things about the XBOX 1 when it first was introduced, as well. The casuals' perceived notion of more powerful is, again, akin to PS2/GC (X360 this round) vs. XBOX (PS3 this round)...which it will likely mirror in reality. They either won't pay attention to or will forget how much more powerful one thing is than another. The only thing that matters to a casual, as far as power is concerned, is this is technically more powerful than that (by what margin does not matter). Only misguided techies on internet forums get hell-bent on meaningless numbers, ironically.
     
  8. Asher

    Regular

    Joined:
    Jul 1, 2005
    Messages:
    972
    Likes Received:
    8
    Location:
    Calgary, Alberta
    You're opening up a whole new can o' worms.

    Yes, SPEs have more independently operating threads.

    There are things SPEs are good at, and things SPEs are, er, not-so-good at.

    Realistically, in my opinion, SPEs are not as useful for extra geometry processing compared to Xenon cores for a number of reasons:
    1. Xenon cores are individually faster than SPEs are for such processing.
    2. Xenon cores can use the L2 cache as a FIFO buffer for Xenos to pull vertex data from, without writing it to RAM. The SPEs need to communicate with RSX through the PPE.
    3. Xenon cores natively understand D3D formats

    It is not as cut and dry as many people are trying to make it out to be. There are advantages to the Cell approach, and there are advantages to the Xenon approach. In my opinion, Xenon can work more effectively with Xenos for additional geometry processing than Cell can work effectively with RSX for additional geometry processing.
     
  9. one

    one Unruly Member
    Veteran

    Joined:
    Jul 26, 2004
    Messages:
    4,823
    Likes Received:
    153
    Location:
    Minato-ku, Tokyo
    Those 2 points are false AFAIK.
     
  10. Gholbine

    Regular

    Joined:
    Jun 19, 2005
    Messages:
    294
    Likes Received:
    1
    I was about to question them also.

    FlexIO is part of the EIB just as all the processing elements are, and the SPEs have DMA. Why would they need the PPE to communicate with the RSX?
     
  11. Asher

    Regular

    Joined:
    Jul 1, 2005
    Messages:
    972
    Likes Received:
    8
    Location:
    Calgary, Alberta
    They weren't when I was profiling on Cell and Xenon 6 months ago...

    On a high-level view (programmer's view), they can dispatch DMA commands for main memory but those are routed through the PPE.

    The SPE's individual DMA units are for moving memory between SPEs and the PPE.
     
    #31 Asher, Mar 22, 2006
    Last edited by a moderator: Mar 22, 2006
  12. Asher

    Regular

    Joined:
    Jul 1, 2005
    Messages:
    972
    Likes Received:
    8
    Location:
    Calgary, Alberta
    Depends how extensive RSX's integration is. I'm assuming they may have to fetch this data from main memory after Cell writes it, but that may not be the case. RSX details are hard to come by.

    Even still, I can see it be written out to main memory anyway for the reasons of a FIFO buffer. The chances of Cell sending out the data and having RSX ready to retrieve it are low, or vice versa where RSX asks for the data and waits for the SPE to send it...
     
  13. Gholbine

    Regular

    Joined:
    Jun 19, 2005
    Messages:
    294
    Likes Received:
    1
    Assuming the RSX can access the SPEs local stores (which I don't think is an unreasonable assumption), what's stopping developers using the local store as a FiFo buffer?
     
  14. Asher

    Regular

    Joined:
    Jul 1, 2005
    Messages:
    972
    Likes Received:
    8
    Location:
    Calgary, Alberta
    Nothing, provided the RSX can access the SPE's LS and the SPE knows enough to lock off the section of LS for the buffer.
     
  15. superguy

    Banned

    Joined:
    Jan 27, 2006
    Messages:
    472
    Likes Received:
    9
    Scooby..the floating point junk by Sony is hyperbole..but it was more took with and run by the message board fanboi's than anything. It doesn't bother me. Now..I dont recall MS slides pointing out the transistor and instruction count of their GPU like Sony had..so yes Sony has tried to play up the brute strength angle of PS3 a lot more than MS has officially..

    And we also had the Major Nelson article from MS. Although, I think it was a REACTION, to all the Sony claims they were putting out there, along with the game media, that it was 100X Xbox360 or something. But again a lot of that did NOT come from Sony..it came from their fans.

    So basically I'm saying all considered Sony is off the hook for all that. And playing up Tech specs isn't wrong anyway.

    My problem with Sony is a wholescale playing off of FAKE videos at E3 as real. That to me was a huge deal completely unprecedented in video game history for pure dishonesty. So I say be mad at Sony for THAT.

    Also from MS side, it's true that ATI has been very aggressive in hyping Xenos and downplaying PS3. That's true. However ATI is not exactly MS either, they are ATI, acting somewhat independantly.
     
  16. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,325
    Likes Received:
    93
    Location:
    San Francisco
    It really does not matter if it can change its configuration on the fly every 4 cycles, you should have all the granularity you need.

    What?! Last time I checked it only can't handle blending in that space.
    WHAT?!
    ERR!?!?!?
    Are you kidding!?
    meaningless number
    meaningless number
    this is the only correct thing you quoted..but you know, it's statistical..it is bound to happen :)
    LOL, maybe you want to reconsider your statement about edram here..;)
    other meaningless number
    same old story..
     
    Johnny Awesome likes this.
  17. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,325
    Likes Received:
    93
    Location:
    San Francisco
    Why would you need to lock off some LS section? it's not a cache, nothing will mess with that section if you don't want to.
     
  18. Asher

    Regular

    Joined:
    Jul 1, 2005
    Messages:
    972
    Likes Received:
    8
    Location:
    Calgary, Alberta
    I'm using the term loosely -- you don't want to trash it is all I'm saying.
     
  19. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    Throw in time division, and it doesn't matter.
     
  20. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    It seems we're getting a bit OT here, but I'll continue nonetheless :grin:

    When I made my comment I was talking about c0_re's coment about POWER. Just because you have more "power" doesn't mean your games will be better. I think PS3 developers will likely be more competent, on average, than XB360 devs, so it still may have better graphics overall, even though IMO RSX has less power.

    Regarding my comments about Xenos:
    In graphics, framebuffer/Z bandwidth is far and away the biggest consumer of bandwidth, especially when you move to HDR and optimize textures for consoles. If RSX is churning out a puny 2GPix/s without alpha blending, it'll use over half its bandwidth once you include Z traffic. Throw in AA, HDR, and/or alpha blending and the situation's even worse. So this affects texture bandwidth as well.

    BTW, if anyone doubts the bandwidth issue, check out the B3D 7600GT review. 22.4GB/s all consumed in an ideal simple fillrate test with colour and Z @ 2.9GPix/s. It has a core clocked 31% faster than the 6800GS, 2.6 times the MADD rate, and numerous other improvements. Unfortunately, it only checks in around 15% faster in most games because it has 30% less bandwidth. RSX will pretty much be a 7600GT times two, but with exactly the same bandwidth.

    Let me reiterate: If RSX was halved, it would still be significantly hampered by lack of bandwidth!
     
    #40 Mintmaster, Mar 22, 2006
    Last edited by a moderator: Mar 22, 2006
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...