PS3 vs X360: Apples to Apples high level comparison...

Discussion in 'Console Technology' started by j^aws, May 22, 2005.

  1. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    Another assumption one can make if we don't want to believe nvidia extended their pixel pipeline design is they count the 2 (indipedent) co-issued Dot2 ops the first ALU can execute and they summed them with Dot4 from VS pipelines.
    I don't even want to consider this option :)
     
  2. Panajev2001a

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,187
    Likes Received:
    8
    That option would still make 2 full Dot4's/cycle because if they counted them all as just Dot products then how could we comparatively count the Dot Products coming from the Broadband Engine ?
     
  3. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    We can't, it doesnt' make sense, that's why I refute this hypothesis.
     
  4. Tacitblue

    Newcomer

    Joined:
    Apr 23, 2005
    Messages:
    131
    Likes Received:
    1
    http://www.extremetech.com/article2/0,1558,1817022,00.asp

    Interview with one of the hardware guys on XBox 360.

    On the memory bandwidth issue, it's my guess they're using Hypertransport again, ~22GB/sec keys in well with the current stats on the HT website, and IBM is part of the HT consortium. So that's one element that's a carryover it seems from the original box.
     
  5. blakjedi

    Veteran

    Joined:
    Nov 20, 2004
    Messages:
    2,985
    Likes Received:
    88
    Location:
    20001
    Where is everyone getting the Cell chip dot-product information from? i didn't think that the Cell had a dotproduct function? I'm confused.
     
  6. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    SPEs and PPE's VMX unit haven't a dot product instruction AFAIK, but four vec4 dot products can be calculated at the same time with 4 fmadd instructions, so the average troughput it's one dot4 per clock cycle.
    To be fair things are more complex than that as on SPEs fmadd instructions have a 6 cycles latency AFAIK..
     
  7. blakjedi

    Veteran

    Joined:
    Nov 20, 2004
    Messages:
    2,985
    Likes Received:
    88
    Location:
    20001
    So in other words it does have the equivalent of a dotproduct function... just with fairly high latency OK. Ok so then when you say average throughput is one dot4 per clock cycle are you talking per SPE or the entire chip?
     
  8. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    A dot4 per cycle per SPE and even PPE's VMX unit should provide one dot4 per cycle.
    PS3 CPU would peak at 8 dot4 per cycle.
     
  9. Panajev2001a

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,187
    Likes Received:
    8
    He is talking about each SPE.
     
  10. pc999

    Veteran

    Joined:
    Mar 13, 2004
    Messages:
    3,628
    Likes Received:
    31
    Location:
    Portugal
  11. Carl B

    Carl B Friends call me xbd
    Legend

    Joined:
    Feb 20, 2005
    Messages:
    6,266
    Likes Received:
    63
    PC999, as PSINext staff I might as well take point here and ask what it is exactly you want looked at? That's two pages of posts - some of them pretty long - and I think it would help if your question consisted of more than just 'take a look,' since there's two different conversations going on there.
     
  12. PC-Engine

    Banned

    Joined:
    Feb 7, 2002
    Messages:
    6,799
    Likes Received:
    12
    Didn't DeanoC, ERP, or aaaaa00 say XeCPU has a special dot product instruction?
     
  13. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
  14. PC-Engine

    Banned

    Joined:
    Feb 7, 2002
    Messages:
    6,799
    Likes Received:
    12
    Cool thanks. :)
     
  15. j^aws

    Veteran

    Joined:
    Jun 1, 2004
    Messages:
    1,992
    Likes Received:
    137
    [​IMG]

    Well some more thoughts...with my earlier derivation that 52 Dot/cycle is required from the above, which vec4 units can provide and a further 84 'other' units are needed to account for 136 Shop/cycle. And looking at that diagram again and the fact that there is no distinction between pixel and vertex units but only 'vector ALU' and SFUs, the following is also a possibility, especially as the 'shader instruction processor' that seems to be issuing to *all* those units, i.e.

    RSX ~ 136 Shops/cycle ~ 52 vec4 + 84 SFU ~ 2*(26 vec4 + 42 SFU)

    Kinda like a 'unified' shader units that can execute either vertex or pixel instructions and there are 'two pools' of these...?
     
  16. Fafalada

    Veteran

    Joined:
    Feb 8, 2002
    Messages:
    2,773
    Likes Received:
    49
    6cycles instruction latency in a 3+Ghz chip is actually Very low - not high.
    For that matter I kinda expect PPE/XCPU instruction latencies to be higher then that.
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...