Wii U hardware discussion and investigation *rename

Discussion in 'Console Technology' started by TheAlSpark, Jul 29, 2011.

Thread Status:
Not open for further replies.
  1. Exophase

    Veteran

    Joined:
    Mar 25, 2010
    Messages:
    2,406
    Likes Received:
    430
    Location:
    Cleveland, OH
    Not only is that die much smaller but it's most likely made on an older process. So RV710 sounds way too pessimistic.

    It is however possible that Nintendo is blowing a little of that die space including Hollywood for backwards compatibility purposes. I'm not saying Nintendo had to do this, but I wouldn't put it past them. In this case I would at least hope that the eDRAM is shared with it, but it'd still cost a couple dozen mm^2 or so.
     
  2. I.S.T.

    Veteran

    Joined:
    Feb 21, 2004
    Messages:
    3,174
    Likes Received:
    389
    Also, isn't the Wii U GPU manufactured at ol' 40nm? The die size of a RV710 at 55nm wouldn't really be comparable...
     
  3. BobbleHead

    Newcomer

    Joined:
    Sep 24, 2002
    Messages:
    58
    Likes Received:
    2
    This is true of all the recent AMD GPUs. You all should really not get hung up on finding an exact retail GPU version of the WiiU. It doesn't exist. The specific combination of the different components in WiiU does not match any other instance of the family it is based on. Anyone who says "it's clearly an rv750!" or "no way, it's obviously an rv720!" is wrong.

    Both of these statements are equally true:
    rv710 is a variant of rv770.
    rv770 is a variant of rv710.

    If the WiiU is a 7xx, then does it really matter if it started with 710 and a few numbers were increased or if it started with 770 and a few numbers were reduced?


    Nintendo will never release the exact config or clockspeed, but both will eventually be unofficially figured out via developer leaks and die shot analysis.
     
  4. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    10,245
    Likes Received:
    4,465
    Location:
    Finland
    Indeed, with eDRAM there's in fact still room for a RV730 in that chip (knowing that 730 was done at 55nm, and this is 40nm, of course)
     
  5. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    44,106
    Likes Received:
    16,898
    Location:
    Under my bridge
    Is it likely that the config can be guessed at from a general arrangement of components? eg. 710 is:

    80SP's
    4 ROP's
    8 TMU's

    If the SP:ROP:TMU ratios are kept the same, assuming they are balanced like this as a minimum spec (high-end chips could add more SPs, but you'll want a minimum for texturing and rendering at 720p), and we assume 8 ROPs to match PS360, that'd be a config of

    160 SPs
    8 ROPs
    16 TMUs

    That'd be a little too large and hot I think, so perhaps they'd then reduce it to 120 SPs. Would they touch the TMUs? If BW is an issue, perhaps so. We can be confident that this GPU+eDRAM adds up to the 156mm^2, only without knowledge of whether Hollywood is included. I think that's enough for the more well-informed to present an R7xxx config that'd fit the 120ish mm^2 at 40 nm. But of guesswork involved, but it should be the closest we've yet got.
     
  6. almighty

    Banned

    Joined:
    Dec 17, 2006
    Messages:
    2,469
    Likes Received:
    5
    Would 160SP's or even 120SP's for that matter make the GPU many times faster then Xenos with 48SP's?

    8 ROP's is a must as it would be seriously crippled compared to Xenos and RSX.

    On the comment of it being smaller then Wii U's GPU do we have confirmation at all that Wii U's GPU is indeed on a 40nm process or is that just people guessing?

    Surely as well as the EDRAM sucking up extra transistors there would also be added logic for the tablet?
     
  7. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    No, it would not (compare with RV730 and think of a shrinked version, or look at the difference between Cedar and Caicos in 40nm, or start with Redwood).
    Yes, they would. The TMUs are an integral part of the SIMDs/CUs starting with the R700 generation. AMD realized two possibilities: half sized SIMDs (8*5=40 SPs, Wavefront size 32) with 4 TMUs (RV710 and RV730) and full size SIMDs (16*5=80 SPs, Wavefront size 64) with 4 TMUs (RV770 and later on RV790 and RV740).

    Edit:
    Xenos has basically almost R600 style SIMDs which should count as 240 SPs for a comparison (3 full size 16*5=80 SPs SIMDs). One just needs to keep in mind they are less flexible as they are not VLIW5 but vec4+1.
    I guess 160SPs at the same clock would have a hard job to be significantly faster than Xenos (it would sometimes struggle just to keep up and one would likely need 16 TMUs like Xenos for it). But the ROPs work quite a bit different on recent GPUs, that adds uncertainty.
     
    #3527 Gipsel, Nov 27, 2012
    Last edited by a moderator: Nov 27, 2012
  8. almighty

    Banned

    Joined:
    Dec 17, 2006
    Messages:
    2,469
    Likes Received:
    5
    Wouldn't the architectural improvement that AMD made with R700 over R600 cover some of the issues that could be faced from porting from 360 to Wii U?
     
  9. Kaotik

    Kaotik Drunk Member
    Legend

    Joined:
    Apr 16, 2003
    Messages:
    10,245
    Likes Received:
    4,465
    Location:
    Finland
    To my understanding they're closer to R5xx vertex shaders than R600 shaders, even though they're unified like R600's
     
  10. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    Besides the LDS, the major difference is the arrangement of the TMUs. And that was mainly done to enable easier scaling to higher SIMD counts (quite succesful if you look at RV730 or RV770). It is not inherently more efficient (the TMU load balancing is probably even slightly better with the R600 design, but it doesn't scale well).
    If you come from Xenos, the largest difference from an arithmetic performance point of view are probably the VLIW5 shader, which are more efficient for general purpose shader code than the quite rigid vec4+1 setup. But the latter is probably quite close for "traditional" shader code. Another larger potential difference are the ROPs imo.
    The R5xx vertex shader part is the vec4+1 arrangement instead of VLIW5. But for comparisons with later GPUs, it's better to think of it like slightly less flexible R600 SIMDs (and the TMU arrangement of Xenos was also close to R600).
     
    #3530 Gipsel, Nov 27, 2012
    Last edited by a moderator: Nov 27, 2012
  11. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,552
    Likes Received:
    514
    Location:
    Varna, Bulgaria
    Why would halved multiprocessor alter the wavefront size? All AMD's VLIW architectures executes the wavefronts over 8 cycles -- two WF interleaving each other for simple and effective utilization.

    Full-sized SIMD: 16 lanes * 8 cycles / 2 WF = 64 WF size.
    A halved SIMD setup will simply keep the 8-cycle latency, minus the interleaving: 8 lanes * 8 cycles = 64 WF size.
     
  12. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    No it doesn't. GPUs with halved SIMD sizes (RV610/620 had even quarter size ;)) still interleave 2 wavefronts. The wavefronts are getting smaller with reduced SIMD sizes. One can actually ask some APIs (like CAL) for the wavefront size and it returns 32 for GPUs with half size SIMDs and 64 for full size SIMDs (edit: and iirc it is also officially documented somewhere; edit2: AMD's Mica Villmow confirms it here).
    The reason is probably some pipelining issues with the register file accesses or branches or whatever. It's simply easier to reduce the wavefront size as everything else stays the same that way.
     
    #3532 Gipsel, Nov 27, 2012
    Last edited by a moderator: Nov 27, 2012
  13. fellix

    Veteran

    Joined:
    Dec 4, 2004
    Messages:
    3,552
    Likes Received:
    514
    Location:
    Varna, Bulgaria
    Thanks for the clarification.

    My puzzlement came from a statement by Nvidia few years ago to the developers, that they will continue to keep the warp-size at 32 intact for all the GPU implementations. I thought AMD was sort on the same bandwagon.
     
  14. Gipsel

    Veteran

    Joined:
    Jan 4, 2010
    Messages:
    1,620
    Likes Received:
    264
    Location:
    Hamburg, Germany
    They are now as starting with the R900/Northern Islands generation all GPUs (also the smallest, Caicos) have a wavefront size of 64 (R800/Evergreen only had Cedar with half size SIMDs and if one counts APUs also Ontario/Zacate) and future incarnations of GCN will probably keep this.
     
  15. Mobius1aic

    Mobius1aic Quo vadis?
    Veteran

    Joined:
    Oct 30, 2007
    Messages:
    1,715
    Likes Received:
    293
    I think RV710, Cedar, or even Caicos are out of the question when you consider the die size and the performance they provide. Caicos comes close and at 750 MHz, matches the 240 GFLOPS of Xenos, but still only has 8 TMUs and 4 ROPs. RV730 or Redwood continues to make the most sense, unless Nintendo had AMD build something custom like a 240 SP, 24 TMU, 12 ROP part with a nice clock speed like 750 MHz or so to guarantee a truly substantial boost over Xenos. Though reducing RV730 to 40 nm or using Redwood would just make more financial sense, I think. Fun to speculate though, but the even considering eDRAM, the die size involved with that + RV710, Cedar or Caicos die sizes doesn't add up.
     
  16. stiftl

    Newcomer

    Joined:
    Jun 24, 2006
    Messages:
    118
    Likes Received:
    11
    Even if they had to include a whole Hollywood chip into the design (which they didn't according to the Iwata asks) this would probably leave something like 100mm² for the GPU alone (without eDRAM). A 400 or 480 shader units isn't out of the question IMO, so something like Redwood or Turks (I know these are 1 resp. 2 generations younger, but just to get a feeling).
     
  17. dumbo11

    Regular

    Joined:
    Apr 21, 2010
    Messages:
    440
    Likes Received:
    7
    Based on the latest DF article, anyone got thoughts on this type of crazy argument:
    - the Wii-U compresses the video and sends it to the tablet.
    - if the image sent to the tablet is torn, the stream gets artifacts and looks ugly.

    That would leave developers stuck between a rock and a hard place in terms of features if they intend to use the tablet, and might explain why the Wii-U seems to perform below it's spec?
     
  18. liolio

    liolio Aquoiboniste
    Legend

    Joined:
    Jun 28, 2005
    Messages:
    5,724
    Likes Received:
    195
    Location:
    Stateless
    Whereas I would be pleased by the idea that the WiiU has a redwood under the hood, I feel startling the fact that the system performs so badly. I run a redwood on the laptop I'm currently typing on, it easily out performs the 360.
    To me looking at redwood or Turk or AMD APU, the only conclusion I can draw by watching at how a game like CoD struggles while rendering at 880x720 is that the overall design sucks badly. It will get better but such crappy results should not happen to begin with.

    Anybody that would release such a hardware platform (not even as a console) would get mocked for a reason. I guess Nintendo is something special as Apple with the difference that Apple comes with good hardware, their last CPU is as good as it get within its power budget.
     
    #3538 liolio, Nov 27, 2012
    Last edited by a moderator: Nov 27, 2012
  19. Exophase

    Veteran

    Joined:
    Mar 25, 2010
    Messages:
    2,406
    Likes Received:
    430
    Location:
    Cleveland, OH
    Could you link me to the interview that referred to Wii GPU support? Or preferably if you could give me the snippet that addresses it.
     
  20. Shifty Geezer

    Shifty Geezer uber-Troll!
    Moderator Legend

    Joined:
    Dec 7, 2004
    Messages:
    44,106
    Likes Received:
    16,898
    Location:
    Under my bridge
    There may be something to that. The compression is probably MJPEG, which isn't going to work if you suddenly change three quarters of the framebuffer while its supposed to be being compressed. This could see Nintendo enforcing a vertical sync to ensure the FB is complete before being compressed and broadcast.

    Anyone know how remote play on PSP/Vita works by comparison on games that tear?
     
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...