Playstation 3 RSX Graphics is NV47 Based

Discussion in 'Beyond3D News' started by Dave Baumann, Mar 30, 2006.

Thread Status:
Not open for further replies.
  1. Dave Baumann

    Dave Baumann Gamerscore Wh...
    Moderator Legend

    Joined:
    Jan 29, 2002
    Messages:
    14,079
    Likes Received:
    647
    Location:
    O Canada!
    It’s long since been known that NVIDIA will be providing the graphics processor for Sony’s Playstation 3, otherwise known as the RSX "Reality Synthesizer". While the expectation has been that it would be of a similar configuration to NVIDIA’s <a href="http://www.beyond3d.com/misc/chipcomp/?view=chipdetails&id=106&orderby=release_date&order=Order&cname=" target="_b3dout">G70 chip</a>, powering the <a href="http://www.beyond3d.com/previews/nvidia/g70/" target="_b3dout">GeForce 7800</a> series, a precise answer as to its composition has never been given before. However, accordingly to an article at <a href="http://www.watch.impress.co.jp/game/docs/20060329/3dps3.htm" target="_b3dout">watch.impress.co.jp</a>, which has slides from Sony’s GDC briefings, RSX is confirmed as being “NV47” based.

    NV47 is actually the previous codename for G70 – in fact development tools still list G70’s codename as NV47 rather than G70. The graphics slide also highlights that RSX has 24 texture units, which is consistent with G70. Given the architecture of NVIDIA's G7x series, this indicates that RSX will also have 24 fragment shader pipelines with two ALU’s per pipeline.

    This may also give some clues to why NVIDIA sought to release <a href="http://www.beyond3d.com/misc/chipcomp/?view=chipdetails&id=112&orderby=release_date&order=Order&cname=" target="_b3dout">G71</a> in the configuration that it has, with the same pipeline counts as G70, rather than opting for a chip with more pipelines as the popular speculation indicated prior to G71’s release. It is likely that much of the 90nm optimisation undertaken with G71 was to cut down on duplication on the work being carried out on RSX.

    The other, known, primary differences between G71 and RSX is that the PC's PCI Express interface will be replaced with a FlexIO interface for communication to the Cell processor, and RSX will only use a 128-bit bus, rather than a G71's 256-bit interface. Given the memory bandwidth difference between RSX and the PC versions, it begs the question as to whether RSX will retain all 16 of G71's ROP's. With the previously stated 700MHz memory speeds for RSX it will end up with the same local bandwidth as NVIDIA's mid-range <a href="http://www.beyond3d.com/previews/nvidia/g73/" target="_b3dout">GeForce 7600 GT</a>, but also has to contend with twice the texture consumption (when sampling from graphics RAM) and pixel processor capability - it may not be much of a surprise if RSX enables 8 ROP's as opposed to the full 16 of G71.
     
    blakjedi likes this.
  2. Geo

    Geo Mostly Harmless
    Legend

    Joined:
    Apr 22, 2002
    Messages:
    9,116
    Likes Received:
    213
    Location:
    Uffda-land
    #2 Geo, Mar 30, 2006
    Last edited by a moderator: Mar 30, 2006
  3. booomups

    Newcomer

    Joined:
    Nov 2, 2005
    Messages:
    116
    Likes Received:
    1
    am i thinking correctly if i see this 128bit bus for the graphiccard ram is a low/limiting number?
     
  4. Geo

    Geo Mostly Harmless
    Legend

    Joined:
    Apr 22, 2002
    Messages:
    9,116
    Likes Received:
    213
    Location:
    Uffda-land
    I wouldn't think so for 720p. For 1080p? Maybe. MS said something snarky recently about 1080p being a pipe dream for PS3, tho don't know if that's what they were thinking of when they said it (i.e. 128bit bus).
     
  5. superguy

    Banned

    Joined:
    Jan 27, 2006
    Messages:
    472
    Likes Received:
    9
    Very likely.

    http://www.beyond3d.com/previews/nvidia/g73/index.php?p=10

    The 7600 GT is only 13% faster at 1280X960 in FEAR as 6800GS.

    Both have 12 pixel pipe specs, but the 6800GS is clocked at 425 vs 560 for the 7600GT. Plus the 7600 pipes have more capability. The 7600Gt should be much faster than it is.

    FEAR shows this, many other games do not. I'm guessing it's because even with no AA/AF, FEAR is hitting bandwidth limitations. Older games, not so much.

    But remember, RSX has 2X the pixel pipes as 7600GT!

    With AA or HDR, it will be even worse.

    We'll see..doesn't seem to me Sony would have left 24 pipes in if they thought they couldn't use them...
     
  6. LunchBox

    Regular

    Joined:
    Mar 13, 2002
    Messages:
    901
    Likes Received:
    8
    Location:
    California
    the 24 pipes are pixel shaders not the ROP...

    it just means it has more shader power....
     
  7. LightHeaven

    Regular

    Joined:
    Jul 29, 2005
    Messages:
    538
    Likes Received:
    19
    I think thats his point, he thinks that the bandwidth will limit RSX just like it limits 7600, even thought the later has more power than 6800GS it doesnt perfoms as good as it should.
     
  8. pcchen

    pcchen Moderator
    Moderator Veteran Subscriber

    Joined:
    Feb 6, 2002
    Messages:
    2,716
    Likes Received:
    89
    Location:
    Taiwan
    If you use a lot of shaders in game, 7600GT can be quite faster than 6800GS.
    Of course, the question becomes: when you use this amount of shaders, will it be fast enough for 30fps/60fps?
     
  9. MulciberXP

    Regular

    Joined:
    Oct 7, 2005
    Messages:
    331
    Likes Received:
    7
    Cant the RSX also draw from the XDR memory bus?
     
  10. Guden Oden

    Guden Oden Senior Member
    Legend

    Joined:
    Dec 20, 2003
    Messages:
    6,201
    Likes Received:
    90
    You can't draw any conclusions from ONE PC game and use that as some kind of basis for predicting PS3 behavior. Doesn't work.
     
  11. Titanio

    Legend

    Joined:
    Dec 1, 2004
    Messages:
    5,670
    Likes Received:
    51
    Yes, something that seems to be lost in comparisons with PC chips. RSX can entirely saturate XDR if it wants, and from the sounds of some comments from devs, it's very viable to both texture and vertex fetch from XDR.

    Also, with the 7600GT comparisons - it has only 12 pixel shaders and 5 vertex shaders. The 7600GT isn't a GTX with less bandwidth, if that's the comparison that's trying to be made with RSX.
     
  12. Dave Baumann

    Dave Baumann Gamerscore Wh...
    Moderator Legend

    Joined:
    Jan 29, 2002
    Messages:
    14,079
    Likes Received:
    647
    Location:
    O Canada!
    The point about the 7600 GT is that it only has half the texture and pixel processing capabilities - RSX will be able to process far more fragments hence it will be more bandwidth hungry than 7600.
     
  13. flick556

    Newcomer

    Joined:
    May 4, 2003
    Messages:
    163
    Likes Received:
    4
    Can things be streamed directly from cell into some sort of cache on the GPU ? This would only rely on bandwidth from the EIB, plus whatever cell uses in xdr to create the data. For geometry, subdivided surfaces calculated on cell could be a win and they are more straightforward than other types of HOS. For textures many of the methods used to compress textures for the disk/cd could be calculated as a stream to the GPU instead of accessed in memory. If you leave the rez at 720p and use RGB32 based HDR bandwidth requirements may stay low.

    Some of these changes will likely have effects on the Art Creation process. This type of system would be significantly different than what is best for xbox360 and pc. Porting may have some problems. This stuff may have a large CPU overhead.

    On the plus side, I can imagine some major image quality benefits. If cell can stream compressed textures, we may see some pretty big jumps in compression rates compared to current gpu formats. Wavlet based compression should offer nice results if speed is good. Maybe they can even allow the gpu to render the framebuffer to cell where it is compressed, this way multi-frame effects consume less memory and bandwidth. For certain types of textures vectorization may offer some really big compression rates but at a potentially great cpu cost.

    Subdivided surfaces may be able to push the geometry to really high levels, though most games don’t seem to push the current geometry load as far as it can go. The reason is devs are conservative with the poly budget to avoid re-tweaking models later. If subdivision is done on cell, poly counts may be tweaked in a dynamic fashion maybe even at runtime for LOD reasons. The artist are already starting to model using high poly counts and then rely on programs to simplify the model and generate normal maps. These programs could be taken further to generate procedural models with per vertex subdivision info. I doubt cells texture access speed is good enough for displacement maps but I don’t know for sure.

    I can easily imagine this stuff creating a significant cpu overhead. It may consume a full SPU or even two. The results will likely have a significant visual impact. I have been interested in these particular uses for cell since I first read about it and can’t wait for a more solid understanding of the link between RSX and Cell.
     
  14. LightHeaven

    Regular

    Joined:
    Jul 29, 2005
    Messages:
    538
    Likes Received:
    19
    Regarding the above post about Cell and RSX. I cant find anywhere to confirm this, but when RSX is accessing the dataflow goes in Flex I/O right?

    Considering that 256 GDDR3 aren't enought memory to everything RSX needs, could the Flex I/O be enough to handle this bandwidth requirements plus stream data dinamicly created on cell directly to RSX?

    On 360 i know that Xcpu and Xenos understand the same d3d formats, and Xcpu doesnt have to tesselate to get high poly counts (and thus use less bandwidth), but on ps3 i just think theres no way Flex I/O could deal very well with all the bandwidth that seems to be required on Ps3... Unless they trade a lot of Cells power for less bandwidth usage, but since i dont think Rsx would understand some more advanced formats, i dont know if thats even feasible.
     
  15. flick556

    Newcomer

    Joined:
    May 4, 2003
    Messages:
    163
    Likes Received:
    4
    flex I/O provides 76.8GB/s of bandwidth(I believe half in each direction) a nice number. I can't think of anything other than gpu communication that would use all this bandwidth. When I said EIB I meant that it does not pass through xdr along the way. flex I/O will Probably be a needed stop, along the path to the gpu. What I would hope for would be cell->EiB->flex IO->gpu cache. If things are not in synch at the hardware level while RSX is rendering you will need to store things in memory buffers giving you something like this cell->EIB->xdr->EIB->flex IO->to GPU cache. XDR is the slowest link in the second path.

    I agree that it may tax the cpu, but the end result may well be worth it. Achieving large amounts of compression using cell may produce better results than doubling the bandwith to memory. Both would be nice :)

    If the compressed textues and geomatry sit in xdr for cell to decompress than what sits in gddr3 ? Maybe data is passed both to and from cell during rendering, RSX would need to be know how to pass compressed data to cell at the hardware level and wait or do other things until cell gives results back. The Compressed data would have a smaller footprint in gddr3, this also keeps this process from interfering with other things running on cell that may be accessing xdr.
     
    #15 flick556, Mar 30, 2006
    Last edited by a moderator: Mar 30, 2006
  16. LightHeaven

    Regular

    Joined:
    Jul 29, 2005
    Messages:
    538
    Likes Received:
    19
    Thats for a 4,6 GHz cell i think.

    the one in PS3 provides 35 GB/s (or something like that i dont know for sure) being 20 GB/s from Cell to RSX and 15 from RSX to cell (someone correct if i'm wrong).

    Thats why i think its just not enought...
     
  17. Titanio

    Legend

    Joined:
    Dec 1, 2004
    Messages:
    5,670
    Likes Received:
    51
    And it has more bandwidth available to it :???:

    You could start thinking of some more advanced things already. Incognito are doing all their cloud rendering on Cell (volumetric raytracer), in Warhawk...asides from sparing the GPU of the no doubt significant computational weight of that task, it's likely also saving main memory bandwidth, as the blending for all that transparency wouldn't be done over main memory.

    It's enough to saturate XDR, though, and then some (sure, not entirely in any one direction, but some mix of both).
     
    #17 Titanio, Mar 30, 2006
    Last edited by a moderator: Mar 30, 2006
  18. Dave Baumann

    Dave Baumann Gamerscore Wh...
    Moderator Legend

    Joined:
    Jan 29, 2002
    Messages:
    14,079
    Likes Received:
    647
    Location:
    O Canada!
    Not framebuffer bandwidth (its very unlikely that pixel write will be going anywhere but the local RAM) and PR specs don't always tell the entire story.

    However, simply look at the fill-performances of the GS - on just colour writes, or Z writes its bandwidth constrained, with nothing else going on, if you combine Z tests and colour tests simulaneously (look at the MSAA fill tests) it drops even further below the theoretical maximums, and thats not considering blends. Simple fact is that even with nothing else going on the framebuffer bandwidth doesn't sustain 8 colour pixels or 16 Z samples (let alone full colour + Z tests / writes / blends etc.) at full rate. If the mix of ops is such that i doesn't get to that bandwidth then it means it using even less than 8 ROP's.
     
  19. Titanio

    Legend

    Joined:
    Dec 1, 2004
    Messages:
    5,670
    Likes Received:
    51
    Well you're talking about sustaining ROPs whereas earlier you seemed to be talking about sustaining shaders, which would be as much a function of vertex/texture fetch, perhaps, as framebuffer bandwidth. I agree the framebuffer will be kept only in VRAM - or, at least, RSX's ;) RSX need not be the only chip writing pixels to the screen in PS3 games.

    As for the value of ROPs beyond 8, I guess it depends how many transistors they'd be saving, or whether it'd be worth keeping them in for the minority of cases where they would still be useful (at least I have seen comment passed by a couple of devs that for some bursts of activity that are cache-friendly, more ROPs would be better than fewer, even if typically you did not have enough main memory bandwidth to justify them).
     
    #19 Titanio, Mar 30, 2006
    Last edited by a moderator: Mar 30, 2006
  20. Dave Baumann

    Dave Baumann Gamerscore Wh...
    Moderator Legend

    Joined:
    Jan 29, 2002
    Messages:
    14,079
    Likes Received:
    647
    Location:
    O Canada!
    The original item was talking about ROP's - the other point is that, along with the number of ROP's, the rest of the RSX is more powerful, ergo it is going to consume more bandwidth than G73 is. Its also dangerous to assume that consuming significant bandwidth from elsewhere is going to be the norm.

    The point is that the bandwidth doesn't sustain 8 in the first place - having all 16 enabled will not benefit "short bursts" any more than 8 would, drop ROP activity below the the level of the bandwidth and you'll be using less than 8 anyway.

    If it does turn out that 8 are active then I suspect that 16 will still be present in the chip, just 8 disabled (some element of redundancy).
     
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...