Can this SONY patent be the PixelEngine in the PS3's GPU?

Discussion in 'Console Technology' started by j^aws, Jun 1, 2004.

  1. Panajev2001a

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,187
    Likes Received:
    8
    No, it will not explode like that, but it still is going to increase compared to what you do now especially because I do not see the jump in Math ops per cycle to be that massive to eliminate completely the use of cube-maps, 3D Textures, etc... as look-up/shortcuts.

    Even if we do eliminate the short-cuts, we are likely to see an increase of texture data usage: no it will not grow like Shader ops usage, but it will be a problem that needs to be taken care of.

    If we want to support a huge number of Math ops per fragment we should point towards not only a large number of ALUs, but also to a decent efficiency.

    I know that parallelism helps, we can have more pixel in flights and even if the texture takes a bit to get to the ALU, we are hiding the latency by having so many pixels being processed.

    It is the idea behind the story: we can have a ALU that clocks at 100 MHz and we can do the same work with two ALUs that are each half as efficient.

    Depending on how we do take care of the latency in the APUs, we will vary their efficiency.

    If texture fetches take lots of cycles then APU's IPC will go down a lot and in order to compensate to the efficiency lost we will need more APUs dedicated to Pixel Shading work.

    If we can afford the extra APUs, it is ok, but what if the efficiency drop is so high that we cannot afford the extra APUs ?

    We would have the Shading power to run those long Shaders you mention ( for complex lighting models ), but in reality we will not be able to run them at decent speed unless we keep texture fetches to a minimum ( especially dependent texture reads that cannot be optimized by the "pack texture fetches and send them early to the Pixel Engines" kind of trick ).

    I know that a solution can be found ( to the potential problem of resonably long latencies for texture fetches ) and we can even look at current patents for the APUs and this one for the SALC/SALP in order to see what can be done... as I said something can be done, I am sure, but I am interested into exploring what we can do...
     
  2. Panajev2001a

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,187
    Likes Received:
    8
    Texture fetches create a dependency issue for the rest of the Shader ops and the APU will stall for the cycles taken by the texture fetch to come back... I do not see them set-up for OOOe nor automatic CMT ( switch on event MT ).

    How slow would it be for the APU to DMA the current context ( Stack and PC ) back to shared DRAM ( we only have 128 KB of Ls per APU ) and start processing on a new pixel ?

    I guess that context switching would not be needed shading pixels from the same primitive: the problem if primitives start descending in the 1-4 pixels range in terms of area then we have to make sure each APU is processing multiple primitives in parallel.

    How would that work ? That is an area I am trying to think about.

    On the bright side, if we looked at the hypotetical VS with 4 PEs and 32 nice little APUs we could see that we have potentially quite abit of pixels in flight at the same time and that with very long Shaders adding relatively few cycles due to texture fetch latency will not impact the situation too much as we are using much more Math ops than Texture fetches and we should have a quite high number of non dependent ( to Texture fetches ) Math ops that can run at full speed on the APU.

    It is just the freak in me that wants the APUs to run as close to peak performance as possible.
     
  3. Laa-Yosh

    Laa-Yosh I can has custom title?
    Legend Subscriber

    Joined:
    Feb 12, 2002
    Messages:
    9,568
    Likes Received:
    1,455
    Location:
    Budapest, Hungary
    Re Subdivs... I've expected the valence thing to be something like this, just wanted to know for sure - thanks for the explanation!

    ERP, most of the models I've made are >95% quad polygons, and most of the vertices have a valence of 4, with a few having 3 or 5. A valence of 6 should not happen as there are many ways to avoid it.
    I'm not sure about the exact details but I can get you some statistics if you'd like to see them... All in all, I usually like to model 'clean' meshes, because they're generally better to work with later in the pipeline - irregular vertices can easily cause UV stretching, skinning problems, surface distorsion and so on.
    So, I don't want to judge your artist without seeing his work at all, but maybe that subdiv car could be optimized a little. Then again, I'm almost exclusively working with characters, and I know that cars and other mechanical objects are a bit harder to build with subdivs...
     
  4. London Geezer

    Legend Subscriber

    Joined:
    Apr 13, 2002
    Messages:
    24,151
    Likes Received:
    10,297

    :oops: *guilty*.... :oops:
     
  5. Guden Oden

    Guden Oden Senior Member
    Legend

    Joined:
    Dec 20, 2003
    Messages:
    6,201
    Likes Received:
    91
    But all modern 3D renderers subdivide quads anyway to triangles, so the practical difference should be minimal. Even storage space should be the same...
     
  6. Laa-Yosh

    Laa-Yosh I can has custom title?
    Legend Subscriber

    Joined:
    Feb 12, 2002
    Messages:
    9,568
    Likes Received:
    1,455
    Location:
    Budapest, Hungary
    Nope, because triangulation only occurs AFTER the subdivision. I'd expect a realtime implementation to work like this, too; at least for Catmull-Clark subdivs, which work best with quad faces.
    For triangle-based subdivs, there are other schemes, like butterfly or what - but these aren't really used, at least in offline rendering.
     
  7. qwerty2000

    Newcomer

    Joined:
    May 24, 2003
    Messages:
    149
    Likes Received:
    0
    Location:
    New Jersey
    I think the ps3 should use 64 or 128-bit HDR rendering methods, supported by the latest PC graphics cards. High Dynamic Range rendering is the way to the future (Unreal Engine 3 has it too). We could be seeing CGI graphics in real time
     
  8. London Geezer

    Legend Subscriber

    Joined:
    Apr 13, 2002
    Messages:
    24,151
    Likes Received:
    10,297
    HDR on its own is not gonna show you CGI level graphics in real time. It helps, but there's much more to it that that :wink:
     
  9. j^aws

    Veteran

    Joined:
    Jun 1, 2004
    Messages:
    1,992
    Likes Received:
    137
    I thought I'd do a quick recap of the of the diagrams from the various Sony patents hinting at the PS3's GPU...Strangely there are three FIG.6's (666 :twisted: :shock: )

    I'll call these DIAGs A-C in order of patent application,

    DIAG A from Cell patent

    [​IMG]

    DIAG B from SALC/SALP patent

    [​IMG]

    DIAG C from Rendering patent

    [​IMG]

    Okay....


    Where's the treasure! :D

    What I don't get is that the GPU in DIAG A has separate pools of eDRAM and image cache (VRAM?) but the GPU in DIAG C has a shared pool of image memory (VRAM?). Are these two differing GPUs? I.e. the GPU in DIAG C seems to be without APUs but built solely with SALPS? :?

    The GPUs from A and C seem to be on differing buses aswell?:?
     
  10. qwerty2000

    Newcomer

    Joined:
    May 24, 2003
    Messages:
    149
    Likes Received:
    0
    Location:
    New Jersey
    I know the ps3 have much better ratuasuzer than the ps2 (when I mean better I mean like 100-200x more better). This time the ps3 will use FSAA raytracing,dispalcemt mapping,anostopic filtering,and more. But HDR is going to help alot.
     
  11. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,400
    Likes Received:
    440
    Location:
    San Francisco
    How do you know? please, enlighten us! :)
     
  12. Megadrive1988

    Veteran

    Joined:
    May 30, 2002
    Messages:
    4,723
    Likes Received:
    242
    there are many elements needed to create realtime visuals that resemble pre-rendered CGI. not any single one element (i.e. an effect or set of effects) can give you that. framerate and lighting are critical elements. as is a decent amount of anti-aliasing. it is all the needed visual elements combined that would give the appearence of CGI-like graphics. if you're missing any one of the basic elements, or not enough of something (i.e. geometry, or framerate) you will lose the CGI look.


    one thing though, we don't need raytracing to have CGIish graphics. even in CGI, raytacing is somewhat sparce. i.e. only some scenes in Toy Story 2 use raytracing. you can get away without having raytracing. in my opinion, even though the upcoming consoles *might* have enough processing power to handle light or modest amounts of raytracing, this is not the generation where raytracing is going to be used very much. it's like, texture mapping pre-1994 on the 3D chips used in the 16-Bit consoles. they barely had enough power to crank out a few thousand polys. while texture mapping could be done (SuperFX2 Commanche) it would not be used very much if at all.

    I would not expect heavy amounts of raytracing on complex scenes until PS4-N6-X3.

    While I am sure PS3-N5-X2 will be capable of limited raytracing in simple to modest scenes, there wouldn't be enough processing power for CGI level models let alone environment, gameplay, etc., PLUS raytracing. the upcoming consoles will have their hands full, and so will programmers, just trying to give us the complexity we all want. raytracing would be a real strain on these machines.

    I think after this upcoming generation, console makers who care about graphics (Sony, MS) will realize if they do not already, that the only way to make a large leap beyond PS3-X2, that the mainstream consumer will notice, is to have things like raytracing and all the cool global illumination stuff. Sony-MS will probably be looking into a combination of hardware raytracing units and software hacks that reduce the computational time for raytracing. just a wild theory here.

    ok enough rambling for this post.
     
  13. Panajev2001a

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,187
    Likes Received:
    8
    Jaws, look at Fig. 6 of the Rendering patent, that would seem to be ( the whole figure ) a Processor Element to be used in the Visualizer:

    the Image Memory would be the Image Cache and main Memory would be the Shared DRAM ( e-DRAM ).

    The Parallel Rendering Engine might be realized with the APUs + SALPs.

    That picture might not show the PE of the Visualizer directly, but it can be stretched to encompass it ( you can think what elements of a Visualizer PE would fit in that schematic ) as the text of the patent seem to go far beyond what the GSCube proposed also in terms of distributed rendering over a network.

    It might have started as an off-shoot of the GSCube research and got extended/ideas from it were considered for use in a CELL based platform.
     
  14. Megadrive1988

    Veteran

    Joined:
    May 30, 2002
    Messages:
    4,723
    Likes Received:
    242
    by the way, since one of this thread's main topics is fillrate, I thought I'd remind everyone of GSCube's fillrate ^__^


    16 processor GSCube

    37.7GB/s (2.36 Gpixels/s x 16)

    http://ps2.ign.com/articles/082/082490p1.html

    note that the Graphic Synthesizers are clocked just under 150 Mhz, just in case anyone was wondering where this figure comes from.

    16 GSs * 16 pixel engines/pipes * 147 point something Mhz (i didn't figure the exact clocking of the GSs here, I suck at math)

    edit: it was right there, under my nose, 147.456MHz :oops:

    anyway, the point is, the 16 processor GSCube has an untextured fillrate of almost 38 gigapixels. obviously 1 texture (plus bilinear filtering) cuts that in half to 19 gigapixels (18.85 to be near-exact)

    edit: naturally, the 64 processor GSCube had an untextured fillrate of about 150 gigapixels, or about 75 gigapixels textured.

    And now we have hints that PlayStation 3's GPU might be producing tens of billions of pixels/sec. so it seems to me that PS3 might, at least, rival the 16 processor GSCube in terms of raw fillrate, if not surpass it. something that would not be true with the previous Visualizer estimates of under 10 gigapixels (from 4 pipes, 1 per PE * 1-2 Ghz) And of course, much more work will be done on PS3 pixels, than were done on GSCube pixels, thanks to the APUs in Visualizer, that GS did not have. but my post here was really simply about raw pixel and textured pixel fillrate.

    now that PS3's fillrate at least *seems* to be falling into place nicely, we can worry about RAM memory :shock:
     
  15. Guden Oden

    Guden Oden Senior Member
    Legend

    Joined:
    Dec 20, 2003
    Messages:
    6,201
    Likes Received:
    91
    If you'd replace "will" with "might" (or equivalent), I'd feel a lot more comfortable. :p

    Anyway, I think it's pretty out of the question to have a GPU doing tens of gpix/s. It's completely idiotic just to have ten gpix fillrate because we don't NEED that many pixels filled to begin with in a console. Much less TENS of gpixes...

    Let's just say I would be highly surprised if this was to be true. I'd be happy with a part doing 2x PS2 untextured fillrate, but with textures, providing it has powerful pixel shading capabilities as well. THAT is where the bottleneck will be, having a supadupa rasterizer that can draw hundreds of polygon layers per frame that will never be seen is just uselessly burning power for no reason.
     
  16. ERP

    ERP
    Veteran

    Joined:
    Feb 11, 2002
    Messages:
    3,669
    Likes Received:
    49
    Location:
    Redmond, WA

    It's quotes like this that really scare me when talking about PS3....

    Why would you need 10's of gigapixels?
    Does this implay a GS like solution where you store all your intermediate calculations in the frame buffer?

    FWIW I really hope not!
     
  17. Megadrive1988

    Veteran

    Joined:
    May 30, 2002
    Messages:
    4,723
    Likes Received:
    242
    Well, I am not saying we need tens of billions of pixels, I am saying that the patent seems to hint at PS3's GPU having that level of fillrate.

    done 8)


    edit: GSCube 16 did have tens of billions of untextured pixels and just shy of twenty billion textured pixels. GSCube 64 had a hundred and fifty billion untextured and seventy five billion textured pixels 8)

    SGI's Ultimate Vision systems are meant to have upto 40 or 80 billion pixels (very high quality pixels i might add) and UV's purpose is for realtime applications.

    so there are uses for that much fillrate. otherwise GSCube and SGI Ultimate Vision would not have been made. I don't think there is such a thing as overkill, even for games.

    yes, I do realize that what is done to each pixel might matter more than just how many pixels we can get.
     
  18. j^aws

    Veteran

    Joined:
    Jun 1, 2004
    Messages:
    1,992
    Likes Received:
    137
    I suppose it could be streatched to fit the 4 VS GPU but I still find it odd that the GPU is hooked off the BE in Diag A but off the main buss in Diag C. Something just doesn't add up there? :?

    All things equal with the BE and VS, which bus layout seems the most efficient, Diag A or Diag C?

    I wouldn't worry too much. :wink: I thought devs wouldn't need to touch the metal of PS3? This would be the nightmare task of the STI engineers developing the compiler/ CELL OS wouldn't it?

    That is one of my biggest concerns where it wouldn't be mature a launch, i.e. inefficient/ buggy...and the final output displayed onscreen say, wouldn't be any different from the earlier released XBOX2 with perhaps a more mature XNA env.?

    I think the same 1998 argument applies, if PS2 was to have 2.4 Gpix/s, a WTF would be applied! :wink: If six years later in 2004, simple Moores law should take us above 30 Gpix/s otherwise I'd sack my R&D team! :D All those billions of Yens...

    On the subject of what would we'd do with 10's of Gpix/sec, since the entire GPU is programmable, including the pixel engine, would we not be able to implemet a Reyes pipeline? or other exotic delights.:D
     
  19. Megadrive1988

    Veteran

    Joined:
    May 30, 2002
    Messages:
    4,723
    Likes Received:
    242
    thanks Jaws - see I knew I wasn't off my rocker 8)
     
  20. ERP

    ERP
    Veteran

    Joined:
    Feb 11, 2002
    Messages:
    3,669
    Likes Received:
    49
    Location:
    Redmond, WA
    Allright you missed my point......

    My point i what am I trading for those 10's of billions of pixels, increasing the die area to increase fillrate, means that die area can't be used for say more ALU blocks. Or better texture filtering etc etc.....

    How useful is 10 billion pixels per second if your entirely limited by ALU speed. The statement is more about balance that it is about fillrate.

    My concern about PS3 in general is exactly what Sony will leave off the die for cost reasons. I have to assume we'll get decent texture filtering that works this time, I have to assume we'll have a complete set of blending ops, but I do worry that they might decide on a "novel" architecture and then have someone that doesn't understand it cut significant features for cost reasons...... IMO this is what happened to the GS.....

    When it comes to system performance the devil is in the details and it's the details of what I'm not seeing or hearing that worry me. We'll know soon enough and it's not like I have any control over it so........
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...