Playstation 3 RSX Graphics is NV47 Based

Discussion in 'Beyond3D News' started by Dave Baumann, Mar 30, 2006.

Thread Status:
Not open for further replies.
  1. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    True, not necessarily, but it gives a (very) rough idea, especially with people like Carmack talking about unique texturing on everything. Just pointing out that even if there isn't an exact proportional correllation, one does exist.
    I wasn't jumping to conclusions, I just wanted to make sure you didn't say "all". "Some" is basically what I was looking for, and the reason I made the above point about texture storage.

    Given this, there's no way some texture BW + vertex BW can even compare to color BW + z BW + the rest of texturing BW, nor can it saturate FlexIO.

    Okay, fair enough. I always assumed G71 was the recipient of RSX work, but whatever. DeanoC said that the number one tip by NVidia is to reduce register usage, so I thought Barbarian's comment was quite meaningful. Could be wrong.

    Regarding bursting, I was simply pointing out that a high speed bus is still very useful even if you only use it 10% of the time. Just like NVidia's high speed z rendering is useful even though it may only be used 10% of the time overall. Yes, it's capable of transfering continuously, but I can't think of even an atypical gaming load where this would happen.
     
  2. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    Gotta take off now guys, not sure when I'll be back (could be a few days). Thanks for the discussion.
     
  3. Ghost of D3D

    Banned

    Joined:
    Mar 10, 2006
    Messages:
    33
    Likes Received:
    4
    I assume by "it", you meant the latter BW costs (color+Z+other-texturing).

    Which would mean you're not quite correct in your rather fact-like last 5 words quoted above.
     
  4. LightHeaven

    Regular

    Joined:
    Jul 29, 2005
    Messages:
    538
    Likes Received:
    19
    No, he meant some Tex BW+ Vertex BW wont be enought to saturate Flex I/O
     
  5. Nemo80

    Banned

    Joined:
    Sep 5, 2005
    Messages:
    128
    Likes Received:
    3

    Guess this proves you're wrong here:

     
  6. ROG27

    Regular

    Joined:
    Oct 27, 2005
    Messages:
    572
    Likes Received:
    4
    This is what happens when someone who comes from the PC domain assumes that PC >= Console, when the two very clearly do not equate.

    MintMaster, stop trying to compare apples with oranges. You put open-platform technology on a pedestal and benchmark everything against it, assuming that it is better when obviously it is not...it's just a different implementation that has been balanced appropriately to accomplish different ends. The stuff in these closed boxes in the long run may not have the raw power or even the contained capability (within the GPU), but the system's architecture as a whole allows for flexibility, hacks, and tons and tons of programmability. This alone allows it to exist competitively for 5+ years.

    Sony balanced the PS3 differently than its competition. It doesn't mean that it's implementation will be worse. If anything, evidence points to the contrary. PS3 games, a year or two out from the release, will most definitely be edging out Xbox360 in the visuals department. The margin of difference won't be huge, but it will exist. I am willing to bet on it even. Are you?
     
  7. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    Did you have to ressurrect this thread for that? I've replied to that exact quote several times in the console forums.

    Just because it can stream data from the local store (e.g. dynamically generated vertices by Cell, which is the intended use of that feature, similar to XB360's XPS) doesn't mean it can texture from it. I've seen some detailed documents about PS3 and RSX, and this sort of suggestion isn't mentioned even once. Why would you want to waste your local store by filling it with a small texture that wouldn't consume much bandwidth anyway?

    Anyway, I mentioned this as an aside for Jaws' suggestion about procedural textures to save bandwidth. They can save disk space, but they will not save bandwidth. It's skirting the main issue with split buses: Framebuffer traffic is dominant.

    Texture bandwidth is only big for heavy multitexturing, in which case you're using lots of textures, or when reading from rendertargets (esp. HDR). The latter will almost always be done in GDDR3, because that's where drawing to the rendertargets is fastest. The former means you'll only significantly reduce GDDR3 traffic if most textures are in XDR. Besides the colour and z-buffers, you'll have over 200MB in GDDR3. In XDR, much of your 256MB will be needed for the OS (64MB, right?), game code, and probably vertex data. How much is left for textures? If you're using the power of Cell, it will not only use lots of bandwidth, but the data it reads and writes will use plenty of space.

    I am not talking strictly from an open-platform viewpoint. I'm talking from a workload viewpoint. If anything, texture bandwidth will be reduced in a closed environment, because devs will go through greater efforts to compress textures. It's only going to increase the percentage of BW used by the FB.

    I too believe that this will happen, but it will be because of the devs, not the hardware. I honestly think the 7600GT is capable of putting out better graphics than we're currently seeing in games from a 7900GTX. Oblivion graphics @ 1024x768 w/4xAA only run at 30-60fps on top hardware? Not impressive. An edge in developer talent is much more valuable, IMO, than an edge in hardware ability. There are so many amazing graphics techniques out there from 4+ years ago that aren't being used yet. This year there was a breakthrough in a new shadowing technique (variance shadow maps), and XB360 has a huge advantage over RSX there. Will it get used? I hope so, but unfortunately I think it'll take years.

    MS is doing their best to make their platform as dev-friendly as possible, but what you need is initiative for implementing these new ideas. I believe the hardcore PlayStation devs have more of it than the XBox devs.
     
  8. Arwin

    Arwin Now Officially a Top 10 Poster
    Moderator Legend

    Joined:
    May 17, 2006
    Messages:
    17,681
    Likes Received:
    1,200
    Location:
    Maastricht, The Netherlands
    Do I understand correctly though that on the original Xbox you just didn't have the low-level access? The Xbox didn't receive much low-level coding love simply because you couldn't get beyond the DirectX interface. Much of this seems to be addressed to some extent in the 360.
     
  9. ROG27

    Regular

    Joined:
    Oct 27, 2005
    Messages:
    572
    Likes Received:
    4
    When I say visuals, I'm not just referring to graphics (models, texturing, lighting, shadows, etc.). I'm referring to the entire visual presentation including animation and physics. You could take a game with significantly simpler graphics than a current Xbox 360 title and make more of a visual impact with good animations, transitions, physics, and simulation. The graphics are good enough for most people and have been for sometime. They understand that this model x made of polygons with textures, shaders, etc. represents this entity y, but what they can't relate to is the unnatural movement and interaction with the environment and lack of depth of immersion (even though they may not be able to explicitely express this because they clump everything together and brand it "graphics"). How do you fix that with making models, lighting, texturing more complex? If anything, you just make the problem worse by doing that. Visual impact is a long way off from just graphics. The way the system is balanced in the PS3 with a more powerful CPU will help make the pretty picture become more alive in my opinion. I personally can't wait for the day devs can make games animate naturally like cinematic sequences and have more layers of depth as far as interactivity.

    And another thing, complexity doesn't make up for bad art. I think most would agree that bad art is what kills most games from the most superficial visual standpoint.
     
  10. nAo

    nAo Nutella Nutellae
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,325
    Likes Received:
    93
    Location:
    San Francisco
    Sorry if you already addressed this point but why do you believe that?
     
  11. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    Which part, the breakthrough or the advantage?

    The breakthrough is not only the filtering benefits (works with trilinear, aniso, and downsampling/anti-aliasing), but also realistic soft shadow applications. I'll put out a demo soon enough.

    The advantage is 32-bit integer filtering on Xenos, and it runs at full speed for single channel textures. Free AA in 16-bit integer formats should help also (even for 32-bit depth squared values via resolution splitting and post processing), but it's a bit complex and theoretical right now.

    The highest filtered precision available on RSX is FP16, which is 40x less precise than even 16-bit integer for shadows. VSMs need twice the precision in the depth squared term as normal shadow mapping or it'll look even worse than bilinear PCF. After doing some calculations for FP10, I found you need over 5% difference in depth values between the shadow caster and reciever to get just a couple levels in the gradient along shadow edges. A nice gradient would probably need 10-15% difference. I16 is better, but probably not enough for a game, I reckon. I32 should be perfect. Yes, you can try to manually filter FP32 on RSX, but that'll be slow and trilinear/aniso are virtually impossible.
     
    #111 Mintmaster, May 28, 2006
    Last edited by a moderator: May 28, 2006
    Geo likes this.
  12. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    I see your point, but having built a physics engine myself, I'm unconvinced that a 2x increase in CPU power over XeCPU (and I doubt the practical advantage is nearly that high) will make that big of a visual impact. For truly interactive physics, the improvement will be almost unnoticeable 99% of the time. For gimmicky "effect physics", it's possible you'll notice a difference with CELL, but it won't detract from the visual presentation.

    Animation is not very CPU intensive at all. Even inverse kinematics is a piece of cake. Natural animations and transitions between them needs lots of expensive data capture and lots of subtle tweaking. Some low-lying fruit is out there too, as Ken Perlin has shown how adding low frequency noise to animation can make beings look much more lifelike.

    A lot of these things are software problems. They don't need gobs of CPU power, they just need a lot of research and money. They also need a lot of data, so RAM could easily be holding them back.

    Finally, I don't think I ever asserted the superiority of XB360 over PS3. Taking everything into account, I think they're comparable enough that the devs will make the difference. I'm talking mostly about RSX and bandwidth in this thread, and I don't think anything you mentioned makes a difference in my claims.

    Agreed, and that's a big part of what I meant when I said "developer talent". Resident Evil 4 is a great example of how art can elevate graphics to beyond the norm. Even if you're going for the realistic look, it just won't happen without a keen sense of what "feels wrong" in the look of a game.
     
  13. Arwin

    Arwin Now Officially a Top 10 Poster
    Moderator Legend

    Joined:
    May 17, 2006
    Messages:
    17,681
    Likes Received:
    1,200
    Location:
    Maastricht, The Netherlands
    Would an SPE be able to help out? It's not bad at integer ops either, after all.
     
  14. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    Texture filtering is buried deep within the 3D graphics workflow. The CPU can feed the GPU with vertices (the first stage of the workflow) or read/write memory that the GPU outputs/reads. There's nothing the CPU can do here unless it takes over the entire rendering procedure.

    High precision integer filtering was not included in NV4x/G7x/RSX because it never had any real use. Even the 16-bit integer filtering in ATI's PC cards is often used as a not-so-good substitute for FP16 filtering. I have no idea why ATI decided to include this feature in Xenos, but it was probably just for the sake of completeness.
     
  15. Nemo80

    Banned

    Joined:
    Sep 5, 2005
    Messages:
    128
    Likes Received:
    3

    Sorry but now you starting to sound like almost like a ****** :)

    If you still considering the XeCPU to have 100+ some GFlops than i really can't help you. The maximum theoretical GFLOPs is bout 70 GFLOPs and that's absolutly theoretical. This implies that the CPU is doing nothing else (of course the same applies to PS3).

    But the big difference is the XeCPU is already busy with lots of stuff (1 Thread to 1 Core for audio in most games (!) (done by a RSX "section" in PS3) then decompression, physics, rendering game loop etc... and all on a tiny 1 meg cache that's even shared with the GPU).

    In this regard, the PS3 CPU is much more advanced and the SPUs are also a lot faster for this kind of stuff.

    There simply is much more "time" for physics than on the 360. The current known games proves that (almost no physics on 360
     
  16. Arwin

    Arwin Now Officially a Top 10 Poster
    Moderator Legend

    Joined:
    May 17, 2006
    Messages:
    17,681
    Likes Received:
    1,200
    Location:
    Maastricht, The Netherlands
    Ok. I'm trying to think how the volumetric clouds are done in Warhawk, but I guess that's possible with read/write stuff over the GPU output, although you still need to solve how the Warhawk itself can fly through the volumetric cloud. With that working, I imagine the Cell might be able to add certain types of shadowing in a unique way. But this is over my head for now, I've only just started with OpenGL.

    Adding to Nemo, the current games also show that on PS3 (lots of physics already).
     
  17. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    Looking at some freeze-frame shots of Warhawk, I'm guessing Cell just writes colour and alpha values into a buffer, and depth values into another. RSX would then just composite that onto the rest of the scene with alpha blending by reading the colour and depth values from a texture. Ever since PS 1.3 (I think) you could write your own z-value from the pixel shader.

    It doesn't look like the depth of rest of the scene is taken into account, nor do the planes cast any shadows on the clouds. Also, when a plane flew through a cloud, there was an abrupt transition in colour. This is why I think the cloud rendering is completely independent and compositing is so simple. The jet flame of the plane was just put on top of the image.

    The interesting question is what method is used for creating the cloud buffers. There are some fast volumetric fog techniques for determining thickness of the fog in the view direction, and you can find a nice paper on NVidia's site. High precision blending is needed here, and I think this is why Cell would be better for the job than the GPU as in the paper. Looking at the video, they also considered the thickness in the sun direction as well. This could be precomputed into a texture, because I didn't see the clouds morph or rotate.

    Both of these calculations (view thickness and sun thickness) could also be done with some sort of "real" raytracing. If the clouds were cleverly made of several spheres, ray-sphere intersections might be done fast enough for 60fps @ 720p, but that's questionable. Even with 6 SPU's, you'll only have about 300 cycles per pixel per frame at most. Directly intersecting a ray with a low-poly mesh is also possible. Either way, it seems tough.

    Personally, I think the first method is more realistic.
     
  18. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    That's just it, I don't see GFlops being the limitation on either XeCPU or Cell except for effects physics like particles where all you need is dumb brute force. There's too much other stuff going on in a physics engine.

    Where are you getting this from, that Microsoft presentation? Man, I knew people would misinterpret that info. That doesn't tell you anything about how much the cores are being used (i.e. % utilization), it just tells you how they split up the work, and they did it this way because it's easy. Audio will never use that much CPU time. The compression threads will only be working when there's something to load.

    If you want to fully use either Xenon or Cell, it's a bad idea to separate tasks in this way, because one of them will always need more cycles than the others and they'll just be sitting around waiting for the next frame. The right way is to chop up your tasks into little segments, queue them, and have all your cores/SPUs attacking them. When one is finished, start working on the next one in the queue. The whole audio in one thread, physics in one, rendering in one, compression in one, etc is not very effective at all, especially since the relative loads of these tasks changes dramatically through the frame.

    That's a bunch of crap and you know it. Do you really think HL2 type physics wouldn't run well on XeCPU? The reason you don't see heavy physics in early games is that it's time consuming to implement.

    The bottleneck right now is software, not hardware.
     
    #118 Mintmaster, May 29, 2006
    Last edited by a moderator: May 29, 2006
  19. Geo

    Geo Mostly Harmless
    Legend

    Joined:
    Apr 22, 2002
    Messages:
    9,116
    Likes Received:
    213
    Location:
    Uffda-land
    Hrrm? Isn't MS evangelizing heavily this very thing as the "easy wins" for multi-gpu, and in fact pushing it on the PC side as well? That's what that linked presentation is, right?

    We can say "that isn't the best way to do it", but if that's what devs are being told to do, isn't it going to happen that way?
     
  20. Mintmaster

    Veteran

    Joined:
    Mar 31, 2002
    Messages:
    3,897
    Likes Received:
    87
    I don't think MS is suggesting that. In the presentation they have the "good multithreading" with two main threads spawning of lots of threads for things like physics, particle systems, etc. All cores work on these things, or at least they should. I think it's more of a software schematic than one which describes how it runs on hardware.

    If it does happen the bad way, Cell/Xenon are just going to have cores/SPUs just sitting around. It's silly to increase your audio or compression load just because that particular SPU is waiting for more work in the next frame. To balance effectivley, you should distribute your tasks among the processors evenly, not segregate them and increase the lighter loads for the heck of it. I certainly hope devs are smarter than that.

    The slide entitled "Another Paradigm: Cascades" certainly seems like a silly idea to me unless a scheduler in the OS switches threads a lot. It doesn't seem that way because it says "On Xbox 360 you must explicitly assign software threads to hardware threads."
     
    #120 Mintmaster, May 29, 2006
    Last edited by a moderator: May 29, 2006
Loading...
Thread Status:
Not open for further replies.

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...