ShaderX 2 Contents - Clues to PowerVR Series 5?

Discussion in 'Beyond3D News' started by Dave Baumann, May 27, 2003.

  1. Teasy

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,563
    Likes Received:
    14
    Location:
    Newcastle
    Joe

    Nothing elitist about it, oh yeah I spent 1 minute registering for free... now I'm one of the elite :D

    Anonymous posts just make it easier for trolls that's all. But if its just the news forum that allows it then I'm not bothered in that case.
     
  2. LeStoffer

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    1,253
    Likes Received:
    13
    Location:
    Land of the 25% VAT
    Nope, AFAIK this technique is not limited to Doom III-like rendering (stencil/shadow). The point is that you can render everything in front to back order during your first pass without the penalty of changing the render states (because you not 'rendering' colours). After this sort the early pixel rejection can be very effective during your second - real - rendering pass.

    I don't know if everybody is going to use this and ATI only recommend it with high overdraw and heavy shader use.
     
  3. Saem

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    1,532
    Likes Received:
    6
    Wouldn't the first pass allow you to remove all the hidden pixels, so that the second pass would be only the visible stuff?
     
  4. Kristof

    Regular Alpha

    Joined:
    Jan 30, 2002
    Messages:
    733
    Likes Received:
    1
    Location:
    Abbots Langley
    Sorting front to back per polygon is not trivial, and usually even imposible because the transform operation is done on the GPU. So best you can hope for is a rough per object sort. So this first pass is going to come at some cost.

    Current Early Z is region based, meaning the Z info has to apply to a full area of 8x8 or even bigger, this can still result in lots of unnecessary Z ops simply because triangles get smaller. For example a highly tesselated monster near the camera might be a problematic occluder because each triangle might be smaller than your Z areas hence it can not update the area info.

    Also don't underestimate the increase of vertex throughput with this kind of approach... remember that DoomIII looks pretty rough simply because the polygon count goes through the roof due to multipass of all the geometry.

    K-
     
  5. Teasy

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,563
    Likes Received:
    14
    Location:
    Newcastle
    Will it be possible for Doom3 and other games, that use this method of two passes, to disable it for certain cards? (Series 5 for instance). Or will this method also have to be forced on cards that just don't need it?
     
  6. LeStoffer

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    1,253
    Likes Received:
    13
    Location:
    Land of the 25% VAT
    Point well taken, but if you're gonna be shaders ops limited because of hefty overdraw this might still be well worth it given that the two major IHV's latest architectures should take good advantage of it (after all how often are they vertex troughput limited these days?).

    But please please prove me wrong with a PowerVR Series 5 card soon, Kristof! :wink:
     
  7. RussSchultz

    RussSchultz Professional Malcontent
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,855
    Likes Received:
    55
    Location:
    HTTP 404
    Why would you have to render back to front? I thought the whole point was to come up with the Z values.

    Then you can render the same dataset and (assuming you've got no transparencies) it essentially is only rendering the visible pixels(everything else fails Z).
     
  8. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,418
    Likes Received:
    178
    Location:
    Chania
    Le Stoffer,

    Allign it with next generation hardware, not current high end sollutions.

    Apart from that, maybe this one from senior SA might help a bit?

    http://www.beyond3d.com/forum/viewtopic.php?p=10148&highlight=#10148

    Just as a sidenote, who says that a early Z alike method is impossible on a TBDR too? (f.e. for applications that are actually optimised for it).

    In any case since my crystal ball is rather empty for predictions about real time performance, all I'm saying is that I don't see a TBDR coming out with any real disadvantages if specs are up to snuff this time around. All we're gonna see for it's lifespan are most likely some moderate usage of vertex/pixel shaders and stencil ops added to the mix here and there, but nothing that could lead it to some serious disadvantage; it's advantages will most likely balance the whole thing out in a worst case scenario.

    Teasy,

    Do you mean wether it will be possible to exchange multipass with single pass? FableMark is using multipass AFAIK.

    Someone correct me if I'm wrong but wouldn't single pass vs multipass have usually no performance advantages if the application is fillrate limited?
     
  9. LeStoffer

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    1,253
    Likes Received:
    13
    Location:
    Land of the 25% VAT
    You do realized that the number 3 solution in SA's post is exactly what I'm referring to? Key conclusion:

    But thanks for the link anyway, since I missed it somehow. :oops:
     
  10. Saem

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    1,532
    Likes Received:
    6
    This is exactly what I was thinking in my post above.
     
  11. LeStoffer

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    1,253
    Likes Received:
    13
    Location:
    Land of the 25% VAT
    Russ, Saem:

    Maybe it just me that's very bad at explaining the technique, so let's use the words of SA:

    My bold: Early Z reject (on chip, not using memory bandwidth) should still help here, but you're of course right that it maybe not make much of a difference when doing application driven deferred rendering. Or maybe I have just misunderstood something (its very hot here today!) 8)
     
  12. Saem

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    1,532
    Likes Received:
    6
    Thanks a lot, LeStoffer.

    That clears up everything. Now it's just a matter of what are the dependants of this pass and whether the pass can be turned off in the case of DR.
     
  13. 3dcgi

    Veteran Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    2,435
    Likes Received:
    263
    If the app is fillrate limited because it's wasting time processing shaders for hidden pixels then the multi-pass approach will help. That's the main point of application deferred rendering. To avoid running long shaders on hidden pixels.
     
  14. Teasy

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,563
    Likes Received:
    14
    Location:
    Newcastle
    Ailuros

    Yeah I mean could the multi-pass technique be changed to normal single pass for a TBDR? Because if the technique is there to save rendering hidden pixels then it can only hurt performance on a TBDR since the card will already remove hidden pixels itself. It might not effect performance much, but its bound to have some effect.
     
  15. RussSchultz

    RussSchultz Professional Malcontent
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,855
    Likes Received:
    55
    Location:
    HTTP 404
    I think the early Z reject will help tons, but the first pass doesn't need to be strictly front to back. (well, if you're trying to eek out all the performance you can it does). If all you're looking at doing is avoiding shading work, the Z first, then color will make quite a difference.

    Though, I think this method fails when we have pixel shaders that fiddle with the Z.
     
  16. Ailuros

    Ailuros Epsilon plus three
    Legend Subscriber

    Joined:
    Feb 7, 2002
    Messages:
    9,418
    Likes Received:
    178
    Location:
    Chania
    I´m aware of it and that´s why I pointed to it in the first place; difference being that he points out that sollution 3 turns out to have a slight advantage, yet requiring more vertex shader performance.

    Now let´s see: assume you have a high end IMR and an equivalent speced TBDR, where the first utilizes sollution 3 and the latter a hierarchical tiling scheme combined with early Z and parameter or Z compression if you prefer, which seems to have theoretically an advantage then? I´d say hard to predict.

    More from the wise man:

    http://www.beyond3d.com/forum/viewtopic.php?p=111108&highlight=#111108

    http://www.beyond3d.com/forum/viewtopic.php?p=61943&highlight=#61943

    Not necessarily. What if the application is vertex data throughput limited?

    If you mean the good ole single buffering that KYRO was still able to use in some OGL applications, I´m afraid I don´t have a clue if it would still benefit a fundamentally different card, even if it´s still a TBDR. Hopefully Kristof can and will answer that one.

    I had dx9 two sided stencil on a TBDR in mind concerning single vs multipass, not application driven deferred rendering in general. Sorry for the confusion.
     
  17. micron

    micron Diamond Viper 550
    Veteran

    Joined:
    Feb 23, 2003
    Messages:
    1,189
    Likes Received:
    12
    Location:
    U.S.
  18. CorwinB

    Regular

    Joined:
    May 27, 2003
    Messages:
    274
    Likes Received:
    0
    Damn, they are even linking to B3D... :p
     
  19. Saem

    Veteran

    Joined:
    Feb 7, 2002
    Messages:
    1,532
    Likes Received:
    6
    Finally, they gave B3D and thus especially the staff and the community.
     
  20. Teasy

    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    4,563
    Likes Received:
    14
    Location:
    Newcastle
    But surely if the app was limited by vertex data throughput then that's even worse. AFAIK this multi-pass technique doubles the geometry that needs to be sent to the chip and doubles the T&L. So if a app, using this technique, was limited by vertex data throughput then changing to a normal single pass rendering technique would get rid of that vertex data throughput limitation.

    Doubling the geometry load to save on pixel shading for hidden pixels may be good for an IMR, in some games it may be the lesser of two evils (games with extremely heavy pixel shading and not so complex models and environments). But its pointless for a TBDR, AFAICS all its doing is limiting the cards T&L power in that app to do something that the card would do natively anyway (get rid of hidden pixels before rendering).

    Basically all I'm asking is would Carmack, or whoever (depending on the game) be able to set the engine up so that it could revert back to a standard single pass rendering technique for TBDR cards? Or is this two pass technique something that can't be turned on and off for different cards?
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...