What's the current status of "real-time Pixar graphics&

Discussion in 'Architecture and Products' started by Daliden, Sep 18, 2003.

  1. Simon F

    Simon F Tea maker
    Moderator Veteran

    Joined:
    Feb 8, 2002
    Messages:
    4,563
    Likes Received:
    171
    Location:
    In the Island of Sodor, where the steam trains lie
    Feel free to nudge it back if you want!
    Hang your head in shame, Marco. Do you not remember the glory days of the mighty Dimension3D?

    But in a sense, that's what the DX/OGL assembler is - it gets converted into the hardware's instruction set. For example, take the XBox specs that were leaked- its instruction set seems quite different to that of DX's. The 3DLabs' approach (a scalar architecture) would be another example.
     
  2. Pavlos

    Newcomer

    Joined:
    Jul 29, 2002
    Messages:
    38
    Likes Received:
    0
    I don’t disagree with that.

    Running RenderMan on hardware is not trivial, but it’s easier than many people think. In short, OpenGL2 can be used as an assembly language to compile the high level RenderMan shaders. Antialiasing, dof and motion blur can be implemented with the accumulation buffer (the built-in hardware antialiasing is hardly an option because of the memory requirements). Displacement mapping can be implemented by rendering directly to vertex buffers. And so on… Even an R300 can implement this.

    But R300 hasn’t replaced any cpu in any renderfarm. Software rendering is more practical. The algorithms used in graphics hardware are designed to operate fast when a number of underlying assumptions are valid, mainly that the scene has a small number of relatively big polygons. When the rendered scene has a big number of pixel or sub-pixel sized polygons with high levels of antialiasing and motion blur (the standard scenario in offline rendering), these algorithms are inefficient (hence the lack of robustness I mentioned in my previous post).

    In particular I see two major inefficiencies.
    - Vertices are transformed after the tessellation of geometry in polygons. Transforming the control points before the tessellation is faster, since tessellation is cheaper than transformation. And geometry is constructed using high order surfaces anyway, since explicit polygon modeling is practical only for low poly models. And to my knowledge this is true even for today’s games.
    - Scan conversion doesn’t make any sense when the polygons are a few pixels big. I think this has explained many times in this board.

    So, I think software rendering will not be replaced any time soon, at least not until the hardware starts to support more robust algorithms.
     
  3. Dio

    Dio
    Veteran

    Joined:
    Jul 1, 2002
    Messages:
    1,758
    Likes Received:
    8
    Location:
    UK
    This may be the human factor.

    I remember JC or Brian Hook saying a rule like this applied to Quake3 map compile times. If they speeded up the map compiler, the level designers made the level more complicated so the compiles always took between fifteen minutes and half an hour - i.e. long enough to take a coffee break, but not so long you can't test your result several times per day.

    Perhaps there is an empyrical rule for 'offline' work?

    I must admit, I find 'intermediate' compile times of 2-5 minutes most irritating - if it's 15 minutes+ I can go play the guitar or get lunch or something, while if it's a 'quick' compile I want results now.
     
  4. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    7,610
    Likes Received:
    825
    They are, but they arent very good ones for more general purpose programming where you dont worry about register pressure yourself (shaders will move there eventually too).

    Limited register set VMs make life hard on compilers, you have to reverse existing optimization passes. Stack based or infinite register set based VMs are a little better, lots of compilers use stack based intermediate languages, with SSA an infinite register set is better still.

    RussSchultz, it is not meant for emulation ... it is meant for on the fly translation/compilation. Like what Transmeta does with x86, only without the throwing away of the results each time. It needs OS hooks for that obviously ...
     
  5. RussSchultz

    RussSchultz Professional Malcontent
    Veteran

    Joined:
    Feb 6, 2002
    Messages:
    2,855
    Likes Received:
    55
    Location:
    HTTP 404
    Perhaps then it shouldn't be called a low level virtual machine, but a high level? Or even a translation layer?
     
  6. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    7,610
    Likes Received:
    825
    His arguement for the name is this :

     
  7. VFX_Veteran

    Regular

    Joined:
    Mar 9, 2002
    Messages:
    683
    Likes Received:
    234
    I'm afraid this is accurate. I know of the guy that left PDI, but I'm not sure the information on his website should be public knowledge. Hmmm.....

    -M
     
  8. VFX_Veteran

    Regular

    Joined:
    Mar 9, 2002
    Messages:
    683
    Likes Received:
    234
    I agree here, but not for most of the processing. I think that we can look at the possibilities for some redundant calculations that the 3d hardware can do many times faster than a CPU.

    -M
     
  9. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,928
    Likes Received:
    230
    Location:
    Seattle, WA
    And I think that most of the processing would be done faster on the GPU. The only major portion of the processing that wouldn't be best-suited for execution on a GPU would be processing that is either of a type GPU's are poorly suited (64-bit FP, lots of branching, or a variable number of loops of a short routine), or non-graphics processing. For example, modern movies may be made with much of the animation done using physics engines to improve realism.

    Of course, modern GPU's aren't quite advanced enough to take over most of the processing, but it won't be much longer...
     
  10. VFX_Veteran

    Regular

    Joined:
    Mar 9, 2002
    Messages:
    683
    Likes Received:
    234
    Again, I disagree here. Currently the hardware (as well as software APIs) is just too limited for anything production-quality. Perhaps in a few more years (note: more than 2), we might start to see some studios using GPUs for redundant simple tasks.

    Geometry shaders that allow on-the-fly generation of geometry procedurally will be a while yet though (maybe 5-10 years).

    Flexibility will also be a huge factor in whether studios will start using 3d hardware for development. They have a while to go on that as well...

    -M
     
  11. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,928
    Likes Received:
    230
    Location:
    Seattle, WA
    I don't think there's far to go at all.

    First of all, the NV3x uses IEEE FP32. That means that the calculations done on the GPU would be essentially the same as those done on the CPU. This lends the GPU well to a "software assist" mechanism, where some shaders that would work better on the CPU are done there, with most executed on the GPU.

    What shaders would be executed on the CPU? I would tend to think only those that have a large number of possible execution branches (either through a while loop or just lots of separate if's). As far as I know, the NV3x is currently capable of executing any production-level shader. Some just may be particularly slow or require a bit of extra precision.

    And as for how far we need to go in flexibility on GPU's, I think we're much closer than you think. If the gen-4 hardware (NV4x, R4xx) unifies the pixel and vertex shaders, we're 95% there. That is, such an architecture would very easily generalize to any number of shader programs (today we have two: vertex and pixel. In the future, one could add a per-patch shader, which would be the rumored primitive processor). I don't know if that generation will offer more than just vertex and pixel shaders, but from all we've heard, it seems likely. I certainly hope so, as higher-order surfaces have been taking entirely too long to catch on.
     
  12. VFX_Veteran

    Regular

    Joined:
    Mar 9, 2002
    Messages:
    683
    Likes Received:
    234
    Let's indulge this for a moment. What shaders specfically do you think will automatically work now? How much precision can I be promised with a powf() function? Suppose I wanted to control the shape of my highlight by a variable N. How much does a simple powf() function cost in terms of the number of registers needed? How about shaders being able to call other shaders? Suppose I wanted to compute a noise gradient on-the-fly for use with bump-mapping and I wanted to create a bump-mapping shader that implements this? How much control can the user have over the variables being passed to these shaders in HLSL?

    Impossible. Here's a production-level shader: suppose we have a shader that casts rays through an arbitrary volume and accumulates density by evaluating a shading tree composed of a bunch of filtered noise functions then calling an illumination shader to retrieve an intensity for that pixel in the volume (ignoring self-shadowing).

    If it is too slow, then we are back to software rendering! Basically, if you can't get the 3d hardware to perform significantly faster than the CPUs on a renderfarm, then you are back to square one in my book.

    -M
     
  13. MfA

    MfA
    Legend

    Joined:
    Feb 6, 2002
    Messages:
    7,610
    Likes Received:
    825
    General purpose processors arent good at anything, even for software rendering you can always design a better processor.
     
  14. KimB

    Legend

    Joined:
    May 28, 2002
    Messages:
    12,928
    Likes Received:
    230
    Location:
    Seattle, WA
    If you've sped up some shaders, you're ahead overall.

    Anyway, the major barrier today, I believe, is in software. It would just be too much work today to make use of an NV3x for high-end graphics processing. I think the hardware is to the point where it could be useful, but the software certainly isn't there. So it's not happening...yet.
     
  15. VFX_Veteran

    Regular

    Joined:
    Mar 9, 2002
    Messages:
    683
    Likes Received:
    234
    LOL! That may be, but that's what this world uses...;)

    -M
     
  16. Pavlos

    Newcomer

    Joined:
    Jul 29, 2002
    Messages:
    38
    Likes Received:
    0
    This is not impossible. We already have compilers which translate RenderMan shaders in bytecode for interpretation, and there’s nothing that prevents you from writing a compiler that targets the OpenGL2 API instead. I’m referring to OpenGL2 because it has not any hardware limits (instruction limits, texture fetch limits, etc…), but it’s certainly feasible with DX9 too. In fact, converting the bytecode of my renderer to OpenGL2 shaders is trivial for the majority of shaders, but I’m not sure if any hardware today is supporting the OpenGL shading language.

    Do not expect the hardware APIs to directly support high level features, such as surface, displacement, light, atmosphere, volume shaders. OpenGL2 chose a lower level of abstraction than RenderMan and probably whatever PDI uses. We must write a renderer which compiles all these shaders in a single pixel shader.
    This is certainly feasible, but I don’t think it’s practical. Such a renderer will be unable to efficiently handle gigabytes of textures, subsurface scattering, global illumination, procedural/delayed primitives, ray tracing, etc… So, the usability of such a thing is questionable.

    Interestingly enough, I think the ray-marching shader you described will be one of the cases where the hardware will easily beat the software, cause of the sheer amount of simple calculations need to be done.
     
  17. VFX_Veteran

    Regular

    Joined:
    Mar 9, 2002
    Messages:
    683
    Likes Received:
    234
    Proof is in the pudding.

    I'll reserve judgement for when our R&D group comes down to me and asks us to start fiddling with 3d hardware. As it stands, I haven't heard of any of our films prepared to use it for at least the next 2 years...

    -M
     
  18. Daliden

    Newcomer

    Joined:
    Sep 18, 2003
    Messages:
    89
    Likes Received:
    0
    Naturally, I believe you would be among the last to see such systems taken in use. Actual production work isn't really the place for testing unproven methods :)
     
  19. cthellis42

    cthellis42 Hoopy Frood
    Legend

    Joined:
    Jun 15, 2003
    Messages:
    5,890
    Likes Received:
    33
    Location:
    Out of my gourd
    The proof of the pudding is in the eating, even. :wink:
     
  20. mrbill

    Newcomer

    Joined:
    Feb 24, 2003
    Messages:
    36
    Likes Received:
    1
    Location:
    Marlborough, MA
    Whoa whoa whoa whoa! Major correction on OpenGL Shading Language needed.

    There are virtual *and* physical limits to the OpenGL Shading Language. What we virtualized were the things that were difficult to count in a device independent way -- temporaries, instructions and texture fetch restrictions.

    But there are very real physical limits. Some of the constraints are small and harsh. Just a few:
    • Vertex attributes - 16 vec4s is the minimum maxium.
      Varying floats (interpolators) - 32 floats is the minimum maximum.
      Texture units - *2* is the minimum maximum, with *0* the minimum maximum texture units available to the vertex shader. (ATI's initial implementation has 16 texture units.)
    Production shaders *will* (not may, *will*) exceed these limits - by large margins. So production shaders *will* have to be broken up ala Peercy/Olano/et al. (The example RenderMan shaders in the paper are *not* close to production shaders in size or scope. But the good news is some of these simple shaders from the paper can now be directly ported to OpenGL Shading Language. )

    So, not quite so trivial for the majority of shaders, let alone production shaders.

    On Mr. Blue's hardware pudding, I'd say we know where the ingredients are, we even probably know how to mix them, but we still have to get out the whisk, get everything into the saucepan and then chill to make the pudding. We don't yet know if we'll get pudding out of this. But if we do, since on some of the incredients we've used some substitions (and even left a couple of pinches out) we still aren't quite sure how it will taste yet.

    Mr. Blue already knows the software pudding tastes good. (So does anyone who saw Bunny or Ice Age.)

    Finally, if history is any guide. The short Red's Dream was rendered in software for the opening and closing sequences, and hardware (the Pixar Image Computer) for the dream sequence. All predating RenderMan btw. As far as I know, it's only distributed on "Tiny Toy Stories" VHS, and in Quicktime on Pixar's web site. See for yourself.

    -mr. bill
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...