What's the current status of "real-time Pixar graphics&

Mr Blue said:
Since when has this thread started talking about assembly language? It's getting rapidly off-topic.
Feel free to nudge it back if you want!
MfA said:
Happens in all non branching forums, it is just a poor design for discussion (of course HTML forces us into it).
Hang your head in shame, Marco. Do you not remember the glory days of the mighty Dimension3D?

MfA said:
Chalnoth said:
This is also part of the reason why I support going for nothing but high-level language shader programming. Assembly programming is too hardware-specific. The assembly should be done away with entirely, with the compiler compiling directly to the machine code.

I wouldnt mind seeing this extended to all programming, only problem is that programmers dislike to expose their source-code ... I personally like virtual ISAs as a halfway compromise. Typed SSA virtual assembly languages can be pretty much just as usefull as the original code for the compiler (it is a real shame politics is preventing LLVM from becoming the next gen GCC by the way).
But in a sense, that's what the DX/OGL assembler is - it gets converted into the hardware's instruction set. For example, take the XBox specs that were leaked- its instruction set seems quite different to that of DX's. The 3DLabs' approach (a scalar architecture) would be another example.
 
Chalnoth said:
I take "convergence" to mean something entirely different.

......

I think that very soon, renderfarms for making movies will use chips from ATI and/or nVidia for most of the processing.

I don’t disagree with that.

Running RenderMan on hardware is not trivial, but it’s easier than many people think. In short, OpenGL2 can be used as an assembly language to compile the high level RenderMan shaders. Antialiasing, dof and motion blur can be implemented with the accumulation buffer (the built-in hardware antialiasing is hardly an option because of the memory requirements). Displacement mapping can be implemented by rendering directly to vertex buffers. And so on… Even an R300 can implement this.

But R300 hasn’t replaced any cpu in any renderfarm. Software rendering is more practical. The algorithms used in graphics hardware are designed to operate fast when a number of underlying assumptions are valid, mainly that the scene has a small number of relatively big polygons. When the rendered scene has a big number of pixel or sub-pixel sized polygons with high levels of antialiasing and motion blur (the standard scenario in offline rendering), these algorithms are inefficient (hence the lack of robustness I mentioned in my previous post).

In particular I see two major inefficiencies.
- Vertices are transformed after the tessellation of geometry in polygons. Transforming the control points before the tessellation is faster, since tessellation is cheaper than transformation. And geometry is constructed using high order surfaces anyway, since explicit polygon modeling is practical only for low poly models. And to my knowledge this is true even for today’s games.
- Scan conversion doesn’t make any sense when the polygons are a few pixels big. I think this has explained many times in this board.

So, I think software rendering will not be replaced any time soon, at least not until the hardware starts to support more robust algorithms.
 
For twenty years, the overall average frame time has consistently been half an hour.
This may be the human factor.

I remember JC or Brian Hook saying a rule like this applied to Quake3 map compile times. If they speeded up the map compiler, the level designers made the level more complicated so the compiles always took between fifteen minutes and half an hour - i.e. long enough to take a coffee break, but not so long you can't test your result several times per day.

Perhaps there is an empyrical rule for 'offline' work?

I must admit, I find 'intermediate' compile times of 2-5 minutes most irritating - if it's 15 minutes+ I can go play the guitar or get lunch or something, while if it's a 'quick' compile I want results now.
 
Simon F said:
But in a sense, that's what the DX/OGL assembler is - it gets converted into the hardware's instruction set. For example, take the XBox specs that were leaked- its instruction set seems quite different to that of DX's. The 3DLabs' approach (a scalar architecture) would be another example.

They are, but they arent very good ones for more general purpose programming where you dont worry about register pressure yourself (shaders will move there eventually too).

Limited register set VMs make life hard on compilers, you have to reverse existing optimization passes. Stack based or infinite register set based VMs are a little better, lots of compilers use stack based intermediate languages, with SSA an infinite register set is better still.

RussSchultz, it is not meant for emulation ... it is meant for on the fly translation/compilation. Like what Transmeta does with x86, only without the throwing away of the results each time. It needs OS hooks for that obviously ...
 
MfA said:
RussSchultz, it is not meant for emulation ... it is meant for on the fly translation/compilation. Like what Transmeta does with x86, only without the throwing away of the results each time. It needs OS hooks for that obviously ...
Perhaps then it shouldn't be called a low level virtual machine, but a high level? Or even a translation layer?
 
His arguement for the name is this :

Unlike high-level virtual machines, the LLVM type system does not specify an object model, memory management system, or specific exception semantics that each language must use. Instead, LLVM only directly supports the lowest-level type constructors, such as pointers, structures, and arrays, relying on the source language to map the high-level type system to the low-level one. In this way, LLVM is language independent in the sameway a microprocessor is: all high-level features are mapped down to simpler constructs.
 
MfA said:
Happens in all non branching forums, it is just a poor design for discussion (of course HTML forces us into it).

The main programmer for the PDI renderer used for Antz/Shrek has a site BTW (and works for NVIDIA now). It doesnt tell much about the rendering engine, but he has posted some tidbits on usenet on occasion.

He used a tiled A-buffer renderer with deferred shading ... the shading parameters in the A-buffer could also be used for subsequent raytracing I believe.

I'm afraid this is accurate. I know of the guy that left PDI, but I'm not sure the information on his website should be public knowledge. Hmmm.....

-M
 
Chalnoth said:
I think that very soon, renderfarms for making movies will use chips from ATI and/or nVidia for most of the processing.

I agree here, but not for most of the processing. I think that we can look at the possibilities for some redundant calculations that the 3d hardware can do many times faster than a CPU.

-M
 
And I think that most of the processing would be done faster on the GPU. The only major portion of the processing that wouldn't be best-suited for execution on a GPU would be processing that is either of a type GPU's are poorly suited (64-bit FP, lots of branching, or a variable number of loops of a short routine), or non-graphics processing. For example, modern movies may be made with much of the animation done using physics engines to improve realism.

Of course, modern GPU's aren't quite advanced enough to take over most of the processing, but it won't be much longer...
 
Chalnoth said:
And I think that most of the processing would be done faster on the GPU. The only major portion of the processing that wouldn't be best-suited for execution on a GPU would be processing that is either of a type GPU's are poorly suited (64-bit FP, lots of branching, or a variable number of loops of a short routine), or non-graphics processing. For example, modern movies may be made with much of the animation done using physics engines to improve realism.

Of course, modern GPU's aren't quite advanced enough to take over most of the processing, but it won't be much longer...

Again, I disagree here. Currently the hardware (as well as software APIs) is just too limited for anything production-quality. Perhaps in a few more years (note: more than 2), we might start to see some studios using GPUs for redundant simple tasks.

Geometry shaders that allow on-the-fly generation of geometry procedurally will be a while yet though (maybe 5-10 years).

Flexibility will also be a huge factor in whether studios will start using 3d hardware for development. They have a while to go on that as well...

-M
 
I don't think there's far to go at all.

First of all, the NV3x uses IEEE FP32. That means that the calculations done on the GPU would be essentially the same as those done on the CPU. This lends the GPU well to a "software assist" mechanism, where some shaders that would work better on the CPU are done there, with most executed on the GPU.

What shaders would be executed on the CPU? I would tend to think only those that have a large number of possible execution branches (either through a while loop or just lots of separate if's). As far as I know, the NV3x is currently capable of executing any production-level shader. Some just may be particularly slow or require a bit of extra precision.

And as for how far we need to go in flexibility on GPU's, I think we're much closer than you think. If the gen-4 hardware (NV4x, R4xx) unifies the pixel and vertex shaders, we're 95% there. That is, such an architecture would very easily generalize to any number of shader programs (today we have two: vertex and pixel. In the future, one could add a per-patch shader, which would be the rumored primitive processor). I don't know if that generation will offer more than just vertex and pixel shaders, but from all we've heard, it seems likely. I certainly hope so, as higher-order surfaces have been taking entirely too long to catch on.
 
Chalnoth said:
I don't think there's far to go at all.

First of all, the NV3x uses IEEE FP32. That means that the calculations done on the GPU would be essentially the same as those done on the CPU. This lends the GPU well to a "software assist" mechanism, where some shaders that would work better on the CPU are done there, with most executed on the GPU.

Let's indulge this for a moment. What shaders specfically do you think will automatically work now? How much precision can I be promised with a powf() function? Suppose I wanted to control the shape of my highlight by a variable N. How much does a simple powf() function cost in terms of the number of registers needed? How about shaders being able to call other shaders? Suppose I wanted to compute a noise gradient on-the-fly for use with bump-mapping and I wanted to create a bump-mapping shader that implements this? How much control can the user have over the variables being passed to these shaders in HLSL?

As far as I know, the NV3x is currently capable of executing any production-level shader.

Impossible. Here's a production-level shader: suppose we have a shader that casts rays through an arbitrary volume and accumulates density by evaluating a shading tree composed of a bunch of filtered noise functions then calling an illumination shader to retrieve an intensity for that pixel in the volume (ignoring self-shadowing).

Some just may be particularly slow or require a bit of extra precision.

If it is too slow, then we are back to software rendering! Basically, if you can't get the 3d hardware to perform significantly faster than the CPUs on a renderfarm, then you are back to square one in my book.

-M
 
General purpose processors arent good at anything, even for software rendering you can always design a better processor.
 
Mr. Blue said:
If it is too slow, then we are back to software rendering! Basically, if you can't get the 3d hardware to perform significantly faster than the CPUs on a renderfarm, then you are back to square one in my book.

-M
If you've sped up some shaders, you're ahead overall.

Anyway, the major barrier today, I believe, is in software. It would just be too much work today to make use of an NV3x for high-end graphics processing. I think the hardware is to the point where it could be useful, but the software certainly isn't there. So it's not happening...yet.
 
Mr. Blue said:
Impossible. Here's a production-level shader: suppose we have a shader that casts rays through an arbitrary volume and accumulates density by evaluating a shading tree composed of a bunch of filtered noise functions then calling an illumination shader to retrieve an intensity for that pixel in the volume (ignoring self-shadowing).

This is not impossible. We already have compilers which translate RenderMan shaders in bytecode for interpretation, and there’s nothing that prevents you from writing a compiler that targets the OpenGL2 API instead. I’m referring to OpenGL2 because it has not any hardware limits (instruction limits, texture fetch limits, etc…), but it’s certainly feasible with DX9 too. In fact, converting the bytecode of my renderer to OpenGL2 shaders is trivial for the majority of shaders, but I’m not sure if any hardware today is supporting the OpenGL shading language.

Do not expect the hardware APIs to directly support high level features, such as surface, displacement, light, atmosphere, volume shaders. OpenGL2 chose a lower level of abstraction than RenderMan and probably whatever PDI uses. We must write a renderer which compiles all these shaders in a single pixel shader.
This is certainly feasible, but I don’t think it’s practical. Such a renderer will be unable to efficiently handle gigabytes of textures, subsurface scattering, global illumination, procedural/delayed primitives, ray tracing, etc… So, the usability of such a thing is questionable.

Interestingly enough, I think the ray-marching shader you described will be one of the cases where the hardware will easily beat the software, cause of the sheer amount of simple calculations need to be done.
 
Pavlos said:
This is not impossible....
Interestingly enough, I think the ray-marching shader you described will be one of the cases where the hardware will easily beat the software, cause of the sheer amount of simple calculations need to be done.

Proof is in the pudding.

I'll reserve judgement for when our R&D group comes down to me and asks us to start fiddling with 3d hardware. As it stands, I haven't heard of any of our films prepared to use it for at least the next 2 years...

-M
 
Mr. Blue said:
Pavlos said:
This is not impossible....
Interestingly enough, I think the ray-marching shader you described will be one of the cases where the hardware will easily beat the software, cause of the sheer amount of simple calculations need to be done.

Proof is in the pudding.

I'll reserve judgement for when our R&D group comes down to me and asks us to start fiddling with 3d hardware. As it stands, I haven't heard of any of our films prepared to use it for at least the next 2 years...

-M

Naturally, I believe you would be among the last to see such systems taken in use. Actual production work isn't really the place for testing unproven methods :)
 
Pavlos said:
This is not impossible. We already have compilers which translate RenderMan shaders in bytecode for interpretation, and there’s nothing that prevents you from writing a compiler that targets the OpenGL2 API instead. I’m referring to OpenGL2 because it has not any hardware limits (instruction limits, texture fetch limits, etc…), but it’s certainly feasible with DX9 too. In fact, converting the bytecode of my renderer to OpenGL2 shaders is trivial for the majority of shaders, but I’m not sure if any hardware today is supporting the OpenGL shading language.
Whoa whoa whoa whoa! Major correction on OpenGL Shading Language needed.

There are virtual *and* physical limits to the OpenGL Shading Language. What we virtualized were the things that were difficult to count in a device independent way -- temporaries, instructions and texture fetch restrictions.

But there are very real physical limits. Some of the constraints are small and harsh. Just a few:
  • Vertex attributes - 16 vec4s is the minimum maxium.
    Varying floats (interpolators) - 32 floats is the minimum maximum.
    Texture units - *2* is the minimum maximum, with *0* the minimum maximum texture units available to the vertex shader. (ATI's initial implementation has 16 texture units.)
Production shaders *will* (not may, *will*) exceed these limits - by large margins. So production shaders *will* have to be broken up ala Peercy/Olano/et al. (The example RenderMan shaders in the paper are *not* close to production shaders in size or scope. But the good news is some of these simple shaders from the paper can now be directly ported to OpenGL Shading Language. )

So, not quite so trivial for the majority of shaders, let alone production shaders.

On Mr. Blue's hardware pudding, I'd say we know where the ingredients are, we even probably know how to mix them, but we still have to get out the whisk, get everything into the saucepan and then chill to make the pudding. We don't yet know if we'll get pudding out of this. But if we do, since on some of the incredients we've used some substitions (and even left a couple of pinches out) we still aren't quite sure how it will taste yet.

Mr. Blue already knows the software pudding tastes good. (So does anyone who saw Bunny or Ice Age.)

Finally, if history is any guide. The short Red's Dream was rendered in software for the opening and closing sequences, and hardware (the Pixar Image Computer) for the dream sequence. All predating RenderMan btw. As far as I know, it's only distributed on "Tiny Toy Stories" VHS, and in Quicktime on Pixar's web site. See for yourself.

-mr. bill
 
Back
Top