IBM has got Pixar in its crosshairs.

That is interesting actually. I was half expecting a big tie in with NVIDIA and Pixar to be announced at SIGGRAPH, in which case that pits them right against their fab. I wonder if NVIDIA may be involved in this instead.
 
It seems like it may be challenging for nVidia to leverage their processors with distributed computing. Still, perhaps this is why nVidia recently announced these technologies for the Quadro FX 3000:

Framelock—Allows professionals to link multiple systems in a cluster to scale power to problem size for life-sized visualization

Genlock—Enables synchronization with standard signal formats and house-sync signals for video post-production, broadcast, compositing, and editing solutions

(ripped off of the nvnews.net front page)
 
Well let's hope this doesn't go the same way as IBM's previous foray into the CGI/VFX market - Digital Domain. Still around, sure, but they fell out of favour the moment James Cameron took his name off the letterhead.

I think what people seem to forget about the CG animation is that the technology always comes secondary to the story telling (in Pixars case). Doesn't matter how quick you can render the frames or how good the animation is if the story sucks.
 
Oh, one more thing.

If IBM is going to start charging for the use of a distributed computing network, do you think it will be long before companies start paying people to run distributed computing software on their home PC's?
 
Half the time to make a CG movie?
With advances in rendering speeds, I can see that.


But Pixar has an edge as well running the new systems they have. They can scale just as quickly as technology allows. They knew SGI was starting to hit a wall hence the move to Linux based systems, IBM based ones at that.

PR Fluff is all I see right now, I want to see a final result.
 
Interesting article - however, IBM has a long long way to go. Pixar is just one - althought probably "number one" - of many. DreamWorks is using HP equipment. Pulled off Shrek and recently Sinbad.

Common theme is the linux render farms. Very cheap compared to the former Sun and SGI solutions...Heck, intel based platorms are in short, throw away items. You can have several in reserv to bring online in no time...

Side note - IBM said they will have a movie out in 14 - 18 months. Still a ways out considering Pixar and DreamWorks are releasing movies prior to.
 
DaveBaumann said:
That is interesting actually. I was half expecting a big tie in with NVIDIA and Pixar to be announced at SIGGRAPH, in which case that pits them right against their fab. I wonder if NVIDIA may be involved in this instead.

IBM runs all of their divisions as seperate companies responsible only for their own profits. For example, IBM's microprocessor division gets no special treatment from the IBM foundry; they have to pay the same rates and wait the same for turnaround as Nvidia or anyone else using the fab. Different IBM divisions have been known to compete for the same contracts.

In other words, I don't think IBM's "on-demand computing" people care one bit about competing with an IBM foundry customer's interests.
 
Read an article in Linux Journal today saying that 'Industrial Light and Magic' swapped out their big SGI renderfarm for a 750 node Linux cluster. Each node having 2 AlthonXP 1600+'s.

Nice.
 
I wonder just when GPUs will fully replace CPUs for those type of uses.
I guess the NV40/R500 will be a big step foward: a PPP would actually give a real advantage, because you can do "WOW!" quality when you want offline rendering, and "Nice." quality when you want it in real time - that on the exact same model. This will be, I believe, a very important win for IHVs here.

Having branching everywhere, and not only static, will most likely also favor such a move. Offline rendering likes complete flexibility, and no branching must seem completely insane to them...

Another thing required here is obviously full speed FP32. NV35 is near, but the register usage problem stops that ( an improvement with future drivers isn't out of the question... ) - the NV40 will also be a big step in that direction, AFAIK.

And obviously, we need multichip for it. ATI got a huge advantage right now for that, the Quadro FX 3000G isn't bad there, but ATI is simply superior...
But keep in mind multichip was one of the NV30's original design goals. Just yet another thing which was cutted due to lack of time, and which we'll likely find in the NV40 ( for workstations, of course, not more I bet )

The NV30 sounded great on paper for it. But now, I doubt it is.


Uttar
 
Uttar said:
I wonder just when GPUs will fully replace CPUs for those type of uses.
Uttar
This is a very good question.

Ian Stephenson will demonstrate at this years Siggraph a RenderMan implementation on top of a PS2. It has very bad quality and it’s not real time, but it demonstrates that it can be done and that a PS3 renderfarm is a possibility, as the author says. A short paper and a presentation are available here . If the rumors about the micropolygon architecture of the PS3 are true then it is very well suited for the task. Considering the relatively low price of consoles and the scalability of the platform this is even more interesting .

The situation with the PC GPUs I thinks it’s a bit shady. Apart from the shading power and flexibility, the GPUs must be able to perform real sub-pixel displacement mapping. This is essential for realistic rendering and the only way to do it is by displacing sub-pixel micropolygons. And this is a really big problem for the traditional GPU pipeline, which remains the same for the last 10+ years. The traditional SGI rendering pipeline separates the vertex from the pixel shading for efficiency reasons. But when the polygons are smaller that a pixel it doesn’t make any senses to shade separately the pixels and the vertices. A temporal solution will probably be to dynamically allocate the computational recourses between the pixel and the vertex shading, but it’s still a waste of silicon.

Another requirement is the efficient implementation of antializing, depth of field and motion blur, but this is a different story because Pixar has patented the usage of stochastic sampling in computer graphics( I know, it’s ridiculous) .


The SRT Rendering Toolkit
 
Uttar said:
I wonder just when GPUs will fully replace CPUs for those type of uses.

Realistically?? Most likely never. Knowing a little about the pixar render stack as well as others, I really don't see it happening. For instance, FP32 isn't generally used as it has "issues".

I guess the NV40/R500 will be a big step foward: a PPP would actually give a real advantage, because you can do "WOW!" quality when you want offline rendering, and "Nice." quality when you want it in real time - that on the exact same model. This will be, I believe, a very important win for IHVs here.

They can already due this. The models that a studio that Pixar works with aren't generally polygon based. In general they are using b-splines/nurbs. They have been able to render to openGL for quite some time.

Another thing required here is obviously full speed FP32. NV35 is near, but the register usage problem stops that ( an improvement with future drivers isn't out of the question... ) - the NV40 will also be a big step in that direction, AFAIK.

As stated about, generally FP32 isn't useful for studio use. They generally prefer FP64 for geometry, they might be able to use FP32 for color though. FP32 geometry can have some percision issues for scenes and the simpler fix isn't to tell the artists what to due, but just turn use FP64.

This is all neglecting the incredible amount of data they are using as well. A single frame in something like Monsters can use over 1 Gig of source data. And shaders that are in the thousands with thousands of instructions per shader.

Might hapen, but it will definitely be beyond the DX10 timeframe for gpus.


Aaron Spink
speaking for myself inc.
 
Brimstone said:
http://www.usatoday.com/tech/techinvestor/techcorporatenews/2003-07-24-ibm_x.htm

IIRC this, along with very similar approaches from both SGI and HP, was presented at Graphics Hardware 2003 in the the "Hot 3D" sessions.
It certainly seems like an interesting twist on the wheel of custom/consumer implementations. Supplying the huge networking bandwidth needed was interesting,
 
Something interresting from Carmack: http://www.gamespy.com/quakecon2003/carmack/
...
GameSpy: Are you going to retire after DOOM 3?
John Carmack: No. I've got at least one more rendering engine to write.
...

The very latest set of cards, with the combination of those features -- floating point and dependent texture reads and the ability to use intermediaries -- you can now write really generalized things and that is appropriate. You might use 50 or 100 potential instructions in some really complex gaming shader; but if the engine is architected right, you would be able to use the exact same engine, media creation, and framework and architect the whole thing to do TV-quality or eventually even movie-quality rendering that might use thousands of instructions and render ridiculous resolutions. The ability to use the same tools for that entire spectrum is going to be a little different from what we have now.
Something is coming.
 
Of course, as many pointed out, cinematic-quality rendering would require FP64 units.

It seems to me that even if some companies believe that they can go all FP32, and that there is no need to offer any lower precisions for performance, multiple precisions will be requird once FP64 comes out.

Given the number of extra transistors required for FP64 processing, it seems only natural that any FP64-capable GPU should also have the capability to process smaller chunks of data, and would likely have additional functional units at lower precisions.

Personally, I think about the best future-looking 3D architecture would include support for:
INT16
FP32
FP64

FP16 may also be an option, but seems too tied to 8-bit DACs to be very useful in the long-term. INT16 would allow for a large number of fast color operations that would be useful for higher-precision DACs. FP32 would be good for most any other fragment operations, including any sort of color operation requiring a reasonable dynamic range, and non-color data. FP64 would be best used for geometry data and very long shaders.

The best realtime GPU with support for these would include support, in hardware, for all three, at a ratio determined by the games.

The best offline-renderer would include native support for only FP64, with memory bandwidth/storage and possibly register usage benefits for using lower precisions.
 
Chalnoth said:
Of course, as many pointed out, cinematic-quality rendering would require FP64 units.
They are wrong. Prman and many other renderers are using almost exclusively single precision maths. Here’s a quote from Larry Gritz that confirms this statement:
Larry Gritz said:
The Ri routines are all single precision (so all input is parsed and
put into floats), and thus both BMRT and PRMan are almost completely
float on the inside. Of course, both use doubles occasionally as
temporaries for intermediate calculations in certain parts of the
renderers where that last little bit of precision is vital. But it's
almost correct to say that both renderers are just single precision
throughout.
But keep in mind that CPUs perform all computations at ~80bits precision, no matter what data types are used. The intermediate results are the ones that are rounded to 32 or 64 bits. Future GPUs will probably use a similar approach.

The SRT Rendering Toolkit
 
Pavlos said:
But keep in mind that CPUs perform all computations at ~80bits precision, no matter what data types are used.

You can change the working precision by setting the FPU control word. The FPU can work in either 32, 64 or 80 bit.
 
aaronspink said:
As stated about, generally FP32 isn't useful for studio use. They generally prefer FP64 for geometry, they might be able to use FP32 for color though. FP32 geometry can have some percision issues for scenes and the simpler fix isn't to tell the artists what to due, but just turn use FP64.

This is all neglecting the incredible amount of data they are using as well. A single frame in something like Monsters can use over 1 Gig of source data. And shaders that are in the thousands with thousands of instructions per shader.

Might hapen, but it will definitely be beyond the DX10 timeframe for gpus.


Aaron Spink
speaking for myself inc.

In relation to the FP32 / FP64 thing, well, if studios need it, nVidia/ATI will implement it. It's as simple as that, really. Doing FP64 in two clocks is really no problem IMO. Still way faster than CPUs. And if studios need it, wouldn't be surprised if some insane developers asked for it evantually, making them HAVE to implement it anyway...

So, consider using a F-Buffer like system for both VS and PS evantually maybe ( that is, if ATI doesn't have a patent on it they would seriously consider to use - but then again, nVidia hired the guy who did the prior work and which released the initial idea, so I doubt ATI would have much of a case here, hehe. ) - that'd fix the instruction limit problem.

Then finally you got the 1GB of data problem. Well, how is that a problem? Eh! AGP 8X could do that in a second! Maybe more since that's a best-case, peak scenario.
Certainly faster than a few hours ;) Sure, it isn't realtime, but I asked when GPUs would replace CPUs for that stuff. Not when they'll replace CPUs and do their job in realtime :)

I'd hope they could have all that stuff for the NV50 already. But then again, I hope not. The NV50 is a huge evolution in many aspects, and I want them to focus on the core of that architecture. Not on stupid stuff like making a F-Buffer like system work in the Vertex Shader or on making FP64 in two clocks possible.

So, when you say beyond the DX10 timeframe, you're probably right. The NV50 would be DX10, and the NV60 would be the part being perfect to replace CPUs in studios. That is all speculation of course, but still, it'd make sense.

NV01: Revolutioning the way you fail ( TM )
NV30: Making so called "cinematic computing" possible, at amazing, sub-0 framerates!
NV60: Making it really possible.

Yeah, I know, numerology is evil, just having some fun here, hehe.


Uttar
 
All precisions have issues, I wish we had compilers smart enough to figure out tight bounds on errors so they could calculate the necessary precision at any point in an algorithm given bounds on the input and error bounds on the output (hard in general, but shaders usually have a pretty straightforward structure ... hell Carmack thought NVIDIA might start using it). Floating point calculation is almost always an inherently non deterministic gamble on the part of the programmer :) A hack so you will.
 
Humus said:
You can change the working precision by setting the FPU control word. The FPU can work in either 32, 64 or 80 bit.

Of course you are right, you can change the internal precision on x86 with the “fldcw â€￾ opcode. But IIRC it will not change the execution speed of any instruction, apart from divides and sqrts, so the usability of doing this is at least questionable. Of course feel free to correct me if I’m wrong.

But my point was that additional functional units are not required for the hardware to support additional precisions and that comparing precision between CPU and GPU is meaningless, at least until the GPUs obtain IEEE conformance (for those who don’t know it, IEEE defines minimal precision for every operation, it’s not just a floating point storing format).

The SRT Rendering Toolkit
 
Back
Top