"Turn every pixel into a polgyon"-renderer

psurge,
I don’t think there are any benchmarks on the net, as speed is not the primary concern for the RenderMan users. The 0.1fps (6 seconds/frame) figure was from a scene with 28800 polygons with 9 lights and some not so simple displacement and surface shaders rendered at 1440x960 with 16x antialising. And this includes the parsing of the 2MB RIB file. Note that I used polygons, a very bad and unoptimized path. Using NURBS or other high level primitives will probably give faster times. Also I forgot to mention that, although the original SIGGRAPH ‘87 paper doesn’t mention it, the architecture defers the shading calculations, so the depth complexity is almost irrelevant to the rendering times.

In addition, consider that Prman doesn’t have any low level optimizations for the x86 market, because it targets a broad number of RISC processors. So, even if the theoretical flops of my machine are higher, I don’t believe that Prman is using much more than 1.5Gflop.
Also note that the expensive dicing stage can be performed by fast fixed function hardware.
 
And just to clarify some thinks about fps and motion blur. Everything that moves more than 1 pixel between two successive frames can be considered aliasing in the domain of time. Without motion blur, fast moving objects at 30 fps seem choppy because they move more than 1 pixel. At 60fps you don’t see any choppiness for a given movement speed, but objects with more speed than a limit (the Nyquist limit) will look choppy. So you have to go to 120fps or more and so on (So if you wand to play a fast paced action game you need at least 100fps).
This way, you are just raising the Nyquist limit. Always there will be a movement speed that produces aliasing(choppiness).

On the other hand, you can just stay at 30-40 fps and perform motion blur. The fast moving objects will just start to look blurry. You can even stay at 20 fps or lower, but then even slow moving objects will look blurry. Consider that many NTSC dvd’s are at only 23fps and none of them goes beyond 30 fps, so it’s not so bad as many people think

To summarize, quality motion blur (aka antialiasing in the domain of time) is much more important and cheap than high frame rates to archive realism. But the sad think is that Pixar have patented the stochastic sampling method, so it’s unlikely to see it in hardware before the patents start to expire.

As for real time photon mapping don’t hold your breath. Photon mapping is using ray tracing. We must first convince the hardware people that raytracing is the right way to go. Personally I like ray tracing and all the relative global illumination monte carlo algorithms mush more than REYES and scanline algorithms. But since I have implemented the RenderMan interface on top of a monte carlo ray tracer, you can understand that my opinion is slightly biased.
 
You can even stay at 20 fps or lower, but then even slow moving objects will look blurry.

Isn't there a limit on how much you can blur before it will fall apart again ?

Going lower than 20 fps, is really pushing it. Would 6 fps (with massive blur) fast pace action game (lots of fast camera movement and character zipping around), would that work ?
 
Active Cooling, Active cooling ;)

Also, what about the micro-wave oven that a 4 CPUs Itanium 1 server would make ? That is 160 Watts per CPU ;)

SOI can do either of these two things: either lower power consumption and dissipation or increase clock-rate... well they might be using SOI for reducing the heat issue ;)

And yes gubbi you are right, we know the FPU has parallel ADD/SUB and MUL/DIV pipes so yes the max is 3 GFLOPS... however you have to consider the Athlon has less registers than the Broadband Engine ( 8 64-80 bits registers [we are not thinking about SSE] vs thirthy-two 128 bits registers per APU and we have 32 of them in the Broadband Engine ), its bandwidth to memory ( both local [cache] and main RAM ) is much less than the one the Broadband Engine has...

Cell was built with LOTS of bandwidth available thanks to its fast busses, use of e-DRAM and of SRAM based Local Storage and the architecture is register happy :)

And it's not like it lacks in Integer or FP processing power either ;)

Two APUs clock per clock would still be faster and more efficient than Athlon's FPU in THIS kind of code due to the higher bandwidth, Local memory ( 128 KB of local Storage ) and registers thrity-two 128 bits registers that would all support the 2 FP MADD/cycle ( 4 FP calculation/cycle... you can even factor in SSE, the APUs would still have the edge and with the increase in clock-speed the APU would still be more efficient as the Athlon's FPU is bandwidth starved... ).
 
Pavlos,

Sorry another question... I assume that motion blur takes camera motion into account. Does it also take moving objects into account? If so, how is this done (lerp between 2 keyframes in the scene, or...?)

Regards,
Serge
 
psurge said:
Pavlos,

Sorry another question... I assume that motion blur takes camera motion into account. Does it also take moving objects into account? If so, how is this done (lerp between 2 keyframes in the scene, or...?)

Regards,
Serge

Serge, have a search for the Stochastic Sampling paper (I can't remember the authors but they're from Pixar). It's all explained in that.
 
Serge,

Yes motion blur takes into account both moving camera (moving coordinate systems) and moving objects (transformed by moving coordinate systems). It’s sampling at random positions in time and then translates (lerps) the micropolygons to the correct position, so it can sample them. Although RenderMan specification specifies an arbitrary number of intermediate object positions in a frame, Prman implements only two, the starting and the ending. It also makes other assumptions, but in practice this is sufficient, as you may have noticed in countless movies. The motion blur has only a 50% penalty.

If you really want to know how it works you can download the Siggraph2000 course notes on RenderMan ( http://renderman.org/RMR/Books/infbeyond.pdf.gz ) and read the first chapter “How PhotoRealistic RenderMan Works“. It has some very nice diagrams of the REYES pipeline and a paragraph that explains how motion blur and depth o field is archived. The stochastic sampling papers are somewhat hard to find.

Also, I’m not saying that PS3 will implement the REYES architecture, but this is the only well known architecture in existence that works with micropolygons (actually almost every renderer that supports true displacement mapping works with micropolygons).
 
Vince said:
Yes, I thought that current implimentations of VS unde DX9 lack the ability to create or destroy vertices in the Shader itself.

It isn't "lacking" the ability, it's by design. Allowing a vertex shader to create and destroy vertices is a recipe for disaster on several fronts. (Assuming we are both talking about the actual vertex shader programs, and not the hardware, which is a completely seperate issue).
 
We don't know if Sony PS3 will use a micro-polygon renderer, but the original assertion was that lots of vertex power without per-pixel shading would look terrible, quite clearly thats rubbish the best off-line render in the world uses that very approach.

As we have no real info (except for 1 patent application that may or may not actually be used inside the PS3) we are just speculating. If we extrapolate the PS2 with what we know/guess about Cell we come to the conclusion that this thing we have lots of processors. Even the aging PS2 has 4 completely seperate processor's.

Now given that assumption, we can choose 2 routes with regards graphics
A) Processors will be mainly dedicated to pixels
B) Processors will be mainly dedicated to vertex/polygon operation with simple per-pixel operations.

Current PC/GCN/Xbox have all gone down route A but the PS2 is unique in going down route B. Now if PS3 follows its heritage, then it will also use B.

Now lets design a system around B, full Reyes would probably be too much but we can borrow a few ideas. A simple rasterisor, not as simple as PS2 say a very fast single texture (with full filtering capabilitys) with alpha blending and depth testing with a multi-context command stream being driven by n vertex/primtive shaders (on full processors). Support for displacement mapping, subdivision surfaces is easy. For Renderman style surface shaders you use either old-fashioned multi-pass or Reyes style break everything up into micro-polygons. If you add a fast feedback to the rasterisor (which could fit quite easily into the GS style XGKICK with embedded VRAM) you can get a simple F-Buffer style thing going quite cheaply.

The advantage of route B is that you not 'wasting' processors on pixel shader. A pixel shader is quite specialised whereas a vertex shader are more 'normal', no weird texture units etc. Sony seems to like using the same design in several places with only minor mods, so you never know.
 
Back
Top