Scene management

Frank

Certified not a majority
Veteran
As it is now, each frame is drawn all anew over and over again. And as we see with the improved memory controller and the efforts to batch drawing calls and instance objects, I think the next step to offer improved frames and speed would be to upload the whole scene once, and send only the modifications afterwards.

As it is, GPU's display single pictures in sequence as fast as they can, they have no notion of the whole scene. Drawing calls and calculating the same things for every frame are expensive. With the shaders to handle calculating the actual color, they might be the most expensive thing in terms of speed, and thus in the highest quality of the frames archievable.

With all the talk about doing real physics, you need to have the whole scene uploaded and processed to some dedicated hardware anyway, which needs to output new geometry, textures and lighting calculations to the renderer. Why not have the GPU do all that? And when it does, it can optimize everything so much better.
 
DiGuru said:
As it is now, each frame is drawn all anew over and over again. And as we see with the improved memory controller and the efforts to batch drawing calls and instance objects, I think the next step to offer improved frames and speed would be to upload the whole scene once, and send only the modifications afterwards.
You're a little late for "the next step". Since there's hardware geometry processing, you usually upload all the static scene data only once and then submit the changes as transformation matrices, bone positions, weights, or other shader constants.

As it is, GPU's display single pictures in sequence as fast as they can, they have no notion of the whole scene. Drawing calls and calculating the same things for every frame are expensive. With the shaders to handle calculating the actual color, they might be the most expensive thing in terms of speed, and thus in the highest quality of the frames archievable.
I'm not sure what you're trying to say, but generally, it's the "common worst case" you need to optimize, and in the case of 3D graphics this worst case is that each and every pixel (except HUD) changes from one frame to the next because the virtual eye moves. There's no point in optimizing still frames.
 
DiGuru said:
I think the next step to offer improved frames and speed would be to upload the whole scene once, and send only the modifications afterwards.
OpenRT API does just that. Only problem is that I think there is no way to modify uploaded objects (yet?), you have to remove them and then add them back to their new position. Also camera is a bit more natural and you don't have to move the entire world to see it from another angle :)

I don't think functionality like this would be availiable in d3d/opengl in the near five years and since ray tracing will probably take over anyway in about ten years I think there is no big problem :)
 
Xmas said:
You're a little late for "the next step". Since there's hardware geometry processing, you usually upload all the static scene data only once and then submit the changes as transformation matrices, bone positions, weights, or other shader constants.

Can you explain that a bit more? That sounds like the same thing as Ho_ho is saying as well.

I know about vertex processing, but I didn't knew you can have them calculate the new ones from things like bone positions, unless you uploaded a calculation for the vertex to use. Which still requires a lot of overhead.

And how far does your scene extend? Does it only incorporates the objects in the current frame, or also the objects all around? Which you would need to do things like physics calculations. Which require the possibility to create new geometry, textures and lighting models. Which I didn't think the current hardware could do.

But even so, if you can deform the current triangles, that's not all that makes up the scene. And when you want to deform geometry and/or create new stuff, what are you going to do with your normal maps and shaders?

Either I don't understand it well enough (likely), or it is more complex than you make it out to be.
 
Uploading a whole scene graph won't save you from having to re-shade every vertex and every pixel when the camera position changes. It might save you from having to traverse the scene graph on the CPU side, but that's not usually a performance bottleneck in the first place. There might be optimization opportunities to gain from re-ordering the draw order on data in a scene to better fit the hidden surface removal capabilities of the underlying hardware, but I wouldn't expect dramatic improvements over what games programmers are already doing.

As for physics calculations, you would usually NOT want to use the same data structures for both rendering and physics modelling. As an example, rendering two 10000-polygon meshes might run reasonably fast on modern graphics HW; collision detection between two such objects is AFAIK very slow with today's algorithms (most of the time, you can live with the collision detection using much coarser models than the rendering). Another example would be internal structure of objects; you don't need to know much about an object's internal structure to blast its polygons onto the screen; you DO need to know quite a lot to do believable physics modelling on it.
 
  • Like
Reactions: Geo
I think the point of hardware accelerated scene graphs might be the theoretical possibility of hardware accelerated visibility determination, as well as letting the driver determine the best way to traverse the scene.
 
DiGuru said:
Can you explain that a bit more? That sounds like the same thing as Ho_ho is saying as well.

I know about vertex processing, but I didn't knew you can have them calculate the new ones from things like bone positions, unless you uploaded a calculation for the vertex to use. Which still requires a lot of overhead.
When doing skeletal animation, you upload a detailed mesh with bone weights per vertex only once, and for every frame only update the bone positions, which happen to be only a dozen or so. A vertex shader can take care of deforming the mesh according to the bone positions then.

However, there's more scene data than just vertex buffers and textures, and what kind of data this is heavily depends on the engine used. So I'm really not sure what you're referring to when you say "upload the whole scene".
 
So, we come back to the shadow (map) problem and bounding boxes/collision detection. For the first one, I can believe there are acceptable solutions that can be done by hardware directly that are not feasible to do on a CPU. That is, after all, what made GPU's such a hit in the first place.

And for the physics calculations: do we really want bounding boxes for that? For things like fluids? They would have to be calculated on the fly. Which might take as much processing power as just handling the polygons directly in the first place. Especially when you are generating new geometry and/or normal maps. And I think it's easier for hardware to generate lots of polygons than normal maps.

And I agree that we would need objects, and ones that specify more than just their outer surface to do that.
 
DemoCoder said:
I think the point of hardware accelerated scene graphs might be the theoretical possibility of hardware accelerated visibility determination, as well as letting the driver determine the best way to traverse the scene.

A better way to split up and place all components than we have right now. Yes, that was my first interest when I wanted to do a graphical engine (which is still on the drawing board). Spatial, instead of first-come-first-go.
 
Xmas said:
When doing skeletal animation, you upload a detailed mesh with bone weights per vertex only once, and for every frame only update the bone positions, which happen to be only a dozen or so. A vertex shader can take care of deforming the mesh according to the bone positions then.
How does the vertex shader knows what calculation to apply to each vertex? Is it the driver who knows, or the GPU?

However, there's more scene data than just vertex buffers and textures, and what kind of data this is heavily depends on the engine used. So I'm really not sure what you're referring to when you say "upload the whole scene".
Well, some description of the objects (vertices, connections and textures), and the lighting that has to be applied to them. But I'm taking a more database-like point of view than you, I guess, as you're probably thinking about how the GPU would handle it. While I'm thinking about the minimal data I would need to store the objects as whole entities. With all the attributes needed to show them when placed in a scene.
 
I don't thing it is feasible to completely separate CPU from scene managment. With some smart glue code sitting between CPU, physics card (PPU) and graphics hardware some of the work could be either offloaded to other hardware of handled automagically without touching CPU.


My vision of a system with CPU, PPU and GPU operating together is this:

CPU sends physics world, animated models, their bones and animations to PPU. Perhaps IK rules can be sent too. Probably not all physics information is needed all the time so some of the entities can be offloaded or their detail level decreased as needed. After all there is little point to run a 10k particle fountain if it is only couple pixels on screen or completely invisible. This info probably comes from CPU that manages the higher level scene managment (positions of more important models, level layout, physics modifiers etc). It could also come as feedback from GPU that checks from time to time how big is some model is on screen or if it really is visible.
Graphics data on GPU is updated mostly the same way as it is now, only difference from todays world is that some info can come from PPU.

PPU calculates a step and sends updated model (and bone) positions via PCIe or some other high-speed bus to GPU or to some buffer that GPU will use as input for its next frame. CPU can query object positions and states from PPU and object visibility from GPU if needed.

Every object* on has a global ID (shared on CPU, PPU and GPU) and by that ID its vertices, bones, textures, materials and everything else can be found as needed. Data is sent to between different processors in little packages with headers telling what they are and what objects do they affect. That way sending updated bone structure from PPU to GPU should be trivial.
*)Anything can be an object. E.g model vertices, bones, animations, physics constants, materials, and all sorts of other info and object hierarchies

Only problem is that for handling all those ID's and objects either some special HW or heavy driver modifications are needed. For start modifying drivers is probably easier, cheaper and more flexible solution.

That kind of system won't probably work very well with current API's and they have to be either rewritten or heavily modified. Probably some hacks could be done but it would probably not be as effective as it could be.

Another problem with that kind of system might be that by the time it matures traditional GPU's might be evolved to something totally different or some other rendering method takes over.


Doing physics calculations on GPU is not a reasnoable thing in my oppinion, at least not yet. They are not designed for that and they are probably not as effective as special purpource PPU would be. Sure, it can be done but by doing physics on GPU some image quality and/or rendering speed is lost.

[edit]
typos
 
Last edited by a moderator:
DiGuru said:
How does the vertex shader knows what calculation to apply to each vertex? Is it the driver who knows, or the GPU?
Where have you been during the last five years? :-|
Hardware vertex processing got programmable starting with the Geforce 3/Radeon 8500 generation in the PC space.
Vertex shaders are programs. I.e. specified by the application/game, sent to the hardware by the driver, and executed there (inside the GPU) for each vertex that passes through.
DiGuru said:
Well, some description of the objects (vertices, connections and textures), and the lighting that has to be applied to them. But I'm taking a more database-like point of view than you, I guess, as you're probably thinking about how the GPU would handle it. While I'm thinking about the minimal data I would need to store the objects as whole entities. With all the attributes needed to show them when placed in a scene.
Frankly, your "database" view is exactly what games have been doing all these years. Every rendering system worth talking about at least bears some resemblence to a scene graph/database. In the simplest case you have a completely static scene and can move the camera around, from which follows that you have at least one object mesh ("... whole entities") and a mathematical description of the camera, which happens to be a 4x4 transformation matrix in practice ("attributes needed to show them"). Textures, constant colors (green teapot vs orange teapot) etc further extend the "attributes needed to show them".

And real games have always done much more than that.

Meanwhile the hardware and accompanying APIs have evolved to support ever bigger and fancier databaseish features. E.g. fixed function vertex processing allows you to reuse a single mesh description for multiple instances of the same mesh in the same scene, by just loading a new transform matrix (64 bytes). This was software first, but then moved to hardware.

HW fixed function transformation could only get the maximum bang out of static meshes though, and likewise mesh storage in graphics memory only made sense for static meshes. Deformations had to be done on the host CPU.
With programmable vertex processing you can do deformations on the hardware side very flexibly, and you can also put all the inputs in graphics memory (because they will stay static, despite of the result being dynamic).

All types of resources (geometry, vertex programs, fragment programs, constants for either, render targets, what-have-you) can be mixed and matched today after being specified/uploaded once.

I hope this wasn't all too obvious but after reading your post I thought it was worth mentioning :-|
 
Gi Guru's point is reasonable.

Currently the game runtime mIntains the complete scene description in some data structue (often referred to as a scene graph) and then submits some subset of that scene to the hardware generally as a set of Hardware instructions with no structire.

What he's suggesting I assume is that you move the entire scene descriptions and mangement onto the hardware. The hardware does the top level culling, and draws what's required. The application would simply update the changes to the scene.

Unfortunately this leads to the application holding a parallel data structure so it can keep track of the changes itself, it already needs to know where everything in the world is and their relationship to each other, so you don't save the data structure. And in large dynamic scenes it's entirely possible that you would have to move more data to the card than you would if you jsut created a new display list for the card.

Various bits and pieces of hardware have been able to do it at some level over the years, even the (very misnamed) Fast3D library on N64 allowed you to submit a sphere primitive that would jump over subsequent draw calls to allow object culling on the GPU. Having said that it was still more efficient in general to do the test on the CPU and not submit the primitives.

The only reason I can see for going this way is raytracing, where because of reflections it's unclear whether an object offscreen has significant impact on the final pixel.
 
Back
Top