How many batches

mikiex

Newcomer
Ok this is one of those questions that vague because there are so many factors that could affect how many batches you can push to a GPU. I am wondering how many batches a typical scene in a game cost in CPU ms on both PS3 and Xbox. A kind of survey of what you would aim for,
 
Depends what your CPU budget is and if your just aiming at console or need to do a PC version.
Most PC devs aim at <2000, or did last time I worked on a PC, and a that might cause issues with min spec type boxes.
you can push a lot more on a console because of the significantly smaller overhead, but it's not always a good reason to do it.
With less larger batches you're trading off sending data to the GPU that's off screen against the additional CPU cost of submitting more primitives.
What the right balance is will vary with things like how CPU heavy your game is, and how much work your vertex shavers do amoung other things.
 
Thanks for your reply Rob, I agree what you are saying, I am not an engine programmer but I know a fair bit about console tech, but I don't have much experience with many different engines on PS3 and Xbox to know what are considered a reasonable amount of batches. If sub 2000 batches is considered ok for PC and PC has more of an overhead, does this mean that consoles should achive a similar figure, with maybe weaker CPUs as they are to a modern PC?

I'm interested to know if games are around the 500,1000,1500 or2000 batches within how many ms of cpu time. Very ballpark estimates of what people would consider good.
 
mikiex said:
If sub 2000 batches is considered ok for PC and PC has more of an overhead, does this mean that consoles should achive a similar figure, with maybe weaker CPUs as they are to a modern PC?
Like ERP said - it can go a lot higher on a console (~2000 batches per frame was something you could see in PS2 games already).

Haven't done console profiling in ages though, so don't have precise numbers.
 
sorry I didnt read it quite as clearly as this, so in theory we are saying 2000 @ 30fps and it should be possible with rendering only to do 1000 @ 60fps
 
The question, as stated, does not make sense, and you will not get answers that make sense.

The cost of a drawcall is by itself is meaningless; it's how much state you set between drawcalls that makes the drawcalls themselves expensive. On 360, the drawcall itself can become almost free (comparable to the cost of a cache miss). You can certainly sustain 10k drawcalls ("batches") per frame in real-world situations.
 
The question, as stated, does not make sense, and you will not get answers that make sense.

The cost of a drawcall is by itself is meaningless; it's how much state you set between drawcalls that makes the drawcalls themselves expensive. On 360, the drawcall itself can become almost free (comparable to the cost of a cache miss). You can certainly sustain 10k drawcalls ("batches") per frame in real-world situations.

I wouldnt say it doesn't make sense because I'm not looking for absolute answer, all these answers are very useful to me. If 2k to 10k is feasable @ 30fps in realworld is the answer that is a good.
 
I wouldnt say it doesn't make sense because I'm not looking for absolute answer, all these answers are very useful to me. If 2k to 10k is feasable @ 30fps in realworld is the answer that is a good.
I remember someone stating (here in the forums) that there's around 2000 draw calls per frame in the PS3 version of Crysis 2. Might be a bit old info, but it seems that 2k-10k draw calls per frame is a pretty good ballpark estimate for current generation 30 fps console games.

Draw calls itself are not that important. You can render thousands of objects in a single call by using instancing. And you can record draw calls to command buffers, and reuse the calls later (no need to perform new draw calls). It also affects the performance a lot how your engine manages the GPU state changes (draw order, how many different shader / state / texture combinations used, etc). For example if the engine is virtual textured, you can draw the whole scene using the same set of samplers and textures. And if the engine uses deferred rendering, it doesn't have to set the lights (update array of constants) per object. Things like these make draw calls cost less.
 
Last edited by a moderator:
Most modern graphics cards support index and vertex buffers natively so the draw call is just inserting command into a FIFO with pointers to the indices and verts.
On the consoles you can also prepackage display lists and just submit those, that literally involves inserting a "call" into the FIFO.
I can honestly say on console titles I've never even set a budget, I usually just measure various data organizations, and pick one.

The better question is why is it so expensive on PC, I believe a lot of the issue is the stupid reorganizing of commands the PC drivers try to do to remove stupid usage patterns. There is minimally one extra copy in a PC driver, but there is no ring transition in the API call so I don't see why we should see the overhead that we do.
 
FWIW, there is a breakdown of a scene capture from our recent Dead Space 2 interview. There are some draw call numbers along with what they cover.
BudgetsOnly.png


Granted, there's not really much going on in the scene (full screenshot at the link).
 
AlStrong that screenshot is also the kind of thing I am looking for, lots of figures. This may sound like a dumb question but this is saying 9.9ms CPU for render setup, this should be easy handled by a single thread with room to other things?
 
looks to me like It's GPU limited in the screenshot.
With lights/shadows taking the bulk of the time.

Yes 9.9ms should be fine on a single thread but it's hard to know what exactly they are measuring, and the scene is hardly complicated.


APT is the UX solution FWIW
 
FWIW, there is a breakdown of a scene capture from our recent Dead Space 2 interview. There are some draw call numbers along with what they cover.
If they used a full core (33.3ms on Xbox and PS3), that would be a more than 9000 draw calls per frame (assuming the scene would be fully CPU bound).

On the consoles you can also prepackage display lists and just submit those, that literally involves inserting a "call" into the FIFO.

The better question is why is it so expensive on PC, I believe a lot of the issue is the stupid reorganizing of commands the PC drivers try to do to remove stupid usage patterns. There is minimally one extra copy in a PC driver, but there is no ring transition in the API call so I don't see why we should see the overhead that we do.
You can use command buffers (command lists) on PC as well. DirectX 11 allows you to record command buffers and reuse them later to save draw calls. Draw calls aren't that costly anymore either. All the GPU render/sampler state objects are now created beforehands and validated on creation time instead of the use time. And constants are stored in GPU memory (constant buffers) instead of transferred over bus to the GPU every frame. With DX11 you can push 50k draw calls per frame on a recent graphics card (using a single thread). Of course this in on i7 (a modern OOO CPU), so I agree that DX11 is still inefficient compared to consoles, but the situation isn't as bad as it was some years ago (DX9 + XP driver model).
 
Back
Top