If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.
![]() |
|
|
#1 |
|
Member
Join Date: Dec 2009
Posts: 600
|
Why are the Draw Calls on the PC more expensive than the Xbox360 (for example) , Developers can do more of them on the Xbox than on PC which sounds pretty strange considering that PCs have way more GPU and CPU horsepower .
|
|
|
|
|
|
#2 |
|
Member
Join Date: Nov 2007
Posts: 990
|
On consoles you can directly write commands to the GPU ring buffer, and you will write the commands directly in a format that the GPU hardware understands. It's just a few lines of code to add a sincle draw call.
PC has both user space and kernel space drivers that process the draw calls. More than one software can be adding GPU commands simultaneously, and the driver must synchronize and store/return the GPU state accordingly (a single mutex lock is already over 1000 cycles). The GPU commands must be translated by the driver to a format understood by the GPU (many different manufacturers and GPU families). The commands and modified data must be sent over a standardized external bus to the GPU. On Xbox for example both GPU and CPU share same memory and nothing needs to be send over a relatively slow bus. On consoles you can also edit GPU resources without locking them if you are sure that the GPU is not using them currently. On PC everything must be properly synchronized and all commands and resource references must be validated (software cannot be allowed to crash the GPU or modify/access data of other programs). PC drivers also automatically manage GPU memory allocation (moving in/out resources based on usage). Depending on allocator/cache algorithms used this can also be relatively expensive. |
|
|
|
|
|
#3 |
|
Senior Member
Join Date: Feb 2002
Posts: 2,036
|
Part 1 of the following link expands on how the application and drivers interact under DirectX.
http://fgiesen.wordpress.com/2011/07...ne-2011-index/ |
|
|
|
|
|
#4 |
|
Entirely Suboptimal
Join Date: Mar 2003
Location: WI, USA
Posts: 6,867
|
sebbi that sure sounds like a mess. Oh the uglies of a general purpose, expandable system.
|
|
|
|
|
|
#5 |
|
Senior Member
|
At least until DX11 OpenGL had far smaller overhead when doing drawing calls than DX. Not sure how things are now.
Also, having to feed the GPU with command stream in a specific format isn't all that fun any more once you have to deal with more than a couple of different GPUs or even versions of the same GPU core. Consoles can allow that kind of "uglyness" as hey are using fixed hardware for years and have no problems with incompatibility. |
|
|
|
|
|
#6 |
|
Moderator
Join Date: Feb 2002
Location: Taiwan
Posts: 2,358
|
If the GPU becomes more mature then it may be possible to have a somewhat fixed hardware "instruction set" for GPUs, and it'd be possible to have a much leaner driver stack on PC.
There are, of course, some problems. The obvious one is, who gets to design this "instruction set." Most hardware 'standard' developed from a single product by a single company, which becomes very popular and then be used as a "de facto" standard (and may become a real industrial standard). It's really hard to make a new standard out of nothing. And design by a committee doesn't work. Microsoft is another possible candidate, but they probably don't understand enough about the underlying hardware architecture to make a good design. Another way is to design an "intermediate code" which is translated by a software into hardware commands. But then this is not very different from a command buffer, and probably not going to bring much performance advantage. There are other problems too. For example, since the driver would not be able to do safe keeping works, the hardware will have to. Basically you'll want the GPU to be like a CPU, with all the security modes and controls. Personally I think this is a good thing, as with GPU getting more flexible and GPGPU there will be more related problems with security, so it's probably better done with hardware anyway. |
|
|
|
|
|
#7 |
|
Moderator
Join Date: Feb 2002
Location: Redmond, WA
Posts: 3,322
|
Modern GPU hardware is already a pretty close copy of the API.
Translating isn't really the issue. The predominant problem is having to deal with multiple processes sharing the GPU. This will not change, short of adding hardware to the GPU to enable fast context switches which at some point may be justifiable. DX11 and the win7 driver model removes a lot of the superfluous Driver overhead that existed. Plus you finally get command buffers, and state isn't global in the same way. You still have the stupid stuff in the PC drivers, fixes/workarounds for poorly optimized or broken game code, that eats a lot of CPU since it involves analyzing everything going to the GPU. Of course devs are forced to work around these, which the driver writers then have to detect a d fix..... And of course new PC GPU's are optimized to run last years best tech. |
|
|
|
|
|
#8 | |
|
Moderator
Join Date: Feb 2002
Location: Taiwan
Posts: 2,358
|
Quote:
Then we can standardize these commands so applications will be able to send commands to the GPU directly, without any extra overhead. You can handle older applications with drivers, and newer applications will be able to access GPU more directly. Of course, on a normal desktop OS, you probably still can't let applications access GPU directly, as it involves memory mapped I/O and that needs to be in kernel mode. However, its overhead should be much less than what we have now. |
|
|
|
|
|
|
#9 |
|
Member
Join Date: Mar 2007
Location: Wroclaw, Poland
Posts: 578
|
This is not going to happen. 1. HW from different vendors is to dissimilar for a common ISA. Not to mention that instructions are not everything GPUs process: there's some state involved, which is entirely HW-specific. 2. Part of what goes to the buffer is so tied to hardware it may be covered by patents (not that I would know anything about patents, just assuming this may very well be the case). 3. Even for the actual code there's a huge variation in what and how you want encoded in the command buffer which depends on the HW you're feeding.
__________________
Shifty Geezer: I don't think the guy really understands the subject. PARANOiA: To be honest, Shifty, what you've described is 95% of Beyond3D - armchair experts spouting fact based on the low-level knowledge of a few. This posting is provided "AS IS" with no warranties, and confers no rights. |
|
|
|
|
|
#10 | |
|
Moderator
Join Date: Feb 2002
Location: Taiwan
Posts: 2,358
|
Quote:
HW from different vendor is probably going to be a moot point as the number of important GPU vendors in x86 space is now only three, and they probably all have cross licensing deals, so patent is not a serious problem. Advance in GPGPU also brings GPU from different vendors closer. Although it's probably not going to happen in maybe a few years, but at least it's not technically impossible and if there's enough incentive they may want to do that. But that brings to the main point though: is there enough incentive for IHVs to do that? Right now I don't see that happening, as there are really no strong demand for very high performance desktop graphics. |
|
|
|
|
|
|
#11 | |
|
Senior Member
Join Date: Feb 2002
Posts: 2,636
|
Quote:
|
|
|
|
|
|
|
#12 |
|
Senior Member
Join Date: Feb 2002
Posts: 2,036
|
I assumed pcchen wasn't referring to the shader ISA, rather a theoretical command buffer "ISA".
|
|
|
|
|
|
#13 |
|
Senior Member
Join Date: Feb 2002
Posts: 2,636
|
On a second read, I think that must be the case. My bad.
|
|
|
|
|
|
#14 | |
|
hardly a Senior Member
Join Date: Jul 2008
Location: still camping with a mauler
Posts: 3,676
|
So what can be done to reduce the cost of draw calls on PC? I know instancing is used to reduce the number of draw calls needed, but is there a way to actually reduce the amount of CPU time needed per call?
Keep in mind my understanding of these things could not even be called "beginner". More like "ignorant spectator".
__________________
Quote:
|
|
|
|
|
|
|
#15 |
|
Now Officially a Top 10 Poster
Join Date: May 2006
Location: Maastricht, The Netherlands
Posts: 13,228
|
All I know is that Crytek had pointed out the main weaknesses they considered to still be in DirectX11, and that they were working with Microsoft to sort them out. I don't know what has become of that actually, would have expected to have heard something about that, but maybe I just missed it.
|
|
|
|
|
|
#16 |
|
Member
Join Date: Mar 2007
Location: Wroclaw, Poland
Posts: 578
|
Draw call cost depends on the HW. Some things have to be translated for a given card and this imposes extra CPU cost per call. One could imagine that modern hardware may not support certain topologies (triangle fans would be something I guess most cards don't support directly; perhaps some support just plain TRIs or just lists). But it's not really a draw call that kills you, it's the (unnecessary) state changes between draw calls and stuff that has to be translated. Pretty much every modern HW out there simulates fixed pipeline in the driver, so that's extra CPU cost for you. Weird texture formats may require some processing. There's a lot happening beyond draw calls. And there are lots of things you can do to minimize CPU usage.
__________________
Shifty Geezer: I don't think the guy really understands the subject. PARANOiA: To be honest, Shifty, what you've described is 95% of Beyond3D - armchair experts spouting fact based on the low-level knowledge of a few. This posting is provided "AS IS" with no warranties, and confers no rights. |
|
|
|
|
|
#17 |
|
a.k.a. Ingenu
Join Date: Feb 2002
Location: Apsley, U.K.
Posts: 2,752
|
Make GPU standard just like CPU thank you.
Not investigated the cost of a draw call in a while, a lot happens behind the hood for sure, but we are used to minimizing them since D3D9... A draw call basically gets all the states and check their validity/consistency before filling the command stream. Anyone working on drivers can explain how that works in D3D10/11 ? (I know the runtime does a lot because of NV
__________________
So many things to do, and yet so little time to spend... |
|
|
|
|
|
#18 | |
|
Senior Member
Join Date: Feb 2002
Posts: 2,636
|
CPUs are not standard either ; ) Moreover, a proper level of abstraction beats standardized hw most of the time (e.g. HL programming languages vs assembly, etc)
Quote:
1. state tracking 2. shader compilation (normally depending on both client shaders and active state) 3. interfacing with the kernel mem allocators for buffer objects management and related fences/syncs. The last one of those does not really belong in there, as it can be taken out of the driver and into a bog standard "GPU buffer API", or if you wish, a "DMA-coherent buffer API", perhaps even in flavors based on whether the device is MMU-equipped (so it can "comprehend" page tables) or not. That said, we can optimize drawcalls all we want, but they will never be 'free' - they'll always cost CPU cycles, whether in housekeeping or in CPU/GPU rendezvous mechanisms. |
|
|
|
|
|
|
#19 |
|
Member
Join Date: Mar 2007
Location: Wroclaw, Poland
Posts: 578
|
Sure. Who's going to create the de facto standard the way Intel's x86 is? I vote for PowerVR to lead in this space. :>
__________________
Shifty Geezer: I don't think the guy really understands the subject. PARANOiA: To be honest, Shifty, what you've described is 95% of Beyond3D - armchair experts spouting fact based on the low-level knowledge of a few. This posting is provided "AS IS" with no warranties, and confers no rights. |
|
|
|
|
|
#20 | |
|
Senior Member
Join Date: Feb 2002
Posts: 2,036
|
Quote:
|
|
|
|
|
|
|
#21 |
|
Senior Member
Join Date: Mar 2006
Posts: 1,713
|
When the CPU constructs a draw call, the GPU executes a different one in parallel, right? So a GPU should not be slowed down by draw call overhead. With PCs getting ever more CPU cores, will draw call overhead really be an issue within the next couple of years? It's not as if games right now are making 100% use of all the CPU power. I don't think this is going to change. Or am I missing something (which is very likely) ?
I guess the first question really is: is the GPU often put into idle mode only because of draw call overhead (so not because there is no work to be done.) If the answer to that is 'yes', then the rest doesn't need to be answered... |
|
|
|
|
|
#22 |
|
Senior Member
Join Date: Feb 2002
Posts: 2,036
|
In most situations the driver/CPU is multiple draw calls ahead of the GPU so yes, they work in parallel. The GPU is only slowed down if it's starved for work.
|
|
|
|
|
|
#23 | |
|
a.k.a. Ingenu
Join Date: Feb 2002
Location: Apsley, U.K.
Posts: 2,752
|
Quote:
I think people want to know why/if they need to put extra effort optimising to minimise draw calls.
__________________
So many things to do, and yet so little time to spend... |
|
|
|
|
|
|
#24 | |
|
Member
Join Date: May 2002
Location: Slovenia
Posts: 420
|
Quote:
If you have lot's of vertex/index buffers then CPU will have to translate API handles to actual hardware addresses all the time. This isn't even an CPU problem that you could solve by having more cores or more threads. It depends alot on memory latency. I did some test a while ago... Basically it goes from one draw primitive call to 100k draw primitive calls with a total budget of 15M triangles that's the same throughout entire run. Same texture, same shader just flipping vertex and index buffers each draw primitive call and uploading some constants. This is D3D 11: https://static.slo-tech.com/52734.jpg And this is mulithreaded D3D 11 vs NV properitary OpenGL extensions: https://static.slo-tech.com/52736.jpg |
|
|
|
|
|
|
#25 |
|
a.k.a. Ingenu
Join Date: Feb 2002
Location: Apsley, U.K.
Posts: 2,752
|
Quite interesting.
__________________
So many things to do, and yet so little time to spend... |
|
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|