Infinisearch
Veteran
***Warning fairly long post***
First off i wanted to say hello... Hello this is my first post. Well hello and this might be the wrong forum for this post.
Now to the nitty gritty, I have an simple idea that should be simple to implement if it is possible. Its possiblity can only be determined by the likes of ATI, Nvidia, and Intel. The idea is just that alot of chipsets have integrated video so why not enable certain functionality of the 'IGP' to complement a graphics card when installed.
Case one (good background read is a paper called batchbatchbatch.pdf from nvidia)
The following is written from the perspective of directx, as per my looking into making a 3d engine. I'll skip the background and skip to needing to batch triangles and how depending on processor speed and how much CPU you want leftover for 'other stuff' (AI,physics...) you choose a value of how many batches per frame (drawprim() calls) to aim for. This being choosen (according to some pdfs from both nvidia and ati 200 is a good number for this) one then has to choose a method on how to fit N 'spaces' into M (again lets say 200) draw calls. I won't talk about texture state changes and associated speedups but skip to Transform state changes and say that you have the option of pretransforming vertexes on the on the CPU ('default way') or as per the above mentioned paper use one bone matrix palette skinning in a vertex shader for simple (non-skeletal) meshes (faster, less cpu overhead, allows for more triangles per model since gpu TL engine so much faster than CPU). Now at this point all is fine and dandy until you decide on a lighting/shadowing method which for the most part will have you process your geometry again once for each light or once for every for lights (best case). This isn't so much a problem for non-skeletal meshes, but when skeletal meshes with skining comes into play, Bam all of a sudden not only do you need more draw calls, even worst you can become triangle/Vertex shader limited. So you are back to transforming vertexes (into world space) on the CPU (speed is an concern since you will be performing skinning which is alot more expensive on the CPU) once and then using the GPU to apply the view/projection matrix for the camera and each light. Well here is where that trusty integrated gf4mx or radeon 9000 core can come in real handy. If it is possible to use the integrated TL units to process the vertices into another vertex buffer (in world space) then you free the cpu to do other things and reduce the load on your primary gpu. This should allow for a nice frame rate boost and more alive worlds by allowing for the use of less static geometry.
I was going to do a case 2 and three but this is a long post as it is, and my brain hurts. So what do you think? ( case 2 - shadow maps on integrated or any render to texture for that matter case 3 - i forgot i'll post when i remember)
First off i wanted to say hello... Hello this is my first post. Well hello and this might be the wrong forum for this post.
Now to the nitty gritty, I have an simple idea that should be simple to implement if it is possible. Its possiblity can only be determined by the likes of ATI, Nvidia, and Intel. The idea is just that alot of chipsets have integrated video so why not enable certain functionality of the 'IGP' to complement a graphics card when installed.
Case one (good background read is a paper called batchbatchbatch.pdf from nvidia)
The following is written from the perspective of directx, as per my looking into making a 3d engine. I'll skip the background and skip to needing to batch triangles and how depending on processor speed and how much CPU you want leftover for 'other stuff' (AI,physics...) you choose a value of how many batches per frame (drawprim() calls) to aim for. This being choosen (according to some pdfs from both nvidia and ati 200 is a good number for this) one then has to choose a method on how to fit N 'spaces' into M (again lets say 200) draw calls. I won't talk about texture state changes and associated speedups but skip to Transform state changes and say that you have the option of pretransforming vertexes on the on the CPU ('default way') or as per the above mentioned paper use one bone matrix palette skinning in a vertex shader for simple (non-skeletal) meshes (faster, less cpu overhead, allows for more triangles per model since gpu TL engine so much faster than CPU). Now at this point all is fine and dandy until you decide on a lighting/shadowing method which for the most part will have you process your geometry again once for each light or once for every for lights (best case). This isn't so much a problem for non-skeletal meshes, but when skeletal meshes with skining comes into play, Bam all of a sudden not only do you need more draw calls, even worst you can become triangle/Vertex shader limited. So you are back to transforming vertexes (into world space) on the CPU (speed is an concern since you will be performing skinning which is alot more expensive on the CPU) once and then using the GPU to apply the view/projection matrix for the camera and each light. Well here is where that trusty integrated gf4mx or radeon 9000 core can come in real handy. If it is possible to use the integrated TL units to process the vertices into another vertex buffer (in world space) then you free the cpu to do other things and reduce the load on your primary gpu. This should allow for a nice frame rate boost and more alive worlds by allowing for the use of less static geometry.
I was going to do a case 2 and three but this is a long post as it is, and my brain hurts. So what do you think? ( case 2 - shadow maps on integrated or any render to texture for that matter case 3 - i forgot i'll post when i remember)