hello first post- igp combo

Your sample-implementation is completely unrealistic given current hardware. I might aswell suggest we put more vertexshader units on the GPU and make them faster. As theoretical as your 'implementation', and it will yield even better results!

Yes and the first line of my last post states exactly that it doesn't seem possible with current hardware, so guess what the rest of my post is refering to.

I didn't ignore it, I pointed out that this is not possible, and may not ever be possible.

Ok and I'm discussing the implemation of a system where a vertex processor is embedded in the chipset. The original idea was to use the vertex processor that was already in the IGP, since it has been established that there is no vertex processor on current IGP's again after that guess where the direction of my comments were directed. The word 'if' preceding the mention of the igp and writing to a vertexbuffer were put in repeatedly for a reason.

That is under the assumption that the IGP can render directly into the videomemory of the other GPU. Which isn't realistic, as I pointed out earlier. Perhaps you should read my posts more clearly. I feel like I'm only repeating myself.

Thats funny I have essentially been repeating myself and wondering why you are misreading my posts, for example in your edit why did you use system_mem? I said agp memory. And if the IGP is capable of writing out to there is no reason it can't access any memory the CPU can.

Not at all. There are many ways to implement shadowvolumes. 3dmark03 generates them entirely through vertexshaders for example (including skinning). Another possibility is to extrude in object-space, so even if the CPU is generating the volumes, it will not need to do a transform.
And if you want transformed vertices, why would you read them over AGP? If it's faster to use the CPU to transform them, then do that.

Silhouette determination done in the GPU, thats interesting care to explain?

It's nice that you can copy some info from the SDK, but trust me, the resource was created in videomemory on the cards I tested with, the performance clearly indicates that.
So I don't really see your point in writing this stuff here anyway.

I wish I had a crystal ball can you sell me yours? I'm curious exactly how you determined exactly which kind of memory the buffer was placed in? Please enlighten me.

I think you need to do three things:

1) Get statistics on people owning an IGP, 3d card, or both.
2) Study ways to efficiently render shadows/skinned meshes.
3) Benchmark the speed of various things, like vertex-processing power of actual IGPs (see how slow software-emulation is, or how slow even real shaders can be, on a low budget), and resources in AGP-mem.

Then rethink your idea.

I think you need to reread the context i stated where one would use said technique which is inherently tied in to why I started cultivating the idea in the first place.
I think I need to stop responding to you since you essentially ignore anything i state.
 
Yes and the first line of my last post states exactly that it doesn't seem possible with current hardware, so guess what the rest of my post is refering to.

Philosophical rubbish?
"What if we could use an onboard IGP?"
Why not go further and say: "What if we had an IGP on the graphics card?"
Or better yet: "What if there was an IGP integrated in the GPU?"
In short: "What if we had more vertexshaders?"
This is likely going to happen, and will affect anyone with a graphics card in the future. Your idea is most likely not going to happen. And even if it is, I guess most people who buy fast graphics cards, don't buy a PC with onboard graphics. So it still would not affect most people.
On another note, haven't you thought of dual CPU solutions yet? One CPU could do the skinning etc, while the other CPU does the normal tasks. Not likely going to affect many people either, but at least it can be implemented already.

Silhouette determination done in the GPU, thats interesting care to explain?

Oldest trick in the book:
1) Store degenerate quads (or tris, when extruding to infinity with w=0) for each edge of the model.
2) Store plane equation for the triangle in each vertex
3) For each vertex, classify against light, extrude vertices which are on backfacing triangles.

3dmark03 does this including skinning, I believe it actually skins all 3 vertices of a triangle for every vertex, and calculates the plane equation on the fly. There was a bit of noise about this method from NVIDIA at the time, because the GeForce FX had less vertexshader units than the Radeons, and was therefore a tad slower.
There is probably a good reason why FutureMark chose this approach over CPU-processing though.

I wish I had a crystal ball can you sell me yours? I'm curious exactly how you determined exactly which kind of memory the buffer was placed in? Please enlighten me.

The answer was already in the previous post. And you may want to watch that attitude, especially when you are asking simple questions like "How do I generate shadowvolumes on the GPU?". It looks a bit silly.

I think I need to stop responding to you since you essentially ignore anything i state.

I tend to ignore people when they aren't making any sense.
 
Hello Infinisearch,

As far as I can see, your argument goes something like this:

1) Many PCs have a primary video card in a slot, and a (usually disabled) IGP on the motherboard.

2) When the IGP is disabled, some transistors are going unused.

3) Maybe it would be possible to use those transistors?

Now ignoring all the technical pros and cons (and FWIW I can see more cons than pros) there's a more basic question: can you make a business case for it?

IGPs are the ultra-ultra-cheap end of video chips, maybe $10-$20. The whole point of them is that video chip companies can sell into a huge system integrator market, which makes them money. To do this, they have to beat other manufacturers on price. If they're cheap, the SIs will buy them and avoid paying for a proper video card. The margins here are very small because the volume of sales is high.

This means the video card companies are very very unlikely to spend money developing fancy new IGP technology, and certainly not new drivers or other tools to support IGP/GPU interoperation, because they want to push a cheaper, lower power version of some primary card technology down to that price point without changing the way it works or invalidating all the testing and development they did when they built the primary card. It doesn't pay to innovate in the commodity component market, and the card companies need the income from this market to finance all the high-end R&D for their expensive flagship cards.

I had a lot more to write here, but I ended up deleting it because I think the main points have already been made by other people, and going into more technical arguments isn't going to help. I think the short answer is that the cost of making use of those transistors is high, and the return low.

If you are personally up against a performance problem with CPU skinning in something you're coding, then I'd suggest offloading it onto the GPU, care of the VP, which is really easy and gives excellent results. If you're just performing a though experiment then I'd probably give it up unless you can come up with a business case where your idea makes real dollars for the video card manufacturers.

If you're really interested in arguing this out with ATI or NVidia, why not contact developer relations and see what they say? I suspect you'll get a similar answer, but let us know if you hear different.

Cheers,

Will Vale
 
If you are personally up against a performance problem with CPU skinning in something you're coding, then I'd suggest offloading it onto the GPU, care of the VP, which is really easy and gives excellent results.

I think that is the basis of the problem...
Infinisearch probably thinks that skinning on the CPU is the way to go because Carmack does it like that in Doom3.
IMHO it is not the way to go at all, even if it is Carmack who is doing it.
Your advice is the right one, I believe, and I gave it before aswell: offload it to the GPU, it is way faster than the CPU. I can only guess why Carmack didn't choose to take this path, but I certainly do not agree with him, and I'm not the only one. Futuremark doesn't agree with him either (and their results seem better than Doom3), and I believe that HL2 also takes the other path, and I think most other coming games aswell.
 
Back
Top