DemoCoder said:
Uttar, don't take this as a hostile attack, but I've seen 3 separate threads now where you've posted essentially totally incorrect conclusions about hardware. (e.g. supersampling vs multisampling thread, etc)
GPUs don't need to write post-transformed vertices back to video RAM. There is an on-chip FIFO that stores the last few post-transformed vertices. It's between 10 and 16 vertices depending on chip architecture. DirectX even includes an Optimize Mesh routine that essentially reorders your vertices to make optimal use of the vertex FIFO. It has a tremendous effect on performance.
The NVidia patent you refer to is talking about a small onchip buffer, not writing post-transformed vertices back to video memory (and reading them BACK! That would be a stupendously dumb design)
You know what makes me look the most stupid? That I knew there were between 10 ( NV10 ) and 16 ( NV20 ) Vertex Caches which are FIFO ( BTW, I wonder how many the NV30 got... ) .
And that's precisely why I supposed all that insane ( and ridiculous ) idea to store thousands of vertices in memory before using the indices. And yes, it's completely dumb. But that was the only explanation I could find for that thing in the patent.
Your explanation actually makes sense: they would refer to a cache larger than most others by saying "memory". I don't quite understand why they'd do that ( they've always called cache as buffers before ) - but it still sounds a lot more logical than everything I've said.
Everything is as most people ( but me
) supposed: the vertices waiting to get in Triangle Setup are in a Vertex Cache.
The *only* bandwidth cost of T&L, thus, is reading static VBs ( and IBs )
So, let's consider a 32B FVF. And consider the GFFX limit 350M vertices/s at 60FPS ( 5.8M vertices/frame )
That's, err, impossible. It would require 177MB of storage
Err... And let's consider AGP 8X 2GB/s limit. At 60FPS, that's only 1M Vertices/frame...
Okay, so assuming we can use 75% of AGP 8X ( the remaining being used for textures ) and 64MB of memory for stored vertices ( that's really a best case scenario, you couldn't play at high res at all with that ) - how many vertices per frame and per second can we have?
We can have 2.75M Vertices/frame and 165M Vertices/s
However, that costs 64*60MB: 3840MB/s - 24% of the GFFX bandwidth!
And having 64MB for static data would probably not be logical unless you've got 256MB of RAM on your GPU. And 75% of AGP 8X would only be possible if there are nearly no texture uploads, so you'll also need 256MB of RAM on your GPU there too.
Also, it's highly unlikely every single static vertex is going to be drawn in a real world game. So, in practice, it might be something like 30% of them being drawn in the same frame ( thus resulting in barely 1.35 Vertices/frame ... )
Sounds like it was required to make the switch to better polygons, and not more polygons!
I really, really hope I didn't do an error here again. I'd look so stupid if I did...
I did reread all of this, so I think it's less likely there's any major error.
I'm sorry for having done so many errors in past threads and messages. I won't post things unless I'm absolutely certain of it, and checked every single spot on the planet to make sure.
Uttar