What else?OpenGL guy said:And I said before: How do you know it was the drivers that held it back?Chalnoth said:I'm just stating that Drivers held back the GeForce's in that test
where is vertex buffer in modern games or game engines, in video memory, AGP memory, system memory or using hybrid method?
Anyone can answer me? Thx.
Mummy said:where is vertex buffer in modern games or game engines, in video memory, AGP memory, system memory or using hybrid method?
Anyone can answer me? Thx.
Yes
I can tell how 3dmark creates its vb's and which fvf is set in the High polygon test:
many vertex buffers, with various length but those characteristics:
FVF 18(0x12) D3DFVF_XYZ | D3DFVF_NORMAL
Pool D3DPOOL_DEFAULT
Usage 8 D3DUSAGE_WRITEONLY
rest are:
FVF 258 (0x102) D3DFVF_TEX1 | D3DFVF_XYZ
FVF 274 (0x112) D3DFVF_TEX1 | D3DFVF_XYZ | D3DFVF_NORMAL
for the high polygon test with 8 lights its the same except:
FVF 322 (0x142) D3DFVF_TEX1 | D3DFVF_DIFFUSE | D3DFVF_XYZ
Usage 0x208 D3DUSAGE_DYNAMIC | D3DUSAGE_WRITEONLY
0x60 (96) bytes vbs (like 3 of them).
Every vb created (except those v. small ones in the high poly tests) is created with the D3DPOOL_DEFAULT and the D3DUSAGE_WRITEONLY flags, so they SHOULD go in VRAM (it still depends on drivers).
I can check other apps/games too, just point me to a demo of the game u wanna test.
I can only speak for UT2003:Zephyr said:Thanks a million. Here are some insteresting games and synthetic benchmark:
Jedi Knight 2, UT2003, Commanche 4, Serious Sam 2, Dungeon Siege, CodeCreatures Benchmark Pro, and Aquanox.
Almost forgot the Radeon 7000. BTW, that's on low end machines - with a 2.8 GHz P4 you most likely won't be bound by triangle throughput even with software vertex processingvogel said:I think the only card UT2003 is sort of limited by triangle throughput is a Kyro II because it doesn't have TnL but enough fillrate (in comparision to other cards that lack TnL) to make that the bottleneck.
We are using a system of prefabs, like Chalnoth mentions, that allows level designers to construct complex levels out of a set of already existing geometry. This geometry used to be instanced to save memory and shared position/ normal/ base texture coords across instances and had separate streams for diffuse to allow for independent precomputed vertex lighting. As an optimization at level load time all geometry is transformed into world space and sorted by material to generate large chunks of geometry that can be rendered without any state changes in a single DIP call. This uses much more memory (up to 20 MByte in some levels) but is a noticeable speedup due to the increased batch sizes for DIP calls. There is a setting in the ut2003.ini file you can fiddle with in the upcoming demo that allows you to switch between the two approachs. The default is using batching.Saem said:I remember there was mention of "recylcing geometry" or something like that used in the Unreal engine build that'll be used in UT2k3. First of is there such a thing or am I hallucinating and if there is where and how does the performance improvement come in?
vogel said:As an optimization at level load time all geometry is transformed into world space and sorted by material to generate large chunks of geometry that can be rendered without any state changes in a single DIP call. This uses much more memory (up to 20 MByte in some levels) but is a noticeable speedup due to the increased batch sizes for DIP calls.
With the biggest level it's slightly above 20 MByte of vertex data so I guess most vertex buffers will actually end up in local memory on modern cards though I doubt it actually matters as we are quite fillrate bound.Chalnoth said:Interesting...I guess that would mean that the geometry throughput in UT2k3 isn't AGP-limited? That is assuming, of course, that not all the geometry can be stored in video memory when the geometry is transformed at level load time.
vogel said:we are quite fillrate bound.
-- Daniel
vogel said:With the biggest level it's slightly above 20 MByte of vertex data so I guess most vertex buffers will actually end up in local memory on modern cards though I doubt it actually matters as we are quite fillrate bound.
-- Daniel
FWIW, I was playing a bit with our OpenGL renderer and using AGP vs local memory for static geometry only has a minor impact on average framerate but lowers the maximum framerate by quite a bit. So for games like UT2k3 AGPx4 shouldn't be a bottleneck even if all vertex data is kept in AGP memory (which only is the case in OpenGL on NVIDIA cards)... that is unless you stare at a wallvogel said:With the biggest level it's slightly above 20 MByte of vertex data so I guess most vertex buffers will actually end up in local memory on modern cards though I doubt it actually matters as we are quite fillrate bound.Chalnoth said:Interesting...I guess that would mean that the geometry throughput in UT2k3 isn't AGP-limited? That is assuming, of course, that not all the geometry can be stored in video memory when the geometry is transformed at level load time.
I'll be glad to answer the question in a thread where it wouldn't be so blatantly off- topicReverend said:Hey Daniel, going a little off the path here but Tim told me that while we can record our own demos *and* benchmark said demo, it may not be best since there'll be other kinds of overhead - he said that with the included UPT, pure rendering performance is measured and not AI and/or game code which are stuff you guys don't attempt to benchmark since consistency of results can't be guaranteed.
<grumbling> ... asshole...vogel said:I'll be glad to answer the question in a thread where it wouldn't be so blatantly off- topicReverend said:Hey Daniel, going a little off the path here but Tim told me that while we can record our own demos *and* benchmark said demo, it may not be best since there'll be other kinds of overhead - he said that with the included UPT, pure rendering performance is measured and not AI and/or game code which are stuff you guys don't attempt to benchmark since consistency of results can't be guaranteed.
-- Daniel