the numbers that circulate in here are a bit misleading....
the sh4 cannot perform t&l for 10 million polys at interactive frame rates. However, it CAN process significantly larger amounts of geometry than what was seen in commercial games.
The highest i have seen is dead or alive 2 with ~30000 vertices per frame in some scenes.
I got a dc devkit some time ago and managed to get 80000 verts per scene @ 60 fps with dynamic diffuse lighting and one texture. This was coded purely in assembler and the geometry was used as a simple raw, continuous vertex buffer with the shader being evaluated for every vertex.
Further optimizations using indexed geometry buffers and a triangle stripping library will give even better results.
My point is, the dc was a platform where you could get acceptable results without too much effort and the developers did not really bother to max it out. I'll post my demo when i finish it as i do not have a lot of free time, and i think i will get my point across.
i will try to post my demo sometime this year as i do not have much time right now. In its current state it uses a single dynamic light and a prebaked ambient occlusion texture. It looks good but i want to add some extra layers of environment mapping, bump mapping and maybe cook some shadows as well.
80k verts @60 fps means i will have to dedicate another 1/60th of a second for game logic, so for such levels of geometry i think that 60 fps is not feasible. 30 fps are achievable though, and with the use of a good triangle stripping library like nvTriStrip in my exporter i can bring the geometry dataset down to half the size with no quality loss...so i would be pushing the equivalent of something like 150k verts @30 fps along with plenty of time of time for game logic to execute...
I've been doing DC homebrew too. My best achieved poly count is 5.46 million per second... but half of the polygons are off screen in that instance. With all of the polygons visible (but not all front facing, I was drawing tori) the PVR seems to max out at 4.1 million polygons per second (and wind up with about ~30% CPU idle time then). I can process around ~6 millionish vertices per second with transform, perspective divide, a dot for the light, clamping light to >0.0, ambient add, and submit with UV coords. No skinning or clipping, though, and I've only been drawing a single model, which fits entirely into cache (it's about 15kb).
I haven't done much work on it recently, though, because I got hung up on getting a good triangle stripifier. I found a couple open source methods online, but the only one that I could get working (GNU Triangulated Surface Library) generated terrible, terrible strips. (A spiral with a 16 edge cross section does NOT need 514 strips! The Standford bunny was coming out at 4,000+ strips.) Sounds like I should look into nVidia's I guess...
Also DOA2 did closer to 50,000 polys per frame at 60 FPS, not 30,000 per frame. Sonic Adventure 2 averages 20,000 per frame at 60 FPS. Rez and Shenmue II peak at around 40,000 per frame at 30 FPS.
amk, Soul Calibur doesn't use bump mapping. The only places I've seen it are in the background for Sega Rally 2's title screen, and on some coins in Shenmue II.