"Yes, but how many polygons?" An artist blog entry with interesting numbers

Discussion in 'Console Technology' started by Farid, Sep 1, 2007.

  1. Mobius1aic

    Mobius1aic Quo vadis?
    Veteran

    Joined:
    Oct 30, 2007
    Messages:
    1,698
    Likes Received:
    276
    Utilized in terms of size of library, or utilized in terms of system usage? I think even amongst 1st, 2nd and the very best PS2 dedicated 3rd parties, VU0 was fairly underutilized. Would love to hear more examples as to how it was used though.

    The availability of DX on Xbox made it much easier to push close to the limits of that system compared to that of the PS2. Xbox visuals were probably often limited by multiplatform development that dictated workability with the PS2 and Gamecube. Splinter Cell: Chaos Theory was a very special example of a third party doing it right because the series' home was the Xbox, and it's users would expect no less.
     
  2. Nesh

    Nesh Double Agent
    Legend

    Joined:
    Oct 2, 2005
    Messages:
    12,906
    Likes Received:
    3,072
    What was the VU0 planned for and what it could have done?
     
  3. Aaron Elfassy

    Newcomer

    Joined:
    Apr 17, 2016
    Messages:
    69
    Likes Received:
    87
  4. jlippo

    Veteran Regular

    Joined:
    Oct 7, 2004
    Messages:
    1,593
    Likes Received:
    838
    Location:
    Finland
    Clipping when polygon hits edge of view frustum pyramid and is cut to new polygons which reside within it, thus increases amount of visible polygons.
     
  5. Nesh

    Nesh Double Agent
    Legend

    Joined:
    Oct 2, 2005
    Messages:
    12,906
    Likes Received:
    3,072
    I am not sure I understand :p
     
  6. milk

    milk Like Verified
    Veteran Regular

    Joined:
    Jun 6, 2012
    Messages:
    3,677
    Likes Received:
    3,729
    If a model is halfway out of view, and some of it's tris get rejected, that is still called culling, not clipping. It is a very granular kind of culling, that is for sure, but still. So granular, depending on how fast/slow that culling is, and how fast/slow simply going through the geometry anyway, it might become a non-optimisation: rejecting all those polygons ends up slower than simply rendering them anyway.

    Clipping refers to when a single triangle is halfway out of view, so it gets sliced and re-triangulated so that all its verts stay within the screenspace bounds for rasterisation. In practice, the screen coordinate bounds ususually extend a bit beyond the screen edge so as to reduce the amount of tris that need be clipped (more polys end up culled entirely before having to clip them)
     
    Nesh likes this.
  7. Nesh

    Nesh Double Agent
    Legend

    Joined:
    Oct 2, 2005
    Messages:
    12,906
    Likes Received:
    3,072
    Since the clipping is beyond the boundaries of the viewport, isnt it possible to, instead of slicing polygons, to actually "vanish" the completely once all their vertices extend beyond set boundary?
     
  8. milk

    milk Like Verified
    Veteran Regular

    Joined:
    Jun 6, 2012
    Messages:
    3,677
    Likes Received:
    3,729
    Yes. And that is done. And that is called culling. Frustum Culling to be more specific.
    But untill ALL verticies are completely out of screen-space, you can't simply not render the poly without leaving visible holes. That's why then it has to be clipped.
     
    Nesh likes this.
  9. corysama

    Newcomer

    Joined:
    Jul 10, 2004
    Messages:
    190
    Likes Received:
    173
    There are 5 processors involved in PS2 graphics:

    The EE (CPU) which does normal CPU things. In this case it's job is to do high-level culling of whole batches of polygons. So, decide to draw the whole batch or skip it. Batches to be drawn are linked into a queue of instructions for the VIF.

    The VIF is a slightly-programmable DMA engine. Basically, it's job is to copy data from the main memory to the internal memory of the VU. In the process it can do a little rearranging of offsets, strides, packing, unpacking the source and destination buffers. The VIF also sends code to the VU1. The VU1 only has 16kb of memory total for both code and data. So, you have to stream code for specific situations kind of like switching shaders.

    The VU1 receives code and a series of data chunks from the VIF. For each data chunk, the VIF can also start the code at a data-driven address. That's how you can have multiple routines in a single code chunk. The VU1's job is to prepare data for the GIF. The VU has limited (16 bit) general purpose registers and instructions. It's real power is in vector instructions that can do large volumes of math. The VU handles vertex animation, lighting, UVs (including some of the mipmap math), etc. It also handles culling individual off-screen triangles and it must manually clip triangles into sub-triangles vs. the edge of the screen. Basically, it handles everything to do with geometry. It's not a 1-in-1-out vertex setup like vertex shaders. It's "blob of bytes" in, many-triangles-out.

    The GIF is another DMA engine. It reads chunks of bytes from the VU1's memory and stuff it into the registers of the GS. Like the VIF, you point it at a command queue containing a mixture of control bytes and data bytes for it to read through and interpret.

    The GS is the rasterizer. It has no instruction set. You control it entirely by setting values in registers (via the GIF). There are registers to define a texture to read from the embedded DRAM. Registers to define the framebuffer in the same DRAM. Registers to define the blending/Z/interpolator actions (vertex color modes). And, a register where you stuff vertex positions. Stuff that 1 register 3 times (yes, overwriting 2 values) and the GS will rasterize a triangle into the framebuffer according to the state defined in the rest of the registers. Alternatively, there is a triangle strip mode that only requires 3 positions to get the first triangle, then 1 per triangle after that. For a given triangle, the GS can only handle a single texture, a few options for how to incorporate the vertex colors and a limited blend mode. So, if you want to use two textures, you need to draw the same triangle twice with different configurations. Fortunately, changing GS configs can be done by setting a few registers at a cost of 1 cycle each. So, the VU can transform a dozen triangle once, have the GS rasterize them, switch GS configs and rasterize them again. That's twice the pixel work for the GS, but it's pretty cheap for the VU.


    So, that's a long, fun brain-dump just to say that the CPU isn't really burdened with calculating polygons. The VU1 is :) Your main intuition is correct though. The VU can and should do a whole lot of fine-grained culling before sending triangles to the rasterizer. That can explain a lot of the difference in polygon culling measured in the emulators.
     
    idsn6, milk, Cloofoofoo and 2 others like this.
  10. PSman1700

    Veteran Newcomer

    Joined:
    Mar 22, 2019
    Messages:
    4,520
    Likes Received:
    2,074
    Thanks for explaining the PS2 abit more to us :) Do you think Transformers would have been possible on the og xbox at the time? The huge draw distances and the amount of trees on display with zero fog was quite amazing at 60fps. Game had impressive physics also, almost like Havok.
     
  11. corysama

    Newcomer

    Joined:
    Jul 10, 2004
    Messages:
    190
    Likes Received:
    173
    The VU1 was set up to do all of the geometry and animation work to prepare triangles for the rasterizer. I'm not sure what the plan was for the VU0. But, I think the idea was basically "What if we added another VU to help out the CPU with math-heavy work?"

    The problem with the VU0 was that there wasn't an obvious good way to schedule streaming results out of it. You could stream data into it easy enough. But, it couldn't send data out on it's own. It also couldn't tell the CPU when data was ready. So, the official plan was to have the CPU poll to see if the VU0 was idling before triggering a DMA out operation. It was very difficult to set that up without having the CPU waste time polling with nothing else to do (or the CPU doing lots of other stuff while the VU0 idles waiting for more work).

    Late in the PS2's lifetime, my team at High Voltage Software figured out how to make scheduling work out. I think the VU0 would somehow trigger an interrupt and the CPU's interrupt handler would move the DMA pipeline along asynchronously from the CPU's main thread. I expect PS2-specialty shops figured this out way before we did, but they weren't sharing :p

    With that in place, moving high-level culling to the VU0 would be an obvious first task. The VU1 doesn't have the room to do animation keyframe->blended matrices work along side the vertex transforms. So, that would be a great job for the VU0 to offload from the CPU. Physics in general would fit well too. Audio processing would be possible. But, there would be multiple hops involved getting data from main RAM->VU0 RAM->main RAM->IOP RAM->SPU RAM. That's a lot of latency.
     
  12. corysama

    Newcomer

    Joined:
    Jul 10, 2004
    Messages:
    190
    Likes Received:
    173
    That would be difficult. The PS2 could handle more raw, basic, simple polygons than the Xbox, but only after a whole lot of work and constraints to appease the hardware. It also technically had more fill rate, but again with a whole lot of constraints. The Xbox had a stronger CPU and a much easier API. The vertex and pixel shaders had way more features than the GS, were much more familiar to PC devs and they were by no means weak. So, if you want to draw a bunch of single-texture, vertex lit, alpha-testing polys then add full-screen "motion blur" as a post-process, the PS2 is actually a much better fit. Throw in lightmaps and it gets a bit harder for the PS2. Normal maps? LOL, no. (Someone figured out how to technically make it work, but it took like 8 passes). Shadow maps? Hope you are OK with solid black shadows. Etc...
     
    chris1515, jlippo and PSman1700 like this.
  13. PSman1700

    Veteran Newcomer

    Joined:
    Mar 22, 2019
    Messages:
    4,520
    Likes Received:
    2,074
    Wouldn't an xbox version be possible somehow, playing to it's strenths?
     
  14. Cloofoofoo

    Newcomer

    Joined:
    Aug 26, 2018
    Messages:
    71
    Likes Received:
    117
    Yes thank you for that explanation , its gives a clearer picture for the ps2. But I was actually asking about the dreamcast. That even if the gpu does auto cull of backfaces , hidden surface removal and clips x,y the cpu still has to calculate all those polygons before sending it to the gpu right? I was just asking if clipping/culling on the cpu is a good idea to do before it is sent to the gpu or is it already overloaded with everything else( physics , ai and so on). As the sdk mentions backface culling can be cpu costly.
     
  15. Cloofoofoo

    Newcomer

    Joined:
    Aug 26, 2018
    Messages:
    71
    Likes Received:
    117
    Returning to how many polygons. I extracted these off the disc files. The game is marionette handler 2 for the dreamcast. Its a robot fighting sim where you dont control the robots but you program their a.i. Its a very low budget title and a sequel.The robots range from 3,100 triangles to 4,700 triangles with no weapon or effects(gun shots, rocket boost bursts). The stages range from 3,000 to 5,200 triangles. The weapons range from 300 to 500 triangles.I guess if you look at it polygon wise it performs like dissidia/ 012 on the psp where the characters range from 1,500 triangles to 2,400 triangles(with weapons) and the stages up to 8,000 triangles.

    Robot 1 - 3,923 tris
    [​IMG]


    robot 2 - 4,791 tris
    [​IMG]

    robot 3- 4,110 tris
    [​IMG]

    Stage 2 - 5,252 tris
    [​IMG]

    stage 6 - 4,151 tris
    [​IMG]
     
    xaeroxcore and jlippo like this.
  16. corysama

    Newcomer

    Joined:
    Jul 10, 2004
    Messages:
    190
    Likes Received:
    173
    With the Dreamcast (and most other systems) the CPU does not need to do poly-by-poly work to send geo to the GPU. The CPU just points the GPU at clumps of polys to draw as a batch. The CPU does do some work to make sure that batch is worth trying to draw (determine that it's probably visible vs. definitely entirely invisible), but it does not want to bother with backface culling or anything like that.

    The PS2 works the same way if you think of the VU1+GS as a single unit. It's just different because the VU1 is more programmable than GPU geometry hardware ever was up until maybe the new Mesh Shaders (which are basically a modernized take on the same idea as the VU1).
     
    idsn6 likes this.
  17. xaeroxcore

    Newcomer

    Joined:
    Aug 1, 2014
    Messages:
    66
    Likes Received:
    4

    So it means, RE4 (PS2 VER), SoulCalibur 2 GCN and the PSP games are all on DC technical reach? Could they run on DC with more modern rendering techniques? And i can´t believe DC Jedi Power Battles sports more than 2 million polys, what a waste for a game that ended up looking like an upgraded 32 bit port. Happy new year to everyone!
     
  18. Nesh

    Nesh Double Agent
    Legend

    Joined:
    Oct 2, 2005
    Messages:
    12,906
    Likes Received:
    3,072
    I think though that the PSP versions may have less polygons than their PS2 counterparts. Then again the DC does show to have a lot of polygons comparable to the PS2. The DOA2 game is an impressive feat
     
    xaeroxcore likes this.
  19. xaeroxcore

    Newcomer

    Joined:
    Aug 1, 2014
    Messages:
    66
    Likes Received:
    4
    i dream with the day someone could port those PS2/PSP games to DC. Now we know those games are on DC tech realm, at least on polycount department!
     
    Nesh likes this.
  20. Cloofoofoo

    Newcomer

    Joined:
    Aug 26, 2018
    Messages:
    71
    Likes Received:
    117
    Skies of arcadia for the dreamcast is next. The game is strange. The main characters are around 900 tris without weapons. side characters and low level monsters are around 500 tris. The later end (regular)monsters and bosses range from 3k to close 8k tris. The stages seem to range from 5k to 50k tris.

    Vyse - 938 tris without weapons
    [​IMG]

    Final boss - 6,111 tris
    [​IMG]

    Regular enemy inside shrine- 3,500 tris
    [​IMG]

    Electric giga boss - 7,670 tris
    [​IMG]

    Ruins - 24,017 tris
    [​IMG]

    ice ruins - 24,443 tris
    [​IMG]

    ice ruins pt2 - 52,414 tris
    [​IMG]

    ice ruins entrance - 12,498 tris
    [​IMG]

    Inside valuan ship - 19,761 tris
    [​IMG]
     
Loading...

Share This Page

  • About Us

    Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!

    Beyond3D is proudly published by GPU Tools Ltd.
Loading...