New dev tools ? PlayStation 3 Edge

Here's a question.

When using backface culling type of methods in order to reduce the amount of poly's being drawn by the GPU, is there also a benefit that for the polys that are not drawn, textures are also not required... thus reducing the total memory usage on the GPU?

If so I guess the PS3 large OS reservation may be a non issue for those using these methods on multiplatform titles
 
Wouldn't that require a much more complicated pipeline? You have to preload textures at least somewhat, so there's more 'latency' in that respect which would make it harder to save memory on that account.
 
Convenient timing. Someone was mentioning a similar question about raytracing recently posted to that thread, and I was about to respond to that.

What I was referring to was that occlusion/visibility culling is one of those things that the only really generic solution is the exhaustive brute force search. For the case of visibility, that's basically raycasting/raytracing through the scene to every conceivable point. In practice, you downsample the infinite space to a finite number of pixels. The Z-Prepass culling is essentially equivalent to raycasting and storing the distance of the first hit.

I was strictly referring to a closed solution that could totally be done with the raw numerical representation of the geometry before ever shipping anything down to the GPU.

Thanks for the information ShootMymonkey.

But would be possible in relative "little time" (less tham 2 years after launch or use final sdk) developers insert some RT/RC content ("partial scene" or more like caracters,cars,clouds etc) in game with use of 1 or 2 SPUs, if since 2004 opemRT (www.saarcor.de ) already obtained with 1 RPU, 90MHz, one pipe, 250nm, 4GFlops reach something like 8 millions/seg rays (at 512x384 res.)?
 
When using backface culling type of methods in order to reduce the amount of poly's being drawn by the GPU, is there also a benefit that for the polys that are not drawn, textures are also not required... thus reducing the total memory usage on the GPU?

If so I guess the PS3 large OS reservation may be a non issue for those using these methods on multiplatform titles
Mmmmm... conceivable, but not even slightly likely. Since you're talking about something that's done every frame, you can't really dump things in and out of memory every frame since polygons that aren't visible on one frame may be visible the next. Usually, when doing streaming-style things, you're going to move whole blocks of data that contain everything referenced by a certain area, since the disc/hdd is so many million times slower than your memory.

That too, when backface culling polys of a single object, it's quite likely the polygons on the "back" which are culled are going to have the same textures/shaders/renderstates and the polygons on the front -- you are still talking about one object after all.

Wouldn't that require a much more complicated pipeline? You have to preload textures at least somewhat, so there's more 'latency' in that respect which would make it harder to save memory on that account.
I don't really see Butta's scenario that possible in practice. I think the biggest complication is going to come from the fact that you can't really move precompiled meshes and index buffers and so on, which is something a lot of engines do because it saves time on the CPU side. On certain platforms, precompiled display lists is bread and butter. Instead, you're going to have to pre-process and construct your vertex/index buffers for stuff at runtime.

This is vaguely similar what people did on the PS2 where the VU would basically be a vertex shader unit that ran consistently a few verts ahead of the GS and kept updating the stream as the GS was eating up more verts. The only difference is that you didn't try to cull anything at the VU because the GS had such high fillrate that it was faster to just suffer the overdraw than to backface cull.

But would be possible in relative "little time" (less tham 2 years after launch or use final sdk) developers insert some RT/RC content ("partial scene" or more like caracters,cars,clouds etc) in game with use of 1 or 2 SPUs, if since 2004 opemRT (www.saarcor.de ) already obtained with 1 RPU, 90MHz, one pipe, 250nm, 4GFlops reach something like 8 millions/seg rays (at 512x384 res.)?
Saarcor/OpenRT stuff is pretty cool indeed. And I do think that there's a lot of potential, and I'm a total raytracing nut, so I very much like the idea of raytracing hardware in lieu of rasterizers. All that said, what they need is to get the funding to scale that up to modern levels of processing (i.e. lots of pipes, higher frequency) and modern memory architecture (they're running on something like 250 MB/sec) and then we can really see how competitive it gets. That's something that I don't think they'll get anytime soon.
 
Mmmmm... conceivable, but not even slightly likely. Since you're talking about something that's done every frame, you can't really dump things in and out of memory every frame since polygons that aren't visible on one frame may be visible the next. Usually, when doing streaming-style things, you're going to move whole blocks of data that contain everything referenced by a certain area, since the disc/hdd is so many million times slower than your memory.

Do you see any solution to those that are having difficulty in handling multiplatform titles due to smaller amount of available memory on the PS3 (especially w/r to texture storage)? Or does the culling and such techniques constitute space savings as less geometry data is passed to RSX?
 
Last edited by a moderator:
Do you see any solution to those that are having difficulty in handling multiplatform titles due to smaller amount of available memory on the PS3 (especially w/r to texture storage)?
Finer granularity in stream blocks would be the first thing I can think of if you really have a design requirement for dense variation in materials. You may end up with level design limitations as a result (i.e. wide open spaces with large draw distances may be a problem).

Alternatively, you can find more intelligent ways to use less content, so that you can worry less about densely varying materials. e.g. blending maps at multiple scales can get you higher effective detail than a single high-res map without using much actual memory-space (potentially less). 3 maps blended according to weights defined in vertex colors can potentially mean less content than a single combined map that is 4x the resolution (depending on how many vertices), and if done intelligently, can look better to boot, and also allow for more variation in look so that you don't actually see things like tiling on terrain textures and so on.

Generically speaking, you have to take some sort of tradeoff if you're pushing content into too tight a fit. The game itself can use a fair bit of memory (mainly statically allocated and computationally-concerning content), but I really don't see it pushing anywhere in the range of 50+ MB, let alone being able to actually be able to *trim away* 50+ MB. You can trim away the occasional megabyte, but that's about it.
 
My post in that older thread was referring to the geometry of just one particular subsystem not the entire scene, hence the smaller number. A typical frame in our game will have 3+ million verticies in total. You need to optimize your shaders to get 60+fps with that much geometry, but you don't need to use the cpu to do it on 360. Alot of the stuff in Edge (backface culling, skinning, zero area triangle culling, zero pixel triangle culling, etc) are not necessary on 360, just feed everything to the gpu and it will happily rip thru them. I tried doing cpu skinning on the 360 for the hell of it, and the gains were negligible.

Our next years PS3 title will use Edge ideas, but not Edge itself. Being multiplatform, it's easier to just take the pieces we need from Edge and implement them in our framework ourselves. Using GCMHud and GCMReplay we've been able to get some estimates and it looks promising so far!

Any updates on the Edge results? Have you been able to meet your objectives (3+ million vertices in a scene)?
 
Any updates on the Edge results? Have you been able to meet your objectives (3+ million vertices in a scene)?

It's gonna be a few more months, we still have alot of work to do on the PS3 build. Our target though is 60fps (not vert count) for this years game on PS3/360 so if we need to trim back, we will.
 
It's gonna be a few more months, we still have alot of work to do on the PS3 build. Our target though is 60fps (not vert count) for this years game on PS3/360 so if we need to trim back, we will.

But am I mistaken or are you not already achieving 60 FPS with 3+million polys in your 360 engine. Would it not be your target to match those numbers on PS3?
 
But am I mistaken or are you not already achieving 60 FPS with 3+million polys in your 360 engine. Would it not be your target to match those numbers on PS3?

Yup, but thats without msaa. With 4xMsaa it's almost at 60fps on 360, but not quite. Our target on PS3 is not to make it match 360, but to make it go as fast as it can go. We still have lots of work left to do though.
 
Yup, FM2 = Forza Motorsport 2

Yup, but thats without msaa. With 4xMsaa it's almost at 60fps on 360, but not quite. Our target on PS3 is not to make it match 360, but to make it go as fast as it can go. We still have lots of work left to do though.

I don't know if you are able to further elaborate, but Is it the relatively unpredictable nature of what the tiles cover that prevent 100% 60fps? I suppose the question boils down to whether or not you are vertex limited in certain scenes.
 
Im read a impressive post of cpiasminc/shoot...so RSX has 256KB post-transform cache(8x 32KB of Xenos)?

(any connection ...thats a exactly space of local store of one SPU? )
 
Last edited by a moderator:
Im read a impressive post of cpiasminc/shoot...so RSX has 256KB post-transform cache(8x 32KB of Xenos)?
I can't recall ever having heard of a post-transform cache being listed in KB. The figures are given in vertices. On looking back, though, I didn't realize I'd said 8x. That's a typo (5 is directly below 8 on the keypad :oops:). RSX's is 63 verts to Xenos' 14, IIRC.

32 KB, btw is the size of the *texture* cache on Xenos.
 
I can't recall ever having heard of a post-transform cache being listed in KB. The figures are given in vertices. On looking back, though, I didn't realize I'd said 8x. That's a typo (5 is directly below 8 on the keypad :oops:). RSX's is 63 verts to Xenos' 14, IIRC.

32 KB, btw is the size of the *texture* cache on Xenos.
Thanx a lot for answer,but what does mean in memory space post transform cache?

(63 to 14 its ~=4X ?)
 
Back
Top