New dev tools ? PlayStation 3 Edge

I just hope that Edge will help eliminate these horrible looking low res shadow maps used in games lately (i.e. Motorstorm, Heavenly Sword, ect.). I am not a owner of a 360, is there also a problem with low res shadow maps on that system?
It's not a problem of low absolute resolution, but low relative resolution. For instance, when an object is really far away from the light, it doesn't cover many shadow map pixels... and if that object really close to the camera, it ends up having a lot of pixels in view, so it's not infeasible for its shadow to also be really big in pixel real estate. Similarly, if a light is coming at a shallow angle relative to a shadow receiver, pixels in the shadow map end up shadowing a LOT of pixels in view because the shadow is so stretched out. And it doesn't matter what system you have, that problem is inherent in shadow maps and always will be.

You use things like PCF or VSMs to try and cover for the resolution problems, but it will never be "eliminated" as long as you're using shadow maps. And Edge or RSX or Xenos has nothing to do with it.

How's about ray-traced shadows? Per pixel and geometry perfect for sharp shadows, with no worry about texture fetching which limits RTs overal suitability for rendering.
Except shadow rays means lots of computational load... You try to to imagine a million-two million pixels x number of active lights in a scene, and you've got a nasty number of ray tests. We do have to do other things besides render stuff.

Now the notion of SPUs extruding geometry for shadow volumes (which will effectively give you the same results anyway) is feasible, but it's nasty in terms of fillrate, which neither RSX nor Xenos have a lot of. Granted, they can put out twice the pixels in that mode, but it's really no big victory considering you can get that when rendering shadow maps as well, and you consider how many extrusions could exist for how many lights.

And even then, pixel-perfect hard shadows aren't exactly the greatest thing since sliced bread.
 
Except shadow rays means lots of computational load... You try to to imagine a million-two million pixels x number of active lights in a scene
I was only thinking of a main sunlight, which is the largest application of shadows I see. Lots of light sources will mean local lights so you don't need to light every object - only those within range of a certain intensity. You could also apply various optimizations like RTing every other pixel to halve/quarter the number of rays, and interpolate.

And even then, pixel-perfect hard shadows aren't exactly the greatest thing since sliced bread.
No, but they're better than pixelated shadows which we're seeing a lot of. 1) If there's better techniques, why aren't they being used? 2) What other techniques map well onto SPEs as Mr. Wibble was questioning? RT with only geometry in suitably structured data shouldn't be too shabby.

Of course, all other suggestions of SPE friendly shadowing techniques are welcome!
 
I was only thinking of a main sunlight, which is the largest application of shadows I see.

Raytraced shadows require you to fire a ray for every single pixel and trace it to the light source. If you hit anything on your way then the pixel is in shadow. The trouble is that you'll need to have every single bit of geometry in the scene at hand because the ray might have to travel in their direction.
This basically means that you either calculate it on the SPEs, that can do the data handling but are seriously limited on local memory; or you try to hack it onto the GPU which is going to be very complicated as they don't have arbitrary memory access at all. Either way it's gonna be very, very slow.

You could also apply various optimizations like RTing every other pixel to halve/quarter the number of rays, and interpolate.

Undersampling? That'd look very ugly, even worse then shadow maps IMHO.
Optimizations usually have other approaches like dividing the scene geometry into voxels and so on.

Thinking about it, the tricks described by Alex Evans of Media Molecule -may- help (3d textures) but certainly not for a sunlight on current gen hardware, there's not enough precision. And he can do some kind of soft shadows already, so why the need for sharp shadows? Only something like a nuclear blast would have an effect like that...
 
Undersampling? That'd look very ugly, even worse then shadow maps IMHO.
Worse than the 10x10 pixel jaggies we see creeping across car surfaces and floors?

Thinking about it, the tricks described by Alex Evans of Media Molecule -may- help (3d textures) but certainly not for a sunlight on current gen hardware, there's not enough precision.
Proper space partitioning should negate the need to test every object with every ray.
And he can do some kind of soft shadows already, so why the need for sharp shadows? Only something like a nuclear blast would have an effect like that...
Sharp shadows are unrealistic, but very common. No games have proper soft shadows with increasing penumbra by distance, so all solutions are fake. They're either uniformly fuzzy edged or hard. Hard looks better in most cases of outdoor scenes, or shadows cast by flaming torches.

If there's a fast way to produce more realstic shadows on Cell, go for it! Failing that, if shadows are to be rendered on Cell, I can't see hard shadows being a disappointment if they're better than many of the implementations we have at the moment with their little moving staircase effects ;)
 
Proper space partitioning should negate the need to test every object with every ray.

You still get lots of random memory access to very large datasets, which is the general problem with raytracing. GPUs can't help you there, they're designed for a flow of data - touch once, then forget about it.
SPEs can work on geometry as they see fit - but they don't have enough local storage to hold even simple ingame models in memory. Space partitioning data structures require memory as well.
As soon as you have to reach for main RAM, your speed will slow to a crawl no matter how fast your CPUs are at number crunching; especially if they're in-order and can't do anything else while they're waiting for RAM reads.
 
If you want hard shadows, something like Doom3's unified lighting system will work. I don't see raytracing being necessary at all.
 
I was only thinking of a main sunlight, which is the largest application of shadows I see. Lots of light sources will mean local lights so you don't need to light every object - only those within range of a certain intensity. You could also apply various optimizations like RTing every other pixel to halve/quarter the number of rays, and interpolate.
*Adaptive* subsampling at least where you don't subsample where you have edges generally works all right, but it's hard not to screw up in cases where you've got fine-grained shadow detail that is smaller than the space between 8 or 4 or 2 pixels, which is exactly where you'd anyway have problems with any other technique.

No, but they're better than pixelated shadows which we're seeing a lot of. 1) If there's better techniques, why aren't they being used? 2) What other techniques map well onto SPEs as Mr. Wibble was questioning?
1) Better techniques in what way? I mean, I think shadow volumes are still among the best techniques currently for omnidirectional point lights, but that doesn't make them better in every single way than shadow maps. Look around for other techniques, and you'll find every one of them has a fatal flaw that makes them unacceptable in some way (and some of those may not necessarily be visual). All you're doing is singling one out and assuming that there could be nothing worse.

2) Raytraced shadows, unlike what you'd like to believe do NOT map well (or rather, well enough). And I seriously doubt that some analog like beam tracing would do any better. If you're talking shadow-aid from the Cell, shadow volume extrusion or pretransforming for shadow depth map passes is the most obvious thing. I would be curious as far as what further ingenuity might yield as far as shadows in general go. For instance, using a simplified area lightsource representation might be useful in generating data for min-max shadow maps... but it's all pie in the sky ideas for now.

RT with only geometry in suitably structured data shouldn't be too shabby.
Believe what you want, then. But do remember that in 1/30th of a second, you don't have 30 ms to spare for rendering, let alone shadows. 5 ms is pushing it.

Proper space partitioning should negate the need to test every object with every ray.
As mentioned elsewhere, you're replacing brute force for bad memory access patterns, which is deathly slow on all architectures and always will be (at least until we figure out ways to use that whole quantum entanglement thingy). If the culling rate is massive, then you might get a win out of it, but that's not going to scale up that well with number of rays -- you'll eventually hit a wall.
 
Laa Yosh said:
Undersampling? That'd look very ugly, even worse then shadow maps IMHO.
You don't need actual raytracing though, a subset functionality is enough to generate much better shadowmaps then conventional renderers. And they would still be better when undersampled (hell, there's no such thing as non-undersampled shadowmaps in realtime yet anyway).
 
...As mentioned elsewhere, you're replacing brute force for bad memory access patterns, which is deathly slow on all architectures and always will be (at least until we figure out ways to use that whole quantum entanglement thingy). If the culling rate is massive, then you might get a win out of it, but that's not going to scale up that well with number of rays -- you'll eventually hit a wall.
I've just realized my explanation wasn't very good. I wasn't thinking exactly of full raytracing of shadows - casting a ray into the scene, checking through objects to find what's hit, tracing a second ray towards the lightsource and again checking objects to set what's hit. I was thinking of RT (or raycasting, which other people seem to use more compfortably?) just for the shadow drawing part. For each pixel, all that's needed is position and normal that can be derived from render time, I think. Thus for each pixel, the shadow rendering would only need a subset of objects evaluated, rather than a worst case n^2 for proper RT. And I'm sure some boffs could come up with clever solutions for the memory accessing the moment traced shadows are considered...

Also appreciate I'm not saying RT shadows is any ideal to replace all other shadow techniques. My use of 'better' for the technique is 'better suited for SPEs'. Butta talkeb about Edge solving shadow issues. Mr. Wibble said Edge is about leveraging SPEs, and probably won't help with shadows unless there's a cute way for them to help. He also said there's normally a lot of rasterization involved - well for a shadow technique on SPEs, traced shadows won't have a rasterization problem. There'll be other issues (RAM access) but traced shadows on SPEs will be better suited to the hardware than shadow maps and volumes, I think, even if ultimately unusable because 'better' isn't 'good enough'!

So, my patented SPE-shadow system consists so far of :

1) position and normal info from GPU, rendered as a couple of texturebuffers
2) a subsampled trace for each pixel to determine if shadowed
3) lots of memory accessing which needs some super clever solution
4) interpolated full-size shadows overdrawn on framebuffer

:yep2:

Now continue telling me how wrong I am :D
 
I've just realized my explanation wasn't very good. I wasn't thinking exactly of full raytracing of shadows - casting a ray into the scene, checking through objects to find what's hit, tracing a second ray towards the lightsource and again checking objects to set what's hit. I was thinking of RT (or raycasting, which other people seem to use more compfortably?) just for the shadow drawing part.
That's what I assumed you meant from the start. You're just greatly trivializing the nature of the problems. Not the least of which include the fact that you're talking about doing it in a game. You're being misguided by a combination of IBM's raytracing demos (which tell you "we can do this and nothing else at the same time") and games which have limited lighting arrangements so that the problem cases for shadow maps are just avoided.

And I'm sure some boffs could come up with clever solutions for the memory accessing the moment traced shadows are considered...
You're kidding yourself. The notion that every problem has a solution is patently false. Raytracing triangle meshes is slower than other types for a reason. The problems involved aren't strictly software, I hope you realize, which is why traced shadows won't even be considered on a scale that involves more than a few hundred rays per frame.

Especially when cleverness on top of OTHER already usable techniques can show a lot more promise for a lot less cost. You seem to be assuming that cannot be done and making the leap to raytraced shadows which has never been usable.
 
Last edited by a moderator:
Especially when cleverness on top of OTHER already usable techniques can show a lot more promise for a lot less cost. You seem to be assuming that cannot be done and making the leap to raytraced shadows which has never been usable.
I'm not assuming anything - only throwing out (little considered) ideas. If there's better matches for shadows on Cell, what are they? Answer Butta and Mr. Wibble directly instead of leaving me to with my crazy notions!

The problem to be solved : Rough looking shadows.
The solution : To somehow use SPEs via an Edge software implementation

Is there a solution? Or will it never happen?
 
I'm lost as to why it must involve Cell or SPEs or Edge in some way. Edge, AFAICT, is about performance for the render path, not anything specific like shadows or AA or texture filtering which everybody likes to complain about.

Is there a solution now? Hell no. Someday? Maybe, and it'll have its share of problems. Will there be an absolute solution? Never.
 
I'm lost as to why it must involve Cell or SPEs or Edge in some way.
Because Edge is what the thread is about, and this is the direction the discussion took!

Butta said:
I just hope that Edge will help eliminate these horrible looking low res shadow maps used in games lately (i.e. Motorstorm, Heavenly Sword, ect.). I am not a owner of a 360, is there also a problem with low res shadow maps on that system?
Mr Wibble said:
So far all the Edge tech seems to be relating to use of SPUs, or monitoring performance. These might have a knock-on effect in helping to improve things like rendering shadows but otherwise it's not relevant
...
Alternatively perhaps there's a cute way to make SPUs generate good looking shadows instead of the GPU, but it seems like the wrong kind of task to throw at them to me - generally speaking there's going to be a whole lot of rasterising going on, and the GPU is going to be better at that.
 
The cute ideas I had pictured wouldn't really be SPE/Cell/Edge-specific in that they are serving to cover up the issues related to *existing* methods that are in use. Edge would simply mean it might perform better when putting some precomputation on the SPEs and thereby aid the GPU by doing a small share of the work -- maybe generate some data that the GPU can't because of being limited to 1 vertex at a time or what not. To have SPEs completely generate all shadows is a waste of time.
 
This is cool. I guess as I listen to it I will try to summarize for the people who dont want to listen to the whole hour long session.



Mark gives an overview

Original goal was to create tech for 1st and 2nd party groups. But now they working on making them available to all 3rd party developers.

There are 3 first party tech groups:
* The ICE team (World Wide Studios America) Specializes on low level graphics
* Advanced Technology Group (WWS EU) Support of EU 1st and 2nd Party
* Tools and Technology Group (World Wide Studios America)

Mark is a long term consultant to Sony and leads/created the ICE team.
John - ICE team lead architect
Vince - Advanced Technology Group - They created GCM replay profiling tool

GCM replay will be available to "licensed developers"
Captures RSX command buffers and allows you to analyze them off line.


Edge vs a Full game engine. Full game engine does not work well for multi-platform devs. Edge tools are worthwhile for even MP devs optimizing for PS3.

Edge doesn't really mean anything. At one point it meant Efficiently Distributed renderinG Engine, but they decided they weren't making an engine.


Edge components:

* Animation engine - Bulk of system runs on SPUs. Very fast.
* Geometry system - SPU
* Skinning - SPU aids RSX (Warhawk NBA game use this)
* Triange Culling - SPU (Lair and F1 game use this module.)
* Blend shapes - SPU
* Data compression - SPU - Zlib decompression on SPU for very hi-speed streaming off of BR disc or HDD. Can decompress 40MB/sec using 25% of 1 SPU!
* GCM replay


Aside from GCM replay, which has some IP issues, all other systems will be released with full source code.

SPU code written as SPURS jobs in C using intrinsics. PPU code and tools written in C with a touch of C++.

Animation system - Primary Goal: Offload as much work from the PPU as possible.

Comes with some tools for creating Collada compatible assets



Vince on Geometry System:

Runtime component and Tools component.

Primary mode of usage:
* Use offline tools to prepare objects
* With offline tools, split geometry into vertex sets - small enough to fit on a SPU leaving enough room to process it. 500-1500 vertexs.
* Format is indexed triangle lists - best format RSX
* Geometry works on vertex sets as discreet entities using 1 or more SPUs

Secondary mode:
* Don't use the offline tools to cut up geometry
* Use streaming to move data in small pieces to the SPU
* Triangle culling can't work in this mode.

Pipeline:
* 1st stage of pipeline decompresses vertexes. Accespts vertex arrays interleaved with data. Separates data into tables of floats. Supports all native RSX formats. Also perserves ability to have RSX process data directly.

* 2nd stage decompresses index data. Indexed triangle lists (because this is the best for RSX).
Optimizing for the mini cache on the RSX is often the most important factor to consider when constructing index data.
Index data is highly compressible. 6.5x more triangles in the same amount of space with index compression.

* (Optional stage) Blend shapes/additive vertex blending.
* (Optional stage) 4 Bone matrix pallet skinning. Most teams do skinning on SPU. 2 very large benefits:
1. Reduce the length of the vertex program.
2. Save time in RSX reading the vertices and weights. 30-70% speed boost over RSX.

* (Optional stage) Triangle Culling. SPU culls to only sends triangles which can be rendered by RSX. Muti-sampling creates some complications - but we can still cull "pretty good"
Overall performance improvement (from culling) in a balanced scene is 10-20%
Reduces the pressure to create an LOD technology in your projects

* Final stage, prepare data the RSX will use. Convert everything into RSX accepted formats


Offline tools:
*Collada 2 binary converter
*Collada parser
*Geometry partitioner
*Cache optimizer


Geometry Processing on SPUs:
* A lot of data
* Double buffering is simple but takes up a long to space.
* Edge uses a single or ring buffer JIT strategy.
* SPU generates data in same frame RSX consumes it.
* RSX almost never waits on SPU - In the rare even it does, the correct synchronization will take place.


Intrinsics.
* Perfered over hand tuned ASM.
* 20x faster than naive C code


Test case examples:
* In general 1 SPU can process 750k triangles per frame at 60FPS while hopefully culling 60% of the triangles.



John does animation system demo

We have all seen this demo already. Its the getaway looking one.


John talks on GCM replay

Optimization tool for RSX

2 parts:
* A runtime library
* A Windows app

(Hard to describe it's obviously very visual graphical analysis tool.)
(This guy seems to be allergic to something in the room he is constantly on the verg of sneezing)




Q & A Session:

Q: Are you holding back the best stuff for 1st parties?
A: We are giving you awesome stuff sit back down

Q: There was stuff blurred out in the presentation. Why?
A: This is an open forum and not everyone is a licensed dev.

Q: Will Sony take input from 3rd parties on further Edge improvements?
A: Yes

Q: Fragment shader debugging tool, can we click on a pixel and debug it?
A: No the tool is not at that stage

Q: The performace gained was it total frame rate of number of draw calls.
A: It shows both

Q: The 20% performance increase from using Edge animation. What kind of gain for non skinned characters
A: 10% or low teens. But it depends...


Q: The last demo how many SPUs were being used.
A: 5 SPUs 1.5mil output triangles. Several mil input

Q: When can we get the tool? And the documentation
A: This month for both
 
Last edited by a moderator:
Q: The last demo how many SPUs were being used.
A: 5 SPUs 1.5mil output triangles. Several mil input
Thanks for the write up. This bit of info is important. The scene is several million triangles per frame, with 1.5 M output to RSX. That's a lot of geometry!
 
thanks inefficient for writing down the details.

* (Optional stage) Blend shapes/additive vertex blending.
* (Optional stage) 4 Bone matrix pallet skinning. Most teams do skinning on SPU. 2 very large benefits:
1. Reduce the length of the vertex program.
2. Save time in RSX reading the vertices and weights. 30-70% speed boost over RSX.

* (Optional stage) Triangle Culling. SPU culls to only sends triangles which can be rendered by RSX. Muti-sampling creates some complications - but we can still cull "pretty good"
Overall performance improvement (from culling) in a balanced scene is 10-20%
Reduces the pressure to create an LOD technology in your projects

* Final stage, prepare data the RSX will use. Convert everything into RSX accepted formats
Is there any reason to use non-trivial vertex shaders on RSX then ?!? If I understood right, this is a single SPU, processing vertices right down to visibility-detection handily beating RSX.
Test case examples:
* In general 1 SPU can process 750mil triangles per frame at 60FPS while hopefully culling 60% of the triangles.
Whooooha. I take this is 750mil per second? Even then I have a hard time believing it. maybe 750k per frame?
Q: The last demo how many SPUs were being used.
A: 5 SPUs 1.5mil output triangles. Several mil input
Now thats quite underarchieving compared to the previous numbers?
 
Now thats quite underarchieving compared to the previous numbers?

That's 1.5Mil on screen. If 60% were culled, the scene was originally 3.75 Mil triangles. That is exactly (almost too exactly) in line with 750k triangles per SPU over 5 SPUs he mentioned earlier.
 
Thanks for the write up. This bit of info is important. The scene is several million triangles per frame, with 1.5 M output to RSX. That's a lot of geometry!

While it is very impressive for Cell, the technique eats almost all of cell's computational power to get 19% increase on RSX. I expect them to come up with approximate culling algorithms that will put less stress on Cell with reasonable performance improvements on RSX.

And the fact that they haven't include the input triangle count in the presentation is a little suspicious.
 
Back
Top