If your assumption is wrong then how can anyone be expected to understand it?
sorry, I dunno why other understand it and can tell me that i'm wrong, but you did not understand. I'll try to talk more simple.
What is that supposed to mean? You're basically saying, "It doesn't save anything but it does".
maybe you dont get it cause you interpret to much into my words. I just said, it doesnt save more than on other architectures, but to have it you have additional work, additional work that you dont need to do on other systems is called overhead.
It's not just overhead, it's an additional load equal to 40% of the big g-buffer BW load that you're going on about per pixel per light.
additional work, but it's not overhead, sure
.
You're only duplicating setup work for geometry in objects that overlap both tiles. Everything else is identical to FR.
if you use tiled rendering, you dont setup two command buffers, you have one buffer that is executed twice, even if the object dont overlap all tiles, they command-buffer needs to be executed to have the right states all the time. just the draw-calls are skipped, not their setup.
No, I said that tiling is the only disadvantage. You've been arguing about other disadvantages of DR on 360.
no? i didn't say that tiling and dumping edram to mainmemory is a problem? I dont know what you understand in my first post of this thread, but that was what I tried to say.
This is a silly way to do things. Alpha geometry can be rendered between tiles after all the lights are accumulated, just like people always do with tiled forward rendering. You already have to sort alpha geometry, so flagging tiles is extremely low cost by comparison.
I thought you refered to KZ2's way of doing it, so it's impossible to make object motion blur with alphablended stuff already on the shaded geometry. the only way I know is
render gbuffer
shade
motionblur
alpha
it's just silly if you dont think about it.
Comparing DR and FR is not that simple. If you have a lot of local dynamic lights, it doesn't matter if you need more tiling, because DR can still be faster. At the same time, you can have situations where even on RSX, DR will be slower because it needs so much more BW per pixel.
having local dynamic lights is the best case for DR, worst case would be global lights, handling global lights Deferred is a real bandwidth waste, while local lights would just touch the areas that are affected by them. but again, I still think it would run faster without DR on the x360. assigning x-lights, having a shader with dynamic branching so the mount of loops for lights is dynamic shouldn't be slower, if your geometry is of reasonable size. rendering the whole scene in one drawcall with x lights would be of course maximum overhead..
You don't get it. Even with forward rendering you need two tiles for 1280x720 @ 2xAA. Barbarian's suggestion has the same tiling requirement.
so why the heck
is he saying
So, for the G-buffer you'd need just one 4-channel FP16 render target, and then for the light accumulation you'd need 2 RGBA8 render targets.The EDRAM can fit that for a slightly non-HD resolution of 1200x720.
you're right, i dont get it, I thought he was talking about "a slightly non-HD resolution" so 'The EDRAM can fit" it.
Basically, this method of DR has no extra tiling overhead. It does require two full passes of geometry, though, even on RSX, but many people do a z-prepass anyway in FR and DR so this isn't really a disadvantage.
just in case you use lower resolution, but again "if you lower the resolution, you can of course save the tiled overhead" you still have other overhead due to the xbox architecture.
And like I explained above, there is a limited amount of lights affecting the objects and they can be very good handled by the shaderunits. you won't save much doing that deferred, probably less than you'd add overhead.
I think on RSX it's the other way around, generating all the needed shaders or using dynamic branching is a big hit, while deferred rendering simplifies the rendering by avoiding this problems and having less overhead because of no reloading x-times the buffers is supposed to be faster in most cases.