Half-Life 2 XB360 won't be using tiling to achieve A.A.

inefficient said:
I guess the argument to that is that Valve does not have to re-write the "whole" engine. Just "part" of the renderer.

And this would not be re-writing the renderer to make an amazing next gen game. They would be merely trying to shorehorn an fairly dated game to run on a new console.

And I don't think its outrageous for fans to have expected that this sort of shoe-horning should be possible. After all, the game was running on 9800 class hardware at ~60fps with 4x FSAA.


As i said, HL2 on 360 will sell like strawberries in Summer whether it's a perfect port or not. Of course Valve will try to work as little as possible. That's always been the case, it's not just Valve, and it's accepted these days.

Only a very very elite of developers spend a lot of resources into making sure that a port to a very popular game is "critically acclaimed" as a port in itself. Ports are 99% of the time "rushed jobs" just to cash in on bigger userbases.
 
nAo said:
To handle tiling rewriting your renderer might not be enough as you also need to take care of your data as well.
let say you level is for some stupid reason a single gigantic mesh..then you're in big troubles with tiling (and even without it :) )

I'm guessing this is because you'd have to submit the whole mesh for each tile. But can you not dynamically split your geometry based on the tile boundaries at run-time, or are you stuck working at the level of granularity provided by the artists?
 
Titanio said:
I'm guessing this is because you'd have to submit the whole mesh for each tile. But can you not dynamically split your geometry based on the tile boundaries at run-time, or are you stuck working at the level of granularity provided by the artists?
Most of the time you want to statically split your geometry with some offline tool/level compiler according some heuristic that is good for you game/level/platform and bla bla ba :)
 
nAo said:
To handle tiling rewriting your renderer might not be enough as you also need to take care of your data as well.
let say you level is for some stupid reason a single gigantic mesh..then you're in big troubles with tiling (and even without it :) )

Marco

And unfortunately in the PC world bigger meshes are better because of the batch overhead.
 
nAo said:
Most of the time you want to statically split your geometry with some offline tool/level compiler according some heuristic that is good for you game/level/platform and bla bla ba :)

Wow, I had no idea that was an offline process generally. I always assumed that you'd only be submitting redundant geometry that was explictly straddling tile boundaries, but I guess in such a scenario you're potentially submitting much more than that redundantly.

Regardless of the granularity you can work with, though, this determination (what geometry to submit for what tiles) happens on the CPU..? Do you have to transform your geometry to screen space to work that out?
 
Titanio said:
Regardless of the granularity you can work with, though, this determination (what geometry to submit for what tiles) happens on the CPU..?
Yes, since usually every frame a rendering commands list is being built by the CPU(s).
Do you have to transform your geometry to screen space to work that out?
It's not necessary, you don't need to compute an exact solution as long as you make use of some conservative test (bounding spheres, bounding boxes, etc..)

Marco
 
nAo said:
Yes, since usually every frame a rendering commands list is being built by the CPU(s).

It's not necessary, you don't need to compute an exact solution as long as you make use of some conservative test (bounding spheres, bounding boxes, etc..)

Cheers, makes sense. I was thinking you'd use an approximate but conservative approach, but that would likely increase the amount of redundant data being passed. But I suppose the cost of working things out exactly may outweigh that cost depending on whether you have lots of CPU power to spare or not etc.
 
Titanio said:
Cheers, makes sense. I was thinking you'd use an approximate but conservative approach, but that would likely increase the amount of redundant data being passed. But I suppose the cost of working things out exactly may outweigh that cost depending on whether you have lots of CPU power to spare or not etc.
Let say you a lot of computational power and you want to use to preprocess all the goemetry you send to the GPU... then you might also use it to cull away back face culled/view port culled triangles..at per triangle level :)
 
I guess if that's an easy by-product of that approach it could further save you on the GPU side then? But I assume it wouldn't save much shading given that culling is done pre-shading anyway? Would it be worth the additional expense (even if small) to save the culling engine on the GPU (and a little bandwidth)?
 
Last edited by a moderator:
If we talk theoretically for a moment, let's say you have a finite GPU and unlimited CPU cycles to spare (but you're not writing a software rasterizer :p). How much savings could be obtained by being as totally efficient on the triangle selection as possible? Compared to letting something like a 7800 receive a big batch job of all the triangles and doing z sort and culls and drawing redundant triangles, you only fed it the triangles that were visible, you'd have a saving on things like pixel shading cycles. If for every visible triangle you also draw one that never gets shown, by being totally effecient you'd be doubling your effective useable shader length.

So what's the 'wastage' on GPU resources that could be saved? Is it something in the order of a few %? Or as much as halving draws? If the savings are substantial I'm envisaging something like 2xSSAA being possible on a typical scene. Or maybe selective higher resolution supersampling for pixels on edge boundaries. Is this a way CPU could help contribute to IQ, not by rendering pixels but be easing the workload on GPU so it has more room to spare on it's job.
 
predicate said:
Is it possible that the reply is incomplete in explanation? For example, Epic aren't using it because MSAA does not work well with the graphical techniques they're using. Is it possible that there are similar conflicts between tiling and techniques Valve are using?

I'm pretty sure Valve is already using a custom technique on the PC. Originally, AA didn't work on Halflife 2. Now it works if you force it from the drivers or do it in game, but I believe there was an interview with Valve (around the time of HL2's launch, + or - 1 year) that said the option in game only applied AA to select surfaces, and offered performance about 20% to 30% better than forcing AA globally.

Compared to letting something like a 7800 receive a big batch job of all the triangles and doing z sort and culls and drawing redundant triangles, you only fed it the triangles that were visible, you'd have a saving on things like pixel shading cycles. If for every visible triangle you also draw one that never gets shown, by being totally effecient you'd be doubling your effective useable shader length.

The Halflife 2 engine already supports this option. It isn't really important though, as the computers with GPUs that can't breeze through this game's polygon counts generally don't have cpus able to handle the load either. Maybe it would give a slight improvement on a gma900, switching over the cpu rendering the polys to the cpu checking in the polys need to be rendered (and then rendering them).
 
Fox5 said:
I'm pretty sure Valve is already using a custom technique on the PC. Originally, AA didn't work on Halflife 2. Now it works if you force it from the drivers or do it in game, but I believe there was an interview with Valve (around the time of HL2's launch, + or - 1 year) that said the option in game only applied AA to select surfaces, and offered performance about 20% to 30% better than forcing AA globally.

MSAA worked out of the box for HL2... err, right off Steam. I played HL2 on day 1 and never had any AA problems--just enabled it in the HL2 video options. They don't use a custom technique in the PC version of HL2.
 
Sorry to go off topic, but does anyone know if Carmack's new rendering engine will use predicated tiling (and "free" AA) since he has stated the 360 will be the primary development platform?

I guess what I'm really wondering is: How well would a game/rendering engine designed for the predicted tiling system of the 360 work on a typical PC platform? Would it even work at all?

What is nutella?
 
Acert93 said:
MSAA worked out of the box for HL2... err, right off Steam. I played HL2 on day 1 and never had any AA problems--just enabled it in the HL2 video options. They don't use a custom technique in the PC version of HL2.

Just checked, they were having problems getting AA working properly before Halflife 2 came out, but by the time it did come out (perhaps due to the 1 year delay) it was working properly.
 
they had problems with their texture atlaxes and multisampling but a bit of centroid sampling here and there saved the day :)
 
nAo said:
they had problems with their texture atlaxes and multisampling but a bit of centroid sampling here and there saved the day :)

It said they used pixel shaders on nvidia cards, which don't support centroid sampling.

Actually, I'm not even sure now if it was done for performance, though it may have helped anyhow. Anyone with an old R300 series card or Geforce FX want to compare forcing AA in the control panel versus doing it in game? (and if pixel shaders could be used to bound the axes, could they be used to selectively apply AA?)
 
Titanio said:
Wow, I had no idea that was an offline process generally. I always assumed that you'd only be submitting redundant geometry that was explictly straddling tile boundaries, but I guess in such a scenario you're potentially submitting much more than that redundantly.

Regardless of the granularity you can work with, though, this determination (what geometry to submit for what tiles) happens on the CPU..? Do you have to transform your geometry to screen space to work that out?

No it doesn't actually work that way on Xenon.
Although the exact amount of work on CPU vs GPU is somwhat dependant on your overall approach.

The Xenon GPU has a mechanism to compute extents of an object during rendering, the results of which can be written back to the commandlist. You can also predicate parts of the commandlist based on state, so with the use of predication and the extents tests you can just keep resubmitting the same commandlist with different predication states, and it will skip over the unneeded commands for a specific pass. I can for example have several different shaders embedded in a commandlist for a particular primitive group, and use predication to enable or disable them for the different passes.

How efficient this is depends on granularity of the submitted primitives (there is a sweet spot), and the overall approach to tiling. It should be noted that on most modern GPU's if a tri is outside of the clipping region, although you still pay the transform cost, you don't pay any pixel cost, and discarding 10,000's of polygons this way is surprisingly cheap.
 
Maybe this is a clue into Valve's approach:

http://interviews.teamxbox.com/xbox/1190/Xbox-360-Interview-Todd-Holmdahl/p4/

In the specs, it is always mentioned 4X MSAA, but can developers choose a higher order anti-aliasing for their games?

Todd Holmdahl:
The hardware supports up to 4X MSAA. However, developers can perform supersampling and hence render to a larger frame buffer as an alternative. This would be done in software. An important thing to understand is that most games end up turning anti-aliasing off due to the performance penalties from standard architectures. With Xbox 360 we designed the GPU from the ground up so that enabling anti-aliasing would not create any performance hit for developers.

Obviously supersampling is going to be a pretty big performance penalty across the board, but it does raise the question (again): can Xenos bypass the eDRAM? Holmdahl's comment about "larger frame buffer" for supersampling as an alternative to MSAA is interesting.

Any thoughts? ERP? Fran? Others?
 
Acert93 said:
Maybe this is a clue into Valve's approach:

http://interviews.teamxbox.com/xbox/1190/Xbox-360-Interview-Todd-Holmdahl/p4/



Obviously supersampling is going to be a pretty big performance penalty across the board, but it does raise the question (again): can Xenos bypass the eDRAM? Holmdahl's comment about "larger frame buffer" for supersampling as an alternative to MSAA is interesting.

Any thoughts? ERP? Fran? Others?
That does seem like it could be feasible (if there aren't any technical hurdles outside of performance)... seeing as HL2 is kind of old at this point and doesn't take a monster to run at 1600x1200 or higher res now. I'm not sure how likely it is, but it seems like it could be possible for something like HL2.
 
Back
Top