*spin-off* idTech Related Discussion

Since the RSX has double the number of fragment shaders of a 7600, as well as more vertex shaders, and other enhancements, I think it's a bit misleading to call it "a 7600-derivative". In fact, I'd be surprised if the 8600 was as fast as RSX. (The 8600 is certainly slower than most 7900's, which are arguably most similar to RSX.) Of course, the 8600 didn't launch until months after the PS3, and probably years after the first prototypes of the PS3 were made available to developers, so it's a moot point...

I'd say a 7800 / 7600 hybrid would be more accurate. Its clocked between the 7800GTX and the 7800 Ultra with the same number of shader / texture units as the 7800 but the same number of ROPS and memory bus width as the 7600.

I imagine a full 8600GT would have been faster, especially in a console environment wereits more flexible architecture could have been fully leveraged.
 
Tech 5 does not have to use Megatexture and outdoor free-roaming worlds, so both dataset size and production budget are scaleable.

You can certainly build a simple level-based shooter with it, even an indoor one (like Doom) and if you don't have to bake the lighting into the texture atlas then you can even have tiling to some extent (although I guess that if you hold back on stamping the hell out of your level, then the game assets can be compressed significantly).

So what do you still get?
- full multiplatform support with almost literaly a single button press
- 720p + 60 fps capability with proper optimization and art assets
- data streaming engine (I guess they have it for animation, sound and other data too)
- most of the usual bells and whistles like post processing engine, vehicle physics, nice editors

It's still a good looking package that could be competitive with UE3, at least the way I see it...
 
Laa-Yosh, locally variable compression and tiling support will destroy uniformity and predictability, thus remove all the performance benefits on GPU. It will most likely reintroduce texture budget (or some equivalent) concept as well.
 
Is having a bigger game really that much of a disadvantage? What's the opportunity cost of developing a better game more easily and with nearly free multiplatform support? It may make digital distribution harder, and those sales may suffer, but how much are you saving in both time and money?

It's not a trade off for everyone (like indie games that can't afford physical distribution), but looking at most of the AAA titles, why wouldn't they take it?
 
Is having a bigger game really that much of a disadvantage? What's the opportunity cost of developing a better game more easily and with nearly free multiplatform support? It may make digital distribution harder, and those sales may suffer, but how much are you saving in both time and money?

It's not a trade off for everyone (like indie games that can't afford physical distribution), but looking at most of the AAA titles, why wouldn't they take it?

Apart from the DD problem you mentioned, the XBOX's lack of HDD on some SKUs mean some games are not practical. A game like Oblivion/GTA wouldn't work if you had to shuffle discs every time you travelled to the other part of the world. Any open world game really. Racing games where you can pick tracks to race on, and even something like Call of Duty/Gears of War which is as linear as they come would suffer since they allow you to pick any mission that you've beat to play CO-OP, time trial, etc.

If your game only targets PS3/PC and you don't care much for DD, it's not a problem of course.
 
A megatexture game doesn't necessarily have to be bigger though, does it? My understanding is that only unique tiles are stored, so you can still have tiles that repeat textures as well as reap the benefits of the VRAM efficiency and content creation. Rage uses a huge amount of storage because they're trying to make a game that does not repeat tiles.

Edit: I could be very wrong on this one. I'm trying to figure out where I read that. I'd thought that the virtual texture was a large grid of tiles, and that each grid corresponded to a smaller texture. If two tiles used the same texture, they both referenced the same texture in storage, rather than storing it twice.
 
Last edited by a moderator:
A megatexture game doesn't necessarily have to be bigger though, does it? My understanding is that only unique tiles are stored, so you can still have tiles that repeat textures as well as reap the benefits of the VRAM efficiency and content creation. Rage uses a huge amount of storage because they're trying to make a game that does not repeat tiles.

Edit: I could be very wrong on this one. I'm trying to figure out where I read that. I'd thought that the virtual texture was a large grid of tiles, and that each grid corresponded to a smaller texture. If two tiles used the same texture, they both referenced the same texture in storage, rather than storing it twice.

MT doesn't work like that. If you use the same MT area for two different surfaces in game, say a door, you can no longer bake the light because they are standing in two potentially different places in the world (one in full shadow, the other basking in the sun).
 
Yeah, but you don't have to bake all the lighting in, even with Rage. Although Carmack said it's considerably slower with full dynamic lighting - but that goes for Rage, so a 30fps game can work differently.

Tiling might also be an issue with the texture atlas thing, as it basicaly means that a large polygon may not fit into the 0-1,0-1 UV space. In this case the single texture would repeat itself over the surface, but you'd have to cut the polygon up to work with a texture atlas layout, because it probably can't handle the texture coordinate conversions for out of range data.
Now manually cutting stuff up to smaller quad polygons is an option, but it may be a problem with large objects like city buildings in GTA type games. You don't want to spend thousands of polygons on a single skyscraper. But you don't neccessarily have to build another GTA with tech 5, and those sandbox games have other special requirements as well anyway.
 
MT doesn't work like that. If you use the same MT area for two different surfaces in game, say a door, you can no longer bake the light because they are standing in two potentially different places in the world (one in full shadow, the other basking in the sun).

I suppose if you're baking all the lighting, almost all surfaces will end up being unique. Maybe something like a racetrack with repeated grass and roadway could get away with it.
 
Tiling might also be an issue with the texture atlas thing, as it basicaly means that a large polygon may not fit into the 0-1,0-1 UV space. In this case the single texture would repeat itself over the surface, but you'd have to cut the polygon up to work with a texture atlas layout, because it probably can't handle the texture coordinate conversions for out of range data.
Now manually cutting stuff up to smaller quad polygons is an option, but it may be a problem with large objects like city buildings in GTA type games. You don't want to spend thousands of polygons on a single skyscraper. But you don't neccessarily have to build another GTA with tech 5, and those sandbox games have other special requirements as well anyway.

FWIW, ETQW allowed up to 16 UV space repetitions without explicit geometry splits.
 
Yeah but that works kinda differently, Rage uses a texture atlas instead with every kind of stuff in it, not just terrain.
 
Yeah but that works kinda differently, Rage uses a texture atlas instead with every kind of stuff in it, not just terrain.

I was actually talking about the non-terrain ATLAS used by ETQW. You only needed to define whether you wanted horizontal or vertical repeating.
 
Right, except that the "vec4" can be a "vec3 + scalar" or a "vec2 + vec2", so, either way, 96 instructions per cycle for RSX.
Yeah, all mentioned in the G70 articles floating around.
Then, IIRC, there's some sort of "mini-ALU" (or special function unit?) that adds another instruction per shader per clock, so, possibly, if these extra instructions are useful, that raises RSX's fragment shading capabilities to 120 instructions per clock.

Not sure about a mini-alu... You might be thinking of NV40. In the G70 series they made the second unit fully capable except without texture op capability.

I don't follow...
You mentioned 5 instructions per cycle... I probably misunderstood you then.


Correct, so 96 instructions per cycle for Xenos.

Also, we can't forget that Xenos also benefits from having 16 dedicated texture units, so, if they're fully utilized, then Xenos peaks at 112 instructions per cycle.

Indeed... an issue on RSX as texture ops are done through the shader units...

What? :?: Where can I find Wavey's document? This sounds interesting...

Oh just the B3D article from way back.

Agreed, but I hope it's not as bad as wasting 50% of RSX's peak performance.

Probably not... :LOL: They'd have to be doing some texture heavy ops before it started eating into processing for math/shader-only programs per cycle.
 
DemoCoder said:
I agree about the low polys

Alas poly count is one of those things that takes a hit when supporting multiple platforms, they just go for the lowest common denominator because having artists flesh out and maintain multiple sets of models is probably too time consuming over the course of the project.


Can a person say that this game is better looking [not art wise] , than the likes of other console tech juggernauts ie. Killzone 2,Uncharted/Uncharted 2 ?

Would you say Fable 2 looks better than Uncharted and Killzone 2? I seriously doubt anyone on B3D would, but I've put them side by side many times in graphics tests and 'independent' people have frequently picked Fable 2 as the best looking. Ultimately people don't judge pdf tech docs, they judge what they see see on screen, and you just never know what way that can go.

Going with that though I'd say Rage would probably visually lose to games with day/night cycles, even if they sport a more primitive tech. The terrain variety looks great, but people seem to really respond to subtle lighting changes that occur with full time cycles. Seems like you are more likely to strike a visual cord with someone if you provided them with multiple looks, something that a moving sun provides. At least that's something that I've noticed in the various 'unofficial' focus tests I've done. In particular, sunset lighting really gets oooohs and aahhhhs.
 
Not to mention polygons are going to take a hit regardless if you want to view large drawdistances and open terrain. There's a fixed budget for poly's that you can't really go over.

Too bad this isn't just X360/PC however, as then they could leverage tesselation possibly to up the poly budget on most models.

Regards,
SB
 
Not to mention polygons are going to take a hit regardless if you want to view large drawdistances and open terrain. There's a fixed budget for poly's that you can't really go over.

Too bad this isn't just X360/PC however, as then they could leverage tesselation possibly to up the poly budget on most models.
I'd have thought tesselation on SPU's would work just as well. The difference with PS3 being that that'd take away from other activities, whereas in XB360 if you don't use the tesselator it sits there idle. In PS3's case, you needs these cycles 'going spare'.
 
Probably not... :LOL: They'd have to be doing some texture heavy ops before it started eating into processing for math/shader-only programs per cycle.

Or if they require full precision floats. I mentioned this in the Uncharted thread, but full precision floats slow down rsx somewhat. A simplified way of viewing rsx is that is has a few processing units that can perform instructions in parallel, and all the instructions done at a given time can be viewed as a "pass". You write your shader, run it through the shader performance tool and it will show you how many 'passes' your shader has, where each pass is doing one or more instructions. The more passes, the slower the shader will take to complete.

The problem is that if you are using full precision floats then it has a harder time jamming multiple instructions per pass, so the compiler ends up spreading them out over more passes resulting in idling shader units and longer run times. The fix is to use half precision floats if you can, this lets the compiler do more work per pass. The simplest test of this is to globally change all full floats to half floats in the entire shader and compare the 'pass' count before and after to see how much potential gain there is, and you can see as well how it reschedules the instructions doing more per pass.

The real problem comes if you can't use half precision floats, like if perhaps the precision simply isn't enough to do what you want to do. You can mix and match half/full floats in the shader, but the inclusion of even just a few full precision floats can still totally undermine the compilers ability to schedule shader instructions. In other words, a shader will full floats might still run with the same number of passes as a shader with 80% half floats. It's case by case of course, but ultimately you really must remove as many full floats as possible. If Rage demands many full floats for their approach then their performance on rsx will take a definite hit.
 
I'd have thought tesselation on SPU's would work just as well. The difference with PS3 being that that'd take away from other activities, whereas in XB360 if you don't use the tesselator it sits there idle. In PS3's case, you needs these cycles 'going spare'.

There are three issues there. The first of course as you mention is that the spu's are already busy doing other things. Look back on many threads here on B3D and all the things spu's get mentioned with. Hey lets throw vertex processing on spu's, shadow calcs, lighting, post processing, tesselation, ai, culling, animation blending, texture decompression, etc, etc. The spu's are fast but they are still finite, and they currently have a million other things to do!

Secondly, tessellation was meant to reduce cpu load. So you do all your cpu processing on a reduced mesh, then let the tesselator expand it into something that looks nicer. But this new improved mesh still must run through the vertex shader before heading on to the pixel shader. In other words, you can still be bitten by vertex processing limitations even with spu tessellating. For example, lets say you have magically moved 100% of your vertex processing to the spu's and your vertex shaders do nothing. They may do no shader processing, but they still have to spend time fetching streams of data, interpolating them, and passing them onto the pixel shaders which likely still have some work to do. Cpu side tessellating will increase this load. So if you unfortunately need say 3 or more vectors of graphics data on rsx (which causes performance hits), then these hits will be multiplied when spu side tessellating is used even if your vertex shaders do nothing. I might be wrong here since I've never been allowed to use the 360's hardware tesselator, but I believe the limitation there is similar in that the hardware tesselator does is thing pre-vertex shader, then that new uber mesh in it's entirety goes through the normal graphics pipeline of vert and pixel processing, but I'm sure someone will correct me if I'm wrong on that.

Thirdly, the big issue with all this spu graphics work is stalls. Using the spu's in your graphics pipeline can result in large dependency chains of 'a' depends on the completion of 'b' which depends on 'c', which depends on 'd', etc. When all these dependencies are spread across gpu+cpu processing then the likelihood of stalls increases. Massive parallelism works great on gpu's because they have tons of processors at which the entirety of the graphics problem is hurled at. In that methodology the "graphics processing" part can almost be seem as an atomic process at which the machine will happily grind away at, scheduling, shifting and prioritizing tasks as only it can do on it's myriad of processors to help reduce stalls automatically. When you have humans trying to do the same with spu's+gpu, then the odds of keeping everything 100% fed with data are low, meaning some processing will idle/stall. Throw in the randomness of a game application and this likelihood increases somewhat. Throw in a severely fractured pipeline, like where say cpu/gpu processing is heavily intermixed, and good luck in hitting peak performance.

EDIT: Oops sorry Al, saw your reply after I posed :(
 
This may have been said, but I don't think the full "glory" of a texture atlas system will be exploited in Rage. I realize that this is not the first game with such technology released (though I don't know if others did the entire world or not), but I think the development process still probably is not set to take advantage of everything offered by such a system. I am not the technical guru in these areas that you folk are and I freely admit it, but when I was working on tamriel rebuilt I saw over and over how handy a system like this could be. Storage will be an issue though to reaching the potential here. Tile systems are just a different compression method (one that results in artifacts as well). Stacking textures and painting seams, or placing objects to hide problems is just such an inferior way to create worlds.
 
Back
Top