Before and after MegaTexture scenes

Richard

Mord's imaginary friend
Veteran
At GDC id Software showed off their new id Tech 5 engine as they're trying to win back some of the lost ground on engine licenses. Everyone stopped keeping score a while ago so a quick educated guess would put Unreal Engine 3 at 99.9% share of the entire engine licensing market.

Krawall Gaming Network managed to interview id's Matt Hooper, Lead Designer on Rage, the debut title of the engine. While the article is in German so it makes it difficult to acertain what the discussion was about (*nudge*) it stands to reason Rage and id Tech 5 were discussed. With the article, KraWall also posted a gallery of photographs taken of Matt's presentation slides. Go there for the full set.

The interesting ones however show "before" and "after" visuals of three scenes where I guess they're trying to demonstrate how MegaTexture can make a scene look unique, losing the barren, repeated tile look of other games without using extra polygons, batches or texture memory. Basically, between before and after the scenes got much better looking without any performance impact.

Scene 1 BEFORE:
krawallbrand_1P1000551.JPG


Scene 1 AFTER:
krawallbrand_1P1000552.JPG


Scene 2 BEFORE:
krawallbrand_1P1000553.JPG


Scene 2 AFTER:
krawallbrand_1P1000554.JPG


Scene 3 BEFORE:
krawallbrand_1P1000555.JPG


Scene 3 AFTER:
krawallbrand_1P1000556.JPG
 
Richard thanks for the pics.

Anyone get a chance to look in detail enough to see if id5 was using tri-linear filtering or perhaps just bi-linear?

The reason I ask is because of this post from Sean (did the SVT talk at GDC) on the Molly Rocket forms,
If we omit the final texture fetch and just talk about the cost in the shader of remapping an input virtual coordinate to a final physical coordinate, the best known shader is only two instructions (this is Carmack's bilinear shader that I hand-waved about). However, that doesn't handle trilinear, and doing trilinear well adds a lot of bloat, so the stuff above could bring that way back down.
 
so those after pics are a single texture ?

In those shots yes. Everything you see is using a single MegaTexture.

Hum, am I seeing no specular lighting in the after shots (all baked lighting?)?

The engine is using baked lighting (how much I don't know - i.e if it's a full baked like Q3, et al or half-baked like the megatextures in Enemy Territory: QUAKE Wars)

I'll try and find out that bi/trilinear question.
 
If I try to imagine that in motion, it sure looks much more realistic because of the amorphous nature of the textures. Is there any video related to those pics?
 
I wonder what affect megatexturing will have on the cost of production. Creating unique texture for every surface would surely increase the effort needed to create a level; which means either more time, or a bigger team. id themselves have new jobs advertised on their website at the moment. 11 out of 13 of which are art related.
 
No one is saying you *have* to paint every surface uniquely, just that you can.


As for the original topic - am I the only one that doesn't see how this is as "insanely complex" as a lot of people are making it out to be/a lot of the implementations I've seen so far are. I've seen implementations that unacceptably break a lot of basic graphics functionality (such as trilinear/anisotropic filtering), require read-backs and multiple passes w/ massive pixel shader programs just to do the simplest texture lookup. I mean.. what the hell?

People have been doing seamless streaming worlds for a long time and suddenly just because we're not allowing UV-coords to overlap and are storing all our texture data in a big array of smaller textures we can no longer do trilinear, have to do read-backs to WRITE data to the GPU and everything else?

Unless I'm missing something, I can think of about a dozen better ways to do all this.
 
Last edited by a moderator:
As for the original topic - am I the only one that doesn't see how this is as "insanely complex" as a lot of people are making it out to be/a lot of the implementations I've seen so far are. I've seen implementations that unacceptably break a lot of basic graphics functionality (such as trilinear/anisotropic filtering), require read-backs and multiple passes w/ massive pixel shader programs just to do the simplest texture lookup. I mean.. what the hell?

People have been doing seamless streaming worlds for a long time and suddenly just because we're not allowing UV-coords to overlap and are storing all our texture data in a big array of smaller textures we can no longer do trilinear, have to do read-backs to WRITE data to the GPU and everything else?

Unless I'm missing something, I can think of about a dozen better ways to do all this.

I think trading tri+aniso filtering for the simple case of bilinear virtual texturing is not bad. Say you had 4 tri (or aniso) filtered texture lookups to shade a pixel. Now compare this to the mega texture case of one dependent texture fetch (in the virtual page table), 2 extra ALU ops (which would be free 99-100% of the time), then 4 bilinear fetches. The bilinear fetches would be 2x as fast as the trilinear (better throughput from the TEX filtering units), but you have to hide one extremely well cached dependent texture read. The virtual texturing case might actually be faster.

Now think in terms of removing nearly all your texture unit state changes. How much of a savings is that?

And the frame buffer read back is not necessary at all if you compute your virtual texture working set from non-pixel shaded and rendered information.

Seriously, mega texturing is a DAMN good idea.
 
I wasn't saying mega texturing was a bad idea, just that the implementations I've seen people using to do it aren't all together impressive.

I'm highly confident I can get the same functionality working on everything back to DX7 level hardware with no dependent texture reads, working trilinear and anisotropic filtering with no read-backs. In fact, I think i'll spend some of my spare time in the next couple of weeks to get a demo of it working.
 
I think trading tri+aniso filtering for the simple case of bilinear virtual texturing is not bad.
Are you honestly willing to give up hardware tri/aniso so easily? Aniso makes a huge difference to the visual quality of pretty much any 3D rendered scene... I consider it an absolute minimum for modern games and I wouldn't be surprised to see more sophisticated filtering schemes (SAT, EWA, gaussian) begin used at least in a few places down the road

I think virtual texturing is great but you can't give up stuff like tri/aniso filtering to attain it. I'm also interested to see if it actually turns out to be faster to have this sort of thing done in "software" or if it will get hit by the hardware VM hammer at some point.
 
In fact, I think i'll spend some of my spare time in the next couple of weeks to get a demo of it working.
Try to document your findings/problems while programming or prototyping your solution. That making-of writing would be a great post mortem read. Even if you don't finish the project and just play around with the idea, it would still be interesting to read about the issues you stumbled upon and how you chose to address them.
 
Well, I got a quick little demo up and running, and it's working on the 16MB Rage 128 Ultra (with trilinear) in the e-mail machine that I'm now typing this post on but it's pretty much just turning into a basic streaming world setup (that's almost identical to one I wrote about 5-6 years ago). I'll go over some of the details, but this was all thrown together extremely quickly (both this time and when I originally prototyped it way back when) and there have been a TON of systems designed already that do all this, especially on consoles, so I'm sure there are better systems out there. Anyway, here goes.

I'm storing the files on disk in jpeg format, loading them in with a multi-threaded file streamer and decompressing them on the fly and then re-compressing into DXT format. I have a fixed (configurable) amount of video memory allocated in 'texture tiles' (or 'pages' if you prefer) and a fixed (also configurable) amount of system memory allocated for storing the texture tiles that aren't as likely to be immediately on screen. There's some [not currently terrific] logic in there that's constantly trying to optimize the memory layout by overwriting texture tiles it doesn't believe have much of a chance of being needed any time soon with tiles it thinks it will need soon (either by popping a tile out of video memory and loading one in from system memory or popping one out of system memory and loading another one in from file). This same segment communicates with the loader threads to ensure to the best of it and the loaders' ability that all the data required is loaded in before it's needed.

My "virtual page table" is just an index attached to each chunk of geometry telling the renderer which 'global' tile has the texture data for that chunk of geometry. If the geometry is visible then, assuming the other parts are doing what they're supposed to, this tile should be in video memory already and hence there's one other index telling the renderer which item in the vidmem texture array currently contains that data. That texture is simply selected into the appropriate sampler stage and the geometry is rendered. Repeat until all the geometry in the scene is rendered. In practice, especially on such a low-spec machine, the data you need isn't always in video memory so you have to force it in there on the fly hopefully from system memory at the worst. The more video memory you have and the smarter your loaders are, the less often this happens.

There's no "virtual page table texture" on the GPU that needs to be referenced inside a pixel shader, nor is any tile in memory structured different from how it is on disk so trilinear/anisotropic filtering works the same it always did. You're not even using a pixel shader so obviously the dependent texture read is out.

As for some performance figures.. well, the full specs of this little e-mail machine are:
CPU: P4 1.7GHz
Memory: 256MB PC133 SDRAM
Graphics card: 16MB Rage 128 Ultra
OS: XP

(did I mention this machine is normally only used for e-mail? ;) )
On a terrain made up of a little over 500k triangles and a single texture spanning 16kx16k pixels I'm getting.. well about 7-12fps on the good sections. Granted the multi threading isn't doing a whole lot of good on the single proc machine and 16/256 vid/sysmem is obscene these days. So I'll run it on some decent hardware later (I'm about 2 hours away from my normal machines right now, but I have a much better equipped laptop arriving in a couple of days), point is it works.

There's certainly a number of areas for performance improvements and I plan on getting to that, along with implementing 'SVT' for comparison. Someone also just sent me id's texture compression scheme which should improve performance a bit, I just haven't had time to read the whole article yet let alone implement any of it (I've only been working on this for just over a day so far).

Time permitting I plan to make all this publicly available so we can see how performance of the two systems scale in comparison on different hardware setups.
 
Last edited by a moderator:
Are you honestly willing to give up hardware tri/aniso so easily? Aniso makes a huge difference to the visual quality of pretty much any 3D rendered scene... I consider it an absolute minimum for modern games and I wouldn't be surprised to see more sophisticated filtering schemes (SAT, EWA, gaussian) begin used at least in a few places down the road

I think virtual texturing is great but you can't give up stuff like tri/aniso filtering to attain it. I'm also interested to see if it actually turns out to be faster to have this sort of thing done in "software" or if it will get hit by the hardware VM hammer at some point.

I am honestly willing to trade tri/aniso for more texture lookups, higher quality anti-aliasing and (in the case of virtual texturing) more detailed textures.
 
I am honestly willing to trade tri/aniso for more texture lookups, higher quality anti-aliasing and (in the case of virtual texturing) more detailed textures.
But you're getting significantly lower quality anti-aliasing (texture filtering is just a form of anti-aliasing), and your "more detailed textures" are going to be largely negated by at the very least over-blurring.

I guess we're just going to have to disagree here; if anything I think we need more sophisticated filtering as there are places where the current trilinear/aniso implementations break down. However, I also don't feel that it's necessary to give up on tri/aniso for the sake of virtual texturing: there's absolutely no reason why you can't have both.
 
I'm not fully understanding this discussion... Why on earth would megatexturing stop you from anisotropic and trilinearly filtering your textures?

You are still going to need lower resolution mip-maps of said mega texture (otherwise zooming out would leave you disasterously aliased and destroy your texture cahce hit rate) so why can't you do the extra texture lookups needed?

So your texture lookups are dependant ones, so what? the lower resolution mipmap of your megatexture will have the same dependancy (or rather a translation thereof). Just do 2 reads and organise the texture storage in such a way as these dependant reads can be treated near enough as one.

I don't see the issue... and anyway, whats gonna happen when aniso and trilinear get forced on in the driver?
 
I'm not fully understanding this discussion... Why on earth would megatexturing stop you from anisotropic and trilinearly filtering your textures?
Well, with no hardware support for virtual texturing the translation from virtual addresses to physical ones has to be done manually in the shader.
Which means every time you need to fetch a new set of samples to filter (from a new mip map, or just neighboring texels) you have to make sure you don't need data from a new page.
So your texture lookups are dependant ones, so what? the lower resolution mipmap of your megatexture will have the same dependancy (or rather a translation thereof). Just do 2 reads and organise the texture storage in such a way as these dependant reads can be treated near enough as one.
Think about it, you are implicitly asking to store the whole texture + mipmaps in a single memory page, which doesn't make sense with this approach.
I don't see the issue... and anyway, whats gonna happen when aniso and trilinear get forced on in the driver?
It would potentially break texture filtering.
 
So your texture lookups are dependant ones, so what?
As Marco explained, the problem is that your texture *taps* are dependent now, not just the shader-level texture lookup. i.e. a simple bilinear fetch might span two texture pages that are arbitrarily separated in memory. Current texture addressing/filtering units can't handle that, so unless you want to fall back on purely software filtering, you have to do some gymnastics to try and ensure that all of the data required by a given sampling operation is all in the same physical page at the time that you do the texture lookup. This is a bit annoying for bilinear/trilinear, and a major problem for aniso filtering... so much so that I know no one who as even attempted an efficient implementation of aniso filtering with megatexturing. At that point, you'd really want a hardware page table/TLB that gets hit for each texture tap by the texture filtering hardware.
 
Back
Top