Interview with Epic's Tim Sweeney

tEd

Casual Member
Veteran
http://forums.epicgames.com/showthread.php?t=571019


Important questions, answered by Tim Sweeney, lead developer at Epic Games:

PCGH: Is there a possibility to make deferred shading and edge-smoothing work at the same time on DX9 graphics cards?
Epic: Unreal Engine 3 uses deferred shading to speed up the calculation of
dynamical lighting and shadows. Integrating this feature together with
multi-sampling requires control of the edge-smoothing at a much deeper level than the DX9 interface can provide. So, on the PC, multi-sampling will only be
supported under DX10.

PCGH: How do the general hardware requirements look like?
Epic: Since optimization work is still ongoing, these details may change every
day. Generally speaking, the game runs quite smooth with DX9 hardware released by NVidia and Ati since 2006. On high-end cards, including the DX10 models, UT3 runs incredibly smooth already. Additionally, we also support shader 2.0 graphics hardware, with only a few technical limitations.

PCGH: Will SLI and Crossfire provide significant advantages?
Epic: We're testing SLI configurations on a regular basis. Their positive
influence can be felt significantly, especially at higher resolutions. So, if one
wants to have full details at very high resolutions, a SLI-system would be the
ideal way to secure optimal performance. We had no opportunity to test crossfire systems yet, but we are expecting similar results.

PCGH:
How exacly are you utilizing the functions of Direct X 10?
Epic: Unreal Tournament 3 will ship with full DX10 support, with multi-sampling
being the biggest visible benefit of the new graphics interface. Additionally,
with DX10 under Vista we have the possibility to use the video memory more
efficently, to be able to display textures with a higher grade of detail as it
would be possible with the DX9 path of Vista. Most effects of UT3 are more bound to the fillrate than to basic features like geometry processing. That's why DX10 has a great impact on performance, while we mostly forgo the integration of new features.

PCGH: Will UT3 players be able to benefit from a 64 Bit environment and is there a 64 Bit version anyway?
Epic: To assure compatibility, we tested UT3 with Vista x64 as well. Nonetheless, we're planning to wait and see first, until the OS and its applications will have ripened, before we'll be taking further steps in the 64 Bit direction. With UT2004 we were one of the first developers who ported a title for Windows XP x64. We would've liked to do this with UT3 and Vista x64 as well as shifting all the PCs we're currently developing on to the 64 Bit version of Vista. Unfortunately, full software and driver compatibility isn't there. The basic OS runs stable and it's fun to work with it isolated. But as soon as you want to print something or want to run Maya or 3DSMax together with some third-party plugins you'll get massive problems. But I am sure those can be fixed via service packs and software updates, so PCs with 4 to 8 gigs of ram can establish themselves during the next 12 months.

PCGH: What is the maximun number of threads that can be calculated separately? Will there be a performance-boost if a quad-core system will be used?
Epic: We're able to scale the thread-structure pretty well. There is a primary
thread for the gameplay and a second one for rendering. On systems with more than 2 cores we run additional threads to speed up various calculation-tasks, including physics and data-decompression. So the overall performance benefits greatly from a quad-core processor. Although we haven't looked into the matter yet, I expect an even further performance increase through CPUs with more than 4 cores in future UE-based games.
 
Last edited by a moderator:
OOOGA OOOOGA WAJOOOOGA...

Ahem sorry my primal being came out there for a second...

Edge smoothing in UT3 with DX10 hardware. I don't have enough paper towels to clean up my drool.

Ouch, being more fillrate limited sounds like it's going to hurt R600.

No 64-bit native version. /sigh.

Thanks for the heads up.

Regards,
SB
 
Isn't edge-smoothing that same effect they used in GRAW (Deferred aswell) to make up for the lack of AA support? Or is this a case of 'same dog, different leg action'?

I wonder if MSAA would be possible under XP (DX9) with the Alky libraries? :p
 
Actually, DX10 SHOULD allow doing proper MSAA in a deffered environment. And I don`t think Sweeney would`ve harped about the advantages of DX10 over DX9 in this regard if they were doing some crappy per-edge blur-filter like GRAW, and Stalker do. So yes, this is very very very drool-worthy.
 
Yeah, when I read it, I assumed it was multisampling as well. But then I thought about it. How the heck do you multisample deferred shading? I used to think how are there games out there that can't be multisampled, but now I have no idea how this can.
 
Still haven't fully got my head around deferred shading. My understanding is that you're basically doing a geometry pass and storing all geometry attributes and lighting parameters specific to each object into a buffer. Then for each light in the scene you add that light's contribution into an accumulation buffer.

What I don't understand is why this accumulation buffer can't be the regular hardware supported MSAA buffer? Also, does DX10's support for accessing MSAA samples in an MRT mean that MSAA is hardware accelerated or is the final resolve still done in a shader?
 
I think it's to do with the stage that MSAA can happen in a DX9 rendering pipeline. So, MSAA happens near the end, and needs certain information that is lost in an earlier step if deferred rendering is used. I think the lost info is the geometry. I get the impression that with deferred rendering you end up with a screen full of pixels rather than a screen full of objects quite early in the process.

Looking forward to being told how way off base I am.
 
Looking forward to being told how way off base I am.

You are completely correct :).
With Deferred Shading the accumulation buffer samples from textures (the "G-Buffers) representing the geometric properties of the scene (depth, normal, material, etc.). At this stage the actual polygons representing the scene are no longer available thus an MSAA'ed accumulation buffer wouldn't antialias anything since there are no polygons to antialias.
Multisampling with Deferred Shading is done by creating each G-Buffer as multisampled, and using DX10's shader-assisted resolves at the accumulation buffer stage to effectively apply shading to each sample and then resolving the results together. Ideally this should only be done for blocks of pixels containing different samples in order to save on performance (i.e. avoiding to apply supersampling on all pixels), however the process of determining if a pixel belongs to a *multisampled* edge can be quite involved.
 
You are completely correct :).
With Deferred Shading the accumulation buffer samples from textures (the "G-Buffers) representing the geometric properties of the scene (depth, normal, material, etc.). At this stage the actual polygons representing the scene are no longer available thus an MSAA'ed accumulation buffer wouldn't antialias anything since there are no polygons to antialias.

Got it, thanks guys.

Multisampling with Deferred Shading is done by creating each G-Buffer as multisampled, and using DX10's shader-assisted resolves at the accumulation buffer stage to effectively apply shading to each sample and then resolving the results together. Ideally this should only be done for blocks of pixels containing different samples in order to save on performance (i.e. avoiding to apply supersampling on all pixels), however the process of determining if a pixel belongs to a *multisampled* edge can be quite involved.

This stage of it was confusing me as well. In traditional AA the shader only sees a single fragment and the ROPs do all the sample generation, depth testing and blending magic. But if this G-Buffer now has multiple samples for the same final on-screen pixel how does the lighting pass decide which sub-samples to shade and which ones are "copied" from other colored subsamples in the same triangle? Or is it the case that all subsamples are lit independently if at least two triangles contribute to the pixel color?

Also, just to confirm. Is the depth-test for generating subsamples now front-loaded in the geometry pass? The sample data in the MSAA'd G-buffer would correspond to the final colored samples in the accumulation buffer so the lighting shader won't have to worry about the possiblity of triangles overlapping - right?

I think I need pictures :oops:
 
Got it, thanks guys.



This stage of it was confusing me as well. In traditional AA the shader only sees a single fragment and the ROPs do all the sample generation, depth testing and blending magic. But if this G-Buffer now has multiple samples for the same final on-screen pixel how does the lighting pass decide which sub-samples to shade and which ones are "copied" from other colored subsamples in the same triangle? Or is it the case that all subsamples are lit independently if at least two triangles contribute to the pixel color?

Also, just to confirm. Is the depth-test for generating subsamples now front-loaded in the geometry pass? The sample data in the MSAA'd G-buffer would correspond to the final colored samples in the accumulation buffer so the lighting shader won't have to worry about the possiblity of triangles overlapping - right?

I think I need pictures :oops:

Dunno of anyone showing an actual implementation in DX10 yet, so probably no pics are available. What SuperCow said sums it up nicely.
 
But if this G-Buffer now has multiple samples for the same final on-screen pixel how does the lighting pass decide which sub-samples to shade and which ones are "copied" from other colored subsamples in the same triangle? Or is it the case that all subsamples are lit independently if at least two triangles contribute to the pixel color?
The latter. If more than one triangle contributes to a pixel then you will have to apply lighting to each individual subsample, then average the results together (and store it in the accumulation buffer). The trick is determining how many different subsamples exist for each given sample.

Is the depth-test for generating subsamples now front-loaded in the geometry pass? The sample data in the MSAA'd G-buffer would correspond to the final colored samples in the accumulation buffer so the lighting shader won't have to worry about the possiblity of triangles overlapping - right?
I'm not too sure what you're asking here. Once you have finished your geometry pass your MSAA'd G-Buffer will be a correct representation of the geometric data for your multisampled scene. The lighting shader certainly won't have to "worry" about overlapping triangles as long as it correctly access each multisample in the G-Buffer and apply lighting to them individually.
 
If I may ask something here:does anyone know where I can get the oldish PS3.0 PowerVR demos?The Cloth and whatever ones that were released some time ago and discussed extensively here. I can`t seem to find them on their site....they also had a Deferred rendering one.
 
If I may ask something here:does anyone know where I can get the oldish PS3.0 PowerVR demos?The Cloth and whatever ones that were released some time ago and discussed extensively here. I can`t seem to find them on their site....they also had a Deferred rendering one.

Those three demos were published in the ShaderX2 book.
 
The latter. If more than one triangle contributes to a pixel then you will have to apply lighting to each individual subsample, then average the results together (and store it in the accumulation buffer). The trick is determining how many different subsamples exist for each given sample.

I'm not too sure what you're asking here. Once you have finished your geometry pass your MSAA'd G-Buffer will be a correct representation of the geometric data for your multisampled scene. The lighting shader certainly won't have to "worry" about overlapping triangles as long as it correctly access each multisample in the G-Buffer and apply lighting to them individually.

Thanks a lot, you answered both questions.
 
Cheers mate, thanks a lot...and sorry for mildly derailing the thread, but it`s a tad bit connected...a tad. And since I started already and everyone seems to be in a helpful mood:does anybody know why nVidia decided to anal retentive about their Cascades demo and limit its usage to G8x cards with a certain driver version?Anybody know of any circumvention attemps?:) I`ve toyed with 3DAnalyze, but I`m a VendorId and DeviceID short of all that I need for it, and there still would be the driver issue, but maybe I could engineer the inf for that.
 
Back
Top