Digital Foundry Article Technical Discussion Archive [2014]

Status
Not open for further replies.
The GPU's rasterizer determines coverage for a triangle in rectangular chunks of pixels. The goal for good utilization is to make sure the batch of pixels that comes out of this stage has as many pixels as possible inside of the triangle, as the part of the rectangle that lies well outside of it may wind up becoming multiple SIMD lanes that are dead for the the wavefront. The mechanics for wavefront packing are something of a mystery to me, however.

If the pixels are shaded expensively, the saving on only doing 50% of them probably outweighs the locality easily?

That should be the common case, as there's going to be a floor of arithmetic work and memory references that is proportional to the screen's size, not its content.
The heavy load case is that the screen is dominated by complex materials that hopefully require more ALU and memory accesses than the check of the motion vector, the recalculation of the motion vector, and multiple reads and writes.
Even if the scene is complex, it's still rendering at 50% of the pixels so the efficiency would scale well. I'd argue it's even more bang for the buck because you are then reusing the really expensive pixels from the past at rather constant cost of the reprojection.

I also realized yesterday the ugly chart that I made was attempting to convey the high level scene (shaded color space) with regard the effect of resolution, it seemed that it's taken as the geometry hence there's a disconnect in how the the chart is is wrong.
 
Last edited by a moderator:
If the pixels are shaded expensively, the saving on only doing 50% of them probably outweighs the locality easily?
I said as much.
What this does do is impose a different scaling curve for fragment processing efficiency versus triangle size.
The impact of just reducing the horizontal resolution from the starting point of 1080p sounds like it should generally be modest, at least for the level of geometric detail in KZ:SF.

Heavy use of the technique in later titles and/or more aggressive reprojection may have import as to how much further geometry can be scaled this gen. The lower bound before wavefront occupancy or pixel-quad inefficiency hits is higher than the output resolution would indicate, which isn't a sign of massive headroom.

Even if the scene is complex, it's still rendering at 50% of the pixels so the efficiency would scale well. I'd argue it's even more bang for the buck because you are then reusing the really expensive pixels from the past at rather constant cost of the reprojection.

It comes down to what tradeoffs come from it, like in what scenarios it breaks down.
That may constrain certain artistic choices or rendering techniques, although it may be more definitively discussed when GG gives details.
If things like fine surface detail or effects with high temporal frequency break the scheme, it would discourage something like a hypothetical Neon Cybercuttlefish in future iterations of the engine.

I'm particularly interested in finding out where the trade-offs are because if they're cheating to get around something at pretty much day 0 for this gen, I want to know what that glass jaw is.
 
I'm particularly interested in finding out where the trade-offs are because if they're cheating to get around something at pretty much day 0 for this gen, I want to know what that glass jaw is.

Tradeoff is that the graphics doesn't look as good as a native 1080p, I thought this is pretty obvious. (no?)

and this only a gen 1 title, things had always been like that and we see better products when the gen 2 arrives, most launch titles feel like previous gen in HD as time goes by. Anyone remember RIIIIIDGE RACEEEERRRR?
 
I want to know what architectural bottleneck they are working around.

This scheme does compromise fragment shader utilization and cuts into the expected room for improvement for things that have to go through the geometry front end.

GCN's geometry capabilities and improvement over prior generations are a known weakness versus competition in the desktop space and its tessellation improvements are less impressive still, so is this already a pressure point on the very first software of this generation of consoles?

That part of the rendering process isn't sped up by reprojection, so is GG eking out time savings elsehwere?
Is it shaving ms off of the time interval between post-processing steps?
Reprojection needs to run off of complete frames, which serializes things somewhat. We see no improvement or worsening in geometry and fragment submission (edit: unless there's time savings in cutting the number of wavefront submissions at that step), and more time taken in post-processing.

That leaves what happens in the middle. Is it saving ALU and texture accesses? The ROP burden isn't clear. It's half the ROP throughput for the rendering phase but this is for Orbis and its 32 ROP architecture. The final composition step requires additional work that could be rolled into other parts of the post-process AA or other effects.
Is it the bandwidth?
 
http://www.eurogamer.net/articles/digitalfoundry-2014-titanfall-ships-at-792p

Titanfall runs at 792p at launch, according to DF. 1408x792 is a true 16:9 resolution, as indicated in this splendid article listing all the possible 16:9 resolutions.

http://pacoup.com/2011/06/12/list-of-true-169-resolutions/

I read it today when tweaking my Heroes of Might and Magic V user.cfg file (I set the resolution to 1312x738 windowed, fits the screen very well while being actually windowed, I completely disabled the shadows, vsync on so the framerate is limited -thus the GPU isn't overreaching- and the game runs like a charm, temps of my Intel HD 3000 GPU don't go overboard)

Respawn is considering the idea of increasing the resolution but they need to get around the quirky aspects of the eSRAM.

Someone tweeted the issue to one of the system's designers at Sony, he said there was nothing he could think of in the system itself that would limit the use of AF, that it must be something the individual devs are doing. But no one seems to have asked the devs themselves what the issue is.

I have a GTX-680, but even on older cards, the impact of AF was pretty much non-existent. Even cranking it to 16x resulted in maybe 1.5-2fps drop. Every other setting has a significantly more noticeable impact than AF, it's the one option I thought would be a given for every next-gen title, especially considering how jarring it can be to the visual quality when you're not used to it (from being a PC gamer).
Dunno what the big deal is. First of all, even my crappy laptop GPU (Intel HD 3000) handles high AF with no apparent decrease in performance. Secondly, the Xbox One shares the same architecture, the games are usually running at quite lower resolutions on X1 compared to PS4, so there is a lot of room for improvement in that department on the PS4. Whatever the reason the AF on the X1 has been really high on every game I have -15 for now, from what I can recall-. Ryse is a good example, so is FIFA 14, CoD and BF4, with Forza 5 falling far behind 'cos I don't think it has AF at all.
 
If a major studio can include a bug where the final resolution is wrong in the golden bits...well, just saying.
 
I think some of you guys are getting a bit too far out ahead here with the possible implications of GGs move.

Remember that the tech was used to keep the general lighting and shading complexity of the single player game while doubling the desired frame rate. Thus they needed to cut fragment processing requirements in half. What they came up with is almost literally that - not rendering every other column, trying to generate the data from previous frames instead.
The tech would not work this well in a 30fps game, partially because of the added latency and partially because of the larger difference between consequent frames. The 60fps requirement will also significantly cut down the available CPU time for the gameplay itself, restricting the use of the tech to relatively simple games.
I don't see, for example, an Assassin's Creed game going 1080i / 60 fps with this tech, because the open world aspects would have to be cut back significantly to fit into only 16ms of CPU time. Even KZ's single player mode is kept at 30fps instead of trying to use the tech.

As for the efficiency, KZ is a deferred renderer, so the most significant fragment processing parts are hopefully as efficient as they'd be with a full frame, since the heavier shaders are processing stuff in screen space from the G-buffer. The only case where they have to deal with small polygons is the first pass of writing out the render targets.


Oh, and the other option to cut fragment processing requirements would have been to use 720p for the multiplayer - a perfectly valid choice in my opinion. Which is also why it's reasonable to assume that marketing played an important role in the decision.
 
Remember that the tech was used to keep the general lighting and shading complexity of the single player game while doubling the desired frame rate. Thus they needed to cut fragment processing requirements in half. What they came up with is almost literally that - not rendering every other column, trying to generate the data from previous frames instead.
That wouldn't be enough on the face of things, since fragment processing is not 100% of the GPU's frame time budget, other portions like the geometry front end and the post-fragment stages would have been unchanged or made longer.

Granted, single player at 30 fps tends to hover above 30, and the actual MP frame rate has been measured to be notably lower than 60 fps, so there is some margin of error from both the performance headroom and shortfall from the ideal aspects.

The reprojection step also apparently leverages data used for the AA solution, so there may be additional savings or at least reduced costs in the later part of the process if the post-process steps can combine some of their work.


As for the efficiency, KZ is a deferred renderer, so the most significant fragment processing parts are hopefully as efficient as they'd be with a full frame, since the heavier shaders are processing stuff in screen space from the G-buffer. The only case where they have to deal with small polygons is the first pass of writing out the render targets.
Wavefront granularity is 64 pixels, so utilization should in theory tend to be better with higher resolution for somewhat similar reasons. However, the evaluation may not be as hard-edged since the pixels may not be completely dead/alive as fragment coverage would be. The capacity for the GPU to coalesce fragments is sort of up in the air as well.
The savings from cutting the heavy shader work in half should be the bigger factor in the common case.
 
That wouldn't be enough on the face of things, since fragment processing is not 100% of the GPU's frame time budget, other portions like the geometry front end and the post-fragment stages would have been unchanged or made longer.

The other types of GPU stress are probably fine tuned by the size and complexity of the multiplayer levels and player character models. Optimize number of texture pages, shader and light complexity, bone influences and so on. It should also be possible to constrain overall scene complexity by limiting the max player number and such parameters.

Granted, single player at 30 fps tends to hover above 30, and the actual MP frame rate has been measured to be notably lower than 60 fps, so there is some margin of error from both the performance headroom and shortfall from the ideal aspects.

Yeah that's the other aspect, they didn't entirely succeed ;)

But I still expect the overall use of this tech to remain somewhat limited, seeing how most games are still 30fps and generally CPU limited. Racing titles might benefit more easily - but those tend to be able to run at 60fps without resorting to 1080i anyway. There it's a trade off between fragment quality and overall image quality; for example sebbbi has probably already looked into this seriously ;)

Multiplayer seems to be the most obvious application, where even with the extra frame of latency it's probably still more responsive than 30fps would be, and it can feel every bit as smooth a true 1080p 60fps game does.
Also stereo 3D can be a good idea, which also requires double the number of frames, even if it's still 30fps - and stereo can gain additional benefits from reprojection anyway. Maybe even VR can use it - but the added latency might be a more serious problem there.
 
Er, that's what I've got from the blog post back then, but then again I'm under some stress so my perception might be wrong...
 
They're just making use of the data from previously-rendered frames, so I don't see any reason why it would add extra latency.
 
Digital Foundry Dark Souls 2 face-offs:

http://www.eurogamer.net/articles/digitalfoundry-2014-dark-souls-2-face-off

PS3: better texture filtering + no screen tearing but lower framerates
X360: better framerates & alpha effects but heavy screen tearing

Classical PS3 vs X360 face-off BUT, as expected, a strong blur filter is present:

it's now a full native 1280x720 on both PS3 and 360. On paper this should count as a massive boon to its presentation, but in practice the boosted pixel count only faintly improves image clarity over the original Dark Souls' 1024x720 frame-buffer. A reliance on a post-process edge filter is to blame: it's an effective aliasing-killer, but many highlights in texture-work and alpha effects are dulled, and on both Sony and Microsoft platforms alike the game produces a softer image than we'd hoped.
What is the point of using a better resolution if you wreck the assets by a blur filter?

Why they keep doing this? Because they are so afraid of "jaggies" for the previews?
 
Dunno what the big deal is. First of all, even my crappy laptop GPU (Intel HD 3000) handles high AF with no apparent decrease in performance.

That's because the bottlenecks are elsewhere.

Console developers strive to use as much of everything as they can, as performance left on the table is a waste. Turning up the AF won't be free if memory BW and TMU's are already being fully utilised.
 
Digital Foundry Dark Souls 2 face-offs:

http://www.eurogamer.net/articles/digitalfoundry-2014-dark-souls-2-face-off

PS3: better texture filtering + no screen tearing but lower framerates
X360: better framerates & alpha effects but heavy screen tearing

Classical PS3 vs X360 face-off BUT, as expected, a strong blur filter is present:

What is the point of using a better resolution if you wreck the assets by a blur filter?

Why they keep doing this? Because they are so afraid of "jaggies" for the previews?
They should make a feature about the massive downgrades in the final game compared to the reveal and press demos:

http://www.neogaf.com/forum/showthread.php?t=781625

:devilish:

PS3 version, January 2014
From 3:49 to 5:08 (no spoilers)
From 2:22 to 2:37, torch gameplay (no spoilers)

PS3 version, March 2014
From 6:15 to 9:40 (no spoilers)

OGdHC7h.png

kY5M6nZ.jpg
 
Last edited by a moderator:
I don't own the game, but I get the sense from the hours of video I've watched that From Soft has some really ambitious ideas about how light would be integral to the gameplay experience that simply didn't work out. Whether it was a performance thing, or if the need to carry torches around just made it too difficult and through the gameplay out of balance, they made a decision to cut most of that aspect out. You can still see the vestiges of that, including the torch, the braziers everywhere, and even boss fights that have light related options, but it isn't what was intended anymore. Reminds me of Far Cry 2 which was supposed to have this really sophisticated faction system that didn't survive to the shipping version. It's too bad, but I don't think there's any malice involved. At the end of the day any cuts they made were probably in the interest of shipping the best, most functional game they felt possible. Considering Demon's Souls and Dark Souls were never sold on their graphical prowess in the first place it feels a bit petty to make graphics the focus of outrage about Dark Souls 2.
 
Status
Not open for further replies.
Back
Top