Game development presentations - a useful reference

Lionhead talks about their dynamic global illumination in Fable Legends.

http://www.lionhead.com/blog/2014/april/17/dynamic-global-illumination-in-fable-legends/

Here are two shots from the blog. There are a couple more shots comparing scenes with and without dynamic GI in the blog as well.

Arch.jpg

Chest.jpg
 
Aggregate G-Buffer Anti-Aliasing

We present Aggregate G-Buffer Anti-Aliasing (AGAA), a new technique for efficient anti-aliased deferred rendering of complex geometry using modern graphics hardware.

In geometrically complex situations, where many surfaces intersect a pixel, current rendering systems shade each contributing surface at least once per pixel. As the sample density and geometric complexity increase, the shading cost becomes prohibitive for real-time rendering. Under deferred shading, so does the required framebuffer memory. AGAA uses the rasterization pipeline to generate a compact, pre-filtered geometric representation inside each pixel. We then shade this at a fixed rate, independent of geometric complexity. By decoupling shading rate from geometric sampling rate, the algorithm reduces the storage and bandwidth costs of a geometry buffer, and allows scaling to high visibility sampling rates for anti-aliasing.

AGAA with 2 aggregate surfaces per-pixel generates results comparable to 8x MSAA, but requires 30% less memory (45% savings for 16x MSAA), and is up to 1.3x faster.

Video results: http://graphics.cs.williams.edu/pape...5Aggregate.mp4
Download mirror: https://mega.co.nz/#!XwAxAawC!bdGTV5...FoGmHfzjbua2u4

http://graphics.cs.williams.edu/papers/AggregateI3D15/Crassin2015Aggregate.pdf
 
Jeff Preshing : How Ubisoft Montreal Develops Games For Multicore – Before and After C++11

At Ubisoft Montreal, we exploit multicore by building our game engines on top of three common threading patterns.
To implement those patterns, we need to write a lot of custom concurrent objects.
When a concurrent object is under heavy contention, we optimize it using atomic operations.
Game engines have their own portable atomic libraries. These libraries are similar to the C++11 atomic library’s “low level” functionality.

http://preshing.com/20141024/my-multicore-talk-at-cppcon-2014/

PDF: https://github.com/CppCon/CppCon2014/blob/master/Presentations/How Ubisoft Montreal Develops Games for Multicore/How Ubisoft Montreal Develops Games for Multicore - Before and After C++11 - Jeff Preshing - CppCon 2014.pdf?raw=true
Video:
 
I heared on the Player One Podcast that the iOS hit Crossy Road developers were showing their graphs - which is fairly typical - but were also completely open about the money they made and how (10 million in the first 90 days), which is far less common. So could be interesting to look at!
 
This might be more for actual industry people, but I think it's still the best thing I've seen in years. My head is still spinning and I'm probably not gonna get much sleep tonight...

 
This might be more for actual industry people, but I think it's still the best thing I've seen in years. My head is still spinning and I'm probably not gonna get much sleep tonight...

Amazing, thanks for sharing this. Additionally, I had never heard of Quixel.
 
http://gdcvault.com/play/1022186/Parallelizing-the-Naughty-Dog-Engine


Vidéo of GDC 2015 conference about paralleling the Naughty Dog engine with fibers.

Great stuff, my summary :

- All code is jobified (yes all) except slow I/O resources that are usual system threads. All jobs can yield to others jobs in middle of execution
- Max 160 fibers (fiber = partial thread + stack) can be used at the same time
- PROBLEM: 3 months before TLOUR release they were running at ~25fps, heavily CPU bound because of locks the cores were far from being 100% used. (GPU not a problem at all)
- SOLUTION: to use near 100% of the cores with jobs (as they did apparently judging by their graphs), cut one frame in 3 parts (game logic, rendering logic, GPU exec) and render 3 consecutive frames simultaneously during the 16.6 ms frame time: see 33:17 on video
- Frame centric design to simplify doing several frames simultaneously, new concept of frames with uncontended resources. Up to 16 frames tracked max (only states not data),
- PROBLEM: this eats a lot of memory! They are quickly running out of memory... :runaway:
- SOLUTION: Tagged heap using only 2 MB blocks, very technical here...
- INPUT LAG: ~66ms?, (3* 16.6 ms frame-time + scan out, not sure here, it's at 54:25 if someone wants to help me) but still shorter input lag than TLOU on PS3
- Code your own fibers library! 5 or 6 functions max. "I am not a fan of PS4 fibers library...do your own fibers library" :LOL:
- They paid no attention at all on cache coherency ("sounds weird coming from us" ;)), they were 100% focused on keeping cores busy which is similar to Infamous SS devs complaints about CPU, that they similarly had trouble keeping cores busy but apparently Naughty dog found better solutions).
 
Back
Top