Optimizations on Xbox 360

It's pretty amazing what developers got both PS3 and 360 to do over time. Just look at any of the major game series.
 
I know what X360 GPU have 24576 128 bit registes. But question here.
Is i good? Is it neutral? And why? What benefits was from that?
 
EDRAM size didn't play so nicely with deferred rendering. Worth reading this thread: https://forum.beyond3d.com/threads/xenos-c1-and-deferred-rendering-g-buffer.42770/page-2
Many devs giving many opinions.
We used a lightweight g-buffer. Our g-buffer had two 32 bit color buffers (rgba8 + rgb10a2) and 24f+8 depth-stencil (reverse float depth). In total this is 12 bytes per pixel. In 10 MB you can fit either 1280x672 or 1152x720 g-buffer target. These resolutions were pretty common on Xbox 360. Both are around 10% sub-HD.

With that setup you can fit: rgb8 albedo, specular (8b), xy normal (lambert azimuth encoded) (10b+10b), roughness (10b), lighting mode (2b). Stencil buffer (8b) can be used to encode additional per object properties. This kind of lightweight g-buffer wasn't a direct quality trade-off (enough bits for everything to avoid banding), but it limited a lot what you can do.

We used virtual texturing and that avoided some limitations, as we could blend decals directly to the virtual texture cache instead of storing data to g-buffer. My tech interview from 2012 has some tidbits: http://www.eurogamer.net/articles/digitalfoundry-trials-evolution-tech-interview
I know what X360 GPU have 24576 128 bit registes. But question here.
Is i good? Is it neutral? And why? What benefits was from that?
I did our shader register optimizations for Xbox 360. In general, GPUs back then didn't have big problems with register storage. However you could still get some gains by optimizing them. I remember seeing 5%+ gains after micro-optimizations in many cases. Both vertex and pixel shaders used the same registers on Xbox 360 (the first GPU with unified shader model). You had to ensure that you have enough parallelism on both shader types. So often you had to optimize vertex shaders to get more pixel shaders to run concurrently. Skinning was pretty difficult to get right. We chose to avoid skinning perf problems by calculating tangent frame at runtime from screen space derivatives. This made the vertex shader much lighter. But otherwise, I don't remember any big bottlenecks.

On modern GPUs (both AMD and Nvidia) the registers are more often a bottleneck. You need to think about the register pressure all the time.
 
I did our shader register optimizations for Xbox 360. In general, GPUs back then didn't have big problems with register storage.
Is there any information about how much registers are in oher GPUs, especially of time when X360 was released?
Maybe you know how much registers wee in RSX?
Thank you for answer also.
 
I also think Xbox 360 were pushed almost to look like a next gen console at its "end". Lately I was trying Quake 4 (2005?) game and comparing it to Crysis 3 or the different type of game GTA5 and graphic differences are impressive. Even eletric power usage is different, those modern games ask something like 80 watt on a last revision single chip 360 something like 10 watt more than the first game during gameplay.
 
I also think Xbox 360 were pushed almost to look like a next gen console at its "end". Lately I was trying Quake 4 (2005?) game and comparing it to Crysis 3 or the different type of game GTA5 and graphic differences are impressive. Even eletric power usage is different, those modern games ask something like 80 watt on a last revision single chip 360 something like 10 watt more than the first game during gameplay.

The "Xbox 360 games look like Xbox games in HD" complaint I heard alot in 2006 was quite applicable until Gears of War came out. Quake 4 could run on some really paltry PC hardware, along with so many other early past gen games. The blitz of 2007 also helped to dispel any notions of the 360 being a simple visual upgrade of the Xbox. I loved the System Wars discussions on Gamespot at the time.

Another thing worthy of note is how Xenos really was the bridge between the dedicated shader vector + scalar architecture of the Radeon X1000 series and the fully superscalar unified shader X2000 series. It almost seems that Xenos really was just comprised of X1000 pixel shaders enabled to handle vertex shaders as well.
 
Xenos seems to have been the way to go at the time to build an efficient GPU that targeted lower resolutions and also had a good feature set. It seems to have been a better approach for this than RSX.
 
Last edited:
Xenos seems to have been the way to go at the time to build an efficient GPU that targeted lower resolutions and also had a good feature set. It seems to have been a better approach for this than RSX.
The only problem for reaching 720p native was the limited 10MB of EDRAM. Games were around 5%-10% sub-HD (8+4 byte render target). Native 720p would have been common if we had 1MB more ESRAM. You would have of course seen 5%-10% worse pixel quality to compensate. Many people had their Xbox 360 output set to 720p, meaning that you got two scaling operations (one from slightly sub-HD to 720p and one from 720p to 1080p by HDTV). This was especially true if you had the older model without HDMI (output options were 1080i and 720p). Otherwise Xbox 360 GPU was very good. Unified shaders, fast branching, memexport, etc.
 
Is there any information about how much registers are in oher GPUs, especially of time when X360 was released?
Maybe you know how much registers wee in RSX?
I don't remember the numbers anymore. Xbox 360 GPU was an unified shader GPU. Vertex and pixel shaders shared the registers files (just like they do on all current GPUs). RSX was older design (GTX 7000 based) with separate vertex and pixel shader hardware. On Xbox 360 you often wanted to optimize your vertex shader (especially skinning) to reduce register usage to achieve more concurrency for the pixel shader. You could also limit how much registers are allocated to VS/PS, allowing more concurrency to the other shader type. We changed register allocation based on object distance. IIRC our g-buffer was was split to four phases with different register allocations. You need more pixel shader concurrency close to the camera (as pixel : vertex ratio favors pixels), but far away objects have higher vertex : pixel ratio, so a different register split performs better.

Skinning was painful to get right. We used pixel shader derivative tangents to reduce register pressure on vertex shader (and to reduce VS->PS interpolants, which also often were a limiting factor to performance). I mention this + some other technical details in Digital Foundry's Trials Evolution technical interview: http://www.eurogamer.net/articles/digitalfoundry-trials-evolution-tech-interview.
 
Could you elaborate more on this, please? :oops:

I believe:
There is a finite amount of GPU power, the EDRAM size meant many games used a framebuffer ~10% smaller than full 720p so it all fit within this fast ram. It was not a GPU performance issue directly.

If EDRAM was 1mb larger deferred rending targets at 720p would fit and could have been used, but as there is then ~10% more pixels there is ~10% less GPU resources available per pixel. You cannot increase the workload and get the same level of work done over this larger area.
 
More pixels to process -> less performance. Compensate by using lower quality shaders etc.

I believe:
There is a finite amount of GPU power, the EDRAM size meant many games used a framebuffer ~10% smaller than full 720p so it all fit within this fast ram. It was not a GPU performance issue directly.

If EDRAM was 1mb larger deferred rending targets at 720p would fit and could have been used, but as there is then ~10% more pixels there is ~10% less GPU resources available per pixel. You cannot increase the workload and get the same level of work done over this larger area.

Well that was a pretty simple explanation on that. Sometimes I need to think a little harder lol. I was stumped for a while on what sebbbi meant right there.

Appreciate the answers gents. :yes:
 
Back
Top