Xbox One (Durango) Technical hardware investigation

Status
Not open for further replies.
I was thinking.. it's not possible to use esram with classic deferred rendering engines, but more modern tile-based deferred engines can do the magic, it uses screen tiles to group lights, with a tight tile frusta to cull non-intersecting lights (reducing the number of lights to consider)

am I right?

this was discussed byt Lauritzen at the siggraph 2010

it's possible to tile in a fast 32 MB?

Something like Talisman?
 
Something like Talisman?

https://docs.google.com/viewer?a=v&...JFXsxn&sig=AHIEtbS_r2L5mCGqmiqrh6lhKuzh7yx5vQ

http://visual-computing.intel-resea...lauritzen_deferred_shading_siggraph_2010.pptx

viewer


and for the extra special sauce:

jKZaIvgkyoCnB.png


juDIGH39eRjoU.png


jrMfsy4G45uFT.png


I've been told by multiple sources I trust that there's still a bit of Orbis info yet to be determined, and quite a bit of Durango info that hasn't surfaced. Durango's CPU in particular is appreciably different than Orbis.
 
Last edited by a moderator:
I was thinking.. it's not possible to use esram with classic deferred rendering engines, but more modern tile-based deferred engines can do the magic, it uses screen tiles to group lights, with a tight tile frusta to cull non-intersecting lights (reducing the number of lights to consider)

am I right?

this was discussed byt Lauritzen at the siggraph 2010

it's possible to tile in a fast 32 MB?

Why using a deferred engine to begin with ? We've seen that Forward+ is a viable alternative on Dx11 hardware to render lots of lights without the drawback of deferred lighting (memory consumption, transparent materials, BRDF diversity...)
 
AndyH (that always have come with true insider information) wrote he had heard something not very clear about fpu units in Durango Jaguar cores,so...after reading to a lot of you about a compute figure not counted in the leaked diagram i will make my bet: cpu in Durango is way beefier than in Orbis and its Fpu number increased and customized to manage physics(MS already made this with Xenon cores in 360,they even included dot product instructions in the the VMX fpu units).This is the flop increase not in the leaked diagram and that as Orbis is already counting in the 1.84 Tflops amount would narrow their theorical computing power gap.
 
By the way Aegis in neogaf confirms no ROPs in ESRAM, that cpus in Orbis and Durango are very different,that the move engines increase the bandwith to memory compressing data and make emphasis in that the units in the GPU are called in MS documents shaders and not CUs...Efficiency comes from memory acccess and SIMDs in GPU management.

I am starting to think they went back to VLIW-4 instead of to next next gen shaders.
 
Last edited by a moderator:
If there is approach similar to Xenon where they put vectorized units to get desired FLOP performance, than that would explain 4 CUs dedicated for CPU work like physics and animations in Orbis. If thats the case, and there is more performance in Durangos CPU, than these CUs in Orbis are most likely dedicated to make up for that difference.
 
By the way Aegis in neogaf confirms no ROPs in ESRAM, that cpus in Orbis and Durango are very different,that the move engines increase the bandwith to memory compressing data and make emphasis in that the units in the GPU are called in MS documents shaders and not CUs...Efficiency comes from memory acccess and SIMDs in GPU management.

I think that that extra units are the very same as CU's modules to do GPGPU computation, I've heard from others that the cpu and the gpu are customized by a lot of things, Aegis finally confirm this, seems to me that microsoft is leading to a very complex/elegant solution than some PCism brute force
 
AndyH (that always have come with true insider information) wrote he had heard something not very clear about fpu units in Durango Jaguar cores,so...after reading to a lot of you about a compute figure not counted in the leaked diagram i will make my bet: cpu in Durango is way beefier than in Orbis and its Fpu number increased and customized to manage physics(MS already made this with Xenon cores in 360,they even included dot product instructions in the the VMX fpu units).This is the flop increase not in the leaked diagram and that as Orbis is already counting in the 1.84 Tflops amount would narrow their theorical computing power gap.

Hey! I remember months ago, rumors placed Xbox 3 wit a beefier CPU and PS4 with a beefier GPU...
 
After seing nvidia physx i wouldnt put my money on these kind of flops very physics calculating capable.

You havn't seen anything close to what physx is fully capable of and aside from that, it's not comparable in the first place. PhysX uses a discrete GPU, Orbis will likely be more of a HSA setup. Durango I'm assuming would be even more efficient if all that SIMD capability was in the CPU itself.
 
By the way Aegis in neogaf confirms no ROPs in ESRAM.

Damn. There goes special sauce variety #1: the classic taste of high fillrate Xbox 360.

Having a fast way to test and update Z and transparency just makes so much sense on a BW constrained system.
 
If they had a 256 wide AVX2 unit per core then it'd go a long way. The Durango CPU would sport 410 GFLOPs.
That would also need wider load/store paths.
Additionally, some people have suggested things like 2 fma units (hence doubling theoretical flops again). For that to work, you'd need additional register read ports. And even then it would be most likely a waste with near all possible workloads, since you'd still only have one load (and one store) port, which seems _very_ insufficient. So that means another agu/load unit. At that point the cpu wouldn't really look like a close relative of jaguar anymore (more like a next-gen descendant of it).
Honestly I don't really see the need for more fpu power in the cpu part at all. With possible closer proximity of the cpu and gpu (with hopefully some way to access the esram with both) you could instead run more workloads on the gpu part (gpu flops should still be cheaper after all).
 
Why using a deferred engine to begin with ? We've seen that Forward+ is a viable alternative on Dx11 hardware to render lots of lights without the drawback of deferred lighting (memory consumption, transparent materials, BRDF diversity...)

Forward+ has it's own set of drawbacks (mandatory prepass, poorer quad utilization during lighting, very inefficient transparency lighting etc). There's been few studies already showing that a tiled-deferred renderer consistently outperforms tiled forward (any MSAA on nvidia and up to 2xMSAA on amd GPUs).
Given the likely increase in vertex counts for next-gen targets forward+ will tend to run even slower due to the poor quad utilization.
 
You havn't seen anything close to what physx is fully capable of and aside from that, it's not comparable in the first place. PhysX uses a discrete GPU, Orbis will likely be more of a HSA setup. Durango I'm assuming would be even more efficient if all that SIMD capability was in the CPU itself.

But if we give credit to the vgleaks, the part the talk about the CPUs is very similar, with more detail in the PS4 part perhaps, but there´s no comment that leads us to believe Durango uses anything but a Jaguar as well.
 
But if we give credit to the vgleaks, the part the talk about the CPUs is very similar, with more detail in the PS4 part perhaps, but there´s no comment that leads us to believe Durango uses anything but a Jaguar as well.

in the Durango leak it says CPU is 8 x86 cores and in the Orbis one it says 8 Jaguar cores...
 
Status
Not open for further replies.
Back
Top