PlayStation 4 (codename Orbis) technical hardware investigation (news and rumours)

Status
Not open for further replies.
I'm wondering that. It's the only choice that makes sense to me - optimise the CUs for GPGPU work rather than graphics. Don't know what sort of optimisations that'd be though.
Perhaps they will convert the triangle culling library from ps3, should help with small triangle problem. ;)
 
That's what I think and it makes the comments from llerre and other people in the know more clear now, as they have been saying that it is a wash between both console, which I couldn't initially reconcile with given the difference in flops. Almost everything in both are similar including the audio, videos, zlib compressors and the cpu. Now it seems the gpu are closer than we thought given the fact that durango might also have dedicated compute units. The only difference is their memory architecture.

Well, it's maybe closer to a wash in terms of rendering, depending on if they're both using the same GPU architecture and the differences in the memory architectures. On the CPU side, I'd say this gives Orbis a significant advantage. I'm still not sure why using these CUs for rendering would be a "minor boost" as the leak suggests.
 
I'm wondering that. It's the only choice that makes sense to me - optimise the CUs for GPGPU work rather than graphics. Don't know what sort of optimisations that'd be though.

But what would the reason be for doing that? Why is that a better choice than just sticking with 18 identical CUs and letting the developers decide how to use them? Curious bit of info, for sure.
 
So, you'd have 1.433 Tflops for rendering and 0.410Tflops (410Gflops according to the update) for GPGPU.

Why seperate? And if needed separate, then that means x720 cannot do gpgpu?

Seems unlikely.


You would only customize for benefit, and a large one at that.
 
I'm wondering that. It's the only choice that makes sense to me - optimise the CUs for GPGPU work rather than graphics. Don't know what sort of optimisations that'd be though.

Do that, however, and you risk junking the chip if one of those special CUs has a flaw, or increasing the amount of redundant CUs the chip already has.
I suspect that at least 2 regular CUs are physically present but fused off for yield purposes.

Using the same hardware makes the reserved CUs less of a manufacturing risk, since they can draw from the same redundant units.
 
I'm wondering that. It's the only choice that makes sense to me - optimise the CUs for GPGPU work rather than graphics. Don't know what sort of optimisations that'd be though.

I am thinking maybe stripping its graphic rendering pipeline and enhancing or upgrading the compute angle, making it more efficient at compute jobs. Make no mistake though, compute jobs will contribute a lot to how we perceive the next gen upgrade. Having much better animation, physics, fluid and hair simulation will go a long way to improving our perception and reception of the improvements in graphics. I think its a good decision provided it will be better at compute jobs than a general CU.
 
Why seperate? And if needed separate, then that means x720 cannot do gpgpu?

Seems unlikely.


You would only customize for benefit, and a large one at that.


What I meant was that the GPU is listed as 1.8TFlops and the 4 CUs for "ALU resource for compute" are listed as 410 Gflops. That 410 Gflops must be part of the 1.8Tflops total. I doubt it's 1.843 + 0.41. It specifically says using the 4 CUs for rendering would give a "minor boost" which suggests for some reason they are not as well suited to that task, or there is some bottleneck that limits their efficiency in that regard.

That in no way suggests Durango cannot do gpgpu, or that the 14 CUs in Orbis for rendering could not be used for computation as well. For whatever reason, there is a distinction between those 4 CUs and the 14 "regular" CUs in Orbis.
 
In all likelihood, those 4 CUs are optimized for compute above and beyond the existing GCN architecture. All 18 CUs can be used for rendering. However, 4 of the 18 CUs have special optimizations for things like animation, physics, etc. The 4 should include better access to the metal via a separate scheduler and API and perhaps other hardware improvements for general purpose computing, as well.

I'm guessing that moving away from the targeted Streamroller architecture to the Jaguar architecure, Sony decided to have a CU co-processor of sorts to make up for the power differential. This would also continue the Sony legacy of heterogeneous, "flexible" computing.

It seems for the making of an efficient, well-balanced, and flexible system overall.
 
But what would the reason be for doing that? Why is that a better choice than just sticking with 18 identical CUs and letting the developers decide how to use them? Curious bit of info, for sure.
One would assume efficiencies. A CU designed for GPGPU wil be more efficient that one designed for graphics tasks. If there's nothing to be gained, there's no reason to break up the 18 CUs, and I doubt anyone involved is stupid enough to force a limitation without a suitable benefit to be had from it. ;)
 
The CUs are already separate from the fixed function pixel pipelines, and the reserved CUs are already listed as having the same texturing capabilities.
They're also described as being usable for graphics, so they apparently still have graphics-specific capabilities.

The theory that they are physically distinct, coupled with what they're described as being able to perform, is approaching being essentially the same, minus some frills that aren't worth the effort of removing.
 
Do that, however, and you risk junking the chip if one of those special CUs has a flaw, or increasing the amount of redundant CUs the chip already has.
I suspect that at least 2 regular CUs are physically present but fused off for yield purposes.

Using the same hardware makes the reserved CUs less of a manufacturing risk, since they can draw from the same redundant units.

Yep, which would lead to a logical grouping vs physical. Maybe it is just scheduler magic.
 
They could be reserved somewhat for other functions, move or camera related or something so they can't be counted on but not fully reserved. I don't know, it does seem odd.

Edit- I need to hit refresh more often before I reply. :(
 
Now it seems the gpu are closer than we thought given the fact that durango might also have dedicated compute units.
This isn't the thread for Durango vs Orbis, but there's value in comparing technologies from the same source. Those CUs aren't going to be sitting idle, so whatever work they do is saved from the CPU and GPU. If Durango is processing physics on CPU, Orbis will have better physics, and if Durango is processing physics on GPGPU for 3 ms, Orbis will save that time and spend 9 ms using the physics leaving 3 ms bonus for the graphics.

In simple terms, Sony has selected more compute resources and they will come into effect, so the difference will be there in some manner. The specifics of what the 4 CUs do will help under the system. I'm certainly curious how it relates to PC, as it seems a concept that could only be pulled off in a console and may be quite valuable. GPGPU augmentations, especially when audio and video is being taken care of by custom silicon, could mean quite a lot of versatile computer resources for non-direct-graphics work. Maybe light propagation systems, AI sorting, or whatever.
 
Yep, which would lead to a logical grouping vs physical. Maybe it is just scheduler magic.
Could be that also. As others have suggested, maybe direct control of these CUs vs. scheduled graphics work across the others. Although customised CUs are more in keeping with the DF article, and that'd explain why the VGLeaks article says they can be used for graphics but less effectively.
 
I hope they're used for physics, animation, AI or whatever else they can think of, rather than rendering, to be honest. I'd like to see a lot of improvements in that area, and this could definitely help.
 
Just to add on Shifty's post (even though this is not DvO). Even if you have "only" 14 CUs available for rendering, these 4 CUs doing their job will still have to "fit" somewhere in Durango.

I think this situation is even worse than the one before for MS since Sony would practically make devs push extra compute in any direction they want and they would still have more shading performance than MS in that case. Not to talk about ROPs, texture units etc. I doubt Durango matches it there anyway.
 
Could be that also. As others have suggested, maybe direct control of these CUs vs. scheduled graphics work across the others. Although customised CUs are more in keeping with the DF article, and that'd explain why the VGLeaks article says they can be used for graphics but less effectively.

If they're under separate (application) control, using them for graphics wouldn't be like simply having 18 CUs under hardware control. You'd have to give them different tasks than you're running on the 14 others, so from that point of view, it would be slightly trickier than simply boosting your GPU performance by 28% or whatever.

Obviously Sony will ultimately want devs to increasingly use these things for non-graphics work. If they don't there'll have been no point in carving them out from under hw balancing/scheduling. Given the caveats about not being able to transparently leverage them to speed up work on the other 14 CUs, if I was Sony I would characterise their usefulness for rendering as 'minor' too relative to their usefulness for other work. This design decision will only be justified if devs use them for other things.
 
I'm a newb so forgive me... Could the 4 Cu's be used for advanced lighting calculations; like for instance approximating global illumination, or does that fall under more rendering than compute?
 
Status
Not open for further replies.
Back
Top