PlayStation 4 (codename Orbis) technical hardware investigation (news and rumours)

Status
Not open for further replies.
Actually change that. Not 16 queues for XBONE nor 64 queues for PS4.

12 CU's / 24 queues (XBONE) [16 Compute + 8 Graphic] 2 Queues / CU
18 CU's / 72 queues (PS4) [64 Compute + 8 Graphic] 4 Queues / CU

That should put the queue argument to rest, and it seems that it is no coincidence that they both line up perfectly for the amount of CU's they have.

That makes more sense. How do these compare with each having the same number of 'graphics queues'?
 
That makes more sense. How do these compare with each having the same number of 'graphics queues'?

Having the same number of graphics queues should not really be a issue imo, you could have the same commands running over more then one CU probably rather easily. Also if it did become a issue there doesn't really seem to be anything stopping people from using the compute queues to graphics work (such as post processing) although I honestly don't see it becoming a issue.

I think post processing might become a big thing as it allows you to do things easier/better/that you can't do with the graphics pipeline.
 
i have a question kind of opposite of what you guys are talking about. well its about the whole 14+4 thing, the more i read the more that doesn't make sense. if what beta says about queues is right and what im reading about aces is being understood right, then how would sony increase the number of aces from 2 to 8 for 14 cus? if aces univerally distribute tasks among all the cus, doesn't that go directly against the idea that after 14 cus graphics won't increase much? i mean there are 32 rops in ps4 right? and the gpu has 176GB/s of bandwidth or 156GB/s if you take in account the cpu how is that not enough to fully use 18cu? then it has 18 tmus, in order for the 14+4 thing to make sense. wouldn't they have to purposefully make 4 of those cus less than the rest somehow and also the tmu? but if its unified then that would be very hard to do no? it would cost alot and that doesn't make sense to spend money to purposefully less the performance.

or do i have it completely wrong? :?:
 
i have a question kind of opposite of what you guys are talking about. well its about the whole 14+4 thing, the more i read the more that doesn't make sense. if what beta says about queues is right and what im reading about aces is being understood right, then how would sony increase the number of aces from 2 to 8 for 14 cus? if aces univerally distribute tasks among all the cus, doesn't that go directly against the idea that after 14 cus graphics won't increase much? i mean there are 32 rops in ps4 right? and the gpu has 176GB/s of bandwidth or 156GB/s if you take in account the cpu. then it has 18 tmus, in order for the 14+4 thing to make sense. wouldn't they have to purposefully make 4 of those cus less than the rest somehow? but if its unified then that would be very hard to do no? it would cost alot and that doesn't make sense to spend money for less performance.

or do i have it completely wrong? :?:

You are correct in your reasoning that it would take more effort, work and money to make the part worse which doesn't really make any sense. To make 4 of the CU's different would require a modification the underlying architecture which would increase a whole number of things including verification (you've majorly changed something), cost (time to actually make the modifications) whilst at the same time reducing performance.
 
well if thats true i cant see why so many people believe in the 14+4 thing. or is there some high level stuff that i'm missing that makes it a good argument for 14+4.
 
well if thats true i cant see why so many people believe in the 14+4 thing. or is there some high level stuff that i'm missing that makes it a good argument for 14+4.

There is no high level or good argument for the system setting it as 14+4. It was an EXAMPLE.

It was only an example of workload allocation that they have provided to developers to give them a sense of what they CAN do with it.

As far as I'm concerned they could have said 1 CU for graphics and 17 CU for GPGPU if there is an occasion for such a thing. (folding@home like stuff might work well like this, I don't know, and, in reality, I don't care)
It would have also made the same sense and the same people would also harp on it.

Let it go. The 18 CUs simply have no merit being fundamentally different from each other to perform the advertised functions. 18 CUs in their current state in the PS4 seems well suited for their purpose of both GPGPU and graphics already and there is very little evidence/reason for the contrary to be true and even LESS sense to specifically modify 4 CUs and increase complexity.

The PS4 has already shown to decrease complexity and increase flexibility to the developers in almost all respects and I don't see a reason to make the CU allocations so inflexible with minimal benefit.

Most of the people with an agenda, if you will, consistently is trying to downplay the PS4 as only having 14 CUs for graphics instead of 18 CUs so it advances their argument that the PS4 is __________ (insert their argument), since 14 is much less a hurdle to argue against than 18.



I hate to bring the competition into this thread (and go ahead and mod edit the below section out if mods don't like this part),
but if PS4 is going to be balanced at 14+4 as some people insist, (4 being 22% of the CUs), then it would also make sense that a hardware with 12CUs would only be balanced at around 9+3 CUs if both sides are to utilize similar code and workload.

In either case the whole argument goes up in flames as a moot argument and doesn't really advance anything.
 
Last edited by a moderator:
The PS4 has already shown to decrease complexity and increase flexibility to the developers in almost all respects and I don't see a reason to make the CU allocations so inflexible with minimal benefit.

I don't see a discrepancy between simple/flexible and running into diminished returns when using those for graphics. I don't think anyone is saying they designed those 4 to be handicapped in terms of providing for a proportional 'visual' payoff relative to the other 14. It may just be that typical games at 1080p and 60fps with robust visuals on screen are more likely to have bottlenecks somewhere else or something.

Most of the people with an agenda, if you will, consistently is trying to downplay the PS4 as only having 14 CUs for graphics instead of 18 CUs so it advances their argument that the PS4 is __________ (insert their argument), since 14 is much less a hurdle to argue against than 18.

That's a reactionary and awful defensive assertion for you to make. Does Cerny have that agenda? What about VGLeaks and/or Sony's documentation?

According to VGLeaks, the dev docs says 'balanced around 14, 4 are extra for compute'. That mirrors Cerny's remarks on it. He was directly asked about the 14+4 breakdown and his response was that if you are concentrating only on graphics you won't want to use all 18 for graphics rendering, and that devs are somehow 'incentivized' to use those extra CU's for compute in some fashion.

Nobody is claiming that those 4 are hived off from the rest in a different array or that they are fundamentally different (VGLeaks says all are the same fyi). What we are saying is that it sounds like these 4 CU's won't be contributing in a meaningful way to graphcis rendering output, be it due to diminished returns in some sense or because devs are 'incentivized' to use them for something else entirely. That can significantly alter the comparison on the 'graphics tech' front with a certain alternative platform.

I hate to bring the competition into this thread (and go ahead and mod edit the below section out if mods don't like this part),
but if PS4 is going to be balanced at 14+4 as some people insist, (4 being 22% of the CUs), then it would also make sense that a hardware with 12CUs would only be balanced at around 9+3 CUs if both sides are to utilize similar code and workload.

Based on...? PS4 sounds like it was more heavily designed to leverage compute on the GPU side of things, so why should anyone assume both sides would allocate resources similarly there?
 
I don't see a discrepancy between simple/flexible and running into diminished returns when using those for graphics. I don't think anyone is saying they designed those 4 to be handicapped in terms of providing for a proportional 'visual' payoff relative to the other 14. It may just be that typical games at 1080p and 60fps with robust visuals on screen are more likely to have bottlenecks somewhere else or something.



That's a reactionary and awful defensive assertion for you to make. Does Cerny have that agenda? What about VGLeaks and/or Sony's documentation?

According to VGLeaks, the dev docs says 'balanced around 14, 4 are extra for compute'. That mirrors Cerny's remarks on it. He was directly asked about the 14+4 breakdown and his response was that if you are concentrating only on graphics you won't want to use all 18 for graphics rendering, and that devs are somehow 'incentivized' to use those extra CU's for compute in some fashion.

Nobody is claiming that those 4 are hived off from the rest in a different array or that they are fundamentally different (VGLeaks says all are the same fyi). What we are saying is that it sounds like these 4 CU's won't be contributing in a meaningful way to graphcis rendering output, be it due to diminished returns in some sense or because devs are 'incentivized' to use them for something else entirely. That can significantly alter the comparison on the 'graphics tech' front with a certain alternative platform.



Based on...? PS4 sounds like it was more heavily designed to leverage compute on the GPU side of things, so why should anyone assume both sides would allocate resources similarly there?

Because just throwing out the word diminishing returns means noting, you can't make such a vague and sweeping statement such as '14CU's -> 18CU's' won't give you any graphical benefit, as has been mentioned already multiple times in this thread, wether or not the CU's give a benefit is largely up to the developer and what they want to do with them.

There is no indication in any way that there is anything stopping the developers from using the CU's in a meaningful way for graphics. To think such a thing shows a major misunderstanding in how the fundamental graphics pipeline even works.

All Cerny actually said was you get more ALU for Compute then for Graphics, why people are taking this as meaning they are deficient in Graphics is a bit strange it could quite simply mean that there are some tasks that are better suited for GPGPU and get you better allocation / use of resources in GPGPU then they do when using the traditional Graphics pipeline.

Unless you can personally link us to the actual Sony documentation stop using its a defense, we don't know what Vgleaks has based its information on.
 
I don't see a discrepancy between simple/flexible and running into diminished returns when using those for graphics. I don't think anyone is saying they designed those 4 to be handicapped in terms of providing for a proportional 'visual' payoff relative to the other 14. It may just be that typical games at 1080p and 60fps with robust visuals on screen are more likely to have bottlenecks somewhere else or something.



That's a reactionary and awful defensive assertion for you to make. Does Cerny have that agenda? What about VGLeaks and/or Sony's documentation?

According to VGLeaks, the dev docs says 'balanced around 14, 4 are extra for compute'. That mirrors Cerny's remarks on it. He was directly asked about the 14+4 breakdown and his response was that if you are concentrating only on graphics you won't want to use all 18 for graphics rendering, and that devs are somehow 'incentivized' to use those extra CU's for compute in some fashion.

Nobody is claiming that those 4 are hived off from the rest in a different array or that they are fundamentally different (VGLeaks says all are the same fyi). What we are saying is that it sounds like these 4 CU's won't be contributing in a meaningful way to graphcis rendering output, be it due to diminished returns in some sense or because devs are 'incentivized' to use them for something else entirely. That can significantly alter the comparison on the 'graphics tech' front with a certain alternative platform.



Based on...? PS4 sounds like it was more heavily designed to leverage compute on the GPU side of things, so why should anyone assume both sides would allocate resources similarly there?

but that doesn't make sense to me, i don't know aything about programming or graphics too much. but wouldn't cu usage be on a per game/engine basis? like as an example i play alot of league of legend im sure the game doesn't take much resources to run and can probably run off 8cus and then every cu after wont contribute to the graphics as much in their simple state, but a game like killzone might be made to use 17cus for graphics and one for compute stuff and then a game like tekken might use all of them for rendering. at least thats how i see it. because if all of the cus are the same as are the tmus and stuff then there shouldnt be diminishing returns right?

i feel like i'm so far out my weight class in these discussions lol.
 
but that doesn't make sense to me, i don't know aything about programming or graphics too much. but wouldn't cu usage be on a per game/engine basis? like as an example i play alot of league of legend im sure the game doesn't take much resources to run and can probably run off 8cus and then every cu after wont contribute to the graphics as much in their simple state, but a game like killzone might be made to use 17cus for graphics and one for compute stuff and then a game like tekken might use all of them for rendering. at least thats how i see it.

i feel like i'm so far out my weight class in these discussions lol.

This is exactly how it works but im not too sure the programmer specifies what to don a CU, instead it works with a queue of jobs and the hardware doles them out to do work based on a bunch of parameters.

How you used the power is upto the developer as you have said.
 
I don't see a discrepancy between simple/flexible and running into diminished returns when using those for graphics. I don't think anyone is saying they designed those 4 to be handicapped in terms of providing for a proportional 'visual' payoff relative to the other 14. It may just be that typical games at 1080p and 60fps with robust visuals on screen are more likely to have bottlenecks somewhere else or something.

Then there is no discrepancy between your understanding and my understanding, but there are many that do try to imply that those 4 are handicapped.

That's a reactionary and awful defensive assertion for you to make. Does Cerny have that agenda? What about VGLeaks and/or Sony's documentation?

According to VGLeaks, the dev docs says 'balanced around 14, 4 are extra for compute'. That mirrors Cerny's remarks on it. He was directly asked about the 14+4 breakdown and his response was that if you are concentrating only on graphics you won't want to use all 18 for graphics rendering, and that devs are somehow 'incentivized' to use those extra CU's for compute in some fashion.

Nobody is claiming that those 4 are hived off from the rest in a different array or that they are fundamentally different (VGLeaks says all are the same fyi). What we are saying is that it sounds like these 4 CU's won't be contributing in a meaningful way to graphcis rendering output, be it due to diminished returns in some sense or because devs are 'incentivized' to use them for something else entirely. That can significantly alter the comparison on the 'graphics tech' front with a certain alternative platform.

It's not reactionary and defensive when I word it precisely as I did.
I quote myself

Most of the people with an agenda, if you will, consistently is trying to downplay the PS4 as only having 14 CUs for graphics instead of 18 CUs
As far as I'm concerned no credible source ever claimed "only", but I've seen more than enough people claiming that PS4 only has 14 CUs for graphics.

If you're not claiming that, you're not in the camp that I'm describing here.

Based on...? PS4 sounds like it was more heavily designed to leverage compute on the GPU side of things, so why should anyone assume both sides would allocate resources similarly there?

What makes the two consoles to have different workloads when it comes to games that require GPGPU and run similar code?

If something about the PS4 hardware is 'incentivizing' the devs to use 6 CPU cores, 14 CUs for graphics, 4 CUs for compute.
What's different about another hardware to "incntivizing" 6 CPU cores, 12 CUs for graphics over 6 CPU cores, 9 CUs for graphics, and 3 CUs for compute?

I don't see why the reason we can't argue 9+3 split if we're to argue a 14+4 split unless the 9+3 CUs don't perform as well as the 14+4 in GPGPU functions. In this case one is implying that each of the 18 CUs is actually superior to each of the 12 CUs :/
 
I don't see why the reason we can't argue 9+3 split if we're to argue a 14+4 split unless the 9+3 CUs don't perform as well as the 14+4 in GPGPU functions. In this case one is implying that each of the 18 CUs is actually superior to each of the 12 CUs :/

There is some evidence of this, Sony making modifications to make running GPGPU and Graphics easier and perform better to. How big a difference it will make is another story though, although they seem to think that it will be useful.

The L2 cache bit, and the extra sources, and extra cache bypass being that.
 
There is some evidence of this, Sony making modifications to make running GPGPU and Graphics easier and perform better to. How big a difference it will make is another story though, although they seem to think that it will be useful.

The L2 cache bit, and the extra sources, and extra cache bypass being that.

Yes I'm aware of those but there is currently no data on how much it improves things (although I do have confidence that the modifications will have some effect as they did go to lengths to include it past the vanilla design), so I'm not bringing uncertainties into the argument.

Either case there is no evidence that these modifications handicap the 18 CUs in their usual functions, and a simple 7850 vs 7790 comparison throws the whole "anything past 14 CUs has diminished returns" into the trash can.
 
but that doesn't make sense to me, i don't know aything about programming or graphics too much. but wouldn't cu usage be on a per game/engine basis? like as an example i play alot of league of legend im sure the game doesn't take much resources to run and can probably run off 8cus and then every cu after wont contribute to the graphics as much in their simple state, but a game like killzone might be made to use 17cus for graphics and one for compute stuff and then a game like tekken might use all of them for rendering. at least thats how i see it. because if all of the cus are the same as are the tmus and stuff then there shouldnt be diminishing returns right?

i feel like i'm so far out my weight class in these discussions lol.

Despite not pretending to understand it fully and using the tech lingo like others, I think you get it more than them. Sony isn't telling anyone how to use the resources, they just think using some of the ALU used for compute might be important later in the gen. They made it easier to use by increasing the compute queues from 2 to 8. If devs want to use it for rendering, so be it. It reminds me of the SPUs a bit, but reversed. They were assumed to be for compute, but then used to help rendering. The CUs are assumed to be used for rendering with possible compute in their future.
 
Given the recent 53 MHz upclock for XB1's GPU, is the same on the cards for PS4, or if not, why not?
My source for the RAM reservation info has not heard anything about upclocks on the PS4 side.

....

Also on the topic of the large RAM reservation, apparently the OS shell is pretty expensive memory wise and uses a lot of embedded dynamic HTML5 content.

The use of the word incentive goes against this theory. He's not trying to convince anyone (why would he need to convince a developer to do anything?) , he's saying the way the system is currently designed incentivizes not using all the CUs for graphics. Its an incentive to use the last 4 for GPGPU because the returns are diminishing if used for graphics due to the balance of the system. It lines up perfectly with the 14+4 from vgleaks, and its not a bad thing either.
Expletive's interpretation of the 14+4 thing is the same as mine; the system is sort of ALU heavy design, so on modern AAA engines (eg Cryengine 3) you will see diminishing returns from using the additional CUs for rendering and get more value using them for compute.

Doesn't mean you will get no benefit for using the extra 4 for graphics though, maybe like the 24% improvement in FPS from using 50% more ALUs as DF is suggesting.

i see, i guess its because i don't understand all this tech talk. but to me i don't see how you guys can come to the conlusion that it only does compression stuff. it still seams to me that arugment has holes in it.

Your reasoning is analogous to if you asked someone what they did at work and they replied 'stuff like accounts and admin" and then you thought to yourself, 'Oh but they could also be the CEO', and held it was reasonable to believe that because they did not claim that they aren't the CEO.
 
Last edited by a moderator:
Expletive's interpretation of the 14+4 thing is the same as mine; the system is sort of ALU heavy design, so on modern AAA engines (eg Cryengine 3) you will see diminishing returns from using the additional CUs for rendering and get more value using them for compute.

Doesn't mean you will get no benefit for using the extra 4 for graphics though, maybe like the 24% improvement in FPS from using 50% more ALUs as DF is suggesting.

My understanding that the ALU heavy design isn't handicapping the system in other ways, as in there was no real tradeoff when they added more ALUs into the system except for something like silicon real estate. We haven't heard of any details remotely even hinting or suggesting that each of the 18 CUs will be less capable than the standard CU so I assume this is true.

Thus, "diminishing returns" is misleading when we come around to think about "diminished over what?"

Diminished returns implies that the more you throw to it the less and less the same unit will benefit your system.

It works well in, lets say, having dual graphics cards because you only get about 70~80% of the power from the second graphics card when utilizing SLI, but there is no evidence to uphold the suggestion that any CUs past the 14 point will have less benefit to the system than any of the 4 CUs in the 14 CU array. 18 CUs will have be 28.57% more computational power than 14 CUs. It's not "it's not balanced so we evaluate it at 25% or 20%". It's 28.57% period.

Of course it's up to the other parts of the system to feed them with stuff to do to get the most out of it but devs shouldn't have a problem with doing that as we haven't heard of any real bottlenecks in the system holding the rest of the pipeline down.

The conclusion I've come to is that the CUs are designed with compute in mind and there will very likely be cases that some developers find doing compute with these resources will result in better results than simply using them in the old-fashioned way. And that is actually a good thing.

The problem occurs when a positive design like this is being spun in a manner to make it sound like a negative.






I would like to point out in the DF article that they pitted 16CUs against 24 CUs and then said that

The results pretty much confirm the theory that more compute cores in the GCN architecture doesn't result in a linear scaling of performance. That's why AMD tends to increase core clock and memory speed on its higher-end cards, because it's clear that available core count own won't do the job alone.

They've only proven that 16CUs to 24CUs don't scale as linearly as we thought. What about 2=>3? 4=>6? 6=>9? 8=>12?
Where does the linearity stop and you hit the wall of diminishing returns? Does the line move around with different setups? According to the answers these questions, the "diminishing returns" may not even be relevant in current gen consoles
 
Last edited by a moderator:
Diminished returns implies that the more you throw to it the less and less the same unit will benefit your system.

And that's precisely what it is - you will see less and less benefit from using the additional CUs for rendering.

I would like to point out in the DF article that they pitted 16CUs against 24 CUs and then said that

They've only proven that 16CUs to 24CUs don't scale as linearly as we thought. What about 2=>3? 4=>6? 6=>9? 8=>12?
Where does the linearity stop and you hit the wall of diminishing returns? Does the line move around with different setups? According to the answers these questions, the "diminishing returns" may not even be relevant in current gen consoles

That's just it, the point of diminishing returns for PS4 is the 14 CU mark (at least for current AAA engines which AMD must have used to derive their numbers from). Hence the 14+4 recommendation to devs.
 
Last edited by a moderator:
but isnt that still just assumption on your part? your saying because he didnt mention must mean it doesnt do it. and because he didnt say it then it only does compression/decompression.
Why are you repeating the same discussion had in the audio thread? This has all been said. The arguments have been covered on both sides.
 
Your reasoning is analogous to if you asked someone what they did at work and they replied 'stuff like accounts and admin" and then you thought to yourself, 'Oh but they could also be the CEO', and held it was reasonable to believe that because they did not claim that they aren't the CEO.

id say a better analogy was you know what your friends position is and the main things that people in that position do. and you ask him what he did at work and he said "mostly data entry". you know normally those in his position do more. so why assume he only does data entry? because thats all he told you? when asked?

anyways we should stop talking about it because its all been covered.

Why are you repeating the same discussion had in the audio thread? This has all been said. The arguments have been covered on both sides.

well for one i was never part of the audio thread and didnt i know about it. and then second i pretty much stopped talking about it after i was linked to that thread by astro when he edited his post so not sure what you want me to do at this point. should i ignore eveyrone whos continuing to quote me? i mean if im assumed to know about this other thread and the discussions therein then surely everyone quoting me should know and not quote me right?
 
And that's precisely what it is - you will see less and less benefit from using the additional CUs for rendering.

and yet in comparison to 7000 series cards its not ALU to memory bandwidth heavy at all. So from that perspective you can run another screen space / post process /etc shader that you otherwise wouldn't of had the time for.

It is for this reason that what i quoted in kind of disengenuious, Sure if you hit 30 FPS with 14CU's you might only get 33 FPS increased of 37FPS because you spend less time shaderbound. But what more likely will happen is that Dev's will use 18 CU's worth of compute to hit where target they are trying to hit the best way possible, regardless if thats 18CU's worth of graphic, 10 CU's worth of GPGPU etc.
 
Status
Not open for further replies.
Back
Top