The capabilities of the 4 special CUs in Orbis

They are not going to make any reference to a nuanced design decision like 14+4 anywhere near a press release. It would only cause confusion and draw attention away from 1.84TF or 8GB.

Whether its 18 or 14+4 that press release is still accurate in its wording.

I agree, but I am wondering about the context of the word "freely".

While they worked on the 8GB design, they may realize that the GPU needs to be more flexible eventually as well. The 14+4 rumor came with the 4GB config.

We will probably hear more about their software techniques in GDC. Hopefully they leave enough clues to infer the new h/w capabilities.
 
I agree, but I am wondering about the context of the word "freely".

"Developers, hey you're free to use more than 14CUs for rendering, we aren't stopping you!"



"But based on the design you're using 30% of the GPU for maybe a 5% improvement, just fyi."

;)
 
Are some people still holding to this 14+4 idea? I can't believe it. That stupid rumor never made sense.

It is just a normal shader array made of 18 CUs connected to the command processor for graphics and some ACEs for compute. Developer can submit whatever tasks they want to this array. That's really all and the end of the story.
 
How do they purport to make the GPU easier for GPGPU tasks ? What were the "difficulty" problems with 18 CUs before anyway ?
 
How do they purport to make the GPU easier for GPGPU tasks ?
Look at what is written in the Sea Islands ISA manual (now taken down again). They support 8 compute pipelines (ACEs?) with a maximum of 64 queues for compute.

Edit:
What were the "difficulty" problems with 18 CUs before anyway ?
There probably weren't any special difficulties. More compute pipelines ease the usage of the GPU for smaller tasks, though, especially if a game has a lot of them to do.
 
Last edited by a moderator:
Look at what is written in the Sea Islands ISA manual (now taken down again). They support up to 8x8 queues for compute.

Link it when you see the link again !

Didn't AMD suggest splitting the compute and rendering part (or something to that effect) if graphics pre-emption is not done yet ?
 
Sounds like they loosened the 14+4 configuration.

Or the 14+4 configuration was invented on message boards by people overanalyzing a single line bullet point on a leaked slide.....

Just saying there is a lot of over analysis on everything that's leaked, and all most all of it tends towards one word implying new exciting technology when perhaps Ocam's razor would point to to simpler meanings.
 
What were the "difficulty" problems with 18 CUs before anyway ?

No problems, but while doing compute tasks on this system, it probably makes more sense to use some of the CUs to do those jobs instead of the CPU cores. Having any sort of physical split of the CUs never made any sense to me. To maximize the system throughput, you probably want to use the CUs flexibly instead of just doing rendering, that's all the rumour ever suggested imo.
 
That's certainly possible. The statement was phrased in an interesting way: Use all 18 for marginal rendering improvement. Config balanced for 14+4.

Need to throw the GPU into the deep end to see how the devs optimize their code around it.
 
Or the 14+4 configuration was invented on message boards by people overanalyzing a single line bullet point on a leaked slide.....

Just saying there is a lot of over analysis on everything that's leaked, and all most all of it tends towards one word implying new exciting technology when perhaps Ocam's razor would point to to simpler meanings.

Of course! But I think right at the start of the discussion, we already considered the possibility that the 14+4 was just a way of saying 'we expect on average that 14CU should be able to cover the graphics rendering sufficiently, and that 4 additional CUs were added to satisfy physics calculations without taking away from the rendering or burdening the CPU too much. It was just an option that had little room for speculation, so it didn't have any. ;)

Now the really interesting bit will be how much better the CUs can be leveraged in this single chip configuration versus if they'd been in a GPU PC card.
 
Well, there should be some (big enough) room for improvement for running GPGPU jobs. Otherwise, AMD wouldn't suggest possible solutions before they implement GCN2. They don't even need GCN2 if there's no problem.
 
Found the slides:
http://de.slideshare.net/zlatan4177/gpgpu-algorithms-in-games

Slide 10,13 and 15.

Presenter was advocating dedicating the APU for compute and a discrete GPU (if you have one) for graphics, for now.


... and Sony PR says:

The Graphics Processing Unit (GPU) has been enhanced in a number of ways, principally to allow for easier use of the GPU for general purpose computing (GPGPU) such as physics simulation...

They may have done something to solve or workaround this latency issue, if it's there.
 
Found the slides:
http://de.slideshare.net/zlatan4177/gpgpu-algorithms-in-games

Slide 10,13 and 15.

Presenter was advocating dedicating the APU for compute and a discrete GPU (if you have one) for graphics, for now.


... and Sony PR says:



They may have done something to solve or workaround this latency issue, if it's there.
That's basically unrelated to the PS4. With everything on a single die, there is no latency issue (that's the reason, why for some tasks the APU is faster than an external GPU in the PC case that presentation was about). Sony didn't have to solve anything as they went for an SoC.
 
Last edited by a moderator:
Are some people still holding to this 14+4 idea? I can't believe it. That stupid rumor never made sense.

It is just a normal shader array made of 18 CUs connected to the command processor for graphics and some ACEs for compute. Developer can submit whatever tasks they want to this array. That's really all and the end of the story.
I wanted to say this sometimes, that the rumours on the 4 reserved CUs sounded odd to me --out of my intuition and nothing else. But I just couldn't put it into words like you did.

I didn't understand why Sony would choose to do something like that when in a close machine you can probably program the CUs independently of what the other CUs are doing.

Likewise, there could be other tasks in the scene these 4 CUs -now unified with the rest of them- could be used to run with the application of the appropriate code.
 
I think the main reason people put support against the split design was because originally the rumours were favouring APU + GPU combo (where one of the APU's graphics pipelines or dedicated GPU's pipelines was specifically meant to be used/designed for atypical processing).

I think this confused us in the end.

It's good to know it's much simpler than that... or at least appears to be simpler and more straightforward.
 
Are some people still holding to this 14+4 idea? I can't believe it. That stupid rumor never made sense.

It is just a normal shader array made of 18 CUs connected to the command processor for graphics and some ACEs for compute. Developer can submit whatever tasks they want to this array. That's really all and the end of the story.

Some people probably need an X-ray come late 2013 to throw that notion away.

And that still might not work.
 
That's basically unrelated to the PS4. With everything on a single die, there is no latency issue (that's the reason, why for some tasks the APU is faster as on an external GPU in the PC case that presentation was about). Sony didn't have to solve anything as they went for an SoC.

Re-read the slides. Indeed the GPU latency due to saturation issue is a PC/DirectX problem. In fact, one of the slides stated that switching to APU virtually eliminates the latency.

The only issue they have to deal with is to manage the cache and memory latency.

I wanted to say this sometimes, that the rumours on the 4 reserved CUs sounded odd to me --out of my intuition and nothing else. But I just couldn't put it into words like you did.

I didn't understand why Sony would choose to do something like that when in a close machine you can probably program the CUs independently of what the other CUs are doing.

Likewise, there could be other tasks in the scene these 4 CUs -now unified with the rest of them- could be used to run with the application of the appropriate code.

That's what I want to find out. Does libGCM allow you to schedule the CUs independently and freely ? Or do you have to go lower ?

If the developers can schedule the CUs whichever way they want, then it's indeed a very flexible setup.


Thanks !
 
Are some people still holding to this 14+4 idea? I can't believe it. That stupid rumor never made sense.

It is just a normal shader array made of 18 CUs connected to the command processor for graphics and some ACEs for compute. Developer can submit whatever tasks they want to this array. That's really all and the end of the story.

So people were supposed to believe every rumor out of VGLeaks (MOST of which have proven true) except the one that implies the PS4 GPU might be compromised (or optimized) in some way?
 
It's all good. The challenge is to understand how things fit together. Even if the hardware is flexible, sometimes it is obscured by software layers for assorted reasons. The key thing here is how does libGCM work ? Is it as flexible as Cyan mentioned ? If so, then the 18 unified CU setup is "perfect". There is no need to structure the CUs artificially.
 
Back
Top