PlayStation 4 (codename Orbis) technical hardware investigation (news and rumours)

Status
Not open for further replies.
14 CUs under control of the regular hardware thread scheduler ('hardware balanced')

4 CUs on a separately scheduled regime, perhaps under explicit programmer control

Make sense, or..?

The 4 CUs the 'GPU-like compute module' Eurogamer referred to?

What would be the reasoning for this? A means to segregate task types on GPU if hardware scheduling and switching of mixed compute/graphics contexts is less than optimal under GCN? At the expense at some extra programmer burden?

The idea of splitting graphics work across two different groups of resources may not have quite died with PS3 ;) But at least these 'SPEs' will run regular shader code...
 
The CUs in AMD's GCN architecture correspond to groups of stream processors. If my math is correct, 18 CUs would contain 1,152 stream processors.

Edit - As you found yourself.

Jep that is right i believe.

18 compute units each having 16 stream processors and each stream processors doing the same instruction for 4 cycles. I could be wrong with this info i also come up with 1152 stream processes.

Edit: Should probably be each stream processors has 4 vector units but im not sure anymore :(
 
updated info on gpu for ps4

UPDATE: some people is confused about the GPU, here you have more info about it:

Each CU contains dedicated:

- ALU (32 64-bit operations per cycle)

- Texture Unit

- L1 data cache

- Local data share (LDS)

About 14 + 4 balance:

- 4 additional CUs (410 Gflops) “extra” ALU as resource for compute

- Minor boost if used for rendering

Dual Shader Engines:

- 1.6 billion triangles/s, 1.6 billion vertices/s

18 Texture units

- 56 billion bilinear texture reads/s

- Can utilize full memory bandwith

8 Render backends:

- 32 color ops/cycle

- 128 depth ops/cycle

- Can utilize full memory bandwith
 
So, if true, pretty much a confirmation of what were leaked a few weeks ago.

I find bothering that part of the leak: bigger than any of AMD APU, along with GDDR5 I could definitely see the system be costlier than Durango.
I would put the die size somewhere between 280mm^2 and 300mm^2.

At least it should perform better than Durango. I think that pricing and online policies are going to be the factor that set the tide wrt which system get the biggest user baser, business wide (/income) the pictures could be different though.
 
can someone explain the 922 gigaops?

When calculating FLOPs it is customary, for marketing reasons, to count a fused Multiply and Add as two FLOPs. However, that is not the rate at which the processor can process instructions in more common lingo, hence the gigaops being half the maximum FLOPs rate.

Pretty redundant information, I'd say, but it does make (semi-)explicit that the processor in question support FMADD at full instruction rate.
 
When calculating FLOPs it is customary, for marketing reasons, to count a fused Multiply and Add as two FLOPs. However, that is not the rate at which the processor can process instructions in more common lingo, hence the gigaops being half the maximum FLOPs rate.

Pretty redundant information, I'd say, but it does make (semi-)explicit that the processor in question support FMADD at full instruction rate.

thanks :smile:

Dual Shader Engines:

- 1.6 billion triangles/s, 1.6 billion vertices/s

how does it stacks up compared to ps3's traingle/s count ?
 
http://m.neogaf.com/showpost.php?p=46998393

I think I have an explanation for the '18 texture unit' thing. I think it's just another semantics issue about how things are referred to in different contexts.

In some AMD documentation, they talk about each CU having one texture unit e.g. in here: http://www.siliconwolves.net/frames/...ming_Guide.pdf

Each compute unit contains 64 kB local memory, a texture unit with 16 kB of L1
cache, and four vector units.
The texture unit has 4 texture filter units (what we refer to as 'texture units' or tmus) and 16 texture load/store units.

So, like the render backend/ROPs thing, I think it's talking about 18 'texture units' in this sense - the group of 4 texture filter units in each GCN CU.
 
No mention of big BluRay discs for 4k. Think the camera is part of the bundle?

Looks like they've updated the GPU info:

UPDATE: some people is confused about the GPU, here you have more info about it:

Each CU contains dedicated:

- ALU (32 64-bit operations per cycle)

- Texture Unit

- L1 data cache

- Local data share (LDS)

About 14 + 4 balance:

- 4 additional CUs (410 Gflops) “extra” ALU as resource for compute

- Minor boost if used for rendering

Dual Shader Engines:

- 1.6 billion triangles/s, 1.6 billion vertices/s

18 Texture units

- 56 billion bilinear texture reads/s

- Can utilize full memory bandwith

8 Render backends:

- 32 color ops/cycle

- 128 depth ops/cycle

- Can utilize full memory bandwidth


Edit:
I like that a 500 GB HDD will be the standard storage. I was expecting 250 GB at the low end, but this is much better.
 
Well I think it means that 14 CUs is enough for rendering in this setup and the last 4 CUs are better spent on other tasks, but can also be used for rendering.
 
Looks like it will perform worse for rendering than the 14 other? Minor boost?

Some additional information would have to explain why this would be the case. GCN is structured such that the rest of the system really doesn't know which CU results came from. The rest of the specs show a system just as balanced for 18 CUs as regular GPUs.

That doesn't rule out some design specific reason why they can't contribute equally by providing 30% more shader throughput, but the data given so far provides no reason.
 
but they didnt have to split the CUs in order to do that.

Are they split in some way? I just think that past 14 CUs the CPU or something else in the system is starting to be a bottleneck and you get overall better performance/balance by dedicating a few CUs to help/boost the CPU.
 
14+4 CUs could be related to the graphics context switching . when not gaming - only the 4 CUs would be running with cpu .

i think the liverpool gpu is designed as 18 CUs gpu . One exception is that the 14 CUs can be turned off when not needed or can be used as total 18 CUs for graphics if the developer choose to.

Not exactly - it seems like that 'hardware balanced' is outlined here:
http://en.wikipedia.org/wiki/Unified_shader_model

As I understand it:
- when you 'run random shaders', they will be automatically scheduled/run across 14 CUs, the same as a normal modern graphics card.
- the other 4 CUs are independently controlled by the developer/their API.

I'm guessing that it might be better for physics or something like that - as it should be more reliable? (or doing AA there? dunno much about that)

I assume you could just use them as 4 more CUs, although it wouldn't be as efficient.

EDIT: an obvious use might be something involving that dual-camera setup, or move...?
 
Status
Not open for further replies.
Back
Top