Consoles Vs PC bandwdith

Betanumerical · May 26, 2013

DavidGraham said:
Is that diagram applicable for the PS4 too ?

No, we do not know the details of the PS4 memory subsystem.

Gradthrawn · May 26, 2013

Betanumerical said:
No, we do not know the details of the PS4 memory subsystem.

We have a pretty good idea though. CPU appears to have the same 20 GB R/W access to memory. Diagram doesn't make it clear if its 20 GB (bi-directional) for each Jag compute module or 20 GB total for both. I would assume the former but that's just basing it off what we "know" about the XBO with its better, more clear, diagrams.

EDIT

Almost forgot, the Onion+ bus is pretty much fully confirmed. Cerny notes it to be 20 GBs, but that may very well be 10 GB in each direction as noted in the diagram. That or it could be an genuine upgrade from the VGLeaks diagram to 20 GB R/W.

"First, we added another bus to the GPU that allows it to read directly from system memory or write directly to system memory, bypassing its own L1 and L2 caches. As a result, if the data that's being passed back and forth between CPU and GPU is small, you don't have issues with synchronization between them anymore. And by small, I just mean small in next-gen terms. We can pass almost 20 gigabytes a second down that bus. That's not very small in today’s terms -- it’s larger than the PCIe on most PCs!

DavidGraham · Jun 1, 2013

What about PCI-E bandwidth in PCs? will they be a limiting factor seeing that consoles don't need PCI-e to access the GPU .

eastmen · Jun 1, 2013

DavidGraham said:
What about PCI-E bandwidth in PCs? will they be a limiting factor seeing that consoles don't need PCI-e to access the GPU .

pci-e 2.0 is 16GB/s and pci-e 3.0 is 32GB/s over a x16 link. So I don't think that will be a limit

DavidGraham · Jun 1, 2013

I think PCI-Express 3.0 is only 16.0GB ..

eastmen · Jun 1, 2013

DavidGraham said:
I think PCI-Express 3.0 is only 16.0GB ..

http://www.pcisig.com/news_room/faqs/pcie3.0_faq/

Code:

PCIe architecture Raw bit rate Interconnect bandwidth Bandwidth per lane per direction Total bandwidth for x16 link 
PCIe 1.x 2.5GT/s 2Gbps ~250MB/s ~8GB/s 
PCIe 2.x 5.0GT/s 4Gbps ~500MB/s ~16GB/s 
PCIe 3.0 8.0GT/s 8Gbps ~1GB/s ~32GB/s

Cjail · Jun 1, 2013

This is interesting:

Digital Foundry: You've previously talked about good performance on Haswell.
Intel integrated graphics hasn't enjoyed the best reputation. What do you think of the new architecture?

Oles Shishkovstov: It is much better/faster from a Compute performance point of view but much more bandwidth-starved as a result (except for GT3e [Iris Pro with embedded RAM] maybe). Actually I don't know how Intel/AMD will solve the bandwidth problem for their APUs/SOCs/whatever in the near future. Will we see multi-channeled-DDR3 or a move to GDDR5 or adding huge caches as Intel did?

pjbliverpool · Jun 1, 2013

DavidGraham said:
I think PCI-Express 3.0 is only 16.0GB ..

It's 16GB/s write +16GB/s read. The bandwidth certainly won't be an issue since it's more than sufficient for sending data back and forth between CPU and GPU (hence why it hasn't been increased in years).

Latency of PCI-E will be a problem for running some compute work on the GPU though. PC's will have to find another way to make up for this disadvantage.

To put your original question into context though, in bandwidth terms the 176GB/s of the PS4 would be properly compared to a bandwidth of 314GB/s of high end single GPU PC's.

Arwin · Jun 1, 2013

Latency, yes. Also, in PS4 (don't know about XBO), it should help a lot that the CPU core can connect directly to GPU compute, without any cache or memory being touched, as 20GB/s iirc? That makes for quite a difference I imagine ...

DavidGraham · Jun 2, 2013

eastmen said:
http://www.pcisig.com/news_room/faqs/pcie3.0_faq/

pjbliverpool said:
It's 16GB/s write +16GB/s read. The bandwidth certainly won't be an issue since it's more than sufficient for sending data back and forth between CPU and GPU (hence why it hasn't been increased in years).

I stand corrected, Thanks .

pjbliverpool said:
To put your original question into context though, in bandwidth terms the 176GB/s of the PS4 would be properly compared to a bandwidth of 314GB/s of high end single GPU PC's.

Could I understand Why?

pjbliverpool · Jun 2, 2013

DavidGraham said:
Could I understand Why?

Because the PS4 has 1 memory pool of 176GB/s to be shared between both the cpu and gpu. A high end PC has 2 memory pools, one dedicated to the CPU at 25.6GB/s and the other dedicated to the GPU at 288GB/s in the 3 highest end PC GPU's. Adding those pools together is completely valid when comparing to a single shared pool.

DavidGraham · Jun 2, 2013

Yes , that's what I thought at first , but the folks here did notify me that the console's CPU (both PS4/Xbone) only has access to about 20GB/s of the total bandwidth. The CPU is not that powerful and the bus width to the northbridge chip is limited to just that amount.

dagamer · Jun 2, 2013

pjbliverpool said:
Because the PS4 has 1 memory pool of 176GB/s to be shared between both the cpu and gpu. A high end PC has 2 memory pools, one dedicated to the CPU at 25.6GB/s and the other dedicated to the GPU at 288GB/s in the 3 highest end PC GPU's. Adding those pools together is completely valid when comparing to a single shared pool.

Surely, some of that memory bandwidth ends up being spent copying data from CPU memory to GPU memory which APUs don't have to deal with. And theres always the latency from such an operation. That's why I find the additive number a bit suspect. 1.99 + 1.99 != 4.

Shifty Geezer · Jun 2, 2013

pjbliverpool said:
Latency of PCI-E will be a problem for running some compute work on the GPU though. PC's will have to find another way to make up for this disadvantage.

And yet cloud computing on a 40 KB/s bus with tens of milliseconds latency is perfectly feasible.

GPU compute makes more sense to me as part of the CPU. Stick the graphics card on the graphics bus and leave it only to render. At least for games.

pjbliverpool · Jun 2, 2013

dagamer said:
Surely, some of that memory bandwidth ends up being spent copying data from CPU memory to GPU memory which APUs don't have to deal with. And theres always the latency from such an operation. That's why I find the additive number a bit suspect. 1.99 + 1.99 != 4.

I don't know how much exchange of information is going on during render time between the two memory pools (provided you have a sufficiently large pool of graphics memory to manage your graphics data) but I can't see it having a significant impact.

If we're going to consider things to that level of detail though then it may also be worth considering how contention between the CPU and GPU trying to use the same pool of memory may reduce performance compared to 2 dedicated pools. I'd have though that may also have some impact on latency.

pjbliverpool · Jun 2, 2013

Shifty Geezer said:
And yet cloud computing on a 40 KB/s bus with tens of milliseconds latency is perfectly feasible.

GPU compute makes more sense to me as part of the CPU. Stick the graphics card on the graphics bus and leave it only to render. At least for games.

I completely agree as long as the CPU has the power available. Whether that's through something like AVX2 or an IGP doesn't really matter to me though, ideally I'd like to see games utilising whatever's available.

Arwin · Jun 3, 2013

pjbliverpool said:
I don't know how much exchange of information is going on during render time between the two memory pools (provided you have a sufficiently large pool of graphics memory to manage your graphics data) but I can't see it having a significant impact.

If we're going to consider things to that level of detail though then it may also be worth considering how contention between the CPU and GPU trying to use the same pool of memory may reduce performance compared to 2 dedicated pools. I'd have though that may also have some impact on latency.

They probably solved that partly by giving the CPU a fixed (and relatively limited) budget? Far more important though I think is still the fact that CPU and GPU don't always need memory as a go-between. I'm really curious what the gains can be there, in terms of bandwidth but also latency.

dobwal · Jun 3, 2013

pjbliverpool said:
Latency of PCI-E will be a problem for running some compute work on the GPU though. PC's will have to find another way to make up for this disadvantage.

I think that high end discrete GPUs will accommodate the PCI-E limitation by employing cpu cores on die. Basicially going heterogenous with nvidia using ARM and AMD using either ARM or x86.

In the mid/low end range, heterogeneous cores (APU) from both Intel and AMD will probably handle light gpgpu work loads while discrete gpu (minus cpu cores) can facillitate traditional rendering.

zupallinere · Jun 23, 2013

Well if this hasn't been posted before EA looks to be banking a bit on the whole HSA side of things

http://www.theverge.com/2013/6/21/4452488/amd-sparks-x86-transition-for-next-gen-game-consoles

Electronic Arts is one company still asking that question, though, and not necessarily for the reason you'd expect. EA Sports boss Andrew Wilson says that one reason none of its next-gen sports games are coming to PC is because Microsoft and Sony's new game consoles are actually more powerful than many PCs in a very specific, subtle way: "How the CPU, GPU, and RAM work together in concert," Wilson told Polygon.

That might sound suspiciously vague, but we spoke to AMD and it's actually true. The AMD chips inside the PlayStation 4 and Xbox One take advantage of something called Heterogeneous Unified Memory Access (HUMA), GOOD FOR GAMING, GOOD FOR AMDwhich allows both the CPU and GPU to share the same memory pool instead of having to copy data from one before the other can use it. Diana likened it to driving to the corner store to pick up some milk, instead of driving from San Francisco to Los Angeles. It's one of AMD's proposed Heterogeneous System Architecture (HSA) techniques to make the many discrete processors in a system work in tandem to more efficiently share loads.

EA puts HSA over the top for AMD ?

blakjedi · Jun 23, 2013

The xbo gpu also has a 30gb/s shared, coherent, read/ write link with the CPU. They just don't call it onion plus.

Consoles Vs PC bandwdith

Betanumerical

Gradthrawn

DavidGraham

eastmen

DavidGraham

eastmen

Cjail

Fool

pjbliverpool

B3D Scallywag

Arwin

Now Officially a Top 10 Poster

DavidGraham

pjbliverpool

B3D Scallywag

DavidGraham

dagamer

Shifty Geezer

uber-Troll!

pjbliverpool

B3D Scallywag

pjbliverpool

B3D Scallywag

Arwin

Now Officially a Top 10 Poster

dobwal

zupallinere

blakjedi

Similar threads