In a deferred renderer they could render different passes quite comfortably.
Yeah I see your point, however for me I'm actually not abandoning the prospect of seeing a Sea island gpu at the time of launch. It may not be more powerful even though there's a high chance of it be, but the advantages of a cooler and more efficient 8000series chip sounds like a more logical choice by all means. I just don't know if Sony has the timing right for this in correlation to the launch.Because a beefy cooling solution and custom motherboard would have to be made to support it, along with embedded GDDR5. In a console this is a given, but on PC it's not worth the hassle because the benefits are basically zero. While there are other PC's out there still using a discreet CPU and GPU games cannot be designed to take advantage of such a system.
I doubt a fully fledged GCN GPU would ever be used, that is Tahiti or its sucessor, more likely the cut down Pitcairn with lacking DP compute capability, but it's still a fair possibility that this is what the final product will be - an ultra high end APU.
I'm not really that sure what more likely though, the above, or an custom but similarly specced to current APU's coupled with a discreet customised mid-high end GPU.
On the other hand, the World Technological Executive from Square Enix is readying to showcase the Agni demo on at least one nextgen console at June 2013. This kinda puts the weak vanilla A10 APU to rest doesn't it since we're talking about something close a 680gtx in power.
Well it could be another console and not the PS4 ....
ultragpu said:10bucks says you're wrong.
Seriously it could be both 720 and ps4. Again with the sheer market size Playstation is in Japan, SE wouldn't dare in a million year to leave Sony out.
They don't crossfire well perhaps, but if the GPU renders the world, and the APU renders characters, certain objects and other doodads, smoke, haze, particle effects and so on... Or the GPU renders everything, then the APU handles post-processing like bloom, tone-mapping, depth of field, FXAA, and possibly physics workloads.
In a deferred renderer they could render different passes quite comfortably.
That HT interface would need to be awfully wide. We are not speaking about the DDR2/3 interfaces of CPUs, where HT was barely able to keep pace. Look at the bandwidth numbers of GPUs in the performance region we are talking about, let alone a possible eDRAM solution (should some kind of dual ported eDRAM sit inbetween the APUs?). That doesn't look like a good option to me. So, is it possible to implement? Sure! Does it make sense? In my opinion not.A pair of APUs with 2 full-width coherent HT links could provide each die with around the same peak bandwidth numbers from the remote memory pool as its own, obviously with latency penalties and whatever share the other chip is using of its own bandwidth.
The interface between the two could be smaller, but if you want the most simple relationship for the software to deal with for non-AFR, it can't be too narrow between them.
The GPUs would need to be designed to allow them to readily work together in this fashion.
With GCN, one can basically partition a single GPU (assuming you were talking about GPU resources). That's way more flexible and should give higher performance from the same total number of CUs.Also something that I was thinking about over the last few years is using one chip to do all of the Lighting / Ray-tracing & other things & leaving the other chip freed up to do as much as it can with the rendering because it doesn't have to worry about all the other task that the other chip can take off of it's hand.
They don't crossfire well perhaps, but if the GPU renders the world, and the APU renders characters, certain objects and other doodads, smoke, haze, particle effects and so on... Or the GPU renders everything, then the APU handles post-processing like bloom, tone-mapping, depth of field, FXAA, and possibly physics workloads.
It sounds possible to get it working like that, and that's probably what they had in mind. my comments are based on my gaming experience using an APU+GPU; with one GPU "slightly" over powering the other.
From what i've tested is that once an additional GPU is detected and enabled, they instantly work together for rendering a single frame or alternate frame rendering with no specific tasks. even when calculating physics it's shared along with rendering graphics. now of course, PC gaming is still not the best way to test hardware utilization correctly so....
when support for APUs go up, i'm sure developers would be able to make better use of it. right now APUs+GPUs don't share tasks the way you would imagine for games. (both manually or developer wise.)
Could an APU help for having a feature like picture in picture. For example playing a game while video chatting in the background with both running smoothly?
Edit: assuming there is also a separate GPU.
I'm using the alleged A10 APU as a starting point.That HT interface would need to be awfully wide. We are not speaking about the DDR2/3 interfaces of CPUs, where HT was barely able to keep pace. Look at the bandwidth numbers of GPUs in the performance region we are talking about, let alone a possible eDRAM solution (should some kind of dual ported eDRAM sit inbetween the APUs?). That doesn't look like a good option to me. So, is it possible to implement? Sure! Does it make sense? In my opinion not.
How does those hyper transport links fares with regard to pin number / physical IO?I'm using the alleged A10 APU as a starting point.
Opterons have 4 HT links with 16-bits in each direction. A single 16-bit HT link at max speed in a single direction is about what a single DDR3-1600 channel provides, with some quibbles given the overhead of HT's protocol.
Two HT links would pair with chips with dual-channel memory, which is what the A10 is.
If the RAM is faster, giving each APU an Opteron's IO would provide for 50 GB/s in chip-to-chip bandwidth in each direction and allow a setup with an aggregate memory bandwidth of over 100 GB/s.
This would be between the Radeon 7770 and 7870 in terms of bandwidth, although a chip with an Opteron's I/O is probably going to have some spare die area thanks to all the extra perimeter it needs.
HT uses differential pairs. In terms of data lines, a 16-bit bidirectional link will have 16 signal pairs in each direction for data.How does those hyper transport links fares with regard to pin number / physical IO?
The question would be whether AMD would have modified the GPU and uncore of the APU to better manage something besides APR. One thing about having a heavy interconnect is that the uncore of each chip would be expected to carry a lot more traffic than before, and there better be features in the GPU to better distribute work.I would expect devs and the API use by the system to come with something better than AFR but as a starting point it would not be that bad.
The sort of numbers that get bandied about to make a serious impact on things like texturing above the caches already present are big. It would need to be some kind RAM on interposer or some other expenditure of cash.Any idea what AMD engineers could do to alleviate a bit the bandwidth constrains such a GPU(s in fact) would face? Could oversized L2, texture cache, and local store help a tad?
There are two versions of the 360 devkits. One version has 512 MB, and the other has 1GB. I used both at work. I wouldn't put too much truck in the "devkits always have twice as much ram" theory.Just out of curiosity how much memory do the PS3 and 360 devkits have? And how much did the PS2 and XBOX devkits have?