AMD: Sea Islands R1100 (8*** series) Speculation/ Rumour Thread

With shared memory, you don't have to worry about coherency. This is a big win for many-core systems where the overhead of enforcing coherency adds up. Also, you don't have to take a cache miss when you first read from it, which an ordinary cache would require, even though it's actually being used as scratch memory. Indeed, many CPUs also have special instructions that operate on the cache as an incoherent scratch pad, though they still have to deal with cache misses since the scratch memory could be evicted at any time. Having shared memory in the programming model allows the programmer to actually take advantage of this without resorting to an ungodly mess of intrinsics.

Those who are in a position to use shared memory can use it without paying for coherence. Those who can't still get the full benefits of cache. Look at how nv is able to configure blocks of SRAM as cache/shared memory.

I am advocating doing that for registers, shared memory and caches. Like LRB.
 
Ok so Sony confirmed what we already knew , a Jaguar CPU , and 1.84TF AMD GPU , it could be a clocked down HD 7870 , or a clocked up HD 7850 , what do you guys think it is?
 
Its a SoC, and its huge so I'm guessing its 20 CUs with 2 dead ones for yield reasons. Probably a custom design based on Trinity's GPU side.
 
Trinity's GPU doesn't fit my conception of a next-generation PC GPU. It's a VLIW4 design, and I think some of the rumors (incoherently) point out features that are more consistent with GCN.
 
But even in the best case, the load ties up a register, and takes up extra bandwidth besides.
Extra bandwidth where? And of course a load takes up a register, where would you store the result?!
keldor said:
Also, effective prefetching is hard, especially when you have to cover 100s of cycles latency.
What prefetching are you referring to?
keldor said:
Why eat the cost if you don't actually need the coherency, which is often the case for inner loops? With a true L1 cache, writes eventually have to be flushed to main memory, and reads have to initially come from main memory, which is quite a waste if it's really a local scratchpad.
None of this makes a bit of sense to me.
 
Well it should be GCN , that way it is more consistent with the PC world (something Sony was proud to brag about) and it will be state of the art too .

It has to be close to something !

Do you have something against mobile variants? Seems closest to that. Perhaps something Solar. 800MHz-ish by the looks of it. Also, I do wonder how many ALUs the highest-end Kabini will lump on its GPU...
 
The name is Kaveri, APU, end of the year according to roadmap.

Did some research and I'm pretty sure you meant to day Kabini. Kaveri is still based on steamroller cores, unless something changed.

AMD-2013.png
 
The formerly Sea Islands ISA document mentioned an enhancement to the command queues for an upcoming device.
I don't know whether something like this could be found in an upcoming console, but it sounds like it would be welcomed.
 
Did some research and I'm pretty sure you meant to day Kabini. Kaveri is still based on steamroller cores, unless something changed.

AMD-2013.png

Bearmoo was referring to the HSA features, I believe. That said, Kabini may very well have the same features, but it's not clear, at least not from this roadmap.
 
http://blogs.amd.com/fusion/2013/02/21/amd/

As the blog elaborates, the Sony solution is a Semi-custom APU (delivered by the Semi-Custom BU, in fact). Solutions coming through this route are not necessarily cookie cutter from current (or future) solutions, but obviously take from the IP sets we have available.

Is it strictly speaking an APU, or an SoC? I.e. does it include features traditionally associated with a southbridge?
 
Back
Top