AMD: R8xx Speculation

Jawed · Feb 14, 2010

There is no general RW cache, that's the point. The only way general purpose reads are cached is via the texture system and as you can see from the diagrams the texture system's cache hierarchy has no connection to any of the write caches.

Jawed

MfA · Feb 14, 2010

Your point that it isn't a L2 is well taken, but that doesn't prove read accesses from the memory controller can't be sourced from the cache to allow it to be used as a general R/W cache for UAVs. All I see from the diagram is that the R/W cache connects to the memory controllers, which allows the option that reads can bypass memory altogether (making it a R/W cache for UAVs).

trinibwoy · Feb 14, 2010

Very nice article but I'm not sure what can be inferred from it. If Cypress was in fact cut down and stuff like Sideport was removed was it just a simple cut - like number of SIMDs or did they completely change what the chip was going to be?

In any case if Northern Islands is another full lineup on 40nm I don't see how it can be anything other than an architectural overhaul otherwise what would be the point? Say back when it was decided that Cypress was going to be a bit smaller than first planned they also decided to refresh it with a bigger chip on the same architecture. That would make some sense but it wouldnt make any sense at all for the downmarket derivatives. Yeah so I'm gonna bet on significant overhauls somewhere, maybe in the geometry pipeline.

DavidGraham · Feb 14, 2010

Mindfury said:
The RV870 Story: AMD Showing up to the Fight

http://anandtech.com/video/showdoc.aspx?i=3740

This is simply a wonderful read , thank you ..

Jawed · Feb 14, 2010

Ah, your ninja edits are amusing at times.

MfA said:
Your point that it isn't a L2 is well taken, but that doesn't prove read accesses from the memory controller can't be sourced from the cache to allow it to be used as a general R/W cache for UAVs. All I see from the diagram is that the R/W cache connects to the memory controllers, which allows the option that reads can bypass memory altogether (making it a R/W cache for UAVs).

http://forums.amd.com/devforum/messageview.cfm?catid=390&threadid=127466&enterthread=y

Silent_Buddha · Feb 14, 2010

Mindfury said:
The RV870 Story: AMD Showing up to the Fight

http://anandtech.com/video/showdoc.aspx?i=3740

The part on Rv740 was the most fascinating part of that article, and explains why it was a virtual no-show, but still played a key roll with regards to Rv8xx.

Regards,
SB

curious.trier · Feb 14, 2010

trinibwoy said:
Very nice article but I'm not sure what can be inferred from it. If Cypress was in fact cut down and stuff like Sideport was removed was it just a simple cut - like number of SIMDs or did they completely change what the chip was going to be?

In any case if Northern Islands is another full lineup on 40nm I don't see how it can be anything other than an architectural overhaul otherwise what would be the point? Say back when it was decided that Cypress was going to be a bit smaller than first planned they also decided to refresh it with a bigger chip on the same architecture. That would make some sense but it wouldnt make any sense at all for the downmarket derivatives. Yeah so I'm gonna bet on significant overhauls somewhere, maybe in the geometry pipeline.

I agree, whatever the replacement for 2010 is called, it has to be either

a) an architectural overhaul,
b) a process shrink.

It would be downright BORING if it was just a port to GF 40nm without any changes.

However, if you look at GF's HKMG brochure, you'll find that they are saying that some customers will announce products in Q1 on their HKMG process. They don't have HKMG on anything 40nm or bigger. And since cpu's and gpu's are typically the first chips to migrate to new processes, chances are high that it will be a amd gpu, if only a pipecleaning part like rv740.

GZ007 · Feb 14, 2010

Jawed said:
I don't expect anything other than a boring refresh of Evergreen this year. I don't know what its name is.

I don't know if NI is meant to be a refresh of Evergreen (e.g. as minor as RV790 or as major as R520->R580 or RV730->RV740) or if NI is meant to be a substantial change (RV670->RV770, RV770->Evergreen) or if NI is meant to be an architectural re-boot (R580->R600).

Evergreen is less late than I thought it was (I thought it was about 1 quarter late) and logic would indicate that AMD plans a substantial change (RV670->RV770, RV770->Evergreen) for summer/autumn 2010 based on the pattern for RV770 and Evergreen. But I think process complications and the GF factor will all conspire against anything other than a boring refresh this year.

Jawed

I dont see anything boring in a refresh. If they could sell a 1+ GHz 5870 for the price of 5770 people wouldnt care to much about architecture. If GF 32nm/28nm refresh would enable them much better yields and higher clocks then it could still beat gt400 cards in price (GT200 vs RV770).

trinibwoy · Feb 14, 2010

Is GF 32nm going to be production ready this year? If so AMD's current process advantage could turn into a slaughter.

Bouncing Zabaglione Bros. · Feb 14, 2010

trinibwoy said:
Is GF 32nm going to be production ready this year? If so AMD's current process advantage could turn into a slaughter.

Supposedly. GF seems to have been quite bullish about 32nm, and there were 32nm wafers showed off by GF a couple of months back that were supposed to be more than just SRAM. If 32nm at GF goes fairly well without any of the delays and problems we've come to expect from TSMC, it could be looking good for CPU/GPU production towards the latter half of the year.

jaredpace · Feb 14, 2010

Are there any die photos of RV870/Cypress yet?

compres · Feb 14, 2010

jaredpace said:
Are there any die photos of RV870/Cypress yet?

No, I am also waiting for this.

MfA · Feb 14, 2010

Jawed said:
Ah, your ninja edits are amusing at times.

I always type before I think. Anyway, so indeed "presently" all UAV accesses go either through texture cache or uncached :/ I wonder what the performance is if you try to abuse atomics to use it as a R/W cache anyway (atomic OR 0 with return to read for instance).

BTW ... the presentation did suggest something else, not so much the diagram but the actual text :
"Unordered shared consistent loads/stores/atomics via R/W Cache"

Alexko · Feb 14, 2010

Bouncing Zabaglione Bros. said:
Supposedly. GF seems to have been quite bullish about 32nm, and there were 32nm wafers showed off by GF a couple of months back that were supposed to be more than just SRAM. If 32nm at GF goes fairly well without any of the delays and problems we've come to expect from TSMC, it could be looking good for CPU/GPU production towards the latter half of the year.

Since 28nm is only 3 months behind 32nm and is a bulk process, I wouldn't expect AMD to bother with 32nm for GPUs... if it weren't for Llano.

Since they have to make a GPU on 32nm SOI anyway, why not make use of the experience they'll gain, and release more GPUs on 32nm? I wonder if it makes sense to do so...

Jawed · Feb 14, 2010

MfA said:
I always type before I think. Anyway, so indeed "presently" all UAV accesses go either through texture cache or uncached :/ I wonder what the performance is if you try to abuse atomics to use it as a R/W cache anyway (atomic OR 0 with return to read for instance).

It'd be fun and it is 128KB...

BTW ... the presentation did suggest something else, not so much the diagram but the actual text :
"Unordered shared consistent loads/stores/atomics via R/W Cache"

Unfortunately the ISA document is in "preview" condition with large chunks missing... This may be the abuse that you mentioned in the prior paragraph

It seems you would use EXPORT_RAT_INST_XCHG_RTN to write using the atomic functionality, and then ignoring the return value. The bandwidth is low, because writes are DWords with no option to write 128-bits. No idea what kind of serialisation mechanics are in play.

Jawed

Lightman · Feb 14, 2010

Alexko said:
Since 28nm is only 3 months behind 32nm and is a bulk process, I wouldn't expect AMD to bother with 32nm for GPUs... if it weren't for Llano.

Since they have to make a GPU on 32nm SOI anyway, why not make use of the experience they'll gain, and release more GPUs on 32nm? I wonder if it makes sense to do so...

Plus the benefit of power gating on 32nm SOI from GloFo ...
Quite a feature for high end GPU, not so much for low end where there is little to shut down.

dkanter · Feb 14, 2010

Historically, AMD has been 1 year behind Intel on process technology. I see no reason why this has changed. GF was formed in early 2009, which is probably too late to have much impact on development of 32nm.

My guess is that GF has 32nm parts at the very end of the year (perhaps Llano), but not in high volume. I could very well be wrong, but that's my guess.

Of course, the real issue is not how GF compares to Intel, it's how GF compares to TSMC...and that's trickier to guess.

David

Alexko · Feb 14, 2010

dkanter said:
Historically, AMD has been 1 year behind Intel on process technology. I see no reason why this has changed. GF was formed in early 2009, which is probably too late to have much impact on development of 32nm.

My guess is that GF has 32nm parts at the very end of the year (perhaps Llano), but not in high volume. I could very well be wrong, but that's my guess.

Of course, the real issue is not how GF compares to Intel, it's how GF compares to TSMC...and that's trickier to guess.

David

GF's roadmap says that risk production for 32nm is scheduled for mid-2010, which does seem a little tight if AMD wanted to refresh the entire Evergreen line-up in 2010, but for just Cypress it seems very doable.

Also, while GF was formed in 2009, it was planned much earlier than that, and no doubt they anticipated it in some ways. For instance, there's a 40nm LP bulk process planned for risk production in mid-2010 and I doubt GF waited for its effective independence to start working on it.

That said, AMD seems more likely to play it safe and just stick to TSMC's 40nm process for 2010. By now, they know it well... I just don't think we can completely rule out GF's 32nm.

nAo · Feb 15, 2010

Jawed said:
UAV reads can either go through the texture cache hierarchy or they can be uncached reads from global memory.

Playing a bit with GSA I noticed that UAV reads are performed via texture fetches. The interesting thing is that since texture caches are read only and not coherent they seem to issue a cache line eviction per each UAV read, followed by a memory barrier to wait for the eviction to be completed (just before the texture fetch). I wonder why they don't simply use uncached reads.

Global Shared Memory, additionally, provides a RW surface - but it's 64KB.

They use it to handle UAVs global counters and counters for append/consume buffers (randomly notices while playing with GSA).

Squilliam · Feb 15, 2010

If AMD are going to transition a chip onto the GF 32nm process quickly I would suggest its likely the Xbox 360 GPU. It makes the most sense as its directly applicable to getting Fusion up and running, its secretive in that Nvidia probably won't find out they've done it until far later than a seperate GPU SKU as Microsoft wouldn't even start selling them until the end of the year and its something which works in with their old Global Foundries agreement in that it nets a new client for GloFo as well as giving them an assured payout with little risk to themselves.

As an aside, I wonder if in the future they will deliberately sell chips they can use for both computers and consoles for economies of scale. They can divide the R+D costs over a larger number of chips and they can lower the overall cost per chip for themselves and the console manufacturer by taking the valuable high bin chips and salvaging the lower bin chips so there would be less wastage and overall better margins all around.

AMD: R8xx Speculation

How soon will Nvidia respond with GT300 to upcoming ATI-RV870 lineup GPUs

Within 1 or 2 weeks

Within a month

Within couple months

Very late this year

Not until next year

Jawed

MfA

trinibwoy

Meh

DavidGraham

Jawed

Silent_Buddha

curious.trier

GZ007

trinibwoy

Meh

Bouncing Zabaglione Bros.

jaredpace

compres

MfA

Alexko

Jawed

Lightman

dkanter

Alexko

nAo

Nutella Nutellae

Squilliam

Beyond3d isn't defined yet

Similar threads