AMD: R9xx Speculation

Why is 256bit such a cost problem today? Radeon 4850 had 256bit, 9600 GT had 256bit.
AA is now for high-end these days or what ? Even on 1920*1080 , AA can increase the picture quality most from all graphic setings.
And even low cost computers can have full HD monitors these days. The difference betwen cheap 17 inch sub HD monitor and a basic 22 inch full HD is 10-20 $ these days. So i dont see a reason why mainstream cards cant and shouldnt be much faster.
I'm just saying that Juniper was a really balanced and well-done mid-range chip - and I don't really see how tipping that balance in the direction of a more AA and pixel render heavy design would make any sense at this point of time.

In fact, HD 5770 kept up very well @ high resolutions and lots of AA and AF when compared to "older" 256bit brethren à la 4870 and even current-gen 256bit cards like 5850.

A new chip with a memory bandwidth and AA/pixel render performance similar to Cypress should also offer shader- and texture-performance similar to Cypress. AMD learnt to keep their memory interfaces as small as performance-wise reasonable after 2900XT. Why would they break with that approach now?

I don't say that Barts actually having 32ROPs and a 256bit interface is unreasonable. I just say that if Barts indeed has 32ROPs and a 256bit memory interface, the die area used to realized that would only be worth it if the chip as a whole performed a lot better than the ~ GTX 460 1GB performance level indicated on that curiously "leaked" slide.

A 256bit memory interface and Cypress-like Z/stencil performance are nice to have - but the rest of the design should scale accordingly. Especially the specs suggested for "Barts Pro" in that slide seemed like a total waste of die space to me. AMD could easily reach GTX 460 768 performance levels with a tweaked Juniper chip and slightly faster memory ... it would be a shame if a Barts chip @ Cypress-like memory bandwidth, AA capabilities and pixel render performance couldn't do much better than that, expecially looking at the suggested power draw levels, too ...
 
Last edited by a moderator:
.
A new chip with a memory bandwidth and AA/pixel render performance similar to Cypress should also offer shader- and texture-performance similar to Cypress.

Maybe the cypress 80 TMUs was quite a waste for 850 MHz but it was needed with the fast design 2x4870 aproach. (gtx480 performance with almost half texel fillrate comes in mind)
Also those 48 TMU-s could be much faster than 48 of the same 5800 TMU-s if they completly changed the architecture. A single TMU is not olways same as a single TMU :LOL:.
 
I'm wondering if AMD won't release two kinds of Bart XT's/6770's. One that uses a single six pin connector and one that uses two.

That would be nice for people and companies like Dell that are basing their selection on the need for just one six pin connector/PSU limitations.

It might also be a straight upgrade path for 5770 users.

AMD's partners could go wild with OC'd/highly overclockable versions using two connectors.

Seems a lot of sites are making a big deal out of OC'd performance.

If the 6770 had more bandwidth than needed then overclocking becomes simpler, no memory overclock needed for straight up gains by overclocking the core and shaders.

No need for a voltage boost to the memory either.

Naturally a very unbalanced design would be undesirable. But one that is slightly so might fit today's environment.

I guess yields of good 6770 chips and the design of the 6750 would have to be considered.

If AMD speed bins their 6770's and find a significant number with nothing broken but that only meet a lower speed specification then my speculation might have merit.

But why buy a downclocked 6770 if there are single six pin connector 6750's to be had?

Just thinking out loud.
 
Maybe the cypress 80 TMUs was quite a waste for 850 MHz but it was needed with the fast design 2x4870 aproach. (gtx480 performance with almost half texel fillrate comes in mind)
Also those 48 TMU-s could be much faster than 48 of the same 5800 TMU-s if they completly changed the architecture. A single TMU is not olways same as a single TMU :LOL:.
Do you expect completely new architecture? NI were intended to be 32nm refresh of Evergreen family, not a new architecture.

Evergreen architecture wasn't bad, there are no DX-next demands, so I see no reason to scrap it. There is some space for improvements, of course... but new architecture is unlikely.

I think they need an improved caching structure more than new TMUs.
 
Do you expect completely new architecture? NI were intended to be 32nm refresh of Evergreen family, not a new architecture.

Evergreen architecture wasn't bad, there are no DX-next demands, so I see no reason to scrap it. There is some space for improvements, of course... but new architecture is unlikely.

I think they need an improved caching structure more than new TMUs.

Was it just refresh architecture? I remember quite a lot of talk on new architecture focused on efficiency (/mm^2, /W) even further than Evergreen
 
Also those 48 TMU-s could be much faster than 48 of the same 5800 TMU-s if they completly changed the architecture. A single TMU is not olways same as a single TMU :LOL:.
Thats true, but the slide still indicates "GTX 460 1GB" as rough performance target for Barts XT - putting those 48 Barts-TMUs (improved or not) well below the 72 slower clocked Cypress-TMUs in a HD 5850 card. The same holds true for the shaders. Even if Barts' shaders are heavily improved, the indicated performance target still puts their combined performance well below the overall performance offered by a HD 5850 card - and all that @ a TDP > 150W?

Sorry, but I really HOPE for the sake of AMD that whoever "leaked" that slide deliberately changed at least ONE spec and/or the performance targets (the screenshot was taken in edit mode, after all ...).
 
Was it just refresh architecture? I remember quite a lot of talk on new architecture focused on efficiency (/mm^2, /W) even further than Evergreen
Better efficiency can be achieved even without completely new architecture...

anyway, my speculation is based on these facts:

- NIs were developed for 32nm process (roadmap)
- TSMC 32nm process was prepared for beginning of Q4/09 (roadmap)
- ATi always use new manufacturing processes as early as they can
- in an old ATi's roadmap there was a family of products prepared for easter 2010 (reportedly, never seen it personally)

I think the initial plan was to release NIs around easter 2010 on TSMC 32nm process. It would be 6 month after launch of this process, so they could expect it would be mature enough for mass production. On the assumption, that there is no rough mistake in this speculation, NIs should be a half-year refresh of Evergreen family, so presumably not a completely new architecture. I can imagine some improvements or testing of new functional blocks, which will be used in next generation (e.g. cache structure, ROPs, etc.)...
 
Thats true, but the slide still indicates "GTX 460 1GB" as rough performance target for Barts XT - putting those 48 Barts-TMUs (improved or not) well below the 72 slower clocked Cypress-TMUs in a HD 5850 card. The same holds true for the shaders. Even if Barts' shaders are heavily improved, the indicated performance target still puts their combined performance well below the overall performance offered by a HD 5850 card - and all that @ a TDP > 150W?

I agree. Unless theres some 'magic sauce' (redesigned TMUs, improved caches, reworked other part) in that design, I wouldn't expect it to be impressive - especially considering the power draw. However, I'm not too convinced about the slide. It just doesn't fit into design philosophy of recent ATI GPUs (HD4xxx, HD5xxx series) and the perf/watt mantra. According to rumours, though, We will find out soon enough.
 
Better efficiency can be achieved even without completely new architecture...

anyway, my speculation is based on these facts:

- NIs were developed for 32nm process (roadmap)
- TSMC 32nm process was prepared for beginning of Q4/09 (roadmap)
- ATi always use new manufacturing processes as early as they can
- in an old ATi's roadmap there was a family of products prepared for easter 2010 (reportedly, never seen it personally)

I think the initial plan was to release NIs around easter 2010 on TSMC 32nm process. It would be 6 month after launch of this process, so they could expect it would be mature enough for mass production. On the assumption, that there is no rough mistake in this speculation, NIs should be a half-year refresh of Evergreen family, so presumably not a completely new architecture. I can imagine some improvements or testing of new functional blocks, which will be used in next generation (e.g. cache structure, ROPs, etc.)...

Then we of course have to (again) draw the line between refresh and new architecture - everything doesn't have to be completely new from the scratch to be called new architecture, and wasn't NI already supposed to be, regardless if it was planned for easter 2010 or not, 4D instead of 4+1D shaderwise, not to mention the other changes?
 
Better efficiency can be achieved even without completely new architecture...

anyway, my speculation is based on these facts:

- NIs were developed for 32nm process (roadmap)
- TSMC 32nm process was prepared for beginning of Q4/09 (roadmap)
- ATi always use new manufacturing processes as early as they can
- in an old ATi's roadmap there was a family of products prepared for easter 2010 (reportedly, never seen it personally)

I think the initial plan was to release NIs around easter 2010 on TSMC 32nm process. It would be 6 month after launch of this process, so they could expect it would be mature enough for mass production. On the assumption, that there is no rough mistake in this speculation, NIs should be a half-year refresh of Evergreen family, so presumably not a completely new architecture. I can imagine some improvements or testing of new functional blocks, which will be used in next generation (e.g. cache structure, ROPs, etc.)...

How do we know for sure NI was planned for 32nm? TSMC didnt make a sudden decision to cancel 32nm, it was announced long back (AFAIK back in late 2008) and they surely inform their customers well in advance about any changes to their roadmap.

If you think TSMC 32nm would have been ready by Easter 2010 you're dreaming. Even Intel started production on 32nm only in Q4 2009 (with products shipping in Q1 2010) and the rest of the industry lags Intel by at least 12 months. Remember TSMC's 40nm process was "ready" in Q4 2008

Anyway you think they had time to design a new architecture on a new process six months after releasing Cypress? It took them 15 months from RV770 to Cypress. And given the path they took with 40nm i think ATI would first try out a "pipe cleaner" part ala RV740 on a new process before attempting full production
 
How do we know for sure NI was planned for 32nm? TSMC didnt make a sudden decision to cancel 32nm, it was announced long back (AFAIK back in late 2008) and they surely inform their customers well in advance about any changes to their roadmap.
There were several sources indicating that TSMC canned their 32nm process just a few weeks before the first NI chip @32nm was supposed to be taped out.

I don't have leisure enough to search them for you now, but a few minutes of googling should reveal at least some of them.
 
You will be pleasantly surprised! :smile:
hist! ;-)

How do we know for sure NI was planned for 32nm? TSMC didnt make a sudden decision to cancel 32nm, it was announced long back (AFAIK back in late 2008) and they surely inform their customers well in advance about any changes to their roadmap.
We know it, because that old ATi's roadmap (showing NI on 32nm) is publicly available and it was posted several times around the forum.

If you think TSMC 32nm would have been ready by Easter 2010 you're dreaming.
TSMC's 2008 roadmap showed 32nm process for Q4/09. Nobody knew at that time, how complex problems they will encounter.

Anyway, looking at the old TSMC's roadmap:
65nm process was released in Q4/06, ATi released first 65nm part in Q2/07 - 6 months afterwards.
55nm process was released in Q2/07, ATi released first 55nm part in Q4/07 - 6 months afterwards.
32nm process was planned for Q4/09. Given the previous time-frames, it's logical to expect, that ATi planned to use it 6 month laters - during Q2/10.

Anyway you think they had time to design a new architecture on a new process six months after releasing Cypress? It took them 15 months from RV770 to Cypress. And given the path they took with 40nm i think ATI would first try out a "pipe cleaner" part ala RV740 on a new process before attempting full production
One thing is roadmap (=something what is expected to happen) and another thing is retrospectively viewed reality. Now we know, that TSMC had more problems than anybody expected, we know why it took 15 months to release Evergreen... but nobody knew it in 2008. So if we are talking about old roadmaps, all these things are irrelevant, because they couldn't influence these roadmaps.

Anyway, I used this speculation to point out, how illogical is to expect a completely new architecture in such short time-frame. I'm implying, that it's likely to expect minor architectural changes - smaller than between RV770 and RV870, possibly focused on effeciency and - maybe - testing of some new structures developed for real next-gen architecture.
 
F1 2010 will be released with DX9 with DX11 to follow at a later date (it's there, just not ready yet). Since Dirt2 was offered free with the 5000 series I wonder if they will do the same again with F1 2010?
 
Back
Top