While we are it, I'll wager that in BD refresh (if not @ 22nm SOI) they are planning to take a couple (somewhere between 1-4) SIMD engines and put it into a bulldozer module. The SIMD's will have their private L1 data and texture caches (just like BD cores have theirs), and the L2 caches of the CPU and SIMD cores will be unified.
You probably visioned unified L3 that's shared across HT link but unified L2. That's something intel couldnt do with it's conroe cache improvements. To unify L2 caches they need simulate or integrate GPU core inside one of Bulldozers "dual CMT core" and that wasnt never intention for these first Fusion generation (announced for 2008), and wont be in foreseeable future (next 6yr). Also CPU alike L1 datapaths are nowhere mentioned when RV770/870->Southerns (R9xx) L1/L2 texture cache doing pretty damn good job.
Bulldozer's been work in progress for yonks, longer than the next GPU.
Yap, that's good example of AMD's poor decision making when they ditch years of developement first Bulldozer design, that couldnt done FMA effectively enough as Intel's Sandy Bridge design (well that shrink that actaully support FMA), and postpone Bulldozer release for 2nd/3rd "improved" design. And now some wise guy bubbletalk about idf fall 2008 on
intel forum claims noFMA in SB, and nothing until tick (Sandy Bridge shrink) on 22nm according to wiki.
Anyway they could release working BD with as it is today just with old school coding scheme that's less effective than a current one in actual version. But AMDs design will still be more registry hungry. (And all that is tied way back to first NetBurst cache organization.) So probably no bad impact on AMDs approach after all. It would be good enough even if it was released 2 year earlier than it will be.
Rv770's developement began in 2005
RV770's developement by their own words i rread somwhere wasn't never begun it was done in parallel when they figure out that nor the node (90nm/80nm) nor timefame isn't on their side when nV released G80 and they were troublshooting their R500/R600 design and they need that out as HD29000XT desperately quick enough. So RV770 dev begun somwhere in late 2006 as spunoff R600 refurbishing design.
(But if we really acknowledge RV770 as original R500 design then it's design begun far earlier than 2005. Somewhere with release of X800Pro in Spring 2004 or even earlier
... Just of Cypress that's really polished out bugfree product based on that 5+ years of redesigning )
rpg.314 said:
Evergreen's development began in 2006
IMHO, Evergreen is R600 descendant with 4gen teselation now part of dx api, and other small improvements (compresion schemes, larger texture buffers, threading) over previous RV770. Maybe Northern Islands begun their R&D somwhere in 2007-08 cause as they said we'll only see smaller improwemnets in SI already. SO integration takes small steps in 2yrs now.
But hasn't AMD junked at least two designs since K8?
They never junk they reiterate
They put their "failures" into drawer for later recycling. So the originality of K8 idea that couldnt be placed effectively on silicont at that time came out now as brand new Bulldozer idea.
And yes they ditched "original" K9 & K10 and redesign actual K8 to meet 2006 standards with full width SSE and larger caches and to became today known as K10/K10.5 (Stars) --according to some media
Back in the days, AMD had also delayed the Hammer architecture for quite a while.
They could delay it even a year more if they wish. K8 when arrived was really much ahead of it's time when it came to price and availability and customers are satisfied with old K7/P4 until Intel Core2 was taking it's prime role 6-9 month after it's release.
The point ... well if they came out with Hammer even 2 years earlier (impossible cause of their 180nm process of that time --- skyrocket high cost to end customer) they wouldn't penetrate market much more they did with ultra successful K7 product line. For Hammer they had much more ideas but not enough silicon to waste, and thankfully they done right decision just to clamp old K7 with 64 bit registers and IMC which was extremely good enough for poor underperforming P4(D/E) but showing it's age at Core2 arrival