Are you talking about the maximum number of threads per thread group? This is just a maximum, you could easily use "odd" sizes (of course, performance might suffer depending on the chip). I can't see why that should be a problem, the driver can split this as it sees fit. At worst, the driver could use wavefront size 64 and just not use the rest of the simd (of course, that's wasting resources - but considering we heard claims of the 4D shaders being 98.5% as fast as the 5D shaders, it would still be nearly as fast per simd as on Cypress).Non-multiple of a power of 2 hardware thread size doesn't play ball with power of 2 LDS bank count.
EDIT: Well it's not strictly true, i.e. 80 is a multiple of 16. But you also have a problem with 1024 work items per work group D3D11 requirement.
It's possible to guestimate the die-size of both configuration, considering Cypress die-size and rumors.
Cypress has a die of 324 mm^2. 1/3 is the space taken by the SIMDs (info that comes from RV770). So in Cypress 108 mm^2 are taken by 1600sp. If the 25% increase space efficiency is true, than in N.I 1280 sp (320x4) can fit in 81 mm^2 and perform as the 1600sp of Cypress. Double that gives, 162 mm^2 for SIMD. A 20% of increase in complexity of the uncore/fixed function unit (TMU/rops), gives about a die-size of 420-430 mm^2 for 2560sp/96TMU/48rops and 380-390 mm^2 for 1920sp/96TMU/48 rops.
By the way, Cayman ROPs can't be 32.. because 32/3 give 10,6 rops per RPE
And the same for the SIMD number..640 can't fit 3 RPE.
If you conclude 420-430 mm^2 for 2560sp/96TMU/48rops.
How much power will it eat/suck on 40nm ??
ATI strategy is to make (efficient) power saving GPU's!!!!!
And in order to make just under 200W TDP on 40nm using high clocks - "as ATI always has done from past", GPU has to stay small in size and 2560sp will not work since GPU will be to big.
If ~330mm^2 was the sweet spot for maximum practical size when 40nm was young then its quite possible that 430mm^2 might be about the same in terms of production difficulty and expense as compared to when Cypress was released.
Theres nothing which says that after 12 months on 40nm a larger GPU isn't still within the sweet spot of power, performance and cost. The same can be said for the 6 core Phenom processors. AMD released that revision on the same process as the original Phenom II and yet the die size is much larger. They still remain within the same TDP of the original inspite the obvious increase in both clocks and core counts.
Yeah, but AMD/GloFo invest heavily on improving current node, while TSMC tend to use half nodes. That's why GloFo announced 28HPP -- it didn't need to do that when AMD was the sole customer.
The yield may improve on TSMC-40G and you might be rid of double vias, etc. But over 400sqmm? I'm not too optimistic about that.
Yes, it might work, it might be that AMD is still using double vias which contributes to the larger die size. But it would be one more step away from the "sweet spot" for sure.
Facing the upcoming Nvidia's Fermi-series GPUs and its entry-level GeForce GT 430, aiming for an October launch, AMD has decided to announce its latest generation Radeon HD 6000-series GPU globally on October 19. As the company has successfully spun off its foundry division and received the US$1.25 billion settlement penalty from Intel, AMD decided to increase its promotion budget and will host its AMD Technical Forum and Exhibition 2010 show along with its Radeon HD 6000-series debut conference in Taiwan. For the event, AMD will send several top executives to visit Taiwan and meet its local partners, as well as explain the company's fourth-quarter product roadmap to the Asia Pacific media.
AMD has recently postponed the launch schedule of its next-generation Radeon HD 6000 series GPUs (Southern Islands) from the original October 12 to November, according to sources from graphics card makers.
http://www.digitimes.com/news/a20100927PD228.html
I think the clue would be Ontario in this instance. They are getting excellent density on that process with Ontario so it is quite possible TSMC has done significant work to improve the process.
So perhaps the chips are a little larger, especially Barts. However if Ontario is an indication of the density they can get on a mature 40nm, looking at this from another angle I think Cayman might even be as 'little' as 350mm^2.
Well... if GT21x are any good, GF104 is good too, as what's wrong with it is just as wrong with its older cousins : performance.That aside, a ~70sqmm part is by no means any indication of what's to expect on a ~400sqmm part. Remember there was a GT218 which worked fine, even GT216/215 are good. Now take a look at GF100/104, can you say the same to those?
AMD has decided to officially launch its Radeon HD 6000-series on October 19, a week later than the original schedule of October 12, with the Radeon HD 6870 possibly the first for launch. http://www.digitimes.com/news/a20101001PD208.html
Are you sure?6700 is Barts, 6800 is Cayman
Is anyone else here NOT excited about this launch?
Please all remember that this launch represents the death of the ATi brand, and show a little respect and reverence.
I'm a bit meh on it myself. Dumping the rising ATI brand on graphics cards in favor of a currently rather down/mediocre AMD brand on graphics all to try to boost the perception of the AMD brand seems horribly misguided and shortsighted, IMO.
So rather than mourning the loss of ATI, I'm more left shaking my head at inept PR.
Regards,
SB
Is anyone else here NOT excited about this launch?
Please all remember that this launch represents the death of the ATi brand, and show a little respect and reverence.
Actually in my case it's more like the high priest asking the congregation that, but I get your analogy.Asking such a question in this thread, is like going into the church Sunday morning and asking everyone if they aren't excited about the idea/concept of God.
Actually in my case it's more like the high priest asking the congregation that, but I get your analogy.
Still, it worries and burdens me...so I bring it up again.