DegustatoR
Legend
Then this would lead to two things:Based on statements round these parts lately I'd guess that's GT206.
Then this would lead to two things:
1. GT206 is not GT200b.
2. GT206 doesn't support GDDR5? Why in the hell would they need 384-bit bus if it does?
1. GT206 is not GT200b.
The point is that it's the same chip -)DegustatoR: And so does G92b AFAICT... what's your point? Of course, that doesn't answer the GT206 mystery...
I think you're running a bit ahead of time =)BTW, tentative GT21x line-up possibilities:
1T|40A|1R -> 0.2TFlops+ -> GT218/???
3T|120A|3R -> 0.6TFlops+ -> GT216/Late March
6T|240A|6R -> 1.2TFlops+ -> GT214/Early May
12T|480A|12R -> 2.8TFlops+ -> GT212/Late June
I really, REALLY don't think that going forward with seperate DP units is the right thing to do. They may do this for the top-end part to be used in Teslas but for anything below that they'll probably go with version 1.2 CUDA compute capability without DP support.In the first possibility, the ALU ratio might seem high until you add this little catch.
GT212 ALUs: 8 MADDs, 8 MULs/2SFU/2DP
GT214+ ALUs: 8 MADDs, 4 MULs/1SFU/1DP
You have all heard about the GT206, also known as GT200b in the press. As we predicted back in May, this would be a part of the Fall/Winter refresh and it is due any day now. Specifications are of course sparse but expect a tweaked GT200 core running cooler and faster, it's all good.
Comment: According to Elsa slides GT212 will be produced in 40 nm already.After GT206, there is a GT212 core on roadmaps. There are rumors that this will be the first NVIDIA chip with GDDR5, but I wouldn't bet on it. It should arrive in Q2 2009.
However, Arun kind of disagrees in his post:Sources has informed us of another chips that is in the works. NVIDIA is also working on GT216 which is said to be the first chip to reach the market using TSMC's 40nm process. It will go up against AMD's RV870 chip and should hit the market at about the same time. That time is late Q2 2009, or early Q3. There are rumors that this chip will be DirectX 11 compatible, which would put it even with RV870 in terms of DX support.
Nice, that cens article is the first one I've seen mention this correctly, although they got most of their other facts wrong: GT216 will be the first NVIDIA 40nm GPU. However, it won't replace GT200, and it should be out in late Q1, not late Q2 (the latter is actually for the chip replacing GT200, so they likely just got some stuff confused). To get back on topic, presumably the roadmap calls for AMD's first 40nm chip to come out before that GT200 replacement too but after GT216, however who the hell knows at this point.
Yes, it's still the same thing fundamentally, no reason it to have major changes. Whether GT200b and GT206 are the same thing is another debate completely, I still think it's more likely that they are not and GT206 is an ultra-low-end chip aimed at replacing G98/G86 in the Montevina Refresh timeframe but we'll see.The point is that it's the same chip -)
It could be "GT200b", sure, but it's still 10 24/8 TPCs, 512-bit bus whatever you want to call it. Otherwise it would have another device id in the drivers.
Is that really a problem?I think you're running a bit ahead of time =)
Okay let me put it this way: NVIDIA's perf/[transistor*mhz] is quite fine. What isn't fine is their transistor density and their clock speeds; the latter is in part because of the monstruous size of the chip which causes variability issues, but the former is very much both a failure *and* a design decision.From my point of view if NV wants to be competitive with their GT21x parts (presuming that's GT20x architecture on 40nm) they'll need to do some rethinking and rebalancing of G8x architecture. Otherwise they'll end up being slower with the same complexity or even with more transistors.
It seems GT212 might be the only GDDR5 chip, unfortunately.I don't think that we'll see the return of 384-bit bus in GT21x chips -- 256-bit GDDR5 should be enough for them.
Yes, AA & 10.1 would be a good thing on GT21x (which doesn't support DX11 AFAIK, based on my parsing of public statements from Michael Hara of Investor Relations). Regarding SM density, I think the RTL itself is fine, it's more of a density issue. I also think my proposed half-SFU solution would be a good way to improve perf/mm² slightly.But as i've said they'd probably want to do something with their SMs in GT21x parts otherwise they'll end up slower per transistor than RV8x0 line. Plus they need to fix AA and add 10.1 support maybe?
In the grand scheme of things, a single FP64 MADD unit is pretty damn cheap. And changing your 24x24 MADD units into 27x27 or 32x32 ones isn't free either, so for basic DP support along with proper denormal support etc. this isn't such an awful solution. I agree however that there is no good reason to keep it on the low-end parts such as GT218, and I wouldn't be surprised if their approach changed in the DX11 generation anyway (which is where most of the R&D dollars are right now obviously).I really, REALLY don't think that going forward with seperate DP units is the right thing to do. They may do this for the top-end part to be used in Teslas but for anything below that they'll probably go with version 1.2 CUDA compute capability without DP support.
Why would they want to replace ultra-low-end now?GT206 is an ultra-low-end chip aimed at replacing G98/G86 in the Montevina Refresh timeframe but we'll see
Even with RV770 transistor density GT200 would still have 530mm^2 die size @65nm and ~450mm^2 @55nm.Okay let me put it this way: NVIDIA's perf/[transistor*mhz] is quite fine. What isn't fine is their transistor density and their clock speeds; the latter is in part because of the monstruous size of the chip which causes variability issues, but the former is very much both a failure *and* a design decision.
Well, if GT206 isn't a mainstream part then the next candidate is GT212, yeah.It seems GT212 might be the only GDDR5 chip, unfortunately.
The thing is that if GT212 (GT216 is probably a low-end chip being the first on 40nm) will be out in 2Q09 then there's no point for them to support anything less than DX11 in it. I even think that they probably should scrap any hi-end part they have planned in the GT21x line and use it as a guinea pig for 40nm process while bringing GT30x DX11 stuff closer.Yes, AA & 10.1 would be a good thing on GT21x (which doesn't support DX11 AFAIK, based on my parsing of public statements from Michael Hara of Investor Relations).
What does "more regular" mean? What's the advantage? What's more regular?AMD's approach is to have a much denser but also more regular layout,
Those sizes sound too large comparing 954M versus 1.4B transistors. So, how did you work that out?Even with RV770 transistor density GT200 would still have 530mm^2 die size @65nm and ~450mm^2 @55nm.
Yes, AA & 10.1 would be a good thing on GT21x (which doesn't support DX11 AFAIK, based on my parsing of public statements from Michael Hara of Investor Relations).
Yeah, my math is probably wrong there.Those sizes sound too large comparing 954M versus 1.4B transistors. So, how did you work that out?
Because G98 is a POS and they need to compete against RV710?Why would they want to replace ultra-low-end now?
What? If we exclude I/O & Analogue, I think it's pretty clear that ~260 * (1400+ / 965) = 380mm²+ on 55nm. This could be combined with 384-bit GDDR5 and higher clocks, which would result in similar perf/mm² (or, more accurately, similar perf/mm² to a hypothetical ATI part with the same performance target!)Even with RV770 transistor density GT200 would still have 530mm^2 die size @65nm and ~450mm^2 @55nm.
Notice that I said [transistor*mhz]... G92b can reach clocks very near HD4870, so I feel it's fair to say that's not a bad metric to consider.Considering that RV770 is dangerously close to GT200 in performance i'd say that they definately have an issue with their perf/transistor ratio right now.
My point is it's an issue of synthesis, not the actual RTL-level architecture (although the ALU-TEX ratio and the choice to stick to GDDR5 obviously don't help perf/mm² much either).But maybe it's an issue of GT200 more than an issue of G8x architecture.
Good idea: let's quadruple risk for a company that badly needs to improve their position and can't afford any more screw ups! Given how a certain semi-risky decision on G96/G98 turned out, I'm sure Jen-Hsun will LOVE that idea!The thing is that if GT212 (GT216 is probably a low-end chip being the first on 40nm) will be out in 2Q09 then there's no point for them to support anything less than DX11 in it. I even think that they probably should scrap any hi-end part they have planned in the GT21x line and use it as a guinea pig for 40nm process while bringing GT30x DX11 stuff closer.
It's easy to forget that moving boxes on pieces of paper doesn't allow you to change reality. If your kind of strategy was pursued, NVIDIA could have canned G71/G72/G73 since G80 was originally scheduled to come up in a very similar timeframe. But new architectures are very prone to delays, and that kind of risk is absolutely senseless IMO. I think their current roadmap is pretty much as follows:They're late on almost every front and it's time to do some roadmap rearranging imho.
He indicated 40nm would be in H1 while the new arch would be in H2.marllt2 said:Could you remind us what Mr Hara said about that please ?
I wasn't thinking specifically of this company or that approach, but this is not a bad start to see the kind of thing I mean: http://www.tela-inc.com/ - what I find particularly cool with Tela's tech, BTW, is if you can reduce leakage you can use transistors higher on the performance-leakage curve, which also means you can improve your perf/mm² more than the raw density impact of the approach!Jawed said:What does "more regular" mean? What's the advantage? What's more regular?