Nvidia GT300 core: Speculation

Status
Not open for further replies.
So I am assuming that GT212 is basically a GT200 on steroids? Its kind of disappointing to see that there hasn't been any architectural changes especially the core only supporting DX10.

That being said the specs however dont disappoint and sound reasonable enough. However how will this card position itself in the enthusiast market? Im guessing that the the GTX295 will be faster in performance. (In games where SLi does scale obviously)
 
It seems AMD won't be bothering with anything faster than RV770 until Q4

Why not? Given how close they got in performance with a big die-size advantage, if they were to simply maintain a 250mm^2 die size at 40nm they should be able to thoroughly thrash GT212.
 
So I am assuming that GT212 is basically a GT200 on steroids? Its kind of disappointing to see that there hasn't been any architectural changes especially the core only supporting DX10.

After 3 major DX10 "generations" and going to a fourth one like that, i think that not implementing DX10.1 is a purely political decision, hardly a technical one.
They simply don't want it.

And it happened before with the leap from PS 1.3 to PS 2.0 (bypassing ATI's PS 1.4).
 
"Because Nvidia's memory interface refers to the ROPs, these would be cut in half from 32 to 16, too. But we assume that Nvidia either installs eight instead of four grid units per ROP partition, or alternatively uses 32 instead of 64 Bit wide controllers, which would both steady the pixel performance of the chip."

Does the first solution allow "full" 8x MSAA performance as on R700 series?
 
Does the first solution allow "full" 8x MSAA performance as on R700 series?

AMD's 8xAA performance is due to how much the ROPs can do per-cycle and not due to the number of ROPs. Nvidia seems to have clawed its way back to parity via drivers with 8xAA enabled in a lot of titles though so I wouldn't be surprised if they don't change anything.
 
After 3 major DX10 "generations" and going to a fourth one like that, i think that not implementing DX10.1 is a purely political decision, hardly a technical one.
They simply don't want it.

And it happened before with the leap from PS 1.3 to PS 2.0 (bypassing ATI's PS 1.4).

Architectures last much longer now, I find it still similar to the PS 1.4 situation where ATI had a version tailored to its hardware but nvidia wasn't going to do a new arch. instead of a refresh.
ATI has a DX10.1 R6xx/R7xx (HD2000 series being almost 10.1 I believe), NV has a DX10.0 G8x, that's it.
 
Architectures last much longer now, I find it still similar to the PS 1.4 situation where ATI had a version tailored to its hardware but nvidia wasn't going to do a new arch. instead of a refresh.
ATI has a DX10.1 R6xx/R7xx (HD2000 series being almost 10.1 I believe), NV has a DX10.0 G8x, that's it.

Then again, do you consider the ATI R4xx as an "almost-SM 3.0" architecture ? Yeah, me neither. ;)
 
After 3 major DX10 "generations" and going to a fourth one like that, i think that not implementing DX10.1 is a purely political decision, hardly a technical one.
They simply don't want it.

And it happened before with the leap from PS 1.3 to PS 2.0 (bypassing ATI's PS 1.4).

Good thinking!!!! You probably right!

I was thinking something similar before you even posted ....
 
True, but he seems pretty confident about some specifics both here and over at 3dc. As far as samples being in the hands of devs, it seems kind of early for that. Are samples usually out this far ahead of commercial launch (assuming said launch is in late Q3/early Q4 09)

I'd guess that it's a wee bit too early for the latter.
 
BTW, should we expect significantly higher clocks on 40nm? Or is it all about density? This Altera document speaks pretty ominously about significant increases in leakage with the smaller process.

It seems to me that the combination of more tightly packed transistors + higher leakage current doesn't lend itself well to higher clocks assuming die sizes stay around what they are today.

Nice link. Here i saw that the first stratix iv hads shipped:
"The learning showed up as adjustments to the layout of reused blocks. Many issues descended on the layout at 40 nm, including a plethora of layout dependencies that impact transistor performance by altering the strain engineering on individual transistors, a whole new host of lithography-related restrictions, and even the fact that at 40 nm, transistor performance depends on the orientation of the transistor relative to the silicon lattice structure. "In one case, we found out that a layout technique that was preferred at 65 nm was actually bad at 40 nm, and we had to adjust our layout," Cliff said. And then of course there is the whole matter of process variations, which just keep getting worse and more local."

I presume when they say shipped it is sample quantities to customers and they are still sticking to their may date for production quantities.

Re: the GT212 release date quoted by vr-zone. I am pretty sure this is wrong, they have conflated old and new rumors. Previously supply of 40nm was limited and nvidia only design with low enough volume to fit was the GT212. That has changed now and the other chips are preferred and the GT212 has been punted into Q3 at least. Coming up there is the 55nm GT200 in the middle of the month and according to chiphell the 4970 in the last week of january. Then there will be an almost 6 month wait for the 40nm high end cards. Sorry :cry:
 
Last edited by a moderator:
Why not? Given how close they got in performance with a big die-size advantage, if they were to simply maintain a 250mm^2 die size at 40nm they should be able to thoroughly thrash GT212.
My guess is that AMD is going to get a 128-bit version of RV770 out there - with faster GDDR5 it should be fine. The aim being margins more than market positioning.

It'd be nice if there was a monster RV790 coming out, at roughly the same die size on 40nm. It just seems way too quiet on that front though.

As to thrashing GT212, well I think a monster RV790 would fail unless there's a big bump in RBEs or clocks - both of which seem unlikely. RBEs I don't expect to increase in count until RV870 (architectural change I reckon) and clocks don't give the impression of being good on 40nm. More ALUs aren't what's needed. More TUs, yes, but not enough to thrash GT212 in places where GT200 is already ahead of RV770.

For what it's worth, I thnk NVidia will retain GT200's ROP count in GT212, giving each quad a 32-bit channel.

Jawed
 
After 3 major DX10 "generations" and going to a fourth one like that, i think that not implementing DX10.1 is a purely political decision, hardly a technical one.
They simply don't want it.

You are right. Nvidia told me that they do not want to integrate DX10.1 before the first DX11 cards.
They said that the only feature, that is interesting in DX10.1, is Multisample-Z-Readback. And this feature supports their current Geforce chips with a hack.
 
There are sources though that state that NV is going DX10.1 with the GT21x series. Not saying they are right this time around, but some of them have been correct in the past before.
 
So I am assuming that GT212 is basically a GT200 on steroids? Its kind of disappointing to see that there hasn't been any architectural changes especially the core only supporting DX10.
Why? Is there any need to have DX10.1 or anything above it right now? Why spend time and money on something that will be obsolete pretty soon anyway?

That being said the specs however dont disappoint and sound reasonable enough. However how will this card position itself in the enthusiast market? Im guessing that the the GTX295 will be faster in performance. (In games where SLi does scale obviously)
Well of course they'll make another AFR card on two GT212s for the top end.
GT212 is like G71 or G92, it's hardly pushing the limits of what's possible at 40nm.

Why not? Given how close they got in performance with a big die-size advantage, if they were to simply maintain a 250mm^2 die size at 40nm they should be able to thoroughly thrash GT212.
Because it's too early for the RV870 and anything at the same die size at 40nm will probably end up being with the same performance as RV870 but without DX11 support. It's pretty pointless for AMD to launch anything like that between now and 4Q -- they'll destroy their own RV870 sales.
GT212 is essentially a 'shrink' of GT200 but since AMD has no top end chips anymore they have nothing to shrink.
 
That being said the specs however dont disappoint and sound reasonable enough. However how will this card position itself in the enthusiast market? Im guessing that the the GTX295 will be faster in performance. (In games where SLi does scale obviously)

My guess would be that they will replace the GTX285 with GT212 altogether, as the 285 is intended to replace the 280.

I imagine GT212 will be pretty close to GTX295 but slightely slower overall, that card will hang around a bit longer until we see a dual GT212 replace it.

I do wonder what will happen to the 260 though. I imagine we'll see a cut down GT212 version to replace that as well. Naming it could be a problem though if its faster than the standard 280 (which it probably would be).
 
Because it's too early for the RV870 and anything at the same die size at 40nm will probably end up being with the same performance as RV870 but without DX11 support. It's pretty pointless for AMD to launch anything like that between now and 4Q -- they'll destroy their own RV870 sales.
GT212 is essentially a 'shrink' of GT200 but since AMD has no top end chips anymore they have nothing to shrink.

Fair point, but they could do an intermediate update to 15 SIMDs or something and come in under 200mm^2. Simiilar to what Nvidia is doing with GT212.
 
Fair point, but they could do an intermediate update to 15 SIMDs or something and come in under 200mm^2. Simiilar to what Nvidia is doing with GT212.
They'll need to go to 128 or 192 bit memory bus then and that would mean less bandwidth for more SIMDs.
NV is going from 512 bit GDDR3 to 256 bit GDDR5 and effectively retaining the GT200 bandwidth.
 
They'll need to go to 128 or 192 bit memory bus then and that would mean less bandwidth for more SIMDs.
For the sake of argument, AMD could drop the CrossFireX Sideport and get a 256-bit bus on a ~200mm2 die (GDDR5 takes more pads, so slightly bigger than for a GDDR3 die).

Jawed
 
Status
Not open for further replies.
Back
Top