D9P/G94: 9600 GT on 19th of February

The old 7600 GT "2.0" (G73-B1, 80nm) had 1.8GHz GDDR3 chips, so 2.0GHz for a mainstream part a year later is not that surprising, especially if they're sticking with those cheap Qimonda ones already found in most 8800 GT's and 8800 GTS 512MB.
It still doesn't make much sense that the lower-performing part has more memory bandwidth than the faster performing part. 2Ghz memory with a 128bit interface? No problem (the 8600GTS already has that). With 192bit interface? Sure makes sense too. But I fail to see why they'd want to use 2Ghz memory with a 256bit interface on a part which is lower performing than the 8800GT. (Not to forget, those 2Ghz chips (which the 8800GT also has) add quite a bit of power draw, since AFAIK you can get them only in factory overclocked/overvolted versions. Fastest gddr3 chips you can get at standard 1.8V voltage are still 1.8Ghz. Might not matter much for cards of the 8800GT performance class, but might be an issue below - probably one of the reasons the 8600GTS has quite a high power draw compared to a 8600GT.)
 
Where the memory runs at 1.8 and 1.94GHz effective, respectively. So, NVidia is going to spend extra to get the right memory for this cheaper part?

Jawed

If they have a bunch of them sitting around due to a larger purchase order with Qimonda (big volume discounts, perhaps ?), it would make sense.
How else do you explain using them in both G92 products, when they didn't even do the same with all G80's (the 320/640MB GTS had 1.6GHz GDDR3 memory running a rated speeds already) ?
Surely if Nvidia was aiming for lower production costs on the 8800 GT (512MB version), they could have further enhanced their bottom line by running 1.8GHz GDDR3 with 1.1ns memory (certainly cheaper than the 1.0ns GDDR3 used today).

Besides, the G73-B1, as i said, came out at about the same time of the 8800 GTX, which had in fact the very same GDDR3 chips (albeit more -triple- of them due to the wide 384bit bus).


mczak, there were 8600 GTS models without the auxiliar 6pin PCIe power plug right from the get-go, like this one from Gigabyte.
I doubt the overall power consumption difference between GDDR3 at 1.8GHz and GDDR3 at 2.0GHz would be that noticeable in the real-world usage scenarios.
 
Last edited by a moderator:
Are we SURE that 1.8GHz and 2.0GHz isn't the same bin anyway, at least for some SKUs? I know you can't get that directly from the Qimonda/Samsung/Hynix websites, but for all we know perhaps some chips are guaranteed to work at *either* 1.0ns @ 2.0v or 1.1ns @ 1.8v. Look at this PDF for example, voltage is 1.9V+-0.1V: http://www.samsung.com/global/syste...AM/512Mbit/K4J52324QE/ds_k4j52324qe_rev12.pdf

Anyway, as Jawed and mczak pointed out, the problem remains that this is more expensive memory (and/or power hungry at least, even if it's the same SKU) than the 8800GT for what is presumably a cheaper and slower part. Once again, I bet there is some intentional NVIDIA FUD in here, heh. There are a couple of explanations I can think of:
- Faster than the 8800GT (G92 anyone? but still unlikely)
- Needs that memory to be performance-competitive (unlikely if it indeed has that many fewer SPs and that core clock).
- Saving some die space by using using less exotic memory bandwidth saving techniques over G92 (unlikely).
- It's actually 0.91ns GDDR4 and cheaper than the alternatives (unlikely).

Anyone has any other idea?
 
Arun said:
- It's actually 0.91ns GDDR4 and cheaper than the alternatives (unlikely).
Perhaps it is a supply issue and GDDR4 was the only thing they could get in satisfactory volume? Dunno, does seem odd to give a card less powerful than a 8800GT more bandwidth... I suppose it will have 8 ROPs?
 
This card is born to kill the hd3850 and than NV has almost all segment (50-100$ segment can be still for ATi) in they own hand.
(btw. core clock speed looks very low)

What can be ATi answer to this card? because 2xrv670 X2 3xxx card there is no rumor about anything.

It's born to kill the HD3850? This card was intended to go up against RV670 in Q1 2008.... There was supposed to be only a 1 month gap in release (RV670 in January, D9P on the 14th of Feb). But then something unexpected happened... A11 silicon of RV670 came back faultless so it got released much sooner than anticipated... and the more expensive G92 had to be pushed down to compete against RV670.

So when the GeForce 9600GT comes out, AMD has a 3-4 months lead in the same segment.

The cheapest HD3850 is already available for €140 over here so NV will need to match the price to remain competitive.

It might also be nice to mention the performance of D9P in 3DM06. The projected performance is around 10K (slightly higher), which is around the same as the HD3850 which is already available. 9600GT will use an 8" PCB. The die-size is in between 190mm2-200mm2, making this the "natural competitor" to RV670.

Oh and D9E will be launched one month later in March. That's the 9800-Series, the follow-up to the 8800GTX/Ultra. This one will probably beat the HD3870 X2 (R680) which will be launched on the 28th of January.
 
expreview: Hello, welcome to the forum! :) Were you explicitly told 64 SPs, or did you (or your source) infer that from another number? For example, a 32 TA/TF design with a higher tex ratio (->96SPs) seems very plausible and likely more balanced.

Jawed: 'm not sure how we can know what the density is unless we are certain NVIDIA reused the same synthesis they did for G92/G98. Given the clock speed, I'd kinda expect that not to be the case. Also, we don't know for sure whether it's 65nm or 55nm, and what foundry it is produced at.

Oh, just to make sure an outdated rumour is not considered as fact in this thread: G92 is ~320mm2, not ~290mm2.
I don't expect it to be 55nm - that would imply NVidia is running on the same schedule that ATI was for this node, which doesn't have any recent historical precedent. I reckon June/July for NVidia's 55nm parts. I consider D9E late - it's looking like it'll be 4 months late - whether D9P was supposed to be concurrent with D9E or later, everything points to all D9x variants being 65nm if they release before summer 2008.

I haven't considered a higher ALU:TEX ratio, though - I was sticking with the ratio we have in G92/G86/G84. In other words, I don't think it's a 6 cluster variant of G92 with the same memory/ROP partitions. It just won't fit.

320mm2 to 200mm2 is 63%, 8 clusters to 6 clusters is 75% excluding the remainder of the die, so I reckon about 80% for the die as a whole (if the memory/ROPs are the same as G92).

So, it's likely to be 2x G84 (both in clusters and partitions)*. Perhaps it's got some new fangled stuff in it. We're certainly expecting double-precision, so how much does that cost?

I certainly wouldn't rule out 96:24, for what it's worth (3x clusters, each 4x 8-way ALUs + 2x 4-way TMUs). Since G84 has a far too low ALU:TEX ratio this would prolly still be 2x faster per clock in games.

---

For what it's worth 500MHz core doesn't make sense, either, unless we're talking about 9600GS or lower - we should be expecting something like 700MHz I reckon.

I think this will turn out to be D9M, not D9P. D9P should be 6 clusters and D9E 8 clusters, if there will be a single-chip D9E.

Jawed

* - nah, 2x the configuration doesn't look so good now, but 2x the performance still seems viable (2nd EDIT)
 
Last edited by a moderator:
Oh and D9E will be launched one month later in March. That's the 9800-Series, the follow-up to the 8800GTX/Ultra. This one will probably beat the HD3870 X2 (R680) which will be launched on the 28th of January.

I sure hope it's single GPU solution from Nvidia but not dual.
I always like one powerful GPU vs. 2 midrange on one card.
 
Perhaps it is a supply issue and GDDR4 was the only thing they could get in satisfactory volume?
Hmm that's an interesting theory using gddr4 due to supply/cost. There are fewer manufacturers though producing gddr4 than gddr3, I don't think it's exactly a volume part so I've doubts about that theory too.
I suppose it will have 8 ROPs?
If the chip is still quite similar to all other G8x/G92 parts and so has its quad-rop units tied to 64bit memory channels, and if it's indeed 256bit it will have 16 rops just like G92.
 
mczak said:
I don't think it's exactly a volume part so I've doubts about that theory too.
Well if it is sub $200 I'd think it would be fairly high volume... maybe not.

mczak said:
if it's indeed 256bit it will have 16 rops just like G92.
Yeah, this is the part I don't get... if it is going to be obviously shader bound at high resolutions then why bother with 16 ROPs or worry about bandwidth. Maybe if more work is getting done per SP...
 
Well if it is sub $200 I'd think it would be fairly high volume... maybe not.

Yeah, this is the part I don't get... if it is going to be obviously shader bound at high resolutions then why bother with 16 ROPs or worry about bandwidth. Maybe if more work is getting done per SP...
CJ reckons it's targetted at 10,000 in 3DMk06 - where HDR stuff consumes lots of bandwidth. This is around 2x 8600GTS performance, if you play 3DMk06.

Jawed
 
It's born to kill the HD3850? This card was intended to go up against

Oh and D9E will be launched one month later in March. That's the 9800-Series, the follow-up to the 8800GTX/Ultra. This one will probably beat the HD3870 X2 (R680) which will be launched on the 28th of January.

Will GF9800 be a singlecore (I hope....) GPU or something like GF7950GX2 or HD3870X2 cards?? :)




There are some rumours about D9E-20 (single core?) and D9E-40 DualGPU (probably 2xD9E-20) which are supposed to be launched in March. I believe there will be some architectural improvements in GF9800 (especially with SP performance) in comparison to GF8800 series....
 
320mm2 to 200mm2 is 63%, 4 clusters to 3 clusters is 75% excluding the remainder of the die, so I reckon about 80% for the die as a whole (if the memory/ROPs are the same as G92).
I'll admit not to be perfectly sure what you mean here, because you aren't using the same (right?) terminology that I am...
G80: 8 clusters, 128 SPs, 32 TA/64 TF, 24 ROPs
G92: 8 clusters, 128 SPs, 64 TA/64 TF, 16 ROPs
D8P: 4 clusters, 096 SPs, 32 TA, 32 TF, 16 ROPs [what I was proposing]

As for 55nm - I think you're overestimating the challenge this poses. The design rules are identical and unlike on 80nm, analogue & I/O is also scaled down by the same amount as digital circuitry. To give you an idea: back in June 2007, Jen-Hsun claimed during their analyst day they weren't yet sure whether their upcoming Hybrid SLI chipset would be on 65nm or on 55nm.

Even if he was a couple of weeks behind the engineering team's decision, that's still an incredibly short amount of time before it had to tape-out. Also, MCP7A (the Intel CPU variant of MCP78) is slated for late Q1 or early Q2 according to recent rumours, and that one is definitely on 55nm... So I'm not saying D9P or any other D9x is on 55, but it's not a good idea either to exclude the possibility. The real reason for not picking it for G92/G98 (and possibly D9P) is that the process wasn't proven and migrating to it at a late stage would likely delay things slightly (which may or may not be worth it). Does anyone remember the very first rumours of a 65nm RV670 which was 3/4th R600 in late Q3?
 
Does anyone remember the very first rumours of a 65nm RV670 which was 3/4th R600 in late Q3?
The original design of RV670 was indeed based on TSMC´s 65nm process node. I still remember the conversation I had on this, which took place in April. However, since the shrink to 55nm proved to be no problem whatsoever (WRT the shrinking process itelf) and it was giving them a little more room to work, they finally decided to use that half-node instead. Since they knew pretty well that NV won´t let them breathe much air, if anything, this turned out as the right and wise decision to make.

Now, I´m not a 100% sure (maybe 99%+x) whether that original design was indeed only 3/4 R600, but if you coincide that with NV´s stronger than expected 8800GT (based on G92), which was a surprise to all of us, it leads me to believe that these "rumours" are/were in fact true. [But since we can´t prove any of this of course, this still remains as a rumour.]
 
Last edited by a moderator:
I'll admit not to be perfectly sure what you mean here, because you aren't using the same (right?) terminology that I am...
G80: 8 clusters, 128 SPs, 32 TA/64 TF, 24 ROPs
G92: 8 clusters, 128 SPs, 64 TA/64 TF, 16 ROPs
D8P: 4 clusters, 096 SPs, 32 TA, 32 TF, 16 ROPs [what I was proposing]
Sorry, total brainfart I was thinking of 96 as 3/4 of 128 which is why I was talking in 3s and 4s :oops:

While you're proposing a 3:1 ratio I'd love to see 4:1 (hence 96x SPs + 24x SPs).

To be honest I suspect 3:1, it's a conservative change from G92. But I think it'd be 72x SPs + 24 TMUs - 3 clusters. Then, 3:1 makes a 192x SPs + 64x TMUs D9E come in at around 950M transistors, I reckon.

Shame the rumour says this chip is 64 SPs :LOL: The instant you introduce the possibility of a different ALU:TEX ratio, everything goes bananas :p

As for 55nm - I think you're overestimating the challenge this poses. The design rules are identical and unlike on 80nm, analogue & I/O is also scaled down by the same amount as digital circuitry. To give you an idea: back in June 2007, Jen-Hsun claimed during their analyst day they weren't yet sure whether their upcoming Hybrid SLI chipset would be on 65nm or on 55nm.
I reckon he was just bullshitting the analysts: give them something to get excited about (55nm, wow - Hybrid, groovy) to paper over the fact that NVidia was making (and has made) a right mess of chipsets in 2007.

Plus, NVidia delivered its first 90nm chipsets a year before its first 90nm GPU - though to be fair G71 was prolly several months late.

Finally, you're missing the crux of what I'm saying. That D9x is late, and, that a February 55nm D9x would have to be on the same schedule as ATI's RV670 (which was due for January but bizarrely was 2 months early). NVidia has deliberately lagged ATI on GPU processes for years now. Even if it's only 4-6 months.

We're still getting used to the fact that 55nm GPUs are out there when the first 65nm GPUs only hit in July (though they should have arrived in March or perhaps earlier, to be fair).

If you said that the delays to this chip (which I reckon should have been launched in November, alongside D9E) have given NVidia the opportunity to re-target to 55nm, then I might have some sympathy. 65nm has clearly given a lot of trouble at TSMC this year, but it appears that 55nm didn't get pushed back, making 55nm look like a faster transition than it really was.

Does anyone remember the very first rumours of a 65nm RV670 which was 3/4th R600 in late Q3?
Planted by ATI? It sure worked.

Calling this GPU D9P appears to be more of the same misinformation (except the other way round), when it's priced/positioned to be D9M. Hell, remember that 8600GTS was $200+ when it launched and this is destined for $150.

If this is D9P then D9E will be awful, being only about 33% faster. D9E as configured earlier in this post, should be 2x the performance of this chip: 192 v 96 SPs (or 72) and 64 v 32 (or 24) TMUs.

Damn, this is so random. My main interest in posting in this thread is that calling a $150 part D9P is ridiculous.

Jawed
 
The original design of RV670 was indeed based on TSMC´s 65nm process node. I still remember the conversation I had on this, which took place in April.
RV610 and RV630 were both scheduled for 2007Q1-ish - I don't see how you can get to RV670 definitely being 65nm from there.

Meanwhile, look at this gem, 26th October 2006:

I see this the topic future:
"r600 coming only in april" "r600 hot and long" "r600 yield 8%" "r600 20% slower than g80" "when r600 coming in january, nv already release a 8800 ultra" "r600 die huge, this is kill ATi margins" "r600 = nv30" "r600 need 600watt psu, r600 cf need 900watt psu"
Guess who that was before you click the link :devilish:

http://forum.beyond3d.com/showpost.php?p=860810&postcount=145

Jawed
 
While you're proposing a 3:1 ratio I'd love to see 4:1 (hence 96x SPs + 24x SPs).
I don't have anything against a 4:1 ratio, although I think in terms of perf/mm² it might be a higher priority to revamp the ALU organization itself, especially in light of CUDA.

Then, 3:1 makes a 192x SPs + 64x TMUs D9E come in at around 950M transistors, I reckon.
I suspect it could turn out to be a fair bit less than that, but it depends on so many factors it's probably not worth wasting too much time on. Heck, I wouldn't even exclude it being bigger than that. It really depends on what references points you take and what you consider the reasons for these to be (synthesis, overhead, etc.)

Shame the rumour says this chip is 64 SPs :LOL: The instant you introduce the possibility of a different ALU:TEX ratio, everything goes bananas :p
While I do respect expreview's track record, pretty much every single rumour about NVIDIA in the last 12 months has turned out to be false. Why should I believe this one with absolute faith, even if it's less than two months out?

I reckon he was just bullshitting the analysts: give them something to get excited about (55nm, wow - Hybrid, groovy)
Heh, while I don't really agree bullshitting the analysts on purpose is very likely, I'll admit Jen-Hsun isn't exactly the best source of process information out there. For example, he claimed in mid-2006 that most of NVIDIA's next chips were 65nm... In terms of process stuff, I think he focuses more on the long-term roadmap which makes it hard to get a reply out of him on schedule versus the competition's.

As for all your other points: I've never, ever seen a schedule for D9P (or its former codename, G94) that called for anything else than early 2008 (in fact, not sure I even ever saw anything before March). So saying that's behind schedule seems a bit ridiculous to me; if anything, it seems to be ahead of internal schedule to me. However, that doesn't mean NVIDIA's roadmap is ideal, or that it wouldn't have been nice to have this part earlier. But that's another debate completely, and orders of magnitude more subjective.

As a concluding note, consider the 90nm C51 IGP northbridge: its schedule was not that far behind R5xx (and it had delays of its own) but its architecture was exactly that of a GF6. NVIDIA came out less than 6 months later with GF7s, and even though they probably weren't 100% on schedule, they were always supposed to lag behind R5xx by a few months in all iterations of the roadmap. Can you see the similarities with today's situation?
 
RV610 and RV630 were both scheduled for 2007Q1-ish - I don't see how you can get to RV670 definitely being 65nm from there.
You can certainly make good cases for either side, that´s what makes it very hard to find a consensus here. This is also the reason I´m not sure if we should continue it, since the evidence is rather lacking. :smile:
Re-reading my previous post again makes you think that I´m a 100% sure, which I´m not, sorry for that confusion.

Even though we all know that ATI was and is very aggressive when it comes to process-node adoption, my view is that they had some risk-management in place, since 65nm was still very much maturing when they had designed RV670, which I think was about the time they ended with bugfixing R600. So they went with 65nm, which is the full-node.

TSMC started their 55nm prototyping-service in May, which just seems too early for me, let alone a design completely dependent on it. Especially when you take into consideration that you´re free to shrink too 55nm, even after you have a completed design in 65nm.

Like I said, that 3/4 R600 rumour seems to coincide well with what NV wanted to release but changed it according to RV670 final specifications.
 
Last edited by a moderator:
As for all your other points: I've never, ever seen a schedule for D9P (or its former codename, G94) that called for anything else than early 2008 (in fact, not sure I even ever saw anything before March). So saying that's behind schedule seems a bit ridiculous to me; if anything, it seems to be ahead of internal schedule to me.
My main focus is on D9E being late for what it's worth - NVidia has left a lot of 8800GTX owners with nothing to upgrade to at precisely the time they were planning on upgrading. It's miles off apparently (I'm trying to persuade a mate to wait patiently for it) and G92-GX2 isn't what those people want it seems.

NVidia promised the CUDA community double-precision (D9x) before the year end, too. Looks late to me.

Oh, and the November 2007 1 billion transistor GPU (that PC Inlife posted until that description on the graph got blacked-out) can't be G92-GX2, that's ~1.5 billion across two chips.

However, that doesn't mean NVIDIA's roadmap is ideal, or that it wouldn't have been nice to have this part earlier. But that's another debate completely, and orders of magnitude more subjective.

As a concluding note, consider the 90nm C51 IGP northbridge: its schedule was not that far behind R5xx (and it had delays of its own) but its architecture was exactly that of a GF6. NVIDIA came out less than 6 months later with GF7s, and even though they probably weren't 100% on schedule, they were always supposed to lag behind R5xx by a few months in all iterations of the roadmap. Can you see the similarities with today's situation?
Nope, chipsets, IGPs and laptop parts are the exception that proves the rule - the leadtimes are too punishing to do anything other than prioritise them. Plus they're small and they are not dependent on bleeding edge clock speeds in the same way as the halo GPUs that normally head-up an architecture-change/refresh, which means that the budget process nodes, which are normally the first to leave prototyping, are the ideal match.

Jawed
 
er, I'm thinking of something very simple.
G92 GTS is : 128SP, 256bit, 16 ROP, 1GHz ram
a 9600GT could be 64SP, 128bit, 8 ROP, 1GHz ram and that sort of adds up.
9600GT being D9P, that's not confirmed and the model number is lowish.

it would be below HD 3850 range but hell, if it can get cheaper, be a great mobile/low power part and be much better than HD 2600 and 8600 GT/GTS, that would be a chip noteworthy of attention.

also I wouldn't advise anyone to buy a HD 3850 256MB, I find amusing the idea of buying such a bad ass GPU to play the latest games and have to compromise on texture detail. (that card looks like the sweet deal)
 
Back
Top