nvidia "D8E" High End solution, what can we expect in 2008?

I think the 512 bit bus was more of a test for the future.. and I think the projected clocks of R600 should have been much higher than the ones we saw in production.
 
Can you explain the difference with 2800XT and 2900XT with AA then? Remember that 2800XT has more bandwidth than the 2900XT .. yet the AA performance of 2900XT is much better. To me it seems ATI screwed up somewhere and eventually fixed it with 2900XT. Nothing to do with bandwidth imo.

So if Nvidia can sort out their chip properely, then the extra supposed bandwidth can only help them and not hinder them.

Hence

Bandwidth = memory bus width/8bits per byte * effective memory clock speed

That gives 204.8 GiB/s.

96.gif

Good imo.

Also, the Nvidia products seem to do better with more bandwidth. ;)

US
 
Can you explain the difference with 2800XT and 2900XT with AA then? Remember that 2800XT has more bandwidth than the 2900XT .. yet the AA performance of 2900XT is much better. To me it seems ATI screwed up somewhere and eventually fixed it with 2900XT. Nothing to do with bandwidth imo.

So if Nvidia can sort out their chip properely, then the extra supposed bandwidth can only help them and not hinder them.

Hence



Good imo.

Also, the Nvidia products seem to do better with more bandwidth. ;)

US

Should we compare the 2900XT to what? 2800XT does not exist.
Are you referring to 8800 (nvidia) to 2900 (ATI) comparison?
Or HD2900 to HD3870 comparison (in which case HD3870 does not everytime outperform R600) ?

And anyway, I said that 512 bit was of course overkill in R600 (i.e. not needed), but it was useful to ATI because they achieved to do that technically so they could use it in a future high-end product.
About nvidia cards, you can easily see that bandwidth (and fill rate, of course) becomes more important at resolutions of 1920x1200+4xAA and more, this can really be seen in the 8800 GTS 512/8800 GTX comparison, having the first one much less bandwidth (-22%) and pixel fill rate (-32%) than the GTX, but more shader and texturing power (+22% and +14% respectively)
 
Something else to remember. If HardOCP's 9800 GX2 spec's are true then a single 9800 has a 512Bit frame buffer, it'll do the card well, especially if you compare the 256Bit GT/GTS.

US
 
Last edited by a moderator:
Last edited by a moderator:
Something else to remember. If HardOCP's G98 GX2 spec's are true and a single G98 has a 512Bit frame buffer, it'll do the card well, especially if you compare the 256Bit GT/GTS.

US

In the HardOCP specs there is no hint about the bus width, nor the chip is calle d"G98", What is written there is 256 SP total (128x2) and 512Mbytex2 frame buffer. This recalls very closely a G92x2, so that would be a 256x2 bus
 
Sorry .. you right. I was talking AA performance of R600 HD2900XT and RV670 HD3870 XT.

The 2900XT has more bandwidth than the 3870XT. My bad.

Remember, i'm also talking about the cards running AA, as in MSAA 4x requirement for DX10/10.1

US

The problem with AA in R600 is the low number of samples in comparison to G8X architecture: R600 can do at best 2 samples per clock, G80 AFAIR takes 4.
Then there is the shader resolve issue, but I think the main problem is the one above.
R600 was certainly overdone in the bandwidht department, IMHO ATI was also developing something else high-end that wouls have taken advantage of that bandwidth later (an R650-R680) but probably that was cancelled in favor of a RV670x2 and quicker R700 development.
 
The problem with AA in R600 is the low number of samples in comparison to G8X architecture: R600 can do at best 2 samples per clock, G80 AFAIR takes 4.
Then there is the shader resolve issue, but I think the main problem is the one above.
R600 was certainly overdone in the bandwidht department, IMHO ATI was also developing something else high-end that wouls have taken advantage of that bandwidth later (an R650-R680) but probably that was cancelled in favor of a RV670x2 and quicker R700 development.

Why is everyone ignoring the fact that there are still 16 ROPS in the R6XXs. Which is the same number of units they've had for about a billion years. With clock-rate improvements that weren't huge across generations(think going x1800xt x1900xtx 2900 3870). The AA performance delta between the 3870 and the 2900 could very well be a factor of extra clocks+some architectural tweaking. It's not some huge earth-shattering thing, not some mysterious bug that seems to be thrown around a lot.

The 2900, with the same number of ROPS and a moderate(relatively) clock increase over the x1900xtx did better in AA mostly in line with what was to be expected under such conditions(except for fluke scenarios where drivers gutted it). Unless everyone thought that ATi actually had EXXXTREMME PIPELINEStm, why is this surprising/unexpected?Ignore the G8Xs and think only within the realm of ATi.

I don't think the AA of 2900 was broken significantly performance-wise. It's simply a case of making design decisions that come around and bite you in the arse. Badly.
 
In the HardOCP specs there is no hint about the bus width, nor the chip is calle d"G98", What is written there is 256 SP total (128x2) and 512Mbytex2 frame buffer. This recalls very closely a G92x2, so that would be a 256x2 bus

Yes, talking frame buffer, or memory buffer. It'll most probably be half GX2 which is indeed 512Bit.

The 9800 GPU core is rumoured to be 256Bit.

US
 
Last edited by a moderator:
Why is everyone ignoring the fact that there are still 16 ROPS in the R6XXs. Which is the same number of units they've had for about a billion years. With clock-rate improvements that weren't huge across generations(think going x1800xt x1900xtx 2900 3870). The AA performance delta between the 3870 and the 2900 could very well be a factor of extra clocks+some architectural tweaking. It's not some huge earth-shattering thing, not some mysterious bug that seems to be thrown around a lot.

The 2900, with the same number of ROPS and a moderate(relatively) clock increase over the x1900xtx did better in AA mostly in line with what was to be expected under such conditions(except for fluke scenarios where drivers gutted it). Unless everyone thought that ATi actually had EXXXTREMME PIPELINEStm, why is this surprising/unexpected?Ignore the G8Xs and think only within the realm of ATi.

I don't think the AA of 2900 was broken significantly performance-wise. It's simply a case of making design decisions that come around and bite you in the arse. Badly.

*sigh*
Who said that? I said that AA in 2900/3870 is less performing than in G80, I pointed out that this has much more to do with architecture (2 sample vs 4 samples) than bandwidth or AA resolve issues, and this can be seen easily in the HD 3870-8800 GT comparison, where the latter has much less pure pixel fill rate (same number of ROPs, lower clocks and lower bandwidth available) but performs better in AA all across the board except rare issues. Then, again, IMHO R600 512 bit bus IS overkill (especially if you have 16 non-extreme ROPs), HD3870 improvements in AA ARE the result of increased clocks(OK, only 35 MHz)+tweaks and if G8X chips were not the FPS crunching devices they are these ATI chips would have been not so understimated, but when two products are on the market, then you should compare them.
 
Sure they should be compared, but that comparison is moot if the point being made is that there's some inherent AA bug with ATi's stuff, because IMHO there's none, it simply doesn't have muscle in that department(due to a number of design choices, including the ones you mentioned).

I think i haven't seen tests without AF and only with AA(those would be interesting) on the 3870 and the 8800GT. Bear in mind that AF is slower as well so that may be a confounding factor in the comparison, so getting it out should show how close/far the cards are in terms of AA capability. My guess is that they'd end-up fairly close.

I wasn't arguing that the 512-bit was adequate. It was mostly useless, given what the chip itself could do with it. I'm simply against the elusive R6xx AA bug, that always shows up in arguments and is always something that'll be fixed and will bring huge performance increases in the next architecture/whatever, mostly because I can't see the evidence pointing to this bug and thus am envious of those who can:D.
 
Ok, so then in all my other posts .. refer G98 as Nvidia 98xx cards.

Hell lemme fix it.

Thx.

US
 
Last edited by a moderator:
Sure they should be compared, but that comparison is moot if the point being made is that there's some inherent AA bug with ATi's stuff, because IMHO there's none, it simply doesn't have muscle in that department(due to a number of design choices, including the ones you mentioned).

I think i haven't seen tests without AF and only with AA(those would be interesting) on the 3870 and the 8800GT. Bear in mind that AF is slower as well so that may be a confounding factor in the comparison, so getting it out should show how close/far the cards are in terms of AA capability. My guess is that they'd end-up fairly close.

I wasn't arguing that the 512-bit was adequate. It was mostly useless, given what the chip itself could do with it. I'm simply against the elusive R6xx AA bug, that always shows up in arguments and is always something that'll be fixed and will bring huge performance increases in the next architecture/whatever, mostly because I can't see the evidence pointing to this bug and thus am envious of those who can:D.

I did not say anywhere there was a bug, I said something about the "AA shader resolve issue" (and not pointing that as the main problem) but this is not a bug, is a design decision (otherwise it would have been fixed in all the other R6XX SKUs), of course.
But is an issue, as it anyway subtracts shader power (not too much, OK), could limit the maximum FPS in certain conditions (it was said even by Sir Eric, in the R600 talk thread) and IMHO compicates the scheduling of shader operations by adding another "level" even in the case of simple weighted MSAA resolve. I remember that Computerbase.de made some tests just after the R600 release with only AA activated, and in that case anyway R600 lost more performance anyway in respect to G80, and at that time in certain cases even with respect to R580, now with newer drivers the situation should be better, but I think anyway there's a structural deficit with respect to G80 architecture for AA.
 
Compared to G8x of course there is. That wasn't being disputed. You don't seem to be getting that we're on the same wavelength, mostly. I simply think that the prime reason for R6XXs AA performance is the number or RBEs(16), with the other aspects like lack of dedicated HW resolve and samples per clock adding their contribution to it.

The fact that the RV670 performs similarly to the R600, in spite of the reduced bandwidth and in-line with the clock-speed delta and the possible tweaks that were introduced points to that as well.:)
 
Compared to G8x of course there is. That wasn't being disputed. You don't seem to be getting that we're on the same wavelength, mostly. I simply think that the prime reason for R6XXs AA performance is the number or RBEs(16), with the other aspects like lack of dedicated HW resolve and samples per clock adding their contribution to it.

The fact that the RV670 performs similarly to the R600, in spite of the reduced bandwidth and in-line with the clock-speed delta and the possible tweaks that were introduced points to that as well.:)

(that is what I was saying from the principle)
I think that the more relevant problem in respect to G80 is indeed the number of sample for clock cycle and relative efficiency of the ROP itself. In fact, if we compare HD3870 and the Geforce 8800 GT, we see that the first one has more bandwidth, same number of ROPs and higher clocks. Theoretical Pixel Fill Rate is higher than the new 8800 GTS (+20% to the GTS, 30% to the GT). But the latter loses less with AA applied anyway, and the AF is to take in account, but IMHO cannot be considered the main responsible for the higher loss.
 
I think the first post of these super dooper rumor fest threads should maintain a nice compilation of the latest info. :)
 
I think the first post of these super dooper rumor fest threads should maintain a nice compilation of the latest info. :)

The latest info is depressing, in a way:D. It seems that everyone was expecting something new(ish) from nV, not a GX2 part. Oh, well.
 
Yeah it is depressing if it's just G92 x 2. I'm not really interested in the inefficient, dev-attention-req'd SLI or Crossfire techs....
 
Back
Top