So.. What will Nvidia Bring to Counter the R520?

Xmas · Jan 27, 2005

Chalnoth said:
1. I would expect both vertex and pixel pipeline counts to be doubled. That's most of the transistors in the chip. The only things that would remain the same are the video processor and I/O.
2. I wouldn't expect the chip to remain the same, but have some additional improvements....so maybe I misspoke in my previous post.

So, for my lower range I doubled the transistors of the NV40 and rounded it to 450M. Any extra would be related to other advancements, hence the upper range. So yes, I should have left out the "without improved tech," as I was obviously including it.

I think you're overestimating the amount of transistors that has to be doubled. The B3D chips table says 190M for NV41 and 143 for NV43.
PS and VS pipelines are certainly a huge area, but VPU, memory interface, and early Z aren't insignificant either. Also, I think if they had had more time they could have made NV40 with less transistors. E.g. NV30 to NV35 is a surprisingly large change for the small increase of transistors. I think both NV30 and NV40 have a lot of "wasted" transistors, being the first incarnation of a new architecture.

Xmas · Jan 27, 2005

Chalnoth said:
Well, for the purposes of non-graphics computing, I personally think it would be highly useful to allow accessing of a FP32 framebuffer within the shader.

Agreed, but I don't expect that to happen before WGF2.0

psurge · Jan 27, 2005

I also recall some comments to the effect that NV40 was designed with a fair amount of redundancy to increase yield.

KimB · Jan 27, 2005

Xmas said:
I think you're overestimating the amount of transistors that has to be doubled. The B3D chips table says 190M for NV41 and 143 for NV43.
PS and VS pipelines are certainly a huge area, but VPU, memory interface, and early Z aren't insignificant either. Also, I think if they had had more time they could have made NV40 with less transistors. E.g. NV30 to NV35 is a surprisingly large change for the small increase of transistors. I think both NV30 and NV40 have a lot of "wasted" transistors, being the first incarnation of a new architecture.

First of all, the NV30 had some blatant issues, and I think the low transistor count increase to the NV35 is more of a highlight of those issues than any sort of indication of how much you can do with few transistors.

Secondly, let's take those numbers. If you consider the NV41 and NV43 to be of the same basic technology (i.e. more similar to one another than to the NV40), then one might assume that the difference between the two can pretty much entirely be taken account of by comparing the pixel pipeline difference, and assuming that the vertex shader increase would be proportional. Doing this, you can extrapolate that a 32-pipeline part with the same basic technology would be around 425 million transistors. Add a few tweaks and advancements and you're right back to 450 million.

KimB · Jan 27, 2005

psurge said:
I also recall some comments to the effect that NV40 was designed with a fair amount of redundancy to increase yield.

I would expect this would be why nVidia sold the GeForce 6800.

DemoCoder · Jan 27, 2005

DaveBaumann said:
FP32 NRM can be achieved in the ALU's, in fact it is - the FP16 NRM is an extra, specific, function such that it can be achieved "for free" (or, at least, interleaved with other instruction operating on the ALU's).

There is no hardware FP32 NRM. NRM_PP executes in hardware. NRM (FP32) is handled by the driver and devolves to a macro which issues DP4/RSQ/MUL. Anything can be "achieved" in software, the question is, how fast does it run. If you build a specialized unit to do NRM, or SINCOS, it has the potential to run faster than a "software emulated" version.

NRM is such a primitive, frequently used operation, it deserves specialized hardware or consideration since it is frequently part of the workload.

Xmas · Jan 27, 2005

Chalnoth said:
Secondly, let's take those numbers. If you consider the NV41 and NV43 to be of the same basic technology (i.e. more similar to one another than to the NV40), then one might assume that the difference between the two can pretty much entirely be taken account of by comparing the pixel pipeline difference, and assuming that the vertex shader increase would be proportional. Doing this, you can extrapolate that a 32-pipeline part with the same basic technology would be around 425 million transistors. Add a few tweaks and advancements and you're right back to 450 million.

a = NV41 - NV43 = 4 PP, 8 ROP, 2 VS, 47M transistors
b = NV40 - NV41 = 4 PP, 4 ROP, 1 VS, 32M transistors
c = NV40 - NV43 = 8 PP, 12 ROP, 3 VS, 79M transistors

NV43 + 6 * a = NV41 + 5 * a = 32 PP, 52 ROP, 15 VS, 425M
NV40 + 4 * a = 32 PP, 48 ROP, 14 VS, 410M
NV43 + 3 * c = NV40 + 2 * c = 32 PP, 40 ROP, 12 VS, 380M
NV41 + 5 * b = NV40 + 4 * b = 32 PP, 32 ROP, 10 VS, 350M
NV43 + 6 * b = 32 PP, 28 ROP, 9 VS, 335M

350M to 380M transistors makes more sense.

DegustatoR · Jan 27, 2005

Xmas said:
350M to 380M transistors makes more sense.

While this may be possible on 90nm i doubt that they'll try to do it as their first 90nm chip. So 24/8 NV47 (or is it NV48 already?) still sounds more plausible with a possible 32/(10-12) part later and WGF-ready NV50 sometime in 2006.

Ailuros · Jan 27, 2005

DegustatoR said:
Xmas said:

350M to 380M transistors makes more sense.

Click to expand...

While this may be possible on 90nm i doubt that they'll try to do it as their first 90nm chip. So 24/8 NV47 (or is it NV48 already?) still sounds more plausible with a possible 32/(10-12) part later and WGF-ready NV50 sometime in 2006.

While I can understand a hypothetical 90nm/6 quad/NV4x future project, I'm somewhat lost when thinking of more quads than 6. Unless such a sollution uses anything higher speced than GDDR3, it'll most likely end up bandwidth limited. The next reasonable step beyond any possible first introduction, would be to increase clockspeed and use higher speced ram at the same time.

Device IDs are AFAIK 0046,0047,0048,0049 which if they really represent future accelerators I'd figure that half of them might be AGP and the remaining PCI-E.

DegustatoR · Jan 27, 2005

Ailuros said:
While I can understand a hypothetical 90nm/6 quad/NV4x future project, I'm somewhat lost when thinking of more quads than 6. Unless such a sollution uses anything higher speced than GDDR3, it'll most likely end up bandwidth limited. The next reasonable step beyond any possible first introduction, would be to increase clockspeed and use higher speced ram at the same time.

B/w is not that important for complex shaders, which are the main area of progress in the last two years and in a couple of years to come. Chip's internal shader fillrate on the other hand is very important. So it is very possible that this hypothetical NV4x with 6+ quads will be able to write less pixels in memory than process per clock (ala NV43/44).

Device IDs are AFAIK 0046,0047,0048,0049 which if they really represent future accelerators I'd figure that half of them might be AGP and the remaining PCI-E.

I doubt we'll see new AGP chips from NVIDIA, they're most likely to use PCIE-AGP bridge for this bus. This bridge has a universal BR2 ID in current drivers so i think most if not all of these IDs are chip IDs.

Ailuros · Jan 27, 2005

B/w is not that important for complex shaders, which are the main area of progress in the last two years and in a couple of years to come. Chip's internal shader fillrate on the other hand is very important. So it is very possible that this hypothetical NV4x with 6+ quads will be able to write less pixels in memory than process per clock (ala NV43/44).

While it's basically true, I doubt that there isn't an upper threshold even in that case.

Besides however hard I'll stretch that speculation I don't see much chances for a 6 quad and an 8 quad and NV's next generation chip all in such short notice; (presupposition that we'll see WGF2.0 chips around mid to late 2006).

DegustatoR · Feb 1, 2005

NVIDIA_NV48.DEV_0211.1 = "NVIDIA GeForce 6800 "
NVIDIA_NV48.DEV_0212.1 = "NVIDIA GeForce 6800 LE "
NVIDIA_NV48.DEV_0215.1 = "NVIDIA GeForce 6800 GT "

That's from FW71.80 inf file.

I think now we can safely assume that NV48 is NOT NV47

Then, it looks like NV48 is an NV40 on 110nm TSMC (no Ultra version in inf btw). Now, what bus is it on? I'd say it's still AGP because there really is no reason to make 6800LE out of NV48 when you have NV41 on PCIE...

KimB · Feb 1, 2005

DegustatoR said:
Then, it looks like NV48 is an NV40 on 110nm TSMC (no Ultra version in inf btw). Now, what bus is it on? I'd say it's still AGP because there really is no reason to make 6800LE out of NV48 when you have NV41 on PCIE...

Um, it would have to be PCIE. Such a move would be a cost-reduction move, and would therefore be aimed primarily at OEM markets. That means PCI Express.

Unwinder · Feb 2, 2005

Just some interesting facts:

1) There is _no_ NV48 core family ID in the driver, which means that NV48 codename is used for marketing only and in reality this chip is a "clone" of one of existing cores (presumably NV40). Just like an NV38, which is internally identified by driver as NV35.
2) There _is_ NV47 core family ID in the driver, and a person with basic reverse engineering skills can easily get specs of this core from 61.xx driver binary. There is the "FSOverride" registry key in the miniport, allowing the driver to disable some pixel/vertex units via the registry (unfortunately it cannot be used for pipeline unlocking, NVIDIA took care about it). The trick is that the driver checks desired pixel/vertex unit index before disabling it, so examining this check code you may easily see how many pixel / vertex units are in any NVIDIA chip (including NV47). Unfortunately I was stupid enough to inform NVIDIA about this hole, so you cannot use this trick with newer NVIDIA drivers. And I can confirm that one of the postings in this thread contains correct pixel / vertex pipelines spec for this chip. Can't say more, sorry. But you've the way to go

digitalwanderer · Feb 2, 2005

Unwinder said:
Just some interesting facts:

1) There is _no_ NV48 core family ID in the driver, which means that NV48 codename is used for marketing only and in reality this chip is a "clone" of one of existing cores (presumably NV40). Just like an NV38, which is internally identified by driver as NV35.
2) There _is_ NV47 core family ID in the driver, and a person with basic reverse engineering skills can easily get specs of this core from 61.xx driver binary. There is the "FSOverride" registry key in the miniport, allowing the driver to disable some pixel/vertex units via the registry (unfortunately it cannot be used for pipeline unlocking, NVIDIA took care about it). The trick is that the driver checks desired pixel/vertex unit index before disabling it, so examining this check code you may easily see how many pixel / vertex units are in any NVIDIA chip (including NV47). Unfortunately I was stupid enough to inform NVIDIA about this hole, so you cannot use this trick with newer NVIDIA drivers. And I can confirm that one of the postings in this thread contains correct pixel / vertex pipelines spec for this chip. Can't say more, sorry. But you've the way to go

Thanks Alexey, great food for thought!

digitalwanderer · Feb 2, 2005

Correct me if I'm wrong please, but I do believe this is the only other post you made in this thread:

Unwinder said:
thomase said:

Several people including myself have recently received new Dell systems with the Geforce 6800 GTO (12 pipe, 5 vs) and we are having no luck attempting to unlock the disabled hardware with RivaTuner. The softmod does not seem to enable the masked hardware. There is a sticker on the back of my card with "REV A00". Is it possible that newer versions of these cards are shipping with NV41? If the NV41 is pin compatible with the NV45, it may be that nvidia has been quietly shipping the former to manufacturers of the GTO and other 6800 NU products. If not, what other reason could their be for the rivatuner softmod not working? I'm reluctant to take the hs/fan off of GPU....

Click to expand...

I'm pretty certain that these GTOs are _not_ NV41 based.

You think the GTOs are nV48s which are actually nV40s? :|

Unwinder · Feb 2, 2005

digitalwanderer said:
Correct me if I'm wrong please, but I do believe this is the only other post you made in this thread:

Unwinder said:

thomase said:

Several people including myself have recently received new Dell systems with the Geforce 6800 GTO (12 pipe, 5 vs) and we are having no luck attempting to unlock the disabled hardware with RivaTuner. The softmod does not seem to enable the masked hardware. There is a sticker on the back of my card with "REV A00". Is it possible that newer versions of these cards are shipping with NV41? If the NV41 is pin compatible with the NV45, it may be that nvidia has been quietly shipping the former to manufacturers of the GTO and other 6800 NU products. If not, what other reason could their be for the rivatuner softmod not working? I'm reluctant to take the hs/fan off of GPU....

Click to expand...

I'm pretty certain that these GTOs are _not_ NV41 based.

Click to expand...

You think the GTOs are nV48s which are actually nV40s? :|

Nope. That GTO was a regular NV40 with slightly revised pipeline masking technique. Currently there are both NV40 (with 1 locked quad and one locked vertex pipe) and NV41 (having 12 pipelines and 5 vertex processors in silicon) based Dell 6800GTOs are availabl in sale.

digitalwanderer · Feb 2, 2005

Thanks!

caboosemoose · Feb 2, 2005

As Dave has pointed out, I think people are getting a little over optimistic about what the next round of top-end refreshes will bring. My view is that R520 will be 16 fragment. Not so sure about NV. If anyone tops 16 fragment it'll be NV but I doubt even that.

They're having a tough enough time getting 16 pipe chips out of the door at the moment, god only knows what yields would be like on 24 pipe or whatever GPUs. And anyway, it's not like ATI and NV are under pressure to produce faster cards - where's the content?

Actually that's another issue, I love 3d tech and all, but does anyone else sometimes think it's all a bit crazy with so little decent content around that even comes close to making the most this glorious hardware?

digitalwanderer · Feb 2, 2005

caboosemoose said:
Actually that's another issue, I love 3d tech and all, but does anyone else sometimes think it's all a bit crazy with so little decent content around that even comes close to making the most this glorious hardware?

Yup, that's was the final bit that decided me to waste such a bundle on an X800 when my 9700 pro died right when the X800 was released. I figured there wouldn't be too huge a refresh part since the original R420 is still pushing the envelope on overkill for any game out there or on the horizon.

I think there will be a "catching up" so to speak of games soon, in about a year.

So.. What will Nvidia Bring to Counter the R520?

Xmas

Porous

Xmas

Porous

psurge

KimB

KimB

DemoCoder

Xmas

Porous

DegustatoR

Ailuros

Epsilon plus three

DegustatoR

Ailuros

Epsilon plus three

DegustatoR

KimB

Unwinder

digitalwanderer

wandering

digitalwanderer

wandering

Unwinder

digitalwanderer

wandering

caboosemoose

digitalwanderer

wandering

Similar threads