NV40: 6x2/12x1/8x2/16x1? Meh. Summary of what I believe

Xmas · Feb 18, 2004

OpenGL guy said:
I don't understand how you thing that NV30 is great at stencil. 8 stencil ops per clock? R300 has that too and it can also do 8 colored pixels per clock as well.

Per clock...

OpenGL guy · Feb 19, 2004

Xmas said:
OpenGL guy said:

I don't understand how you thing that NV30 is great at stencil. 8 stencil ops per clock? R300 has that too and it can also do 8 colored pixels per clock as well.

Click to expand...

Per clock...

And your point is?

ninelven · Feb 19, 2004

Well, you don't have to be a math major to figure it out...

nobie · Feb 19, 2004

I think the point is, Nvidia parts generally run at higher clock speeds than equivilant ATI parts. Of course you wouldn't know it from the benchmarks

SsP45 · Feb 19, 2004

OpenGL guy said:
Xmas said:

OpenGL guy said:

I don't understand how you thing that NV30 is great at stencil. 8 stencil ops per clock? R300 has that too and it can also do 8 colored pixels per clock as well.

Click to expand...

Per clock...

Click to expand...

And your point is?

500Mhz core clock on NV30 versus 325Mhz core clock on R300. That's a clock over 50% higher.
________
Mercedes-Benz W114 Specifications

OpenGL guy · Feb 19, 2004

SsP45 said:
OpenGL guy said:

Xmas said:

OpenGL guy said:

I don't understand how you thing that NV30 is great at stencil. 8 stencil ops per clock? R300 has that too and it can also do 8 colored pixels per clock as well.

Click to expand...

Per clock...

Click to expand...

And your point is?

Click to expand...

500Mhz core clock on NV30 versus 325Mhz core clock on R300. That's a clock over 50% higher.

But you can't say the NV30 is really any better at stencil than R300, but you can say that it runs at a higher clock.

Also, how many NV30s really shipped at 500/500? Not too many I bet.

Waltar · Feb 19, 2004

OpenGL guy said:
SsP45 said:

OpenGL guy said:

Xmas said:

OpenGL guy said:

I don't understand how you thing that NV30 is great at stencil. 8 stencil ops per clock? R300 has that too and it can also do 8 colored pixels per clock as well.

Click to expand...

Per clock...

Click to expand...

And your point is?

Click to expand...

500Mhz core clock on NV30 versus 325Mhz core clock on R300. That's a clock over 50% higher.

Click to expand...

But you can't say the NV30 is really any better at stencil than R300, but you can say that it runs at a higher clock.

Also, how many NV30s really shipped at 500/500? Not too many I bet.

All 5800 ultra's shipped at 500 / 500, albeit overvolted.

OpenGL guy · Feb 19, 2004

Waltar said:
OpenGL guy said:

SsP45 said:

OpenGL guy said:

Xmas said:

OpenGL guy said:

I don't understand how you thing that NV30 is great at stencil. 8 stencil ops per clock? R300 has that too and it can also do 8 colored pixels per clock as well.

Click to expand...

Per clock...

Click to expand...

And your point is?

Click to expand...

500Mhz core clock on NV30 versus 325Mhz core clock on R300. That's a clock over 50% higher.

Click to expand...

But you can't say the NV30 is really any better at stencil than R300, but you can say that it runs at a higher clock.
Also, how many NV30s really shipped at 500/500? Not too many I bet.

Click to expand...

All 5800 ultra's shipped at 500 / 500, albeit overvolted.

I know that. My question was: How many 5800 Ultras were shipped?

Vince · Feb 19, 2004

OpenGL guy said:
I know that. My question was: How many 5800 Ultras were shipped?

I'm curious, what does something like the number of units shipped have to do with the discussion of the differing ideologies which lead to each IHV's respective architecture?

The number of units shipped can be a function of many factors, and as you very well know, a great number of them aren't technical in basis. And outside of gross extrapolations, the knowldge isn't public which would provide us with a true answer as to how many shipped and why that number did. With all due respect, your comment seems more of a feeble attempt at defending every last facet of the R300, which is unecessary as we all respect it immensely for what it does do (and very well at that) and don't need to see you defend what it doesn't with such tangental arguments.

As a consumer, I can buy the nVidia product in question. I physically couldn't buy a product from ATI which is comparable in said respect. To me, as the consumer, it doesn't matter the reasons why or the number shipped or the IHV's profit margin - I can only use what you [figurativly] provide.

Joe DeFuria · Feb 19, 2004

Vince said:
I'm curious, what does something like the number of units shipped have to do with the discussion of the differing ideologies which lead to each IHV's respective architecture?

Isn't it obvious?

The number of units sold (or in this case, producible at that clock speed in any appreciable quantity) is an indication of which of the differing ideologies is actually legitimate.

The only thing that made NV30 at all reasonable performance wise, was the clock speed, and the point is, that clock speed is not particularly legitimate, given how

1) They canned it with few units produced
2) Leaf Blower
3) They replaced the product up with one that had a lower clock rate.

And, I think you can more or less expect an employee of ATI to be "defensive" of their own products. All he said was he didn't see what so impressive about NV30's stencil.

OpenGL guy · Feb 19, 2004

Vince said:
OpenGL guy said:

I know that. My question was: How many 5800 Ultras were shipped?

Click to expand...

I'm curious, what does something like the number of units shipped have to do with the discussion of the differing ideologies which lead to each IHV's respective architecture?

You choose to miss the point as well.

The number of units shipped can be a function of many factors, and as you very well know, a great number of them aren't technical in basis. And outside of gross extrapolations, the knowldge isn't public which would provide us with a true answer as to how many shipped and why that number did. With all due respect, your comment seems more of a feeble attempt at defending every last facet of the R300, which is unecessary as we all respect it immensely for what it does do (and very well at that) and don't need to see you defend what it doesn't with such tangental arguments.

What am I defending here? The R300 does as many stencil ops per clock as the NV30. What I am questioning is how you can conclude that nvidia put any more work into stencil op performance than ATI did.

If you want to say that the 5800 Ultra was 50% faster because of its higher clock speed, I won't dispute that. What I dispute is calling the 5800 Ultra a real product. I never saw it on store shelves, for example. If you instead look at the 5800 non-Ultra, which was clocked at 400/400 I believe, then you see a more realistic picture as this product was actually available for purchase.

As a consumer, I can buy the nVidia product in question.

As far as I know, this is only true if you were one of the few (10000?) that had preordered the board.

I physically couldn't buy a product from ATI which is comparable in said respect. To me, as the consumer, it doesn't matter the reasons why or the number shipped or the IHV's profit margin - I can only use what you [figurativly] provide.

All of this is irrelevant to what I am referring to.

I'll state it clearly: Clock per clock, NV30 and R300 do the same number of stencil ops: How can you say that NV30 is more optimizated for stencil ops than R300?

Pretty simple question.

AlphaWolf · Feb 19, 2004

Vince said:
OpenGL guy said:

I know that. My question was: How many 5800 Ultras were shipped?

Click to expand...

I'm curious, what does something like the number of units shipped have to do with the discussion of the differing ideologies which lead to each IHV's respective architecture?

The number of units shipped can be a function of many factors, and as you very well know, a great number of them aren't technical in basis. And outside of gross extrapolations, the knowldge isn't public which would provide us with a true answer as to how many shipped and why that number did. With all due respect, your comment seems more of a feeble attempt at defending every last facet of the R300, which is unecessary as we all respect it immensely for what it does do (and very well at that) and don't need to see you defend what it doesn't with such tangental arguments.

As a consumer, I can buy the nVidia product in question. I physically couldn't buy a product from ATI which is comparable in said respect. To me, as the consumer, it doesn't matter the reasons why or the number shipped or the IHV's profit margin - I can only use what you [figurativly] provide.

Oh please.

It's not hard to figure out why the 5800's didn't ship in volume.

DemoCoder · Feb 19, 2004

OGLGuy, I never said that NVidia beat ATI at stencil ops, I merely said, they decided to "design for" for stencil ops. They have half the pipelines of the R300, but are able to match it in stencil performance. The NV40 presumably will take the same approach: that if you need to write Z/Stencil only, you can effectively double your performance per pipeline.

ATI's stencil performance simply comes from having 8 pipelines. There is no extra design effort to increase stencil ops on the R300, it just falls out as a result slapping a glob of pipelines on the chip. An equivalent NV3x architecture with 8 pipelines would have 2x the stencil performance. Of course, other decisions they made, with respect to .13um, low-k, etc, torpedeo-d more pipes.

NVidia tried to design a chip that is optimized for stencil. Their architecture works, it just needs more pipes to beat ATI.

I would not say that they were *wrong* to include a Z/Stencil only mode. We'll see how the R420 matches up the NV40 if they keep the "zixel" idea around.

Imagine if you designed a car engine that had a special "high efficiency" mode that could double horsepower, but you ended up building such an engine with only 2 cylinders instead of four. It is right to say that the engine was "designed for" this efficiency, but it still will be less powerful or equivalent to a regular 4 cylinder.

OpenGL guy · Feb 19, 2004

DemoCoder said:
OGLGuy, I never said that NVidia beat ATI at stencil ops, I merely said, they decided to "design for" for stencil ops. They have half the pipelines of the R300, but are able to match it in stencil performance. The NV40 presumably will take the same approach: that if you need to write Z/Stencil only, you can effectively double your performance per pipeline.

ATI's stencil performance simply comes from having 8 pipelines. There is no extra design effort to increase stencil ops on the R300, it just falls out as a result slapping a glob of pipelines on the chip. An equivalent NV3x architecture with 8 pipelines would have 2x the stencil performance. Of course, other decisions they made, with respect to .13um, low-k, etc, torpedeo-d more pipes.

But you could also say that ATI designed for color and stencil performance since it gives you 8 either way.

NVidia tried to design a chip that is optimized for stencil. Their architecture works, it just needs more pipes to beat ATI. I would not say that they were *wrong* to include a Z/Stencil only mode. We'll see how the R420 matches up the NV40 if they keep the "zixel" idea around.

Here's a couple of questions: How many shipping apps are seeing benefits from NV3x's extra stencil/Z ops? How many shipping apps are suffering due to NV3x's lack of color processing power?

Imagine if you designed a car engine that had a special "high efficiency" mode that could double horsepower, but you ended up building such an engine with only 2 cylinders instead of four. It is right to say that the engine was "designed for" this efficiency, but it still will be less powerful or equivalent to a regular 4 cylinder.

Poor example. The "high efficiency" mode is only available under certain conditions.

DemoCoder · Feb 19, 2004

Well, once you start playing the "how many shipping apps" game, you might as well torpedo most of ATI's DX9 advantages. In another thread, people are touting ATI's PCIE framebuffer read back performance for HDTV video editing, an even more vaporous situation.

I've owned an R300 since they were first shipped, and I am still not enjoying any killer DX9 titles on it. Half-Life 2 never shipped, which is about the only killer worthwhile DX9 game to wait for.

We all know that Doom3 is the major use case for stencil/z performance. DX9 pixel shader performance is only 1/2 of the equation. Future games will need both shader performance, and hi-speed Z/Stencil ops.

The whole point of this thread is that NVidia tried something different with the NV30. Their engineers aren't idiots and dumbasses because of it. They shot for low-k, .13um, and a bunch of other weird things, and some worked out, and some didn't.

ATI has a design that works, but it is not the only 3D architecture possible nor it is the most optimal possible. Hell, a design that has 2 pipelines, but ran them at 4Ghz would be just as valid.

As for the K6, the design was poor compared to what Intel was selling at the same time. Itanium likewise is a mess. Not a bad idea in theory, but at this point in time, it is simply not cost effective compared to a 64-bit x86 extension.

If you really think that NVidia engineers are dumbasses, I will calmly wait until ATI makes a misstep to remind you to apply the same standards to yourself.

jvd · Feb 19, 2004

I believe nvidia's engineers were dumb asses .

The made a video card that would need to run at a 500mhz clock to compete with a 320 mhz clocked card. What made it even worse was that htey could not produce that card in any sane quanty .

Now perhaps they did try a diffrent way of making things. But as we can see it was the wrong way to do things .

Luminescent · Feb 19, 2004

The made a video card that would need to run at a 500mhz clock to compete with a 320 mhz clocked card. What made it even worse was that htey could not produce that card in any sane quanty.

Given a different perspective, Nvidia (and Intel for that matter) could claim that their architecture was designed with deeper pipelines and longer latencies in mind for the very sake of achieving the high clockspeeds necessary to best the competition's instruction per clock advantage.

OpenGL guy · Feb 19, 2004

DemoCoder said:
Well, once you start playing the "how many shipping apps" game, you might as well torpedo most of ATI's DX9 advantages. In another thread, people are touting ATI's PCIE framebuffer read back performance for HDTV video editing, an even more vaporous situation.

I've owned an R300 since they were first shipped, and I am still not enjoying any killer DX9 titles on it. Half-Life 2 never shipped, which is about the only killer worthwhile DX9 game to wait for.

TRAOD takes good advantage of the R300 architecture. Whether the game is good or not is important.

But in any event, the R300 does well with any shaders as it's always doing 8 colored pixels per cycle. Even single textured polygons run at full speed. Can you say that about NV3x?

We all know that Doom3 is the major use case for stencil/z performance. DX9 pixel shader performance is only 1/2 of the equation. Future games will need both shader performance, and hi-speed Z/Stencil ops.

My question was legitimate. 3D Mark 2003 takes advantage of stencil ops in several tests. Looking at GT3 performance, I see that the higher clocked 5900 Ultra lags behind the R360. Where's the alleged architectual advantage the NV3x has? I'm even giving nvidia the benefit of looking at non-Futuremark approved drivers!

The whole point of this thread is that NVidia tried something different with the NV30. Their engineers aren't idiots and dumbasses because of it. They shot for low-k, .13um, and a bunch of other weird things, and some worked out, and some didn't.

My point wasn't to put down the NV30 or nvidia, but simply to question your conclusions.

ATI has a design that works, but it is not the only 3D architecture possible nor it is the most optimal possible. Hell, a design that has 2 pipelines, but ran them at 4Ghz would be just as valid.

Wow, I never knew that!

As for the K6, the design was poor compared to what Intel was selling at the same time. Itanium likewise is a mess. Not a bad idea in theory, but at this point in time, it is simply not cost effective compared to a 64-bit x86 extension.

The K6 ran integer operations quite well compared to Intel offerings.

If you really think that NVidia engineers are dumbasses, I will calmly wait until ATI makes a misstep to remind you to apply the same standards to yourself.

Now you're putting words into my mouth.

Joe DeFuria · Feb 19, 2004

DemoCoder said:
If you really think that NVidia engineers are dumbasses, I will calmly wait until ATI makes a misstep to remind you to apply the same standards to yourself.

I don't mean to speak for OpenGl guy....but I just want to say...where did OpenGL guy say or imply that nVidia's engineers are dumb-asses?! It's crap like that which you put into people's mouths that puts me off. You like to toss accusations of "nVidia hatred" around here, and yet I just see you reading into a lot of crap that people don't say.

From where I sit....again....all he said was "what's so great about nVidia's stencil?"

What's so advanced about an architecture that has half the pipes w/2 stencil ops per pipe, compared to an architecture that has double the pipes with 1 stencil op per pipe?

jvd · Feb 19, 2004

Luminescent said:
The made a video card that would need to run at a 500mhz clock to compete with a 320 mhz clocked card. What made it even worse was that htey could not produce that card in any sane quanty.

Click to expand...

Given a different perspective, Nvidia (and Intel for that matter) could claim that their architecture was designed with deeper pipelines and longer latencies in mind for the very sake of achieving the high clockspeeds necessary to best the competition's instruction per clock advantage.

Yet it only makes sense to do so when you can create tech that can scale that high. The nv30 could not do it .

NV40: 6x2/12x1/8x2/16x1? Meh. Summary of what I believe

Xmas

Porous

OpenGL guy

ninelven

PM

nobie

SsP45

OpenGL guy

Waltar

OpenGL guy

Vince

Joe DeFuria

OpenGL guy

AlphaWolf

Specious Misanthrope

DemoCoder

OpenGL guy

DemoCoder

jvd

Luminescent

OpenGL guy

Joe DeFuria

jvd

Similar threads