nv30 vs r300 article

sancheuz · Oct 28, 2002

I must praise the makers of this article for a well constructed and unbiased comparison and contrasting of the nv30 and r300.
http://www.beyond3d.com/articles/nv30r300/

Mephisto · Oct 28, 2002

sancheuz said:
I must praise the makers of this article for a well constructed and unbiased comparison and contrasting of the nv30 and r300.
http://www.beyond3d.com/articles/nv30r300/

Normaly I'm the one missing the news threads, but this time ... =)

http://www.beyond3d.com/forum/viewtopic.php?t=2923

Hellbinder · Oct 28, 2002

Information from NV30 presentations undoubtedly indicates that NV30 provides 16 texture units, and 8 pipelines also get indirect confirmation. So combining with the information from the diagram, it is obvious that NV30 has 8 pipelines, each similar to that of NV2x, plus a fragment program processing unit. So NV30 has a same pixel fillrate and much higher texel fillrate with R300 for clock to clock.

You conclude that the Nv30 has 8 pipelines based on what? A 4x4 still has the required indicated 16 texture units. and also fits better with your Nv30 OpenGL diagram imo.

EDIT:

In fact i am now convinced.. look at the OpenGL diagram, the block section for conventional Texture processing. It says....

Texture unit 0
******
Texture Unit 3

which clearly indicates to me 4 texture units per pipeline. Thus 4 pipes with 4 Texture units each.

Xmas · Oct 28, 2002

Do not confuse logical with physical TMUs. GeForce3/4 have four logical TMUs while having only 2 physical TMUs per pipe. NVidia may be exposing only 4 logical TMUs in the "conventional" texture processing stage because older software (esp. if it uses NV_texture_shader) doesn't use more and newer software is supposed to use the new fragment shading extensions.

Just a note, Kyro also exposes only 4 logical TMUs in OpenGL, but 8 in DX.

Hellbinder · Oct 29, 2002

Can i get more peoples input on this please??? Does everyone else here agree with Xmas?

KimB · Oct 29, 2002

Hellbinder[CE said:
]In fact i am now convinced.. look at the OpenGL diagram, the block section for conventional Texture processing. It says....

Texture unit 0
******
Texture Unit 3

which clearly indicates to me 4 texture units per pipeline. Thus 4 pipes with 4 Texture units each.

Look again. That's for legacy support. Basically, that's the GeForce4 pipeline you're seeing.

Regardless, I do disagree with Zephyr on the points about the specifics of the NV30's pipelines. There has been no concrete information anywhere, as everything that's been released has been about the hardware side. Since there's no necessary corellation between the number of textures supported and the number of textures per pixel pipeline, we have no information to go on.

Zephyr · Oct 29, 2002

Chalnoth said:
Regardless, I do disagree with Zephyr on the points about the specifics of the NV30's pipelines. There has been no concrete information anywhere, as everything that's been released has been about the hardware side. Since there's no necessary corellation between the number of textures supported and the number of textures per pixel pipeline, we have no information to go on.

I only can say that it got an indirect confirmation.

MDolenc · Oct 29, 2002

The article is buggy...

Evildeus · Oct 29, 2002

MDolenc said:
The article is buggy...

Hmm

Dave Baumann · Oct 29, 2002

The article is buggy...

When doing an article of such a size and nature its bound to have a few errors in there - we have had this checked and verified by 'certain people' who would know about the respective architectures, and even after then we've found out more things. However, that does not mean to say its set in stone and should elements require it, we probably will revise it.

So, please, tell us...

KimB · Oct 29, 2002

Zephyr said:
I only can say that it got an indirect confirmation.

Interesting.

Anyway, the one thing that you noted that seems significantly less-likely than an 8x2 architecture is the 3 vertex processing pipelines. Given what we've seen in the past (that non-power-of-two things in the computing world are very rare), it makes more sense that the "1.5x vertex processing power" quote was a result of something other than three vertex processing pipelines.

It could be that the architecture uses four less-efficient pipelines (possibly just due to bus bandwidth or other similar aspects), or two more-efficient pipelines. I think three pipelines is less likely.

Zephyr · Oct 29, 2002

Chalnoth said:
Anyway, the one thing that you noted that seems significantly less-likely than an 8x2 architecture is the 3 vertex processing pipelines. Given what we've seen in the past (that non-power-of-two things in the computing world are very rare), it makes more sense that the "1.5x vertex processing power" quote was a result of something other than three vertex processing pipelines.

It could be that the architecture uses four less-efficient pipelines (possibly just due to bus bandwidth or other similar aspects), or two more-efficient pipelines. I think three pipelines is less likely.

My speculation, three vertex processing units, also got a confirmation. What I can say is the vertex processor in NV30 does have a special architecture. We will see the true answer soon.

MDolenc said:
The article is buggy...

Any indication of the bug is welcome...

Luminescent · Oct 29, 2002

Do you guys really think that the vertex processors and pixel processors on the NV30 will share some logic? I find that unlikely, being that it would not fully exploit the parallelism possible in a streaming data processor with 3 vertex processors and, possibly, 8 pixel processors. Each pipeline should have individual units, for even the higher precision functions (i.e radeon 9700 - scalar and vector units in parallel). Does anyone have anymore information as to processor sharing within the NV30?

Some speculate the former because of the small difference in transistor count between the NV30 and radeon 9700, but we should remember that the 9700 allocates logic to the encoding and decoding of memory formats, such as 3d float textures and such. The 9700 also holds 4 vertex pipelines. These extra convenience features may not be found on the NV30.

Reverend · Oct 29, 2002

Any indication of the bug is welcome...

As I said in the news post thread, there are some misinformation. Most of the "bugs" have to do with your comments re DX9b2.1 (and, less so, how it relates to NV30/R300)... since Wavey said this is "verified" and hence "accurate", you may want to re-check or re-verify with whoever your sources are.

I'm sure Kristof didn't go through this article (or didn't go through in-depth), otherwise this would not have happened.

John Reynolds · Oct 29, 2002

Reverend said:
As I said in the news post thread, there are some misinformation. Most of the "bugs" have to do with your comments re DX9b2.1 (and, less so, how it relates to NV30/R300)... since Wavey said this is "verified" and hence "accurate", you may want to re-check or re-verify with whoever your sources are.

I'm sure Kristof didn't go through this article (or didn't go through in-depth), otherwise this would not have happened.

You might want to double-check your sources on what makes you so "sure" that Kristof didn't go through the article.

Regardless, these petty, condescending insults need to stop, Rev.

Kristof · Oct 29, 2002

If something is wrong or incorrect please provide a correction and I am sure that Wavey will take the time to go over it and update the text if found to be necessary.

K-

MDolenc · Oct 29, 2002

(The instruction slots listed in the above table are minimum counts required to meet specification, higher instruction counts can be exposed through the DX Caps mechanism. Current indications from DX9 Beta3 are that the minimum number of instruction slots for VS3.0 is 512 - Ed.)

Really

? Are you sure you didn't mixed this a little? I also always thought that vertex shaders 1.x are already floating point, aren't they? And what kind of "per channel masking" does NV30 and VS3.0 have that others can't match? VS1.x can do any kind of swizzle and can mask destination registers, so what can NV30 and VS3.0 do more? And you also got a bit messy with call nesting and dynamic and static flow control didn't you?

Next page...
Any hardware that will want to support VS2.0 will have to provide at least 12 temp registers, 16 integer registers and 16 boolean registers (that's 44 registers at any time) so NV30 will not be even a VS2.0 part

? And that "sampler" row is quite a laugh

. It's not a bug it's a feature! Or was it the other way around?

When both _abs and negate (-) are present, the _abs happen first in NV30 and PS3.0.

Aren't we still talking about VS here?
Pixel shaders...
When will we see drivers from ATI that will expose 160 instruction slots?
And how many constants and instruction slots will NV30 expose? They said somewhere that each constant costs one instruction slot, so instruction count can greatly vary based on how many constants will we use, right?

It seems that NV30 does NOT support MRT

Then why does Cg expose this directly?
Next page...
Didn't we come to conclusion that Radeon 9700 doesn't support arbitrary swizzles some time ago? Abs in PS3.0 would also be cool...

He said the big improvements won't necessarily be in the number of pixels/sec (though they will increase of course), but rather in the quality or "intelligence" of those pixels.

For those that haven't yet figured out what to make out of that statement:
Radeon 9700 can plot 2600 million pixels per second. 1600x1200 display mode has 1.92 million pixels. How many times can you redraw whole screen? One of the major advantages NV30 will have over Radeon 9700 will be it's computational power and ability to execute two half precision instructions instead of one at full precision. It's "how fast can you do a 32 instruction shader" now and not "how many single texture pixels can you plot" anymore.

Dave Baumann · Oct 29, 2002

I'll let Zephyr go over most of these, but...

When will we see drivers from ATI that will expose 160 instruction slots?

I don't understand the point of this - regardless of whether or not the drivers support it doesn't mean that the hardware isn't capable.

Then why does Cg expose this directly?

The article states that MRT could be supported under NV30 by pack/unpack - Cg just exposes the functionality its up to the compiler to understand how to do it on the hardware.

MDolenc · Oct 29, 2002

DaveBaumann said:
I don't understand the point of this - regardless of whether or not the drivers support it doesn't mean that the hardware isn't capable.

And why would that be good for?

DaveBaumann said:
Then why does Cg expose this directly?

Click to expand...

The article states that MRT could be supported under NV30 by pack/unpack - Cg just exposes the functionality its up to the compiler to understand how to do it on the hardware.

My bad... Cg does not expose this...

Dave Baumann · Oct 29, 2002

MDolenc said:
And why would that be good for?

Eh? If you want to look at it like that then why 1024 for NV30?

However, as I said, if the caps mechanism states the minimum number of instructions etc to meet spec, but provides no upper bounds then the PS instruction cap for R300 can state 160 while NV30's can state 1024.

Regardless, this is not an inaccuracy since the hardware does support this number of instructions.

My bad... Cg does not expose this...

NP. Seems that MRT support on NV30 is slightly contentious as I've just had someone saying that NV30 does not support it at all.

nv30 vs r300 article

sancheuz

Mephisto

Hellbinder

Xmas

Porous

Hellbinder

KimB

Zephyr

MDolenc

Evildeus

Dave Baumann

Gamerscore Wh...

KimB

Zephyr

Luminescent

Reverend

John Reynolds

Ecce homo

Kristof

MDolenc

Dave Baumann

Gamerscore Wh...

MDolenc

Dave Baumann

Gamerscore Wh...

Similar threads