3:8 vertex and pixel shader in X800 and NV40 - coincidence?

Scali · Aug 30, 2004

They will be equally busy if the resolution is also halved.

keeping the same number of vertex pipelines allows you to run same software at same speed, but with just smaller resolution.

This seems very reasonable, your game can run acceptable/well even with mainstream card, but if you want to use high resolution or high AA mode, you have to pay for the high end model.

True, but up to now, the mainstream cards had always had less vertexunits aswell, both for ATi and NVIDIA. And 3 would still be sufficient for most stuff, since the R3x0's 4 units are generally overkill. The RV3x0's 2 units generally manage quite well.

Xmas · Aug 30, 2004

Scali said:
Well, if it is indeed a FireGL that's a different story then. I thought we were talking about a mid-end consumer card.

I thought we were talking about a chip that targets both markets.

Scali said:
True, but up to now, the mainstream cards had always had less vertexunits aswell, both for ATi and NVIDIA.

NV36

Scali · Aug 30, 2004

I thought we were talking about a chip that targets both markets.

I wasn't. I was talking about X800SE/X700/whatever the new midrange 8-pipeline chip will be called (the one to replace the current RV3x0 series).
Which obviously would target the mainstream market.
Now if ATi is actually going to target a new area for FireGL (less fillrate, same vertexprocessing power), and also sell it as a mainstream part, that's a completely different story. Under no circumstance would I say that it was AIMED at this mainstream market though.
It would just be marketed as such because it is cheaper than designing yet another chip.
And who knows, perhaps the mainstream parts will be faulty FireGLs with some vertexunits disabled

NV36

Erm...? How many vertexshader units do the NV3x series have anyway? I've only been able to find some vague info about some kind of 'array' configuration.
Regardless, NV36 does NOT have 6 units, that's for sure. It doesn't even have 4, I suppose (I recall that some claimed the 5800/5900 were equal to 3 units or so).

Xmas · Aug 30, 2004

Scali said:
I wasn't. I was talking about X800SE/X700/whatever the new midrange 8-pipeline chip will be called (the one to replace the current RV3x0 series).
Which obviously would target the mainstream market.
Now if ATi is actually going to target a new area for FireGL (less fillrate, same vertexprocessing power), and also sell it as a mainstream part, that's a completely different story. Under no circumstance would I say that it was AIMED at this mainstream market though.

I would.
Obviously they don't design chips and then decide afterwards in which market the chip might sell.
Just like NVidia always made compromises in the designs to suit both GeForce and Quadro markets, I'm sure ATI does the same now since they acquired the FireGL brand.

NV36

Click to expand...

Erm...? How many vertexshader units do the NV3x series have anyway? I've only been able to find some vague info about some kind of 'array' configuration.
Regardless, NV36 does NOT have 6 units, that's for sure. It doesn't even have 4, I suppose (I recall that some claimed the 5800/5900 were equal to 3 units or so).

This was an example of a mainstream chip not having less vertex units than the high-end part. Yes, it has less vertex units, but also less fillrate than RV410.

Mariner · Aug 30, 2004

From B3D's NV36 preview:

Although the pixel pipeline configuration remains the same as NV31, albeit with the replacement of the integer units for float ALU's, the entire vertex engine from NV30 / NV35 has been lifted and included in the NV36 chip for GeForce FX 5700.

I assume NV did this for the same reason that ATI are keeping the 6 Vertex units in the RV410 i.e. the professional market.

Scali · Aug 30, 2004

I would.
Obviously they don't design chips and then decide afterwards in which market the chip might sell.
Just like NVidia always made compromises in the designs to suit both GeForce and Quadro markets, I'm sure ATI does the same now since they acquired the FireGL brand.

In general this is true. But in this case there obviously was no compromise made for the mainstream market, and the chip seems to be solely aimed at the professional market. It just happens to be marketable as a mainstream chip aswell (which ofcourse is not a coincidence, but was planned).
But it's the same as saying that eg HTT or EM64T was designed for the mainstream market. It's not, it is mainly for the Xeon, but since it's already in the chip design anyway, it's a nice bonus for mainstream users. Most of them won't get that much use out if it though.

Scali · Aug 30, 2004

I assume NV did this for the same reason that ATI are keeping the 6 Vertex units in the RV410 i.e. the professional market.

I assume they did this because their NV30/35 was already underpowered in the vertex department anyway.

Dave Baumann · Aug 30, 2004

Scali said:
Now if ATi is actually going to target a new area for FireGL (less fillrate, same vertexprocessing power), and also sell it as a mainstream part, that's a completely different story. Under no circumstance would I say that it was AIMED at this mainstream market though.

The majority of sales will always be through consumer, so the primary design target is for that. You can make opportune judgments dependant on die and other considerations to make choices that will benefit other areas of your business if it doesn't impact overall costs for the consumer market significantly, and it will still give some benefits there as well.

Scali · Aug 30, 2004

You can make opportune judgments dependant on die and other considerations to make choices that will benefit other areas of your business

Exactly, so you AIM at that. The other was already a given (you are working on a design based on a mainstream part, neither NVIDIA nor ATi make true professional parts).

Ailuros · Sep 1, 2004

Scali said:
I assume NV did this for the same reason that ATI are keeping the 6 Vertex units in the RV410 i.e. the professional market.

Click to expand...

I assume they did this because their NV30/35 was already underpowered in the vertex department anyway.

That doesn't sound right...

NV30/35 and NV36 (GFFX5700) had all a 3-way SIMD VS. Unlike the GFFX 5600 which had lesser VS units.

The 5600 was deemed obviously as too weak compared to the competition and thus it's successor got the full VS configuration of the high end models.

Underpowered? Why?

http://www.beyond3d.com/previews/nvidia/nv40/index.php?p=18

http://www.beyond3d.com/previews/nvidia/nv40/index.php?p=19

http://www.beyond3d.com/reviews/ati/r420_x800/index.php?p=14

http://www.beyond3d.com/reviews/ati/r420_x800/index.php?p=15

With the NV4x line it's obviously an entirely different story since it's not about VS2.0+ SIMDs, but rather VS3.0 MIMDs. Here the hardware cost is obviously a lot higher in the latter case than in the first.

I don't see why it's so irrational about R410 having all 6 VS units of their big brothers.

Exactly, so you AIM at that. The other was already a given (you are working on a design based on a mainstream part, neither NVIDIA nor ATi make true professional parts).

Their parts for the professional market do not however seem to have a bad price/performance ratio at all, rather the contrary. What's a "true" professional part for you anyway? Both IHVs are aiming for quite some time now to stay very competitive in the professional market and they seem to succeed too. One point to pursue might be what IHVs such as 3DLabs have planned for their mainstream professional parts, considering what a "VS-monster" the P20 turned out to be.

vb · Sep 1, 2004

Ailuros said:
NV30/35 and NV36 (GFFX5700) had all a 3-way SIMD VS. Unlike the GFFX 5600 which had lesser VS units.

The 5600 was deemed obviously as too weak compared to the competition and thus it's successor got the full VS configuration of the high end models.

There was a post by Uttar mentioning 5600 has all 3 vs units on die, they just don't work.

OT are there any adavantages in designing VS units in 3x arrays, as nv did for the last two generations and ATI does now?

Scali · Sep 1, 2004

Underpowered? Why?

We're talking about 5800/5900 Ultra right?
Well, look at your own links.
We see 180/200 mtris/sec for these cards.
Compare that to 9700/9800 Pro, their direct competitors.
We see 275/325 mtris/sec.
Now why on earth would one think they are underpowered?!

What's a "true" professional part for you anyway?

In short, 3dlabs stuff. A card that is designed solely for professional applications, where no considerations for games were made whatsoever.
That's not to say that 3dlabs is better, but they have the true professional mindset. The rest are just trying to get the best of both worlds, at a decent price (another thing that 3dlabs doesn't consider at all).

Ailuros · Sep 2, 2004

We're talking about 5800/5900 Ultra right?
Well, look at your own links.
We see 180/200 mtris/sec for these cards.
Compare that to 9700/9800 Pro, their direct competitors.
We see 275/325 mtris/sec.
Now why on earth would one think they are underpowered?!

Go back and check the results from synthetic applications.

I meant to point at those and not theoretical throughput numbers. Good luck looking for those hundreds of millions tris for both anyway. Maybe with wireframe or something damn near wireframe.

In short, 3dlabs stuff. A card that is designed solely for professional applications, where no considerations for games were made whatsoever.
That's not to say that 3dlabs is better, but they have the true professional mindset. The rest are just trying to get the best of both worlds, at a decent price (another thing that 3dlabs doesn't consider at all).

Great. Then same question again: what if 3DLabs has planned for future P20 based mainstream parts a very high amount of VS units. Would you still think that ATI's or any other IHV's decision to keep as much as possible VS units on board is unreasonable?

assen · Sep 2, 2004

Ailuros said:
I meant to point at those and not theoretical throughput numbers. Good luck looking for those hundreds of millions tris for both anyway. Maybe with wireframe or something damn near wireframe.

Wireframe is on the order of 3-5 times SLOWER on modern gaming-oriented GPUs; they basically emulate it by drawing two skinny triangles for each triangle edge. Depending on the GPU, one or two textures and vertex colors are basically "free" (of course, until you hit memory bandwidth limits).

Theoretical throughput is far from what you get in a full-blown game, but it's entirely possible to get within 85-50% of them in simple game-like scenario, e.g. a simplistic terrain renderer with a couple of meshes replicated thousands of times on top for good measure. With instancing, if possible.

Scali · Sep 2, 2004

Go back and check the results from synthetic applications.

I meant to point at those and not theoretical throughput numbers. Good luck looking for those hundreds of millions tris for both anyway.

Here you go:

http://graphics.tomshardware.com/graphic/20030512/geforce_fx_5900-27.html

Just because you don't notice a difference in most software doesn't mean there isn't a difference.
That's the entire point. As said before, the Radeons actually had too much power. The GeForce had less, but it was still enough for actual software. But their low end models had too little, so they opted for giving it as much power as the high end models (probably cheaper to just copy-paste the unit than to try and design an in-between thing). Simple, no?

Great. Then same question again: what if 3DLabs has planned for future P20 based mainstream parts a very high amount of VS units. Would you still think that ATI's or any other IHV's decision to keep as much as possible VS units on board is unreasonable?

Yes. The point is that mainstream software doesn't use these units, so for mainstream apps, this is a waste at this time. And I have yet to see a true mainstream card from 3dlabs. Last time I looked, they didn't even have driver support for Direct3D at all. And let's not get started on the pricerange of their cards.

Ailuros · Sep 2, 2004

Here you go:

http://graphics.tomshardware.com/graphic/20030512/geforce_fx_5900-27.html

So VS1.1 performance is all that matters on VS2.0 or VS2.0+ VS units? Regardless that 3dmark01 VS1.1 tests are already present in the B3D links above, there's more than that one in there.

How about trying to get comparative persentages from all tests and attempt to throw them through an average.

Just because you don't notice a difference in most software doesn't mean there isn't a difference.

I'm absolutely certain that last year's games where entirely overloaded with VS1.1 calls and there wasn't a single pure T&L call in them.

That's the entire point.

Is it? Of course if one merely concentrates on singled out aspects, probably yes.

As said before, the Radeons actually had too much power. The GeForce had less, but it was still enough for actual software.

In the grander scheme of things yes, yet the most important differences where elsewhere IMHO.

But their low end models had too little, so they opted for giving it as much power as the high end models (probably cheaper to just copy-paste the unit than to try and design an in-between thing). Simple, no?

I still don't know how many VS units NV31 had or if they were simply not all functional (as noted above). If the former theory is true and it contained a somewhat broken (for whatever reason) full 3-way SIMD, then I don't even know if there was 1 or 2 operational after all and to what degree. The fact still remains that those type of units (VS2.0/VS2.0+) are quite cheap in comparative terms to scale, whether you'd want to acknowledge it or not.

Yes. The point is that mainstream software doesn't use these units, so for mainstream apps, this is a waste at this time. And I have yet to see a true mainstream card from 3dlabs. Last time I looked, they didn't even have driver support for Direct3D at all. And let's not get started on the pricerange of their cards.

Then I'm making up things and I dreamed about the P9. Granted I don't have a clue what 3DLabs is planning to do for cheaper models of their new generation of products, but it's not an entirely unreasonable thought or consideration.

Oh yes obligatory link:

http://www.digit-life.com/articles2/profcards/1review-3dlabs-wildcats.html

As for the price-range of the P9 I think the link covers that one too.

Scali · Sep 2, 2004

So VS1.1 performance is all that matters on VS2.0 or VS2.0+ VS units? Regardless that 3dmark01 VS1.1 tests are already present in the B3D links above, there's more than that one in there.

You said that they wouldn't get near their theoretical triangle limit. Yet both do, in fact, GeForce is even closer, it's just that the Radeon's limit is that much higher, so it's still far ahead.

I'm absolutely certain that last year's games where entirely overloaded with VS1.1 calls and there wasn't a single pure T&L call in them.

What does that have to do with anything? I merely said that the Radeon has a much higher triangle throughput, or actually that the GeForce was underpowered in comparison. Then you say they don't get near their theoretical limit, so I show some figures that indicate that they do. Who cares if they use VS1.1, 2.0 or fixed T&L? Especially on the Radeon it doesn't matter, since everything uses the same unit. I'm not too sure how the GeForce does it, I believe it had a separate fixed unit still.

Is it? Of course if one merely concentrates on singled out aspects, probably yes.

We WERE concentrating solely on the vertexprocessing aspect yes.

In the grander scheme of things yes, yet the most important differences where elsewhere IMHO.

You mean the completely horrible PS2.0 implementation? Yes, but that's not what we are discussing now.

The fact still remains that those type of units (VS2.0/VS2.0+) are quite cheap in comparative terms to scale, whether you'd want to acknowledge it or not.

If it is a fact, there must be proof. I don't have to acknowledge anything that isn't proven. So come on, present the transistor counts of all the parts then. Else it's your word against mine, and I will continue to say that since pixelshader units are less complex than vertexshader units, they require less transistors, which ofcourse makes perfect sense.

The card ships in the proprietary package from which you can tell that it's not a gaming solution. The austere stylish solution of the 3Dlabs' colors shows a photo of the card put inside and its brief description.

Gee, interesting. Also the fact that they don't test against any gaming cards from ATi or NVIDIA, and don't bother to test any games.
Apparently that card is nothing more than a low-budget professional card. I wouldn't be surprised if it didn't actually have Direct3D drivers.
In short, this is NOT a mainstream card, but aimed solely at professional applications, quite unlike the ATi and NVIDIA cards.

Dave Baumann · Sep 2, 2004

In short, this is NOT a mainstream card, but aimed solely at professional applications, quite unlike the ATi and NVIDIA cards.

No, it will be a mainstream card, with opportune additions that will benefit other markets, as well as assist its primary market in some cases. Its vast majority of sales will go through desktop sales.

Scali · Sep 2, 2004

No, it will be a mainstream card, with opportune additions that will benefit other markets, as well as assist its primary market in some cases. Its vast majority of sales will go through desktop sales.

How many OEMs will install such a card, especially when it is more expensive than ATi/NVIDIA alternatives, and doesn't support Direct3D?
There's no mainstream market for it, whether 3dlabs wants it or not.

Dave Baumann · Sep 2, 2004

Sorry, thought you were talking about RV410 still.

3:8 vertex and pixel shader in X800 and NV40 - coincidence?

Scali

Xmas

Porous

Scali

Xmas

Porous

Mariner

Scali

Scali

Dave Baumann

Gamerscore Wh...

Scali

Ailuros

Epsilon plus three

vb

Scali

Ailuros

Epsilon plus three

assen

Scali

Ailuros

Epsilon plus three

Scali

Dave Baumann

Gamerscore Wh...

Scali

Dave Baumann

Gamerscore Wh...

Similar threads