3:8 vertex and pixel shader in X800 and NV40 - coincidence?

A quick read of just about any X800 material would tell you it's got 6 vertex shaders.

Erm, I meant the yet unreleased X800 (SE?) card? Or is that one not going to come?
X800 Pro/XT have 6 vertexshaders, I know that much.
 
Scali said:
A quick read of just about any X800 material would tell you it's got 6 vertex shaders.

Erm, I meant the yet unreleased X800 (SE?) card? Or is that one not going to come?
X800 Pro/XT have 6 vertexshaders, I know that much.

All indications are that X800SE is just a R420 core with its 6 vertex shaders and 2 quads disabled. It's a Dell-only OEM part that has nothing to do with RV410, ATI's upcoming mid-range offering.

Latest rumours have RV410 (X700/Pro/XT) designed with the full complement of 6 vertex shaders also, but without the 256-bit memory bus. :(
 
If all new Radeons will then have 6 vertexshader units, that would be the opposite of what was said before?
Namely, effectively the throughput will have improved over R3x0 after all, then.

I also find that quite unlikely... The vertexshaders are the most complex processing unit in the entire chip, they would be the first to go in a budget model. ATi has always cut vertexshaders in the past.
 
The vertexshaders are the most complex processing unit in the entire chip, they would be the first to go in a budget model.

Vertex Shaders are cheap in comparison to the pixel shader pipeline - only one Vector and one Scalar ALU and without requirements for texture processing (in VS2.0) and associated caches ensures their silicon budget is much less.
 
Vertex Shaders are cheap in comparison to the pixel shader pipeline - only one Vector and one Scalar ALU and without requirements for texture processing (in VS2.0) and associated caches ensures their silicon budget is much less.

But these units are more complex (more instructions, more registers, more precision), so no doubt a vertexshader takes more silicon than a pixelshader (and I do not count texture units as parts of pixelshader units).
Cutting an entire pixel pipeline may save more silicon, but why stop there if you can also cut a vertex pipeline? Especially since the vertex pipelines now have less pixel pipelines to feed.
I would find it much more logical if they just cut both parts in half, not only the pixelpipelines.
 
Scali said:
(and I do not count texture units as parts of pixelshader units).

They are an integral part of the fragment pipeline!

Cutting an entire pixel pipeline may save more silicon, but why stop there if you can also cut a vertex pipeline?

Because you might be targetting other, specific, markets with it.
 
They are an integral part of the fragment pipeline!

Yes and no.
You can have as many texture units per pipeline as you want, although nobody would want less than 1 today :)
That's why they're called units. A pipeline is not a unit, it is a set of units.
So let's say we have the following units:

vertexshader unit
pixelshader unit
texture unit

Then the vertexshader unit is either the most complex one, or second to the texture unit. That probably depends mostly on the amount of cache in the texture unit. At any rate, it's a good way to cut transistors, after you've already cut down the texture units to a minimum (one per pipeline, and only one quad).

Because you might be targetting other, specific, markets with it.

If I understood correctly, it was aimed at the budget OEM market. Wouldn't make a lot of sense to offer that much vertexprocessing power that they cannot even use in most software. Since it's supposed to be budget, they should just cut the units, since they are only adding to the cost of the card, and not to the performance in most software.
And ATi is known to cut vertexshaders on budget chips. They also cut part of HyperZ, for the same reason: the cost in transistors doesn't justify the performance gain in a budget chip.
 
Scali said:
If I understood correctly, it was aimed at the budget OEM market. Wouldn't make a lot of sense to offer that much vertexprocessing power that they cannot even use in most software. Since it's supposed to be budget, they should just cut the units, since they are only adding to the cost of the card, and not to the performance in most software.
And ATi is known to cut vertexshaders on budget chips. They also cut part of HyperZ, for the same reason: the cost in transistors doesn't justify the performance gain in a budget chip.

I am not tech-savvy enough to discuss the complexities of vertex shader units, but an 8-pipe part in the $200 segment can hardly be classified as a "budget" solution. Budget would be the 9100 IGP or X300, and even the X600 series would fall into the mainstream category as X700 pushes it down. RV410 will belong to the mid-range segment, and without knowing the details of the chip's overall design, it would seem premature to conclude that 6 vertex shaders would be a poor choice from a cost-effectiveness standpoint. For what it's worth, the original source for the rumoured specs is mentioned in this xbit story, though I'm sure Dave is privy to more details but is simply bound by a NDA. :devilish:
 
You can have as many texture units per pipeline as you want, although nobody would want less than 1 today
That's why they're called units. A pipeline is not a unit, it is a set of units.

Sigh.

Yes, but at the moment they do all come as a package – a fragment shader in the R300 pipeline consists of the texture samplers, the texture address processor, two ALU’s, various bits of cache and the ROP and that is all packaged up (and usually in a grouped in a quad of processors) – removing a single element of this requires much more reworking since that is how the pipeline is designed. In total there are 32 FP24 ALU’s (of differeing capabilities), 16 texture samplers, 16 tex address processors, etc., etc. in R420 (and half that in a theoretical 8 pipeline) - conversely the VS of R420 is 6 FP32 Vector ALU’s and 6 Scalar; on balance the VS is much smaller even when you remove half the pixel processing capabilities.
 
I am not tech-savvy enough to discuss the complexities of vertex shader units, but an 8-pipe part in the $200 segment can hardly be classified as a "budget" solution.

Whatever it is, it only has half the pixelpipelines of a high-end part.
It would seem logical that the vertexpipes would also be halved, since they would just sit idle most of the time, if half the pixelpipes are missing anyway.
I see it as the replacement of the R9600 series, which is also a halved R9800, so to speak.
 
Yes, but at the moment they do all come as a package – a fragment shader in the R300 pipeline consists of the texture samplers, the texture address processor, two ALU’s, various bits of cache and the ROP and that is all packaged up (and usually in a grouped in a quad of processors) – removing a single element of this requires much more reworking since that is how the pipeline is designed. In total there are 32 FP24 ALU’s (of differeing capabilities), 16 texture samplers, 16 tex address processors, etc., etc. in R420 (and half that in a theoretical 8 pipeline) - conversely the VS of R420 is 6 FP32 Vector ALU’s and 6 Scalar; on balance the VS is much smaller even when you remove half the pixel processing capabilities.

Whatever the case, less is cheaper, and 6 vertexshader units are not going to be very busy if half the pixelpipelines are missing.
And as mentioned above, ATi has also halved the vertexshader units in the previous models, it would seem logical that they do this again. It improves their competitive quality.
 
Unless there are other specific requirements.

What requirements could that possibly be? Perhaps you should read this thread from the beginning.
It starts with the perception that current hardware has a 3:8 vertex:pixel unit ratio, while the previous generation had 4:8.
This change results from the fact that ATi realized that the vertexunits were not being used to their full potential in current software, and so 8 pipelines weren't enough to keep 4 vertex units busy.

Now why exactly would this no longer hold when you have 8 pipelines instead of 12 or 16, like in the high-end?

Let me put it this way: I can't imagine ATi being stupid enough to put as many vertex units in GPUs with half the pixel pipelines as in their high-end GPUs.

We'll see when the hardware arrives. $5 says it's not going to have 6 vertex units if it doesn't have more than 8 pipelines :)
 
Scali said:
Unless there are other specific requirements.

What requirements could that possibly be? Perhaps you should read this thread from the beginning.

Scali, unless I am sorely mistaken, I believe this is Dave's way of telling you that he is very much aware of whatever specific requirements he is alluding to but isn't free to elaborate just yet. Have you never had the pleasure of such exchanges with Dave on a thread before? :LOL: ;)
 
It's easier to scale resolution than geometry complexity.
And the FireGL market requires more vertex performance.
 
We'll see when the hardware arrives. $5 says it's not going to have 6 vertex units if it doesn't have more than 8 pipelines

$10 says Dave knows exactly what he's talking about, while you're just talking out your a**e (as usual).
 
$10 says Dave knows exactly what he's talking about, while you're just talking out your a**e (as usual).

Dave may be right, but I am most certainly not talking out of my arse.
Everything I said makes perfect sense. And I severely resent personal attacks such as this one!
 
Dave has already indicated that the RV410 is targeted at the professional market where Vertex performance is more important. Obviously the "mainstream performance" games market is the main thing but with the 6 Vertex units they should be able to create a competitive FireGL product with it also.

This was discussed in another thread somewhere a week or so ago - don't think it was in this particular forum but it was in one of the two "3D Graphics..." forums.
 
Scali said:
Whatever the case, less is cheaper, and 6 vertexshader units are not going to be very busy if half the pixelpipelines are missing.

They will be equally busy if the resolution is also halved.

keeping the same number of vertex pipelines allows you to run same software at same speed, but with just smaller resolution.

This seems very reasonable, your game can run acceptable/well even with mainstream card, but if you want to use high resolution or high AA mode, you have to pay for the high end model.
 
Dave has already indicated that the RV410 is targeted at the professional market where Vertex performance is more important. Obviously the "mainstream performance" games market is the main thing but with the 6 Vertex units they should be able to create a competitive FireGL product with it also.

Well, if it is indeed a FireGL that's a different story then. I thought we were talking about a mid-end consumer card.
Even for a professional card it would be a unique feature though. But at least one that is not pointless... for example, wireframe drawing doesn't require much fillrate at all, compared to the amount of vertex processing.
 
Back
Top