NV30,35 & R300/R350 Pixel Shader Pipes Compared (New inf

Do these compiler optimizations tie in at all with DirectX 9.1 which is rumored to help nVidia hardware run better (I'm guessing in much the same way - make effective use of the mini ALU's present in both NV3x and R3xx)?
 
welcome to global village where everybody is another idiot.

Btw a lot of things posted here are also taken for the truth someplaces else.
 
I seem to remember a number of rumors first posted at this forum later reported by The Inquirer...
 
jimbob0i0 said:
DB does it ever fascinate you how something posted at one site on the net rapidly spreads as truth and gets quoted all over the net so quickly?

No, because it even get requoted by PR personel. I had one talking to me about DX9.2 the other day!

Chalnoth said:
I seem to remember a number of rumors first posted at this forum later reported by The Inquirer...

They were facts reported by this forum.
 
DaveBaumann said:
Chalnoth said:
I seem to remember a number of rumors first posted at this forum later reported by The Inquirer...
They were facts reported by this forum.
I'm not so sure. It's happened many times, Dave. I doubt they were all true.
 
Knowing what you do about NV35 pipelines, transistor counts, and the work that has gone into the driver technology; playing the role of a NVidia engineer and if it was up to you, how would you have redesigned the NV40's pipelines? Also, do you think R420 will make the move to FP32 or not?

Thanks,
-The Rockster
 
Rockster said:
Knowing what you do about NV35 pipelines, transistor counts, and the work that has gone into the driver technology; playing the role of a NVidia engineer and if it was up to you, how would you have redesigned the NV40's pipelines?
I doubt that the NV40 will have a significant change in design philosophy. I do, however, expect that FP32 performance will be much more highly-emphasized, and the hardware should perform quite a bit better at it. But I also expect FP16 support to be there, and be higher-performing.

Also, do you think R420 will make the move to FP32 or not?

Thanks,
-The Rockster
I don't know. It depends on whether or not ATI makes the move to a unified vertex shader/pixel shader format.

If ATI goes unified, then they'll have to support FP32. If they don't, I'd give it no more than a 50/50 chance they'll move to FP32.
 
Chalnoth said:
I don't know. It depends on whether or not ATI makes the move to a unified vertex shader/pixel shader format.

Does PS/VS 3.0 require a unified format?

If not, then I suspect R420 won't be FP32 in the pixel shaders.
]
 
Joe DeFuria said:
Does PS/VS 3.0 require a unified format?

If not, then I suspect R420 won't be FP32 in the pixel shaders.
The unified architecture would be, as far as I know, an entirely hardware optimization thing. VS 3.0 adds much of the functionality of the pixel shader, however, so the two units will be much closer in functionality with version 3.0 than they were in version 2.0, so it may make more sense to use the same unit for both.
 
Chalnoth said:
The unified architecture would be, as far as I know, an entirely hardware optimization thing. VS 3.0 adds much of the functionality of the pixel shader, however, so the two units will be much closer in functionality with version 3.0 than they were in version 2.0, so it may make more sense to use the same unit for both.

What I'm asking is, does the additional functionality require that PS now be 32 bit? (Or are they still "separate enough" in the spec that the precision can be different.)

If the specs allow 24 bit FP Pixel shaders in 3.0, then I'd bet that we won't see a "unified pipe" in the R420.
 
I'm not sure. I would suspect the accuracy requirements are the same for PS 3.0 as they are for PS 2.0. If you want to check yourself, you can always look it up. I'm sure it's somewhere at http://msdn.microsoft.com but I don't have the time to check just now.
 
I agree with Joe. FP24 seems to give ATI a real advantage in this generation allowing them to have more functional units at similar or smaller transistor counts. Seems to me that is how the R420 can hit 12 pipelines with the benefit of making the process change to .13u

Do you think at 175M transistors, the NV40 will simply double the number of pipes as they are now, creating an 8x2 followed by the 2 mini FPU's. I doubt they would do things like de-couple the FP32 ALU and TMU. What advantage might be gained by replacing the 2 mini FPU's with a single full? Or should the mini's be dumped all together, in favor of additional control logic to move away from working on quads with SIMD?

My guess it will be the 8x2 with both mini's, handling 2 different quads in flight with some logic to handle the loops and branching.
 
Rockster said:
I agree with Joe. FP24 seems to give ATI a real advantage in this generation allowing them to have more functional units at similar or smaller transistor counts. Seems to me that is how the R420 can hit 12 pipelines with the benefit of making the process change to .13u
The reason to go for FP32 would be so that the pipelines can be unified. Most of the shader time spent will be in the pixel shader, so sharing the pixel shader hardware with the vertex shader hardware will effectively allow for more pixel shader units.

And I would be highly disappointed if ATI stayed with FP24. I think that, for one, FP24 has problems with anisotropic filtering. If you look at some anisotropic filtering test images, you will notice artifacts on the R3xx:
(from this article)
http://www.tomshardware.com/graphic/20031023/images/ati-cp-stage0.png

Notice that there appear to be a number of artifacts in the MIP/aniso detection algorithm, with spikes at triangle edges. There are no such artifacts on the NV3x shots. This issue should manifest itself as texture aliasing with anisotropic filtering enabled.
 
I have to admit that it's not very rigorous. My basic train of thought is this:

The R3xx's F24 Has some subpixel accuracy, and is made to work properly with bilinear filtering on large textures. Anisotropic filtering requires more samples per pixel, and so, I believe, requires more subpixel accuracy.

I could be wrong, but the use of FP24 for texturing seems the most obvious reason for those artifacts.
 
Chalnoth said:
Rockster said:
I agree with Joe. FP24 seems to give ATI a real advantage in this generation allowing them to have more functional units at similar or smaller transistor counts. Seems to me that is how the R420 can hit 12 pipelines with the benefit of making the process change to .13u
The reason to go for FP32 would be so that the pipelines can be unified. Most of the shader time spent will be in the pixel shader, so sharing the pixel shader hardware with the vertex shader hardware will effectively allow for more pixel shader units.
At the expense of lots of additional control logic, non-optimal interconnect paths, lost parallelism ...

Chalnoth said:
And I would be highly disappointed if ATI stayed with FP24. I think that, for one, FP24 has problems with anisotropic filtering. If you look at some anisotropic filtering test images, you will notice artifacts on the R3xx:
(from this article)
http://www.tomshardware.com/graphic/20031023/images/ati-cp-stage0.png

Notice that there appear to be a number of artifacts in the MIP/aniso detection algorithm, with spikes at triangle edges. There are no such artifacts on the NV3x shots. This issue should manifest itself as texture aliasing with anisotropic filtering enabled.
I didn't click on the link (Tom's on my embargo list), but ... what the heck? Texture filtering isn't performed by the shading units, so how would it be affected by their precision? IANAHWD but something tell's me that Tom's just repeating some uninformed babble he picked up in the laundry.

If this is about the 'angular' problem (uneven multiples of 22.5°) ... :rolleyes:
 
Chalnoth said:
The reason to go for FP32 would be so that the pipelines can be unified.

Agreed. however, i'd wager it will take considerable more time and effort to physically unify the pipelines, than to just extend the current pipeline architectures to support PS/VS 3.0.

Since the rumor is that R420 is "based" on R300, I'm doubting that ATI has unified the pipelines for R420.

When ATI does unify the pixel and vertex pipelines, then I of course epxect full FP32 everywhere. I don't expect this until R5x0 generation though, as it will be a significant departure from the R3x0 generation.

And I would be highly disappointed if ATI stayed with FP24.

I wouldn't. It gets the job done. I'd be more disappointed if ATI went with FP32 and unified the pipelines, and that meant a higher cost, or a later product, or a slower product...or all of the above.

Since the R3x0 core is a decent DX9 shader, there's no need for ATI to move FP32 or unified pipelines until the API requirements all but dictate that it's required.

I think that, for one, FP24 has problems with anisotropic filtering.

As with other's responses: wtf? :oops:
 
Back
Top