Firing Squad NVIDIA Editors Day Article

DaveBaumann said:
Ahhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh, god dammit, does nobody read the bloody front page!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Blame it on Humus :!:
Ask him, Dave, if you want to know why. ;)
 
DaveBaumann said:
Jakub, Dio (who works for ATI BTW) wasn't talking about making pro ATI or NVIDIA statements (although in truth, ATI tend not to debate what you may say about the competition as far as my dealings with them have been concerned), he was actually just talking about factual, or not, technical statements - there were a number of allusions NVIDIA made to ATI hardware that were not correct (ones that I've mentioned in other threads bout this editors day).
Care to summarize them for me? I'd love to make corrections.
 
One thing I am impressed with is writers starting to come here when they have published their best efforts to get suggestions for improvements and develop better sources of informed analysis. This is the second case this week.

The professionalism here is as apparent as the expertise. I'd love to see this trend continue and the bar for writing clever pieces would undoubtable raise as more expert points of view are considered and consulted.
 
Jakub said:
Care to summarize them for me? I'd love to make corrections.
My main factual problem was the first post I made in this thread: The R300 series supports 160 instructions, and they are somewhat more capable than DX9 level instructions. Therefore, continually referring to a '64 instruction limit' is incorrect.
 
Dio said:
Jakub said:
Care to summarize them for me? I'd love to make corrections.
My main factual problem was the first post I made in this thread: The R300 series supports 160 instructions, and they are somewhat more capable than DX9 level instructions. Therefore, continually referring to a '64 instruction limit' is incorrect.
If you're correct about 160 instructions being the limited, I'm puzzled that none of the devs I polled (including several who took advantage of my offer of anonymity) did not dispute the limit when I asked about it in email?

Over the phone one of the anon devs mentioned going over 64 at the cost of another pass (IIRC), so I don't think that's what you're referring to.
 
64 has to stay regardless for a generic PS 2.0 or are devs gone start doing one path for R350 and one for R300 and one for NV3x and one for PS2.0.
 
Jakub said:
Dio said:
Jakub said:
Care to summarize them for me? I'd love to make corrections.
My main factual problem was the first post I made in this thread: The R300 series supports 160 instructions, and they are somewhat more capable than DX9 level instructions. Therefore, continually referring to a '64 instruction limit' is incorrect.
If you're correct about 160 instructions being the limited, I'm puzzled that none of the devs I polled (including several who took advantage of my offer of anonymity) did not dispute the limit when I asked about it in email?

Over the phone one of the anon devs mentioned going over 64 at the cost of another pass (IIRC), so I don't think that's what you're referring to.
The API can be a limiting factor here.
 
I stand corrected. In D3D R300 is limited to 96 instructions. (I thought there was a minimum/maximum limit that a program could fall between, but I was reading the wrong bit of the spec).

As to why developers did not point this out: well, these guys are all very busy, and they have a lot of things running around in their heads.

This is probably emphasised because they don't have to think too much about our hardware ("It Just Works" again :) )
 
Dio said:
I stand corrected. In D3D R300 is limited to 96 instructions. (I thought there was a minimum/maximum limit that a program could fall between, but I was reading the wrong bit of the spec).

As to why developers did not point this out: well, these guys are all very busy, and they have a lot of things running around in their heads.

This is probably emphasised because they don't have to think too much about our hardware ("It Just Works" again :) )
Devs aren't too busy to correct me. Believe me, my initial set of questions came back not with answers, but with corrections to my assumptions. :)

Anyway, I am horribly confused by the instruction limit issue. I found sources for 64, 96 and 160. I just can't get a straight answer out of anyone on that. I eliminated 96 as a possibility b/c that simply happened to be the difference between 160 and 64, so I assumed people got confused. What ultimately swayed me to accept 64 was that no one at the conference (including many who attended ATI Shader Day), didn't dispute 64.
 
Jakub said:
Anyway, I am horribly confused by the instruction limit issue. I found sources for 64, 96 and 160. I just can't get a straight answer out of anyone on that. I eliminated 96 as a possibility b/c that simply happened to be the difference between 160 and 64, so I assumed people got confused.
That's an incredibly naive way of looking at things. Under PS 2.0, you can have 64 ALU instructions and 32 texture instructions for a total of 96 instructions. Internally, the R300 based parts can do more than this (160 instructions), and you can take advantage of these extra instructions under OpenGL. How? Well, in OpenGL, the application gives you a shader and asks you to compile it. If it doesn't fit, then you can return an error. This allows you to do a lot more because your compiler/optimizer can take full advantage of HW features that aren't fully exposed by the API.
What ultimately swayed me to accept 64 was that no one at the conference (including many who attended ATI Shader Day), didn't dispute 64.
I guess you didn't ask the right people.
 
OpenGL guy said:
Jakub said:
Anyway, I am horribly confused by the instruction limit issue. I found sources for 64, 96 and 160. I just can't get a straight answer out of anyone on that. I eliminated 96 as a possibility b/c that simply happened to be the difference between 160 and 64, so I assumed people got confused.
Under PS 2.0, you can have 64 ALU instructions and 32 texture instructions for a total of 96 instructions. Internally, the R300 based parts can do more than this (160 instructions), and you can take advantage of these extra instructions under OpenGL. How? Well, in OpenGL, the application gives you a shader and asks you to compile it. If it doesn't fit, then you can return an error. This allows you to do a lot more because your compiler/optimizer can take full advantage of HW features that aren't fully exposed by the API.
OK, so:

1. Radeon R300/350/360 hardware is capable of doing 160 instructions.
2. PS 2.0 limits the Radeon (and I assume, the GF FX?) to 64 math and 32 texture instructions.
3. OpenGL has no such limitations, thus it's possible to get the maximum instruction count out of the Radeon and the GF FX.

I'm not sure what you're saying with the error message. It almost sounds like the error message enables you to "take full advantage of HW features that aren't fully exposed by the API.", which doesn't make much sense to me.
 
I don't get it.

Doesn't PS2a fully expose NV3x hardware? If so, why doesn't ATi press for a PS2b that fully exposes R3x0 hardware, particularly since quite a few devs are intimating that they're bumping into shader-length limits with R300?

OGL guy, by "you" do you mean the card's drivers/hardware returning an error to the 3D app? So do you mean the app can have multiple shader versions and keep feeding them in until they find one that best fits the given hardware, by trial and error?
 
Jakub said:
I'm not sure what you're saying with the error message. It almost sounds like the error message enables you to "take full advantage of HW features that aren't fully exposed by the API.", which doesn't make much sense to me.
What Pete wrote ;)

The GL driver returns an error when the shader was too complex to fit the hardware. This error has to be caught in the application, which can then scale back to a simpler shader, until the driver reports success.

Ie, you can try submitting a 2000 instruction shader to the API and if the driver somehow manages to make it work, fine. You can't do that in DX Graphics because the MS runtime will block off shaders exceeding its own 'standard' limits.
 
zeckensack said:
What Pete wrote ;)

The GL driver returns an error when the shader was too complex to fit the hardware. This error has to be caught in the application, which can then scale back to a simpler shader, until the driver reports success.

Ie, you can try submitting a 2000 instruction shader to the API and if the driver somehow manages to make it work, fine. You can't do that in DX Graphics because the MS runtime will block off shaders exceeding its own 'standard' limits.
OK, that makes complete sense.

Interesting that PS2.0a would enable extra instructions on an NV3x board but not an R3x0...

Anyway, thanks for clearing that up.
 
Jakub said:
Interesting that PS2.0a would enable extra instructions on an NV3x board but not an R3x0...
not really.
It makes total sense.
Its something nVidia needs, something nVidia pushed for, soemthing that helps nVidia hardware.
ATI doesnt need the help, their hardware "Just Works"
 
Jakub said:
Interesting that PS2.0a would enable extra instructions on an NV3x board but not an R3x0...
It's a little more complicated than that. The problem is simply that we can't expose all of the instructions in D3D because of limitations of the API. In other words, it possible that a particular 65 ALU instruction shader won't fit in our hardware. On the other hand, it's possible that a particular 256 (or more!) ALU instruction shader could fit in our hardware after compilation and optimization. In D3D, we're limited to the worst-case-scenario by the API. In OpenGL, we have more flexibility because the driver gets to provide feedback to the application.
 
Exactly what he said.

WHQL etc. strictly enforces that any program that passes D3D's size checks MUST compile - so we have to return conservative limits to the application.

This is a reinforcement for what I said earlier about contacting ATI to check details. The principle of 'ask the game developers' is valid, but in this case it has led to the wrong answer, and I can forsee the same thing happening very frequently because the majority of developers do not have the time to really get to know anyone's hardware, they just make it work. (Which, in my opinion, implies they actually spend LESS time on our hardware because it should work easier).

I understand your reasons for not wanting to do this, but I think you would get better results.
 
This is another valid aspect. Jakub's initial article stated that, for instance, Gearbox were there because they were very happy with the support they got from NVIDIa in making the shader work well on the FX series. But the question should also be asked why does that type of back end support need be required? Also, were they not happy that they "Just worked" in the first place on ATI hardware (assuming it did)? Which is better for the developer - having hardware that does what it says, or having lots of support to on the software end in order to make the hardware run at a rate they are targetting?
 
DaveBaumann said:
Which is better for the developer - having hardware that does what it says, or having lots of support to on the software end in order to make the hardware run at a rate they are targetting?
Oh! OH!!!

I know the answer to this one! :D
 
Back
Top