Firing Squad NVIDIA Editors Day Article

digitalwanderer · Nov 4, 2003

DaveBaumann said:
Ahhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh, god dammit, does nobody read the bloody front page!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Beyond3D has a front page? :|

nelg · Nov 4, 2003

DaveBaumann said:
Ahhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh, god dammit, does nobody read the bloody front page!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Blame it on Humus :!:

Ask him, Dave, if you want to know why.

Jakub · Nov 4, 2003

DaveBaumann said:
Jakub, Dio (who works for ATI BTW) wasn't talking about making pro ATI or NVIDIA statements (although in truth, ATI tend not to debate what you may say about the competition as far as my dealings with them have been concerned), he was actually just talking about factual, or not, technical statements - there were a number of allusions NVIDIA made to ATI hardware that were not correct (ones that I've mentioned in other threads bout this editors day).

Care to summarize them for me? I'd love to make corrections.

g__day · Nov 4, 2003

One thing I am impressed with is writers starting to come here when they have published their best efforts to get suggestions for improvements and develop better sources of informed analysis. This is the second case this week.

The professionalism here is as apparent as the expertise. I'd love to see this trend continue and the bar for writing clever pieces would undoubtable raise as more expert points of view are considered and consulted.

Dio · Nov 4, 2003

Jakub said:
Care to summarize them for me? I'd love to make corrections.

My main factual problem was the first post I made in this thread: The R300 series supports 160 instructions, and they are somewhat more capable than DX9 level instructions. Therefore, continually referring to a '64 instruction limit' is incorrect.

Jakub · Nov 5, 2003

Dio said:
Jakub said:

Care to summarize them for me? I'd love to make corrections.

Click to expand...

My main factual problem was the first post I made in this thread: The R300 series supports 160 instructions, and they are somewhat more capable than DX9 level instructions. Therefore, continually referring to a '64 instruction limit' is incorrect.

If you're correct about 160 instructions being the limited, I'm puzzled that none of the devs I polled (including several who took advantage of my offer of anonymity) did not dispute the limit when I asked about it in email?

Over the phone one of the anon devs mentioned going over 64 at the cost of another pass (IIRC), so I don't think that's what you're referring to.

bloodbob · Nov 5, 2003

64 has to stay regardless for a generic PS 2.0 or are devs gone start doing one path for R350 and one for R300 and one for NV3x and one for PS2.0.

OpenGL guy · Nov 5, 2003

Jakub said:
Dio said:

Jakub said:

Care to summarize them for me? I'd love to make corrections.

Click to expand...

My main factual problem was the first post I made in this thread: The R300 series supports 160 instructions, and they are somewhat more capable than DX9 level instructions. Therefore, continually referring to a '64 instruction limit' is incorrect.

Click to expand...

If you're correct about 160 instructions being the limited, I'm puzzled that none of the devs I polled (including several who took advantage of my offer of anonymity) did not dispute the limit when I asked about it in email?

Over the phone one of the anon devs mentioned going over 64 at the cost of another pass (IIRC), so I don't think that's what you're referring to.

The API can be a limiting factor here.

Dio · Nov 5, 2003

I stand corrected. In D3D R300 is limited to 96 instructions. (I thought there was a minimum/maximum limit that a program could fall between, but I was reading the wrong bit of the spec).

As to why developers did not point this out: well, these guys are all very busy, and they have a lot of things running around in their heads.

This is probably emphasised because they don't have to think too much about our hardware ("It Just Works" again

)

Jakub · Nov 5, 2003

Dio said:
I stand corrected. In D3D R300 is limited to 96 instructions. (I thought there was a minimum/maximum limit that a program could fall between, but I was reading the wrong bit of the spec).

As to why developers did not point this out: well, these guys are all very busy, and they have a lot of things running around in their heads.

This is probably emphasised because they don't have to think too much about our hardware ("It Just Works" again )

Devs aren't too busy to correct me. Believe me, my initial set of questions came back not with answers, but with corrections to my assumptions.

Anyway, I am horribly confused by the instruction limit issue. I found sources for 64, 96 and 160. I just can't get a straight answer out of anyone on that. I eliminated 96 as a possibility b/c that simply happened to be the difference between 160 and 64, so I assumed people got confused. What ultimately swayed me to accept 64 was that no one at the conference (including many who attended ATI Shader Day), didn't dispute 64.

OpenGL guy · Nov 5, 2003

Jakub said:
Anyway, I am horribly confused by the instruction limit issue. I found sources for 64, 96 and 160. I just can't get a straight answer out of anyone on that. I eliminated 96 as a possibility b/c that simply happened to be the difference between 160 and 64, so I assumed people got confused.

That's an incredibly naive way of looking at things. Under PS 2.0, you can have 64 ALU instructions and 32 texture instructions for a total of 96 instructions. Internally, the R300 based parts can do more than this (160 instructions), and you can take advantage of these extra instructions under OpenGL. How? Well, in OpenGL, the application gives you a shader and asks you to compile it. If it doesn't fit, then you can return an error. This allows you to do a lot more because your compiler/optimizer can take full advantage of HW features that aren't fully exposed by the API.

What ultimately swayed me to accept 64 was that no one at the conference (including many who attended ATI Shader Day), didn't dispute 64.

I guess you didn't ask the right people.

Jakub · Nov 5, 2003

OpenGL guy said:
Jakub said:

Anyway, I am horribly confused by the instruction limit issue. I found sources for 64, 96 and 160. I just can't get a straight answer out of anyone on that. I eliminated 96 as a possibility b/c that simply happened to be the difference between 160 and 64, so I assumed people got confused.

Click to expand...

Under PS 2.0, you can have 64 ALU instructions and 32 texture instructions for a total of 96 instructions. Internally, the R300 based parts can do more than this (160 instructions), and you can take advantage of these extra instructions under OpenGL. How? Well, in OpenGL, the application gives you a shader and asks you to compile it. If it doesn't fit, then you can return an error. This allows you to do a lot more because your compiler/optimizer can take full advantage of HW features that aren't fully exposed by the API.

OK, so:

1. Radeon R300/350/360 hardware is capable of doing 160 instructions.
2. PS 2.0 limits the Radeon (and I assume, the GF FX?) to 64 math and 32 texture instructions.
3. OpenGL has no such limitations, thus it's possible to get the maximum instruction count out of the Radeon and the GF FX.

I'm not sure what you're saying with the error message. It almost sounds like the error message enables you to "take full advantage of HW features that aren't fully exposed by the API.", which doesn't make much sense to me.

Pete · Nov 6, 2003

I don't get it.

Doesn't PS2a fully expose NV3x hardware? If so, why doesn't ATi press for a PS2b that fully exposes R3x0 hardware, particularly since quite a few devs are intimating that they're bumping into shader-length limits with R300?

OGL guy, by "you" do you mean the card's drivers/hardware returning an error to the 3D app? So do you mean the app can have multiple shader versions and keep feeding them in until they find one that best fits the given hardware, by trial and error?

Rolf N · Nov 6, 2003

Jakub said:
I'm not sure what you're saying with the error message. It almost sounds like the error message enables you to "take full advantage of HW features that aren't fully exposed by the API.", which doesn't make much sense to me.

What Pete wrote

The GL driver returns an error when the shader was too complex to fit the hardware. This error has to be caught in the application, which can then scale back to a simpler shader, until the driver reports success.

Ie, you can try submitting a 2000 instruction shader to the API and if the driver somehow manages to make it work, fine. You can't do that in DX Graphics because the MS runtime will block off shaders exceeding its own 'standard' limits.

Jakub · Nov 6, 2003

zeckensack said:
What Pete wrote

The GL driver returns an error when the shader was too complex to fit the hardware. This error has to be caught in the application, which can then scale back to a simpler shader, until the driver reports success.

Ie, you can try submitting a 2000 instruction shader to the API and if the driver somehow manages to make it work, fine. You can't do that in DX Graphics because the MS runtime will block off shaders exceeding its own 'standard' limits.

OK, that makes complete sense.

Interesting that PS2.0a would enable extra instructions on an NV3x board but not an R3x0...

Anyway, thanks for clearing that up.

Althornin · Nov 6, 2003

Jakub said:
Interesting that PS2.0a would enable extra instructions on an NV3x board but not an R3x0...

not really.
It makes total sense.
Its something nVidia needs, something nVidia pushed for, soemthing that helps nVidia hardware.
ATI doesnt need the help, their hardware "Just Works"

OpenGL guy · Nov 6, 2003

Jakub said:
Interesting that PS2.0a would enable extra instructions on an NV3x board but not an R3x0...

It's a little more complicated than that. The problem is simply that we can't expose all of the instructions in D3D because of limitations of the API. In other words, it possible that a particular 65 ALU instruction shader won't fit in our hardware. On the other hand, it's possible that a particular 256 (or more!) ALU instruction shader could fit in our hardware after compilation and optimization. In D3D, we're limited to the worst-case-scenario by the API. In OpenGL, we have more flexibility because the driver gets to provide feedback to the application.

Dio · Nov 6, 2003

Exactly what he said.

WHQL etc. strictly enforces that any program that passes D3D's size checks MUST compile - so we have to return conservative limits to the application.

This is a reinforcement for what I said earlier about contacting ATI to check details. The principle of 'ask the game developers' is valid, but in this case it has led to the wrong answer, and I can forsee the same thing happening very frequently because the majority of developers do not have the time to really get to know anyone's hardware, they just make it work. (Which, in my opinion, implies they actually spend LESS time on our hardware because it should work easier).

I understand your reasons for not wanting to do this, but I think you would get better results.

Dave Baumann · Nov 6, 2003

This is another valid aspect. Jakub's initial article stated that, for instance, Gearbox were there because they were very happy with the support they got from NVIDIa in making the shader work well on the FX series. But the question should also be asked why does that type of back end support need be required? Also, were they not happy that they "Just worked" in the first place on ATI hardware (assuming it did)? Which is better for the developer - having hardware that does what it says, or having lots of support to on the software end in order to make the hardware run at a rate they are targetting?

digitalwanderer · Nov 6, 2003

DaveBaumann said:
Which is better for the developer - having hardware that does what it says, or having lots of support to on the software end in order to make the hardware run at a rate they are targetting?

Oh! OH!!!

I know the answer to this one!

Firing Squad NVIDIA Editors Day Article

digitalwanderer

wandering

nelg

Jakub

g__day

Dio

Jakub

bloodbob

Trollipop

OpenGL guy

Dio

Jakub

OpenGL guy

Jakub

Pete

Moderate Nuisance

Rolf N

Recurring Membmare

Jakub

Althornin

Senior Lurker

OpenGL guy

Dio

Dave Baumann

Gamerscore Wh...

digitalwanderer

wandering

Similar threads