Inquirer spreading R420 info

OpenGL guy said:
Why don't you just admit your own errors? "If 32 were actually available, they'd be exposed by now." In any event, sireric wasn't really wrong. Have the cap bits changed for R300? No, hence there was no change to the temp registers. There's no practical need to expose four more temps for R300. "Because we can" is not compelling enough to me, and, apparently, to others at ATI.

2) The R300 has 32 temporary registers in the vertex and pixel shaders (64 "total"). We currently "reveal" 12 in the pixel shader (not sure about vertex shader), following DX9 recommendations. We will raise that as caps bits allow or DX9 specs change.
I read the quote above as, "if Microsoft would let us, we'd support 32 temporary registers."

The caps bits have been available for supporting 32 temporary registers the entire time DX9 has been released.
 
Umm hows about ps2.0b for finally getting round to enabling 32 temps Chalnoth?

OpenGL guy, can you give us some indication of whether or not ps2.0b is actually an ATI profile or whether it was another company that had it prepared for them (SGI/PowerVI/NV???)
 
Chalnoth said:
OpenGL guy said:
Why don't you just admit your own errors? "If 32 were actually available, they'd be exposed by now." In any event, sireric wasn't really wrong. Have the cap bits changed for R300? No, hence there was no change to the temp registers. There's no practical need to expose four more temps for R300. "Because we can" is not compelling enough to me, and, apparently, to others at ATI.
2) The R300 has 32 temporary registers in the vertex and pixel shaders (64 "total"). We currently "reveal" 12 in the pixel shader (not sure about vertex shader), following DX9 recommendations. We will raise that as caps bits allow or DX9 specs change.
I read the quote above as, "if Microsoft would let us, we'd support 32 temporary registers."
It says "we will raise that". Does it say "we will raise that to 32"? No. Does it give a timeframe? No. Get over it. You are the one in error.
The caps bits have been available for supporting 32 temporary registers the entire time DX9 has been released.
As I stated, we could expose 16, but it's pointless.

Do you have some sort of need to always get the last word in even when you are wrong? I showed that you were wrong earlier. You're still wrong now. This is getting boring.
 
Chalnoth said:
OpenGL guy said:
Why don't you just admit your own errors? "If 32 were actually available, they'd be exposed by now." In any event, sireric wasn't really wrong. Have the cap bits changed for R300? No, hence there was no change to the temp registers. There's no practical need to expose four more temps for R300. "Because we can" is not compelling enough to me, and, apparently, to others at ATI.

2) The R300 has 32 temporary registers in the vertex and pixel shaders (64 "total"). We currently "reveal" 12 in the pixel shader (not sure about vertex shader), following DX9 recommendations. We will raise that as caps bits allow or DX9 specs change.
I read the quote above as, "if Microsoft would let us, we'd support 32 temporary registers."

The caps bits have been available for supporting 32 temporary registers the entire time DX9 has been released.
huh, i guess the part that says "recommendation" has you confused.
I also wonder why you think its 32 registers, and OpenGL guy says 16 for DX9 SM2.0
I also have to comment on your comment about "treating developers as stupid".
Make up your mind. When ATI treats em as smart, you bitch and say that they shouldnt leave them room to screw up. When ATi does that, you bitch. Which is it?
 
Chalnoth said:
Either way, Sireric's post was just plain wrong.

Not quite.

1) Regret to inform you that R3xx still has 32 temps registers, as its always had.

2) The GPRs exposed has stayed the same, mainly due to the fact that there's no need to expose more. It's not just simply changing a dword; there's a whole support aspect and then there's the optimizer to change as well as other parts of the driver that I won't get into. I did ask the SW guys, and they said they saw no need to increase it at this time.

So, I guess that my full quote was not quite on the ball, but it's mainly true. I should of added to the second comment that we would expose it as the caps are available and as we feel the need for it.
 
Usually in any basic code block, there are few live registers anyway. AMD put out a paper awhile back where they analyzed thousands of lines of code in a C library, and the average number of live variables per function was 6.5. There is reason to believe this will be lower for GPUs, but also reasons to believe it could be higher. GPUs generally accumulate results in shaders and don't deal with datastructures. On the other hand, GPUs don't have memory to spill to, so may use lots of registers as a sort of "poor man's heap" or as a way of caching some results. I personally think the figure for most shaders is probably 6 or lower.

If someone's using 16 or 32 registers (that are all live), I think they're probably using them to store a big datastructure.
 
Last time i Checked sireric was a Hardware Engineer who thats Right actually Works for ATi. Not only that.. He actually Works on and aids in the designs etc of actual ATi hardware...

Now imagine that. He might actually know more about R300 hardware than someone who is NOT a Hardware Engineer who whorks for ATi.

... I...... am actually..... having trouble with this... I cant Quite grasp it...

I need an asprin....
 
wow sir eric and Open GL guy i cant belive your even paying attention to chanloths replies damn

alright for those who dont know i dont know why you dont know but whatever Sire Eric and Open GL guy WORK for ATI if their is any question about the architecture THEIR the ones that you will get informed answers from same with drivers

edit: damn u hellbinder ya beat me to it
 
Would it be safe to assume that ATI’s decision to have 32 temp registers was made before the PS2.0 spec was finalized. If so, what were the considerations that led ATI to design the R300 with 3 times the amount of registers than what was eventually exposed what led to the original decision to have 32 as opposed to 12?
 
DemoCoder said:
I'm curious if there is a performance drop if more than 12-16 are used. Curious that 12 was settled on as the API limit.
I'm not sure why 12 was settled upon when we can easily do 16, but I guess Microsoft did their own research on PS 2.0 shaders and concluded that 12 was enough. I mean, I suppose it could make things easier for other vendors *shrug*

Under OpenGL, there's an opportunity to use more temps because the driver gets to attempt compilation and optimization before telling the app the shader is not suitable. If the driver can rearrange things to fit into the driver, then it will do so. Under D3D we don't have this option so we have to be more conservative, although 16 would still be fine.
 
What a world to live in where a beautiful theory is never slain by a cold hard fact. Bumble-bees walk, curve balls are optical illusions, and R300 has only 12 temps.
 
Damn. I'm late to the party. :( Anyway, I'll just chip in that ATI has exposed all 32 fragment/pixel shader registers. It's just that this is only under OpenGL. I just checked, too. . . :) There's certainly no reason why they would be unable to do the same for DirectX. The hardware supports it, so it's obviously some other reasons.

Anyway, who needs all 32 registers? Heck, NVidia has proven that you can get by quite well on just a handful.
 
Just out of curiosity ...
OpenGL guy said:
DemoCoder said:
I'm curious if there is a performance drop if more than 12-16 are used. Curious that 12 was settled on as the API limit.
I'm not sure why 12 was settled upon when we can easily do 16, but I guess Microsoft did their own research on PS 2.0 shaders and concluded that 12 was enough. I mean, I suppose it could make things easier for other vendors *shrug*
Maybe they thought IHVs would do 16 registers, but might have a use of a few registers as scratch for macro expansion. Ie set aside four registers that aren't visible to the runtime, but only for the driver's internal use?

Under OpenGL, there's an opportunity to use more temps because the driver gets to attempt compilation and optimization before telling the app the shader is not suitable. If the driver can rearrange things to fit into the driver, then it will do so. Under D3D we don't have this option so we have to be more conservative, although 16 would still be fine.
Where would the other sixteen go? Reserved for texture coord registers, or sampler results (yes, I'm confused about PS2)?
 
gandalfthewhite said:
wow sir eric and Open GL guy i cant belive your even paying attention to chanloths replies damn

alright for those who dont know i dont know why you dont know but whatever Sire Eric and Open GL guy WORK for ATI if their is any question about the architecture THEIR the ones that you will get informed answers from same with drivers

edit: damn u hellbinder ya beat me to it

Wow, can't believe I actually read and manage to read that :oops:
well, seldom see post like that here.
 
Althornin said:
I also wonder why you think its 32 registers, and OpenGL guy says 16 for DX9 SM2.0
Here:
12 min/32 max: The number of r# registers is determined by PS20Caps.NumTemps (which ranges from 12 to 32).

Make up your mind. When ATI treats em as smart, you bitch and say that they shouldnt leave them room to screw up. When ATi does that, you bitch. Which is it?
When did I say that?

ATI shouldn't leave room for end users to screw up (to an extent...FSAA/AF settings are a good thing).
 
Back
Top