Hey! It's the 13th. I want my juicy info!!

dan2097 said:
and theres something about dual-issue vs "co-issue" and how the r3xx cant do dual-issue but the nv40 can, no idea how important that is

Not that important. Most code consist of 1, 3, 1+3 or 4. Two-component vectors aren't very common.
 
True, must of the article is already common knowledge but there is some interesting information

Quote:
NVIDIA UltraShadow II for 4 times the performance in highly shadowed games (e.g. Doom III) comparing to older GPUs


Confirmation of a 32x0 mode

128 pixel shader operation /clock

They mean 4 times the performance of their older GPU's not ATI's :LOL:
 
Stryyder said:
True, must of the article is already common knowledge but there is some interesting information

Quote:
NVIDIA UltraShadow II for 4 times the performance in highly shadowed games (e.g. Doom III) comparing to older GPUs


Confirmation of a 32x0 mode

128 pixel shader operation /clock

They mean 4 times the performance of their older GPU's not ATI's :LOL:

AFAIK, the NV3x was actually ahead of ATi in terms of shadow performance, as the line simply dominated in all early Doom III benchmarks, and NV35 contained all sorts of upgrades "recommended" by JC himself.
 
Bouncing Zabaglione Bros. said:
ATI's face looks nice (from what little we can see), but I really want to see the Nvidia mermaid and her hair billowing underwater.

A Mermaid ? What are Nvidia thinking ? I mean, how lame will the nude patches for the demo be ? :)
 
Humus said:
dan2097 said:
and theres something about dual-issue vs "co-issue" and how the r3xx cant do dual-issue but the nv40 can, no idea how important that is

Not that important. Most code consist of 1, 3, 1+3 or 4. Two-component vectors aren't very common.

Do you have any idea whether their figure of 128 operations/clock is comparable to ATIs 9800XT value of 40 pixel shader operations/clock from here:

http://www.hardocp.com/image.html?image=MTA2NDg1OTI2NEFrQm9pUVd0ZU5fMV83X2wuanBn
 
Humus said:
Not that important. Most code consist of 1, 3, 1+3 or 4. Two-component vectors aren't very common.

Thanks, I was already wondering why you would want to do some math on the alpha with only one of the colours (e.g. R+G and B+alpha as they show)
 
Humus said:
dan2097 said:
and theres something about dual-issue vs "co-issue" and how the r3xx cant do dual-issue but the nv40 can, no idea how important that is

Not that important. Most code consist of 1, 3, 1+3 or 4. Two-component vectors aren't very common.

You're only talking about co-issue 3/1 VS 2/2 here but GF 6800 has two units each capable of co-issue.

Wasn't it already the case with the NV35 ? I believed the first NV30 had one FP Unit (also responsible for texturing) and two FX units, and NV35 had two FP units.
 
Humus said:
dan2097 said:
and theres something about dual-issue vs "co-issue" and how the r3xx cant do dual-issue but the nv40 can, no idea how important that is

Not that important. Most code consist of 1, 3, 1+3 or 4. Two-component vectors aren't very common.

The way I read it, dan2097 isn't talking about the vector configuration.

NV4x is able to process dual instructions at the same time. It's not clear from their diagram, but it sounds like each PS unit can process either 3/1 or 2/2 operations. So, NV4x could execute dual 3/1 operations per pipeline per clock.

Though it's possible that nvidia's second shader unit is limited to 2/2 operation only...it's not clear.

(Edit..yeah...and what Zeross said!)
 
LeStoffer said:
Humus said:
Not that important. Most code consist of 1, 3, 1+3 or 4. Two-component vectors aren't very common.

Thanks, I was already wondering why you would want to do some math on the alpha with only one of the colours (e.g. R+G and B+alpha as they show)

Don't think of them as RGBA but just 4 values.
 
The dual slide isn't exactly very technical so it's hard to say with what restrictions it may have.

Input into SU2 most likely must come from SU1. Secondly there is no indication if the output from SU1 can be written to a register before going though SU2. So you may be limited to something like this (ignoring swizzling)

o.r = (i1.r op1 i2.r) op3 i3.r
o.g = (i1.g op1 i2.g) op3 i3.g
o.b = (i1.b op1 i2.b) op4 i3.b
o.a = (i1.a op2 i2.a) op4 i3.a

Which is only 'so' useful. Does give you single clock MAD though
 
surfhurleydude said:
Stryyder said:
True, must of the article is already common knowledge but there is some interesting information

Quote:
NVIDIA UltraShadow II for 4 times the performance in highly shadowed games (e.g. Doom III) comparing to older GPUs


Confirmation of a 32x0 mode

128 pixel shader operation /clock

They mean 4 times the performance of their older GPU's not ATI's :LOL:

AFAIK, the NV3x was actually ahead of ATi in terms of shadow performance, as the line simply dominated in all early Doom III benchmarks, and NV35 contained all sorts of upgrades "recommended" by JC himself.

Drinking the Cool Aid?? Doom III was the only game with shaders that the NV3x didn't choke on and die. Since the NV35 was built to play doom and Doom was coded to run on the NV35 this shouldn't be suprising. Unfortunately most people will play more than just Doom 3 and JC will have to release a product that is coded to the DX9x spec.
 
dan2097 said:
Do you have any idea whether their figure of 128 operations/clock is comparable to ATIs 9800XT value of 40 pixel shader operations/clock from here:

They get to the 128 operations/clock like this:

a) 16 pipelines with...
b) 2 Shader Units each...
c) that can each do 4 instructions (on RGB+Alpha) per cycle (per clock I assume)

Thus: 16 x 2 x 4 = 128
 
Joe DeFuria said:
Humus said:
dan2097 said:
and theres something about dual-issue vs "co-issue" and how the r3xx cant do dual-issue but the nv40 can, no idea how important that is

Not that important. Most code consist of 1, 3, 1+3 or 4. Two-component vectors aren't very common.

The way I read it, dan2097 isn't talking about the vector configuration.

NV4x is able to process dual instructions at the same time. It's not clear from their diagram, but it sounds like each PS unit can process either 3/1 or 2/2 operations. So, NV4x could execute dual 3/1 operations per pipeline per clock.

Though it's possible that nvidia's second shader unit is limited to 2/2 operation only...it's not clear.

(Edit..yeah...and what Zeross said!)

Does that mean maximum pc was actually right about them doubling the number of shader units to 32, i.e. 2 shader units per pipe.

i.e. 8x the number of shader units the nv35 has :oops:

I see what your saying the 2/2 mode (EDIT in most cases) isnt useful but the ability to do 2x 3/1 is
 
Stryyder said:
surfhurleydude said:
Stryyder said:
True, must of the article is already common knowledge but there is some interesting information

Quote:
NVIDIA UltraShadow II for 4 times the performance in highly shadowed games (e.g. Doom III) comparing to older GPUs


Confirmation of a 32x0 mode

128 pixel shader operation /clock

They mean 4 times the performance of their older GPU's not ATI's :LOL:

AFAIK, the NV3x was actually ahead of ATi in terms of shadow performance, as the line simply dominated in all early Doom III benchmarks, and NV35 contained all sorts of upgrades "recommended" by JC himself.

Drinking the Cool Aid?? Doom III was the only game with shaders that the NV3x didn't choke on and die. Since the NV35 was built to play doom and Doom was coded to run on the NV35 this shouldn't be suprising. Unfortunately most people will play more than just Doom 3 and JC will have to release a product that is coded to the DX9x spec.

Can I ask what the hell you are rambling on about? Fact of the matter is, UltraShadow was put in the NV35 line up because of JC's request, and it enhances STENCIL op performance, not shader performance.
 
pocketmoon_ said:
LeStoffer said:
Thanks, I was already wondering why you would want to do some math on the alpha with only one of the colours (e.g. R+G and B+alpha as they show)

Don't think of them as RGBA but just 4 values.

Yes, I stand corrected...! :oops: Welcome the wonderful world of advanced shaders for me then... ;)
 
LeStoffer said:
dan2097 said:
Do you have any idea whether their figure of 128 operations/clock is comparable to ATIs 9800XT value of 40 pixel shader operations/clock from here:

They get to the 128 operations/clock like this:

a) 16 pipelines with...
b) 2 Shader Units each...
c) that can each do 4 instructions (on RGB+Alpha) per cycle (per clock I assume)

Thus: 16 x 2 x 4 = 128
This isn't QUITE true.
16 pipes, each with 2 shaders.
Each of those shaderss can execute 2 instructions.
Those 2 instructions can have a total of 4 ops.
ie vec3/scalar, or 2 vec2s.
So there are 128 operations, but only 64 instructions the way I read it.
The difference is that if you have 4 scalar adds in a row they all can't execute in 1 clock on one shader.
It would be a stupid shader to hit this case though.
 
Back
Top