Put Your GeForce FX Questions here...

Status
Not open for further replies.
also :

- how independent is each pixel pipeline?

- can different pipelines execute different shaders simultaneously?

- can different pipelines execute different instructions of the same
shader simultaneously?

- can different pipelines process pixels from different triangles (with
the same shader) simultaneously?

- can different pipelines process arbitrary pixels from the same
triangle simultaneously, or are pixels processed in 2x2 or 2x4
or 4x2 blocks?

- if pixels are processed in 2x2 blocks (would seem likely given
the presence of DDX and DDY instructions in the pixel shader),
what happens if a triangle is only one or two pixels large? Do
6-7 pipelines remain idle, or is some kind of internal supersampling
performed?
 
how powerful is the pixel culling engine? how much better is it vs. gf4 and R300?
 
Dave, please ask about the pixel processing format the NV30 is uses. How does it arrange its alu ops? Can it issue a scalar, vector, and texture address op per cycle (like the R300)?

These are integral questions which predict the extent to which long shaders are feasible. Only an architecture that can compute many instructions a cycle will remain competitive. Psurge's questions are also very interesting and should also be proposed. I second the motion.
 
Dave,

another FSAA question:

- what about 4xMSAA--rotated grid, ordered grid, something "special"?

ta,
-Sascha.rb
 
Is the occlusion detection system semi-deferring the rendering somehow ( at least in polygon batches) , or is it pure IMR ? If its IMR then should application strategies to reduce overdraw remain the same as on previous generation GF3/GF4 ?
Edit: :oops: it was answered, ive read the article before. its an IMR, end of story
 
no_way said:
Is the occlusion detection system semi-deferring the rendering somehow ( at least in polygon batches) , or is it pure IMR ? If its IMR then should application strategies to reduce overdraw remain the same as on previous generation GF3/GF4 ?

Errrr, perhaps one should read the site a little, IMO thats already answered.
 
What are the details of the subdivision surface implementation that they've used for the Ogre and for Dawn? Is this a Catmull-Clark subdiv, a Doo-Sabin, or something different? Do they tesselate on the CPU or the GPU? How does the adaptive subdivision work; is it the same level for the whole mesh?
 
Dave,

We're going to hear an aweful lot about their long pixel shader programs and how essential it's gonna be and yada-yada-yada...

We also know that ATI refuse this and basicly says that a few more multipasses wont hurt performance that much sinse we're already taking slow speed with 160+ shaders ops programs.

Okay, then let's settle this once and for all: Ask nVidia to show you a shader program that in practice will run that much faster on the NV30 because R300 has to multipass. (And then later on show the program to ATI and let them have a rebute).
 
Okay, then let's settle this once and for all: Ask nVidia to show you a shader program that in practice will run that much faster on the NV30 because R300 has to multipass. (And then later on show the program to ATI and let them have a rebute).

You and I both know that is not gonna happen...

First of all, NVIDIA don't have any demos that actually use more than 160 instructions (the largest demo they have uses 100+ instructions and R300 can run that shader as well, without multipassing).

More than 160 instructions is not practical in real time applications (offline rendering is a different manner entirely), which is probably why ATI have limited themselves to that number.

Second of all, long instructions, even without any type of flow control can bring us unachievable previously results and raise the level of graphics in a significant way! I bet that the hottest games to be based around the DX9 specs in the future won't be using more than 100 instructions in the best case...
 
alexsok said:
First of all, NVIDIA don't have any demos that actually use more than 160 instructions (the largest demo they have uses 100+ instructions and R300 can run that shader as well, without multipassing).

Ummmmm....
 
Well... I recalled 100+ figure, but it seems to actually be 350+... my mistake.

However, my original point still stands:
Yes, its true to say that if every pixel used thousands of shader instructions then the performance wouldn't be playable, however what we are trying to achieve here is not to give the developer any limitations.
And besides, do you really think 350+ instructions is pratical in anything other than tech demos?

I'd like to see R300 running that Time Machine demo, the difference in perfomance...
 
Was the 51GFLOPS in the PS a marketing goof with numbers from a time when the core clock was ment to be 400MHz? (And consequently should be 64GFLOPS at 500MHz.)

I just can't make any sense out of the combination 500MHz and 51GFLOPS. At 400MHz it makes perfect sense though.

OK, this is yet another "how fast is the pixel shaders"-question. But at least I had a different twist on it. :D
 
alexsok said:
Well... I recalled 100+ figure, but it seems to actually be 350+... my mistake.

The have a shader which does volumetric rednering in a single pass using a very long shader (dont know how long!)

Check out the Cg shaders over at cgshaders.org - pick a big one and compile it to see how big is is in native shader code :)
 
Status
Not open for further replies.
Back
Top