NV40 poll

Mariner · Mar 23, 2004

Joe DeFuria said:
Sure...but I think the key stiky point there is sometimes. In other words, under what conditions can said pipeline output more than one pixel per clock?

Every second Tuesday when there is a letter "P" in the month.

KimB · Mar 23, 2004

Hyp-X said:
duncan36 said:

Do people even realize that Nvidia failed at 6x2 with the Nv30?

Click to expand...

No, and I don't buy that.

Just think about 6x2 - it can't execute 1.5 quads.

The NV34 and NV31 don't appear to always work on a full quad at once. Not that it happened, but it's not impossible.

Joe DeFuria · Mar 23, 2004

Demirug said:
Joe DeFuria said:

Demirug said:

PS: Sometimes a pipeline can output more than one pixel per clock.

Click to expand...

Sure...but I think the key stiky point there is sometimes. In other words, under what conditions can said pipeline output more than one pixel per clock?

Click to expand...

They conditions are part of the pipeline design.

Well, of course.

My point was, most of us here who are "skeptical" of the "16 pipeline" claims, have no trouble believing that under certain conditions, it can act as a 16 pipeline product....that is generating 16 pixels per clock.

Basically, given the fact that nVidia claimed 8 pipelines for NV3x, and given that it was only rendering 8 pixels under certain limited conditions (no color), we're left to wonder if there are similar, different, (or no) restrictions with the NV40's 16 pipelines.

Demirug · Mar 23, 2004

Joe DeFuria said:
Well, of course.

My point was, most of us here who are "skeptical" of the "16 pipeline" claims, have no trouble believing that under certain conditions, it can act as a 16 pipeline product....that is generating 16 pixels per clock.

Basically, given the fact that nVidia claimed 8 pipelines for NV3x, and given that it was only rendering 8 pixels under certain limited conditions (no color), we're left to wonder if there are similar, different, (or no) restrictions with the NV40's 16 pipelines.

Big missunderstanding.

I was talk about this in general and you in the NV40 context.

We all know that nVidia is able to make a 8x2 Chip looks like a 16x1 in the case of single texturing. Doing 16 Z/Stencil Ops with a 8 pipeline chip is know art, too. But if they want to do 32 16 Z/Stencil Ops per clock they need 16 doppel pumped pipelines or 8 quad pumped pipes. They have build doppel pumped pipes befor but never a quad pumped beast.

At least the primary reason for 2 TMUs per pipe did not longer exist with NV40.

vb · Mar 23, 2004

Demirug said:
At least the primary reason for 2 TMUs per pipe did not longer exist with NV40.

Is that confirmed info?

Colourless · Mar 23, 2004

I voted for 8x2/16x0 however my feeling is it will be 8x2 with 4 Z Units per pipe giving 8x2/32x0. That said, bandwith limitations will probably mean that the effective difference between 8x2 and 16x1 in this generation will be small.

Demirug · Mar 23, 2004

vb said:
Demirug said:

At least the primary reason for 2 TMUs per pipe did not longer exist with NV40.

Click to expand...

Is that confirmed info?

It is a logical reason. How can somebdoy confirm logic?

KimB · Mar 23, 2004

Yeah, I thought 8x2/16x0 would be the most likely still, given that that's a natural evolution of the NV30. Not that that would necessarily be a bad thing. But it would be more promising if nVidia left that architecture behind more, and went with a 16x1, so I'm cautiously optimistic that that's the case.

stevem · Mar 23, 2004

There is a BIG push by partners to espouse the 16x1 mantra. Although I hate to be a me-too, from some conditional comments, I also thought the underlying organization to be 8x2 & 4 Z check/pipe for 8x2/32x0 ("quad-pumped")... Comments from Hyp-X & Demirug also make sense. Now we need some extended multitexturing/filtering tests once boards arrive... I also need confirmation for the 2/4 quads/cycle theory...

WaltC · Mar 23, 2004

Demirug said:
At first you need a rasterisation that is able to output enougt quads per clock to fill the pixelprocessors. On they other side of they pixelprocessor you need enough ports to transfer the information to the ROPs. And for sure you need enough ROPs for the finish.

Lets look at the NV30. The pixelprocessor is able to output 4 Pixel with color and Z Information or 8 Pixel with only Z-Information. I am not sure if there are two Z-calculation units in the pixelprocessor or if nVidia use some parts of the normal color calculation for the 4 additional Z values. After the pixelprocessor they need some addition ROPs. But this ROPs dont need alphatest or alphableding. Only Z and Stencilops are needed.

A more complete "multipixel per pipe" solution is the NV31. In this case we have two pipes with two tmus but 4 ROPs after they pixelprocessor. If they pipeline is used in they doppel pumped modus they rasterisation unit combines two pixel in one. They texturcoordinate from the first pixel is the first coordinate. They second pixel use the second coordinate. The tmus work as normal but at the end of the pixelprocessor pipe both texturvalues are transferd to different ROPs. nVidia use something like this bevor in the NV10.

This all looks like BS to me, that is, if you are talking about final pixels rendered to screen per clock, which is what we are generally talking about when we discuss pixel pipelines as a gpu specification.

Q: What is the component a gpu renders to screen?

A: A final, color pixel

Q: What is the *only* screen element a gpu renders to screen?

A: A final, color pixel (monochrome pixels--not to be confused with black & white z-pixels--rendered to screen are still "color," unless your position is that black, white, and grey shades are not colors, which would be incorrect)

Q: Through what physical mechanism in the gpu are final, color pixels rendered to screen?

A: Pixel pipelines.

Q: What are some internal gpu processes that occur in the formation of the final, color pixels which are rendered to screen?

A:Z, stencil, ROPS, OPs, etc.

Q: Are Z, stencil, ROPs, OPs, etc., *ever* rendered to screen as screen elements?

A: No

Q: What are they, then?

A: Descriptions of operations that occur in the gpu *prior* to the final, color pixel being rendered to screen via the gpu's pixel pipes

Q: If I "combine two pixels into one," how many pixels does my pixel pipe render per clock?

A: One, at maximum.

Q: Never two pixels per clock, per pipeline?

A: Never.

Q: Why?

A: Because when you combine two of anything into one, the result is always one. The "combination" occurs prior to pixel render, and what is rendered is one, final, color pixel.

Q: Is there a maximum number of color pixels per clock which a gpu can render to screen?

A: Yes.

Q: What determines that maximum?

A: The number of pixel pipelines in the gpu.

Q: Does a gpu ever render fewer pixels per clock than the number of pixel pipelines it has?

A:Yes

Q: When does this happen?

A: When operations internal to the pixel creation process in the gpu, such as Z, stencil, ROPS, OPs, etc. in the pre-render stages require more than a single clock to complete, per pixel. This causes the number of final, color pixels rendered to screen *per clock* to be LESS than the total number of pixel pipelines in the gpu. How many less per clock depend on the operations performed on each pixel, and on the architecture of the gpu.

Q: Do all gpus have a fixed number of pixel pipelines?

A: Yes, all of them do, regardless of architecture.

Q: Do all gpus handle pixel pre-render processing involving Z, stencil, ROPS, OPs, etc. the same way?

A: No, all gpus of differing architectures handle such pre-rendering operations differently, according to the capabilities of their respective architectures.

Q: What is the biggest current misunderstanding today with regard to what pixel pipes produce?

A:That they produce anything apart from a final, color pixel which is rendered to screen. Hence they are called "pixel pipelines," which eloquently describes their function and purpose.

Q: I wasn't sure that I understood all of this. Tell me, again, what the final product of pixel pipelines in all 3d gpus is, regardless of architecture.

A: A color pixel

Demirug · Mar 24, 2004

WaltC, i can accept that you say a pixel is only a pixel if it contains color information. How should we call pixel with only Z and/or stencil information that do not change the current pixelcolor?

Let us only count pixel with color information and take a look at NV31.

Q: How many pixel (with textureinformation) can it render per clock?

A: 4 Pixel per clock.

Q: How many pipes have NV31?

A: 2 pipes.

Q: How can we prove that there are only two pipes?

A: Render pixel with a different count of texturlayer. 3 and 4 layers need the same time. 5 and 6 layer, too. If you look at the results you can calculate that the chip is working as 2x2.

Q: If we have only 2 pipes but can output up to 4 pixel per clock each pipe can output 2 pixel per clock?

A: Yes

Q: If it is not possible that the chip is a 4 pipeline chip that need to combine two pipes to one in case of multitexturing?

A: No it is not. The reson is that one part of the shading process (the shadercore) have only two path (pipes). Each one can do a FP calculation or start 2 texturesamples.

Q: We see that it is possible to output more than one pixel per pipe. How is this technical solved?

A: The pipe can store for each workingunit more than one colorinformation. This information are stored in register. If a pipe works in doppelpixel mode it use one register for pixel one and another register for pixel two. Each register is filled with the result of a textursample. At the end of the shaderpipe it outputs both register to the ROPs.

Joe DeFuria · Mar 24, 2004

Demirug said:
WaltC, i can accept that you say a pixel is only a pixel if it contains color information. How should we call pixel with only Z and/or stencil information that do not change the current pixelcolor?

I thought that was settled...it's a "zixel" (TM).

Demirug · Mar 24, 2004

Joe DeFuria said:
Demirug said:

WaltC, i can accept that you say a pixel is only a pixel if it contains color information. How should we call pixel with only Z and/or stencil information that do not change the current pixelcolor?

Click to expand...

I thought that was settled...it's a "zixel" (TM).

Yes, I use this word sometimes too.

If we use it here I can say that a pipe is able to output more than one zixel per clock.

Zixels are very important if you use DOOM III rendering technic.

Joe DeFuria · Mar 24, 2004

Demirug said:
If we use it here I can say that a pipe is able to output more than one zixel per clock.

Zixels are very important if you use DOOM III rendering technic.

Agreed, on both accounts.

Luminescent · Mar 24, 2004

A: The pipe can store for each workingunit more than one colorinformation. This information are stored in register. If a pipe works in doppelpixel mode it use one register for pixel one and another register for pixel two. Each register is filled with the result of a textursample. At the end of the shaderpipe it outputs both register to the ROPs.

What conditions must be met for the above scenario to occur in NV31? Is it only plausibe for shader ops or cases of single texturing?

Sorry if these questions were answered previously; I did read through the thread, though.

Demirug · Mar 24, 2004

Luminescent said:
A: The pipe can store for each workingunit more than one colorinformation. This information are stored in register. If a pipe works in doppelpixel mode it use one register for pixel one and another register for pixel two. Each register is filled with the result of a textursample. At the end of the shaderpipe it outputs both register to the ROPs.

Click to expand...

What conditions must be met for the above scenario to occur in NV31? Is it only plausibe for shader ops or cases of single texturing?

Sorry if these questions were answered previously; I did read through the thread, though.

Single texturing is the only condition. Maybe it can work with a single Shaderop per Pixel too. But more than one texture or Shaderop is a no go for this "feature".

KimB · Mar 24, 2004

Demirug said:
Zixels are very important if you use DOOM III rendering technic.

More like being able render to the z-buffer faster than the color buffer can improve performance if the game does a z-only pass first. I don't see why this has to be tied to the DOOM3 rendering technique, as it'll help out any hardware that has early pixel kill on z-fail, provided the rendering algorithm requires multiple passes.

NV40 poll

What is IYO the most likely NV40 pipeline configuration?

4x2/8x0

4x2/8x1

8x1

8x2/16x0

16x1

Mariner

KimB

Joe DeFuria

Demirug

vb

Colourless

Monochrome wench

Demirug

KimB

stevem

WaltC

Demirug

Joe DeFuria

Demirug

Joe DeFuria

Luminescent

Demirug

KimB

Similar threads