textures per pass

muppy

Newcomer
I have a bit a confusion in mind and i need some explanation ;)

1) what does it means "textures per pass"??

2) Here
http://www.xbitlabs.com/articles/video/display/tyan-tachyon-g9000.html
there is write that R200 have 4 pixel pipelines and 2 Texturing units per pipeline, RV250 have 4pixel pipelines and 1 Texturing units per pipeline, and there is also write that RV250 as well as R200 can process 6 textures per pass. Why?? R200 shouldn't be able to process 8 textures per pass (4*2=8 )???

Sorry for my bad english
Thanks
Muppy
 
DaveBaumann said:
http://www.beyond3d.com/reviews/ati/radeon9700pro/index.php?page=page17.inc#multipass

1 pass=1 clock cycle, right?
It does't seem this link explain why RV250 as well as R200 can process 6 textures per pass... :rolleyes:
Thanks
Muppy
 
Hmm, I didn't know that Quake3 forced another pass on surfaces where more than two layers required. Hmmm. I made some Kyro-specific Q3 shaders that took no less than 5 layers... and when I tried then in maps the Kyro choked on them. No wonder! :devilish:
 
muppy said:
1 pass=1 clock cycle, right?

No - when we talk about 1 pass (in relation to mulitpass) we are talking about a geomtry pass.

When the geomtry data is sent to the graphics board is deosn't just contain the positions of the geomtry, but also the associated texture information, such as what texture's are applied to it and the positions of the texture is relavent to that piece of geomtry. If you have a piece of geomtry that requires 4 layer, but you board can only handle two per pass then the geomtry data has to be sent (and processed) twice - the first time with the information for the first two layers and the second time with the information for the third and fourth texture layers. On the first pass the results are sent to the frame buffer and on the second pass the results are belnded with the first pass by reading the results from the first pass from the frame buffer memory combining the results from the first and send pass and then passing the final pixel (with the information from all four texture layers) back out to the fram buffer.

Multipassing is costly since it wastes processing on passing the same geomtry multiple times and also the intermediate read/combine/write step waste frambuffer bandwidth. By allowing more texture layers per pass, without needing more texture units per pipeline by storing the intermediate results on chip) this costly step is reduced and hence there is less wasted processing.

(BTW - this principal applies to alot more than just texturing these days, which is why things such as Shader instruction counts and the F-Buffer are so widely talked about)
 
Ozymandis said:
Hmm, I didn't know that Quake3 forced another pass on surfaces where more than two layers required. Hmmm. I made some Kyro-specific Q3 shaders that took no less than 5 layers... and when I tried then in maps the Kyro choked on them. No wonder! :devilish:

I don't understand what this means
Muppy
 
DaveBaumann said:
muppy said:
1 pass=1 clock cycle, right?

No - when we talk about 1 pass (in relation to mulitpass) we are talking about a geomtry pass.

When the geomtry data is sent to the graphics board is deosn't just contain the positions of the geomtry, but also the associated texture information, such as what texture's are applied to it and the positions of the texture is relavent to that piece of geomtry. If you have a piece of geomtry that requires 4 layer, but you board can only handle two per pass then the geomtry data has to be sent (and processed) twice - the first time with the information for the first two layers and the second time with the information for the third and fourth texture layers. On the first pass the results are sent to the frame buffer and on the second pass the results are belnded with the first pass by reading the results from the first pass from the frame buffer memory combining the results from the first and send pass and then passing the final pixel (with the information from all four texture layers) back out to the fram buffer.

Multipassing is costly since it wastes processing on passing the same geomtry multiple times and also the intermediate read/combine/write step waste frambuffer bandwidth. By allowing more texture layers per pass, without needing more texture units per pipeline by storing the intermediate results on chip) this costly step is reduced and hence there is less wasted processing.

(BTW - this principal applies to alot more than just texturing these days, which is why things such as Shader instruction counts and the F-Buffer are so widely talked about)

Many thanks for the useful explanation, but i don't understatn why RV250 as well as R200 can process 6 textures per pass
And then, 1 pass=1 clock cycle, right?
Many thanks ;)
Muppy
 
muppy said:
1 pass=1 clock cycle, right?
It does't seem this link explain why RV250 as well as R200 can process 6 textures per pass... :rolleyes:
Thanks
Muppy
No. A pass is the process of calculating a color value and writing it to the frame buffer, possibly blending it with the value already in the frame buffer. The time it takes is not relevant for the definition of a pass.

If a chip only supports 4 textures per pass, but you want 5 textures to contribute to the final color of a pixel, you have to do multiple passes. First combine four textures and write the result to the framebuffer, then do a second pass to read from the fifth texture and combine the two values via blending.

The number of textures per pass is not limited by the number of TMUs per pipe, but rather by the number of "texture stage register sets" that hold such information as where the texture is located in memory, its size, pixel format, number of mip levels, kind of filtering, etc.
And the pipeline has to support loop-back capabilities, i.e. take the calculated result as input for another calculation.
 
muppy said:
Many thanks for the useful explanation, but i don't understatn why RV250 as well as R200 can process 6 textures per pass
And then, 1 pass=1 clock cycle, right?
Many thanks ;)
Muppy

The whole pass takes "very long time".

but one (pass * pixel) may take many clock cycles,
depending on antialiasing mode, texture filtering mode,
pixel shader code(including number of textures used),
texture cache hits/misses etc...

The "simplest possible pixel" usually takes one clock cycle to render, but when there is too much to do to one pixel in one lcock cycle, modern gfx chips just take another (and maybe another after than) clock cycle immediately after the first one for the pixel.

Some (older) chips could not do this, but they could combine 2 pipelines into one "more capable pipeline", each (pixel*pass) were be rendered at 1 clock cycled, but there were less pipelines available, and the performance effect was about the same.

for (simplified) example, if we are doing multitexturing on 5 textures:
GF3 can only use 4 textures/pass, it has to split the work into at least 2 passes, and do lots of "extra work".

R200 can do all the textures on one pass and it can access two diffrent textures/clock cycles/pipeline ( "2 TMU's/pipeline" ),
so R200 will crunch each pixel for 3 clock cycles.

RV250 can do all the textures in one pass, and it can access only one different texture/clock cycle/pipeline ("1 TMU/pipeline"),
so RV250 will crunch each pixel for 5 clock cycles.
 
Ozymandis said:
Hmm, I didn't know that Quake3 forced another pass on surfaces where more than two layers required. Hmmm. I made some Kyro-specific Q3 shaders that took no less than 5 layers... and when I tried then in maps the Kyro choked on them. No wonder! :devilish:
Well, at least Serious Sam was written correctly :)

muppy said:
Ozymandis said:
Hmm, I didn't know that Quake3 forced another pass on surfaces where more than two layers required. Hmmm. I made some Kyro-specific Q3 shaders that took no less than 5 layers... and when I tried then in maps the Kyro choked on them. No wonder! :devilish:

I don't understand what this means
Muppy
Kyro supports 8 texture layers per polygon (in DX) and so games that use multiple texture layers should be able to do this many layers in a single pass.
Unfortunately, it appears Quake3 can only count to 2 before spawning a new geometry pass. IIRC this matched the maximum capability of chips from one manufacturer at the time, but I suspect this limitiation was probably very annoying for the other IHVs.
 
Back
Top