If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.
![]() |
|
|
#1 | |||
|
Member
Join Date: Jul 2003
Location: Beijing
Posts: 640
|
I've just read a whitepaper named "Xbox Pixel Shader Performance" which you can find in the latest XBOX SDK.
here's some quote : Quote:
Quote:
Quote:
|
|||
|
|
|
|
|
#2 |
|
Member
Join Date: Jul 2003
Location: Beijing
Posts: 640
|
And, if such restrictions do exist, where do they come from?
I think the single triangle restriction is due to the fact that the PS pipelines need data which is interpolated across the triangle, so shading pixels in two triangle doesn't make sense. please correct me here if I'm wrong. So what about the single quad restriction? Accoording to the "NV30 inside" article published on 3dcenter.org, NV30 does have such restriction. what about R300/350? they have 8 Pixel shader pipeline, so 2 quads at one time? |
|
|
|
|
|
#3 | |
|
Tea maker
Join Date: Feb 2002
Location: In the Island of Sodor, where the steam trains lie
Posts: 4,379
|
Quote:
__________________
"Your work is both good and original. Unfortunately the part that is good is not original and the part that is original is not good." -(attributed to) Samuel Johnson "I invented the term Object-Oriented, and I can tell you I did not have C++ in mind." Alan Kay |
|
|
|
|
|
|
#4 | |
|
Irregular
Join Date: Feb 2002
Posts: 1,170
|
Quote:
The 2 quads are processed independently, can take different processing time or belong to different triangles. There's a 16x16 tile checkerboard pattern one of the units is processing the "black" tiles, the other the "white" tiles. |
|
|
|
|
|
|
#5 |
|
Member
Join Date: Jul 2003
Location: Beijing
Posts: 640
|
thanks Simon F and Hyp-x, those comments are helpful though I don't quite understand the "16x16 pattern" thing
Another question: With the advent of DFC, it's quite possible that different pixels within a single quad need different processing time, is it safe to say that the processing time a quad needs is that of its slowest pixel? If this is true, I think the parallelism is reduced with more pixel shader pipelines reside in a single GPU because the possibility that all pixels which are processed at one time need equal processing time is decreasing very quickly with more pixel shader pipelines. Is it possible that IHVs design their hardware to assign each pixel a independent pipeline rather than assigning 4 pipelines to a quad? how about this approach's efficiency? |
|
|
|
|
|
#6 |
|
Senior Member
Join Date: Jul 2002
Location: UK
Posts: 1,758
|
These kind of things are known as 'granularity losses' (the smallest chunk that can be worked on is larger than the smallest chunk of useful data that could be desired).
It's rarely significant on very small triangles - these tend to be vertex limited. In theory it could be somewhat of a problem on long (100s of pixels) and skinny (~1 pixel) triangles, but I've never actually seen a problem case. I don't like the terminology used below. 'The four pixel pipelines...' is a misnomer because it implies independence. It is one quad pipeline. (Regular readers will have heard this rant before). |
|
|
|
|
|
#7 | |||
|
Tea maker
Join Date: Feb 2002
Location: In the Island of Sodor, where the steam trains lie
Posts: 4,379
|
Quote:
Quote:
Quote:
__________________
"Your work is both good and original. Unfortunately the part that is good is not original and the part that is original is not good." -(attributed to) Samuel Johnson "I invented the term Object-Oriented, and I can tell you I did not have C++ in mind." Alan Kay |
|||
|
|
|
|
|
#8 |
|
Senior Member
Join Date: Jan 2003
Location: Ottawa, Ontario
Posts: 1,783
|
The restriction is most probably needed to calculate texture gradients, like the way dsx/dsy work. These gradients are then used to compute the mipmap level.
This way you can do whatever transformation on the texture coordinates, the mipmap level will still be calculated correctly. This avoids overblur and aliasing that would otherwise occur with 'linear' mipmap level interpolation. That latter method also requires more operations especially when using adaptive anisotropic filtering. Should save some silicon... |
|
|
|
|
|
#9 | |
|
Senior Member
Join Date: Jul 2002
Location: UK
Posts: 1,758
|
Quote:
If we can't get away from TMU's and pixel pipelines, anything you or I choose to do is not likely to make much difference |
|
|
|
|
|
|
#10 |
|
Member
Join Date: Jul 2003
Location: Beijing
Posts: 640
|
Thanks for all the replies, I'm beefed up by visiting here :P
|
|
|
|
|
|
#11 | ||
|
Member
Join Date: Mar 2002
Location: UK
Posts: 570
|
Quote:
John. |
||
|
|
|
|
|
#12 | |
|
Senior Member
Join Date: Jan 2003
Location: Ottawa, Ontario
Posts: 1,783
|
Quote:
The only method I know that really makes gradient calculations independent is to fully shade 3 texture coordinates per pixel (arranged in a 1x1 pixel triangle). The cost of this is of course comparable to 3x supersampling but with shader analysis it could be reduced a lot? |
|
|
|
|
|
|
#13 | ||
|
Member
Join Date: Mar 2002
Location: UK
Posts: 570
|
Quote:
This is of course the price you pays... John. |
||
|
|
|
|
|
#14 | ||
|
Unknown.
Join Date: Aug 2002
Location: UK
Posts: 4,877
|
Quote:
Does that mean R500 is less advanced than NV50? Or am I just reading too much in that sentence? Or maybe am I overestimating the NV50? I doubt that though. Uttar |
||
|
|
|
|
|
#15 |
|
Senior Member
Join Date: Jan 2003
Location: Toronto
Posts: 1,557
|
Uttar, I am going to ask that Dave bans you from B3D until you produce that editorial
__________________
on my way to becoming dark matter.......... |
|
|
|
|
|
#16 | |||
|
Senior Member
Join Date: Jul 2002
Location: UK
Posts: 1,758
|
Quote:
Does that clarify things? |
|||
|
|
|
|
|
#17 | |
|
Unknown.
Join Date: Aug 2002
Location: UK
Posts: 4,877
|
Quote:
Hey, if I'm being slower than for most of my other writing, is that I want this to be high quality and very informative. It's not I'm not working on it. It's just that there is thus a lot more "overhead" ( NV30 anyone? That includes a professional technical writer correcting it in his free time and sources commenting on it, to make sure I ain't making any major mistakes and that the overall message is indeed correct. Now, if one of the sources just told me "Err, no, that all this just doesn't seem to be true at all", I'd just scrap the whole thing and start again with other goals. No kidding. 7 full A4 pages in Times New Romans, size 12, written already. Current goal is around 10 pages. ETA is 20-25 October. Before releasing it, I also need to make sure it's released before or after an official launch, to make sure its existence isn't forgotten due to discussions about I don't know what awfully boring fall refresh :P Dio: Lol, okay. I did read too much in that sentence I guess Uttar |
|
|
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Xenos Chip Package | Dave Baumann | Beyond3D News | 44 | 23-Jun-2005 09:30 |
| How long before a X800 wrapper (Ruby demo) appears? | g__day | 3D Technology & Algorithms | 285 | 17-Jan-2005 10:08 |
| Ati's Technology Marketing Manager @ TechReport | Evildeus | 3D & Semiconductor Industry | 67 | 02-Jun-2004 21:36 |
| 5900? NV4x! | Frank | 3D Architectures & Chips | 71 | 10-Oct-2003 21:12 |