If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.
![]() |
|
|
#1 |
|
Senior Member
Join Date: Aug 2002
Location: Miami, Fl
Posts: 1,036
|
Just wondering if titles, the likes of Halo, Doa 3, splinter cell, shenmue 2, etc. used trilinear texture filtering. I know that the NV2A incurs a performance hit with trilinear (half the bilinear fillrate, unless developers figured how to use both pipeline tmu's in conjunction for filtering?). Do many titles utilize this superior method, and is it plausible on the xbox with acceptable performance?
Can someone who develops on the xbox expound on the reasons behind the trilinear performance penalties on the NV2A? Thankyou |
|
|
|
|
|
#2 | ||
|
Moderator
Join Date: Feb 2002
Location: Redmond, WA
Posts: 3,322
|
Quote:
Quote:
You do take a penalty when you enable aniso, the actual penalty is very dependant on the max aniso level and the scene. There is also a similar penalty for forcing negative mipmap Bias. |
||
|
|
|
|
|
#3 |
|
Senior Member
Join Date: Aug 2002
Location: Miami, Fl
Posts: 1,036
|
Thankyou for your reply ERP, I guess I got aniso and trilinear performance mixed-up.
|
|
|
|
|
|
#4 |
|
Senior Member
|
ERP, sure there is no perfomance hit with tri-linear ( BTW, Shenmue II is not using tri-linear, it really look again like anisotropic filterign + bi-linear ) ?
Are you saying it can pump out ( 250 MHz clock ) 2 GTexels/s with tri-linear ? AFAIK, didn't it do 2 GTexels/s with bi-linear and 1 GTexels/s with tri-linear ? I'm not saying anything you do not know, but having ZERO hit when you are requesting 2x the number of texels per pixel ( for tri-linear ) and you only have two TMUs per pipe it is tough to mantain the same speed... Unless each TMU in the 4 NV2A pipes can do a single texel tri-linear filtered I cannot see how the performance of tri-linear is THE same as bi-linear... 2 textures per cycle AND tri-linear filtering... ? uhm... |
|
|
|
|
|
#5 | |||
|
Moderator
Join Date: Feb 2002
Location: Redmond, WA
Posts: 3,322
|
Quote:
Quote:
Quote:
The limitation isn't the ability to compute the filtering it's the bandwidth required to the read the source texels from the cache. And as I mentioned above this limit such that in the majority of cases trilinear takes 1 cycle. The exploitation of the coherency is probably why setting a negative MipMap bias is so expensive, it increases the size of the sample mask for the pipeline exceeding the maximum read rate from the cache in a larger percentage of cases. |
|||
|
|
|
|
|
#6 | |
|
Senior Member
|
Quote:
I remember Korval ( Tryarch ) talking about how it was the cache bandwidth limiting you from doing tri-linear at the same speed as bi-linear ( 2 textures/pixels in one cycle ) now that I think about it... Even reading your post you "seem" to make it clear that thos benchmarks were stressing the GPU a bit with bi-linear filtering on and the switch to tri-linear might not have dropped performance a lot "maybe" because the limitation was elsewhere... I'm sorry to not a bit of an angered tone ( as it might be after the 100th time you repeat things you have coded yourself and seen in the XDK to people who do not have that info.. |
|
|
|
|
|
|
#7 |
|
Member
Join Date: Mar 2002
Location: Montana
Posts: 154
|
Korval was talking about the Geforce3's tri-linear performance. It was assumed at the time that the cache architecture of the GF3 and NV2A were nearly identical.
|
|
|
|
|
|
#8 |
|
Senior Member
|
so they are not ?
strange... that would put the NV2A texture performance 2x as high as Flipper's even with tri-linear... in the real world having e-DRAM on FLipper will sustain the fill-rate a bit better and they NV2A's advantage will be much smaller ( and in some cases identical ) |
|
|
|
|
|
#9 |
|
Moderator
Join Date: Feb 2002
Location: Redmond, WA
Posts: 3,322
|
Panajev2001a,
Firstly I want to say that a lot of what people believe about NV2X is based largely on benchmarks run on PC's with GF3/4's, the problem is that it is difficult on a PC to understand what's happening. Between the machinations of the Driver and the DirectX runtime it's pretty much impossible to know what your measuring, it's even difficult to know if your CPU or GPU bound. XBox DX is a very thin interface, I can see all the interupts, and I can setup the buffers so I know their won't be stalls. The net result is that on Xbox I can measure CPU and GPU usage seperately and accurately. So in general if I'm talking about graphics performance I will not give CPU limited results. OK I don't have anywhere I can currently rerun the tests so these results are from my memory. Test involves trivially filling the screen repeatedly with large polygons Z set to LE source textures compressed. 1 Texture bi or tri just over 700 MPixels 2 Textures bi or tri somewhere between 600-700 MPixels 1.2->1.4 Mtexels My in game tests show no noticable difference (read <2%) in GPU performance between Trilinear and Point sampling, and performance in this case is largely limited by fillrate. If you search in the main forum I posted some in game Aniso costs along with MIP bias costs aswell, they're not all encompassing but they give a general indication of cost. Yes for all intensive purposes NV2A will out fill flipper if you just fill the screen over and over. Flipper seems to be somewhat less sensitive to small tri's (i.e. it doesn't seem to slow down as much) but it's hard to do direct comparisons because of the large discrepency in T&L performance. FWIW I have never seen a resonable situation where Flipper out performs NV2A. Filling the screen over and over with transparent textures might tip the balance, but I've never done the test to verify it. I'm not trying to slam flipper here, GC is a really nice platform to work on, but Flipper just isn't going to win many benchmarks against NV2A. Sorry this is one of those misconceptions that annoys me for some reason :/ The other one is people assuming that NV2A is limited by framebuffer memory bandwidth. IME this is hardly ever the case. Once we were no longer CPU bound we found that we actually hit fill rate limits before we ran out of memory bandwidth. We could relatively easilly test this by comparing performance with various AA modes, and looking at the performance difference. |
|
|
|
|
|
#10 |
|
Senior Member
|
thanks a lot ERP... your post as very interesting
It doesn't surprise me that you found Flipper to have those kind of performance... even reaching or remaining close to NV2A is a success for Flipper given the clock speed disavantage ( what would you have thought of a 202 MHz Flipper ? ) and the fact Flipper had a bit smaller budget than NV2A... Something that suprised me of Flipper was the T&L unit... assuming we mostly stick to static meshes and DX7 class features, it does perform quite well... If you see how you add one, two textures and then one and two lights ( local not only infinite ones ) and see how the performance is still quite high... that did impress me |
|
|
|
|
|
#11 |
|
Member
Join Date: Jul 2002
Posts: 700
|
I sometimes wonder how close Flipper and XGPU would have been if they had been their full original specification. XGPU - 300 Mhz NV25 and
Flipper - 202.5 with more on die 1T-SRAM. Perhaps any eventual "Flipper2" and "XGPU2" are alot closer in performance in 2006. Actually, I do believe the differences will be almost insignificant by then, to all except useless benchmarks. I know this is slightly off-topic, I'll say it anyway. All I am really hoping for in the next generation of consoles, is television-show quality prerendered graphics in real-time games. any of you believe that is possible in the timeframe we are talking about, of 4-5 years? |
|
|
|
|
|
#12 |
|
Member
Join Date: Feb 2002
Posts: 394
|
Anyone know what ever happened to Korval? He really knew his stuff. After he spanked that Deadmeat guy that was the last i heard of him..well when PGC Forums went freebie ads and all.
|
|
|
|
|
|
#13 |
|
Member
Join Date: Jul 2002
Posts: 481
|
If NV2A is mostly fillrate limited, then why don't we see Antialiasing in more Xbox games?
Also, does the NV2A take the same humongous fillrate hit from anisotropic filtering that the GeForce4 suffers? |
|
|
|
|
|
#14 | |
|
SNAKES... ON A PLANE
|
Quote:
(EDIT = Abject Stupidity Removed To Save Face)
__________________
For Great Justice Move Every 'Zig' |
|
|
|
|
|
|
#15 | ||
|
Harmlessly Evil
Join Date: Feb 2002
Posts: 2,027
|
Quote:
|
||
|
|
|
|
|
#16 | ||
|
Moderator
Join Date: Feb 2002
Location: Redmond, WA
Posts: 3,322
|
Quote:
Given 2x multisampling there is some additional cost incurred because the Z compression doesn't function as effectively. In addition the filter/copy forwards operation is expensive. The cost isn't necessarilly prohibitive, but it does need to be planned for. Quote:
Times are in ms to complete rendering of the background with various combinations of forced Aniso and Mipmap bias. Mipmap Bias accross the top/Aniso setting down the side. So the bottom right cell is Mipmap Bias of -3 and Aniso 4 (8x). If your just interested in Aniso then just look at the left hand column. Code:
0 -1 -2 -3 Linear 5.2 5.4 6.0 6.3 Aniso 2 5.7 6.2 7.1 7.4 Aniso 3 6.1 6.8 8.1 8.3 Aniso 4 6.4 7.4 8.7 8.8 In the actual game we allowed artists to specify aniso level and Mip LOD on a material by material basis, using it where is made a difference, in general the performance cost for using it in this fashion was extremely low. |
||
|
|
|
|
|
#17 |
|
Senior Member
Join Date: Feb 2002
Posts: 598
|
didn't Deadmeat say the flipper was an overclocked version of some card called "alladin7"?
|
|
|
|
|
|
#18 | |
|
ea_spouse is H4WT!
Join Date: Feb 2002
Location: 53:4F:4E:59
Posts: 1,588
|
Quote:
As in the motherboard embedded graphics core of the ALi ALaddin?
__________________
"The sooner someone gets sued by Intel for violation, the sooner the patent can be revoked from orbit for gratuitous and wanton disregard for prior art and obviousness." ~TomF |
|
|
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Official Microsoft PR thread | Sonic | Console Technology | 82 | 29-Jun-2005 09:45 |
| Xbox 360 graphics processor codenamed: XENOS. some details | Megadrive1988 | Console Technology | 43 | 20-May-2005 00:44 |
| Final Doom3 benches at HardOCP | Johnny Rotten | 3D Hardware, Software & Output Devices | 641 | 26-Jul-2004 03:22 |
| OMG HARDOCP REVIEW OF UT2003 AND FILTERING | bloodbob | 3D & Semiconductor Industry | 439 | 26-Jul-2003 22:13 |
| Who's Blocking the Xbox? Sony and Its Games | Console Technology | 78 | 25-Feb-2003 06:33 | |