Anand's retail Radeon 9500 Pro review - much faster!!!....

Thinking back to what Carmack said several months ago...

Should anybody ever doubt what he has to say? I mean, is there any need to go through his .plan's with a fine tooth comb?

When the dude says that nVidia is 1/2 generation behind ATI...just take it at face value. He clearly knows full well what's happening behind the scenes, and it should be left "as is" without any debate.

ATI has done what nVidia has never been able to do. That is, bring out your new tech. ahead of the next generation API's, to include the budget product line. By the time DX9 has been officially released, ATI will have had their high-end cards out for darn near 3-5 months (depending on exactly when DX9 is released), and the mid-range for probably 1-2. nVidia has never been that far ahead of the curve, so to speak.

It shall be interesting to see how the FX turns out, in terms of performance. If it's unable to significantly position itself over the 9700, you could darn near say that this release turned out to be a very large failure.
 
I'm quite impressed with the performance of the 9500 PRO. With so much less bandwidth than the 9700, I thought performance would really suffer more (as it seemed in the initial 9500 PRO previews).

I'm also surprised at the cost ATI is able to produce them at. It's still a ~100M transistor chip, after all, and rightly so due to the DX9 support. I guess their board design is much cheaper than the Ti4600, not to mention the slower and cheaper Hynix memory.
 
Do you think that the 9700's 256-bit bus is not working as efficiently as it could be? I would have definately expected the 9500 PRO antialiasing scores to scale worse than the 9700 as you increase the number of samples, but it not the case:

http://www.tech-report.com/reviews/2002q4/radeon-9500pro/index.x?pg=12

The same this is found in Wavey's reviews (similar performance hit when neither card is CPU limited). Obvioiusly ATI's colour compression is quite effective, as the 9500 PRO really slaughters the Ti4600 in AA most of the time, but I figured at higher sampling levels there would still be a marked difference between the 9500 PRO and the 9700.

Hmm...
 
It's also possible that the 256bit bus is "overkill" for the 9700Pro in that it's not being saturated, and maybe the results show up the way they do.
 
Mintmaster said:
I'm quite impressed with the performance of the 9500 PRO. With so much less bandwidth than the 9700, I thought performance would really suffer more (as it seemed in the initial 9500 PRO previews).

This reminds me of a comparison that I believe Dave Wavey made a while back (please correct me if I'm wrong, Dave). Shouldn't the NV30 be ~ twice the speed of the 9500 pro? After all it's twice the clock speed (approx) twice the memory speed, same number of pixel pipes & same 128 bit wide memory buss.

Of course it's assumed we're talking non-CPU limiting conditions.

Conversely, the NV30 and the 9500-pro would be two very good cards to compare clock-for-clock to see how the actual efficiencies of the two architectures match up.
 
Contrast 9500 PRO's results with both the plain 9500 and 9700 PRO and I think its clear to see that and 8x1 with 256-bit bus, or 4x1 with 128-bit isn't getting the best out of the bandwidth.

Obviously this isn't the case where FSAA is concerned, and thats where the real gains are to be had on current titles for the high end boards.
 
Still, the good results of the 9500 Pro with only 128-bit bus, does point to the GFFX being at least competitive to the 9700 Pro (down to its high clock speed, as both the R300 and NV30 appear to have similar bandwidth saving tech).

What would be even more interesting is to find out the performance gains that the 9500 Pro gets when overclocked. This should show how much 'room' there is left in a 128-bit bus design, with 8 pipes.
 
High resolution gaming i.e 1600 x 1200 with 4X AA and 16 X Ansiotropic would show the benefit of a 256 bit bus bigtime.

Intel Pentium4 2.53GHz used in both tests by Wavey..

Radeon 9700 Pro


010.gif



Radeon 9500 Pro

image013.gif


9700 Pro is getting 100 fps in Serious Sam @ 1600 x x1200 4X FSAA and AF..while the 9500 is getting 60 (still damn good) for a low cost DX9 card :p ...yet shows what the memory bandwidth is able to do.
 
The way R300s performance scales with bandwidth is interesting; I wonder what's causing it. Is it due to inefficiency of 256 bit bus? Or is 9700 Pro becoming fillrate limited, thus unable to take advantage of some of the bandwidth?

BTW, is 128 bit bus on 9500 Pro 2x64 or 4x32?
 
Geeforcer said:
The way R300s performance scales with bandwidth is interesting; I wonder what's causing it. Is it due to inefficiency of 256 bit bus? Or is 9700 Pro becoming fillrate limited, thus unable to take advantage of some of the bandwidth?

It's likely the second one.
(That's one of the reasons everyone expected the NV30 to be 8x2.)

BTW, is 128 bit bus on 9500 Pro 2x64 or 4x32?

2x64
 
It should be relatively easy to test by overclocking the chip alone and plotting the performance against frequency.
 
alexsok said:
R9500 is an exellent product, there is simply no doubt about that!
I mean, think of it for a sec, DX9 hasn't even been released yet, and mainstream (<200$) cards are already on the market! I really hope NVIDIA will catch up to ATI here soon enough and bring a similarly perfoming card to the market, so the developers could focus on DX9 class hardware immediately! Add to that the significance of HLSL's and I can only hope developers would follow the path laid down for them!

Hijacking the thread a little bit;

I would like to know how's the NV31 going. Has the chip still the massive 125Mio transistors (like the R9500Pro) and an reduced speed or will the chip work at 500MHz too, but with only 4 Pipelines. What design route has Nvidia choosen? It will be nice to see the differences and similarities between ATi and Nvidia for the mainstream cards.
 
Hyp-X said:
(That's one of the reasons everyone expected the NV30 to be 8x2.)

Even presuming that Radeon 9700 Pro is indeed fillrate limited, I don't think that makes a case for 8x2 NV30, considering that it already has substantially more fillrate and less bandwidth. Even in its current state, NV30 is much more in danger of being bandwidth limited rather then fillrate limited most of the time.
 
You've got to be one or the other (or CPU limited).

All that really matters is who's faster in the end. I guess we won't know that until the damn thing hits the reviewers.
 
Board costs are negligable, and the item that drives the cost is the number of layers and size of the board.

Board technolgies also can be a considerable cost factor. Please note CAN be ;)
 
Geeforcer said:
Hyp-X said:
(That's one of the reasons everyone expected the NV30 to be 8x2.)

Even presuming that Radeon 9700 Pro is indeed fillrate limited, I don't think that makes a case for 8x2 NV30, considering that it already has substantially more fillrate and less bandwidth. Even in its current state, NV30 is much more in danger of being bandwidth limited rather then fillrate limited most of the time.

Yes of course they choose high clocking instead of more TMUs.
It has more than 50% advantage to R300, and given that R300 is rarely bandwidth limited, it can turn out to be a big advantage.

Especially in Doom3. I expect that game to be fillrate limited (not bandwidth limited) almost everywhere. (Except where it's CPU limited.)
 
Geeforcer said:
Hyp-X said:
(That's one of the reasons everyone expected the NV30 to be 8x2.)

Even presuming that Radeon 9700 Pro is indeed fillrate limited, I don't think that makes a case for 8x2 NV30, considering that it already has substantially more fillrate and less bandwidth. Even in its current state, NV30 is much more in danger of being bandwidth limited rather then fillrate limited most of the time.


So I now expect the R350 to be an 8x2 chip. Would make sense when the R9700Pro is fillrate-limited most of the time. Maybe then the chip will be clocked like the R300 @ 325-350MHz, but will have >400MHz DDR2 to augment the 8x2 pipelines. This will not speed up newer games like Doom3 so much but older games would get an nice boost.
 
mboeller said:
This will not speed up newer games like Doom3 so much but older games would get an nice boost.

Yep. To speed up Doom3 doing two shader ops per cycle would be more of an advantage.

Actually having 2 TMUs without able to do 2 shader ops would lead to a very unbalanced card.

But now we are probably talking about a 160 Million transistor chip - on a 0.15 process ...
So I have my doubts...
 
The question is, is 9700 pro indeed fillrate-limited? Could someone do a quick test to determine that?

All you need to do is take something like UT2k3 Inferno, run the benchmark at high resolution at default clockspeed, then overclock the core by say, 5-10% and run the test again. If the card is indeed fillrate-limited, there should be a close correlation between percentage increases of core clock and performance. On the other hand, leaving core at the default frequency and overclocking memory alone should have less of an impact.
 
Back
Top