PDA

View Full Version : X1800 XT makes the case...


scificube
04-Oct-2005, 01:30
In an effort to support only FP32 and no partial precision (will almost) it seems ATI efforts in restructuring the HW in order to make this a reality has reaped some some nice rewards.

http://www.techpowerup.com/reviews/ATI/R5XX/3

With only 16 pipes...arrays or 4 quads or however things are best described nowadays it seems ATI has made a strong argument for it's ultra-threaded approach to things considering it outperforms the 7800GTX which has 24 pipes.

What is equally as interesting to me is that ATI mentions that the architecture is "ideal" for physics. This would appear to be a shot over Aegia's bow if ATI is to be believed. I don't believe this architecture could handle the tasks of rendering and physics at the same time, but this would still seem an intentional warning shot or at least an invitation to play around and see what one could do...at least with offline rendering perhaps.

I have to say I think something either is seriously wrong or seriously interesting is going on with the fear single player demo. Safe bet...something is borked.

edit:
better wild theory - Fear utilizes parallax mapping. ATI mentions the architecture is well suited to handling parallax occlusion mapping with SM3.0...seems ATI may be quite right :)
end edit:

What say you...

Is the performance advantage due to ultra-threading? (in having fewer pipes a higher clockspeed and more bandwith would bring one back to even, but would it be enough to seriously outperform the 7800GTX?)

What do you make of the physics quip in the chart?

ATI makes some bold claims in being able to use various methods to do HDR with MSAA while real men use FP32 to get manly results with SM3.0...ATI isn't just blowing smoke up our collective...well you get my drift. I think I'd like to wait for some collaboration of these numbers but the physics quip was something I thought was interesting. Any substance to this...or no?

In both the X1800 and Xenos being ultra-threaded so to speak if the numbers are to be trusted then if nothing else does this not bode well for Xenos?

Please discuss.

Skrying
04-Oct-2005, 01:33
Your bring up an interesting point. But one question, if the R520 can be used to offload the CPU for physics, then why not a bit more improvment in a physics heavy game such as HL2? Does it need to be implemented in the engine by your guess?

scificube
04-Oct-2005, 01:38
Your bring up an interesting point. But one question, if the R520 can be used to offload the CPU for physics, then why not a bit more improvment in a physics heavy game such as HL2? Does it need to be implemented in the engine by your guess?

Actually I just tossed the idea that physics was being offloaded to the GPU with Fear. That would have required ATI and the Fear crew to trust each other a good deal to say the least. It was wild speculation on my behalf.

What seems more plausible is that the architecture is somehow better tweaked to handle parallax mapping better and Fear uses this technique to represent damage your gunfire can do to the enviorment in the single player demo.

Chalnoth
04-Oct-2005, 01:41
Let's wait until we see some independent reviews before making absolute claims on performance levels. Two days isn't that long.

Skrying
04-Oct-2005, 01:41
Yet another interesting idea, and one that seems much more plausible and likely to me. But do you think it could make up that much of a difference, because the slides show a huge one to say the least.

scificube
04-Oct-2005, 01:45
Yet another interesting idea, and one that seems much more plausible and likely to me. But do you think it could make up that much of a difference, because the slides show a huge one to say the least.

I'm betting it makes a good difference but THAT big a difference is something...well special.

Chalnoth is probably right. It may be best to wait and see with this stuff. I was seeing if anyone had noticed what I had noticed. 2 days in truth really isn't that long to wait...it's a good thing I know that now. Does everything happen while I'm asleep? Sheesh.

Chalnoth
04-Oct-2005, 01:55
Yet another interesting idea, and one that seems much more plausible and likely to me. But do you think it could make up that much of a difference, because the slides show a huge one to say the least.
Yes, but there have been a number of situations in the past where a huge difference between two cards has been measured, but it turned out to be an error on the part of the person running the benchmarks. So independent results are really necessary to be certain.

Skrying
04-Oct-2005, 02:07
Yes, but there have been a number of situations in the past where a huge difference between two cards has been measured, but it turned out to be an error on the part of the person running the benchmarks. So independent results are really necessary to be certain.

You're taking all the fun out of this!

trinibwoy
04-Oct-2005, 02:09
You're taking all the fun out of this!

Damage control?

digitalwanderer
04-Oct-2005, 02:24
You're taking all the fun out of this!
Nah, Chal is actually right. Wait until we have some numbers from a number of places on a number of things before you start projecting too much. ;)

Chalnoth
04-Oct-2005, 02:33
I'm just taking this from the perspective of a scientist. You really have to have independent verification to be sure of the results of any experiement.

AlStrong
04-Oct-2005, 02:53
Should the Fear performance be representative of Doom 3's performance :?:

digitalwanderer
04-Oct-2005, 02:57
Should the Fear performance be representative of Doom 3's performance :?:
No.

Skrying
04-Oct-2005, 03:02
I'm just taking this from the perspective of a scientist. You really have to have independent verification to be sure of the results of any experiement.

I know. And in all seriousness this is how I approuch things like this, though I really enjoy speculation sometimes. I was joking with you. :p

Geo
04-Oct-2005, 03:12
Have we not got tired of "pipes matter" yet? :sad:

Maybe we ought to be comparing performance/per transistors? Except apparently we can't really do that either.

So we have to compare die size and adjust for known process differences and compare performance/per normalized mm2?

But if there was any shred of "pipes matter" left it should have gone out the door last night --even if R520 doesn't have decisive performance advantage; I can't imagine at this point it will have decisive performance disadvantage. . .and clearly the mhz difference will not make it up if "pipes matter".

Chalnoth
04-Oct-2005, 03:19
Well, there's a lot of things to compare.

Performance/cost (end-user cost)
Performance/watt (important for overclocking)
Performance/transistors (not really directly useful, but interesting in investigating the efficiency of an architecture)

And, of course, absolute performance is always a fun thing to measure for the top of the line products.

Geo
04-Oct-2005, 03:22
At this point you put any trust at all in a comparison that relies on transistor count as a divisor? Not I; thrown in the towel on that one for now.

Chalnoth
04-Oct-2005, 03:39
At this point you put any trust at all in a comparison that relies on transistor count as a divisor? Not I; thrown in the towel on that one for now.
Well, it can be important as an indirect measure of cost. Better is performance per die area, of course.

Anyway, I'm willing to bet that the majority of any efficiency improvements that we see in the R5xx core are due to the disassociation of the texture units from the ALU units. I think that once we have the cores and can do decent measurements of performance per die area, we can get a better idea of whether or not this was a good thing to do.

For example, if we assume that the R520, at a clockspeed where the power consumption is roughly on par with the G70, is 15% faster, but is also 15% larger than a 90nm G70 (which doesn't exist, so we might use transistor count for a placeholder), then we might assume that the choice to go with disassociated units really was a wash, and its only real benefit is better branching performance.

If, on the other hand, the R520 is released, and at a clockspeed where the power consumption is roughly on par with the G70, is 15% faster, but has about the same number of transistors, then we can roughly conclude that the choice to go with disassociated units was, in general, a good decision.

As a side comment, I don't feel the ring bus has much of anything to do with efficiency for current games. I feel that it is most likely an efficiency improvement in terms of sharing the texture units with the vertex units, and I doubt it helps at all with the efficiency of any current game implementations of SM3.

Acert93
04-Oct-2005, 03:54
I am hoping reviewers spend as much time judging IQ as they do performance. IQ can be hard to qualify, but I am interested in how MSAA+HDR performance and looks, the new angle independant AF, Adaptive AA, etc... all work out in practice. On a $400+ card you expect Good IQ and Good Performance.

If you only wanted good performance you could get a 6600GT and turn off all the features (i.e. low everything) and run at a stead 60fps in every game I know of. I know last summer when I bought my NV40 I was looking for that killer combination of Performace+IQ. If I had not I would have stayed with my Radeon 9700 which performs well to this day if performance is the only thing you were after.

overclocked
04-Oct-2005, 04:52
Well, there's a lot of things to compare.

Performance/cost (end-user cost)
Performance/watt (important for overclocking)
Performance/transistors (not really directly useful, but interesting in investigating the efficiency of an architecture)

And, of course, absolute performance is always a fun thing to measure for the top of the line products.

I would add time to market ala instant launch also in that formula, not in general but comparing just the case with G70/R520.

_xxx_
04-Oct-2005, 07:16
Two days isn't that long.

BUT IT IS!!! :evil:





:oops:

Ailuros
04-Oct-2005, 07:22
I would add time to market ala instant launch also in that formula, not in general but comparing just the case with G70/R520.

Definitely; IMO NVIDIA wouldn't have had such an easy stroll if R520 would had been released about the same time and with high enough quantities for the highest model.

Chalnoth
04-Oct-2005, 07:54
I would add time to market ala instant launch also in that formula, not in general but comparing just the case with G70/R520.
Nah, not really. I'm of the opinion that that will work itself out. If you can't buy it, you can't buy it. If ATI can't release the XT in volume, and if the GTX is considered to be better than the best ATI can release in volume, then more people will choose the GTX.

So I'd rather just focus on looking at the technology and performance of the product. Leave whether or not it's available to the people who are actually planning to upgrade (I'll probably have my 6600 GT until sometime late next year, as I just don't have much cash at the moment).

Mariner
04-Oct-2005, 10:11
Should the Fear performance be representative of Doom 3's performance :?:

If you notice, the X1800 sell sheets don't mention Doom 3 performance whatsover. This leads me to believe that G70 will have better performance in this game than R520. Not entirely surprising given that the current NV architecture is almost built around support for this rendering technique.

The only problem is, I wonder how many on-line reviews will do the old "Doom 3 and 3DMark comparison" thing with little or no reference to any of the other possible advantages/benefits of the R520 architecture?

My guess is: lots.

Jawed
04-Oct-2005, 10:49
The type of scheduler in R520 makes a whole new world of GPUs possible. In much the same way as NVidia's 6xxx series splits the ROPs from the fragment shader pipelines, it brings an entirely new degree of flexibility to an architecture. Only on a far more significant scale.

We know that there's little point in having "more ROPs" because they're constrained by memory bandwidth.

Similarly we can infer that there's little point in having more texture pipes because they're also constrained by memory bandwidth.

But the constraints on shader operations per second (non-texturing) are rather more nebulous. In other words, the more the merrier - Moore's law will provide the on-die memory and arrays of ALUs that provide 2x speed-ups every generation.

With this new type of scheduler we're not only seeing the texture pipes fully decoupled from the shader pipes, but we're also seeing the relative capacities of the two being decoupled.

The first generation of this new scheduler, in RV530, already shows that a 3:1 ratio between shader pipes and texture pipes is viable, with 12 fragments being shaded while 4 un-related fragments are being textured.

A similar scheduler is operating in Xenos and goes one step further in providing that GPU with the means to effectively load-balance vertex and fragment shader work.

Finally, of course, this new type of scheduler allows much smaller batches of fragments to be processed. This will make dynamic branching a viable performance enhancement in fragment shader programs - something that we've not seen so far in SM3 architectures.

So, in summary, the new scheduler is a major inflection point - I'd say as important as the first shader-capable GPU (but hey, I wasn't around then, so what do I know?).

Jawed

overclocked
04-Oct-2005, 11:47
Definitely; IMO NVIDIA wouldn't have had such an easy stroll if R520 would had been released about the same time and with high enough quantities for the highest model.

Sure i agree with that. But at the same you could get SLI also, then theres brand loyality and many other factors but we on the other hand really dont know how fast it is yet, feature-wise its great.

Nah, not really. I'm of the opinion that that will work itself out. If you can't buy it, you can't buy it. If ATI can't release the XT in volume, and if the GTX is considered to be better than the best ATI can release in volume, then more people will choose the GTX.

So I'd rather just focus on looking at the technology and performance of the product. Leave whether or not it's available to the people who are actually planning to upgrade (I'll probably have my 6600 GT until sometime late next year, as I just don't have much cash at the moment).

Well i agree with this also as im self not going to buy any new computer untill Vista.
I took what you wrote in your list more as i said "general" in this case and its always easy looking back and se what the "right" choise would been.

I think esp the 512 and also the 256 versions of the X1800XT pricing IS very strange because if you have that performance advantage ATI is suggesting plus the fact of expensive FAST memory i take this as an indication of low volume parts, but i could be wrong. But that bit really striked from when i first read the ATI pappers.

Ailuros
04-Oct-2005, 15:55
Sure i agree with that. But at the same you could get SLI also, then theres brand loyality and many other factors but we on the other hand really dont know how fast it is yet, feature-wise its great.

Multi-GPU configs are an entire story of their own; even a high end gamer will consider the extra cost for such a system. While there are definitely customers for such systems, the market share is also diametrically smaller than the remaining high end segment of the market.

As for the feature-set I count this far the ability to combine 64bpp HDR + MSAA and the new less angle dependent AF mode. Of course is there also adaptive AA, but since it's there on competitive products I won't count it as an advantage. All the remaining aspects - possible advantages and disadvantages - when it comes to features/functionalities are mainly of developer interest.

overclocked
05-Oct-2005, 03:59
Multi-GPU configs are an entire story of their own; even a high end gamer will consider the extra cost for such a system. While there are definitely customers for such systems, the market share is also diametrically smaller than the remaining high end segment of the market.

As for the feature-set I count this far the ability to combine 64bpp HDR + MSAA and the new less angle dependent AF mode. Of course is there also adaptive AA, but since it's there on competitive products I won't count it as an advantage. All the remaining aspects - possible advantages and disadvantages - when it comes to features/functionalities are mainly of developer interest.

Well i think your right but in that perspective, that we know the performance and/or hit with AA+FP16HDR. But for now we dont know but soon we will.
There also a couple of more things you could add as noise,power,heat and some may find it impracical with a dualslot heatsink, the list goes on.

Skrying
05-Oct-2005, 04:06
We dont know noise, heat or power. Dual slot is a given, though I think most of the people who will be buying these cards really doesnt care about dual slot unless its a shuttle or similair system. Also, I personally much rather have a dual slot if it moves the hot air out of my case, something I really dislike about the 7800GTX cooler is that it just recirculates the hot air around in the case, that's also why I like Nv/ATI Silencer's over say a Zalman VGA cooler.

trinibwoy
05-Oct-2005, 04:13
We dont know noise, heat or power. Dual slot is a given, though I think most of the people who will be buying these cards really doesnt care about dual slot unless its a shuttle or similair system. Also, I personally much rather have a dual slot if it moves the hot air out of my case, something I really dislike about the 7800GTX cooler is that it just recirculates the hot air around in the case, that's also why I like Nv/ATI Silencer's over say a Zalman VGA cooler.

Yep, don't care about dual-slot either since I'll only have a single card anyway. And I agree that exhausting out the case is best - I miss my Silencer.

rwolf
05-Oct-2005, 05:44
I'm just taking this from the perspective of a scientist. You really have to have independent verification to be sure of the results of any experiement.

David Kirk? :wink: