Beyond3D Forum

Beyond3D Forum (http://forum.beyond3d.com/index.php)
-   Pre-release GPU Speculation (http://forum.beyond3d.com/forumdisplay.php?f=51)
-   -   The NEXT LAST R600 Rumours & Speculation Thread (http://forum.beyond3d.com/showthread.php?t=39173)

Jawed 12-May-2007 17:20

Quote:

Originally Posted by nAo (Post 985502)
Well, I wouldn't be suprised if drivers can tweak or even modifying load balancing policies, it was possible on more primitive architectures so I don't see why things should change now.

I agree. What Rys says implies he considers it a driver fault.

Quote:

Regarding DX10 performance I'd like to see tested MUCH MORE features than geometry shaders,
I'd just like to see them tested, so far nought. No conception of the theoretical performance of G80's GS architecture and no idea how G80 is dealing with the variety of possibilities there.

Quote:

it's not going to be the end of the world if GS implementation is not uber fast now, there's time for that, as long as devs can start to use it.
I guess we'll just have to see...

Jawed

trinibwoy 12-May-2007 17:22

Quote:

Originally Posted by PSU-failure (Post 985505)
I'd like to know where the mistakes are, since you seem to have a great knowledge. :?: It's easier to say someone is wrong than to explain why.

It's also easier for someone to ask for an explanation than to go research the topic themselves and try to better their own understanding so that they can avoid making such silly statements in the future.

First of all, instruction re-ordering is handled by the shader compiler so your statement there has no merit, at least I've never heard of OOE in a GPU. And even if it did, I fail to see how that would be relevant to an R600 to G80 comparison. Let's assume for kicks that GPU's did OOE in hardware. How is R600's ALU configuration more amenable to this than G80's?

And that bit about storing temporary data for in-flight threads - that's the theme that all modern GPU's are built around. Not even sure how to respond to your comments about the number of instructions or branching since they really make no sense to me. Maybe somebody is willing to take a shot at it for ya.

Rys 12-May-2007 17:25

Quote:

Originally Posted by trinibwoy (Post 985507)
Forget the delay. R600's reported performance is far too low given the leaked specs. I'm looking forward to Rys' analysis, hopefully he can shed some light on what's going with the texturing hardware and identify some potential bottlenecks. If R600 really has only 16 bilinear int8 filtering units then it has no more texturing ability than the GTS. Of course, there could be some really malicious software issues at play but there's just so much raw power I just don't see that being a major factor.

We can push peak filter rates (and for more than INT8 surfaces) out of R600 at this point (large and small textures too) with a new tester (w00t!), so it seems the driver and hardware is running freely in that respect, so it would depend on what the app is doing to poop on that somehow.

Don't look forward to it too much! I get nervous around this time that I'm not going to have the time to put everything in there. I've already cut some stuff (which we'll talk about at some point though)......

Which seems to say I should spend less time drinking tea and reading this thread, and more time hacking it up :lol:

trinibwoy 12-May-2007 17:33

Quote:

Originally Posted by Rys (Post 985512)
We can push peak filter rates (and for more than INT8 surfaces) out of R600 at this point (large and small textures too) with a new tester (w00t!), so it seems the driver and hardware is running freely in that respect, so it would depend on what the app is doing to poop on that somehow.

Are those peaks high peaks or low peaks? :smile:

Quote:

Don't look forward to it too much!
Awwwww :cry:

Razor1 12-May-2007 17:41

Quote:

Originally Posted by Rys (Post 985508)
The driver can pass down what amounts to a scheduling hint based on what it's about to send the hardware to chew on.


interesting

Quote:

Which seems to say I should spend less time drinking tea and reading this thread, and more time hacking it up :lol:
LOL

Jawed 12-May-2007 17:44

Quote:

Originally Posted by PSU-failure (Post 985505)
It's easier to say someone is wrong than to explain why.

Unfortunately explaining is not simple.

You need to think of time-sliced batch scheduling as the primary mechanism, with the GPU operating on a set of tens or hundreds of batches (per cluster). Instruction parallelism is something that's pretty much hidden and not (in my opinion) relevant to a discussion of load-balancing and overall batch throughput.

This is a good starting point:

http://www.beyond3d.com/content/articles/4/8

Jawed

SugarCoat 12-May-2007 18:01

Quote:

Originally Posted by Bouncing Zabaglione Bros. (Post 985397)
I think it's quite telling that there is no real XTX until R650 arrives at .65 in a couple more months. It seems that for whatever reason, AMD doesn't have a top-of-the-range product to go head-to-head with the competition, so they are having to try and push their second string product into top place. This explains the low price, and the less than screaming performance compared to G80.

I think that just like we didn't see the full realisation of R5x0 until R580, so we won't see the full realisation of R6x0 until R650 and I suspect it's much higher clocks. I also think that R650 will come sooner that we expect, because of that obvious gap at the XTX level of the AMD product range that is begging to be filled.

The R580 wasnt a massive leap in real world performance over the X1800XT in any respect. I believe the biggest advantages came at ultra high resolutions but for the most part the R520 was a good strong core and the R580 was simply a refresh of that giving a speed boost. In this case, with the R600, its looking like the problem is with the core itself so if what we're waiting on is substantial clock increases at .65nm then thats not very hopeful considering nVidia can do the same thing.

The R520 also had a really good advantage, to me anyway, in improved high IQ performance, and substantially so over Geforce 7 parts, so if the IQ hasnt been improved yet again to something beyond that of the G80 even if its slower, then this card is a pass to me.

Still got the inhouse tech demos to look forward too!

tEd 12-May-2007 18:20

All the bench leaks , i'm surprised no driver has been leaked yet

aeryon 12-May-2007 19:12

Quote:

Originally Posted by tEd (Post 985527)
All the bench leaks , i'm surprised no driver has been leaked yet

for what ? actually they last no more than 2 or 3 days before a new release comes out :lol:

Razor1 12-May-2007 19:26

Rys, just had a question, if the threads in flight are reduced when GS is being used, that will effect everything else right? It will become a "systemic" problem?

3dcgi 12-May-2007 19:34

Quote:

Originally Posted by trinibwoy (Post 985464)
Hold on, you created a geometry shader that did no work and it still halved performance? :shock:

Unless the driver/compiler can detect that the shader will do no work GS threads must still be run and memory must be allocated. This will be true of both G80 and R600.

Galduta 12-May-2007 19:59

Very interesting .... R6 Las Vegas ,

R6: Las Vegas, maximun settings

2900XT/8800GTS:
1024x768
min:38/27
med:74/60
max:111/95

1280x960
min:26/18
med:53/42
max:83/68

1600x1200
min:19/13
med:37/30
max:70/48

Rys 12-May-2007 20:02

Quote:

Originally Posted by Razor1 (Post 985540)
Rys, just had a question, if the threads in flight are reduced when GS is being used, that will effect everything else right? It will become a "systemic" problem?

In that it'll affect the performance of the entire chip? It has the potential to, of course, if there aren't enough available threads to keep throughput up.

I think there are also cases (currently, and on G80) where a GS shader can have the thread count increasingly reduced to the point where only a relatively small number of them are running and possibly only on one cluster, because it's doing increased amounts of amplification.

I think worst cases like that are possible on dynamically load-balanced architecture which has fixed resources, though. I haven't tested a shader like that heavily though, nor on R600 yet.

mboeller 12-May-2007 20:05

From what I have read so far I conclude that the AA of the HD 2900XT is still "broken" because without AA the HD 2900XT is significantly faster than the GTS but with AA the card is slower.

can someone confirm this?

IbaneZ 12-May-2007 20:13

Quote:

Originally Posted by Galduta (Post 985548)
Very interesting .... R6 Las Vegas ,

Interesting indeed.

Why the hell doesn't R600 kick the living crap out of the 8800 GTS?

Please, someone has to remind the 320 stream processors that's it's showtime. Wakey wakey. :lol:

R300King! 12-May-2007 20:29

I think this is a HD2900XT here.

http://i9.photobucket.com/albums/a52...r/archmark.jpg

FEAR
http://i9.photobucket.com/albums/a52...easer/fear.jpg

CoH
http://i9.photobucket.com/albums/a52...Teaser/CoH.jpg


Are these good scores? :D

Galduta 12-May-2007 20:32

Quote:

Originally Posted by R300King! (Post 985559)
I think this is a HD2900XT here.



FEAR
http://i9.photobucket.com/albums/a52...easer/fear.jpg



Are these good scores? :D


Fake :D ? This test is CPU limited

http://www.firingsquad.com/hardware/...ance/page5.asp


http://www.firingsquad.com/hardware/...es/fear800.gif

Andrew Lauritzen 12-May-2007 20:39

Quote:

Originally Posted by Jawed (Post 985492)
D3D10 requires the developer to specify an upper bound on the the number of vertices generated per input vertex. It might be that the driver is playing safe and always assuming the maximum...

I specified a maximum of 3 vertices....

Honestly on one hand I understand that there are a lot of really hard things to do with respect to the geometry shader. It really does break the parallelism and can easily be coded to bring *any* card to a halt. That said, they must have thought that some of the easy cases could be accelerated efficiently or else they wouldn't have added it to the spec (assuming MS isn't just totally off in left field). Still, the current performance of the G80 GS leaves something to be desired, but I'm willing to conceded that it could be partially or entirely driver related at this point.

Still I won't be surprised if geometry shading is another dynamic branching wrt. NV40 vs R520...

R300King! 12-May-2007 20:53

Quote:

Originally Posted by Galduta (Post 985560)

It doesn't seem like a joke or anything. He says its on Core2duo @2.4Ghz default. ..so, I donno. We'll see maybe Monday.

w0mbat 12-May-2007 21:01

single card WR!
http://img46.imageshack.us/img46/4594/30kni1.jpg

tEd 12-May-2007 21:03

jesus what's with the 3dmark scores , theres alot other stuff to show which would be mucho more interesting.

Skinner 12-May-2007 21:15

Quote:

Originally Posted by R300King! (Post 985559)
I think this is a HD2900XT here.

Are these good scores? :D

Well not for 128x76,8 ;)

edit btw computer settings are on medium.

aeryon 12-May-2007 21:43

Quote:

Originally Posted by tEd (Post 985573)
jesus what's with the 3dmark scores , theres alot other stuff to show which would be mucho more interesting.

through all time, humans like to compare their penis and see who gets the biggest...











... sorry, me run away :oops:

spidy 12-May-2007 21:48

Quote:

Originally Posted by aeryon (Post 985579)
sorry, me run away :oops:

What? You run away when it comes to a penis comparison? Ah wait, you're french? That'd be a great excuse :lol:

R300King! 12-May-2007 21:51

W0mbat, what was the previous single card 05 WR? 8800GTX, yes? What was it's score?


All times are GMT +1. The time now is 07:44.

Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.