AMD: R7xx Speculation

Status
Not open for further replies.
Well, if R700 is actually a UMA architecture this should help AFR rendering performance tremendously and basically eliminate synchronization issues. Of course, it's only a rumor that R700 is more than just Crossfire on a card.

The frame sync issue doesn't seem to have anything to do with performance. Having two chips render different frames faster will still have the potential to demonstrate (or even exacerbate) the micro-stuttering.
 
I'm not really convinced by this, but for completeness:

http://www.fudzilla.com/index.php?option=com_content&task=view&id=7769&Itemid=1

ATI will launch its Radeon 4850 and Radeon HD 4870 cards on the 25th of June. It is the last week of June, and the date was the original that we've implied.

There might be some ATI activities before that date, but the official launch with retail availability will indeed be the 25th of this month. Nvidia should be launching on Tuesday, the 17th of June, assuming the retail availability of both Geforce GTX 280 and Geforce GTX 260.

The cards will end up quite fast, much faster than previous RV670 cards, especially with FSAA and Aniso turned on, as RV770 does Anti-aliasing the way it should and not via Shaders (as was the case with RV670).

We heard 1.5 times faster is the number that should appear, at least in the most optimistic case.
(emphasis added)
 
The frame sync issue doesn't seem to have anything to do with performance. Having two chips render different frames faster will still have the potential to demonstrate (or even exacerbate) the micro-stuttering.

I beg to differ. In a typical multi-GPU master/slave setup the transmission of data from the slave card to the master takes time. At least in SLI this is always the case. CF, however, appears to be more flexible in that a single-card multi-GPU solution does not necessarily have to operate in this manner.
 
I'm not really convinced by this, but for completeness:

http://www.fudzilla.com/index.php?option=com_content&task=view&id=7769&Itemid=1

(emphasis added)

Dude, it's fudzilla. Fuad Abazovic doesn't possess the technical knowledge to discuss these sorts of things, and to exacerbate the issue he speaks horribly-broken English. Chances are he took a "we've fixed AA performance" comment out-of-context and interpreted it to mean they no longer implement AA via the SPs.
 
I beg to differ. In a typical multi-GPU master/slave setup the transmission of data from the slave card to the master takes time. At least in SLI this is always the case. CF, however, appears to be more flexible in that a single-card multi-GPU solution does not necessarily have to operate in this manner.

That transfer is not what causes stutter.....

Sharing the memory pool is a performance improvement and performance or lack thereof isn't the underlying cause of inconsistent frame times. Assuming we're still talking about AFR of course.
 
Jason from ET, correct? Thanks.

Correct.

nicolasb said:
I'm not really convinced by this, but for completeness:

http://www.fudzilla.com/index.php?op...=7769&Itemid=1

ATI tells me their AA problems are "fixed," but I think it's probably jumping the gun to say they're not done via shaders (or not programmable sample patterns, or whatever). I honestly don't know the technical details of how they're done, only that ATI assures me that their AA troubles of late are going to be gone.
 
That transfer is not what causes stutter.....

Whether it's on the back-end or the front-end (i.e. setup vs. scanout), there is something bottlenecking that slave GPU, which often exhibits longer average frame rendering times than the master.

Sharing the memory pool is a performance improvement and performance or lack thereof isn't the underlying cause of inconsistent frame times. Assuming we're still talking about AFR of course.

I disagree. The fewer inter-processor communications there are, the fewer dependencies and potential bottlenecks as well.
 
ATI tells me their AA problems are "fixed," but I think it's probably jumping the gun to say they're not done via shaders (or not programmable sample patterns, or whatever). I honestly don't know the technical details of how they're done, only that ATI assures me that their AA troubles of late are going to be gone.
Welll, that's certainly good news.

Does this mean that everyone who has been endlessly repeating "ATI can't 'fix' AA because it isn't broken" will now have to shut up? :)
 
Whether it's on the back-end or the front-end (i.e. setup vs. scanout), there is something bottlenecking that slave GPU, which often exhibits longer average frame rendering times than the master.
That really isn't the case. Let's say that a single GPU can render a single frame in 50ms. With two GPUs, each can still render a single frame in 50 ms, but, since they work on two frames simultaneously, they complete two frames much faster than a single GPU can.

Now, in a perfect world, a GPU would finish rendering a frame every 25ms, however that isn't usually the case. The main reason is that if you are GPU-limited, then the CPU can send the rendering commands for a single frame in less than 50ms. Let's say the CPU can send the commands in 10ms. Then you end up with something like this:

Code:
Time
0ms               game starts
10ms         first frame sent
20ms         second frame sent
30ms         third frame sent
40ms         fourth frame sent (eventually the driver will stop sending frames when it gets too far ahead of the GPU)
60ms         first frame displayed
70ms         second frame displayed
110ms       third frame displayed
120ms       fourth frame displayed
...
Now what does the application see? It sees that frame times oscillate between 10 and 40ms. That can cause problems for the app's animation timer and cause "micro-stuttering".

-FUDie
 
That really isn't the case. Let's say that a single GPU can render a single frame in 50ms. With two GPUs, each can still render a single frame in 50 ms, but, since they work on two frames simultaneously, they complete two frames much faster than a single GPU can.

Now, in a perfect world, a GPU would finish rendering a frame every 25ms, however that isn't usually the case. The main reason is that if you are GPU-limited, then the CPU can send the rendering commands for a single frame in less than 50ms. Let's say the CPU can send the commands in 10ms. Then you end up with something like this:

Code:
Time
0ms               game starts
10ms         first frame sent
20ms         second frame sent
30ms         third frame sent
40ms         fourth frame sent (eventually the driver will stop sending frames when it gets too far ahead of the GPU)
60ms         first frame displayed
70ms         second frame displayed
110ms       third frame displayed
120ms       fourth frame displayed
...
Now what does the application see? It sees that frame times oscillate between 10 and 40ms. That can cause problems for the app's animation timer and cause "micro-stuttering".

-FUDie

Would seem to be trivial to solve, and ATI supposedly working on it, according to an article I read.
 
Welll, that's certainly good news.

Does this mean that everyone who has been endlessly repeating "ATI can't 'fix' AA because it isn't broken" will now have to shut up? :)

Well, it's hardly news to anyone that their AA in the R600-670 generation of stuff took a bigger performance hit than in the 5xx series, and it's not news to ATI.

To call AA "broken" now is maybe a bit melodramatic. It works, and it may even work as intended. It could just as well be that ATI made the decision to enable the programmable shader resolve necessary for DX 10.1 even though it meant less performance, opting for features over perf (and maybe naively assuming more devs would use 10.1 for AA and get a perf boost from that). I really don't know.

And it's not at all unreasonable to expect their next architecture to have taken learnings from all that and made it go a lot faster, especially since it's been a sticking point with reviewers and has hurt them in benchmark contests.

I mean, I think everyone would be surprised and disappointed if they didn't improve AA in this coming generation, right?
 
Hope AA resolves via pixel shaders will soon be first class citizens as they are absolutely necessary.
 
Whether it's on the back-end or the front-end (i.e. setup vs. scanout), there is something bottlenecking that slave GPU, which often exhibits longer average frame rendering times than the master.

I disagree. The fewer inter-processor communications there are, the fewer dependencies and potential bottlenecks as well.

The issue has nothing to do with bottlenecks or dependencies though. It's about the CPU sending multiple frames to X number of GPU's in a short period in terms of real and game time. And then waiting a longer period until the first card wraps up. Improving inter-processor communication or reducing latencies will not change anything.

So not only do we get a similiar frame repeated in quick succession we miss a bunch of game time while the CPU waits. This oscillating between quick and slow updates is what's perceived as stuttering.

We get
_ . _ . _ . _ . _ . _

We want
_ _ _ _ _ _ _ _ _ _
 
Isn't the stuttering just a matter of queuing up frames at uneven intervals? Frame 2 starts rendering right after frame 1 starts and they both finish roughly the same time. The spacing gets off and it appears to stutter. For example if it takes 30s to render a frame and you start both within 1s of each other it's not going to be smooth video. Disregard the time scale I'm using here.
 
So all SLI and CF has been a sham to this point? Should probablyt tell that to all the dudes who spent 1k+ on two cards for beefy SLI rigs (since SLI is popular and CF isn't, is why SLI is my example).
 
I would not say its that simple. Theres a huge thread on this "issue" just a few pages down if your really interested into. Kinda strange how its all melded into the GT200/R7xx threads.
 
Status
Not open for further replies.
Back
Top