AMD: Southern Islands (7*** series) Speculation/ Rumour Thread

Really, you think the "majority" of people with a GTX 580, 6970, or future 7970 users are going to be at 1920x1200 or below?

WTH? All the graphs I've linked to have been 2560x1600 at least. I haven't played at a resolution below 2560x1600 in 3-4 years, so I wouldn't even comtemplate looking at graphs below that res.
 
I'd say anyone spending money for a 580 or 6970 for playing games at 1920x1200 or lower is wasting their money.

I dunno about that, I pc game on a 65" tv so it's at 1920x1080 res with an NVidia 580. I'm considering a 7970 because it's not peak frame rates that concern me, I'm more interested in minimum and average frame rates. If a video card gives me 60fps all the time in all games with all features enabled then it's worth it to me. Of course your mileage may vary...
 
I'd say anyone spending money for a 580 or 6970 for playing games at 1920x1200 or lower is wasting their money.

Regards,
SB


Huh? There's already games that couldn't be maxed at that res with those cards...or at least will give it all it can handle. BF3, Crysis 2, regular Crysis (LOL) etc. Plus I'm sure the Metro games I'm less familiar with.
 
Huh? There's already games that couldn't be maxed at that res with those cards...or at least will give it all it can handle. BF3, Crysis 2, regular Crysis (LOL) etc. Plus I'm sure the Metro games I'm less familiar with.

I game at 1920x1200 and 1920x1080 and a 580 is far from
overkill at either resolution.
 
GCN's presentations indicated there were debug and interrupt instructions. Could this, coupled with the separate ACEs, allow compute kernels that can last for an indeterminate amount of time without fear of triggering a driver timeout? Can the CPU interrupt or switch out tasks, or query if a kernel is progressing?
 
Although all the graphs in that say F1 2010 the top does actually say F1 2011 and the results do not appear to line up with the F1 2010 test in their 53 card roundup, which may suggest it is indeed F1 2011. In which case, we did notice late in the game that F1 2011 wasn't performing as expected (i.e. inline with F1 2011) but it was discovered after the initial driver.
Oh interesting. I couldn't see anything wrong with the scores other than being terribly cpu limited (with the HD7970 being virtually as fast in 2560x1440 with 8xAA compared to 1680x1050 with no AA). But if you can fix this cpu limitation to catch up with nvidia, all the better :).
 
I took a look at AESEncryptDecrypt as used in Anandtech's HD7970 review. I suspect that the test was just run for a single iteration and that 1-time costs and/or power management hindered HD7970 performance. With an 8192x8192 image, AESEncryptDecrypt on HD7970 took 315ms. Yet when I ran the test for 10 loops ("-i 10" parameter), the average time was 171ms per iteration. For comparison, GTX580 took 239ms and 227ms per iteration for 1 and 10 iterations, respectively.
 
I took a look at AESEncryptDecrypt as used in Anandtech's HD7970 review. I suspect that the test was just run for a single iteration and that 1-time costs and/or power management hindered HD7970 performance. With an 8192x8192 image, AESEncryptDecrypt on HD7970 took 315ms. Yet when I ran the test for 10 loops ("-i 10" parameter), the average time was 171ms per iteration. For comparison, GTX580 took 239ms and 227ms per iteration for 1 and 10 iterations, respectively.

Interesting. If power management is to blame, there are some delicate responsiveness/power-efficiency trade-offs involved, here. Perhaps the overhead would be smaller on an APU like Llano.

If it's not related to power-management, why would Radeons have higher 1-time costs than GeForces?
 
If it's not related to power-management, why would Radeons have higher 1-time costs than GeForces?

They are essentially all about the drivers. It could well be that the AMD driver team just hasn't ever worked on them, and the nV one has. It's not like they are really relevant to any real use.
 
I took a look at AESEncryptDecrypt as used in Anandtech's HD7970 review. I suspect that the test was just run for a single iteration and that 1-time costs and/or power management hindered HD7970 performance. With an 8192x8192 image, AESEncryptDecrypt on HD7970 took 315ms. Yet when I ran the test for 10 loops ("-i 10" parameter), the average time was 171ms per iteration. For comparison, GTX580 took 239ms and 227ms per iteration for 1 and 10 iterations, respectively.
We used 150 iterations.

AESEncryptDecrypt.exe -t -i 150 -x <input file>
 
We used 150 iterations.

AESEncryptDecrypt.exe -t -i 150 -x <input file>
Arg, apparently I can't edit messages.

Anyhow, while our test is for 150 iterations, you're right in that the report from the program is as far as we can tell the average execution time for a single iteration.
 
Arg, apparently I can't edit messages.

Anyhow, while our test is for 150 iterations, you're right in that the report from the program is as far as we can tell the average execution time for a single iteration.
Thanks for the info. Seems odd that your performance is so much different, I'll rerun with the press driver tomorrow.
 
Thanks for the info. Seems odd that your performance is so much different, I'll rerun with the press driver tomorrow.
Even with the press driver installed I am getting much better results: 146ms per iteration. What version of the SDK did you use? I want to be sure I'm testing the same version of the sample. Also, are you looking at the reported "Time" or "[Transfer+Kernel]Time"?
 
Even with the press driver installed I am getting much better results: 146ms per iteration. What version of the SDK did you use? I want to be sure I'm testing the same version of the sample. Also, are you looking at the reported "Time" or "[Transfer+Kernel]Time"?
I'm using the x86_64 sample out of SDK 2.5. I see that AMD released 2.6 now, but that wasn't out at the time that we put the test together.

As for the value I'm using, I'm using "Time".
 
I'm using the x86_64 sample out of SDK 2.5. I see that AMD released 2.6 now, but that wasn't out at the time that we put the test together.

As for the value I'm using, I'm using "Time".
Ok, I don't have the 2.5 code in front of me, but if it's like 2.6, then "Time" is probably not the value you want to use. From the 2.6 version of AESEncryptDecrypt.cpp:
Code:
    totalKernelTime = (double)(sampleCommon->readTimer(timer)) / iterations;
...
void AESEncryptDecrypt::printStats()
{
    std::string strArray[4] = {"Width", "Height", "Time(sec)", "[Transfer+Kernel]Time(sec)"};
    std::string stats[4];

    totalTime = setupTime + totalKernelTime;
    
    stats[0] = sampleCommon->toString(width    , std::dec);
    stats[1] = sampleCommon->toString(height   , std::dec);
    stats[2] = sampleCommon->toString(totalTime, std::dec);
    stats[3] = sampleCommon->toString(totalKernelTime, std::dec);
So "Time" is a poor measure of performance since it takes the total set-up time and adds in the average time per iteration. That means if you have a higher set-up time but much faster performance per iteration, you might still "lose". setupTime should be divided by the number of iterations as well if you care about the average performance per iteration as the set-up time is a one time cost so that should spread out over all the iterations equally, or you can just look at the "[Transfer+Kernel]Time" if you don't care about set-up costs.

Yet another bug to file against the samples :p
 
Back
Top