AMD: Southern Islands (7*** series) Speculation/ Rumour Thread

If Nvidia can beat the performance of Tahiti on a 365mm die I'll be impressed with that too.

I suspect nVidia will impress more with pricing than with die size. Their Tahiti sized part will probably fall short of Tahiti performance due to lower transistor density and lower bandwidth.
It will be a very interesting comparison though now that GCN has stepped up big time on the compute side of things and improved geometry processing considerably too. It'll be much easier now to ascertain whether there's a lot of unnecessary bloat in nVidia's scheduler and distributed geometry designs.
 
So you're saying Nvidia's design won't be as efficient?

Based only on die-size? No, probably not but that's a pretty irrelevant comparison without knowing anything about features and/or the bandwidth that die has access to. They should finally be in the same ballpark with power efficiency though unless nVidia is just outright lying about Kepler's strength in that area.

If you look at other metrics (bandwidth, flops, texturing), even Fermi looks pretty good compared to Southern Islands so far.
 
Based only on die-size? No, probably not but that's a pretty irrelevant comparison without knowing anything about features and/or the bandwidth that die has access to. They should finally be in the same ballpark with power efficiency though unless nVidia is just outright lying about Kepler's strength in that area.

If you look at other metrics (bandwidth, flops, texturing), even Fermi looks pretty good compared to Southern Islands so far.

But those metrics (apart maybe from bandwidth) are irrelevant. You don't pay for FLOPS or filtering rate, you pay for die size, PCBs and memory chips. And of course, power. Being efficient in gaming_performance/flops gets you nothing.
 
Ok noob question time :

Amd used to have groups of 5 shaders, 4 basic and 1 complex, then they changed to having groups of 4 shaders all complex.
what have they got now
 
, then they changed to having groups of 4 shaders all complex.
Not exactly. The omitted T-unit was a 40-bit device, and that permitted some op's to be carried much faster. In VLIW4 those functions were distributed, with very little hardware additions, so the remaining ALUs were still considered "simple".
 
Sorry, but Eyefinity etc is a niche feature. Talk about absolute levels of cherry picking. My original statement was to the gaming community as a whole. My point still stands.



So wait... I post loads of graphs that prove my point, and I get shouted down because theres ONE game that shows otherwise. Someone ELSE posts graphs that PROVE MY POINT, and I still get shouted down.

Am I being trolled? There's way too many fanboi's in this thread.

You got "shouted down"?
Really?
Is that what :) means now?

The only point I was trying to make is that multimonitor gaming is a large portion of the target demographic for top of the line GPUs. If you're gaming only at 19x12 or even 26x16 the demands on GPU RAM are significantly lower than what many enthusiasts with CFX or SLI are using today and that will only increase. I cannot imagine going back to non-Eyefinity/non-Surround gaming. Play a racer in wide or a strategy in 3888x1920 and you'll be hooked.

So I'm not shouting you down. Hell, for single monitors I'm actually agreeing with you. But for multiple monitor gaming you are wrong, full stop. As that segment grows 3GB is quite important o the high end.
 
But those metrics (apart maybe from bandwidth) are irrelevant. You don't pay for FLOPS or filtering rate, you pay for die size, PCBs and memory chips. And of course, power. Being efficient in gaming_performance/flops gets you nothing.

Agree but they're still way more relevant than die-size if you're interested in academic discussion about architectural efficiency. Also, consumers don't pay for those things - the manufacturer does.

nVidia doesn't make its money by being efficient in gaming_perf/flop. They make money by being efficient in selling_price/mm^2 :D
 
The only point I was trying to make is that multimonitor gaming is a large portion of the target demographic for top of the line GPUs.

Define large :) I would be shocked and amazed if more than 5% of CFX/SLI users game on more than one monitor. You're talking about a tiny niche of a tiny niche here.
 
Here's a really good reason why the 7970 is at the right price point for now.
I game at either either 3888x1920 or 6028x1200. To do this I have 2x 6970 (no longer in use but handy) or 2x GTX 580 with 3GB.

At the moment I have one game - granted only one - that has texture/shader aliasing in SLI, but not with only one card. The trouble is, I can't drive three monitors without SLI (and I'm too lazy to switch back to the 6970s). So here's a card that, if I had two, I could just run with a single card for that game if I had a CFX issue.

My two cents.
 
Define large :) I would be shocked and amazed if more than 5% of CFX/SLI users game on more than one monitor. You're talking about a tiny niche of a tiny niche here.

I think Dave should comment on that, but I'll wager the fastest-growing segment of top-line GPU sales is multi-monitors. Making it quite fast with 1 GPU is only going to accelerate that growth.

Dave?
 
Sorry to add even more to FB size branch, but today I've played some BF3 Co-op and Multi while monitoring GPU mem usage. Single 1920x1200 monitor no AA only FXAA on ULTRA. Well, most of Co-op maps were easy and hovered between 600MB and 1200MB of VRAM but MP map I've played started at 1300MB and soon breached 1600MB of VRAM usage.
I have 2 screens now but plan to do proper 3 screen setup this year. It's useful to play EVE on secondary screen and Civ5 on main for instance. I know very few people are muli-playing games but you can run several EVE clients at once and each takes around 400MB-600MB of VRAM. 1.5GB is OK for now, but I won't go lower than 2GB from where I'm now. Even if 7950 will be available in 1.5GB version for £50 less than 3GB one.

One more thing to add, GPGPU! Here FB can be VERY important for some algorithms.
 
Almost every response when someone comments on the price is: "but its 20% faster than a 580 for only 10% more". But the problem is its only 40% faster than a 6970 for 90% more. "Only 10%" when you're already at $500 is a LOT of money.

You know that you can go all the way to the very bottom (IGP), not only 6970 ;)?

500 is a lot of money, but ultimately it's the market that will determine, if it's too much or not. Perhaps AMD is setting the price so high because of limited supply? Or perhaps they're just testing to see what's the reaction to the MSRP? Or perhaps they just don't feel any pressure to set it any lower at the moment?

As you can see, there are at least a few people here that would be willing to spend that much for this particular card. It's too much for you (probably too much for me too right now), that much is clear, but there's no reason to argue about it for 4 pages, as I'm sure no one is gonna convince the other, as it's very subjective...
 
Not exactly. The omitted T-unit was a 40-bit device, and that permitted some op's to be carried much faster. In VLIW4 those functions were distributed, with very little hardware additions, so the remaining ALUs were still considered "simple".

Ok, So do we know what the 7 series have ?
 
Agree but they're still way more relevant than die-size if you're interested in academic discussion about architectural efficiency.

Also, consumers don't pay for those things - the manufacturer does.

Consumers pay the ask price, which doesn't have to be related to any technical consideration, depending on the market situation.

I think looking at gaming_perf/FLOPS is taking "academicness" to the point of silliness. For a given process and similar performance level, the best design is the one that gets the best performance/(size×power).

The point is that when discussing efficiency and trying to compute ratios, you have to choose things that actually cost something as denominators. Usually, that's power—something your customers do pay for—and the manufacturing cost, which includes the die size, the cost of the PCB and memory, and power-related things (including cooling). All things that actually cost money to you or your customers. Why should FLOPS factor into that? It's an interesting point of micro-architectural detail, but not an efficiency issue.

I mean, if your design has a huge gaming_perf/FLOPS ratio, but doesn't actually outpace the competition, is more expensive to make and draws more power, how does that help you?
 
Probably by looping (at least) 3 times. So basically the same as in Cayman, just not in parallel.

So in Cayman, 64 transcendentals would be executed in groups of 16, taking four cycles total (full pipelining?), while in Tahiti, they would be executed all together, but would have to go through at least three loops? Is that right?
 
If the hardware is distributed amongst the lanes of an individual SIMD, full transcendental throughput would require 4 separate batches of 64, one per SIMD. Each individual SIMD would have 1/4 the throughput.
 
Back
Top