AMD: Southern Islands (7*** series) Speculation/ Rumour Thread

Eric: You can get CPU bound, it's also a new architecture for us so there are other bottlenecks we're going to address over the coming months so it'll keep on getting better. The heuristics it's using are fundamentally different from our previous architecture, there are easier parts and there are parts that we're still working on. There's a lot, in fact we've got some games that have gone up 5-fold in performance since we started, from August to now, in drive changes to take advantage of the architecture. Probably another 2-3 months for us will really allow us to showcase the card. In one of the presentations you saw that performance changes, typically 15-40% - in this generation you'll see bigger numbers.


He expects a lot from driver improvements for this new architecture!

Yes, it's quite astonishing, really, to read the comments of some who have evidently forgotten that often it is the case, most especially with new architectures, that driver revisions dramatically improve performance and even image quality. Often with new architectures, dramatic improvements can be seen over the span of just the first few months. It seems common knowledge to me so I'm shocked that some people might be surprised by this information. I always much enjoy reading Eric's comments and can only hope that the pessimists among us will take the time to read them, too...
 
Sorry for the OT; I use Chrome and release drivers. There's too many places where something could go wrong in the modern computer, but the reason I blame AMD is because the vid driver crashes before the BSOD, and there's a documented flaw for multiple accelerated contexts in the first Catalyst release to support Flash hardware acceleration that never was addressed in the any of the release notes of subsequent Cats. Now I just reboot the machine right after the first driver crash which happens haphazardly.

It seems like some kind of poorly handled buffer overflow, and maybe when the GPU is finally treated as a first class citizen by the OS as where the trend seems to be going, these issues will get resolved more gracefully. I never got these issues w/ software only Flash after all, so support for the unified memory space supported by GCN will be a great boon (once we get past the requisite teething issues).

Hopefully, you've got more than one browser installed. I had three installed before I uninstalled Chrome the other day--btw, Chrome I believe is still in perpetual beta (?) so you shouldn't be surprised by quirky/buggy behavior. I took Chrome off because I really, really dislike the google update program the browser installs by default, because you cannot disable it from running at startup (bootup) (I like to check for updates manually with IE and Firefox and I wanted to do that with Chrome, too.) When you set googleupdater.exe (or something close to that) not to run at startup--it ignores the command and runs another copy of itself next boot. That's malware behavior and I no like it and I no have patience to fiddle a whole lot with Chrome to see if I can turn off the program...:/

Anyway...my suggestion would be to try the same thing with another browser or two just to eliminate the possibility that it's a problem with Chrome and the Flash player. I know I've had no problems with the hardware accelerated Flash with my AMD gpu and drivers w Ff or Ie. Take it for what it's worth and...

MERRY CHRISTMAS! Ho-Ho-Ho!
 
Thanks for the tip; you've definitely legit concerns about google's chrome. However, its upsides outweigh its automated update policy for me for sure, and they seem to handle upgrades intelligently and cautiously which I like:

http://www.tomshardware.com/news/google-chrome-update-download-browser,14249.html

As for the problem, I've definitely reproduced the error w/ both FF and IE on 3 different platforms using AMD GPUs, so there's something fishy going on w/ either Flash, AMD's drivers, Windows, or possibly one of a hundred thousand other things. I'm not so obsessive about misc. errors on my computer anymore esp. since others experience the same thing and I only experience it in the most extreme loads (like 20+ youtube vids across multiple browsers or multiple old tabs w/ flash vids open for several days).

BACK ON TOPIC:

When will AMD expose things like unified memory address and compiler support? The last slide on the page seems to indicate a gradual rollout:

http://www.rage3d.com/articles/amd_fusion_summit_2011/index.php?p=7

but I wonder if that means new driver versions, new windows versions, or hardware revisions like future APUs.
 
Anand said that at least one lowend chip must be VLIW4 to crossfire with trinity
The only VLIW4 member of the HD7000 series will be the graphics of Trinity. Everything above will be GCN (Tahiti, Pitcairn, CapeVerde), every discrete GPU below that will be VLIW5. You can crossfire VLIW5 with VLIW4 GPUs, and there is even the potential to do the same with GCN (the Rage3D interview mentions that one would need to deactivate the AF improvements for that).
 
R3D: - that ECC support will be available on compute products, does that mean it's only going to be turned on for the FirePro/FireStream products?
Eric: At this point, yes. We thought about 'could it be used to improve yield on consumer products' and things like that and we may decide to that kind of thing, well we reserve the right do anything we want I guess! Right now those kinds of features would actually hurt performance for consumers because they do take away from memory storage (maybe not the internal, but the external DRAM) and it would certainly make the drivers more complex. The FirePro driver team are doing that because some of their customers desire that, I wouldn't say it’s a requirement but Oil & Gas, Medical, they need it for liability reasons; and the whole server play, these guys need it. For now that's our plan, not to enable it for consumer. It's not necessary to destroy it or burn a fuse or anything, but just not to enable it for consumer.

If I'm reading it correctly, Eric is hinting that enabling of ECC should be possible on consumer products with proper VBIOS and driver.
I wonder if this means possibility of modding your standard HD7970 to FirePro at home like it was possible in old generations?

Usefulness of that for gamers is zero, but for a student/enthusiast gamer with interests in professional graphics ability to install FireGL drives is a huge win!
 
Anand said that at least one lowend chip must be VLIW4 to crossfire with trinity
192838m0cp2n5zticiviti.jpg

I think next generation DX11 with VCE=GCN.
 
VCE is independent block. It isn't related to any specific GPU architecture. Anyway, this slide compares Trinity to Llano. Llano is VLIW5, so next-gen DX11 means VLIW4. (GCN is DX11.1).

fehu: Dave (I believe) already confirmed, that drivers are prepared for CrossFire of different architectures, so they can mix VLIW4 and VLIW5 (I guess they will mix VLIW4 with HD6xxx VLIW5 GPUs only, because HD5xxx VLIW5 and GCN offers different AF, which would create disturbing effect in both AFR and scissors modes)
 
VCE is independent block. It isn't related to any specific GPU architecture.
Hybird mode VCE use CUs compute power to encode.
slide103.jpg


Anyway, this slide compares Trinity to Llano. Llano is VLIW5, so next-gen DX11 means VLIW4. (GCN is DX11.1).
Why would AMD use VLIW4 in Trinity?If AMD still has any sense,they will put a 8cu GPU in Trinity instead of making driver team more miserable.
 
Hybird mode VCE use CUs compute power to encode.
Hybrid mode is high-end related. AMD stated, that low-end products will use standard mode.

Why would AMD use VLIW4 in Trinity?If AMD still has any sense,they will put a 8cu GPU in Trinity instead of making driver team more miserable.
VLIW offers better 3D performance per transistor than GCN (at least at this moment, drivers can change it in future). Die size is critical for APUs, VLIW4 drivers are working fine, VLIW architecture is performance/transistor effective, so it's quite optimal for such product.
 
Hybird mode VCE use CUs compute power to encode.
slide103.jpg



Why would AMD use VLIW4 in Trinity?If AMD still has any sense,they will put a 8cu GPU in Trinity instead of making driver team more miserable.

It's been confirmed to be VLIW4. It's a scheduling thing, APUs take a while to design, so you can't always get the latest graphics tech into them. Likewise, Llano shipped long after Cayman, but was still VLIW5.

The APU after Trinity, probably in early 2013, will most likely feature GCN.
 
VLIW offers better 3D performance per transistor than GCN (at least at this moment, drivers can change it in future).
If so,they would keep VLIW5 instead of going VLIW4.

Die size is critical for APUs.
2 Bulldozer cores without L3 will smaller than 4 32nm K10 cores,AMD can put a larger GPU in similar die size.

VLIW4 drivers are working fine.
What about future games?

VLIW architecture is performance/transistor effective, so it's quite optimal for such product.
It is not effective for GPGPU compute,which AMD advertise as a major selling point for APU.
 
Last edited by a moderator:
If so,they would keep VLIW5 instead of going VLIW4.
Actually when they released Cayman, they said that from pure gaming point of view, VLIW5 is better - VLIW4 just suits compute a bit better and is almost as good in gaming.
 
Not quite sure how that will work out for Cape Verde, since CU groups of 4 would be too coarse grained for disabling units on lower-end parts (8 or 12 CUs is quite a large gap).
I'd like to mention that both "Chelsea" and "Heathrow" mobile parts appear to be based on Cape Verde, and the large performance gap between 12 and 8 CUs might be the reason why AMD is using two codenames for the same chip.
 
From Eric Demers: http://www.rage3d.com/interviews/amdchats/eric_demers_dec_2011/index.php?p=4

Eric: The CUs themselves, the whole thing is very scalable, you can disable CUs one after another and you can improve yield that way. Let's say you have one bad CU, well it's not a 7970 anymore, but maybe it's a 7950? We will scale CUs, don't know if we'll scale ROPs.

I am understanding from this, that it is possible to have other CU counts than divisible by 4.
 
If so,they would keep VLIW5 instead of going VLIW4.

VLIW4 is a good compromise between 3D performance, Compute performance and die size.

Since GPU compute is a big part of making fusion successful, any solution for that is going to be weighted more heavily towards compute. Hence, VLIW5 isn't suitable. GCN would be preferable but it's too early to be integrated into a CPU (longer design windows) and likely costs more die space for similar 3D performance (which cannot be ignored even if compute is preferable).

Regards,
SB
 
Back
Top