AMD: Southern Islands (7*** series) Speculation/ Rumour Thread

If AMD had started GCN srp more competitively and focus more on helping game developers update their engines to GCN support, then you are building the brand... slowly. But i will forgive AMD this round if they start releasing unlockable 7950....:oops:

Recent / future titles already partnering with AMD Gaming Evolved include: Sleeping Dogs, Hitman Absolution, Medal of Honor Warfighter, Tomb Raider, Bioshock Infinite, with more to come... :)
 
ROPs are quite big units. Would additional 16 ROPs boost perf/mm²? I don't think so. Tahiti doesn't suffer in high resolutions / high MSAA scenarios. In fact these are its strongest point.

On low resolution/no aa the CPU and the geometry side of the chip becomes the bottleneck/is slowing performance.

High resolution/high msaa is ALWAYS the strongest point of the "fastest/most expensive/biggest" chip.

In order to get
1) the huge number of shaders utilized, lots of pixels with shaders(huge resolution) is needed
2) the huge number of texturing power utilized, lots of pixels with textures(huge resolution) is needed
3) the huge memory bandwidth utilized , lots of pixel samples (huge resolution & big aa mode) is needed
 
On low resolution/no aa the CPU and the geometry side of the chip becomes the bottleneck/is slowing performance.

High resolution/high msaa is ALWAYS the strongest point of the "fastest/most expensive/biggest" chip.

In order to get
1) the huge number of shaders utilized, lots of pixels with shaders(huge resolution) is needed
2) the huge number of texturing power utilized, lots of pixels with textures(huge resolution) is needed
3) the huge memory bandwidth utilized , lots of pixel samples (huge resolution & big aa mode) is needed

You want to use a 400-500$ card for play without AA and at low resolution? Buy a 80$ card for this. I dont think Tahiti is so bad on Raster and geometry side. yes, GCN need to be filled for be at his maximum, but it is not what you ask of this type of card ?

Look the Tomhardware review, some will complain they have push the setting with max AA etc, and this make the 7870 look equal, faster of the 660TI .. but even with this type of card, the games used are totally playable at thoses settings, why use 2-4x MSAA as we see on some review, for get 108fps instead of 78? ...
 
Last edited by a moderator:
Even sadder is the difference between the 7900 series and the 670 series is tiny.

This is a 7970 vs a GTX 670 at the same settings, I wouldn't normally bench with vsync enabled but it was the only bench I could find with a similar system to my own.

v0vGs.png


GfgZZ.png


Mine is the bench with the 670.
 
Those graphs for Sleeping dogs are hilarious, way to make the differences seem much bigger than they really are.

The optimized shadowing off/on pic with the street is also hilarious. The "off" where the shadows are completely gone.
 
Mmm, tasty.

Quick note for any AMD people paying attention: content on page 11-139 is a duplicate of page 11-138. Instruction 52 (0x34) is missing, too.

I strongly suspect the following paragraph in section 9.1 needs to be re-written:

There are 32 kB memory per compute unit, segmented into 32 or 16 banks (depending on the GPU type) of 1 k dwords (for 32 banks) or 2 k dwords (for 16 banks). Each bank is a 256x32 two-portRAM (1R/1W per clock cycle). Dwords are placed in the banks serially, but all banks can execute a store or load simultaneously. One work-group can request up to 32 kB memory.

There are plenty of typos throughout, e.g. "newton rhapson" (cue: Davros).
 
Even sadder is the difference between the 7900 series and the 670 series is tiny.

This is a 7970 vs a GTX 670 at the same settings, I wouldn't normally bench with vsync enabled but it was the only bench I could find with a similar system to my own.

http://i.imgur.com/v0vGs.png

http://i.imgur.com/GfgZZ.png

Mine is the bench with the 670.

I would be more interested in knowing how the game is using vram between the 2 cards. Then see a video of the 2 using a camera phone or other video device just to see if the gameplay is the same or not. But that would also mean bumping up the res to include 1920x1200.
 
Last edited by a moderator:
The SI ISA document has 4 outstanding instruction counters, although the S_WAITCNT instruction currently only tracks three.
The immediate for S_WAITCNT doesn't look like it allocates all the bits, perhaps room to grow for the other three or a possible future expansion to track VALU_CNT?
That would be rather Kepler-like.
 
The SI ISA document has 4 outstanding instruction counters, although the S_WAITCNT instruction currently only tracks three. The immediate for S_WAITCNT doesn't look like it allocates all the bits, perhaps room to grow for the other three or a possible future expansion to track VALU_CNT?
That would be rather Kepler-like.
That would only have any benefit for scenarios 5, 6 and 7 in table 4.2, and only when the shader compiler has no non-dependent instructions available to schedule.

On the other hand, the 3 bits allocated to VALU_CNT obviously imply something very "short term", but the document provides no explanation of what that could be. Perhaps it's used inside a FORK … JOIN block? e.g. if clauses within it are shorter than 9 instructions the hardware uses this counter to handle swapping for the wavefront.

Strangely, there is a single reference to a "clause", in section 9.2. It made me wonder if there's an underlying 8-instruction cadence for clauses. i.e. AAAAAAAABBBBBBBBAAAAAAAA... represents two wavefronts, alternating in their issue of VALU instructions. But that seems extremely unlikely.

Still no APP Kernel Analyzer for SI :cry:
 
Even sadder is the difference between the 7900 series and the 670 series is tiny.

This is a 7970 vs a GTX 670 at the same settings, I wouldn't normally bench with vsync enabled but it was the only bench I could find with a similar system to my own.

Mine is the bench with the 670.

And how far overclocked is your 670 vs that stock 7970?

Also the difference is 10%, not bad. Ghz edition (which just does the same thing all 670's already do) will up that considerably.

The ghz edition in the AMD graph is over 50% faster than the 670 and ~40% at 1080p. I'm guessing your 670 is overclocked massively and you "forgot" to mention it or something. Or the AMD bench was CPU limited and you "forgot" to mention that, or something. What with those amazing Nvidia drivers and all.
 
Even sadder is the difference between the 7900 series and the 670 series is tiny.

This is a 7970 vs a GTX 670 at the same settings, I wouldn't normally bench with vsync enabled but it was the only bench I could find with a similar system to my own.
Mine is the bench with the 670.

What clock speeds were you running your GTX 670 at. this link shows a GTX 680 getting 34.7 fps at the same settings.
http://www.youtube.com/watch?v=PHtnQLvodJ0

So you definitely must be running at a very high overclock. If you take a HD 7970 with voltage overclock to 1250 Mhz it will obviously open up a huge gap between the GTX 670 and the HD 7970.
 
And how far overclocked is your 670 vs that stock 7970?

Also the difference is 10%, not bad. Ghz edition (which just does the same thing all 670's already do) will up that considerably.

The ghz edition in the AMD graph is over 50% faster than the 670 and ~40% at 1080p. I'm guessing your 670 is overclocked massively and you "forgot" to mention it or something. Or the AMD bench was CPU limited and you "forgot" to mention that, or something. What with those amazing Nvidia drivers and all.

I thought he used a older 7970 at stock (non GHz edition) vs 670 at it's highest boost/oc rate. I noticed that his max FPS is 62.1 for the 670 w/vsync? Something is not right with that.
 
why even discuss benchmarks taken with vsync on..

(and unknown different cpus, cp settings etc)
 
Last edited by a moderator:
Back
Top