AMD: Volcanic Islands R1100/1200 (8/9 series) Speculation/ Rumour Thread

fellix · Sep 29, 2013

lanek · Sep 29, 2013

fellix said:
image

Nice find Fellix...

Grall · Sep 29, 2013

Transferring a 5-screen eyefinity framebuffer at 1440P/60Hz requires over 4GB/s. It's unlikely it will come at absolutely no performance hit whatsoever. Unless this bridge-less crossfire supports only two cards you could pretty much swamp the entire PCIe interface with only framebuffer data, without even having to reach to 4K resolutions.

Frankly, I think this is a mistake.

fellix · Sep 29, 2013

Grall said:
Transferring a 5-screen eyefinity framebuffer at 1440P/60Hz requires over 4GB/s. It's unlikely it will come at absolutely no performance hit whatsoever. Unless this bridge-less crossfire supports only two cards you could pretty much swamp the entire PCIe interface with only framebuffer data, without even having to reach to 4K resolutions.

Frankly, I think this is a mistake.

That's true for AFR modes. But what if they use some more advanced methods, like what Lucid Hydra is doing with multi-GPU load balancing?

Wynix · Sep 29, 2013

It's worth pointing out that Intel will not introduce PCIe 4.0 until 2015 at the earliest.

Dave Baumann · Sep 29, 2013

Grall said:
Transferring a 5-screen eyefinity framebuffer at 1440P/60Hz requires over 4GB/s. It's unlikely it will come at absolutely no performance hit whatsoever. Unless this bridge-less crossfire supports only two cards you could pretty much swamp the entire PCIe interface with only framebuffer data, without even having to reach to 4K resolutions.

Frankly, I think this is a mistake.

Those scenarios are already using the PCI bus, but in a relatively inefficient and unmanaged manner.

Secondly, take a look at PCI version / lane width scaling tests. Standard performances are not greatly affected by PCI bandwidth (i.e. the traffic is fairly low), when they are its because the textures / buffer sizes have overflown the framebuffer and you start addressing from system RAM and the disparities in local bandwidth vs PCIe bandwidth become rather catastrophic to game performance.

AnarchX · Sep 29, 2013

fellix said:

http://www.donanimhaber.com/ekran-karti/haberleri/AMD-Radeon-R9-290X-ve-yeni-nesil-teknolojiler.htm

... also slides about 600MHz display signal support and a NDA lifting date of 15th October. So BF4 bundle is a pig in a poke sale?

flopper · Sep 29, 2013

fellix said:

difference between pci-e 2.0 and 3.0?
since that be the main question a user would ask with crossfire these cards.
does the 2.0 work as well the 3.0?

BRiT · Sep 29, 2013

flopper said:
difference between pci-e 2.0 and 3.0?

Already posted above, look at the tables. Its 32GBs vs 16GBs. Well beyond what's needed for this.

*EDIT: Seems like the chart is saying total bidirectional bandwidth. Single direction bandwidth would be half of that, so 16GB/s vs 8GB/s. And that's theoretical limits. The practical/actual limits might be lower -- perhaps too close to my liking to know if there's limitations or not.

OpenGL guy · Sep 29, 2013

BRiT said:
Already posted above, look at the tables. Its 32GBs vs 16GBs. Well beyond what's needed for this.

I've never seen anything beyond about 12.8 GB/s for a single directional transfer on PCIe gen 3.0 and 16 lanes.

flopper · Sep 29, 2013

BRiT said:
Already posted above, look at the tables. Its 32GBs vs 16GBs. Well beyond what's needed for this.

ok so no performance hit with pci-e 2.0 then thx.

BRiT · Sep 29, 2013

Ah, that chart is a bit misleading(?). In the total bandwidth column I think they're counting bidirectional bandwidth. So for PCIe3 vs PCIe2 it would be 16GB/s vs 8GB/s for a 16 lane setup in 1 direction.

When you take into account the theoretical vs actual metrics, that makes this much closer than I'd like for this usage. Thanks for providing that datapoint OpenGL Guy.

OpenGL guy said:
I've never seen anything beyond about 12.8 GB/s for a single directional transfer on PCIe gen 3.0 and 16 lanes.

OpenGL guy · Sep 29, 2013

BRiT said:
Ah, that chart is a bit misleading(?). In the total bandwidth column I think they're counting bidirectional bandwidth. So for PCIe3 vs PCIe2 it would be 16GB/s vs 8GB/s for a 16 lane setup in 1 direction.

When you take into account the theoretical vs actual metrics, that makes this much closer than I'd like for this usage. Thanks for providing that datapoint OpenGL Guy.

What was the source for the chart? I ask because the PCIe gen 2 had a lot of command overhead (like 20% of the bandwidth or something) and that was reduced with PCIe gen 3. In fact, I believe a significant part of the performance gain for PCIe gen 3 was the improvements in reducing command overhead. (PCIe gen 2 command rate is 5 GT/s vs. 8 GT/s for PCIe gen 3, so the rate did not double on gen 3 vs. gen 2.)

To put this in perspective, the peak bandwidth I recall seeing on PCIe gen 2 was around 6.2 GB/s for a single directional transfer, compared to 12.8 GB/s on PCIe gen 3. We have done some testing with bidirectional transfers, too, achieving around 20 GB/s. That result wasn't confirmed on other platforms, however, as it was mainly proof-of-concept.

According to the table, PCIe gen 2 peaked at 8 GB/s for single direction (which matches my recollection) and if you take away what I recall the command overhead to be, you end up with around 6.4 GB/s, close to what I was seeing. Gen 3 peaks at 16 GB/s for single direction, yet we (I tested AMD and Nvidia GPUs) only achieve around 12.8 GB/s, so it's possible some gen 3 tuning is needed.

Thowllly · Sep 29, 2013

OpenGL guy said:
What was the source for the chart? I ask because the PCIe gen 2 had a lot of command overhead (like 20% of the bandwidth or something) and that was reduced with PCIe gen 3. In fact, I believe a significant part of the performance gain for PCIe gen 3 was the improvements in reducing command overhead. (PCIe gen 2 command rate is 5 GT/s vs. 8 GT/s for PCIe gen 3, so the rate did not double on gen 3 vs. gen 2.)

To put this in perspective, the peak bandwidth I recall seeing on PCIe gen 2 was around 6.2 GB/s for a single directional transfer, compared to 12.8 GB/s on PCIe gen 3. We have done some testing with bidirectional transfers, too, achieving around 20 GB/s. That result wasn't confirmed on other platforms, however, as it was mainly proof-of-concept.

According to the table, PCIe gen 2 peaked at 8 GB/s for single direction (which matches my recollection) and if you take away what I recall the command overhead to be, you end up with around 6.4 GB/s, close to what I was seeing. Gen 3 peaks at 16 GB/s for single direction, yet we (I tested AMD and Nvidia GPUs) only achieve around 12.8 GB/s, so it's possible some gen 3 tuning is needed.

PCIe 1 & 2 used 8b/10b encoding, so 5Gt/s gave 4gb/s per lane, I think that's where that 20% overhead number you've heard come from, so the 8GB/s BW number is already after taking overhead into account. PCIe 3 uses 128b/130b encoding , with 8Gt/s giving ~7.9gb/s per lane, a little short of a doubling over gen2. AFAIK there were no reduction in packet overhead from gen2 to gen3, so if you see more than a doubling in BW from gen to gen3, it is not from anything inherent in the gen3 specs.

silent_guy · Sep 29, 2013

The slide says "direct access to GPU display pipes."

Could this mean that PCIe writes don't go to DRAM and then get scanned out by the memory controller to be sent to the display unit, but that those write bypass memory altogether and go straight from PCIe to the display unit?

Sinistar · Sep 29, 2013

And what exactly is the compositing block, separate memory for the frame buffer?

Albuquerque · Sep 30, 2013

A few of you need to remember that PCI-e is a point to point topology, there isn't any issue with "flooding the bus".

silent_guy · Sep 30, 2013

Sinistar said:
And what exactly is the compositing block, separate memory for the frame buffer?

Maybe something that merges different video streams together?

E.g. If you do video decoding on just 1 GPU but render images on both, then you need to merge them somewhere?

redstrat · Sep 30, 2013

www.3dmark.com/3dm/1255142
R9 290X score?

aaronspink · Sep 30, 2013

Grall said:
Transferring a 5-screen eyefinity framebuffer at 1440P/60Hz requires over 4GB/s. It's unlikely it will come at absolutely no performance hit whatsoever. Unless this bridge-less crossfire supports only two cards you could pretty much swamp the entire PCIe interface with only framebuffer data, without even having to reach to 4K resolutions.

Frankly, I think this is a mistake.

PCI-e 3.0 run at 8GT/s and on a 16b wide link providing 16 GB/s of bandwidth. Repeated testing shows little to minimal difference between PCI-e 3.0 and 2.0 on GPU performance. We have measured performance data across wide ranged of PCI-e speeds and widths with all the data pointing to no measurable difference between PCI-e 3.0 at x4 and PCI-e 3.0 x16. These same measurements have also been run in CF comparing PCI-e 3.0 at x4/x8/x16 with no measurable effect.

Its also worth pointing out that the bridge interfaces have generally horrible bandwidth compared to the full width PCI-e interfaces. The bridge interfaces tend to be x1 interfaces and will run out of bandwidth well before the full PCI-e interfaces will.

5x4K displays will require ~10 GB/s and PCI-e 3.0 x16 can certainly support that level of bandwidth along with command stream data from the CPU (about 3+ GB/s worth of it).

In other words, this is actually a way to provide more bandwidth to CF so it won't be bottlenecked by the bridge link in the future.

AMD: Volcanic Islands R1100/1200 (8/9 series) Speculation/ Rumour Thread

fellix

lanek

Grall

Invisible Member

fellix

Wynix

Dave Baumann

Gamerscore Wh...

AnarchX

flopper

BRiT

(>• •)>⌐■-■ (⌐■-■)

OpenGL guy

flopper

BRiT

(>• •)>⌐■-■ (⌐■-■)

OpenGL guy

Thowllly

silent_guy

Sinistar

I LIVE

Albuquerque

Red-headed step child

silent_guy

redstrat

aaronspink

Similar threads

AMD: Volcanic Islands R1100/1200 (8***/9*** series) Speculation/ Rumour Thread

Invisible Member

Gamerscore Wh...

(>• •)>⌐■-■ (⌐■-■)

(>• •)>⌐■-■ (⌐■-■)

I LIVE

Red-headed step child

Similar threads

AMD: Volcanic Islands R1100/1200 (8/9 series) Speculation/ Rumour Thread