Effects of PCI Express 2.0

Mart

Newcomer
In the coming months Intel, NVIDIA and AMD are expected to release new chipsets with PCI-E 2.0 support. By the end of the year, we can expect new GPUs which will support it. Version 2.0 is going to double the bandwidth between CPU and GPU. All of this could be read in numerous articles on hardware and tech sites this past half year.

What I haven't been able to find anywhere is how much of a difference this is going to make for games and gamers. Is this going to have an effect on the amount of draw calls per frame? The speed at which for examples textures can be loaded into GPU memory? The speed at which a game runs in general (more frames per second)?
Is this a development which programmers should all excited at, or is it just an evolution for the sake of progress, but not really interesting besides?

I'd really like to hear your thought on this.

Mart
 
The knock-on effect of a larger available bandwidth over the GPU-to-host bus is potentially all of those things you mention, to varying degrees.

Current consumer graphics platforms (from the host to the graphics card to the driver) are all optimised for using that available bandwidth as well as possible on one direction, from host to GPU. Traditionally, readback bandwidth peak has been hard to realise for whatever reason.

So while 2.0 might offer performance improvements in the host-to-GPU case because of the extra bandwidth, the biggest bit for games programmers to get excited about is probably more usable levels of bandwidth for GPU-to-host transfers, for expressing problems that require GPU-processed data to be pushed back to the host, before sending back to the GPU again.

As for hugely noticeable improvements in games performance, I wouldn't hold your breath. The transparent benefits should be slim.
 
Rys gave a much shorter and simpler reply than I'm doing here, but he hadn't posted his when I started writing, so meh! :) At least they're complementary...
---
I'd describe it as necessary, but not interesting. The 1.0 spec was made with the performance of GPUs such as the NV40 and R420 in mind, which are easily several times slower than G80 and R600. As such, relative to the performance of the chips at the time of release, PCI Express 2.0 will actually be slower, not faster... And that's arguably what matters in terms of what you can do with it. Video memory also at least doubled in the same timeframe.

That's mostly important for the high-end. However, PCI Express 2.0 is also important in the entry-level, because of technologies such as TurboCache and HyperMemory. The idea is this: the GPU's available bandwidth becomes the sum of the video and system memory bandwidths, assuming the HW and the driver is smart enough.

Back in 2004, CPUs were still using DDR1 on a 128-bit bus, so the PCI Express 1.0 link wasn't too much of a bottleneck. Now, we're using DDR2, so there could be a fair bit more bandwidth to harvest there. In fact, these entry-level GPUs are using DDR2 too, but on 64-bit busses. So the combined CPU-GPU bandwidth could be as high as 3 times the GPU-only bandwidth.

In practice, however, the CPU still needs some bandwidth (although not that much, really, for an entry-level system that plays games at 25FPS!) and PCI Express 2.0 is still only 5GT/s bidirectional. However, at the very least, this will allow to approximately double the bandwidth compared to not using system memory, and it will increase it by 30-40% compared to doing this on PCI Express 1.0.

HT3 will benefit AM2+ IGPs for the same reasons, as the HT1 link can already be a bottleneck (especially on RS690, iirc, since it's higher performance than MCP68).
 
Crossfire performs substantially better on a 975X motherboard than it does on most P35 boards, but initial indications are that X38 shows very little Crossfire performance boost over 975X.

That suggests to me that moving from x4 to x8 PCIe makes a significant difference, but that moving from x8 to x16 doesn't. In other words, at the moment, x8 bandwidth is enough (just); beyond that, something else is the bottleneck. I therefore regard it as unlikely that moving from x16 PCIe 1.0 to x16 PCIe 2.0 (another doubling of speed) will make much difference - yet.

However, it is unquestionable that eventually x16 PCIe 1.0 bandwidth will become insufficient, and it is absolutely right that PCIe 2.0 should be introduced comfortably before the bus becomes a bottleneck rather than after; otherwise, once we reach that point, there would no incentive for ATI and Nvidia to make faster chips any more.
 
Crossfire performs substantially better on a 975X motherboard than it does on most P35 boards, but initial indications are that X38 shows very little Crossfire performance boost over 975X.

That suggests to me that moving from x4 to x8 PCIe makes a significant difference, but that moving from x8 to x16 doesn't. In other words, at the moment, x8 bandwidth is enough (just); beyond that, something else is the bottleneck. I therefore regard it as unlikely that moving from x16 PCIe 1.0 to x16 PCIe 2.0 (another doubling of speed) will make much difference - yet.

However, it is unquestionable that eventually x16 PCIe 1.0 bandwidth will become insufficient, and it is absolutely right that PCIe 2.0 should be introduced comfortably before the bus becomes a bottleneck rather than after; otherwise, once we reach that point, there would no incentive for ATI and Nvidia to make faster chips any more.

I believe this depends on the card used and application used for the testing.
I know Tom's Hardware is probably the 2nd last site to trust, but at least according to their PCI Express scaling test, 8800GTS is extremely sensitive for the bandwidth in some applications, while the X1900XTX tested aswell wasn't even close to as sensitive.

As example, Call of Duty 2 @ 1600x1200, one of the tests where 8800 is really sensitive:
GF8800GTS:
x16 62.6 FPS
x8 52.1 FPS (~ -17%)
x4 36.5 FPS (~ -42%)
x1 15.9 FPS (~ -75%)

X1900XTX:
x16 56.1 FPS
x8 55.8 FPS (~ -0.5%)
x4 54.7 FPS (~ ~2.5%
x1 43.4 FPS (~ 23%)

edit:

What I'm wondering, though, is what's up with the 32 lane PCI Express? It's in the specs of both 1.x and 2.0, yet I've never seen anywhere a slot or card of any sort for such?
 
Crossfire performs substantially better on a 975X motherboard than it does on most P35 boards, but initial indications are that X38 shows very little Crossfire performance boost over 975X.

That suggests to me that moving from x4 to x8 PCIe makes a significant difference, but that moving from x8 to x16 doesn't....
Unfortunatly comparing performances like that over differing chipsets doesn't necessarily tell the full story. There are other differences that can affect Crossfire performance outside of pure bandwidth, such as whether the chipset supports peer-to-peer.
 
What I'm wondering, though, is what's up with the 32 lane PCI Express? It's in the specs of both 1.x and 2.0, yet I've never seen anywhere a slot or card of any sort for such?

ISTR that it's defined electrically and in the spec, but there's no 32-lane connector defined (it would be way too large). The point of 32-lane would be connecting chips on-board, so like using 32x PCIe between North/South bridge instead of HyperTransport or whatever Intel uses nowadays. I don't think it's ever been used, but it's fairly straightforward and there if anyone wants it.
 
Readback performance is very good even with PCI-E 1.0

I don't believe that modern games are using it to it's full extent, there are lots of different interesting new techniques (like texture virtualisation) that will intensively use the PCI-E bandwidth. Still, as cards become even faster, a faster standard is indeed more then welcome. But I don't believe it will affect the speed of games greatly when introduced.
 
Back
Top