Predict: The Next Generation Console Tech

Status
Not open for further replies.
Just imagine gran turismo with a 32 spu cell and XDR2. drools

Yeah the graphics next gen are just going to be ridiculous, especially in certain genres that seem kind of tapped out already. I mean I look at Streetfighter 4 and it's kind of hard to imagine how they're going to improve that style of game it already looks so good, but I'm sure they'll figure out a way :p Another example is car games where, I guess they'll go even more photorealistic.

For FPS and other genres it's easy to see where they'll improve, as there's lots of weakness there.

All I know is I cant wait to see the mind blowing amazingness even a modestly specced next gen console will bring (even 2GB/800 AMD SP would be a huge leap).

I guess that's kind of OT/

But I dont think Cell is going to be in PS4 if the "Vita like easy to program" source was true.
 
Do you have any estimate on much memory bandwidth is spent reading textures VS reading/writing framebuffers VS reading/writing geometry?

I have absolutely no hard numbers on that, but the very crude estimate I've seen thrown around is that half your B/W is textures, and most of the rest is the frame buffer. How high a portion the geometry spends depends a lot on your data set -- with lots of very small triangles, it's entirely possible that your geometry spends more than your textures, even if this is not typical. So with intelligent streaming of your geometry and enough eDRAM to fit your frame buffer, you should be able to double your effective B/W.


Batching w/ texture-packing is also employed to increase the potential for this. Also MegaTexture is a perfect example of an entire system built around the idea.

If in your game you have a single texture sampled on a single draw call once per frame (that's not rendering your skybox) then you're probably doing it wrong...
I was talking about individual texels inside your textures. MegaTexture is a good example of this -- as everything gets it's unique texture, each and every pixel on the screen typically samples a different texel from the data set. So no matter how you cache, you cannot avoid needing to read in a whole lot of data every frame.

Note that while there is no time locality of data in textures, there is plenty of space locality -- or, if you read in a texel from some certain mipmap level, it's almost a certainty that the adjacent texel will also be used (by a neighbouring pixel). This is why there are texture caches, and why MegaTexture is interesting -- it lays out data so that it tries to stream it straight to the screen as efficiently as possible.
Smaller textures make matters worse & can in many ways hurt performance drastically as you reduce batch-efficiency.
Lots of small individual textures do make things worse. But if you use just a couple of small enough (tiled) textures, they can fit into caches and you suddenly have lots of time locality. On that kind of data, you can survive without almost any B/W, and this is how many old 3d-engines survived in the age of no bandwidth. This has more or less been abandoned in the modern world, mostly because it looks like crap.
 
I think the cost (and inability to reduce cost over console life) of a 192 bit/256 bit bus will be worth the expense.

I mean we discuss all the time that current top range GPUs will be midrange by the time the Next Gen arrives and current top end GPUs feature 256+ busses.

I don't know if I've ever seen a cost analysis of bus size. It'd be interesting to compare to some of the EDRAM predictions in this thread.

I think providing a SKU without an HDD next gen would be smart. You shift cost off your books and give the consumer a choice, which are generally good things. I have replaced one of my PS3 HDDs with a 320GB 7200 RPM drive because I'm a techie and notice the difference. I doubt many of my peers even know they have the option. I would spring for an SSD next gen as I am currently pricing one for my PC.
 
I think the cost (and inability to reduce cost over console life) of a 192 bit/256 bit bus will be worth the expense.

I'm not sure at all about that. Remember that stuffing the framebuffer on eDRAM gets rid of half of your b/w needs -- this worked so well for xbox360, that I would be genuinely surprised if there was a console in the next gen that *didn't* do that. So 128 bit on console ~= 256bit on discrete GPU.

I mean we discuss all the time that current top range GPUs will be midrange by the time the Next Gen arrives and current top end GPUs feature 256+ busses.
Top end busses were 256+ before last gen came out. In fact, top end gpu busses have been 256+ since 2002. Bus widths don't tend to increase -- the costs caused by a wide bus are as high today as they were back then. You can get a reasonable approximation of the likely bus widths at a price point simply by looking last years products at that price point. 256-bit GDDR5 bus would need 8 chips, for the lifetime of the device. There is no #%&ing way I can see any of the manufacturers accept that.

I don't know if I've ever seen a cost analysis of bus size. It'd be interesting to compare to some of the EDRAM predictions in this thread.

It's hard to give any, because the costs are split in at least 3 places. The memory controller at the GPU needs to be larger (and as you need to drive a signal off-chip, die shrinks don't make memory controllers any smaller). You need more signal lines on the pcb, and thanks to how finicky high-speed memory interconnects are, doubling the lines way more than doubles the cost. And finally, you need more actual memory chips, which sucks, because while the price per GB of memory is probably the fastest dropping cost you have, the minimum price per chip is more or less constant.

All these costs are significant, and none of them scale down with time. All consoles will be bandwidth-starved, simply because adding more processing power to better utilize the bandwidth you have is just cheaper than adding more bandwidth.

My best bet for a new console on the "before new tech" timeframe would have an unified 128-bit GDDR5 bus, with enough eDRAM on the chip to always fit the entire framebuffer. XDR2 is the dark horse -- on paper, it would be very nice, but as I understand, it doesn't have much manufacturer interest.
 
I doubt XDR2 will be used. Rambus has been working the Terrabyte Intiative for many years now and the signaling rate is 32x versus 16x of XDR2.


Xbox 720
Power PC cores and the latest ATI GPU design on a single chip (SoC).
Rambus Terrabyte Intiative memory interface.
Optical drive that reads General Electric hologram based discs.
 
I think providing a SKU without an HDD next gen would be smart. You shift cost off your books and give the consumer a choice, which are generally good things. I have replaced one of my PS3 HDDs with a 320GB 7200 RPM drive because I'm a techie and notice the difference. I doubt many of my peers even know they have the option. I would spring for an SSD next gen as I am currently pricing one for my PC.

I think it is sooner the other way round? The optical drive should be the optional one, unless we find a new media that is fast, re-writeable and cheap so that the two could become a hybrid (kind of like the media for Vita, but that isn't that fast).
 
I doubt XDR2 will be used. Rambus has been working the Terrabyte Intiative for many years now and the signaling rate is 32x versus 16x of XDR2.


Xbox 720
Power PC cores and the latest ATI GPU design on a single chip (SoC).
Rambus Terrabyte Intiative memory interface.
Optical drive that reads General Electric hologram based discs.
I tried to find informations about this Terrabyte Initiave, I found that it's mem controller that support nowadays industry standard (DDR3 and GDDR5) BUT I don't get the benefits :LOL:
Does that mean that you could link more memory chips on mem controller of the same width (so increasing bandwidth)?
 
I tried to find informations about this Terrabyte Initiave, I found that it's mem controller that support nowadays industry standard (DDR3 and GDDR5) BUT I don't get the benefits :LOL:
The Terabyte Initiative was a pursuit of a range of technologies to enable up to 1 TBps memory bandwidth. You require DRAM chips built around their architecture (XDR2) to enable it, and can't just plug standard DDR in to a new memory system. If it works as per their claims, 500 MBps unified RAM should be useable/affordable in a console. I don't know of any real-world implementation of XDR2 to prove it though! This is what I'm hoping for, and there's a reasonable chance of Sony using XDR2 in PS4. 4 GB of 500 GB/s RAM would be excellent for the developers. At 1080p, bandwidth shouldn't be a huge bottleneck, while the system architecture would be very flexible (no constraints imposed via eDRAM or split memory). It'd need a customised GPU with FlexIO though.

Rambus's DDR3 improvements are something new to me. I guess RAMBUS feel the need to branch out into technologies that are being used rather than just offering radical new technologies that don't get adopted. Their DDR3 tech is offering a doubling of DDR3 performance, which is well below their XDR2 performance claims. It'd be a nice bonus to any system using DDR3, but isn't a tech in itself that'll support a console.
 
The Terabyte Initiative was a pursuit of a range of technologies to enable up to 1 TBps memory bandwidth. You require DRAM chips built around their architecture (XDR2) to enable it, and can't just plug standard DDR in to a new memory system. If it works as per their claims, 500 MBps unified RAM should be useable/affordable in a console. I don't know of any real-world implementation of XDR2 to prove it though! This is what I'm hoping for, and there's a reasonable chance of Sony using XDR2 in PS4. 4 GB of 500 GB/s RAM would be excellent for the developers. At 1080p, bandwidth shouldn't be a huge bottleneck, while the system architecture would be very flexible (no constraints imposed via eDRAM or split memory). It'd need a customised GPU with FlexIO though.

Rambus's DDR3 improvements are something new to me. I guess RAMBUS feel the need to branch out into technologies that are being used rather than just offering radical new technologies that don't get adopted. Their DDR3 tech is offering a doubling of DDR3 performance, which is well below their XDR2 performance claims. It'd be a nice bonus to any system using DDR3, but isn't a tech in itself that'll support a console.
From what I've read on the web (as Rambus own site is a mess or my job connection is problematic) the "terrabyte initiative" is not only about memory but also the mem controller, and how to clear the "electric signal". This part of the tech should work with various existing RAM type. Here's a video:
http://www.youtube.com/watch?v=5y7loaBA2t4

They give some figures at some point but in a format I don't know how to deal with I don't know at this compare to DDR3 or GDDR5 clocked at the same speed.

If you're response is they can "double" the bandwidth using the same memory chip as every body else then it's a great news even if it's not XDR2 figures but I4m not sure I get either the tech or your post correctly :|
 
The Terabyte Initiative was a pursuit of a range of technologies to enable up to 1 TBps memory bandwidth. You require DRAM chips built around their architecture (XDR2) to enable it

Does this mention buswidth ?

XDR2 use a 16x data rate as well as a higher base clock than XDR. But I can't see it deliver more than 4x XDR per pin.

With a 64bit differential data bus (same amount of routable lanes as 128bit GDDR5) you end up with 800 MHz x 16 bit/cycle/pin x 64 lanes =102GB/s, adequate, but not revolutionary.

I'm predicting every console vendor is going to have a chunk of eDRAM or DRAM die stacked on the GPU next gen to alleviate bandwidth constraints.

Cheers
 
From what I've read on the web (as Rambus own site is a mess or my job connection is problematic) the "terrabyte initiative" is not only about memory but also the mem controller, and how to clear the "electric signal".
Yes. It's a range of techs.

This part of the tech should work with various existing RAM type. Here's a video:
http://www.youtube.com/watch?v=5y7loaBA2t4

If you're response is they can "double" the bandwidth using the same memory chip as every body else then it's a great news even if it's not XDR2 figures but I4m not sure I get either the tech or your post correctly :|
That's the idea. Clean up the signal and you can then resolve more information, enabling twice the data per clock asi understand it. This is working on slowish RAM though, so it's not going to be much benefit to next-gen consoles. The base RAM used next-gen will have to be faster stuff, like GDDR5, which is already loaded with signalling tech that means less room for improvement.
 
Does this mention buswidth ?

XDR2 use a 16x data rate as well as a higher base clock than XDR. But I can't see it deliver more than 4x XDR per pin.
32x datarate.

With a 64bit differential data bus (same amount of routable lanes as 128bit GDDR5) you end up with 800 MHz x 16 bit/cycle/pin x 64 lanes =102GB/s, adequate, but not revolutionary.[/quote]

From Rambus's marketing gumph:
Initial devices are capable of 9.6Gbps data rates providing up to 38.4GB/s of bandwidth from a single 4-byte-wide device. The roadmap for XDR 2 extends to 12.8Gbps data rates providing 51.2GB/s of bandwidth per device.
Their graph shows 12 Gbps datarates in 2011. Coupled with supposedly simpler system archtecture, I'm guessing they see more devices as possible, for higher BW. Still, 128 bits would be four 4-byte devices, at 50Gb/s is 200 GB/s. I don't know how the simplified communications platform would help support a 256 bit bus at 400 GB/s. I don't know what their licensign fees are that might make all this prohibitively expensive!

I'm predicting every console vendor is going to have a chunk of eDRAM or DRAM die stacked on the GPU next gen to alleviate bandwidth constraints.
What we really need are some actually costs for these options! How much would a 128 bit bus and eDRAM cost, versus 256bit bus? What about factoring in XDR2's fewer pins so simpler board?
 
What we really need are some actually costs for these options! How much would a 128 bit bus and eDRAM cost, versus 256bit bus? What about factoring in XDR2's fewer pins so simpler board?

This is what I was trying to say.

I'd also point out that it seems for even midrange cards, nVidia and AMD have settled on a 256 bit bus. And even some low end cards have 192 (GeForce 460/560 have 256; GeForce 440/450/545 DDR3/550 have 192)
 
Definitions of "high end" and "low end" vary, but any of those "low end" card with a 192-bit bus is much faster than the fastest integrated graphics processors - even the fastest Llano stuff - and faster than many standalone graphics cards. And consoles are stuck with their bus long after a PC graphics card has been replaced by a faster device on a 128 or 64 bit bus (or by a small area of of silicon on the side of a CPU).
 
What we really need are some actually costs for these options! How much would a 128 bit bus and eDRAM cost, versus 256bit bus? What about factoring in XDR2's fewer pins so simpler board?

I requested something similar over in the Nintendo GPU speculation thread.
What I can contribute, is that the same ATI card with 1GB GDDR5 commands a $10 premium vs. its GDDR3 counterpart. And that includes any extra layers or tighter manufacturing tolerances that may be required to run at the higher data rate. Given that the internal structure of the RAM is the same, I can't see GDDR3 making sense, as the price differential of the parts themselves is going to approach zero over time.

As far as a 256-bit interface is concerned, look at what the AMD HD6850 cards sell for these days. I've seen cheaper 256-bit cards back in the day, but its probably best to look at current offerings, even though it is difficult to say what role market positioning plays. The price differential vs the HD6770 (which has a smaller, rebranded, GPU that should yield very well by now, less substantial power circuitry and cooling, and of course a 128-bit memory bus) is a modest but still significant $40 or so. But again, given the difference in GPUs, that comparison is clearly exaggerating the price differential.
 
Isn't the bigger cost of GDDR5 not the build price, but the power use?
 
Isn't the bigger cost of GDDR5 not the build price, but the power use?
The power use seems modest enough. This pdf is material from Samsung, and a year old. It won't get worse from 2010 on out, and 4,3 W is hardly cause for concern. I've seen similar material from Elpida.

AMD seemed to have early issues adjusting voltages for GDDR5, but it would appear to be a thing of the past for some time now.
 
I tried to find informations about this Terrabyte Initiave, I found that it's mem controller that support nowadays industry standard (DDR3 and GDDR5) BUT I don't get the benefits :LOL:
Does that mean that you could link more memory chips on mem controller of the same width (so increasing bandwidth)?

The Rambus signaling technology operates at 16gbps, and it is envisioned that a single memory controller could connect to 16 DRAMs, with each DRAM providing 4 bytes of data per cycle (1TB/s = 16gbps * 4B * 16 DRAMs). To reach the 1TB/s target, Rambus is relying on three key techniques to increase bandwidth: 32X data rates, full speed command and addressing, and a differential memory architecture.

http://www.realworldtech.com/page.cfm?ArticleID=RWT120307033606




XB720
PowerPC cores and ATI on a single die with a Rambus memory controller integrated.
16 Rambus DRAMs on the main board clocked at 500hz.


Later in the consoles life cycle reduce the number of Rambus of DRAMs from 16 to 8 by clocking them at 1000hz.
 
Unfortunately, I don't have any real numbers either.

As far as a 256-bit interface is concerned, look at what the AMD HD6850 cards sell for these days. I've seen cheaper 256-bit cards back in the day, but its probably best to look at current offerings, even though it is difficult to say what role market positioning plays. The price differential vs the HD6770 (which has a smaller, rebranded, GPU that should yield very well by now, less substantial power circuitry and cooling, and of course a 128-bit memory bus) is a modest but still significant $40 or so.

If I'm not wrong, the 6770 also has 8 memory chips, just two for each 32-bit bus. The problem with 256-bit bus isn't so much cost now -- assuming an unified 4GB memory pool, it would give really nice performance for it's cost. The problem is that you can expect the cost of the memory subsystem to remain static for the whole life of the device, while with a 128-bit system you can expect the cost to halve in just a few short years.
 
Status
Not open for further replies.
Back
Top