AMD/ATI for Xbox Next?

XDR2? ...

Memory configuration is going to be a major feature next gen. With many-cores, high-bandwidth GPUs, and slow optical media *something* will need to give. If past CPUs are an indication CPU cache will be limited, further increasing the needs for fast memory to feed the CPUs.

I know devs loved the split configuration on the PS3, but maybe we will see a small pool of very, very fast memory and a large pool of slower. Maybe in this context a very large scratchpad makes sense (but also expensive).

I guess that is the benefit of waiting until 2012. But then again if, say, the GPU continues on the path of Xenos (i.e. memory controller is there) and eDRAM is dropped (and that package+wiring) and that budget is incorporated into the GPU, maybe the GPU will be of significant enough size to justify a 256bit bus. There are affordable retail GPUs out there with 256bit busses and if the GPU is significantly large (Xenos+eDRAM is about 330mm2) they may budget it so a full process shrink can still fit a 256bit bus?

If they plan to use DirectCompute/OpenCL for utilizing the GPU for post processing (stuff SPEs are doing on the PS3), physics, and other math tasks this may be a feasible approach?
 
I know devs loved the split configuration on the PS3, but maybe we will see a small pool of very, very fast memory and a large pool of slower. Maybe in this context a very large scratchpad makes sense (but also expensive).
If XDR hits the promises it has made, a large, single pool of fast RAM is very possible. Personally I love the idea of a few GBs of 512 GB/s RAM (1TB/s is being tooted), eliminating all the faf of managing multiple RAM pools. Couple that with a single processor solution like Larrabee and you'd have the most straight-forward system possible. Big lump of RAM, large collection of uniform cores to do what you want with. That'd be a very straight-forward system to develop for I think.
 
XDR2? ...

Not sure how they'd feel about the licensing considering GDDR5. But there is the experience with the memory controller to consider as well.

Memory configuration is going to be a major feature next gen. With many-cores, high-bandwidth GPUs, and slow optical media *something* will need to give. If past CPUs are an indication CPU cache will be limited, further increasing the needs for fast memory to feed the CPUs.
I'm still wondering if MS really needs to go beyond even 8/12 cores considering the PC space, cross-platform, and even the ability to utilize more, but maybe I'm being conservative. :)

I know devs loved the split configuration on the PS3, but maybe we will see a small pool of very, very fast memory and a large pool of slower. Maybe in this context a very large scratchpad makes sense (but also expensive).
What do you mean by the small pool?

I guess that is the benefit of waiting until 2012. But then again if, say, the GPU continues on the path of Xenos (i.e. memory controller is there) and eDRAM is dropped (and that package+wiring) and that budget is incorporated into the GPU, maybe the GPU will be of significant enough size to justify a 256bit bus.
FWIW, I believe the smallest PC GPU with a 256-bit bus was the rv670 at around 190mm^2, but they would need a definite redesign if they somehow have another double-bandwidth per bit memory in the pipe. Again, I'm not so sure about the efforts they'd be willing to put there in the grand scheme. Maybe it's feasible, but maybe they want an easy solution too. *shrug* :)

There are affordable retail GPUs out there with 256bit busses and if the GPU is significantly large (Xenos+eDRAM is about 330mm2) they may budget it so a full process shrink can still fit a 256bit bus?
Comparing the summation isn't fair as the yields will differ individually versus a conglomeration. ;) It's important because that is a big factor in chip pricing.

Things may be a bit different from a die size vs memory pad limit for the MCM nature of Xenos as well I would think, particularly because of the interface between the two.

As for cost... depends on what sort of losses you're banking on as acceptable for the next round considering the issues this generation. :p
 
So, EDRAM = good engineering, therefore, Microsoft will absolutely NOT use it, because their main characteristics is being bad at engineering?

Yeah, I don't really get the logic either.

On top of that a lot of developers praise the EDRAM for various reasons, just read about Joker's opinions on rendering alpha blended particles, or check the interview with the graphics programmer of that bike tricking XBLA game from the Digital foundry guys... so what exactly are these claims based on?
 
If XDR hits the promises it has made, a large, single pool of fast RAM is very possible. Personally I love the idea of a few GBs of 512 GB/s RAM (1TB/s is being tooted), eliminating all the faf of managing multiple RAM pools. Couple that with a single processor solution like Larrabee and you'd have the most straight-forward system possible. Big lump of RAM, large collection of uniform cores to do what you want with. That'd be a very straight-forward system to develop for I think.

Until Larrabee can show, in the real world, it is more than a HUGE, but slowwwwww, GPU I am not sure we should be pining for such designs yet.

Fermi appears fairly close to the 5870 in terms of FLOPs density, so what advantages does Larrabee have over Fermi that make us want to ditch a CPU+GPU combo? Given a fixed budget, is a Larrabee system going to be "easier to develop" the kind of *end product on the screen* as a Fermi style chip?

Larrabee may be easier and allow new techniques, but I am not sold it will be "easier" to get the same kind of visuals and performance of a standard GPU. And I don't think anyone wants to bank on totally unproven technology at the core of their console, especially when you can get a proven design that is quite fast out of the box and cheap.
 
I want to believe that we'll have reached a point with GDDR5 or XDR2 where embedded memory won't make a lot of sense. One of the reasons the 360 needed the EDRAM was it was otherwise a unified memory system with a 128 bit bus. But if we're targeting 1080p will you really need more than you can get with a unified 256 bit GDDR5 setup?
 
Yeah, I don't really get the logic either.

On top of that a lot of developers praise the EDRAM for various reasons, just read about Joker's opinions on rendering alpha blended particles, or check the interview with the graphics programmer of that bike tricking XBLA game from the Digital foundry guys... so what exactly are these claims based on?
Yeah, I see them making the EDRAM more flexible but getting rid of it doesn't make sense to me unless a dev is willing to give us some reasons for MS to get rid of it other than to save money.

Though I will ask considering that something like LRB supposedly will be able to do realtime raytracing (though they have only shown a demo with a current generation game.) what will ATI need to put on the chip in order to do the same thing?
 
Why are some of you thinking MS is anxious to dump the EDRAM? Were they burned by that design decision in some way? I thought it was considered one of MS's smarter decisions for this gen. Especially if Sony is likely to use it, what would make MS reluctant to use it again?
 
Until Larrabee can show, in the real world, it is more than a HUGE, but slowwwwww, GPU I am not sure we should be pining for such designs yet.
Sure. I'm not specifically advocating Larrabee. I just like the elegance of a one core-type, one RAM pool design, where it's all about programmability and software flexibility. I think a PS4 with large RAM pool, Cell derivative, would be a good enough option. Sadly the GPU side is still a weak-point, and bang-for-transistor-buck, we'll probably be stuck with a separate GPU specifically designed to churn through graphics. There's going to be a software interface between CPU and GPU still. I guess the simplest future may be dumping more onto the GPU, in essence a RAM+GPU system with a CPU just there to pick up specific functions (kinda like PPU in Cell, in a support role to the heavy lifters).
 
Though I will ask considering that something like LRB supposedly will be able to do realtime raytracing (though they have only shown a demo with a current generation game.) what will ATI need to put on the chip in order to do the same thing?

wrt Raytracing, it is a tech demo at best until it can provide some sort of benefit to developers/end product. There are examples of older GPUs running RT, but until it is performant enough to give good IQ where the IQ is either better than rasterizing OR the advantages (e.g. simplified renderer) offer other benefits that justify the speed reduction I don't even see it relevant next gen. RT is an implimentation detail, and unless the hardware is fast enough to do it to be a "game changer" the implimentation will remain on the sideliness imo.
 
@Joshua Luna
It´s interesting that you bring up Fermi, considering there are 2-3 years until 2012. This gen Sony basically choose a two year old top of the line GPU, looking at the computing power of the 360 GPU they are pretty close.
Could that mean that we could expect a console released 2012 would have a GPU comparable to Fermi?
Even though I would love to see that happen I don´t see it as a realistic expectation as it would imply a pretty expensive console.

Despite the fudzilla rumour I don´t expect a true next generation from MS or Sony within just a few years, because I don´t think the interest of the current generation will fan out that fast. I expect some console refresh, but not a new generation leap, I think there are cheaper ways of breathing new life in the current gen. The new motion controllers and 3D gaming are new ways but I expect there to be more ways that don´t require all the investments a brand new console would require.
 
I just did a little search on the web about GDDR6 and it looks like it won't be available by the time the systems we're speaking what speed/bandwidth can we expect the GDDR5 to reach by 2012 on say a 128bits bus?
 
I think a PS4 with large RAM pool, Cell derivative, would be a good enough option.

"Good enough" for what? An x86 CPU + ATI GPU is going to mop the floor with a Cell derivative in terms of end product in almost every case in a 2 year launch window.

Sadly the GPU side is still a weak-point, and bang-for-transistor-buck, we'll probably be stuck with a separate GPU specifically designed to churn through graphics.

I don't understand why this is sad--how is a GPU a "weak point" but then not "bang-for-buck" you appear to conceed it is better. Isn't that what it is all about? Getting the most from the least amount of hardware.

Hence the Larrabee comment: a HUGE chip, yet slow for graphics.

I see it as: GPUs are the strong point. Proven scalability, increased programmability, good bang-for-buck in terms of budget investment. The same cannot be said for CPUs--as important as a good CPU is. And while there are diminishing returns, having a huge vector unit (GPU) to toss tasks at traditionally given to low-return CPUs seems like a valid investment, especially if you may need to render 2x frames (3D).

There's going to be a software interface between CPU and GPU still. I guess the simplest future may be dumping more onto the GPU, in essence a RAM+GPU system with a CPU just there to pick up specific functions (kinda like PPU in Cell, in a support role to the heavy lifters).

Now you are talking ;)
 
@Joshua Luna
It´s interesting that you bring up Fermi, considering there are 2-3 years until 2012. This gen Sony basically choose a two year old top of the line GPU, looking at the computing power of the 360 GPU they are pretty close.
Could that mean that we could expect a console released 2012 would have a GPU comparable to Fermi?

On the one hand Fermi on 28nm may be in the general range for area a console may be looking at in 2012. But I don't follow the point about Sony picking a 2 year old GPU. RSX was a G70 derivative which, until the launch window, was still a top end GPU.

iirc the 5870 is ~340mm2 or so on 40nm. If the Xbox 3 is GPU centric and launches on 32-28nm in 2012 and we are looking at a single GPU solution (not dual...) with similar budgets to the 360, I would think we could see some bump from more hardware but a reduction in frequency. Something in the 2.5-4TFLOPs range (about what I predicted in 2005/2006) would be reasonable guess? Who knows though ... the question is how useful the chip will be.

Despite the fudzilla rumour I don´t expect a true next generation from MS or Sony within just a few years, because I don´t think the interest of the current generation will fan out that fast. I expect some console refresh, but not a new generation leap, I think there are cheaper ways of breathing new life in the current gen. The new motion controllers and 3D gaming are new ways but I expect there to be more ways that don´t require all the investments a brand new console would require.

3D controls in 2010. A limited console launch in 2012 with mass market release/price in 2013 is 3-4 years away for most *early adopters*.

I dunno, but I would be ready for a new console in 2013 (4 years). Lack of AF is killing me :p
 
Sure. I'm not specifically advocating Larrabee. I just like the elegance of a one core-type, one RAM pool design, where it's all about programmability and software flexibility. I think a PS4 with large RAM pool, Cell derivative, would be a good enough option. Sadly the GPU side is still a weak-point, and bang-for-transistor-buck, we'll probably be stuck with a separate GPU specifically designed to churn through graphics. There's going to be a software interface between CPU and GPU still. I guess the simplest future may be dumping more onto the GPU, in essence a RAM+GPU system with a CPU just there to pick up specific functions (kinda like PPU in Cell, in a support role to the heavy lifters).
That supports AMD/ATI point fure is fusion :)
having the cpu and the gpu on the same chip should ease communication a lot.
 
Why are some of you thinking MS is anxious to dump the EDRAM? Were they burned by that design decision in some way? I thought it was considered one of MS's smarter decisions for this gen. Especially if Sony is likely to use it, what would make MS reluctant to use it again?

I think the concern will be benefit/cost/detriment. Every new generation has different bottlenecks. Will eDRAM alleviate cost/performance bottlenecks without causing more issues? The 5870 seems to spit out AA/AF at a reasonable cost on a 256bit bus. Is eDRAM like Xenos flexible/fast enough to have multiple buffers without tiling? is there a REAL tiling solution? Is the cost of ~100mm2 of eDRAM better invested in more DRAM? More shaders? More CPUs? More CPU cache?

If fillrate and such are being tamed by system memory then eDRAM may no longer be a big deal. Choke points change and 100mm2 is a lot of silicon! That is ~2 PPEs on the 360 or almost doubling the Xenos shader array!!
 
That supports AMD/ATI point fure is fusion :)
having the cpu and the gpu on the same chip should ease communication a lot.

But increase heat and power. And for production a 165mm2 CPU and a 335mm2 GPU have a lot better yields than a 500mm2 GPU/CPU combo.

Fusion appears to be targetting the low end anyhow. Unless we see something unique (e.g. dual-chip with each cheap having a couple CPUs and mainly GPU) I don't see it competing well. Just look at the PC market with high end Quad Core CPUs at nearly 3GHz and craptastic GPUs--it is a pretty worthless combo for games.
 
On the one hand Fermi on 28nm may be in the general range for area a console may be looking at in 2012. But I don't follow the point about Sony picking a 2 year old GPU. RSX was a G70 derivative which, until the launch window, was still a top end GPU.

Sort of, they shared the architecture, but the top of the line G71 GPUs of Nvidia run at considerably higher speeds at the time the PS3 was released and it was also immediately replaced with a totally different architecture, but I see your point.

2013 may be a good year to start capturing the hardcore gamers for the next generation, I don´t rule it out, but it will come down to nickel and dimes, is it worth it. The investment is huge, when will it start making a return on the investment, isn´t it better to ride a few years more on the current generation if you can keep the interest up?
 
But increase heat and power. And for production a 165mm2 CPU and a 335mm2 GPU have a lot better yields than a 500mm2 GPU/CPU combo.

Fusion appears to be targetting the low end anyhow. Unless we see something unique (e.g. dual-chip with each cheap having a couple CPUs and mainly GPU) I don't see it competing well. Just look at the PC market with high end Quad Core CPUs at nearly 3GHz and craptastic GPUs--it is a pretty worthless combo for games.
Well I disagree as I stated above (and putting aside process problems not impossible to overcome Intel planned to produce Atom using TSMC process for example) now the tinier quad core money can buy is the athlon X4 6xx code name Propus it is ~170mm² you can also have a juniper for ~170mm².
Say you can put them together and you have what I would not call a craptastic chip. It's tinier than what you seem to plan for tho but it could have some nice advantages as you save money (exceptable yields, easier design, less busses, only one cooling solution ) that can be spent elsewhere. You can go with a bit more RAM or use faster than your competitors, the chip is big enough to support a wider buss if needed, you may have a tinier/sexier system, spend more on the HDD bigger or faster. Overall you may catch up somehow in perfs even if public will think that you're lagging behind or you may choose to be a bit cheaper. Something a bit more iffy when the first real dual core CPU were introduce they were doing a way better job than what would have acheived two mono core on separate socket, maybe there is quiet some hidden hidden by having the CPU and the GPU on the same chip. Sebbbi hinted this if you do computation on GPU you must deal with synchronisation issues:
sebbbi said:
Transferring vector floating-point intensive algorithms to the GPU is the next step. The GPU is a harder thing to use properly compared to multi-core CPUs, as the latency to get the results back from it is much higher, and you need to process considerably larger data-sets at once to get best performance out of it. GPUs also historically have really bad random branching performance compared to CPUs. It's really important to design the algorithms properly for the GPU in mind. Sadly this often means large structural changes to the game program, as GPU latencies can be more than a frame long. Multi-threaded game programming has already increased the input lag a bit (often by a frame at least). This is one of the main reasons why many recent games suffer from noticeable input lag.
I also remember reading Ms presentation about multithreading and the restriction they have to set because it's complicated to know "what's happening on the GPU side" (it's obviously more complex than that)

The chip I describe above is not perfect as I found the split between CPU and GPU a bit too even but by 2012 with 28/32 process mature and available it's a different matter CPU would not grow (four cores should be enough) so that let more room for the GPU part of the chip.
 
Back
Top