Predict: The Next Generation Console Tech

Status
Not open for further replies.
Wasn't it so that IBM sold a ton of fully-functioning Cells for servers/supercomputers and Sony got the things that had 7 SPEs working? I don't think the yield on Cell was that bad, just that IBM wanted the best for itself :)

AFAIK the PS3 Cells only had 32-bit floats, making them practically useless for most scientific computing tasks. (And much saner for gaming.)
 
Quick info. There was a rumor that MS would stick with the 2 model setup. Even going so far that the "set top" box model would be a kinect enabled, netflix, lower end gaming machine. The "hardcore" model having the the optical drive, hdd, and backwards compatibility.

The new rumor is from our very own GrandMaster...

http://www.gamesindustry.biz/articles/digitalfoundry-next-gen-xbox-in-2012-analysis?page=1

It's also believed that Microsoft will continue its successful two SKU strategy, and indeed take it much further with its new platform: a pared down machine is to be released as cheaply as possible, and positioned more along the lines of a set-top box (the use of 360 as a Netflix viewing platform in the US is colossal) and perhaps as a Kinect-themed gaming portal, while a more fully-featured machine with optical drive, hard disk and backward compatibility aimed at the hardcore would be released at a higher price-point.

He also delves into the idea of dual GPUs...

The yield situation does lend some credence to another oft-repeated Nextbox rumour, however: the notion that it will feature dual graphics chips in its design. While the idea of packing two GPUs into its next console might sound crazy, it would allow Microsoft to use more chips from the production line in the same way that Cell (and indeed numerous CPUs and GPUs) had cores turned off to make chips with minor defects viable, improving percentages on chips that could be utilised.

There are also advantages from a development perspective too. One contact told us that two GPUs makes a lot of sense - short of adopting the fully programmable graphics chip (like Intel's abandoned Larrabee), it's almost a developer's dream feature.

Even used inefficiently, developers could tile - vutting the scene in half and sending each piece to different GPUs. Efficiency would be lost on border-overlapping geometry - just as it is with tiling on the Xbox 360 right now - but the rendering of geometry is less and less work with the slow shift from fully forward to fully deferred rendering. A deferred renderer would lose nothing in efficiency in all the lighting and shading passes.

More streamlined applications could see independent rendering operation parallelised - for example, rendering the main scene with the shadowmap, rendering the different cascades of the increasingly popular shadowmaps, rendering light buffers, rendering different portions of a complex post-processing chain - but post-processing is a good candidate for pure tiling, as well.

There would also be production advantages for Microsoft too. Two slower, narrower graphics chips should be easier and cheaper to make than one big one, and it would be less expensive to route two 128-bit memory buses instead of one 256-bit bus. It could also be cheaper to cool them separately too.

The dual GPUs idea sounds plausible. Just not sure it's likely. However, I am coming around to the idea of the two SKUs approach. Makes sense that they would use a 360 that's updated & streamlined for the DVD-less set-top box SKU. Then have the high end SKU be the the 720 with hard drive, DVD and/or Bluray.

Tommy McClain
 
Dual GPU in current designs = duplication of memory with all the added draw backs of board complexity, power, cost, etc.

I don't think it is a horrible idea for the reasons given (I have echoed those long ago) but it does pose hurdles. If they do go with a SI maybe they could invest in cross-chip traffic/memory controller for shared memory. Then going with 2x130mm^2 chips has the benefits of manufacturing. That said there could be structural losses within the GPUs like schedulers and whatnot that would be duplicated and dedicated logic to get the cross-GPU to work. I wonder if it is a dedicated chip design a memory controller / side port communication could be efficiently worked out to minimize such issues.

Part of me says another options, if it could be pulled off, would be to use the PC market for binning of usable parts:

10% Top Bin = $500 tier PC parts / no defects, best speed/TDP bin
11-50% Bin = Xbox 3 Bin / Mid-PC bin, 80% frequency, 10% block disabled
51-100% Bin = Mid-Range & Low-End PC Binning, various disabled blocks, reduced frequencies

This would only be helpful the 12-18 months until new DX models come out. But if MS could coordinate this with a chip maker it could be a boost to the chip maker, "Game GPU as Xbox 3" or even better "Faster Xbox Chip". And while it would only be helpful the first year it would allow more usable parts--so even if they are sold at a loss to MS (lets say ATI doesn't need 1M extra parts at $40 chip, so they buy them at $30, BUT MS loses $10 instead of $40 on the unusable chips for the Xbox) they could come ahead until the processes matures to the 80% usable rate and/or the next process reduction. If course a PC compatible part is going to be larger than a console specific one.

I bet there is a lot on the table...
 
Was Grandmaster's source TheChefO? :p

I can't take credit for the dual GPU solution as the concept was brought about much earlier in this very thread!

I believe Acert93 was the first to propose the possibility.

All I've been doing is proposing reasons for which they might consider a multi-GPU solution.

edit: And viola! There's Acert! Happy Thanksgiving!
 
Dual GPU in current designs = duplication of memory with all the added draw backs of board complexity, power, cost, etc.

I don't think it is a horrible idea for the reasons given (I have echoed those long ago) but it does pose hurdles. If they do go with a SI maybe they could invest in cross-chip traffic/memory controller for shared memory. Then going with 2x130mm^2 chips has the benefits of manufacturing. That said there could be structural losses within the GPUs like schedulers and whatnot that would be duplicated and dedicated logic to get the cross-GPU to work. I wonder if it is a dedicated chip design a memory controller / side port communication could be efficiently worked out to minimize such issues.

Part of me says another options, if it could be pulled off, would be to use the PC market for binning of usable parts:

10% Top Bin = $500 tier PC parts / no defects, best speed/TDP bin
11-50% Bin = Xbox 3 Bin / Mid-PC bin, 80% frequency, 10% block disabled
51-100% Bin = Mid-Range & Low-End PC Binning, various disabled blocks, reduced frequencies

This would only be helpful the 12-18 months until new DX models come out. But if MS could coordinate this with a chip maker it could be a boost to the chip maker, "Game GPU as Xbox 3" or even better "Faster Xbox Chip". And while it would only be helpful the first year it would allow more usable parts--so even if they are sold at a loss to MS (lets say ATI doesn't need 1M extra parts at $40 chip, so they buy them at $30, BUT MS loses $10 instead of $40 on the unusable chips for the Xbox) they could come ahead until the processes matures to the 80% usable rate and/or the next process reduction. If course a PC compatible part is going to be larger than a console specific one.

I bet there is a lot on the table...

Very important aspect : Size matters a lot!

Your post make me think here about complexity, size, etc ... and really would be big trouble combining double acess memory bandwidth for 2 GPUs etc, but what if they could put both under the somekind memory controller/crossbar or watheaver more efficient than the current Radeon HD 6990 its possible?

About the size gpu... in fact something more than found xbox360 and ps3 and specifically around 250mm2 RSX(90nm) would be very difficult to place in a closed box console and please forgive me my dreaming mode ON: seeing what we have today perhaps the best option ont these limitation (relation power/wattage/size) is put 2 Juniper like (Radeon HD 6770/6870M/800 stream processors each) current with 170mm2 at 40nm or counting with 28nm with about 250mm2(2*120/125 mm2 juniper like at 28nm), depending of the (under)clock become something like 70 watts "double gpu".

( I was dreaming with something power like 2 * 6990M on same package,but every debate I can tell as much as dream after all possibly the best to come to us will be 40% of something imagined.)
 
Very important aspect : Size matters a lot!

Your post make me think here about complexity, size, etc ... and really would be big trouble combining double acess memory bandwidth for 2 GPUs etc, but what if they could put both under the somekind memory controller/crossbar or watheaver more efficient than the current Radeon HD 6990 its possible?

About the size gpu... in fact something more than found xbox360 and ps3 and specifically around 250mm2 RSX(90nm) would be very difficult to place in a closed box console and please forgive me my dreaming mode ON: seeing what we have today perhaps the best option ont these limitation (relation power/wattage/size) is put 2 Juniper like (Radeon HD 6770/6870M/800 stream processors each) current with 170mm2 at 40nm or counting with 28nm with about 250mm2(2*120/125 mm2 juniper like at 28nm), depending of the (under)clock become something like 70 watts "double gpu".

( I was dreaming with something power like 2 * 6990M on same package,but every debate I can tell as much as dream after all possibly the best to come to us will be 40% of something imagined.)

Hmm ... Multiple Radeon 6770's ... you don't say! :p


Assuming 32/28nm launch in 2012 would yield 8x trans count, this would amount to a budget of roughly 4billion (497m x8 = 3,976m) if we are to assume equal budget/process node.

This leads to some pretty interesting potential hardware:

With that budget, MS could extend the xb360 architecture to the following:

10MB EDRam (100m) => 60MB EDRam (600m) Enough for a full 1080p frame buffer with 4xaa

3 core xcpu (165m) => 9 core xcpu (495m) - or an upgraded 6 core PPE with OoOe and larger cache along with an ARM core (13m trans)

This leaves a hefty 2.8b trans available for xgpu which could accommodate 3x AMD ~6770 (1040m) ~3 teraflops or 4x AMD ~6670 (716m) ~2.8 teraflops.

Such small, modular chips would enable good yields on new(er) processes until they were mature enough to combine together and eventually integrate to an APU.
 
Hmm ... Multiple Radeon 6770's ... you don't say! :p

Sorry i don't see your post forgive me,but 2 gpus is very dificult(more complex acess memory,cache to hide some latencies,"perfect"efficiency sinc 2 processor etc),3 its too much size and wattage even counting with 28nm (3* 125=375mm2 and 100/110+ watts).
 
You don't need multiple GPU dice to guard against manufacturing defects. You can just add redundancy onto the chip and disable portions that carry a defect. Or disable a working portion to have performance parity.

ATI does this exact thing with the Radeon 6870/6850 (which are the same physical chip, where the lesser SKUs have portions disabled).

NVIDIA does this exact thing with GTX580/GTX570 (which are the same physical chip, where the lesser SKUs have portions disabled).

IBM/Toshiba/Sony have done this exact thing with Cell BE for the PS3 (8 SPUs built, one permanently disabled for redundancy).

The idea that now, somehow, splitting a relatively small GPU into multiple pieces had become a better guard against manufacturing defects is complete and utter hogwash. If that were the case, where are ATI's reference designs coupling two lower-end chips on one board? 120mm² per die has been quoted. These boards do not exist because they make no sense whatsoever. Noone benefits from them.

Multi-GPU is an inefficient enthusiast-only crutch to produce more performance than you can manufacture on a single die. It's a waste of transistors for anything below that. Never mind the significant software overhead to get any performance scaling out of it.

It's also certainly not a developers' dream feature as quoted above. It's sort of acceptable in a "doesn't necessarily suck as much as you might think" way, but there is no benefit over single GPU with the same aggregate specs. Only drawbacks. Less performance. Higher manufacturing cost. Complete nonsense.
 
You don't need multiple GPU dice to guard against manufacturing defects. You can just add redundancy onto the chip and disable portions that carry a defect. Or disable a working portion to have performance parity.

ATI does this exact thing with the Radeon 6870/6850 (which are the same physical chip, where the lesser SKUs have portions disabled).

NVIDIA does this exact thing with GTX580/GTX570 (which are the same physical chip, where the lesser SKUs have portions disabled).

IBM/Toshiba/Sony have done this exact thing with Cell BE for the PS3 (8 SPUs built, one permanently disabled for redundancy).

The idea that now, somehow, splitting a relatively small GPU into multiple pieces had become a better guard against manufacturing defects is complete and utter hogwash. If that were the case, where are ATI's reference designs coupling two lower-end chips on one board? 120mm² per die has been quoted. These boards do not exist because they make no sense whatsoever. Noone benefits from them.

Multi-GPU is an inefficient enthusiast-only crutch to produce more performance than you can manufacture on a single die. It's a waste of transistors for anything below that. Never mind the significant software overhead to get any performance scaling out of it.

It's also certainly not a developers' dream feature as quoted above. It's sort of acceptable in a "doesn't necessarily suck as much as you might think" way, but there is no benefit over single GPU with the same aggregate specs. Only drawbacks. Less performance. Higher manufacturing cost. Complete nonsense.

So the argument here is that a 2.8b transistor chip built on a 28nm node will have better yields than a pair of 1.4b transistor chips on that same 28nm node.

I strongly disagree.

I see what you're saying in building in redundancy, but with a chip that big, that is a lot of redundancy!



I don't think anyone would argue with you about one 2.8b transistor GPU being better than (2) 1.4b trans GPU's, but the issue at hand is, can you get high enough yields out of a chip that big on 28nm that will run cool enough to fit in a console TDP without costing an arm and a leg.

I'm not saying it's impossible, I just don't see how.
 
So the argument here is that a 2.8b transistor chip built on a 28nm node will have better yields than a pair of 1.4b transistor chips on that same 28nm node.

I strongly disagree.

I see what you're saying in building in redundancy, but with a chip that big, that is a lot of redundancy!



I don't think anyone would argue with you about one 2.8b transistor GPU being better than (2) 1.4b trans GPU's, but the issue at hand is, can you get high enough yields out of a chip that big on 28nm that will run cool enough to fit in a console TDP without costing an arm and a leg.

I'm not saying it's impossible, I just don't see how.
2.8B is enthusiastlevel high end. Never mind that we don't know NVIDIA's/ATI's yields on the current chips in that class, you're looking at an estimated power draw of 180W for the GPU alone, which is so far beyond the budget that you can drop the whole idea entirely.
 
2.8B is enthusiastlevel high end. Never mind that we don't know NVIDIA's/ATI's yields on the current chips in that class, you're looking at an estimated power draw of 180W for the GPU alone, which is so far beyond the budget that you can drop the whole idea entirely.

It's 8x the current generation transistor budget:

500m x 8 = 4b

I'm not expecting anything less than this for next gen.


How this budget is broken up is debatable, but with current trends of GPU's taking more of the work load off the cpu, I'd bet on 3/4ths of the budget or roughly 3b transistors.



As for wattage, I'm not sure how they get around this, but there are possibilities outlined in this very thread. Binning is a possibility. Even more so if the GPU is broken up into multiple pieces instead of one monster chip.

Another possibility is don't confine the console to a micro machine. Let it breathe in a standard 17" wide av case.

Maybe both, maybe none, I don't know, but anything less than that transistor budget will be a waste of everyone's time.
 
It's 8x the current generation transistor budget:

500m x 8 = 4b

I'm not expecting anything less than this for next gen.


How this budget is broken up is debatable, but with current trends of GPU's taking more of the work load off the cpu, I'd bet on 3/4ths of the budget or roughly 3b transistors.



As for wattage, I'm not sure how they get around this, but there are possibilities outlined in this very thread. Binning is a possibility. Even more so if the GPU is broken up into multiple pieces instead of one monster chip.
Splitting does not help with wattage nor with cooling. Not logically and not in practice. Look at the EVGA dual 560Ti card. Its cooler is even bigger and more obnoxious than the highest-end single-GPU cards.
Binning for "mobile" style power consumption reduces yields, just as binning for higher clock frequencies reduces yields. Only a sliver of any given run is good enough to cut it as a mobile part. Which is fine in PCs, since high-end mobile GPUs are super niche. Console GPUs aren't.
TheChefO said:
Another possibility is don't confine the console to a micro machine. Let it breathe in a standard 17" wide av case.

Maybe both, maybe none, I don't know, but anything less than that transistor budget will be a waste of everyone's time.
Prepare to have your time wasted then.
 
Splitting does not help with wattage nor with cooling. Not logically and not in practice. Look at the EVGA dual 560Ti card. Its cooler is even bigger and more obnoxious than the highest-end single-GPU cards.

Cooling is easier over a larger surface than a smaller one.

I never said wattage would be absolutely less, but it could be with binning.

Binning for "mobile" style power consumption reduces yields...

Indeed, but binning two low power 1.4b trans GPUs would produce higher yields than a single low power 2.8b trans GPU.

Of course, an engineer could just take your approach and say, "can't do it", slap in whatever fits in a single gpu package in a small box and call it a day.



But then we'd have xb720 competing with Wuu instead of ps4. Which I'm sure some folks around here would be thrilled with.
 
I don't see where the current Xbox360 would find its target in the 2 model line-up rumor. Xbox360 can still generare a lot of profits. It would make sense for MS to launch a revised Xbox360, a truly slim design, with a new SoC at 32nm, maybe already at CES2012, and then launch the next-generation next fall. Xbox360 would target mainstream-casual gamers, in the range 149-249$, and the Next Xbox will be sold at a much higher price range 399-499+$, targeting hard-core gamers.
I think that people are willing to pay more today for entertainment and electronics than few years ago.
 
It would make sense for MS to launch a revised Xbox360, a truly slim design, with a new SoC at 32nm, maybe already at CES2012, and then launch the next-generation next fall. Xbox360 would target mainstream-casual gamers, in the range 149-249$, and the Next Xbox will be sold at a much higher price range 399-499+$, targeting hard-core gamers.

Indeed.

It would help them sell consoles.

A more targeted approach on the marketing front and xb360+Kinect helps them really narrow in on the casual market while top-end hardware in xb720 helps reestablish the hardcore gamer.

For this reason, I expect much out of xb720.

Perhaps more than others on this board.
 
Status
Not open for further replies.
Back
Top