Predict: The Next Generation Console Tech

bkilian · Aug 24, 2011

erick said:
2. If John Carmack is already doubting if MS will add an optical drive to their nextBox, what gives? If the nextBox does not launch with a completely new and awesome optical disc format, and unless they want to pay Sony for Blu-Ray, what other choice do they have?

Besides, the necessary infrastructure is already in place: Xbox Live!, big enough HDDs. Live! is already used to distribute games that weigh in at several gigabytes. After that its just a matter of scaling the infrastructure.

Sure, if you're willing to drop your addressable market by half or less. Just over half our customers sign up for live.

pjbliverpool said:
I doubt there are many serious developers out there that wouldn't have been happier to see a Core 2 in the 360 rather than Xenon. PS3's different given how much RSX needed the extra graphics umph cell could deliver.

That would have been something to see indeed, since the Core 2 launched 9 months after the XBox 360. Even the original Core launched 2 months after the 360. There's a rule of thumb we try to use in software engineering, never rely on a component that launches just before you do, since any slips they make will kill you. (This is one of the things that happened to Kin, but that's a long and boring story). Relying on a component that lanches after you is just plain insane.

Squilliam · Aug 25, 2011

bkilian said:
There's a rule of thumb we try to use in software engineering, never rely on a component that launches just before you do, since any slips they make will kill you. Relying on a component that lanches after you is just plain insane.

So that means that Microsoft is going to be starting to send out early development kits with sample GCN graphics hardware from AMD in a few months?

Rangers · Aug 25, 2011

bkilian said:
Sure, if you're willing to drop your addressable market by half or less. Just over half our customers sign up for live.

Isn't it half that pay for Live Gold, though? Not quite the same thing.

erick · Aug 25, 2011

function said:
He was talking about roar powah, and he was wrong. That's really all there was to it. He was chest beating, and seemed unaware that Cell does actually have some pretty huge strengths

I'm actually well aware of the architectures of both PS3 and X360.

The issue with PS3 is that Cell is a double-edged sword. While it does have good single-precision GFLOPS performance (204 GFLOPS is the latest number agreed upon, but you'd have to deduct 25,6GFLOPS for the disabled 8th SPE and then another 25,6GFLOPS for the 7th SPE which is reserved for OS usage, giving a final theoretical maximum of about 152 GFLOPS), but its PPE has completely negligible integer performance, even compared to Xenon:

PS3 Cell PPE @ 3,2GHz = 10 240 MIPS
X360 Xenon @ 3,2GHz = 19 200 MIPS
Core 2 Duo @ 2,93GHz = 27 079 MIPS

and therefore

Core 2 Duo OC @ 3,2GHz = 29 574 MIPS

It's almost 3x as fast as Cell when executing general-purpose code.

Now, you might wonder - but when it comes to physics, your system is still screwed without Cell? Well, welcome to the wonderful world of nVidia PhysX where a 8800 GTX can put out a total of 345 GFLOPS. Of course, when you enable PhysX in an actual game, only a some of this will be used. However, the results tend to be quite spectacular compared to the consoles when implemented properly.

On my 2006 PC, Mirror's Edge ran @ 1920x1080, 4xAA, PhysX enabled and all high @ 45-50 frames per second - compare that to 1280x720 2xAA @ often sub 30 fps with rudimentary physics and the differences in computational power become more clear-cut.

jonabbey · Aug 25, 2011

erick said:
I'm actually well aware of the architectures of both PS3 and X360.

The issue with PS3 is that Cell is a double-edged sword. While it does have good single-precision GFLOPS performance (204 GFLOPS is the latest number agreed upon, but you'd have to deduct 25,6GFLOPS for the disabled 8th SPE and then another 25,6GFLOPS for the 7th SPE which is reserved for OS usage, giving a final theoretical maximum of about 152 GFLOPS), but its PPE has completely negligible integer performance, even compared to Xenon:

PS3 Cell PPE @ 3,2GHz = 10 240 MIPS
X360 Xenon @ 3,2GHz = 19 200 MIPS
Core 2 Duo @ 2,93GHz = 27 079 MIPS

and therefore

Core 2 Duo OC @ 3,2GHz = 29 574 MIPS

It's almost 3x as fast as Cell when executing general-purpose code.

Those wikipedia numbers don't make any sense. They have the Cell's single PPE rated at nearly half the integer MIPS of Xenon's 3 PPE-like cores, which is surely wrong. They also list the Cell's PPE as being able to execute 3.2 D IPS, which is nonsensical.

function · Aug 25, 2011

erick said:
I'm actually well aware of the architectures of both PS3 and X360.

The issue with PS3 is that Cell is a double-edged sword. While it does have good single-precision GFLOPS performance (204 GFLOPS is the latest number agreed upon, but you'd have to deduct 25,6GFLOPS for the disabled 8th SPE and then another 25,6GFLOPS for the 7th SPE which is reserved for OS usage, giving a final theoretical maximum of about 152 GFLOPS), but its PPE has completely negligible integer performance, even compared to Xenon:

PS3 Cell PPE @ 3,2GHz = 10 240 MIPS
X360 Xenon @ 3,2GHz = 19 200 MIPS
Core 2 Duo @ 2,93GHz = 27 079 MIPS

and therefore

Core 2 Duo OC @ 3,2GHz = 29 574 MIPS

It's almost 3x as fast as Cell when executing general-purpose code.

It's a good thing that developers are allowed to use the SPUs as well as the PPE isn't it then!

Isn't it a bit odd that you're comparing what's available to developers in the PS3 specific implementation of Cell (PPE + 6 SPUs), cutting out 2 SPUs, but using the raw Core 2 Duo figures (and later overclocked figures)?

Now, you might wonder - but when it comes to physics, your system is still screwed without Cell? Well, welcome to the wonderful world of nVidia PhysX where a 8800 GTX can put out a total of 345 GFLOPS. Of course, when you enable PhysX in an actual game, only a some of this will be used. However, the results tend to be quite spectacular compared to the consoles when implemented properly.

Was anyone making the point that you can't do physics without Cell? I don't think that the world of PhysX is all that wonderful, but that's OT.

On my 2006 PC, Mirror's Edge ran @ 1920x1080, 4xAA, PhysX enabled and all high @ 45-50 frames per second - compare that to 1280x720 2xAA @ often sub 30 fps with rudimentary physics and the differences in computational power become more clear-cut.

Comparing an "average" non-vsync or triple buffered frame rate from a small section of a game with a capped frame rate isn't very fair now is it? And it doesn't change anything that people have been saying so far about about "raw power" (and what that means or doesn't) or some of the advantages that consoles have due to being closed boxes etc.

erick · Aug 25, 2011

function said:
Comparing an "average" non-vsync or triple buffered frame rate from a small section of a game with a capped frame rate isn't very fair now is it? And it doesn't change anything that people have been saying so far about about "raw power" (and what that means or doesn't) or some of the advantages that consoles have due to being closed boxes etc.

If you read the reviews from the time, it is clear that both PS3 and X360 versions of Mirror's Edge are suffering from occasional frame rate lag, in which case the v-sync is dropped and torn frames ensue.

On the PC there is a thing called triple-buffering which should be enabled by default in all video card drivers since the beginning of the 21st century. Let me quote Anandtech:

In other words, with triple buffering we get the same high actual performance and similar decreased input lag of a vsync disabled setup while achieving the visual quality and smoothness of leaving vsync enabled.

As for the general-purpose performance of Cell, Beyond3D itself offers up some corraborating data:

・Dhrystone v2.1
PS3 Cell 3.2GHz: 1879.630
PowerPC G4 1.25GHz: 2202.600
PentiumIII 866MHz: 1124.311
Pentium4 2.0AGHz: 1694.717
Pentium4 3.2GHz: 3258.068

function said:
It's a good thing that developers are allowed to use the SPUs as well as the PPE isn't it then!

Isn't it a bit odd that you're comparing what's available to developers in the PS3 specific implementation of Cell (PPE + 6 SPUs), cutting out 2 SPUs, but using the raw Core 2 Duo figures (and later overclocked figures)?

SPUs can only be used for FLOPS as far as I know, at least I have never seen integer performance numbers for them. FLOPS are good for AI, physics, post processing - but they are not capable of running "normal" game code which is integer(MIPS)-heavy.

Overclocking simply applied to the condition of my 2006 PC (given that it was what my comparison built upon from the start), but it's also a platform-specific advantage usable by anyone who is willing to do some research.

------------------------------------------------

However, I think we're getting sidetracked here. This thread is supposed to be about what's to come, not what has been

Shifty Geezer · Aug 25, 2011

erick said:
SPUs can only be used for FLOPS as far as I know, at least I have never seen integer performance numbers for them. FLOPS are good for AI, physics, post processing - but they are not capable of running "normal" game code which is integer(MIPS)-heavy.

You need to read up. For one thing specifically about how 'integer' and 'float' are misnomers to describe workloads. A more accurate definition would be 'vector float' work, and 'general purpose' which involves lots of memory access, branching, and any array of calculations and processes. SPEs can work perfectly on 'integer' performance, as fast as they can perform float work (only not SIMDed, obviously, but in terms of instructions issued). They can also process the integer pipe simultaneously with the float pipe. Where SPE's struggle is memory access as there isn't an intrinsic cache for them, so devs have to be aware of how they access memory or they can be constantly stalling, and they're not strong on branching although, once again, devs can take on some of the responsibility of that. However, efficient game code can be run full speed on SPEs.

However, I think we're getting sidetracked here. This thread is supposed to be about what's to come, not what has been

I was just thinking that!

sebbbi · Aug 25, 2011

wco81 said:
Are there $300 PCs which would outperform consoles?
How about $500 or $600 PCs or laptops?

Yes. Quoted from other forum:
AMD A3650 (Llano) APU 109$, motherboard (69$), hard drive (34$ for 250GB), some memory (1GB is 9$), cheap case (22$). Prices from Newegg (new parts). The total is 243$. It's cheaper than the Xbox 360 with (same sized) 250GB hard drive at 299$. Of course you need to get a cheap mouse/keyboard and a HDMI cable to connect it to your TV.

If the resolution is set to 720p (1280x720) and AA & AF is turned off / minimized (most console titles do not have them or have 2xAA only), this Llano system will run majority of console ports at 50-60 fps (according to reviews around the net). Consoles run the same games usually at 30 fps (and many games tend to dip to 20s when there's lots of things happening at once). So this system definitely packs a bit more punch than the current generation (2005) consoles, and for a lower price. Of course you cannot play games at 1080p/8xAA/8xAF like hardcore PC gamers prefer, but neither can any of the current generation consoles.

There is a site collecting Brazos E-350 netbook gaming benchmarks (the AMD Atom replacement with integrated GPU). Even the Brazos seems to run many properly optimized AAA console ports at 30 fps (when the game resolution is lowered to match the console 720p, and detail settings are lowered to match the console settings). Of course there are some games that run very poorly on Brazos, but it seems that those are just bad ports. There are so many well done graphics intensive console ports (like Mass Effect 2, Crysis 2 and Dirt) that it's clear that properly optimized console ports run pretty well on newest breed of netbooks. I am pretty confident that next year we have netbooks that pack more power than current generation consoles. But hopefully we also have next generation consoles before the Christmas.

liolio · Aug 25, 2011

I learn about COMIC thanks to one of MfA's post. Too bad the pdf is no longer freely available or at least blocked at my job.
I don't know the benchmark they are using so I'm not sure about how relevant the results are. Still at least for these benchmark results looked pretty good.
Since I read this there have always been in my head an idea that I don't have the knowledge to discard. I remember reading "the end of the GPU roadmap" from Sweeney and what it would deemed as convenient as a development platform. To make it short and to summup how I understood it is that he wanted to make a lot of performance sacrifices for the sake of programmer time:
* he wanted auto vectorizing compiler, which most likely would not do the job as well as a human.
* flat memory space, actually more than that he wanted transactional memory and was ok to sacrifice some extra 30% performance.
* he wanted a homogeneous system.
* he wanted a lot of bandwidth (internal and external)
* he wanted a hell lot of processing power per pixel.

That's quiet a lot for a single man to ask for

and it may not happen anytime soon.
Still it let me with an interrogation, I've always read that at some point it's better to invest on hardware (for coherency, extract ILP. vs letting the compiler do it as in VLIW design, etc.) but

I can be stubborn...

after reading that about COMIC it made me wonder (especially as see above I'm not really able to say something about the relevance of the benchmark they used and more importantly for us how relevant it would be overall to a game engine).
A SPU is 3/5 the side of a xenon/ppu core (and most likely a power A2 core) and that's without L2, even less for a more standard CPU, and power consumption is an good order of magnitude lower.
My interrogation is mostly like this, taking in account Sweeney consideration: productivity first, would it be possible to implement hand in hand a "super-COMIC" and a Cell successor to provide something that would fit the bill.
You can pack a lot on SPU for cheap (silicon and power consumption / thermal dissipation), you may use lower power CPU and more adapt to the job he would take care in such a model, you may be add even more EDRAM (or for cheaper in power and silicon) as scratchpad memory on board. So you may create quiet a monster chip but if you only communicate on its peak throughput it may not fool the professional that consider the hardware but also human work time as a great limiter.
Now you have one person that want to give away 50% of the peak performance for productivity (actually as chip getting involves more and mores cores you leave more and more peak performance, as it get impossible to go and make sure everything is busy doing well, etc. ) my question is how would fare a chip that packs 3 times or more the power consumes an order of magnitude less and say throw away 60% or even 70% of the peak performances but still has a convenient software model and where when you don't such an costly software model peaks way way higher than your average SMP model?
I don't know, may be IBM knows and hence came with POWER A2 instead of a reworked cell and the matching programming model but we still can discard legacy support as a part of their final decision on top of technical merit.

OK I don't expect that to happen, it's just for the sake of discussion which were diverging to "average SPU usefullness".

EDIT:

And no reaction to the pa add-on I linked yesterday? I though more about it and I can see some really lean and convinient implementation if implemented from scratch. Basically you could clear the face of the controller from action button, focus on the analogue sticks placement add the "start / pause /ps360 where it fits. I feel like this will get forget but whereas it may take sometime to get use it could have been the next step in pad ergonomic.

function · Aug 25, 2011

erick said:
If you read the reviews from the time, it is clear that both PS3 and X360 versions of Mirror's Edge are suffering from occasional frame rate lag, in which case the v-sync is dropped and torn frames ensue.

On the PC there is a thing called triple-buffering which should be enabled by default in all video card drivers since the beginning of the 21st century. Let me quote Anandtech:

For goodness sake old boy, I mentioned triple buffering in the very quote you used to base this response on, it's in black and white mere millimetres above what you were typing! "Comparing an "average" non-vsync or triple buffered frame rate from a small section of a game with a capped frame rate isn't very fair now is it?"

In a non-vsynced or triple buffered (normally PC) game the system will display new frames as fast (or almost as fast) as it can. The actual frame rate may fluctuate between less than 1 and several hundred in the case of no vsync, or between less than 1 and the refresh rate in the case of triple buffering. In a game that is *capped* at 30 fps the frame rate may drop lower, taking the average below 30, but unlike with a non-vsynced or triple buffered game, it cannot go higher even if the machine were capable of doing this.

Comparing a non-vsynced or triple buffered average frame rate with a capped frame rate (which is what you did) will not give you an accurate comparison of the relative hardware's (edit - I actually mean "relative platform's") ability to run a game. Capping the frame rate *reduces* the average. A game that is capped at 30 fps, with dips below 30, may run with an average of 30+ or 40+ or 50+ or whatever fps with the cap removed.

Overclocking simply applied to the condition of my 2006 PC (given that it was what my comparison built upon from the start), but it's also a platform-specific advantage usable by anyone who is willing to do some research.

You can overclock consoles too sometimes! Overclocking changes power consumption, cooling required, your board and PSU need to be able to handle it, not all chips overclock the same etc. It adds a lot of variables, and makes the already tricky job of comparing apples and oranges even more difficult.

erick · Aug 25, 2011

function said:
For goodness sake old boy, I mentioned triple buffering in the very quote you used to base this response on, it's in black and white mere millimetres above what you were typing! "Comparing an "average" non-vsync or triple buffered frame rate from a small section of a game with a capped frame rate isn't very fair now is it?"

Sorry I missed that >_<

function said:
In a non-vsynced or triple buffered (normally PC) game the system will display new frames as fast (or almost as fast) as it can. The actual frame rate may fluctuate between less than 1 and several hundred in the case of no vsync, or between less than 1 and the refresh rate in the case of triple buffering. In a game that is *capped* at 30 fps the frame rate may drop lower, taking the average below 30, but unlike with a non-vsynced or triple buffered game, it cannot go higher even if the machine were capable of doing this.

Well, that's an architectural difference between the platforms. Triple-buffering is hardly an unfair advantage - after all you can expect every scene in a console game to be tested against fixed hardware and optimized accordingly to hit a specific performance target, while obviously nothing like this can happen on the PC. And it does provide v-sync quality, just without the setbacks.

I'd say +1 for raw power vs fixed hardware optimization on this one

function said:
You can overclock consoles too sometimes! Overclocking changes power consumption, cooling required, your board and PSU need to be able to handle it, not all chips overclock the same etc. It adds a lot of variables, and makes the already tricky job of comparing apples and oranges even more difficult.

I know, I own a PSP and with custom firmware nearly all of those can be made to run at 333MHz (instead of the 222MHz default)

Naturally, battery life suffers, but nifty nevertheless.

To take this discussion into a new direction - what's your take on the possibility of modular and upgradeable design for the next gen? I'm pretty sure MS is quite happy with their idea of their X360 HDD add-on considering the prices it commanded.

In the past there have been modular upgrades like the SEGA 32x which actually added processing power. While a commercial flop, it did demonstrate that it is possible to upgrade an aging system.

So ... how would you feel about the opportunity to actually strap a "RAM-Pack" as an upgrade on the nextBox, and as a way to negate the initial high price? Considering the margins that other console hardware add-ons seem to have that might also be a viable way of bolstering the manufacturer's income?

Rangers · Aug 25, 2011

erick said:
So ... how would you feel about the opportunity to actually strap a "RAM-Pack" as an upgrade on the nextBox, and as a way to negate the initial high price? Considering the margins that other console hardware add-ons seem to have that might also be a viable way of bolstering the manufacturer's income?

I have a post somewhere about how a RAM expansion pack would be a great idea, but more as a sneak attack strategy on the competition, IE keep the expansion port secretive, a year or two in, bust it out, competition is helpless

You would need to take steps to ensure ram pack equipped is the default configuration after introduction (free to prior owners, mandated on all new titles, ships with all new hardware, etc)

I'm not sure how technically feasible it is anymore though.

erick · Aug 25, 2011

Actually, it's quite easy.

You really only have to compare it to the X360 HDD. While high-end configurations got this, game devs had to initially ensure the Arcade units ran their games as well, without the HDD.

4 years later, a lot X360 of games have semi-mandatory HDD installs (what I mean by that is that without the HDD the game experience retains its functionality, but sux cox, pardon my French).

And gamers are content to buy these HDDs as an upgrade. Or maybe even not just the HDD, but a new model of the same console altogether.

In any case, you don't have to worry about providing anything for free as long as a game can just barely function without the upgrade

The devs who feel the need (and 4-5 years into the new cycle boy will they feel the need) can take advantage of it, while as prices come down for the hardware, the manufacturer can start to include the upgrade in newer models as deafult.

Sound doable?

sebbbi · Aug 25, 2011

erick said:
The devs who feel the need (and 4-5 years into the new cycle boy will they feel the need) can take advantage of it, while as prices come down for the hardware, the manufacturer can start to include the upgrade in newer models as deafult.

Sound doable?

Doable yes, but not many devs are going to support a feature that only a fraction of the hardware base supports. Games are expensive to create, and the developers want to target as broad audience as possible. The problem with optional hardware accessories is that the developer cannot be sure that it exists. If you want to support it, you have to code two separate paths, one with it and one without. Most developers choose to support the most common denominator, instead of making separate versions for separate accessories. Xbox games cannot assume a hard drive exists, and this makes some ideas hard to implement, same with PS Move for example. If you design your game to be controlled with PS Move, you still have to design another control scheme for the standard Dualshock controller, if you want to sell the game to everyone. Often this results in compromises in both ways. With Wii you do not have to compromise, since everyone is guaranteed to have a Wiimote. Same with PS3 and it's hard drive. Everyone has one, so all games can be designed to take full advantage of it.

bkilian · Aug 25, 2011

Rangers said:
Isn't it half that pay for Live Gold, though? Not quite the same thing.

No. 55 million consoles, 35 million live accounts (That includes silver and gold) from the last released numbers. It means that some 20 million users have never connected a console to live. It's not exact, since some folks have multiple live accounts for a single console (4 in our household) and others have multiple consoles for a single Live ID, but I'd say the error is in favour of more users not having live than the other way around. (A single console easily supports multiple users, but doing multiple consoles with a single ID is a pain)

Tahir2 · Aug 25, 2011

2 Gigabytes of super fast GDDR5 memory would be pretty cool.
Would GDDR5 be suitable in a UMA setup or would it have to be standard DDR type memory? In which case we are only now beginning to see the development and ratification of DDR4.

qb2k5 · Aug 26, 2011

Could something like this be feasible for next gen? A GPU with 4GB of slower DDR3 ram but with extremely fast edram (20MB/350+GB/s) for AA, transparencies, and etc for 1080p/30fps?

I know this GPU is obviously slow. But my point is they could use slower ram on a midrange gpu from 2012 but make up the lack of bandwidth with fast edram.

Rangers · Aug 26, 2011

bkilian said:
No. 55 million consoles, 35 million live accounts (That includes silver and gold) from the last released numbers. It means that some 20 million users have never connected a console to live. It's not exact, since some folks have multiple live accounts for a single console (4 in our household) and others have multiple consoles for a single Live ID, but I'd say the error is in favour of more users not having live than the other way around. (A single console easily supports multiple users, but doing multiple consoles with a single ID is a pain)

I'd rather wonder what the PS3 connection stats are since that's free...

Yes I understand silver is free but I'd guess many dont bother when they know the good stuff is pay.

AlphaWolf · Aug 26, 2011

Rangers said:
I'd rather wonder what the PS3 connection stats are since that's free...

Yes I understand silver is free but I'd guess many dont bother when they know the good stuff is pay.

The PS3 number is massively inflated because of regional stuff. People often have several to get regionally released demos.

There's no downside to signing up for a silver account.

Predict: The Next Generation Console Tech

bkilian

Squilliam

Beyond3d isn't defined yet

Rangers

erick

jonabbey

function

None functional

erick

Shifty Geezer

uber-Troll!

sebbbi

liolio

Aquoiboniste

function

None functional

erick

Rangers

erick

sebbbi

bkilian

Tahir2

qb2k5

Rangers

AlphaWolf

Specious Misanthrope

Similar threads