Predict: The Next Generation Console Tech

Status
Not open for further replies.
Top perfoming 8 core Xeon server CPUs are long way from that kind of peak performance (even when running pure AVX vectorized code). Not even Haswell is expected to reach that kind of peak performance. So I wouldn't hold up much hope for that rumor (no matter how the flops are calculated) :)

Also IBM:s 16 core (64 thread) supercomputer CPU (PPC A2) has peak of 204.8 GFLOP/s. Having six times more CPU power in a console would surely be fun... but highly unlikely.

What if it's an APU and therefore includes the capabilities of the GPU elements in the FLOP number?
 
Top perfoming 8 core Xeon server CPUs are long way from that kind of peak performance (even when running pure AVX vectorized code). Not even Haswell is expected to reach that kind of peak performance. So I wouldn't hold up much hope for that rumor (no matter how the flops are calculated) :)

Also IBM:s 16 core (64 thread) supercomputer CPU (PPC A2) has peak of 204.8 GFLOP/s. Having six times more CPU power in a console would surely be fun... but highly unlikely.

Aren't you confusing DP and SP GFLOP/s here though? At single precision, the Cell processor is already at 204.8GFLOP/s, so six times that isn't that far fetched. Double precision is significantly harder, though IBMs Dual Precision version of the Cell did 100GFLOP/s with 8 cores in 2008, and if I remember correctly, it wasn't actually that big of a change.

Not saying that's what we'll get, but single precision should be doable.
 
Aren't you confusing DP and SP GFLOP/s here though?
No. I am talking about single precision flops.

Some GFLOP/s listings of Intel CPUs (peak synthetic) on overclocking site:
http://www.overclock.net/t/947312/how-many-gflops-does-your-processor-have

White paper on Sandy Bridge E based E5-2680 Xeon (benchmark):
http://research.colfaxinternational.com/file.axd?file=2012/4/Colfax_FLOPS.pdf

Peak rates (8 core / 16 thread Xeon):
193 GFLOP/s 32 bit float addition (AVX)
183 GFLOP/s 32 bit float multiply (AVX)

Sandy can do AVX add and multiply simultaneously (different ports). Basically that shoud get you close to 400 GFLOP/s peak. But that's still far away from 1.2 TFLOP/s.
 
Basically, the guy claimed the devs were not too happy, as the Xbox 3 was going to be consirably more powerful than the PS4, with an 8 core 64 bit 1.2 TFLOP CPU, 4 GB of RAM. He also told way before vgleaks the PS4 was going to have a 32 bit 4 core processor and 2 Gb of RAM.

Is there a reason given for why they think PS4 would have a 32-bit CPU? This is not only going out of their way to hobble the architecture, but a regression from the PS3.
 
Intel's forthcoming 50+ core Xeon Phi (known as Knight's Corner, and the previous iteration Larrabee) pushes around 700-800 GFLOP/s (double precision) in LINPACK, and consumes 300 watts. This figure makes the 1.2 TFLOP/s console CPU seem even more absurd...

Quote:
"But we won't have to wait for November to hear about Linpack running on MIC machines. According to Intel's Rajeeb Hazra, Intel's GM of the Technical Computing group, they've been running the High Performance Linpack (HPL) benchmark on pre-production parts and have been able to achieve one teraflop on a single node equipped with a Knights Corner chip. That teraflop, by the way, is provided by the Knights Corner card plus the two Xeon E5 host CPUs, so the MIC chip itself is likely delivering something in the neighborhood of 700 to 800 gigaflops."
 
Am I the only one around here who like the idea of the cell??

I'm right to think I'm sure that cpu simd are good because they offer instant performce? I'm sure in 2013 on 28nm it would be very possible to build something like a dual core power 7 class ppe with 4x smt and something like 12-16 spe's @ around 3.2ghz....I bet that could crack 1 teraflop quite nicely and be just awesome for ai and physics calculations...
Link that to something like 4mb l2 and some edram and we have a winner.
Seems like a genius idea wasted imo.
 
Intel's forthcoming 50+ core Xeon Phi (known as Knight's Corner, and the previous iteration Larrabee) pushes around 700-800 GFLOP/s (double precision) in LINPACK, and consumes 300 watts. This figure makes the 1.2 TFLOP/s console CPU seem even more absurd...

Quote:
"But we won't have to wait for November to hear about Linpack running on MIC machines. According to Intel's Rajeeb Hazra, Intel's GM of the Technical Computing group, they've been running the High Performance Linpack (HPL) benchmark on pre-production parts and have been able to achieve one teraflop on a single node equipped with a Knights Corner chip. That teraflop, by the way, is provided by the Knights Corner card plus the two Xeon E5 host CPUs, so the MIC chip itself is likely delivering something in the neighborhood of 700 to 800 gigaflops."

Yeah, but again, that's double precision. I'm not sure we'll see next-gen consoles focus on DP, even on the CPU I don't think they really need it?

@french toast: there are a lot of important ideas in the Cell, but it was too hard to program for, that much is clear. Multiplatform performance will be a stronger focus for Sony next-gen I'm sure, as is (partly to achieve that) ease of development and flexibility.

It's hard to guess what we'll end up getting, but it's clear that being closer to PC architecture is a big advantage up-front.
 
Is there a reason given for why they think PS4 would have a 32-bit CPU? This is not only going out of their way to hobble the architecture, but a regression from the PS3.

if a 4GB address space is enough then a 32bit CPU is pretty harmless, unless you want to crunch arbitrary precision integers or do unaccelerated crypto for some reason. 32bit can even be faster than 64bit because of smaller pointers in the L1 caches. (on PC, 64bit can be faster overall because of an unrelated increase in number of registers)

nintendo 64 had a 64bit CPU.. but it was possibly only ever exploited in 32bit mode :)

that said I remember that you would be able to map flash or other storage to memory address space, if you have a 64bit CPU and run it at 64bit.
so accessing flash cache, flash mass storage, HDD etc. could be somewhat easier and lower latency.
 
Am I the only one around here who like the idea of the cell??

I'm right to think I'm sure that cpu simd are good because they offer instant performce? I'm sure in 2013 on 28nm it would be very possible to build something like a dual core power 7 class ppe with 4x smt and something like 12-16 spe's @ around 3.2ghz....I bet that could crack 1 teraflop quite nicely and be just awesome for ai and physics calculations...
Link that to something like 4mb l2 and some edram and we have a winner.
Seems like a genius idea wasted imo.

I do too but I think GPGPUs eat it's lunch & having a APU with a powerful CPU & GPU with fast memory will reach the same goals that Cell was trying to reach.
 
be just horrible for ai calculations...

Fixed that for you. The Cell is awesome at physics, but for what passes as AI in the present games industry, it's really just awful. AI presently means mountains of script that likes to look at half the contents of the memory, a lot of branching (as in, branch every 4 instructions, tops), and no SIMD. That's the absolute worst possible load I can think of for the SPE. I'd be willing to bet that a single modern Intel core would beat those 12-16 SPEs at running complex, jumpy script with a huge amount of loads. Probably by a large margin.

Some game companies have jerry-rigged the SPEs for AI use, because that's what the PS3 has, but they really are bad at the task. Most devs steer clear.

For actual real AI (neural networks and the like), SPEs would be pretty reasonable. However, they are not in any way relevant to game "AI" -- the field is still young, and if someone was able to make a real AIs that were as good as pre-defined scripts in the kinds of tasks games do, they would be able to make really absurd amounts of money outside the gaming industry.
 
if a 4GB address space is enough then a 32bit CPU is pretty harmless,
There are some advantages to having the extra breathing room in the virtual address space, even if the physical memory does not match, as you've indicated.
One thing that may help is that a 4 GB console may have a 4+ GB development machine, and the continuity between the two may be helpful.

My primary point is that most cores Sony has to choose from, be it x86 or PPC, have 64 bit capability by default. This was the case for Cell, so some extra clarification is needed to explain the claim that Sony's chip won't have what was common even for the last gen.
 
Am I the only one around here who like the idea of the cell??

I'm right to think I'm sure that cpu simd are good because they offer instant performce? I'm sure in 2013 on 28nm it would be very possible to build something like a dual core power 7 class ppe with 4x smt and something like 12-16 spe's @ around 3.2ghz....I bet that could crack 1 teraflop quite nicely and be just awesome for ai and physics calculations...
Link that to something like 4mb l2 and some edram and we have a winner.
Seems like a genius idea wasted imo.

Cell was a interesting way to "cheat" and have more processing power than you could afford otherwise. but with so much transistor budget available nowadays?
maybe you could do "heterogeneous computing, but not quite" instead.

have a few "fat" cores and some more "slim" ones, all running the same arch (be it x86 or PPC or other) with maybe some more vector units for the slim cores.
The fat cores are great for game logic, AI, and anything which needs high single thread performance and robustness ; slim cores are throughput oriented and play the gigaflops game, have many, many paper flops like e.g. Xenon cores.

this would be like pairing an Intel CPU with a Xeon Phi, somehow. Xeon Phi uses a ring bus btw. a putative Intel chip with CPU + Phi on same die would look like a huge Cell, but easier to program for.
 
What if someone was to take the idea of Cell , APU & GPGPU & just make 1 chip with about 12 CPU cores & the GPU cores all clocked at about 1.2GHz
 
What if someone was to take the idea of Cell , APU & GPGPU & just make 1 chip with about 12 CPU cores & the GPU cores all clocked at about 1.2GHz

Polyphony would be implementing raytracing, naughty dog creating FF Spirits Within-quality graphics, and Skyrim and Rockstar would be complaining they can't get their ports running at 30 fps, let alone maintain resolution parity with other consoles.

Amazing hardware will not always lead to amazing games, or ports in the case of PS3.
 
Polyphony would be implementing raytracing, naughty dog creating FF Spirits Within-quality graphics, and Skyrim and Rockstar would be complaining they can't get their ports running at 30 fps, let alone maintain resolution parity with other consoles.

Amazing hardware will not always lead to amazing games, or ports in the case of PS3.

Oh, right. I forgot 3rd parties are incompetents that can't deal with hardware. (Sarcasm)
 
Last edited by a moderator:
Oh, right. I forgot 3rd parties are incompetents that can't deal with hardware.

Incompetent is a strong word. Games cost serious money to make. Every developer hour spent fighting against the quirks of the HW is a dev hour not spent on gameplay. Some companies have real rockstar devs who get the last flop out of even the most bizarre of architectures, but not all are so blessed. And it makes you wonder, if the rockstar devs weren't doing long hours planning migrations of memory between the different pools, what else could they be working on?
 
. And it makes you wonder, if the rockstar devs weren't doing long hours planning migrations of memory between the different pools, what else could they be working on?

is that even the case though?

sure it seemed like it for a company that was required to develop a platform exclusive such as Haze who had difficulty with their tools and getting builds up and running, but do we have any other examples of platform difficulty actually affecting designers?

it just seems to me that complexity of one platform isnt going to stop designers who arent working specifically on that platform

if it doesnt run well, then it doesnt run well
 
is that even the case though?

sure it seemed like it for a company that was required to develop a platform exclusive such as Haze who had difficulty with their tools and getting builds up and running, but do we have any other examples of platform difficulty actually affecting designers?

it just seems to me that complexity of one platform isnt going to stop designers who arent working specifically on that platform

if it doesnt run well, then it doesnt run well

Doesn't really work that way.
Engineering will spend time optimizing things, that time could be spent elsewhere.
This is the same argument I've always used for GC for the bulk of game code, though I relent that position if you have a team of stars, but very few of those exist.

FWIW rarely does engineering effect design anymore, other than what you can build given the time constraints.
 
Wel, in the case of Skyrim and Rockstar; they can't even get their games running flawlessly on their main platform; I am talking about game breaking bugs, save game corruption, and so on.
I think it's a shame if next generation hardware is created with the aim to make those devs' work easier; instead of giving really skilled developers hardware that they can use to create graphics and games even "beyond the specs".

Next generation consoles need to last for at least half a decade. It will be a big waste if they are catered to 'lesser' developers, IMO.
Even in the case of the 360. If Bethesda had it's way, 360 would have shipped with a x86 dual core Pentium D, 1GB of ram, no EDRAM, a built in hard drive, and skyrim would still run like shit.
Plus Halo4, or gears of war 3 would probably have never been possible.

So in my opinion, MS or Sony should not hold back when designing their hardware.
Real devs will step up and fill in the gap, leaving incapable, complaining devs behind
 
Status
Not open for further replies.
Back
Top