Mind adding power usage increases to each of those three PS'es as well?
200watts is what I'm expecting which was the power budget for ps3/xb360.
Running a 2048 alu Tahiti gpu at 600MHz will result in a <160watt budget leaving plenty for a cpu.
Mind adding power usage increases to each of those three PS'es as well?
I hear cell consumes less than 20W, iirc. Each spe can be made to consume 1~W, iirc. Floating point performance and intel's latest overclocked appears to provide...
On my 3960X at 4.7Ghz I'm getting about 160-163 GFLOPs with HT enabled.-mdzcpa,evga forums
Likely 100~W consumption for intel i7 part
Cell is above 200~Gflops.
Yes but without having the initial release version power usages for PS1/2 you can't really interpolate linearly of what can be put inside PS4. I'm fairly certain with each new generation power usage went up considerably compared to previous one. 5B transistors at 200W won't be easy to manage, especially considering there will be other components needing power as well.200watts is what I'm expecting which was the power budget for ps3/xb360.
Your comparing measured with theoretical performance. I'm also uncertain if you're comparing DP flops with SP flops. The 3960X has a theoretical throughput of 317 SP GFLOPS at it's stock speed of 3.3Ghz.
That doesn't make it particularly better in the flops/watt measure but then that's only comparing flops which Cells specialised design focusses on. In other areas the 3960X would utterly slaughter it. There's certainly no doubt that the PS3 would have been far more potent with one of those inside (albeit impossible).
I'm sure you wouldn't find many developers complaining about a 3960X being in a next gen console if it didn't effect everything else (which it obviously would). It'd be interesting to understand thoughts on that compared with a scaled up Cell though of say 4PPE's, 32 SPU's and 4x the memory. That'd make a pretty potent console CPU and still come in at a lower power and smaller footprint.
The lack of coherence between the Local Stores is probably seen as a disadvantage but once you start to scale Cell it'll turn out to be a big advantage.
Once you start adding in piles of cores coherent caches will become a major source of latency and power consumption. -ADEX, beyond3d
That was one of the larger design decisions on creating the Cell was that there was a limit to the amount of cache you can use before you hit diminishing returns. Where as not only is the sdram predictable its infinitly scaleable. -Terarrim, beyond3d
The memory wall: the processor frequency has now surpassed the speed of the DRAM and the current workaround of using multilevel caching leads to increased memory latency....
The slow main memory access on traditional x86 architectures creates a data flow bottleneck causing processor idle times. This results in much lower sustained performance than the theoretical peak of the CPU. To combat the bottleneck, state of the art processors have significant cache (L1, L2, L3), typically several megabytes on the processor chip. This uses up space that would otherwise be available to allow more transistors (and more processing power, as well as more heat). This “wasted” cache memory area is one explanation for why Moore’s law no longer translates into equivalent performance increases.-link
IIRC, ps2 used .25 micron(250nm) process on some of its components, similar to Dreamcast, iirc. The components were pretty large and caused shortages, but were quickly migrated to a .18micron process.Yes but without having the initial release version power usages for PS1/2 you can't really interpolate linearly of what can be put inside PS4. I'm fairly certain with each new generation power usage went up considerably compared to previous one. 5B transistors at 200W won't be easy to manage, especially considering there will be other components needing power as well.
real world, spe can at some tasks get 98% of theoretical performance in real world. The stream processor goes around the memory wall that can't be overcome with ever more cache, which may explain why it seems that even overclocked the i7 benchmark results are so low.
For general purpose computation 2-3 nextgen general cores should more than suffice for most tasks. We want additional performance for things like physics. Or is there any use for 6-8 general cores in a console?
The slow main memory access on traditional x86 architectures creates a data flow bottleneck causing processor idle times. This results in much lower sustained performance than the theoretical peak of the CPU. To combat the bottleneck, state of the art processors have significant cache (L1, L2, L3), typically several megabytes on the processor chip. This uses up space that would otherwise be available to allow more transistors (and more processing power, as well as more heat). This “wasted” cache memory area is one explanation for why Moore’s law no longer translates into equivalent performance increases.-link
13 spes exceed the theoretical i7 performance,
Latency, more bandwidth but more latency. Though if coded in a similar fashion it probably would reach closer to peak, we'd need someone with expertise in the area to say how close.Or much more likely the benchmarks the I7 is running aren't hand crafted specifically for it's architecture like the ones that extract "98%" of the theoretical performance out of cell. Hand crafted code for Sandybridge would achieve a far higher percentage of the theoretical maximum than the examples you are using. On the other hand, none hand crafted code on Cell would likely fair far worse than on an i7 in terms of percentage of maximum.
They can handle it, but if you have to code to fit your data[in local cache memory] as if it were an spe for it to reach near theoretical, and the hardware is going to take more space and consume more energy for less performance.... what's the point?The word general should give a clue to that. What is it about those cores that you think can't handle physics? Or AI, or general game code?
Latency is probably worse compared to xdr, though given predictability with similar coding style you could probably get around it(but if latency is part of the reason for xdr use even in cell, then it may affect how close one can come to peak even with similar approaches... someone with expertise in the area can probably clarify the issue.).I'm not quite sure why an IBM explanation of why their processor is so much better than x86 should be taken as a reliable source but regardless, the 3960X has double the bandwidth to main memory that Cell does so if bandwidth is the basis for your reasoning that it can achieve a higher percentage of it's theoretical throughput then.. well...
Theoretical floating point performance only. It's not the only measure of performance believe it or not.
Generations are typically defined by a performance metric above the previous generation.
Generally speaking, this is roughly 8-10x the transistor count to get roughly 8-10x performance.
Applying this to ps4 over ps3 we get the following:
Transistor Count:
2000 PS2 ~56m
2006 PS3 ~534m
2012 PS4 ~5,000m
The other measure is die size roughly being 500mm2 and fitting as many trans as possible in this budget on the latest process (at this moment 28nm).
FYI-
Tahiti (352mm2 4300million transistors) has 2048 alus.
This still leaves ~150mm2 or ~1,000m transistors for the CPU.
If I'm disappointed, it will be due to Sony/MS following Nintendo's gimped hardware routine with a gimmick accessory. Not because my expectations are out of line.
I assume for psn software compatibility's sake(necessary to transfer purchases) cell's performance has to be matched or exceeded, which likely calls for more than 3 cores. In all likelyhood we can expect at least 6 spes, the easiest solution, as well as 3 general cores to allow easy code portability with the other consoles(I assume the 7th separated spe functions can be transfered to a regular core.).
If cell was powered by a substantially powerful gpu like it was back at e3 2005(dev kit), rather than hamstrung by a limited gpu as is. Realtime cloth as seen in FFVII demo would be possible. You have to also recall that many a programmer has commented that in many cases algorithm performance advances have matched or exceeded the performance obtained from hardware advances by moore's law. Given that realtime cloth simulation was in its infancy last decade, the added possible performance from algorithm advances should likely not have been exhausted.
Algorithm design advances are likely to continue post the introduction of new consoles too. If sufficient performance is there, an algorithm performance advance can bring things from the realm of the impractical nonrealtime to the practical realtime use.
The vibe I get on this board these days is that every dev and person associated with the industry is expecting less potent hardware and better software platforms. The future isn't seen in hardware. Whereas many others are still expecting the console businses to carry on as it has done the last four generations without change and they want the console companies to lose massive amounts of money on hardware. The business side of picking next-gen tech doesn't not point to awesome hardware IMO.
I'm confused as to how that relates to my post.-bgassassin
I'm under the impression the wii:u will be underpowered because the box they showed off was quite small and it has a tablet as a peripheral.
We can assume nextgen ps4 to have at the least 3 cores similar to wiiu + 6spes. If those cores have less performance then 2spes, than we can assume at the least a 2X difference in cpu performance at some tasks, given such assumptions.
The CPU/GPU are more than likely going to be at the back of the case near the rear fan like with Wii meaning most of the heat would be pulled out almost immediately. The added length should give the console more breathing space so whatever internal heat remains would dissipate rather easily since there won't be an internal HDD and Nintendo more than likely using a slim drive like before.
None of what you said means much without a base for comparison. So what if the processors are towards the back? The fan is going to be tiny. That doesn't mean you're suddenly capable of dissipating 100W heat. I'm not sure what sort of cooling solution you're thinking of, but it's certainly not going to be lightweight if you're expecting a lot, and definitely not quiet.
I don't think a base for comparison is necessary. Expecting a lot is subjective because I don't think what I'm expecting would be considered a lot, however we already have an indication that it's not lightweight. Also I believe that the fan would be adequate as long as the proximity is close enough to pull the heat away.
I think another question would be the power supply. When the 360 launched despite being a relatively very large console it had what I assume was an external power supply, no? And it was very very big, attached by a very thick cable.
Things in terms of heat dissipation and power requirements have only gotten worse since then, implying the need for lots of power to attain modern performance, which is why many fear nextgen consoles might not be able to provide a jump such as that given by 360/ps3.
Depending on nintendo's willingness to incur costs the power supply might be built in, and if that's the case depending on unit size, it could pose a performance limitation.
The launch 360 also had a power draw of 160-180w. That's way more than I'm expecting from Wii U. Also both the launch 360 and Slim are designed to accommodate a bulky optical drive and internal HDD. So including the PSU, that's three things that Wii U most likely will not have to accommodate. I can't remember the last time Nintendo used an internal power supply. That won't be changing. You can look at the console and see that.
Huh? 360 doesn't have an internal PSU.
Also the pre-slim didn't have an internal harddrive, it was kind of slapped in a caddy on the side. And Wii u will have an optical drive also. If anything one with more size constraints seeing as it's Blu Ray based (how many tiny Blu Ray players do you see on store shelves?)
As long as the HDD is inside the console it's internal and the case has to be designed accordingly.