Gabe Newell on NextGen systems

I doubt he'd skip next-gen since multi-core is the general direction that PCs are headed as well. And it's not like Valve would dissolve over a matter of "it's too hard." More than anything else, the geek in him would be far too eager to face the challenge.
 
Fox5 said:
In that case niether can the Athlon 64. If we just look at the flops, you have to have like a 16 core opteron system before you surpass the Xcpu in flops.(which I believe achieves around 116 GFlops in measured performance, while wasn't there just a post about Cell achieving around 40GFlops in real world scenarios?)

Wait a minute what do you mean by measured performance on the XeCPU? There's been no measurements or benchmarks that I'm aware of, and basically that would be executing at the processors theoretical max; a total non-starter. So we really can't make any claims as to the XeCPU's real world performance as yet, only the expected performance.

As for Cell, it did indeed reach ~40 GFlops of performance - that being in a large FFT scenario in which an Intel Xeon normalized to 3.2 GHz scored ~0.5 GFlops. Now, in another test (from the same Barcelona presentation) the SPE's were able to reach 19GFlops of performance all on their own.

So I think that quoting performance numbers is all well and good, but the context of the situation must be provided as well.
 
Oh for God's sake. Who gives a DAMN about FLOPS! What the hell does it have to do with actual real-world performance? OMG. Why do you console nuts fall for the same friggin hype each gen!?!?!?

FLOPS. Floating Point Operations Per Second. What operations? What situations? How optimized? How complex? What else is going on? I can count theoretical performance too. Oh, those wonderful FLOPS-spouting fools don't need to worry about writing actual code to quote what the chip can theoretically do. Let's see anyone in a million years get the max theoretical performance out of ANYTHING.

Let's think N64 for a moment. It's a good example of how nonsensical FLOPs can be. We have a R4300i CPU with little cache hooked up to a unified RDRAM architecture. Oh gosh, the RDRAM has latencies from the depths of hell. So, the CPU can't even remotely reach optimal efficiency because it can't get data in a reasonable amount of time. Now, that's one situation where a CPU will NEVER, EVER reach its theoretical performance. There are more factors involved than just adding up numbers from each separate part. They all have to work together to achieve an overall result. FLOPS is useless. It's the same logic that people use when they say 3DMark is useless. They are synthetic benchmarks. They don't necessarily have anything to do with a real application.

And then we get extra bull shitty flops ratings for GPUs too.

The gist of the next-generation is this: They developed absolutely awesome GPUs with few corners cut. They couldn't do the same for their CPUs. So they went a risky new route with untested technology. I originally thought Xbox360 was guaranteed to dust PS3's Cell, until I heard that the PowerPC in there is in-order. So I'd put them both on the same level. Both are going to be a MAJOR bitch to program for. Their IPC (instructions per clock), or processing efficiency, is going to be extraordinarily hard to get going good. Think P4's Netburst but FAR worse. Athlon64 is a dream of modern processing efficiency. These console chips are not in the same league.

Whether the CPU will be fast enough is actually a bit moot in the end I guess. Since console platforms are closed we will not be running benchmarks to compare these CPUs to anything. If a developer targets 60fps, he'll hit 60fps. So the games will feel fast, while the graphics will be reduced. How will you know they've been reduced? You won't because they'll still look damn good compared to what's out now. But you can be DAMN sure they will be reduced in complexity, especially in the 1st and 2nd gen of games. Like I said before I STRONGLY feel these consoles will be like top-notch AthlonXP or low A64's hooked up to an amazing GPU.

It makes sense economically since even console hardware developers don't live in a fantasy world. And unless they wanted to design $1000 hardware, they had to cut corners somewhere. Too bad they boast such utter bullshit that is damn near impossible to hit.
 
xbdestroya said:
As for Cell, it did indeed reach ~40 GFlops of performance - that being in a large FFT scenario in which an Intel Xeon normalized to 3.2 GHz scored ~0.5 GFlops. Now, in another test (from the same Barcelona presentation) the SPE's were able to reach 19GFlops of performance all on their own.

It should be further clarified that the latter figure was for one SPE on its own.
 
@Swaaye: The whole flops thing aside, I think your logic towards the end of your post begins to become a little muddled. What does '$1000 hardware' and 'top-end XP or low-end A64' have to do with anything? A chip on the order of a 2.4 GHz A64 would certainly have been cheaper to implement in silicon than the Cell would have (speaking in terms of manufacturing costs) - so I just don't see how something like the Cell was taking the easy/inexpensive way out. No. Rather this chip is more expensive and required a significant R&D effort to boot, so if you're basing your logic for real-world performance relative to the x86 processors we know and love on cost structures, I think you'll need to look elsewhere.

Remember that just because you're paying hundreds of dollars for that 200 MHx speed increase between A64 processors, doesn't mean it's costing a penny more to manufacture. Only towards the top-top end does speed become a serious fabbing consideration (for the A64's above 2.4 GHz).
 
Titanio said:
xbdestroya said:
As for Cell, it did indeed reach ~40 GFlops of performance - that being in a large FFT scenario in which an Intel Xeon normalized to 3.2 GHz scored ~0.5 GFlops. Now, in another test (from the same Barcelona presentation) the SPE's were able to reach 19GFlops of performance all on their own.

It should be further clarified that the latter figure was for one SPE on its own.

Hey now, I did clarify! :p

But I guess I could see where my sentence could have been interpreted as the SPE's as a collective grouping. But indeed, that was just for a single SPE. 8)
 
I find it kinda ironic that the CELL benchmarks clearly show huge GFLOPS differences between different types of problems, but people here put so much faith in GFLOPS without even knowing how it rates when it comes to game problems.
 
xbdestroya said:
@Swaaye: The whole flops thing aside, I think your logic towards the end of your post begins to become a little muddled. What does '$1000 hardware' and 'top-end XP or low-end A64' have to do with anything?

If they'd used a x86 chip they've had to have bought tech from another company. I get the vibe that that's not a good thing when you're dealing with Intel... Nobody's ever given AMD a chance though. Companies like to own and control IP, and they can't do that if they license from a x86 manufacturer....

But primarily they wouldn't be able to boast about 1) "shocking" next-gen tech 2) ridiculous performance claims. If this tech hit's PC GPUs it will radically change, and surely will be on the normal pricing level for PC CPU's. Heh, well it already has. P4-D and A64X2 probably annihilate these console chips bar none. But obviously those aren't viable for a console. I'm not even slightly convinced that say an Athlon XP, tailored a bit differently than the PC version, would not have been a fantastic console CPU. And yes a Sempron K7 surely is very cheap. A K7 is a hell of a lot faster than the Celeron in Xbox....and if they could code for it on a closed platform I have no doubt it would be adequte to pair with these GPUs.

I hope you see just what hype will do for them come launch, and what it historically has done for the sales of EVERY major console. Over-the-top performance claims are what make or break consoles. I think multi-core in-order CPUs were a business decision. Surely any sane developer wouldn't request them.
 
swaaye said:
And yes a Sempron K7 surely is very cheap.

Yes it is - but my point is that an A64 3500+ is not that much more expensive than that very same Socket A Sempron. The Thourougbred core has a die area of 80mm^2 to the Venice core's 84mm^2. Produced on 300mm wafers, it actually probably costs less.

And here's Cell with something like 232mm^2...

While we're on the topic though, I wholly disagree with the notion that performance claims are what 'makes or breaks' a console.
 
In Sony's part it was also an investment into their home electronics division.

The R&D for cell was not all for the sake of PS3, but an attempt to create a piece of hardware that they could sell to other competitors, gain royalties from, and also use in their own electronics with dirt cheap manufacturing costs.

Developing Cell for PS3 is sort of killing a few birds with one stone, tehy are doing the EXACT same thing with blu-ray, a proprietary technology they hope to make royalties from and sell to other competitors.

In both case they are using their biggest asset, PS3, as the delivery mechanism for both CELL and Blu-Ray.

I don't know how anyone can argue the CELL in the PS3 wasn't a business decision, it had much more to do with making Sony money in the long run then it had to do with creating a good Console CPU.

MS on the other hand, just went cheap, probably because they got somewhat screwed with Intel last time around and where determined to make a profit this time around. They probably new full well, that the CELL was a very limited console CPU so they went with a similarly "cost-effective" solution.
 
swaaye, what the heck is this rant about? did anybody ever claim in-order cpus were a hands-down technological advancement over oooe destop parts? it's always been a matter of transistor budgets/viability. proprietary ip has nothing to do with that (which of the next-gen cpus exactly is console-vendor solely-owned design?). at least try to get you arguments straight. theoretical performances figures are just that, and nobody in their right mined claimed any of the next-gen cpus' performance would average at those (of course vendors will quote their maximum performance figures, and that's ok as long as it's made clear what those figures signify). basically what you've been trying to say is that you can't see a reason why the console vendors didn't go with a desktop part like amd64 - and your opinion is ok, yet none of your arguments manages to address that consistently.
 
Scooby I don't disagree with you and I'll be the first to say that Cell was a business decision. BUT, it was also a processor designed to kick-ass in a console - not just replace the various chips in Sony's consumer electronics.

Anyone who thinks that performance within the PS3 was likely not first and foremost on Kutaragi's mind every time he received a status report on the chips progress I think is overly eager to believe a scenario in which Cell was shoe-horned into the system rather than being designed with that system in mind.

To sum up my thoughts on Cell:

An expensive gamble of a chip with the purpose of:

1) replacing a number of chips within Sony's various consumer electronics.

2) forming the core of Sony's new Playstation console

3) possibly 'shooting-the-moon' and becoming an eventual widespread architecture

4) using PS3 as a means to guarantee minimum volume demands and thus flesh out and build up a home-base industrial fab capacity

I think it's a good strategy; each aspect in a way supports the others as far as will to commit goes. If it fails, it can't fail that badly because the volume will still be high and fabs can be sold off if it comes to that.

But I just can't agree that Cell was built with Playstation second in mind - (even if it does end up die-area inefficient in that role) - I have to think it was surely tied for first when being developed.
 
But doesn't CELL really seem like it's best suited for decoding video streams and other stuff you would find in DVD players and TV's?

It just doesn't seem like the best design for video games...i mean it might kick ass, new consoles always do, but it still doesn't seem optimized/designed for a console, but more of an A/V processor.
 
scooby_dooby said:
But doesn't CELL really seem like it's best suited for decoding video streams and other stuff you would find in DVD players and TV's?

It just doesn't seem like the best design for video games...i mean it might kick ass, new consoles always do, but it still doesn't seem optimized/designed for a console, but more of an A/V processor.

I don't know what to tell you. It certainly is well-suited for decoding. I'm a business guy, tech is just a hobby for me like it is for so many of us.

I've already commented on what I think of the business play of it all; DeanoC or another dev would probably be best-suited to comment on it's worth as a console chip. I'm sure some opinions on that change on a daily basis as well within the dev community, as new support comes out, new tricks are learned, and new 'walls' - so to speak- are hit.
 
xbdestroya said:
swaaye said:
And yes a Sempron K7 surely is very cheap.

Yes it is - but my point is that an A64 3500+ is not that much more expensive than that very same Socket A Sempron. The Thourougbred core has a die area of 80mm^2 to the Venice core's 84mm^2. Produced on 300mm wafers, it actually probably costs less.

And here's Cell with something like 232mm^2...

While we're on the topic though, I wholly disagree with the notion that performance claims are what 'makes or breaks' a console.

I don't think AMD had a chance of making it into a console.
Console makers seem to like to give the GPU control over the memory(maybe because the GPU is the focus of the system?), and the IMC of the Athlon 64 would have conflicted with that. That, and as evidenced by the GameCube, it's possibly to have extremely low memory latency on a console without an IMC, whereas a PC has too many legacy restrictions.
Athlon XP may have been a possibility, except AMD is really scaling back production on those, and the memory bandwidth has a max of 3.2GB/s, which is rather low by now. Plus both Athlons would require being tied to DDR, which the industry is moving away from, unless they'd like to make a DDR2 memory controller for Athlon XP.

P4 isn't an option because it runs to hot, if either console manufactuer had gone for AMD or Intel they would almost certainly have needed at least a semi custom design. Pentium-M would have worked perhaps, but Dothan is weak in Flops compared to even other x86 designs, though Yonah would have worked better.
 
swaaye said:
P4-D and A64X2 probably annihilate these console chips bar none.
What exactly is your POV? That PS3's Cell is inferior in perforamnce to A64 and the number are imaginary, or that Cell is theoretically powerful but no-one will ever be able to attain that, whereas A64 all the powers there?

I find the concept that these chips are 'placebos' quite ridiculous. Every chip has limits and of course the peak FLOPS, calculated from number of FLOP operators in the chip, isn't the whole picture. But if they just wanted a stupid peak figure without caring what was attainable, why not reduce functionality, reduce LS size, and just dress a conventional G5 in a shroud of redundant unusable FLOP increasers? Lose a load of those registers, elliminate the dual-issue pipeline, and just have tiny useless FMADD units. If they really believe as you suggest everyone buys a console based on peak figures, they could easily have increased theoretical flop figures. Do you REALLY believe several years and lots of research later, STI came up with a system purposefully crippled as a marketting gimmick? As pointed out already, 40+ GFLOPs real usable performance has been obtained. How quick can an A64 manage a FFT?
 
Fox5 said:
I thought the actual benched performance of cell was around 40Gflops. Which would be pretty impressive, since I don't think any x86 chips come anywhere close. What's the current top x86 single core performer? I'd imagine somewhere between 10Gflop to 20Gflop.

I see your numbers have already been corrected, but...

How would it have been pretty impressive? A CPU that has been designed from ground to up to offer optimum floating point performance, with 8 separate "cores", would have reached only 2x - 4x single core x86 performance?
 
Back
Top