Gabe Newell on NextGen systems

jvd said:
Think about what ? Nintendogs sold 300k in a week .

Look at darkstalkers 4 months old and only 80k , fifa only 47 , nba street 50k , streeth show down not even 70k .


These are not strong sales dispite what you want to believe

I'd say 3.2 million software units from 28 games (you're aware the PSP has more than 28 games available, right?) in a 4-month period on a new system that hasn't launched in Europe yet is pretty good :p
 
jvd said:
Think about what ? Nintendogs sold 300k in a week .

Look at darkstalkers 4 months old and only 80k , fifa only 47 , nba street 50k , streeth show down not even 70k .

These are not strong sales dispite what you want to believe

And what about the rest of the NDS games? They sure as hell aren't 300k sellers.

The tie in ratio for games on PSP and NDS very similar (like 1.86 vs 1.9 or something similar to that). One other thing of note is that if you compare PSP's sold to games sold you'll see in April/May/June tie-in ratios were above 2.0 -- they'll start to go up over time 1,077 / 620 vs 3,215/1,515. Since you brought up NDS I'll mention this -- does this mean NDS has weak software sales also, since tie in ratios are similar? I imagine GTA will do for PSP what Nintendogs did for the NDS.

The sales might not be console status, but when you have 5 million total PSPs sold you can't expect many titles to sell over 200k (that chart is for US only or something, or maybe japan only -- not sure). Add in the fact that in a month and a half PSP will be out in Europe to help sales (they'll have a nice selection of software to start with) and the picture doesn't look as grim as you paint it.
 
And what about the rest of the NDS games? They sure as hell aren't 300k sellers.

When did i say the ds is doing better ? I used nintendogs as an example of time not equaling sales . If a game is only selling 20k a month suddenly its not going to jump to 180k a month 5 months down the line , it will only keep droping sales .
 
Isn't it intriguing how few absolutes there are in this world? One fella looks at Cell and sees a chip for CE electronics that's no good in a console, and another looks at exactly the same chip and sees it as good for a console but too big and expensive to merit inclusion in CE goods!
 
Shifty Geezer said:
Isn't it intriguing how few absolutes there are in this world? One fella looks at Cell and sees a chip for CE electronics that's no good in a console, and another looks at exactly the same chip and sees it as good for a console but too big and expensive to merit inclusion in CE goods!

Yeah that is amazing. :oops: I personally think that they are both wrong.
 
According to SiSoft Sandra, a hyperthreaded 3.8ghz 1MB cache P4 can do 4380 MFLOPS, or 7909 using SSE2. I'd imagine with SSE3 and 2MB cache and the 1066mhz fsb that might be pushing up towards 8500MFLOPS. Hmm, according to Sandra, the P4s destroy the Opterons in FLOPS once SSE is used, strange I always thought opterons were supposed to be the FLOPS masters, or is it just in semi real world situations?
That's interesting... I never actually looked at SSE3 myself, but are there some ops that have single-cycle latencies? I'm sure there must be a few SSE2 ops that are for you to get 7900, but I can't say I've ever had a use for them. Sandra being a benchmark will probably just find an otherwise meaningless use for everything.

And yeah, the SIMD units of P4, unlike those of Athlon64 can probably do the job better just becase they're "genuine" SSE pipelines. Athlons probably still use something extended off of the old 3dNow! pipelines, which were 64-bit SIMD, rather than 128-bit (at least, that's what the K7s did, I think, so I'd expect the K8s to be the same). Whatever else, that does mean that 128-bit packed vector ops are pathed in such a way that the ops are done in two passes at the execution phases, so there's no way to get the instruction latency below 2 cycles.
 
You're just proving that FLOPs are useless as metrics. A64 has proven to be far more powerful than a P4 for gaming. Why? Because it can actually make use of more of its processing resources than P4, due to a more efficient overall design.

SSE3 will not give you a near-twofold boost either. SSE2 brought the big, powerful, useful instructions. Tech Report,"..SSE3 instructions, which accelerate a number of different types of computation, including video encoding, scientific computing, and software graphics vertex shaders". SSE3 sounds like it's designed to enhance the performance of cheap, integrated video without full hardware acceleration. Among other things though, as above.
http://techreport.com/onearticle.x/6363
http://en.wikipedia.org/wiki/SSE3
http://en.wikipedia.org/wiki/SSE2
http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions
http://en.wikipedia.org/wiki/3DNow! (AMD, Cyrix, Winchip, C3, others)

Athlons aren't as fast as P4 for SIMD because of the lower clockspeed. It's the same with Pentium-M. P3 had some cache issues that affected SIMD performance. Katmai had external cache, Coppermine had L1/L2 cache limitations, and Tualatin still was a lot like Coppermine. Athlon XP did have some hardware limitations in the SSE units. Thing is, with Athlon, SSE was nearly redundant usually as the FPU is so powerful that it was usually nearly as fast. P4 can NOT say the same. If P4 did not have SSE2, heh, well it would be a LOT slower for gaming. BTW, Athlon64's SSE2 units ARE more efficient than P4's. http://www.aceshardware.com/read.jsp?id=60000258

Athlons previous to 64 did not have SSE2. They also were a lot less efficient than 64 in general. The big picture is important when looking at why a CPU performs as it does. Athlon XP had a 64-bit L2 cache, while A64 is 128-bit. Obviously A64's memory controller gives it most of its performance advantage. A64's execution core is remarkably similar to AXP.
http://www.cpuid.org/K8/index.php
 
passby said:
Doesn't AMD use something different- 3DNow?...or if they do have SSE2 onboard now
OT since this is not the point of your post. 3DNow is really long ago. Anyway I forgot the details and reasons, but SSE was favoured over 3DNow. Today SSE2 is offered by both CPU makers, and optimization guides recommend it.

.

AMD has also made SSE an official part of x86-64, I don't think 3dnow was carried along.
I think the main differences are that SSE is more developed(it came out later and has been worked on longer), and is just more powerful than even 3dnow+. On the other hand, 3dnow and MMX can be used concurrently, while SSE doesn't allow MMX to be used concurrently.

. Athlons probably still use something extended off of the old 3dNow! pipelines, which were 64-bit SIMD, rather than 128-bit (at least, that's what the K7s did, I think, so I'd expect the K8s to be the same).

Thought that internally the Athlon had 80bit pipelines.(so 80bit precision and below would be single cycle wouldn't it?) The original 3dnow did only have 64 bit precision support though, but the 3dnow+ had 128 bit.
 
Fox5 said:
AMD has also made SSE an official part of x86-64, I don't think 3dnow was carried along.
I think the main differences are that SSE is more developed(it came out later and has been worked on longer), and is just more powerful than even 3dnow+. On the other hand, 3dnow and MMX can be used concurrently, while SSE doesn't allow MMX to be used concurrently.

To clear all of this confusion up real quick, here is what the Athlon 64 supports as far as extended instructions.

This explains why SSE isn't listed by itself under the AMD section. This is the quote to look for "Simply put, SSE2 is basically an extension of the original SSE that improves upon performance. While both SSE and SSE2 work on Pentium 4 chips, SSE2 only works on the newest generation of Intel chips. That's why I make builds that use SSE, because then Pentium III and AMD Athlon XP chips can use my builds as well. Basically, if you're on a Pentium 4, the SSE2 builds will give you the best performance."


Here is more info about SSE, MMX, and 3dnow!
http://www.google.com/search?hl=en&...:unofficial&oi=defmore&q=define:3DNow
http://www.psychology.nottingham.ac.uk/staff/cr1/simd.html
http://www.amd.com/us-en/Processors/TechnicalResources/0,,30_182_861_1028,00.html
 
You're just proving that FLOPs are useless as metrics.
You're forgetting that prior to all the talk of FLOPS, the metric that people were so used to resting on was clock speed. That's why you'd see nonsensical garbage from places like IGN that say "3 cores at 3 GHz! Yes, 9 GHz of raw power!"

At least FLOPS takes into account some measure of how efficient the device can actually get to be per cycle. While it's a TLP-induced performance as opposed to ILP-induced, it's at least a metric that contains more information than "3.2 GHz." You can't blame a CPU for the platform it's placed in, so theoretical numbers are the only kind you can really estimate without having an actual app to speak of.

You can't complain about a company delivering information to the masses as if they're idiots. 99% of the time, that assumption is correct!

SSE3 will not give you a near-twofold boost either
Who said that? I only asked if there were SSE3 ops that had single-cycle latencies. There's way too few instructions in SSE3 to actually make that big a difference... and I think a lot of them are horizontal ops. Main advantage of that in gaming is just not having to transpose your matrices so that they can suit both what the CPU likes and what they GPU likes.

The 8500 on Sandra comment is pointless to argue about anyway because it's a totally artificial benchmark. It's more about measuring what the capacity of a whole system is, not about what will see practical usage.
 
scooby_dooby said:
and also that the strength of CELL truly seem to be as a massive A/V decoder/encoder and not as a console CPU.

How can you possibly come to that conclusion when the first game is nearly a year from even going gold, let alone without knowing how 2nd and 3rd generation software from high end development houses will start to truly take advantage of the hardware?

Is it so crazy to think the design could actually have uses over x86 that will be applied in gaming? Actual strengths that will be taken advantage of beyond being merely a subversive marketing tool for mass production? Or is that just not a possible outcome in your world?
 
I guess my overall feeling is that the new console CPUs are extremes on bang for the buck on the part of the console makers. They are as cheap as possible but offer the POTENTIAL for extreme performance. The reality of it is though that it will be so very hard to reach those peaks that it's probably not realistic to believe it's going to happen. :)

Hell it's going to take hand assembly coding I'd think to get multi-threaded in-order code going good isn't it? Compilers can hardly get SIMD into code today, after 5 years or so of Intel working on it for SSE and SSE2! This is surely far more complicated. They have to deal with SIMD, multi-threading, and in-order optimizations cuz the CPU won't do it for them. If they have to use massive amounts of assembly that's a bitch right there and, well, porting it to other consoles like we see happening today will become a lot harder I'd think.

I just don't really understand why they were forced to such extremes. And how both Xbox360 and PS3 ended up on basically the same path. We'll have to see if Nintendo keeps their ease-of-development idea going. Gamecube was just amazing on that front.
 
a688 said:
Fox5 said:
AMD has also made SSE an official part of x86-64, I don't think 3dnow was carried along.
I think the main differences are that SSE is more developed(it came out later and has been worked on longer), and is just more powerful than even 3dnow+. On the other hand, 3dnow and MMX can be used concurrently, while SSE doesn't allow MMX to be used concurrently.

To clear all of this confusion up real quick, here is what the Athlon 64 supports as far as extended instructions.

This explains why SSE isn't listed by itself under the AMD section. This is the quote to look for "Simply put, SSE2 is basically an extension of the original SSE that improves upon performance. While both SSE and SSE2 work on Pentium 4 chips, SSE2 only works on the newest generation of Intel chips. That's why I make builds that use SSE, because then Pentium III and AMD Athlon XP chips can use my builds as well. Basically, if you're on a Pentium 4, the SSE2 builds will give you the best performance."


Here is more info about SSE, MMX, and 3dnow!
http://www.google.com/search?hl=en&...:unofficial&oi=defmore&q=define:3DNow
http://www.psychology.nottingham.ac.uk/staff/cr1/simd.html
http://www.amd.com/us-en/Processors/TechnicalResources/0,,30_182_861_1028,00.html

Wait, how does that explain why SSE isn't listed on the AMD section, yet is on the Intel?

That's why you'd see nonsensical garbage from places like IGN that say "3 cores at 3 GHz! Yes, 9 GHz of raw power!"

I think they said 18Ghz because of hyperthreading.

Is it so crazy to think the design could actually have uses over x86 that will be applied in gaming? Actual strengths that will be taken advantage of beyond being merely a subversive marketing tool for mass production? Or is that just not a possible outcome in your world?

Generally it seems that unless one company makes a major misstep, they're never more than 6 months behind their competitors. Cell isn't released yet though, but I wonder how it will hold up against the x86 processors available when it does finally launch. If Cell is superior to technology that was available to consumers a year before Cell is launched, that's not such a big deal. Now then, the X360 launch is much closer, wonder how its cpu will compare to x86 cpus available in the fall? Would an X2 have been a better choice for it then the current tricore?
 
Why do people have this bizarre notion that concurrent programming (multithreaded, multicore) requires some kind of assembly or compiler magic.

Concurrency has nothing to do with the compiler and everything to do with algorithm design.
 
If Cell is superior to technology that was available to consumers a year before Cell is launched, that's not such a big deal. Now then, the X360 launch is much closer, wonder how its cpu will compare to x86 cpus available in the fall? Would an X2 have been a better choice for it then the current tricore?

Im admittedly layman to much of this but the gist Im getting is these are two very different designs (Cell/Xcpu vs. Intel/AMD) with possibly very different strengths and weaknesses. This makes direct comparisons at this juncture rather difficult yes? It doesn't help matters that Cell/Xcpu will require something of a paradigm shift from developers. That learning curve combined with the closed box nature of consoles means the story will take a couple years before we have an idea whether the strength of these different designs will be applied properly and what (if any) fruit it will bear. But to assert there likely wont be much reward (ie direct benefits in gaming over the more general x86) seems rather preconceived to me.

But like I said Im layman, thrash me accordingly if need be. I can take it. :)
 
No I think you're on the right track there liverkick. :)

Listen, it's true that in the tech world, one company rarely is able to pull away from the pack as far as performance goes, but that's not because there is some natural law against it. There's no need nor reason to limit ourselves to an Intel/AMD mindset with Cell (or the XeCPU) because Intel and AMD have an entirely different set of goals to achieve, and Cell will never threaten their 'empires' as long as x86 stands as the predominant code architecture.

Cell was designed for the absolute highest floating-point performance on what could still be considered 'full' cores - and clearly the result is a reflection of that. AMD and Intel's multi-core chips will NOT reach this level for quite awhile. Quite a long while. But that's not what they're trying to do - they're trying to get next year's processor to run last year's code faster than this years core.

Well, until now. With the x86 multi-core chips coming on en force, we'll see a move to more serious threading attempts. Still an architecture such as Cell's (or XeCPU's) will always hold the Flops advantage over them because the x86 cores will forever be slaved to the support of legacy code. But again, for Intel and AMD's x86 chips, their Flops competitions are only with each other.
 
Back
Top