Starbreeze take on the Ps3 vs Xbox 360 (the Darkness Int)

NavNucST3 · May 5, 2006

pc999 said:
That happens with rules too, but at least we could ban them right at the begining if they didnt present previos knowledge and/or will to learn (a post like "please explain me the rules links" would be valid) ence this forum wouldnt apreciate their posts.

Yes...as happens around every E3, the blame game goes into effect, where its only the "new" members who derail threads, or add nothing to the discussion, none of the regulars or senior members, ever cause any problems.

trinibwoy · May 5, 2006

NucNavST3 said:
Yes...as happens around every E3, the blame game goes into effect, where its only the "new" members who derail threads, or add nothing to the discussion, none of the regulars or senior members, ever cause any problems.

therealskywolf · May 5, 2006

PiNkY said:
Without derailing the thread, AnandTech's look at the new AGEIA PhysX PPU might have some relevance in this discussion

link

I just saw the PPu vs No PPU video of GRAW in GAmespot. Pathetic. Looks barelly better and runs worse.

PiNkY · May 5, 2006

Vysez said:
That hardly has anything to do with the comment made by the Starbreeze developer.

Not that the actual discussion occurring in this thread is of a great quality or anything, though.

Oh imho it has. There was some argument as how a typical x86 processor compares to the likes of cell/xenon under a typical games workload. Now when you look at the perfomance improvement the ppu with its 50 somewhat gflops adds to a 6 gflops processor on vendor specific demo code (+16% in the AEGIA Demo, FX-57 minimum) it just points out quite well how meaningless simply adding up gflops is.

Crossbar · May 5, 2006

PiNkY said:
Oh imho it has. There was some argument as how a typical x86 processor compares to the likes of cell/xenon under a typical games workload. Now when you look at the perfomance improvement the ppu with its 50 somewhat gflops adds to a 6 gflops processor on vendor specific demo code (+16% in the AEGIA Demo, FX-57 minimum) it just points out quite well how meaningless simply adding up gflops is.

I agree with previous posters that fps is not really adequate to judge the capability of a physics card, anyhow let's make a hypothetical experiment.

Assume you have a machine with 56 Gflops, hand this out to developers and let them get used to it and let them develop some nice games, give them two-three years time to complete them. Now change the hardware and remove 50 Glops and run the same games.

Do you think the performance drop would be about 16%? I think not, they would probably hardly be playable.

I think this is what a closed console environment mean.

Bigus Dickus · May 5, 2006

Trying to stay away from the rampant speculation and meaningless numbers comparisons, but I do think it's worth pointing out that there have been some discussions among some quite knowledgeable persons on this forum on the topic of CELL and physics, and what I have gathered from those discussions is that the super-multithreaded approach of CELL might not be best suited to physics calculations. Reason being some of those calculations need to run fast serially, and aren't as suited for multithreading.

I'd suggest searching the forum for those posts if you are more interested in a discussion a bit above the civility level in this thread. In fact, I'll probably do a search now to make sure it was physics that was indeed being discussed, and not AI.

My point being that there has been a lot of throwing around of the assumption that CELL will be a physics monster based on numbers alone, without considering the actual architecture of the chip and what it is well suited for.

IMO (and I'm not a developer by any means, so it is a very crude and pulled out of my hiney opinion), I would think the multiprocessing nature of CELL might not have so big of an impact on general physics or AI that many assume, but rather allow more things to be done simultaneously. One possibility might be sideline physics, like cloth simulation or destruction of something that doesn't need to go through normal collision detection and interaction of the whole physics engine but only a localized area (lots of speculation here). And I'm sure developers will find things to do with the SPE's that they simply didn't have a processor to do before, so that could be cool because it might actually give games something new that we haven't seen before, instead of an imperceptible boost in physics or AI. For example, have one SPE running a voice recognition routine (if it can handle it - no idea of the processing requirements for modern voice recognition) to lend a new interactivity/control method in the game. I bet developers will think of dozens of cool things to do... but I wouldn't hang my hat on expectations of jaw-dropping physics simulations, at least not moreso that what we will see elsewhere.

PiNkY · May 5, 2006

Well perhaps I came off wrong. That is not the point i was trying to make. It is more like, see, deducting that perfomance on real-world gamecode scales even near linear with flops-count on different architectures (from the, in this context, brainaic Athlon64 over the PPE to the extremly simple SPEs / PPUs) is really not a good idea.

patsu · May 5, 2006

Bigus Dickus said:
Trying to stay away from the rampant speculation and meaningless numbers comparisons, but I do think it's worth pointing out that there have been some discussions among some quite knowledgeable persons on this forum on the topic of CELL and physics, and what I have gathered from those discussions is that the super-multithreaded approach of CELL might not be best suited to physics calculations. Reason being some of those calculations need to run fast serially, and aren't as suited for multithreading.

Yes, I think this is one aspect of the Cell architecture (The number of parallel units).
There are also other advantages, specifically the NUMA memory layout (fast local store with explicit control), SIMD units with each SPU and large register files. These also contribute to fast execution if the problem maps well.

(A) What I cannot ascertain is without actually coding it, I can't put my finger on whether game problems map well to the Cell architecture (Not just spreading the workload to different units, but also organizing the memory layout to maximize Cell's performance for instance).

(B) ...and whether there are limitations that undermine the performance (e.g., How bad/good are the branch hints realistically, does PPE affect global memory access in any way).

Perhaps the thread you mentioned has covered all these. I will look it up.

I programmed on parallel clusters and super computers in the late 80s and early 90s. I can appreciate what those irons did, and what Cell is trying to do (and of course XB360's XCPU/VMX too). This is just an exercise for me to find out how far the consumer companies are taking the concept. Both XB360 and PS3 seem well-poised to succeed into the next next gen (speculating), so I'll be even more happy further down the road.

However it annoys me to no end when someone infer from a specific hardware trait (or even just paper specs) to conclude that all XB360 and PS3 games will look indistinguishable. For one, as as gamer I don't just care about looks and static screen shots, I care about the total experience. And two, we are cutting out the largest contribution -- from the game designers, the artists and the programmers. For the sake of the game industry, I hope people are striving hard to deliver the best experiences possible.

All films are shot using pretty much the same technologies, but each and every one of them (the good ones anyway) are unique. So can games. It's ok if it takes time. I can wait. Perhaps the problem is more acute for me: All my games this gen are exclusive franchises, but that's just me.

rounin · May 5, 2006

Perhaps GRAW will be an exception or maybe people haven't yet found out how to use these massively parallel units properly yet. I mean, Nvidia, Ati, Sony, Epic (I believe they thought CELL would be good for physics and their games support Ageia) and Ageia can't all be idiots in thinking parallel computing is good for physics...

Titanio · May 5, 2006

Bigus Dickus said:
Trying to stay away from the rampant speculation and meaningless numbers comparisons, but I do think it's worth pointing out that there have been some discussions among some quite knowledgeable persons on this forum on the topic of CELL and physics, and what I have gathered from those discussions is that the super-multithreaded approach of CELL might not be best suited to physics calculations. Reason being some of those calculations need to run fast serially, and aren't as suited for multithreading.

If your concern is the rate at which things are done, you'll want to minimise the serial workload and maximise the parallel workload, to see a speed-up that scales with processor units.

However, that speaks nothing for potential throughput. Frames-per-second and work-per-frame are not tied to each other.

I think the potential is there to do more per frame rather than necessarily to do more frames per second.

Also, re. Epic and their comments on the AGEIA port to Cell, they seemed to be referencing stuff they were actually seeing rather than something they were "thinking". The AGEIA port isn't even optimal, though, on a number of levels.

Guilty Bystander · May 5, 2006

Good lord will you give it up?

Your one-man war fought with theoretical FLOP figures got old so long ago, now it's really pathetic to keep seeing your silly posts on every thread.

In my three years visiting here I've never ever seen you add anything productive to these threads but rather post one of these jerkoff posts like the one above.

I just saw the PPu vs No PPU video of GRAW in GAmespot. Pathetic. Looks barelly better and runs worse.

As the article may have said the framerate is slightly lower because with the extra physics on screen there also comes extra graphics on screen for the GPU to chew on and thus slow framerates down a bit.
The physics however are much much better with Physix turned on and would you be able to enable them on a PC without a PPU framerares would turn into diarates.

Graphics don't increase with the PPU neither does stressing on the GPU.
The PPU's only duty is to offload stressing off the CPU and increasing physics and animation complexity.

Bigus Dickus · May 5, 2006

patsu said:
However it annoys me to no end when someone infer from a specific hardware trait (or even just paper specs) to conclude that all XB360 and PS3 games will look indistinguishable. For one, as as gamer I don't just care about looks and static screen shots, I care about the total experience. And two, we are cutting out the largest contribution -- from the game designers, the artists and the programmers. For the sake of the game industry, I hope people are striving hard to deliver the best experiences possible.

Not that I can speak for everyone, but I think the gist of the "argument" if you want to call it one is that the performance should be close enough (an order of magnitude difference even is still fairly close in the grand scheme of things) that similar results should be achievable on both systems. Not that we will actually see identical results, of course, because as you stated the developers are the biggest factor in how things look and play.

The biggest application of that line of thinking is that multiplatform games should, in all likelihood, look and run very similarly on both systems (as has usually been the case in the past, even with rather drastically different hardware). First party and exclusives will look different for artistic reasons, but IMO we will still see the tit for tat arguing back and forth as each new title is released regarding which system has the best graphics, which games look the best, which side has the most "good looking" games, etc. All meaningless, of course, be that's what we'll get.

I reserve the right to say "meh, they all look pretty good, different, good, hard to tell what is really the 'best'...."

Bigus Dickus · May 5, 2006

Titanio said:
If your concern is the rate at which things are done, you'll want to minimise the serial workload and maximise the parallel workload, to see a speed-up that scales with processor units.

Yeah... the issue is just what kinds of computational tasks are well suited to parallelization. Not all are, and I recall some threads discussing why physics (or, perhaps it was AI, still haven't had time to do a good search... damn B3D's search functionality) may not be one of them.

I do think you're right that what CELL will offer is more stuff per frame, I'm just not convinced that the "stuff" will be noticably more advanced physics simulations. I think the "more stuff" will be great physics + great AI + neat routines, like voice recognition, or something else I couldn't possibly think up right now.

Shifty Geezer · May 5, 2006

Bigus Dickus said:
(an order of magnitude difference even is still fairly close in the grand scheme of things)

Well, in guess in the grand scheme of things, thinking in terms of say the difference in scale between subatomic spaces and the distances between galaxies, an order of magnitude can be considerd fairly close. But if a console of 10x the power of another console doesn't show that as a clear advantage, the developers on that console ought to give up!

Titanio · May 5, 2006

Bigus Dickus said:
Yeah... the issue is just what kinds of computational tasks are well suited to parallelization. Not all are, and I recall some threads discussing why physics (or, perhaps it was AI, still haven't had time to do a good search... damn B3D's search functionality) may not be one of them.

I do think you're right that what CELL will offer is more stuff per frame, I'm just not convinced that the "stuff" will be noticably more advanced physics simulations. I think the "more stuff" will be great physics + great AI + neat routines, like voice recognition, or something else I couldn't possibly think up right now.

You should look up Havok's HavokFX presentation re. Physics and parallelisation.

Framerate is one way to measure performance, but it can be misleading up to a point. With a parallel task bound by a slower serial part, your framerate may not significantly differ for two very different scales of workload. For example - an arbitrary plucked out of the air one - three processors may not be much faster than one for a given workload, but may be only slightly slower with a much bigger one (where the single processor is buckling). That's what I mean about doing more per frame versus more frames per sec.

I'd also say that more generally while performance with one simulation may not scale well beyond a certain number of processors for a certain implementation, you could have multiple simulations with limited interaction between one another across more processors.

PhysX cards currently have specific issues beyond this though - probably chiefly the bandwidth to it in and out of the system.

expletive · May 6, 2006

Since we're talking about parallelization, can someone explain the limitations with Xenon's '2 threads per core'? Do both threads run in parallel or alternating cycles?

Also, if a core has only 2 threads, how can something like the OS use only 5% of the core? I would think that every task needs to use 50%? Yeah obviously i'm really lost on this concept...

PeterT · May 6, 2006

expletive said:
Also, if a core has only 2 threads, how can something like the OS use only 5% of the core? I would think that every task needs to use 50%? Yeah obviously i'm really lost on this concept...

Just because the CPU supports 2 hardware threads doesn't mean that you can't schedule N software threads on top of that. How do you think your OS manages that "multitasking" thingamajing?

silhouette · May 6, 2006

expletive said:
Since we're talking about parallelization, can someone explain the limitations with Xenon's '2 threads per core'? Do both threads run in parallel or alternating cycles?

Also, if a core has only 2 threads, how can something like the OS use only 5% of the core? I would think that every task needs to use 50%? Yeah obviously i'm really lost on this concept...

Short answer is the threads have their own register space but share the execution units. Theoratically, if the two threads always only use different execution units (i.e. one is always using integer ALU, while other one uses FPU), both of them executes at full speed. However, in practice, they sometimes have to share the same units (like load/store or branch), and at these times one of them has to wait while other finishes execution. So, the answer is somewhere between 50%-100% speed depending on the instruction mix of the threads.

rounin · May 6, 2006

Titanio said:
You should look up Havok's HavokFX presentation re. Physics and parallelisation.

...

Where is this?

expletive · May 6, 2006

PeterT said:
Just because the CPU supports 2 hardware threads doesn't mean that you can't schedule N software threads on top of that. How do you think your OS manages that "multitasking" thingamajing?

Hey i said i was lost on the concept ok!?!?

Seriously though, I understand the concept on windows (and MAC) but wasnt sure if that was a function of the OS and if these console specific designs were any different than the CPUs we see in PCs.

Starbreeze take on the Ps3 vs Xbox 360 (the Darkness Int)

NavNucST3

trinibwoy

Meh

therealskywolf

PiNkY

Crossbar

Bigus Dickus

PiNkY

patsu

rounin

Titanio

Guilty Bystander

Bigus Dickus

Bigus Dickus

Shifty Geezer

uber-Troll!

Titanio

expletive

PeterT

silhouette

rounin

expletive

Similar threads