John Carmarck bothered with Next gen MProcessor Consoles

Paul · Mar 27, 2004

Why does it seem that big PC guys have nothing but shit to say to consoles and the hardware makers? This can be seen from the big dogs such as Sweeny, Carmack(though not limited to), to the PC elitest in the forums.

Deadmeat · Mar 27, 2004

...

Wow, I can't believe Carmack said those things.

He is influential enough to make such valid criticism. The rest of developer community remains in silence in fear of retaliation....

Just shows you that the brightest among us, say sometimes the dumbest things.

Or John knows exactly what he's talking about...

I think CELL software design makes parallism easier than before, as it seems to be semi transparent,

You heard wrong....

if the developer concentrates on writing modular code, and letting the OS handle the process distrubution. Is that not the whole point of the CELL software modules?

In theory, yes.

Edge · Mar 27, 2004

Sure John's criticism is valid in that it is harder to program for a multi-core CPU, than a single, but just because it's harder does not mean, you should not do it.

There are very strong and valid reasons why the console developers have decided to go multi-core, and when you have the two biggest doing it, you think they are going to fail on that account? Of course not.

The Saturn was not a failure for being multi-core, but there where other factors involved. Also, even though the Saturn was a multi-core CPU system, it just was not very well designed in that regard, and did not have very good tools for taking advantage of that parallism. I remember one of SEGA's programmers, saying how difficult it was to write assembler for the two SH-2's.

Anyway one could argue that CELL is CPU's (the PUs) with many co-processors (the APUs), and like a GPU which is also a co-processor, it's never a bad ideal to have those, or is it?

CELL design has localized memory (128 KB), that's not cache, so cache concurrency is not an issue. Well the Saturn certainly never had that advantage.

World of a difference comparing past parallel systems like Saturn to CELL.

Megadrive1988 · Mar 27, 2004

Sure John's criticism is valid in that it is harder to program for a multi-core CPU, than a single, but just because it's harder does not mean, you should not do it.

I agree (not that its harder) I agree that it should be tried...on a larger scale, like it is going with PS3, Xbox2 and perhaps N5 as well.

It has been done in the arcade industry. the workstation industry. the high-end visualization industry, and even the videogame console industry (SegaCD, 32X, Saturn, M2, PS2,)

There are very strong and valid reasons why the console developers have decided to go multi-core, and when you have the two biggest doing it, you think they are going to fail on that account? Of course not.

well put.

Magnum PI · Mar 27, 2004

Edge said:
when you have the two biggest doing it, you think they are going to fail on that account? Of course not.

yes you are right i almost forgot nintendo decided to go multi-core with the DS..

Dio · Mar 27, 2004

cthellis42 said:
Dio said:

The problem is that multiprocessor performance isn't a solved issue.

Click to expand...

Hence, the earlier and the more effort developers expend on this the better, eh?

I think it might be the kind of research suited to pure R&D departments, not to commercial companies working to ultratight budgets and deadlines. The majority of the development side of the games industry is barely profitable at best, except for the occasional big hit and a few heavily capitalised companies.

In fact, this research up until now has been the perogative of the university R&D and they haven't had much success. I'm expecting that any console company shipping a console relying heavily on multicore performance will have done this basic research for the games companies (who don't have the slack to take up more than nominal effort in this area) and made made some advances.

If not, it feels awfully like they are betting the console business on unproven technology. But if everyone goes the same way, does it really matter?

Deadmeat · Mar 27, 2004

...

but just because it's harder does not mean, you should not do it.

If you have a $50 million budget and unspecified deadline, sure. But parallelism does increase development time which translates to more expenses.

There are very strong and valid reasons why the console developers have decided to go multi-core

When the hardware side decided to go multiprocessing, the console vender should invest at least double that amount on software layer to abstract the ugly hardware.

We all heard about SCEI's $400 million investment in hardware R&D, but haven't heard much about its CELL OS R&D. This is where the problem lies.

and when you have the two biggest doing it, you think they are going to fail on that account? Of course not.

SCEI has a poor history with parallelism. VU0 still has an avg 8% utilization rate by SCEE's own admission after having been on the market for 5 years. Now SCEI is telling developers to code for 8 APUs in CELL without anykind of automatic parallelization assistance. I say the history is bound to repeat itself.

I remember one of SEGA's programmers, saying how difficult it was to write assembler for the two SH-2's.

There is no such thing as an assembler for two SH-2's.

Anyway one could argue that CELL is CPU's (the PUs) with many co-processors (the APUs), and like a GPU which is also a co-processor, it's never a bad ideal to have those, or is it?

The GPU side parallelism works because of followings.

1. Each display list carries no data inter-dependency.
2. DirectX makes sure the developer is abstracted from the actual shader counts. It doesn't matter if a DX-compatible GPU carries one shader or eight.

The CPU-side parallelism doesn't work well because

1. There is heavy data-interdependency between threads.
2. CELL OS may or may not abstract the actual APU layout. I have no info on this.

CELL design has localized memory (128 KB), that's not cache, so cache concurrency is not an issue. Well the Saturn certainly never had that advantage.

Saturn's slave SH-2's cache was configured to have 2 KB Cache + 2 KB Local RAM.

World of a difference comparing past parallel systems like Saturn to CELL.

Physically, yes. Conceptually, Not much.

To Dio

If not, it feels awfully like they are betting the console business on unproven technology. But if everyone goes the same way, does it really matter?

The burden laid on developers are different.

In case of Xbox Next, out of 4 logical processors, I expect the OS side subsystems(Networking, I/O, and DirectX) to claim at least 2 fulltime. An XBox Next title should be able to get a decent performance out of single/dual threading.

In case of PSX3, a developer faces 8 APUs, which is allocated all to himself...

Deadmeat · Mar 27, 2004

...

PS. CELL OS is not an SCEI inhouse project, it has been outsourced to a small start-up firm started by Okamoto Nobuichi, so you can guess that SCEI is not investing hundreds of millions into CELL OS like it did on CELL hardware...

This is how Japanese firms often think of software, an afterthought...

Dio · Mar 27, 2004

Re: ...

Deadmeat said:
In case of Xbox Next, out of 4 logical processors, I expect the OS side subsystems(Networking, I/O, and DirectX) to claim at least 2 fulltime. An XBox Next title should be able to get a decent performance out of single/dual threading.

What exactly are you defining as 'DirectX' and 'I/O' here (In terms of jobs the CPU will actually have to do)?

Assuming you mean 'Direct3D/DirectSound/DirectPlay/DirectMusic general overheads, and any disk access', I'm surprised that you think such things - which take low single figure percentages of CPU time on a PC - should take up so much time on a next generation console. I would have thought 'overheads' will be far lower on console due to more direct access to the hardware (the lack of any DDI).

Also, as noted above, there is no reason to believe that any next generation system will employ an architectural model to which 'threading' (which implies shared memory, MESI, etc.) will be applicable.

Deadmeat · Mar 27, 2004

...

What exactly are you defining as 'DirectX' and 'I/O' here (In terms of jobs the CPU will actually have to do)?

Those run on separate threads. Hence the "user" threads are free to do its own thing.

Edge · Mar 28, 2004

Parallelism does increase development costs (more so in the first generations), but that would also depend on the level of parallelism which could vary greatly, and thus the costs also.

R&D expenditures are almost always 95 percent Development (the expensive factories), and 5 percent for Research. You don't need expensive factories for software, so saying 400 million for hardware is misleading.

Sony does have a poor history with parallelism, but that does not mean it will continue, especially when they were smart and partnered with one of the brightest companies in the world when it comes to parallelism. IBM.

It could be as high as 32 APU's in PS3 Broadband engine, as the BE could be spread among four chips, with 1 PU + 8 APUs + 8 MB EDRAM each (250 million transistors) simply because the design lends itself to that, and I agree with your previous posts on the subject, that a 1 billion transistor chip is unrealistic for 2005. Over time, those four (250 mil trans) chips could be two chips, and then 1 chip by 2010. CELL lends itself to having multiple chips, as that is the foundation of it's design. I believe 2 Ghz, and 500 Giga integer ops/500 GigaFLOPS. Not their goal, but certainly powerful enough.

If you study the patent on CELL, you will also see, that a great deal of thought has went into the software side of the design. It because of that, I am very impressed by it.

If they can do it for super computers, then they can do it for a console. It's not impossible, and it looks like Sony/IBM/Toshiba are striving to make it easier than before. It certainly no PS2 parallelism, and vastly more impressive.

Note I was not implying assembler for two SH-2's with some kind of automatic parallelism but that he coded assembly for both. Thanks for the info on the second SH-2 having 2KB local RAM.

Guden Oden · Mar 28, 2004

I'm pretty sure BE isn't going to be four chips, or thee or even just two. It will be one chip, if for no other reason more than one vastly complicates things regarding external, off-chip memory.

Also, four chips would have four eDRAM memory pools, so the largest dataset you could ever process would be 1/4 of total eDRAM capacity. You'd also need a fast, complicated crossbar/switch connecting all four chips through a high-speed bus interface. This would introduce latency in inter-processor communications and make arbitration more complex too since the distance between processors increases.

Even integrating them onto one die further on into the future would not change this, as the architecture is fixed; no net speed gains can be made. Only savings from die shrinks, fewer components etc.

randycat99 · Mar 28, 2004

Y'know, I just got to thinking all this concern over how to make multithreaded game software may be clouding the real matter here. It makes sense that a a tranparent parallel software solution would be the obvious counterpart to a parallel hardware solution. That's the holy grail of the computing future, and all concerned are watching Sony/IBM/Toshiba to see if they can actually pull this off under all that under all the hubub.

However, was it implicitly claimed that this is exactly what they would do? Is it implicitly known that this is what will be required to extract any kind of performance out of a BE computer? All of us have assumed that it is, but it is still an assumption.

Let's step back a bit. How much software parallelism is really required to make a game work on a BE computer? Consider that this is a game. What do games do? A game is essentially a serial logic thread that prescribes challenging/entertaining stimuli and responses to a user input. It could be all text interaction to tell the player what is going on, but that would get boring and tedious. So we have graphics, sound, and in a different sense physics and AI to make those 2 former entities behave convincingly. Those are the truly processing hungry operations in a modern game code, not the serial logic of the game itself. Freed from that requirement, I wouldn't be surprised if the serial logic thread itself could run so fast as to update millions of times a second on even a single, most mundane modern CPU. If it needs to run on a single processing unit and is easiest to implement on a single processing unit, so be it. It's well taken care of. It's the calls to the graphics, sound, physics, and AI threads that will be done in parallel (to each other, not necessarily in parallel to the game logic; obviously, the game logic has to be the sequential driving component to the other stuff).

That said, isn't it fairly easy to imagine that the graphics, sound, physics, and AI code can be expressed to exploit parallelism rather intuitively? Each one is a conceptually simple operation- it just requires the quick processing of a huge amount of data which will result in a raster output and an audio signal (I guess you can add in motional/rumble feedback, as well). We're talking about the manipulation of brute amounts of vertices, procedural texture engine data, and per pixel processing. All of this is born to be parallel just as you would expect GPU stuff happens on a GPU.

So essentially the work of a game developer can essentially remain unchanged (though tinkering around deeper to the metal is still a possibility, if so desired). You write your serial game thread just as always, and then when it comes to ordering around legions of polys and pixels, you are only talking to a software driver abstraction. The driver abstraction just happens to orchestrate processing units in the "main CPU" (if that still has any meaning anymore), instead of a graphics card.

Please do note that by saying all of this, I do not wish to undermine the continued strive for the holy grail of pervasive parallel software code. It's still a holy grail that could indeed open new possibilities on something like the BE. I'm just saying that in the big scope of things, it is actually not a pivotal component to enabling things (namely, games) to be done on a BE.

The only question that need be asked is if Sony/IBM/Toshiba can come up with what is essentially a software graphics driver? To that, I can only say (from my laymen's observation of CG technology), "Why is that even a question?"

MS has had one for years now. It's called DirectX.

...2005/2006 can't come soon enough! BRING IT, PS3! BRING IT!

No-name · Mar 28, 2004

No doubt that Carmacks influence and abilities in the PC world are highly spoken of but I find it off beat to say such a thing when he hasn't done anything on the console front for generations so why would he care? Others have made themselves on consoles, John Carmack didn't.

ERP · Mar 28, 2004

So here's a question, what do you think existing graphics drivers do?

They just move bits of memory around, generally from main memory into some hardware queue, FWIW this operation alone can end up using all of the available processing resources. This is why PC IHV's preach batching of geometry, minimising API calls.

I've said before that I don't think the average programmer will even know they are working on a multiprocessor system, it'll all be in the libraries. I think the counterpoint to this is that I don't see traditional game architectures using what the Cell patent describes for much more than a powerful vertex shader and possibly the occasional physics calculation.

Doing multiprocessing right requires a substantial rethink in how we code games. The secret if indeed there is one, is to limit the types of parrallesim that we employ, so that the inter thread issues can be limited and understood. I just don't see efficient usage of these architectures happening in the short term.

I think what these complex architectures do is make it harder for small underfunded developers to compete with the big boys and to me that's a bad thing.

Personally I like the challenge, but I can see why many developers wouldn't.

Brimstone · Mar 28, 2004

ERP said:
I think what these complex architectures do is make it harder for small underfunded developers to compete with the big boys and to me that's a bad thing.

Personally I like the challenge, but I can see why many developers wouldn't.

But Microsoft has an ace up its sleeve with the PC game mod community. Tim Sweeney and the Unreal Engine is a great example. Epic toils away making a robust engine that they can sell to other developers and make a game themselves. Then hopeful future game developers utilize the engine by way of the PC mod community. The engine can be used for free and the internet is used to distribute their creations. Those with a knack for moding games have a great way of to get their foot in the door and the game industry gets a much needed talent pool to draw upon.

Deadmeat · Mar 28, 2004

...

randycat99

That's the holy grail of the computing future, and all concerned are watching Sony/IBM/Toshiba to see if they can actually pull this off under all that under all the hubub.

You are looking at wrong places. Sun has made more progress in auto-parallelization compiler than SCEI would ever will.

t's the calls to the graphics, sound, physics, and AI threads that will be done in parallel (to each other, not necessarily in parallel to the game logic; obviously, the game logic has to be the sequential driving component to the other stuff).

Graphics and sounds are already highly parallel. AI and physics stuff introduce interdependency and you have a thread scheduling accordingly.

The only question that need be asked is if Sony/IBM/Toshiba can come up with what is essentially a software graphics driver?

You still have messy "user" code to break down into.

ERP

Doing multiprocessing right requires a substantial rethink in how we code games. The secret if indeed there is one, is to limit the types of parrallesim that we employ, so that the inter thread issues can be limited and understood.

This is why the "art" of programming no longer works and the "science" of software engineering kicks in..
Take the whole serial code, break them down into several independent modules along the pipeline, and then interconnect them via well-defined memory buffer. In other word, the pipelining of software architecture.

Guden Oden · Mar 28, 2004

I don't think MS want mods in their console games, it will open the door for piracy on the platform.

New player models and such, sure, why not. Maybe the occational level, but not full-blown mods, because that requires code changes in the executables themselves.

Other manufacturers (ie, Sony, Nintendo), could just as well allow player-created content too, no particular advantage for MS here.

Entropy · Mar 28, 2004

Re: ...

Deadmeat said:
World of a difference comparing past parallel systems like Saturn to CELL.

Click to expand...

Physically, yes. Conceptually, Not much.

Entropy · Mar 28, 2004

Re: ...

Deadmeat said:
randycat99

That's the holy grail of the computing future, and all concerned are watching Sony/IBM/Toshiba to see if they can actually pull this off under all that under all the hubub.

Click to expand...

You are looking at wrong places. Sun has made more progress in auto-parallelization compiler than SCEI would ever will.

Autoparallellizing and autovectorizing compilers are interesting from a software engineering standpoint, and they have some use, but they aren't all that critical really. The only code that benefits from them is legacy code that is both performance critical AND amenable to these techniques. Those are very narrow conditions. If you have legacy code that isn't performance critical, it's OK if it only utilizes a single CPU. If you have legacy code that is performance critical, you'd typically be tweaking the critical parts manually anyway when transitioning to a new architecture.

And of course, you may not be running legacy code at all, but producing new, something I gather will be quite common on a platform such as the PS3.

John Carmarck bothered with Next gen MProcessor Consoles

Paul

Deadmeat

Edge

Megadrive1988

Magnum PI

Dio

Deadmeat

Deadmeat

Dio

Deadmeat

Edge

Guden Oden

Senior Member

randycat99

No-name

ERP

Brimstone

B3D Shockwave Rider

Deadmeat

Guden Oden

Senior Member

Entropy

Entropy

Similar threads