Toshiba, Sony close to 65nm sample production

Yes, and the ideology of the Cell and the associated funding has to take into account that its sole application is not PS3.

Does this really make a big difference though ?

CELL was designed to be modular, scalable and fast at multimedia workloads and network ready.

These are all areas in which the PlayStation 3 R&D team is interested: modularity and scalability is important as with advances in Semiconductor technology they can afford to pack more processing constructs ( more APUs, etc... ).

It is like saying that PlayStation 2 lost 3D capabilities due to the multi-media functions they added, but then you think about the IPU and how it is used in games, then you think that the SIMD instructions used in the EE's RISC core for Motion Compensation are also used in games' engines...

Sometimes additional extra-features can come almost for free due to the features you already put inside: re-use of the logic you inserted in the Hardware.
 
Panajev2001a said:
While I know MS has money to spend, I also do know that Xbox 2 did not start before PlayStation 3 in the respective R&D centers: we have good bits of information that suggest late 1999-early 2000 the R&D for Playstation 3 was already in full force and at that time it is not really conceivable to assume the Xbox 2 was in a similar state.

I’m not actually making the suggestion that development on the XBox2 did start before PS3, I’m just stating that we do not know exactly when it did start nor the levels of funding that have going into it (internally or externally) already. We certainly have no concept of how much is yet to go into either.

And this is one of the point I’m making – we can’t make assumptions without knowing what is going on, you may have lots of information about Cell, but we just frankly do not have any kind of information as to what MS are or have been doing.

The other point I’m making is that pointing to level of R&D as a measure of success or not doesn’t necessarily paint the entire picture. I have no doubt that at this point in time Sony’s focus is probably primarily on the Cell for PS3, but that doesn’t necessarily equate to their partners either. When the development of a silicon process is pointed to I don’t really think this is something that solely pertains to R&D that goes to PS3 – as the a company like IBM is going to be looking at this anyway (and as Vince point out they are with AMD), they are likely to use that for more than just that one application and for all we know it may be the case that this is what MS are paying IBM for access to, i.e. it equally applies to other areas not just PS3.

A longer R&D cycle to control the spending better, but they also did spend quite a lot of money in CELL and CELL related investments and IMHO this is because I do not feel CELL will be thrown away for PlayStation 4, but that if they can get a good and scalable development environment they might leave enough mileage in the scalability of CELL ( with a couple of fixes and new ideas along the way ) to produce a next-next-generation Hardware, but I am looking too far in the future this way.

I’d assume that as well, if it’s a success.

I believe the launch date theory: if Xbox 2 and PlayStation 3 come out close to each other, I expect them to have comparable perfomance over-all with each console having its advantages and flaws.

If PlayStation 3 comes out at a later date I expect its features and/or performance to be improoved over what the Xbox 2 will be offering.

Fundamentally, as I’ve said a number of times, so do I (and again, 90% of the software will probably end up looking the same across them). There’s some leeway in what you may call comparable though – within 6 months I don’t think there would be much in the way of a difference.
 
Dave, I offer you the same advice Simon gave me a few months back in a similar position: give it up mate. There's no point.
 
Panajev2001a said:
Yes, and the ideology of the Cell and the associated funding has to take into account that its sole application is not PS3.

Does this really make a big difference though ?

CELL was designed to be modular, scalable and fast at multimedia workloads and network ready.

These are all areas in which the PlayStation 3 R&D team is interested: modularity and scalability is important as with advances in Semiconductor technology they can afford to pack more processing constructs ( more APUs, etc... ).

Well, when designing a stand alone console what the importance of having a CPU that’s fundamental structure is being network ready? Is multimedia functionality as important as its 3D processing abilities?

This point here is basically going back to the more dedicated vs generalised units discussion.
 
Dio said:
Dave, I offer you the same advice Simon gave me a few months back in a similar position: give it up mate. There's no point.

Well, thanks Dio... I am here trying to have a decent conversation ( sometimes people disagree ) and I do not think we should be that negative.

There is a point: discussing helps us all to re-think our positions as we have to explain our views to others that might not share our assumptions.

Unless you hold the TRUTH (tm) then there is always a point and if you do hold the TRUTH (tm) if you tell us, I am sure I will be queit like a little German Shepherd Dog puppy :D

I have no problem re-think my theories, but being as stubborn as I am I refuse to do it if someone just stops at the "it is this way because I said so"... well unless NDAs are involved: something I do not understand ? Help me learn the tools or research my way to them, I will not mind.
 
Dio said:
Dave, I offer you the same advice Simon gave me a few months back in a similar position: give it up mate. There's no point.

hehe - I'm toying with using Simon's quote in that post as my sig! ;)

However, I fear if everyone vaguely constructive gives up then then the forum will end up with the likes of Chap and Deadmeat doing it in a much worse fashion meaning this place goes to the dogs again and I really will have to close it down then.
 
DaveBaumann said:
Panajev2001a said:
Yes, and the ideology of the Cell and the associated funding has to take into account that its sole application is not PS3.

Does this really make a big difference though ?

CELL was designed to be modular, scalable and fast at multimedia workloads and network ready.

These are all areas in which the PlayStation 3 R&D team is interested: modularity and scalability is important as with advances in Semiconductor technology they can afford to pack more processing constructs ( more APUs, etc... ).

Well, when designing a stand alone console what the importance of having a CPU that’s fundamental structure is being network ready? Is multimedia functionality as important as its 3D processing abilities?

This point here is basically going back to the more dedicated vs generalised units discussion.

I see your point Dave and that is an interesting one.

I would not think of CELL as a traditional general purpose approach: if you ran SPEC on it it would do decently, but it would likely show single thread efficiency to be low and the processor would not be the best thing per cycle running Office suites, etc...

Well, of course if the benchmark was to run tons of instances of the same legacy application... well, things would be different: you always play to the strenghts of parallel architectures when you move in the multi-tasking realm.

When I mention multi-media as being the focus, I do include 3D processing in it ( games applications and what not ).

3D graphics on an architecture like CELL would fly quite fast as what is provided to you is geared towards the same needs 3D Graphics chips designers face.

Its multi-media functionality could be said it is a by-product of the strong 3D Processing capabilities the architecture shows ( in the patent's implementations ) or we could make the inverse case: the good 3D Processing capabilities are a by-product of the strong multi-media functionality the architecture shows.

I see the flexibility comes at a cost in the sense that I suspect the ATI VPUs to come with more silicon budget spent on some HW tricks and very optimized to the task at hand.

The EE's VUs peak at less than what NV2A's Vertex Shaders do, but still some developers do appreciate the flexibility the EE's VUs provided.

It is a trade-off: CELL and all the other parallel architectures understood that focusing on the needs of multi-media/very vectorizable applications was the best choice towards obtaining more bang for your transistor buck and get in those applications ( which are the ones that really neeed power ) the best performance compared to traditional General Purpose CPU approach.

In the same way VPUs focus on an even smaller sub-set of applications ( basically implementing the classical SGI pipeline faster and faster and faster yet ) and might very well peak at a higher performance in them.

I said... if they all release at approximately the same times they will be comparable with certain architectures doing one thing better and other doing another one better.

I am sure that the extra flexibility ( the APUs in themselves are still more like generalized DSPs than General Purpose CPUs ) of CELL will be put to good use... physics, A.I., different Rendering algorithms ( not only the SGI approach ), etc...


Like people are seeign on the PC front, the applications that require performance are mostly games and other multi-media applications ( video encoding and decoding, music encoding and decoding, Image Processing, etc... ) and if you had to think about an architecture that would be best at running them you would go towards parallel processors which is what GPUs are also becoming.

Call it CELL, call it IPF, call it POWER5, call it NV3X, call it R3XX, call it R4XX or NV4X... the idea is similar.

High bandwidth, vast Single Precision Floating Point processing power, very high exposed parallelism... the same concepts are being used now in VPUs as they are in these highly parallel micro-processor architectures.

Their efficiency at what we can call legacy appliation ( low ILP, low TLP [single threaded applications mostly]...full of conditional branches... applications that force to have strong scalar performance ) of these parallel architectures is LOW... some CPU designers might call it embarassingly low, but the point is that these days even that LOW is good enough, especially with the tendency of users to run several applications at the same time if they can.

These are flexible Vector Processors with strong multi-media related sources of inspiration, not general purpose processor as they have been intended so far.

They are more like DSPs that can also be used to do tasks that common CPUs do, but they are not oriented in optimizing for the same workload.
 
DaveBaumann said:
Dio said:
Dave, I offer you the same advice Simon gave me a few months back in a similar position: give it up mate. There's no point.

hehe - I'm toying with using Simon's quote in that post as my sig! ;)

However, I fear if everyone vaguely constructive gives up then then the forum will end up with the likes of Chap and Deadmeat doing it in a much worse fashion meaning this place goes to the dogs again and I really will have to close it down then.

Plsu we are not having THAT bad of a chat... I hope...

I can say that I learned new things talking about Shaders and real applications running on what the patent presented as the Broadband Engine and the Visualizer:previously I had been too focused on basic 3D Vector processing, data flow and general architectural ideas, but less on "ok so what real world 3D applications will face in terms of problems"... a mistake, but I cannot always look at all the aspects fo the picture.

Still, I am thankful if people can help ( if their intentions are to just have a polite discussion with maybe polite disagreements ).
 
DaveBaumann said:
Well, when designing a stand alone console what the importance of having a CPU that’s fundamental structure is being network ready? Is multimedia functionality as important as its 3D processing abilities?

Let me put that a little clearer.

Lets say for instance that you want to put a CELL based CPU into a PDA – maybe you can get away with just one APU. Now, that APU would need to be designed with enough multimedia processing functionality / abilities to meet the requirement of a PDA, the issue you then have it that when you scale the number of APU’s up to PS3 levels you have the potential to have a disproportionate amount to multimedia processing power required to PS3’s requirements.

In the networking instance, it may be great for a grid based computing system that is passing vast quantities of data between each processing array to have networking capabilities at each APU level, but is this really a necessity for a console which it likely only going to be detailing with simple packets of information that relate to locations in a map for multiplayer games or possibly some kind of multimedia serving (should this be a requirement for PS3)?

With a processor array system you have to design enough functionality / power in for the lowest level that you can conceive you are going to use it at, meaning that when you scale it up into other applications some of the functionality may be disproportionate for that particular applications primary task / goal.

(Note, it may be the case that these particular instances have processing requirements that are just as applicable to 3D processing requirements, which is fine, as you’ve not lost anything, but I think you get my point – if there is something in there that’s specifically required for certain functionality as you scale upwards you run the risk having potential waste in some applications or vice versa you have to design in such a fashion that each of the smaller devices will have enough processing abilities for these tasks to be done in a number of APU’s that are not going to go beyond the power requirements of that device – “one size fits all†is quite a difficult issue, otherwise it would have been done by now.)

Edit - hadn't seen the previous two posts before submitting this.
 
DaveBaumann said:
DaveBaumann said:
Well, when designing a stand alone console what the importance of having a CPU that’s fundamental structure is being network ready? Is multimedia functionality as important as its 3D processing abilities?

Let me put that a little clearer.

Lets say for instance that you want to put a CELL based CPU into a PDA – maybe you can get away with just one APU. Now, that APU would need to be designed with enough multimedia processing functionality / abilities to meet the requirement of a PDA, the issue you then have it that when you scale the number of APU’s up to PS3 levels you have the potential to have a disproportionate amount to multimedia processing power required to PS3’s requirements.

In the networking instance, it may be great for a grid based computing system that is passing vast quantities of data between each processing array to have networking capabilities at each APU level, but is this really a necessity for a console which it likely only going to be detailing with simple packets of information that relate to locations in a map for multiplayer games or possibly some kind of multimedia serving (should this be a requirement for PS3)?

With a processor array system you have to design enough functionality / power in for the lowest level that you can conceive you are going to use it at, meaning that when you scale it up into other applications some of the functionality may be disproportionate for that particular applications primary task / goal.

(Note, it may be the case that these particular instances have processing requirements that are just as applicable to 3D processing requirements, which is fine, as you’ve not lost anything, but I think you get my point – if there is something in there that’s specifically required for certain functionality as you scale upwards you run the risk having potential waste in some applications or vice versa you have to design in such a fashion that each of the smaller devices will have enough processing abilities for these tasks to be done in a number of APU’s that are not going to go beyond the power requirements of that device – “one size fits all†is quite a difficult issue, otherwise it would have been done by now.)

It was not done by now because Semiconductor technologies did not allow to scale up as much and did not allow things like e-DRAM or tons of little SRAM memories to keep everything local.

SoC design is blossoming these days due to not so many new ideas, but to the fact that finally the technology has caught up to the Researchers theories.

You make the comment about the PDA.

The theory behind CELL is that Instruction wise the APU stays the same on any CELL based system.

APUs target naturally multi-media ( 3D included ) applications and workloads with generally a high degree of parallelism.

If you put CELL in a PDA is because you want the PDA to display cool videos, play nice 3D graphics and decode nice music, etc...

If you put it inside a server you would do it because you know that it can scale very well ( easier to build cheaper clusters ) and can handle a lot of tasks at once given its nature.

You would not put CELL in a PDA expecting it to run at peak efficiency Word Processing, E-mail checking, Web Browsing, etc...

you then have it that when you scale the number of APU’s up to PS3 levels you have the potential to have a disproportionate amount to multimedia processing power required to PS3’s requirements.

I feel that this would not be a problem as the processing requirements for the kind of multi-media work CELL would be expected to do on a PDA would be the same ( just scaled much higher as PlayStation 3 is after-all a game machine too :) ) as the ones you would need for PlayStation 3.

In the networking instance, it may be great for a grid based computing system that is passing vast quantities of data between each processing array to have networking capabilities at each APU level, but is this really a necessity for a console which it likely only going to be detailing with simple packets of information that relate to locations in a map for multiplayer games or possibly some kind of multimedia serving (should this be a requirement for PS3)?

Networking capabilities are not handled really at the Hardware level: we cannot say they are implemented in Hardware.

The idea of the fundamental communication packets containing Routing Information is more of a forward looking idea than being something that could not be done any other way: to make the CELL picture work we need to have a way for all CELL devices to easily inter-operate and share data and processing power ( if needed and if networks and application requirements allow it ).

This is something that can come in handy in MMORPGs though as it might make easier to handle the client-server communication when you have a CELL based server and a CELL based client.

CELL fits the bill for several Sony and Toshiba devices and this will allow them to save some money ( better ROI ), but if we add the fact that there is an incentive for customers too in having their devices CELL enabled, then we put the CELL R&D capital to even better use.
 
To maybe make a bit of a summary...

Sony and Toshiba saw that in their products ( PDAs, TVs, game consoles, DVD players, etc... ) there were some processing requirements that were common, some workloads that were very processing heavvy and that maybe there was a common answer that was now practical.

If you really look at it you will see that one shoe is not really fitting all workloads/activities or trying to, but it is that we have several people that happen now to want to do a very similar activity and they basically all need the same shoe model just in different sizes as that model in particular is the best for that activity and they do not care much for others activities s that model will do fine enough.
 
OK so I have a question.

Just how much code in a game is actually parallel in nature?
Outside of the rendering problem, how much of the game logic in your average game is going to be able to exploit multiple processors effectively?

Theorizing a system with N processors, my guess would be that by enlarge you'd end up with N-1 dedicated to graphics, and the game running on a single processor (possibly 2).

You would probably end up constrained by the speed of the one processor running the logic and building the display lists.

I just wonder how much developers will be willing to change the way they build games especially when they are on tight schedules?

The industry isn't the same place it was 10 years ago when every one on a game team wanted to write graphics engines and bang at the hardware.
 
Panajev2001a said:
I see the flexibility comes at a cost in the sense that I suspect the ATI VPUs to come with more silicon budget spent on some HW tricks and very optimized to the task at hand.

Well, this it true. The details of the Pixel Engines are sctechy and we don’t know the texture sampling abilities and texture address processing either, something which ATI will be very efficient at.

Also, ATI has been refining the Hierarchical Z buffer over the years – something that can significantly cut down on shader processing and reject many pixels in a single cycle. This is a relatively large bit of silicon and its something that sits between Vertex and Pixel shader ops (across all pipelines) – I’m not see how something like this can apply on the PS3’s architecture.

In the same way VPUs focus on an even smaller sub-set of applications ( basically implementing the classical SGI pipeline faster and faster and faster yet ) and might very well peak at a higher performance in them.

SGI’s pipeline will pretty much be a tiny subset of the pipeline that will emerge in a few years time. SGI’s pipeline was fundamentally fixed function - some elements of the SGI pipeline have already been replaced entirely by programmable units (T&L --> VS) and other programmable units have been inserting in-between fixed function units.

Panajev2001a said:
The theory behind CELL is that Instruction wise the APU stays the same on any CELL based system.

Yes, precisely. You and you have to make sure you have the right instructions for the use on the lowest level of granularity, which might not necessarily be applicable when scaled up. This can be a tricky balance.

I feel that this would not be a problem as the processing requirements for the kind of multi-media work CELL would be expected to do on a PDA would be the same ( just scaled much higher as PlayStation 3 is after-all a game machine too ) as the ones you would need for PlayStation 3.

But not necessarily to the same level of performance – i.e. if you have one APU that’s needs the multimedia capabilities to display at 640x480, when you scale that up to the PS3 that’s could be an overshoot for TV displays.

The idea of the fundamental communication packets containing Routing Information is more of a forward looking idea than being something that could not be done any other way: to make the CELL picture work we need to have a way for all CELL devices to easily inter-operate and share data and processing power ( if needed and if networks and application requirements allow it ).

This is something that can come in handy in MMORPGs though as it might make easier to handle the client-server communication when you have a CELL based server and a CELL based client.

In this particular instance what can’t be done with a piece of decicated hardware?

Panajev2001a said:
Plsu we are not having THAT bad of a chat... I hope...

No, you're not. But that’s not what I’m getting at – it’s the fact that there is no counterpoint to some of the speculation here that has attracted the likes of Chap and Deadmeat and their own particular brand of “counterpointsâ€￾.
 
Panajev2001a said:
If you really look at it you will see that one shoe is not really fitting all workloads/activities or trying to, but it is that we have several people that happen now to want to do a very similar activity and they basically all need the same shoe model just in different sizes as that model in particular is the best for that activity and they do not care much for others activities s that model will do fine enough.

When I said one size fits all, this is what I mean - I should have said "one size scales to fit all". You still have a tricky balance to meet the right processing requirements at the lowest level and not end up with redundancy in certian applications when you scale that up. I'm not saying you can't meet your requirements with sucess, but it seems inevitable you are going to end up with some reduncany some wher in some applications you put it to, which goes back to the more focused units discussion earlier.
 
I am happy to report that I have not been stabbed to death by a flock of crazed woodpeckers.

A lot of the (EDIT: previous) argument so far has been what "counts" as PS3 investment. Dave, if I hear you correctly, I think your arguing that investment in CELL != investment in PS3, nor does investment in 65/45nm. Agreed. But what's the difference? PS3 needs CELL, and while CELL is not designed specifically for PS3 with total disregard for anything else, it still is the primary application. They need 65mn/45nm lines to make PS3. So while the money spent is not for PS3 specifically (in that it the research and capabilities gained can be used for other things), it has been spent on technologies that PS3 will need, and in that sense, it has been spent on PS3. (Of course, those same technologies could be used to advance Xbox2 - so perhaps you can count it as Xbox2 investment as well)
DaveBaumann said:
However, I fear if everyone vaguely constructive gives up then then the forum will end up with the likes of Chap and Deadmeat doing it in a much worse fashion meaning this place goes to the dogs again and I really will have to close it down then.
In that case, I hope you don't give up.

EDIT: Missed the new disscussion entirely.
 
I don't see the conflict between multimedia, networking, and 3D graphics. Since the APUs are really more like general purpose DSPs, I think the 3D graphics capability is easily transferrable to multimedia. With things like MPEG encoding/decoding, all you really need is a fast 2D-FFT's and 2D-IFFT's. If you have a DSP-like architecture with MAC units, implementing the decimation algorithm will be fast. You're going to be passing substantial data around, so you need the inter-APU bandwidth. Bandwidth you would need in passing physics, textures, vertex, and pixel data anyway.

I guess the key is that multimedia, networking and 3D graphics all share the key trait of data and logic locality. They are data heavy, and loosely logically coupled. Since the vast majority of the computations do not depend on other computations, they can be carried out in a parallel fashion. And even if they are dependent on each other, they are mostly dependent on local data. Physics calculations don't need to know what's happening on the other side of the game world, until their effects progagate over. Because multimedia, networking and 3D graphics share the same kind of processing requirements, they can be met by the same type of processor.

Obviously, there are some optimizations that are application specific, like hardwired shaders, that PS3 will not have, and it will effect its performance. But I don't think there will be the multimedia or network overkill Dave implies.
 
ERP said:
Just how much code in a game is actually parallel in nature?
Outside of the rendering problem, how much of the game logic in your average game is going to be able to exploit multiple processors effectively?
Avoiding paradigm shifts - you can always pipeline the MPU unfriendly parts :p

You would probably end up constrained by the speed of the one processor running the logic and building the display lists.
Which is old news for those of us that worked with PS2 :LOL:
Anyway I realize specific details don't relate to 'average application' scenario, but I would argue that the main bottlenecks of display list building I was faced with could be distributed across MPUs (if I had them) quite easily, and efficiently.
 
nondescript said:
I don't see the conflict between multimedia, networking, and 3D graphics. Since the APUs are really more like general purpose DSPs, I think the 3D graphics capability is easily transferrable to multimedia. With things like MPEG encoding/decoding, all you really need is a fast 2D-FFT's and 2D-IFFT's. If you have a DSP-like architecture with MAC units, implementing the decimation algorithm will be fast. You're going to be passing substantial data around, so you need the inter-APU bandwidth. Bandwidth you would need in passing physics, textures, vertex, and pixel data anyway.

I guess the key is that multimedia, networking and 3D graphics all share the key trait of data and logic locality. They are data heavy, and loosely logically coupled. Since the vast majority of the computations do not depend on other computations, they can be carried out in a parallel fashion. And even if they are dependent on each other, they are mostly dependent on local data. Physics calculations don't need to know what's happening on the other side of the game world, until their effects progagate over. Because multimedia, networking and 3D graphics share the same kind of processing requirements, they can be met by the same type of processor.

Obviously, there are some optimizations that are application specific, like hardwired shaders, that PS3 will not have, and it will effect its performance. But I don't think there will be the multimedia or network overkill Dave implies.


Micron probably agrees with you.

14kl.jpg


14il.jpg


14hl.jpg
 
Fafalada said:
ERP said:
Just how much code in a game is actually parallel in nature?
Outside of the rendering problem, how much of the game logic in your average game is going to be able to exploit multiple processors effectively?
Avoiding paradigm shifts - you can always pipeline the MPU unfriendly parts :p

There just isn't that much stuff in the average game outside of rendering that works on a lot of sequential non dependant data. Besides pipelining is only a win when communication overheads don't kill you.

Fafalada said:
You would probably end up constrained by the speed of the one processor running the logic and building the display lists.
Which is old news for those of us that worked with PS2 :LOL:

This is actually what I was trying to say.

Fafalada said:
Anyway I realize specific details don't relate to 'average application' scenario, but I would argue that the main bottlenecks of display list building I was faced with could be distributed across MPUs (if I had them) quite easily, and efficiently.

I would argue that the main bottlenecks for display building are reading and copying data. Doing that over a communication link between processors is probably going to cost more than doing to memory.

I'm not really trying to say that games can't make use of multiple processors for there core logic, just that the way most are currently constructed, it's difficult and time consuming, not to mention debugging hell. And in a deadline oriented industry than usually equates to won't happen often.
 
IMHO more powerful hardware is (atleast as things stand) really only useful when it comes to graphics. Doing more advanced physics/AI is more of a software problem. Even with unlimited hardware resources, simulating something like human personality or thought would be very very difficult.
 
Back
Top