DaveBaumann said:
The instructions nAo envisioned do not seem to be part of what CELL is about: if they were introduced they would have to be part of every APU in some form introducing the redundancy problem you described.
Yes.
But can you also see the flipside of that - making sure that the native instructions within the APU's hits the right level of functionality for the computational power of the range of devices you are looking to it to be place in at the right die size / power requirements?
I think that is why the instructions part of the APU ISA would be related not to the power requirement of any CELL based device, but to the workload CELL as an architecture is mostly geared towards.
I see APU's ISA to be quite small and I am talking 50-60 Instructions kind of small.
A better solution would be to embed in a CELL CPU or CELL GPU some dedicated Silicon which is not part of the APU specification.
Ahhh, so, fixed functionality isn't such a bad boy then?
Doesn't this also defeat the entire object a little? Surely the point is to have a simple mechanism that is scalable up and down devices with the minimum of change to the basic construct - whats the point of doing that if you have to lob a bunch of extra instructions into a fixed unit somewhere for each different device?
Hehe, no fixed functionality is not that bad of a boy: even when talking about 3D Rendering we always assumed that PlayStation 3 was not going to go 100% Software even with CELL as there are some tasks that are simply quite easy to implement in fast Silicon Logic than to run them through software solutions.
Those tasks also do not benefit of the fact of being run in SoftwareL they are solved problems, they can find succesful Hardware implementations.
The idea of CELL was not replace Dedicated Silicon 100%, we want CELL devices to be flexible, modular and to be still able to communicate easily with each other and for that we need a flexible, but common building-block.
We also have to face practical problems and that is why for each device we look at what it has to do and how we can complement CELL to be at its best in that device without ruining the ideals behind CELL.
A CELL based PlayStation 3 could even have a custom non CELL based GPU and PlayStation 3 would still fit in the big picture of a CELL based Home Network although a CELL based Visualizer would be better IMHO.
CELL is flexible to handle most of the tasks we want it to do, but sometimes there are few of them which are small in themselves, but are used very often in which it is not worth do by software and that is why even CELL leaves some space for Dedicated Silicon.
If PlayStation 3 developers will want to do software texture filtering because it fits their needs then they can still do it and they have flexibility and power to attempt that.
Generally with CELL we try to do with our trusty APUs all those tasks in which we would like to have programmable solutions, but as I keep sayign there are some tasks that nobody would really want to code as they are fully solved problems.
[Kinda OT I guess] Do we have any clue what type of instruction set would be applicable to an APU as well? It seems to strikes me that the "look at the number of FLOPS it can do" willy waving exercises some people like to go into seems to be missing the issue - dependant on your instruction set some ops are going to take a tone of APU's/cycles in comparison to more focused hardware . The texld instruction is one such example - obviously thats useful for shader purposes and an NV30, for instance, will be able to carry 4 out in a single cycle - how many cycles would it take non-decidated hardware? What other instructions that are commonly used in shader ops are there that specific shader hardware already has that the PS3 will be relaint the number of APU's & clock cycles to achieve?
FP/FX:
ADD, SUB, MUL, DIV, MADD ( more than one kind... with broadcast and without, etc... ), LOAD/STORE, etc...
You would have SIMD and Scalar versions of several of the Arithmetic Instructions of course.
Of course they will have some more specialized Instructions, but those will be related to efficient message passing, general DMA and I/O related operations, etc...
I do not see the real need for a Dot Product Instruction for example: they should have in their Libraries a function that does that work and that maps to a certain sequence of simple instructions, but I am saying something very obvious here.
If you look at EE's VUs ISA, minus some instructions like CLIP that are related to 3D Graphics in particular, we might find a good deal of the kind of ISA we are expecting the APUs to have.
I expect them to extensively profile 3D Engines, Image Processing applications, Networking Stacks, etc... ( all applications in which CELL has the advantage and for which its power and architecture is best suited ) and to include in the ISA the most used Instructions ( plus of course some set of useful general and basic operations ) they found.
I do not see complex 3D Operation being part of the ISA, but I see part of the ISA useful operations which would be needed to implement those Operations and make them relatively fast.
Some of those instructions might be used for several tasks... to make an example, we might see the usefulness of some comparation Instructions that work on absolute values.
Those would be useful for Physics, Image Processing and general Signal Processing, 3D Graphics, etc...
One thing about what ERP and Fafalada said about "Pipelines", again quoting Suzuoki's CELL patent:
[0131] The ability of APUs to perform tasks independently under the direction of a PU enables a PU to dedicate a group of APUs, and the memory resources associated with a group of APUs, to performing extended tasks.
The PUs can create "Pipelines" dedicating APUs to certain tasks: it seems to me that the STI guys thought about cases in which this might be useful to programmers.