Jawed said:Also, see this thread:
http://www.beyond3d.com/forum/showthread.php?t=23361
I can't believe the continued, idiotic, resistance to the idea that Cell PPE and Xenon are 1-VMX designs when we have all this evidence.
There's no doubt that Cell PPE has "something up its sleeve", but it isn't two VMXs.
Jawed
Step back as compared to what? The norm is for a CPU to either have NO vector unit, or one. xCPU will have THREE in total.3N1gM4 said:Both the PPE and XCPU will probably have 2 VMX units. Having just one unit would be too much of a backward step in terms of performance.
But seemingly ignore the DP instruction for some reason.Guden Oden said:Besides, MS has already published flops figures for the thing, and they correlate well with one vector processing element per core.
Um, MS did state the number of DPs the chip can calculate/sec afair. What makes you say they ignored it, and assuming they did, what effect would that have, in your opinion?Jawed said:But seemingly ignore the DP instruction for some reason.
4 component DP is 7Flops/cycle, MADD is 8. Why would anyone count with DPs when that would reduce the FP rating?But seemingly ignore the DP instruction for some reason.
I'm guessing that it's a four-vector DP, not a four-component DP.Fafalada said:4 component DP is 7Flops/cycle, MADD is 8. Why would anyone count with DPs when that would reduce the FP rating?
In all these architectures (Cell PPE, Cell SPE, Xenon) the FP pipeline appears to be able to dual-issue a math op with a load/store/permute.Shifty Geezer said:Does the DP function run concurrently with other float ops?
DP is just a pipeline. When doing a math op, the Xenon VMX runs down one of the available math pipelines: DP, Vector, Vector Simple (dunno what that means! - add?), scalar.I was of the opinion it was just an instruction, not an extra processing unit, and so consumed FP capacity of the VMX units with only a little increase in efficiency, but then I'm not at all well read on the XeCPU!
Matrix SIMDs are a rather inefficient use of die-space, if you're not targetting some extremely specialized application field. And XeCPU field isn't nearly that specialized.Jawed said:I'm guessing that it's a four-vector DP, not a four-component DP.
True, but we do have other CPUs out there with DOT instruction and their DP pipelines are longer then MADD too. It's a horizontal operation and all that - we had quite a bit of discussion on issues with DP before(when first SPE info was unveiled), especially in regards to issues with high-clocked processors having such operations.SPE's don't seem to have such a unit - so it makes comparing the FP pipeline of SPE and PPE somewhat difficult
Fafalada said:Matrix SIMDs are a rather inefficient use of die-space, if you're not targetting some extremely specialized application field. And XeCPU field isn't nearly that specialized.
darkblu said:jawed,
dp is strictly horizontal. if you had a vertical dp (i.e. multi-vector dp) that would have been a madd/macc, not a dp.
i actually find the sh4 matrix-vector multiplication op quite clever and universal. it proved to be of use to ; )
ERP said:And it was still sometimes faster to do the individual dotproducts because it gave you more latitude in scheduling.
You know damn well I was referring to execution resources not damn repeat instructionsdarkblu said:i actually find the sh4 matrix-vector multiplication op quite clever and universal. it proved to be of use to ; )
you know, one of these days we'll corner you and won't let you go until you spill everything you know about that bloody vfpu ; )Fafalada said:You know damn well I was referring to execution resources not damn repeat instructions
There are recent CPUs with ISAs offering full matrix support(not just an odd instruction here and there) but they still stick with one vector worth of execution resources.
darkblu said:you know, one of these days we'll corner you and won't let you go until you spill everything you know about that bloody vfpu ; )