http://www.tomshardware.com/hardnews/20050825_190713.html
You will want to go to their site to read the entire article.
I found this part interesting for CELL's future. I was thinking that PS4 could have ended up with 4 CELL processors... which is 32 cores/elements. That is NUTS to program for
But it seems that the SPEs are interchangable and upgradable, so the PS4 may contain say 2 CELLs, each CELL having lets say 2 beefed up PPEs with more cache and the SPEs are likewise beefed up in performance and workflow and have more memory (lets say 512K or even 1MB).
That way you do not have to end up with an insane number of cores. You would still have more but could scale the chip to how it best works for the designated task.
Out of the entire article I found this bit most interesting, if not only because there has been a lot of discussion about the independance and abilities of the SPEs. The "master-slave" arrangement was an issue of much debate 8 months ago.
Again interesting read.
You will want to go to their site to read the entire article.
But here is where Cell's architecture becomes truly unique: No SPE has a view into system memory. In Intel's multicore technology, for instance, all processing cores operate as fully-capable CPUs unto themselves, with equivalent access to system memory whose arbiter is a memory controller looking over the front-side bus. In Cell architecture, only the PowerPC element (PPE) has a view of system memory, and there can be as few as one of these elements within a processor. The PPE is the only conventional processing element, with complete access to system functions (or sharing that access with another PPE, when present). Those system functions include directing another processing element -- which hasn't been discussed in detail until today -- with the more familiar-sounding name of Memory Interface Controller. The MIC fetches swatches of memory for the SPEs, providing them with a shared, collective "sandbox."
Here is where cache organization plays a critical role. Each PPE has its own L1 cache, as you might expect, which is not shared with other PPEs. Performance is boosted -- as with the Power processors we've seen to date -- by an L2 cache, the size of which appears not to be limited by the spec. For the SPEs, there is a separate and new type of cache called the SL1. All SPEs within a group share a single SL1 cache. This cache is the only world they know. In conventional caching, the processor addresses data in memory by its absolute address, but caches provide that memory as though it were being provided directly from system RAM. But SPEs are little computers, and the SL1 cache is their system RAM. The memory controller acquires the products of their work like a teacher picking up after her students at the end of class.
Another unique revelation of the Cell 1.0 specification is an apparent second order of element grouping -- or, translated into an Intel context, a "multi-multicore" possibility. A CBEA-compliant processor package can contain groups of PPE elements and separate groups of SPE elements. Judging from the algebra IBM uses to describe the interaction between elements, there need not necessarily be as many PPE groups as SPE groups. This is important because it indicates that grouping isn't necessarily the product of simply sandwiching multiple Cell processors together, although the specification deals entirely with logic and not packaging. It's therefore conceivable that a Cell processor vendor could create multiple performance tiers by integrating any number of SPEs (probably in multiples of two) with one, two, or three PPEs.
I found this part interesting for CELL's future. I was thinking that PS4 could have ended up with 4 CELL processors... which is 32 cores/elements. That is NUTS to program for
But it seems that the SPEs are interchangable and upgradable, so the PS4 may contain say 2 CELLs, each CELL having lets say 2 beefed up PPEs with more cache and the SPEs are likewise beefed up in performance and workflow and have more memory (lets say 512K or even 1MB).
That way you do not have to end up with an insane number of cores. You would still have more but could scale the chip to how it best works for the designated task.
Out of the entire article I found this bit most interesting, if not only because there has been a lot of discussion about the independance and abilities of the SPEs. The "master-slave" arrangement was an issue of much debate 8 months ago.
Like the co-processor of ancient days, an SPE is subordinate to the PowerPC element, and performs no system management functions whatsoever. Instead, it can be delegated user-specific tasks, especially graphics processing, which can take advantage of the SPE's Single Instruction/Multiple Data (SIMD) architecture.
Again interesting read.