“PS3.5″? – New Sony Patent

This week Sony filed a new patent outlining a potential upgrade for the PlayStation 3 console, namely adding an external processor to increase overall bandwidth for the current Cell CPU. The patent can be located here at http://www.freepatentsonline.com/y2010/0312969.html. Firstly, here are some important points as stated in the filing:

In recent years, there has been an insatiable desire for faster computer processing data throughputs because cutting-edge computer applications involve real-time, multimedia functionality. Graphics applications are among those that place the highest demands on a processing system because they require such vast numbers of data accesses, data computations, and data manipulations in relatively short periods of time to achieve desirable visual results. These applications require extremely fast processing speeds, such as many thousands of megabits of data per second. While some processing systems employ a single processor to achieve fast processing speeds, others are implemented utilizing multi-processor architectures. In multi-processor systems, a plurality of sub-processors can operate in parallel (or at least in concert) to achieve desired processing results.

Some multiprocessing systems contemplate interconnections via interfaces in a matrix configuration to improve processing throughput and versatility.

Accordingly, there are needs in the art for new methods and apparatus for interconnecting one or more multiprocessor systems with one or more external devices to achieve higher processing capabilities.

The below diagram illustrates the new extended processing archicture (as a dotted outline), and how it integrates with Shared Memory and the console:

Shared-Memory.jpg


Full article:
http://www.ps4sony.net/ps3-5-new-sony-patent-this-could-go-either-way/
 
This actually looks like something for the PSP2, which was rumored to be based on a 4 SPU design. This could be an interesting idea to have the PSP2 and PS3 work together on a much more integrated level than before. Of course there are various other devices it could interact with like recording and compressing video from a Sony handicam on the fly or whatever, but the PS3 / PSP2 connection seems to me the most obvious one.
 
This week Sony filed a new patent outlining a potential upgrade for the PlayStation 3 console
They most certainly did not! That idea isn't even the right stick, let alone the wrong end of it!

The patent first linked to in this forum here. DF's report here. The patent itself does not look applicable to the PS3. It covers a processor with a memory interface unit that can be changed between working as an SMP extension to a processor, or as a conventional node within a distributed processing network. It's the former that is something special, the idea that you could have two discrete processing packages, one in a console and one in a TV say, that work as one processor with coherency across memory accesses and data sharing. To the system, there is one processor, a Cell with 2 PPUs and 12 SPUs, or whatever it is. With something like a Naughty Dog SPU scheduling codebase, Uncharted 4 would spread out to available cores transparently.

The diagrams are only ever for illustration. The choice of 4 SPUs on the example chip could be as much a choice of presentation, as 8 SPUs would be a lot of redundant blocks that don't help explain the design.

Currently Cell does not have a BIC capable of switching between SMP and distributed-computing modes. I don't think such a thing can be emulated by a SPE. Even if it could, there's no connection that allows two discrete device CPUs to communicate at full bus speeds. The patent uses Cell's current FlexIO BW as an example, that the BIC unit can't exceed 35 GB/s out+25 GB/s in, but the fastest port on a PS3 is HDMI at 10 gigabits, one way AFAIK. The gigabit network port would only allow for distributed computing models, which would require custom software written to support it. I know of nothing in development that allows 50+ GB/s communication between devices. You'd need something like a RAMBUS port!

So in summary, this patent is 99.9% certainty nothing to do with PS3, but may point towards an improved network Cell offering shared workloads across Cell-enabled devices. Possibly even hedging their bets for a future that's 10 years out.
 
They most certainly did not! That idea isn't even the right stick, let alone the wrong end of it!

The patent first linked to in this forum here. DF's report here. The patent itself does not look applicable to the PS3. It covers a processor with a memory interface unit that can be changed between working as an SMP extension to a processor, or as a conventional node within a distributed processing network. It's the former that is something special, the idea that you could have two discrete processing packages, one in a console and one in a TV say, that work as one processor with coherency across memory accesses and data sharing. To the system, there is one processor, a Cell with 2 PPUs and 12 SPUs, or whatever it is. With something like a Naughty Dog SPU scheduling codebase, Uncharted 4 would spread out to available cores transparently.

The diagrams are only ever for illustration. The choice of 4 SPUs on the example chip could be as much a choice of presentation, as 8 SPUs would be a lot of redundant blocks that don't help explain the design.

Currently Cell does not have a BIC capable of switching between SMP and distributed-computing modes. I don't think such a thing can be emulated by a SPE. Even if it could, there's no connection that allows two discrete device CPUs to communicate at full bus speeds. The patent uses Cell's current FlexIO BW as an example, that the BIC unit can't exceed 35 GB/s out+25 GB/s in, but the fastest port on a PS3 is HDMI at 10 gigabits, one way AFAIK. The gigabit network port would only allow for distributed computing models, which would require custom software written to support it. I know of nothing in development that allows 50+ GB/s communication between devices. You'd need something like a RAMBUS port!

So in summary, this patent is 99.9% certainty nothing to do with PS3, but may point towards an improved network Cell offering shared workloads across Cell-enabled devices. Possibly even hedging their bets for a future that's 10 years out.

Considering the relatively large latencies involved with sharing of memory and resources across a bus for a multi GPU design, hence the use of AFR, I'm not seeing where this would be particularly useful. Say a Media Device <-> TV sharing computing resources across a foot or more of cable. Even more when you consider that GPU's are far better at hiding high latency for data access yet still face significant latency obstacles with regards to operating as a homogenous unit.

The bandwidth may be fine, but the latency for computing is going to be quite high. As such I wouldn't see it as say two 6 SPU devices being seen as one 12 SPU device. More likely that it would still be 2 distinct devices opperating seperately on the same data set. Say for example Device 1 does whatever it does and outputs data to Device 2 for further processing.

Would probably be fine for latency tolerant activities, like watching video for example. But wouldn't be applicable to latency intolerant activities such as gaming for example.

Regards,
SB
 
The bandwidth may be fine, but the latency for computing is going to be quite high. As such I wouldn't see it as say two 6 SPU devices being seen as one 12 SPU device. More likely that it would still be 2 distinct devices opperating seperately on the same data set. Say for example Device 1 does whatever it does and outputs data to Device 2 for further processing.
Except the actual network topology will be one processing centre and one memory pool as I understand it, that being what SMP is all about. If you have a processor processing its data abnd passing that to another processor with its own localised memory, you're using distributed computing. It's the having direct access to all memories at all times that makes this patent something different.

Yes, the latencies could be horrific, but Cell is designed to work in LS and prefetch data, so a lot of that could, perhaps, be intrinsically hidden by existing good SPE coding practice. I muight be being quite optimistic here!

Would probably be fine for latency tolerant activities, like watching video for example. But wouldn't be applicable to latency intolerant activities such as gaming for example.
If we had the BW to share GB's of data per frame, I don't think latency would be a big problem. For starters you could divide the workload into steps and farm a particular step out, such as having the postprocessing and deferred-renderer compositing done on the TV. That should add no more latency to the current setup. You could also have background AI tasks running on off-console SPEs where the results aren't immediately necessary, like general city emulation.

I'm not sure any of this is really targeted at gaming though. I think it's more likely the idea is using PS4 to help drive media functions. eg. A TV with a simple, low power Cell that provides lower quality video upscaling, and then if you connect your PS4 up, that does the heavy lifting and provides a higher quality experience. In essence, rather than put a processor in every device, just ahve one processor that's used across devices. Like the cloud we're going back to, the old mainframe+dumb terminal structure.
 
But in that case wouldn't it make much more sense to just enable a sony tv to just send the tv signal to the ps3/4 and have the ps3/4 do all the work? A ps3/4 would be fast enough for that anyway so to me it seems like a lot of hassle to try and get 2 devices to work on the same thing if you could just as well let 1 device handle all the work.

I'm sure there is some good use for this kind of system but I just don't think it would be all that usefull in the case you describe. Though I don't have any better ideas myself.
 
This actually looks like something for the PSP2, which was rumored to be based on a 4 SPU design. This could be an interesting idea to have the PSP2 and PS3 work together on a much more integrated level than before. Of course there are various other devices it could interact with like recording and compressing video from a Sony handicam on the fly or whatever, but the PS3 / PSP2 connection seems to me the most obvious one.

I'm sure I can confirm this design has nothing to do with PSP2...
 
Except the actual network topology will be one processing centre and one memory pool as I understand it, that being what SMP is all about. If you have a processor processing its data abnd passing that to another processor with its own localised memory, you're using distributed computing. It's the having direct access to all memories at all times that makes this patent something different.

Yes, the latencies could be horrific, but Cell is designed to work in LS and prefetch data, so a lot of that could, perhaps, be intrinsically hidden by existing good SPE coding practice. I muight be being quite optimistic here!

If we had the BW to share GB's of data per frame, I don't think latency would be a big problem. For starters you could divide the workload into steps and farm a particular step out, such as having the postprocessing and deferred-renderer compositing done on the TV. That should add no more latency to the current setup. You could also have background AI tasks running on off-console SPEs where the results aren't immediately necessary, like general city emulation.

I'm not sure any of this is really targeted at gaming though. I think it's more likely the idea is using PS4 to help drive media functions. eg. A TV with a simple, low power Cell that provides lower quality video upscaling, and then if you connect your PS4 up, that does the heavy lifting and provides a higher quality experience. In essence, rather than put a processor in every device, just ahve one processor that's used across devices. Like the cloud we're going back to, the old mainframe+dumb terminal structure.

So what is it? Because the comment in the patent seems to talking to the necessity to adding power in a platform...
 
Last edited by a moderator:
But in that case wouldn't it make much more sense to just enable a sony tv to just send the tv signal to the ps3/4 and have the ps3/4 do all the work?
But what if you don't have a PS3/4? Your TV needs a processor, so you have to put one in, but if you put a big processor in, you have to engineer that in with added cost. The idea would be scalability, the hardware scaling to the investment of the user in Cell devices. Buy a cheap 2 SPE TV now rather than the better quality but more expensive 6 SPE TV, and buy a 4 SPE BRD player when you can afford it 6 months later, and the two offer a synergy that improves the experience of both. At Sony's end the differentiating factor is just the number of SPEs. It's the same systems and software running on all devices. GoogleTV and PSN/Qriocity content will be the exact same code base, portable and scalable, without having to rewrite for different processors and without having to worry about maintaining different virtual environments for different processors. this would then mean Sony could roll out service improvements like FW updates for PS3, across all devices.

It'd be elegant, if it ever worked. This is very hypothetical though. I'm not even sure if the TV is the best example.

So what is it? Because the comment in the patent seems to talking to the necessity to adding power in a platform...
It's a network processing arrangement. The actual intro reads:

Accordingly, there are needs in the art for new methods and apparatus for interconnecting one or more multiprocessor systems with one or more external devices to achieve higher processing capabilities.
It's not taling about adding processing performance to one device, but one or more. It's about distributing processing capabilities between devices. This was Kutaragi's grand vision; does no-one remember the Cell-toaster and Cell-fridge jokes?!
 
Shifty, your analysis is similar to mine except for a few things:

* Your scenario assumes that the processors need to share a lot of info. It depends on the application design. Polyphony has demonstrated multi-screen render for GT5 @ 60fps. So there should be a way for PS3 to render the same scene to the screen by duplicating data and code on the systems. They use the gigabit port for the intercon. [EDIT: If they have enough computing power and can finish jobs much earlier, there may be leeway for a slightly slower syncing mechanism, but it's risky and case-by-case. The other way is to add an output component to sync the output independently. This output component can be one of the chosen/elected PS3s]

* The context for this application may also be applied to a server environment. WarHawk servers are all PS3s right ? IBM is very familiar with server infrastructure too.

* The applications can be async operations (e.g., heavy weight image analysis for PSEye, DVR running in parallel, home server apps, blah)

In short, it can still be relevant for PS3. The question is price. Anyway it's just a patent now.
 
* Your scenario assumes that the processors need to share a lot of info.
Because that's what the patent says! If the're not sharing a lot of info, there's no much point in wanting an SMP model and you could just go distributed computing.

Polyphony has demonstrated multi-screen render for GT5.
Which was not an SMP system, but a distributed computing system: 4 systems all working independently, the results being accumulated at the end. This is not a patent for using multiple processors to share workload, which is old news. It's a patent for a processor with an IO system that can change its nature.

In short, it can still be relevant for PS3.
No it can't, because the patent has a hardware feature on the processors that PS3's Cell hasn't got! Again, this is not a patent for distributed computing, or a distributed application, or a means by which an application can be shared across processors. It is a patent for a processor design featuring a novel IO system, with 2 channels, which can independently be set to run in synchronous or non-synchronous modes allowing for a mix of topologies with the same processor architecture.

The vision of networked devices sharing workloads is possible, and probably what Kuturagi had in mind, but that hasn't amounted to anything yet, while Sony are patenting this system to enable much closer hardware networking which they may or may not do anything with. But this patent cannot apply to PS3, unless there is a means by which a SPE can emulate the BIC unit of the new design and provide synchronised connection between processors. I don't think that's possible because the SPE sits the wrong side of the memory controller.
 
Yes, it's for an SMP system...

The BIC provides two flexible interfaces with varying protocols and bandwidth capabilities to address differing system requirements. The interfaces can be configured as either two I/O interfaces (IOIF 0/1) or as an I/O and a coherent SMP interface (IOIF & BIF). When the BIC is configured to operate as a coherent SMP interface, the BIC provides the PE with a high-performance, coherent interconnection. When the BIC is configured to operate as an I/O interface, the BIC provides the PE with a high-performance (non-coherent) interconnection.

The BIC includes a logical layer, a transport layer, a data link layer, and a physical link layer. The logical layer (and in some embodiments the transport layer) may be adapted to change the operation of the BIC between a coherent SMP interface (BIF) and a non-coherent interface (IOIF). The logical layer defines the basic operation of the BIF or IOIF, including the ordering and coherency rules. The transport layer defines how command and data packets are transferred between devices. Command and data packets are preferably separated into smaller units referred to as physical layer groups (PLGs) for presentation to the data link layer. The data link layer defines the facilities that ensure (substantially) error free transmission of information between the sender and the receiver. The physical layer defines the electrical characteristics and timing of the I/O drivers and describes how data link envelopes are transmitted across physical links. The physical link layer preferably supports the concurrent operation of up to two sets of logical/transport/data link layers and a configurable method of apportioning the available bandwidth of the physical layer between the two.

... but the BIC is not necessarily a hardware element, it's a logical entity and the Cell supports memory-mapped I/O. The async mode should be doable (a la distributed system). Sync mode may be less practical (depends on apps).

It definitely sounds like a Kutaragi legacy. He threw $$$ at optical interconnect R&D before the boardroom shuffle. The Cell is a distributed system-like CPU anyway. The SPUs can only use DMA to access shared memory. All the memory mapped DMA IOs are managed by the PPU. One SPU should be able to access the local store of another SPU in a different Cell system via this mechanism.

Then again, if the coupling is not tight, the developers can implement a distributed system via other means. In other words, *if* Sony wants to do this thing, the key (breakthrough) technologies may not be described in this patent. ^_^

EDIT:
The key technologies need to address latency, throughput and cost issues.
 
http://www.ps4sony.net/

That website is a joke right? it implies being official with an intellectual property name for a company, in this case Sony and the probably trademark name of a future PlayStation console yet all you see are stories and links that post alot of fear, uncertainty and doubt.
 
who cares. if they release something, great, if not we wait. hey that rhymes.
sony probably has a lot of patents that never got used.
 
sounds like a smart idea to extend the PS3 instead of selling a complete new device.

There are quite some cluster that work with unified address space, a page miss is of course damn slow(even with infiniband), but if you program software for multi-cpu systems, you're already aware of bad performance for memory accesses for non local memory (happens even with simple 12core opterons). The solution is usually to mirror all data, on clusters this is done transparent, ever unit has a copy of the memory-page, as long as nobody tries to write to this page, everything is fine.
But this is quite some software overhead, cost you probably more than an external cell would give you, if you have to deal with page misses by software.

But on the other side, the otheros interface to the hdd/network is done via the lowest level driver, there is no direct network/hdd-hardware access, and I was reading somewhere that this is due to the custom integrated i/o-hardware on the controller of the PS3.

Lets imagin for a second the integrated hdd/network controller can get the same memory area mapped as the user space 'game', in that case, you could plug in an extension via the LAN or SATA and it would just work, with no change to software, nore any negative impact like interrupts.
for a game it wouldn't matter where those x-spus are placed or what latency they have. Sure, if you'd have a latency of 1ms+, it might create issues, but usually you have <1ms latency within the network in your company, across switches on shared routes and this ping is two way. I don't know what latency you can get if you plug in a device directly to the PS3 network/hdd port, but I could imagine, with sony's custom protocols could handle those 0.1 meter of wire quite well. SSDs offer up to 100k iops, that's 0.01ms. although that would be insane 32000cycles, compared to ~ 1500cycle for 16kb on spu, that makes me confident that a nearby device is not completely out of possibility.

Shifty Geezer said:
I know of nothing in development that allows 50+ GB/s communication between devices.
neither do I, (if you exclude hyper transport which is by accident 50GB/s+, but completely out of topic here). but most tech I know rather fights with latency than with bandwidth. LAN would allow 1GBit/s (maybe more if sony has some hidden tweaks) and SATA would allow 6GBit/s, two way. not perfect, but quite ok for some high latency zip file uncompression, mp3 decoding, texture transcoding, async ray-world-intersections etc.
 
rapso said:
sounds like a smart idea to extend the PS3 instead of selling a complete new device.

If workable, may be. The first problem Sony will consider is cost. Then we talk about bandwidth and latency. ^_^

How much would a 4 SPU external module + 512Mb RAM cost ? Say… if I can hook up the PSEye via the module (a la Kinect), runs game components, and media app/server in parallel. Assuming the gigabit port alone suffices for these apps.

Based on Angelcurio's hint, it won't do PSP2 (Ha ha).

EDIT: According to this article: http://www.pcworld.com/businesscent...ns_sampling_cellchip_derived_spursengine.html
Toshiba's 4 SPU SpursEngine alone costs $50 in bulk (in 2008)
 
Back
Top