IBM Cell Patent: Broadband Engine/PU Cache/Bus.

j^aws

Veteran
A pic from patent,

IBM-BE-8PU.jpg


Link to patent:Bus interface controller...

Those 'skilled in the art' can infer things from this patent...me,...I have a headache! ;)

From patent,

[0016] The local IC 110 comprises a plurality of processor units (PU) 121 to 128. Each PU 121 to 128 is coupled to its own cache 131 to 138 by a PU cache bus 130. In one embodiment, the PU cache bus 130 is part of a broadband engine (BE) bus (BEB) 130. For ease of illustration, in FIG. 1, the BEB 130 is generally depicted as a cutaway element of a lower layer of a 3-dimensional rendition of the IC chip 110. However, those of skill in the art will understand that other placements and configurations of the BEB 130 are within the scope of the present invention.

Mmm...But I do like that BE with 8PUs and a multi-layered chip design! 8)
 
The Bus design is becoming clearer. Quite interesting and original - though I'm not sure if some server class multiway-CPU already use something similar. Devs probably don't need to worry aboout the details presented here, that should be abstracted. Although the bits about whether the caches of PUs are synced or not is still unclear. The paper seems to be keeping that decision open. It seems as if whether pieces of cache data are synced or not is optional and can be decided by software at runtime! But I'm not sure yet.

In very general terms, a distributed TLB/cache system. Should be interesting to see how well this works to relieve bus bottleneck.
 
Jaws finds another.

Soon the patent will arise that will detail the entire "next generation entertainment" system.
 
Mfa said:
Were the local memories for the PUs called caches in all the patents BTW?
It's L2 cache in the patent just before this one.
Prior to that there was scarcely any detail at all on PUs - and there wasn't really any mention of them having local memory, or what configuration it might be.
 
I found this patent 2 weeks ago and I thoght it wasn't related to CELL..well, shame on me :D
Maybe I should double check some new patents ny Mr. Kahle that recently appeared..
 
PC-Engine said:
Mmm...But I do like that BE with 8PUs and a multi-layered chip design!

Don't all cpus use multiple (7+) layers nowadays?

Yeah...but I'm not sure how many layers are used these days. I wonder if they'd be a jump in layers especially considering the number of transisters and buses required for the BE and would an increase in layers save on die space?

Also these PU caches got me thinking...in the above BE, the local memory seems to be off chip and I know the Sony BE patent, with 4 PUs, suggests 64MB of eDRAM, but what if those transistors for eDRAM got sacrificed by larger on chip L2 caches for these PUs? Or a combination of larger caches, more PUs and APUs? There must be an optimal combination for maximum BE performance. Would more L2 caches for PUs be better than eDram for a given die space?

These patents from Sony and IBM got submitted years ago, circa 2002. Patents are only a guideline and there would be plenty of time for them to tweak the design of the BE for PS3. Any ideas on whether BE for PS3 would be 'taped' out by now or what kind of lead time is necessary for a Spring 2006 release?
 
what if those transistors for eDRAM got sacrificed by larger on chip L2 caches for these PUs?
Well, in the present PS2 EE, the VUs are starved by the MIPS core cache, and our resident devs express their hate for that vocally. From what I can gather so far from a paper nAo posted earlier, you want a really good performing cache system for the PUs - they are responsible for feeding PE data requests from main memory(RAM).

Granted, "good" does not imply "big" for cache design. Multimedia applications such as games tend to thrash the cache(right?), so we may see diminishing returns past a certain point quickly. I don't expect L2 to be too large.
 
Paul said:
Soon the patent will arise that will detail the entire "next generation entertainment" system.

E3 2005, destined to be an all time classic, PS3, XB2, N5 revelations...mmm.

Tokyo show 2004 for Cell workstation revelations???

In the meantime we'll have to scavenge scraps to build the jigsaw... :cry:

passerby said:
Quote:
what if those transistors for eDRAM got sacrificed by larger on chip L2 caches for these PUs?
Well, in the present PS2 EE, the VUs are starved by the MIPS core cache, and our resident devs express their hate for that vocally. From what I can gather so far from a paper nAo posted earlier, you want a really good performing cache system for the PUs - they are responsible for feeding PE data requests from main memory(RAM).

Granted, "good" does not imply "big" for cache design. Multimedia applications such as games tend to thrash the cache(right?), so we may see diminishing returns past a certain point quickly. I don't expect L2 to be too large.

Do you think that the PUs in the PS3 BE will use a MIPS core/cache for backwards compatability or an IBM Power or a PowerPC core? Could the PUs be any type of cores?
 
E3 2005, destined to be an all time classic, PS3, XB2, N5 revelations...mmm.

Tokyo show 2004 for Cell workstation revelations???

In the meantime we'll have to scavenge scraps to build the jigsaw... Crying or Very sad

Dude, June 29th, IBM and Sony talk Cell at IEEE :) We'll get more info then.
 
Paul said:
E3 2005, destined to be an all time classic, PS3, XB2, N5 revelations...mmm.

Tokyo show 2004 for Cell workstation revelations???

In the meantime we'll have to scavenge scraps to build the jigsaw... Crying or Very sad

Dude, June 29th, IBM and Sony talk Cell at IEEE :) We'll get more info then.

Thanks...It's in my diary now! :D
 
You guys are likely setting yourself up for another disapointment. The presentation at the Vail Workshop is about Consumer Electronics, with presentations also by Panasonic and Hitachi. Hofstee is scheduled to talk about the trade-offs in MPU design.

When the IC which was to become the Emotion Engine was enveiled at IEEE, Kutaragi was there with the leaders of the composite Sony/Toshiba team. There is no indication they'll any part of the [micro]architecture shown during a conference with Panasonic and Hitachi. I'd expect this to be like the E3 presentation with James Kahle, where he makes some telling remarks about Cell and pervasive computing in gaming and nobody cares because it's overshadowed in the media by the marketing catchphrases from the ATI and Microsoft reps. :rolleyes:

PS. Faf and Pana, remember what I bet you at GAForum about 8PUs? The numbers keep popping up.
 
Do you think that the PUs in the PS3 BE will use a MIPS core/cache for backwards compatability or an IBM Power or a PowerPC core? Could the PUs be any type of cores?
Wondering myself what a PU is going to be. :p Most popular opinion is that it is some PPC derivative. But you never know, it could turn out to be some specialized solution just for coordinating and moving data around with other PUs, among PEs, in-out of BE, etc.
 
Vince said:
You guys are likely setting yourself up for another disapointment. The presentation at the Vail Workshop is about Consumer Electronics, with presentations also by Panasonic and Hitachi. Hofstee is scheduled to talk about the trade-offs in MPU design.

When the IC which was to become the Emotion Engine was enveiled at IEEE, Kutaragi was there with the leaders of the composite Sony/Toshiba team. There is no indication they'll any part of the [micro]architecture shown during a conference with Panasonic and Hitachi. I'd expect this to be like the E3 presentation with James Kahle, where he makes some telling remarks about Cell and pervasive computing in gaming and nobody cares because it's overshadowed in the media by the marketing catchphrases from the ATI and Microsoft reps. :rolleyes:

Catchphrases?...I must've slept through that one! ;) The only one I've heard is XNA...

Vince said:
PS. Faf and Pana, remember what I bet you at GAForum about 8PUs? The numbers keep popping up.

Now you're teasing...:?: :D C'mon...no secrets on these forums! :D

passerby said:
Do you think that the PUs in the PS3 BE will use a MIPS core/cache for backwards compatability or an IBM Power or a PowerPC core? Could the PUs be any type of cores?
Wondering myself what a PU is going to be. Most popular opinion is that it is some PPC derivative. But you never know, it could turn out to be some specialized solution just for coordinating and moving data around with other PUs, among PEs, in-out of BE, etc.

Either way, they must be pretty small/compact cores for the PUs if Vince and the above BE suggest 8 PUs...
 
8x8? I don't think so.
8x4 Would be very nice..
4x8 was the originale figure..
4x4 It's my bet for the final configuration :)
 
Megadrive1988 said:
If Broadband Engine has 8 PUs, and still 8 APUs per PU, would that not be the chip with 72 processors in it, reported by Mercury News?

http://www.mercurynews.com/mld/mercurynews/5311288.htm?1c

8 PowerPC or POWER (or even MIPs) cores (the PUs) with 64 APUs

Posted on Tue, Mar. 04, 2003

Sony chip to transform video-game industry

TECHNOLOGY ENVISIONS ALL-IN-ONE BOX FOR HOME
By Dean Takahashi
Mercury News


Sony's next-generation video-game console, due in just two years, will feature a revolutionary architecture that will allow it to pack the processing power of a hundred of today's personal computers on a single chip and tap the resources of additional computers using high-speed network connections.

If key technical hurdles are overcome, the ``cell microprocessor'' technology, described in a patent Sony quietly secured in September, could help the Japanese electronics giant achieve the industry's holy grail: a cheap, all-in-one box for the home that can record television shows, surf the Net in 3-D, play music and run movie-like video games.

Besides the PlayStation 3 game console, Sony and its partners, IBM and Toshiba, hope to use the same basic chip design -- which organizes small groups of microprocessors to work together like bees in a hive -- for a range of computing devices, from tiny handheld personal digital assistants to the largest corporate servers.

If the partners succeed in crafting such a modular, all-purpose chip, it would challenge the dominance of Intel and other chip makers that make specialized chips for each kind of electronic device.

``This is a new class of beast,'' said Richard Doherty, an analyst at the Envisioneering Group in Seaford, N.Y. ``There is nothing like this project when it comes to how far-reaching it will be.''

Game industry insiders became aware of Sony's patent in the past few weeks, and the technology is expected to be a hot topic at the Game Developers Conference in San Jose this week. Since it can take a couple of years to write a game for a new system, developers will be pressing Sony and its rivals for technical details of their upcoming boxes, which are scheduled to debut in 2005.

Ken Kutaragi, head of Sony's game division and mastermind of the company's last two game boxes, is betting that in an era of networked devices, many distributed processors working together will be able to outperform a single processor, such as the Pentium chip at the heart of most PCs.

With the PS 3, Sony will apparently put 72 processors on a single chip: eight PowerPC microprocessors, each of which controls eight auxiliary processors.

Using sophisticated software to manage the workload, the PowerPC processors will divide complicated problems into smaller tasks and tap as many of the auxiliary processors as necessary to tackle them.

``The cell processors won't work alone,'' Doherty said. ``They will work in teams to handle the tasks at hand, no matter whether it is processing a video game or communications.''

As soon as each processor or team finishes its job, it will be immediately redeployed to do something else.

Such complex, on-the-fly coordination is a technical challenge, and not just for Sony. Game developers warn that the cell chips do so many things at once that it could be a nightmare writing programs for them -- the same complaint they originally had about the PlayStation 2, Sony's current game console.

Tim Sweeney, chief executive of Epic Games in Raleigh, N.C., said that programming games for the PS 3 will be far more complicated than for the PS 2 because the programmer will have to keep track of all the tasks being performed by dozens of processors.

``I can't imagine how you will actually program it,'' he said. ``You do all these tasks in parallel, but the results of one task may affect the results of another task.''

But Sony and its partners believe that if they can coordinate those processors at maximum efficiency, the PS 3 will be able to process a trillion math operations per second -- the equivalent of 100 Intel Pentium 4 chips and 1,000 times faster than processing power of the PS 2.

That kind of power would likely enable the PS 3 to simultaneously handle a wide range of electronic tasks in the home. For example, the kids might be able to race each other in a Grand Prix video game while Dad records an episode of ``The Simpsons.''

``The home server and the PS 3 may be the same thing,'' said Kunitake Ando, president and chief operating officer of Sony, at a recent dinner in Las Vegas.

Sony officials said that one key feature of the cell design is that if a device doesn't have enough processing power itself to handle everything, it can reach out to unused processors across the Internet and tap them for help.

Peter Glaskowsky, editor of the Microprocessor Report, said Sony is ``being too ambitious'' with the networked aspect of the cell design because even the fastest Internet connections are usually way too slow to coordinate tasks efficiently.

The cell chips are due to begin production in 2004, and the PS 3 console is expected to be ready at the same time that Nintendo and Microsoft launch their next-generation-game consoles in 2005.

Nintendo will likely focus on making a pure game box, but Microsoft, like Sony, envisions its next game console as a universal digital box.

A big risk for Sony and its allies is that in their quest to create a universal cell-based chip, they might compromise the PS 3's core video-game functionality. Chips suitable for a handheld, for example, might not be powerful enough to handle gaming tasks.

Sony has tried to address this problem by making the cell design modular; it can add more processors for a server, or use fewer of them in a handheld device.

``We plan to use the cell chips in other things besides the PlayStation 3,'' Ando said. ``IBM will use it in servers, and Toshiba will use it in consumer devices. You'd be surprised how much we are working on it now.''

But observers remain skeptical. ``It's very hard to use a special-purpose design across a lot of products, and this sounds like a very special-purpose chip,'' Glaskowsky said.

The processors will be primed for operation in a broadband, Net-connected environment and will be connected by a next-generation high-speed technology developed by Rambus of Los Altos.

Nintendo and Microsoft say they won't lag behind Sony on technology, nor will they be late in deploying their own next-generation systems.

While the outcome is murky now, analyst Doherty said that a few things are clear: ``Games are the engine of the next big wave of computing. Kutaragi is the dance master, and Sony is calling the shots.''
------------------------------------------------------------------------
Contact Dean Takahashi at dtakahashi@sjmercury.com or (408) 920-5739.

Source

Fafalada said:
So they're estimating 4*3.2 Ghz 512Mbit chips >>>256 MB @ 25.6 GB/s. Still not bad assuming 64MB eDRAM on the CPU and 32MB+ eDRAM on the GPU.
Personally, I am a bit worried about the notions of supposed large eDram buffer on the CPU.
For one, that would pretty much guarantee no L2 cache on PEs - and I don't need to go explaining why that could be a problem (I'll let ERP do it instead ).
And secondly you insert another layer of DMA juggling between the already non-trivial juggling from main-APU local memories.

Now, granted - at 64MB the above is not that much of an issue since you could be really lazy with reloading it - the thing is I sincerely doubt it could be anywhere that big.
What I'm worried about is that we'll end up with something like 2-4MB per PE, and still no actual cache for the CPU.

Fafalada and others seem to have concerns about eDRAM on another thread.

Are you saying that we can't have both L2 cache on PUs and eDRAM because of ineficiacy? Which would you consider more efficient?

nAo said:
8x8? I don't think so.

How much die space would 64 MB of eDRAM take at 65nm? If eDRAM was discarded, could they packin 8 PUs and 64 APUs instead of 4 PUs, 32 APUs and 64 MB eDRAM? They'd still attain 1TFlop at 2GHz if they're having trouble attaining 4 GHz. :?:
 
DRAM as a whole is rather dense, especially the capacitor-less cell Toshiba's developed. It wouldn't allow another four PEs + 8 APUs. Maybe if it had been 64MB SRAM instead of eDRAM... ;)
 
Back
Top