Cell and RSX : their relationship.

Status
Not open for further replies.
( you have seen the patent and you know we have 64 Integer Units and 64 FP Units in the Visualizer in addition to the 4 Pixel Engines and the Image Cache )

That's not opinion, that's fact. It's up to you whether you want to believe it or not.

I guess this patent that Panajev and Vince is describing must be something somebody made up. :LOL:
 
PC-Engine said:
( you have seen the patent and you know we have 64 Integer Units and 64 FP Units in the Visualizer in addition to the 4 Pixel Engines and the Image Cache )

That's not opinion, that's fact. It's up to you whether you want to believe it or not.

I guess this patent that Panajev and Vince is describing must be something somebody made up. :LOL:

Not to nitpick, but yeah, it could be. Some people on here (a poster called version first of all) keep posting made up graphs and "insider info" to wow the fanbois on here.

But yeah, there could have been some patents from Sony. The only way to see if a Cell GPU (or a so called All-Cell system where there is no CPU or GPU, just a group of Cells, or whatever people were talking about on here) would work is to have patents and test silicon based on the patents i guess.
 
Didn't even Dave Orton from ATI state something to the effect that Sony was probably targeting a different approach for graphics processing? The Orton statement was well before the nVidia deal was announced.
 
Yeah, too bad there is no word about a CELL based GPU there, there are just some APUs connected to a pixel engine (? a standard GPU? LOL :) )
 
nAo said:
Yeah, too bad there is no word about a CELL based GPU there, there are just some APUs connected to a pixel engine (? a standard GPU? LOL :) )

You are really in serious denial.. ;)

It's pretty obvious the BE and VS is based on CELL from that patent application. ;)

Heck it even uses the infamous APU with it's so called apulets.

[0073] FIGS. 5-10 further illustrate the modular structure of the processors of the members of network 104. For example, as shown in FIG. 5, a processor may comprise a single PE 502. As discussed above, this PE typically comprises a PU, DMAC and eight APUs. Each APU includes local storage (LS). On the other hand, a processor may comprise the structure of visualizer (VS) 505. As shown in FIG. 5, VS 505 comprises PU 512, DMAC 514 and four APUs, namely, APU 516, APU 518, APU 520 and APU 522. The space within the chip package normally occupied by the other four APUs of a PE is occupied in this case by pixel engine 508, image cache 510 and cathode ray tube controller (CRTC) 504. Depending upon the speed of communications required for PE 502 or VS 505, optical interface 506 also may be included on the chip package.

[0074] Using this standardized, modular structure, numerous other variations of processors can be constructed easily and efficiently. For example, the processor shown in FIG. 6 comprises two chip packages, namely, chip package 602 comprising a BE and chip package 604 comprising four VSs. Input/output (I/O) 606 provides an interface between the BE of chip package 602 and network 104. Bus 608 provides communications between chip package 602 and chip package 604. Input output processor (IOP) 610 controls the flow of data into and out of I/O 606. I/O 606 may be fabricated as an application specific integrated circuit (ASIC). The output from the VSs is video signal 612.


[0014] The basic processing module is a processor element (PE). A PE preferably comprises a processing unit (PU), a direct memory access controller (DMAC) and a plurality of attached processing units (APUs). In a preferred embodiment, a PE comprises eight APUs. The PU and the APUs interact with a shared dynamic random access memory (DRAM) preferably having a cross-bar architecture. The PU schedules and orchestrates the processing of data and applications by the APUs. The APUs perform this processing in a parallel and independent manner. The DMAC controls accesses by the PU and the APUs to the data and applications stored in the shared DRAM.

[0015] In accordance with this modular structure, the number of PEs employed by a member of the network is based upon the processing power required by that member. For example, a server may employ four PEs, a workstation may employ two PEs and a PDA may employ one PE. The number of APUs of a PE assigned to processing a particular software cell depends upon the complexity and magnitude of the programs and data within the cell.

[0016] In a preferred embodiment, a plurality of PEs are associated with a shared DRAM. The DRAM preferably is segregated into a plurality of sections, and each of these sections is segregated into a plurality of memory banks. In a particularly preferred embodiment, the DRAM comprises sixty-four memory banks, and each bank has one megabyte of storage capacity. Each section of the DRAM preferably is controlled by a bank controller, and each DMAC of a PE preferably accesses each bank controller. The DMAC of each PE in this embodiment, therefore, can access any portion of the shared DRAM.

[0017] In another aspect, the present invention provides a synchronized system and method for an APU's reading of data from, and the writing of data to, the shared DRAM. This system avoids conflicts among the multiple APUs and multiple PEs sharing the DRAM. In accordance with this system and method, an area of the DRAM is designated for storing a plurality of full-empty bits. Each of these full-empty bits corresponds to a designated area of the DRAM. The synchronized system is integrated into the hardware of the DRAM and, therefore, avoids the computational overhead of a data synchronization scheme implemented in software.

[0018] The present invention also implements sandboxes within the DRAM to provide security against the corruption of data for a program being processed by one APU from data for a program being processed by another APU. Each sandbox defines an area of the shared DRAM beyond which a particular APU, or set of APUs, cannot read or write data.

[0019] In another aspect, the present invention provides a system and method for the PUs' issuance of commands to the APUs to initiate the APUs' processing of applications and data. These commands, called APU remote procedure calls (ARPCs), enable the PUs to orchestrate and coordinate the APUs' parallel processing of applications and data without the APUs performing the role of co-processors.

[0020] In another aspect, the present invention provides a system and method for establishing a dedicated pipeline structure for the processing of streaming data. In accordance with this system and method, a coordinated group of APUs, and a coordinated group of memory sandboxes associated with these APUs, are established by a PU for the processing of these data. The pipeline's dedicated APUs and memory sandboxes remain dedicated to the pipeline during periods that the processing of data does not occur. In other words, the dedicated APUs and their associated sandboxes are placed in a reserved state during these periods.

[0021] In another aspect, the present invention provides an absolute timer for the processing of tasks. This absolute timer is independent of the frequency of the clocks employed by the APUs for the processing of applications and data. Applications are written based upon the time period for tasks defined by the absolute timer. If the frequency of the clocks employed by the APUs increases because of, e.g., enhancements to the APUs, the time period for a given task as defined by the absolute timer remains the same. This scheme enables the implementation of enhanced processing times by newer versions of the APUs without disabling these newer APUs from processing older applications written for the slower processing times of older APUs.

[0022] The present invention also provides an alternative scheme to permit newer APUs having faster processing speeds to process older applications written for the slower processing speeds of older APUs. In this alternative scheme, the particular instructions or microcode employed by the APUs in processing these older applications are analyzed during processing for problems in the coordination of the APUs' parallel processing created by the enhanced speeds. "No operation" ("NOOP") instructions are inserted into the instructions executed by some of these APUs to maintain the sequential completion of processing by the APUs expected by the program. By inserting these NOOPs into these instructions, the correct timing for the APUs' execution of all instructions are maintained.
 
PC-Engine said:
It's pretty obvious the BBE and Visualizer is based on CELL from that patent application. ;)
Good! quote to me where it cites primitives are rasterized/shaded with SPEs then :)
All I can see is a SPE connected to something that probably is a GPU without a vertex engine, maybe you see more than me, lol :)
 
Why would you put 64 Integer Units and 64 FP Units in addition to the 4 Pixel Engines and the Image Cache and cathode ray tube controller in a VISUALIZER if not to rasterize? Simply put it's based on software rasterization aka CELL based GPU. ;)
 
PC-Engine said:
Why would you put 64 Integer Units and 64 FP Units in addition to the 4 Pixel Engines and the Image Cache and cathode ray tube controller in a VISUALIZER if not to rasterize?
To shade vertices . SW rasterizers suck big time.
 
nAo said:
PC-Engine said:
Why would you put 64 Integer Units and 64 FP Units in addition to the 4 Pixel Engines and the Image Cache and cathode ray tube controller in a VISUALIZER if not to rasterize?
To shade vertices . SW rasterizers suck big time.

That's why SONY went with a G70 instead of a CELL based Visualizer.
 
PC-Engine said:
That's why SONY went with a G70 instead of a CELL based Visualizer.
Do you believe some of the best engineers in the world didn't know in advance CELL architecture is not good at everything?! Yeah..they're just plain dumb.
They went with G70 cause Toshiba didn't succeed with their GPU...
 
nAo said:
PC-Engine said:
That's why SONY went with a G70 instead of a CELL based Visualizer.
Do you believe some of the best engineers in the world didn't know in advance CELL architecture is not good at everything?! Yeah..they're just plain dumb.
They went with G70 cause Toshiba didn't succeed with their GPU...

And it looks like Toshiba's GPU was based on the Visualizer patent which is based on CELL's architecture using APU/SPE applets...
 
PC-Engine I can't believe that you think that Sony went with a GPU from Nvidia because they seen how powerful the X360 was. That's just crazy.
 
mckmas8808 said:
PC-Engine I can't believe that you think that Sony went with a GPU from Nvidia because they seen how powerful the X360 was. That's just crazy.

Why is that crazy? If Xbox 360's GPU wasn't powerful then SONY could've just stayed with the CELL based Visualizer instead of going with Nvidia. I mean look at PS2. SONY went with the GS instead of Nvidia/ATI/PowerVR because they didn't feel threatened by the DC's capabilites.
 
I'm a product manager. I would like to offer a slightly different perspective from my own experiences.

Powderkeg said:
They had to go to Nvidia to be competative with the ATI equipped 360 in all areas, instead of just blowing them away with geomitry alone.

Bottomline is: I highly doubt Sony made major design decision according to ATI's (projected) R500 specs.

"Designing specifically to crush a competitor's product" is a viable approach if that product/competitor is a market leader, and has already been released. There is little need for Sony to do that given the circumstances.

First and foremost, successful companies design products *for their market/intended use*. In this case, Sony is on a mission to expand gaming to the next tier, and also to converge different industries. Their performance targets are pegged according to their own vision and projected users' needs rather than based on ATI's specs. As others have pointed out, Sony can simplify PS 3 development by working with nVidia. So there are good reasons to go to nVidia instead of doing everything in-house.

Perhaps J. Allard quoted the "Replace Cell with RSX" line because MS wants to counter the Cell hype. And the tech sites are the people who hoped for a 4-Cell PS 3. These do not necessarily reflect the truth or the PS 3 Cell performance. Sony may have considered using Cell for hardcore GPU work (the same way they wanted to have router functionality in PS 3 earlier on). But that's just normal brainstorm, what-ifs, another possibility to cut cost, long term investment during _any_ product planning.

Now back to Sony's vision...

PC-Engine said:
Simply put, SONY was too ambitious with their original CELL CPU goal of 1TFLOPS + CELL GPU for a 2006 release so instead of 1TFLOPS they got 215GFLOPS for the CPU and ditched the CELL GPU which together wasn't good enough to compete with Xbox 360.

Bottomline: Ambitious yes. But I think the comparison with 360 is MS-speak. Sony does not need to look to Microsoft to realize that a Cell CPU is insufficient to do the job. Ken Kutaragi is there to remind them.

Sony has a grand vision even before PS 3 is designed. The Cell-based *system* (not just the CPU) goal was set according to that vision. One of the articles mentioned Kutaragi wanted to create an early version of the Matrix virtual environment. To do so, Sony has to roll out a system that is beyond the capability of Xbox 360 or any consumer devices.

Early in the development phase, IBM talked Ken out of the ridiculous target for the Cell CPU. Sony would have to look for alternatives. nVidia may well come into the picture since then.

It is not uncommon for leaders to paint a grand vision (See "7 Habits of Highly Effective People") and then drive the whole company/groups of companies to achieve it.

For those who are offended by Ken's lying/hyping of PS 3, here's my interpretation...

I used to work for a boss who's like Ken (in another country). He would go to the press to talk about his high vision, partly to capture the users' imagination, and mainly to align multiple parties inside and outside our organization. Many times instead of consensus-based meetings that deadlock forever, we achieved much better mileage by announcing our goals publicly. It sends the message: "There is no turning back", "We are s.e.r.i.o.u.s.", "I'm putting my reputation on the line", ... to your colleagues, subordinates and other parties. We managed to hi-jack much of the group's talents and resources to do what we wanted to achieve.

We also forced ourselves to excel (You aim high, you shoot higher even if you miss). I used to wake up in cold sweat (a couple of times) when the deadlines are near. Many quit, 2 broke down, but more talented people joined too.

So behind the hype, there are real people sweating (swearing) to make that little game box of ours. I get a little pissed off when posters belittle Sony's or for that matter, MS's work. If you don't like the product, just don't buy it. :)

Sorry for the rant.
 
I can imagine the Sony engineers dancing in the streets because they exceeded KK's goal of 200 GFlops. I can't imagine where science and technology would be today if everybody aimed low. It takes people with great vision and determination to push the boundaries .. thank goodness we do not have to rely on the vison of the meek and conservative.
 
Status
Not open for further replies.
Back
Top