Redwood & Yellowstone new (?) specs

The Cell CPU would still benefit from very fast interconnects with the other chips in the system like Redwood and a fast external RAM... it is true that we have a good amount of e-DRAM on the CPU, but we do not want to be bottlenecked too much by all the other system perypherals... and those peripherals needs to receive the results of Cell calculation fast...

Just because Cell has e-DRAM on the CPU doesn't mean that the rest of the system is not going to need it... and we also need a fast bus to initially fill that chunk of e-DRAM and keep it full...

Cell's internal bandiwth and processing capabilities is well over 50 GB/s so we have not forced the external bus system to follow Cell's high speed ( 1,024 bits bus )...

Latency considerations for the external RAM should also be relaxed ( as it is the bandwidth concern... )... we are not feeding a 1-2 MB cache on the CPU, we are feeding 64 MB of e-DRAM ( which can be optimized during compile time, it is part of the ISA :) ) and most importantly the LS's of the APUs from which execution takes place...

Higher external latency allows us to ramp up the clock speed of the memory bus and since the bus from memory to Cell CPU would not be THAT long and PS3 is a custom design ( motherboard inclused ) there is space for a moderatly wide bus :)

Yellowstone can transmit 1 byte/clock ( ODR ): the real speed is 400 MHz...

You route only 400 MHz on the PCB, the PLL multiplication occurs on the memory chip... the chip communicate with each other with the 400 MHz signal...

Yellowstone can be used to connect a processor ( in the next diagram it is used to connect the DRAM to the GPU... ) to external memory...

redwood_block_diagrams.gif


As you can also read at RAMBUS' website, Redwood is a parallel bus technology that connects different chips together ( like CPU with Northbridge, Northbridge with Soutbridge, etc... ) while, again, Yellowstone is a memory interface and connects the memory with the CPU/Graphics processor...

Cell could pack in a Yellowstone memory controller to talk with external RAM and could be connected with Redstone to other chips...

We could also have a Northbridge that connects ( in a PS3 design, tell me what you think ) to external RAM using Yellowstone and then has two busses: one to the Broadband Engine and one to the Visualizer.

I have a better idea ( hopefully ):

We could connect Broadband Engine and the Visualizer together with a Redwood bus ( Cell patent: bus 608 ) and have the I/O ASIC pack the Yellowstone memory interface ( makes sense... both Visualizer and Broadband Engine have e-DRAM to work on locally ) which would connect the I/O Asic to the external memory...

This sort of makes sense ( also Sony licensed both Yellowstone and Redwood ) :)


The 400 MHz clock gets multiplied by a PLL reaching a total of 1.6 GHz and we operate at DDR on that clock ( both edges of the clock see a transfer )...

we basically have 3.2 Gbits/(s*pin)... or 400 MB/(s*pin)

50 GB/s = ~50,000 MB/s

hence, we need a 128 bits data bus... if we could raise the base frequency to 800 MHz ( could be more expensive than just using more traces for the data bus ) then we would reach 800 MB/(s*pin) thus we would need only a 64 bits bus...

If 25 GB/s is good enough for us we can deal with 400 MHz base clock and
64 bits data bus...

We will not have many memory modules and I do not expect the distance from memory module to memory controller to be really long, it is not a PC motherboard: a 128 bits bus running at 400 MHz base clock ( x4 thanks to the PLL and then add-in DDR transfers )...

We can use the 400 MHz base clock and transfer 4 phase shifted clock signals that on the chip itself thanks to the PLL would be sort of "packed/fused" in a single 4x faster clock...

We encode in a 400 MHz signal more data and the way we retrieve it is to use the PLL multiply it by 4x and sample it on both edges of the clock achieving an effective frequency of 3.2 GHz ( I hope I am not too tired and that I am not talking jibberish ;) )... I think we can encode in the 400 MHz the signal we need...



y_odr.jpg


Yellowstone operates at Octal Data Rates (ODR), transferring 8 bits of data per clock. ODR enables 3.2GHz data rates with a 400MHz clock and provides a scalable path to over 6.4GHz as bandwidth needs increase.

The lower speed 400MHz system clock is routed on the PCB between chips. Oh-chip, the 400MHz clock is multiplied -- up to 1.6GHz with a PLL. This effective 1.6GHz clock is subsequently used to transmit and receive data on both clock edges, resulting in 3.2GHz data rates. The 1:8 relationship between clock and data rates results in Octal Data Rate (ODR) operation.


It says it clearly, only on the chip we use the PLL for "clock multiplication", chip to chip the clock signal routed is 400 MHz...

redwood_technology.gif
 
another broadband engine ??? :D hehe, that would sky-rocket the prices of the console... 128 MB of e-DRAM between the two... ;)

I'd rather follow the patent and think about external RAM :)
 
I'd rather have Blu-Ray and a system that over-all works efficiently than a second Broadband Engine...

How would you keep BOTH processors even FED ?

From the disc ?

It would not be too hot for the RAM.... 64 + 64 = 128 of system RAM...

Even assuming 32-64 MB for the Visualizer alone we're at 160/192 MB

of RAM... system + VRAM + Sound RAM + etc...

I do not think that would be the best scenario...
 
sorry but...


(excerpt from Austin Powers 3... ~):

*Panajev slowly rolls the chair next to V3, after that last comment, and coming close to him says

"How about NO, V3 ?"

:LOL:

Before feeding each other ( :eeew ) they need to take data from mass storage ( the disc )... and you would need a very expensive bus and lots of Blu-Ray cache to achieve what you are thinking...

If we had more e-DRAM ( like a total of 256-512 MB ) I would be more inclined...

I was thinking much before this time about using Cell processors AS THE MAIN RAM...

When you have enough e-DRAM to avoid needing main RAM maybe we could try your idea... still I do not like not having any way of buffering-caching data between the disc and other I/O devices and the BEs ( except I/O RAM )...

I think that considering the fact that the initial costs on the HW would be way too high, they would go with a still very fast solution ( Yellowstone and External RAM ) rather than increasing e-DRAM...
 
Vince,

You nailed it :) ( Hybrid UMA was the approach PS2 also followed and if realized well it can pay off :) )



V3,

I have a bag full of NO's with your name on it :LOL:

( I am just kidding V3 ;) )
 
Back
Top