Part of the success of the platform is dependent on succesful backward compatibility: PS2 had quite excellent backward compatibility and many consumers ( I did ) appreciated to see this and will expect sony to deliver again with PS3... as Sony as mentioned, btw ( backward compatibility is one of their concerns )...
Let's take the AGP example... if we have an AGP 4x mother-board, we can from the BIOS change the AGP mode to fall back on AGP 1x and AGP 2x...
I was thinking about the PS3 BIOS being able to change Yellowstone base clock speed to 100 MHz: this would give us effective 800 MHz signaling, which is what we have on the current PS2 ( 400 MHz x 2 [DDR] )...
300 MHz for the EE embedded in the I/O ASIC and 800 MHz Yellowstone, when operating in PS2 backward compatibility mode: else the clock sent would be 400 MHz to achieve 3.2 GHz signaling ( the GFX is able to downclock itself when it doesn't detect the fan to be at full operation, we are not doping that [maybe they are thinking about that too... if the fan doesn't work and the temperature rises the PS3 could enter a low power state telling you the system is over-heating, so you could get it repaired instead of buying a new one] )...
If we had a 3.2 GHz signaling rate for Yellowstone we would have the memory being much faster than what the EE expects running PS2 games... we want to keep the effective latency close to what the EE in the PS2 used to expect...
And with two states: PS2 or PS3 mode ( PSX could be emulated all in software... yeah BE and Visualizer are powerful enough to do so
Sony has bought Connectix, they might as well use their technology )
The graphics enhancement would be on the GPU side... FSAA, anisotropic filtering, etc...
I think this would not cause as many problems as adding texture filtering to early PSX games does ( try it on Wipeout, alpha blended textures getr a bit messed up )... Sony did release some libraries for PSX that were intended to make the game work correctly even if texture smoothing and fast CD loading were enabled: they could be doing the same for PS2 titles...
The rasterization side can be done on the Visualizer ( like PSX rasterization was done on the GS )...
How about dealing with all the transfers from I/O devices ( we still have that 10 channels DMAC ) and implementing software sandboxes ?
How would it work ? But that's an interesting idea.
It would work like protected memory, the EE would run a secure program ( authored by the Cell OS [it would be a Kernel space command only, the user programs would not have acces to this... similar procedure to the command given by a Cell OS program that can change the mask ID of an APU allowing it to access multiple sandboxes] ) that would regulate the access to the data present in the External Memory...
We could divide the external RAM in as many sandboxes as we have set-up on the e-DRAM memory bank controller for the Be and Visualizer and regulate the access to these bigger sandboxes in a very similar way as the way we do in HW in the memory bank controllers...
Let me take the patent for a moment...
[0113] The PU of a PE controls the sandboxes assigned to the APUs. Since the PU normally operates only trusted programs, such as an operating system, this scheme does not jeopardize security. In accordance with this scheme, the PU builds and maintains a key control table. This key control table is illustrated in FIG. 19. As shown in this figure, each entry in key control table 1902 contains an identification (ID) 1904 for an APU, an APU key 1906 for that APU and a key mask 1908.
[...]
When an APU requests the writing of data to, or the reading of data from, a particular storage location of the DRAM, the DMAC evaluates the APU key 1906 assigned to that APU in key control table 1902 against a memory access key associated with that storage location.
[0114] As shown in FIG. 20, a dedicated memory segment 2010 is assigned to each addressable storage location 2006 of a DRAM 2002. A memory access key 2012 for the storage location is stored in this dedicated memory segment. As discussed above, a further additional dedicated memory segment 2008, also associated with each addressable storage location 2006, stores synchronization information for writing data to, and reading data from, the storage-location.
[0115] In operation, an APU issues a DMA command to the DMAC. This command includes the address of a storage location 2006 of DRAM 2002. Before executing this command, the DMAC looks up the requesting APU's key 1906 in key control table 1902 using the APU's ID 1904. The DMAC then compares the APU key 1906 of the requesting APU to the memory access key 2012 stored in the dedicated memory segment 2010 associated with the storage location of the DRAM to which the APU seeks access. If the two keys do not match, the DMA command is not executed. On the other hand, if the two keys match, the DMA command proceeds and the requested memory access is executed.
This is how we make sure that an APU will not read into other memory sandboxes ( unless we play with the mask, but only a trusted program like the OS can initiate such a change in the mask )...
With External memory we can store in a certain small area of memory the information regarding the rest of the memory which gets divided in sandboxes ( which have all a fixed size that is larger than each sandbox in the e-DRAM ): we would keep in this section of the memory ( which only the OS could access ) the key control tables, the keys and other information ( start and end of each sandbox in the External memory )... we would mimic with the EE what the memory bank controllers do in HW on the BE and Visualizer... slower, but we are dealing with the 4th memory hieararchy level ( 1st == registers, 2nd == Local Storage and 3rd == e-DRAM )... and the speed will be MUCH faster than an HDD or Blu-Ray...
But its possible. If BE is the size of EE or GS on 0.25 process, I think they can afford to spend 0.5 cm2 on eDRAM.
Its better to put the money there instead of external memory. In the long run this will reduce in cost. External memory is less predictable.
BE and Visualizer contain a lot of transistors used for logic ( which we do not have exact numbers for, but it is not a small number [also add the space used by Local Storage which is SRAM and all the APU's registers] ), not only DRAM... and I do not want the logic portion of the chip to be crippled because of a bit of more e-DRAM...
Still we do not know the size of the BE... it might be even a bit bigger than the either GS or EE at 0.25 um, they might have wafers which would make impossible to include the logic and 256 MB of e-DRAM as you would have too few chips/wafer ( you have to think about that... you do not want shortages, you want to have smooth mass production )...
Who knows, maybe even with the new wafer they will use in 2005, having a chip of the same size or bigger than the GS in 0.25 um could be not optimal...
We have to think at ways top keep a high performance , but also high yelds... tough, but that is what should be at least aimed to...
Remember, at a later stage we could even embed the Yellowstone RDRAM into the I/O ASIC as they did for the PSX GPU ( went basically from off-chip VRAM to embedded VRAM ) as manufacturing processes will allow you to do so... maybe when they move to 45 nm... the day the reach 30 nm ( not near ) they could maybe try to put BE and Visualizer onto a single chip... like they are doing with 130 nm EEs and GSs
( prolly 32 MB... the Visualizer should be able to handle its won with good 3D and 2D texture compression )
Do you think it will have 3D and 2D TC hardware support ? I think Sony might skip this again.[/quote]
I think the Pixel Engine in the Visualizer PEs should be supporting TC ( both 2D and 3D/Volumetric )...
If they do not, well at 1-2 GHz I see 4 APUs per PE and 4 PEs...
4 APUs/PE * 4 PEs * 8 FP ops/APU ( 4 parallel FP MADD ) * 1-2 GHz = 128-256 GFLOPS
Plus I expect Dependent Texture reads to be supported as well as loopback... The Visualizer should be quite programmable ( the APU could also manage DOT3 products )...
Even if the Pixel Processor was relatively simple ( supporting thought advanced texture filtering [well tri-linear and anisotropic filtering] and deendent texture reads and single pass multi-texturing [we might move away from polygons and textures towards procedural textures, but we cannot FORCE it from the start, it has to be given as an option you can follow thanks to the power and flexibility of the system] ) we still have all the APUs and PUs to work with Fragment Shaders and Triangle Set-up ( so we can make it flexible and stop calling it triangle set-up but something-else-set-up
)...
The BE should worry about Physics, T&L and Vertex programs ( and could help the Visualizer with pixel programs too if there is the need, after all we use the same kind of APUs to do the calculations... ) while the Visualizer worries about Triangle set-up, Fragment programs, texturing ( that should be handled by the Pixel Engine and the Image Cache ) and particle effects...
The 3D pipeline on such an architecture has a kind of flexibility that is mindblowing ( this will be a coin with two faces, you HAVE to give developers HLSL and nice High Level libraries [at least at the beginnign force them to learn the PS3 Hw using them, like they did with PSX, leaving them free to explore as time progresses] else they will really go nuts ): BE and Visualizer can help each other with computing loads as the Cell architecture was designed around software Cells being able to also migrate from PE to PE to be executed... We could execute Fragment and Vertex Programs on either BE or Visualzier, we could have a PE in the Visualizer run some physics calculations if we wish to ( maybe to variate at the pixel level how the light affects the hit object )...
Thanks for the Rambus press release. Is Yellowstone tech capable of being embedded as well ?
I do not know as of yet, but I hope so as it could be a good idea to move PS3 manufacturing costs down...