as if they really have a clue apart from what they read in that EEtimes article on costs of semiconductors or whatever, but I digress
Sorry mr. Senior MPU Engineer with DEC experience... but if you are telling me that 512 pins would not make the motherboard more expensive than 256 pins I do not know what to tell you...
You have logic points, which I might or not disagree with, I do not think there is a need for trying to flame people like that...
But, back to the topic at hand...
True, which WASN'T as much a problem in the past, because if console X had only half as much RAM as the average PC at the time of launch, it was a difference of maybe a quarter gig at most. If PS3 launches with 256MB in 2005 the difference will be THREE QUARTERS of a gig or maybe more, and when devs already whine and bitch about too little RAM in today's consoles you wouldn't expect Sony to go and do the same mistake again.
I expect no less than half a gig main RAM, just like I said in my post. If PS3 has 2x64MB eDRAM, 256MB is just 2x of its on-chip memory pools. 512MB is a much more comfortable 4:1 ratio.
Well, who tells you that they will both have 64 MB of e-DRAM ? I hope they can pack that much, but seeing the optimistic numbers they put in their CMOS5 ( 65 nm ) related PRs I think 32 MB might be more probably ( the CPU would also have 4 MB of SRAM accounting all the Local Storages though ).
32 + 32 = 64 = again your nicer 4:1 ratio...
There is one concept you have to catch and I know you are more than smart enough to have already thought about it... who is saying that we cannot keep the increase in bandwidth needed for the increasing polygon data and texture data cannot be contained ? Who is saying that most of the e-DRAM and the local SRAM is necessarily storing for the most part merely a sub-set of the external RAM ?
You are treating it too... uhm... how do we ignorant people call it... "inclusively" perhaps ?
I see with the processing power and on chip bandiwdth Cell has that we might see for the first time micro-polygon ( REYES-like ) based renderers gracing real-time 3D on game consoles... I plan to see more enphasis on Shader Programs and procedural texture creation on the fly rather than just immaging the current paradigm multiplied by 10x... and then you nag on me for lacking vision, with all the respect "hello pottle, meet kettle".
The increased processing power might let us finally do things without TONS of pre-calculated tricks... we could at least reduce their usage in real-time 3D applications.
How would data be processed in this kind of architecture if we look at the big picture...
We would have streams of data that would help refill CPU and GPU dynamically ( unless each frame we consume and waste more data from e-DRAM, that we are not going to re-use, than what main RAM can provide us we should be fine )... let's think on vertex data...
What would happen to our polygon data ( let's think about using subdivision surfaces... and yes we could do deferred processing [depth sort the surfaces and subdivide only the visible ones] to reduce the amount of "slicing and dicing" needed in the next stage ) ?
Well our stream would be loaded by the CPU and the result of processing would be a stream that would probably be decisively
bigger than the incoming one ( at the end that processed stream would end up sent to the GPU through a separate chip-to-chip connection, probably Redwood based )... the transformed stream of micro-polygons cannot be sent yet to the GPU, quite a processor intensive part... Shading
With this kind of rendering approach ( micro-polygons + enphasys on Shading programs and not pre-done textures ) we will actually mitigate the increased enphasys on external rendering bandwidth provided that we have enough e-DRAM and SRAM on the chips to allow a good amount of space for processing data locally...
Think about what people said about normal maps and lighting... why do we need those pre-calculated maps if we can do those calculations in real-time ?
Why do we need tons of texture layers if Shaders can be used to procedurally create several of them ?
Why do we waste space with ultra detailed polygonal meshes in main RAM if we can send "space" optimized ones ( NURBS, sub-division surfaces ) and we can "expand" them on chip ?
With 32 MB of e-DRAM on CPU and 32 MB of e-DRAM on GPU ( plus GPRs and LS ) and 1 TFLOPS of processing power you should be able to move to such a rendering approach...
BTW, you point out that the CPU for example will hold data it buffers from memory thanks to the e-DRAM ( that the CPU can freely address ) then you fail to connect it to the fact that it would also mean reduced main RAM contention with the GPU...
PlayStation 3 will probably once again use a ~hybrid UMA approach as the PlayStation 2 used, but if you notice the supposed local memory on each processor has grown a lot ( and the main RAM would still be 8x faster at 25.6 GB/s )... the point of Cell and the use of e-DRAM was to minimize the bottleneck caused by external memory speed and the cost involved in pushing it to speeds that would be acceptable...
Also regarding complaints of current developers... Texture Compression says anything to you ?
The CPU ( 1 TFLOPS class beast ) would have 1,024 GPRs ( 16 KB of space ), 4 MB of SRAM based Local Storage and 32 MB of e-DRAM...
The GPU ( if based on Cell like the patent theorized ) would have 32-64 MB of e-DRAM ( IMHO they could get away with 32 MB ) and 512 GPRs ( 8 KB of space ) and 2 MB of SRAM based Local Storage...
The total would be... 70+ MB and if we think about 64 MB on the GPU this climbs to 104+ MB and if we think the CPU too has 64 MB of e-DRAM this total grows to 136+ MB...
136.024 MB transferring data internally at speeds like 100+ GB/s ( for 1,024 bits e-DRAM that would require a clock-speed of at least 780 MHz which is NOT low for e-DRAM )... I do not think that would be a small amount...
136 MB + 256 MB = 392 MB
I would not call that amount bad at all...
Let's think that they can push e-DRAM to 100 GB/s... 25.6 GB/s would be less than 1/4th...
Even thinking about VRAM bandwidth for the GPU... 100 GB/s would still be more than 2x the bandwidth the current GS has and the amount of memory would be 16x over the current GS.
The current GS gets its data ( its only connection to the rest of the machine ) through a 64 bits pipe yelding 1.2 GB/s
This is 40x less than the total bandwidth available on the GS.
1/4th or 1/40th... I think I prefer 1/4th
But, let's be fair... 25.6 GB/s is split between two chips as the 3.2 GB/s on PlayStation 2 was as well ( and to make things worse the EE did not have e-DRAM and as much Local Storage to be less dependent on that bandwidth )...
100 GB/s / 12.8 GB/s = 8
So we have 1/8th now, but 1/8th is still better than 1/40th