Second Gen Cell info

Fafalada said:
That's compare to the imaginary 4 Cells on a chip Very Happy I think that's what people comparing it too.
Well recent debates about PhysxPPU usually included the assumption that MS can "easily afford" to add another large chip in there since their CPU will be relatively small and inexpensive compared to 8/1 Cell.

Ahh I see, I wasn't really following the discussion on PPU that closely. I expect the full blown PPU solution to be quite a big chip.

What I want to know is how does the Wattage of Cell and Xenon CPU will compare.
Yeah it's a good question - also would be nice knowing what PS3 Cell will really be like in the first place.

PS3 Cell is most likely this thing that we are looking at give or take SPEs :) Though if only One Cell in PS3, that FlexIO seems over kill.
 
V3 said:
...
Though if only One Cell in PS3, that FlexIO seems over kill.

Why's the FlexIO bandwidth overkill?

Eyeballing the Xenon 'leak' the R500 has ~33 GB/s 'read' bandwidth from CPUs L2 cache + UMA system RAM...

And the FlexIO has ~ 77 GB/s aggregate bandwidth with ~ 45 GB/s outbound and ~ 32 GB/s inbound...so the PS3 GPU would be able to read ~ 45 GB/s.

The 32 GB/s and 45 GB/s figures seem to be in the same 'ballpark'...
 
Tacitblue said:
No doubt the technical information will be released before too long in some form, seems to have more trickling out before the May E3 frenzy.

About Cell or the PS3 incarnation? I think this "version 2" is what we saw already at the ISSCC, so I wouldn't expect very many new details on it. But yes, hopefully we'll hear about the PS3 configuration soon..(though I don't think there'll be any conference from Sony now before E3..still a possibility I guess).

By the way, a little straw poll - where do people see the PS3 clockspeed ending up?
 
Given one of the power curve graphs i saw which was for the first prototype chip, 3.2 Ghz was run at 0.9V and maybe 3 watts per SPE so maybe 50 Watts total if you include the PPC? How much of a performance hit are they willing to take to keep it cool yet still be competive to the Xenon processor. That's what i'd like to know. I'm not exactly a technical type here.
 
I'm sticking with my previous estimate of 4.096Ghz - no idea if it's at all likely but I like pretty numbers :p

How much of a performance hit are they willing to take to keep it cool yet still be competive to the Xenon processor
That would also depend on what kind of clock Xenon processor will end up at, won't it?
 
I noticed they moved the permute/test section (part of the Altivec?) on the die away from the main PPC core and placed it next to the L2 this time, less heat soak there i guess away from the main portion of the PPE. Latency or stability reasons?
 
Tacitblue said:
I noticed they moved the permute/test section (part of the Altivec?) on the die away from the main PPC core and placed it next to the L2 this time, less heat soak there i guess away from the main portion of the PPE. Latency or stability reasons?

Excuse me, but where do you see that?
 
Check the labelled variants of each die shot of DD1 and DD2, the "permute" section of the PPC has been moved for some reason or another. In the prototypes it was right by the main execution units, in the newer variant its up by the L2 cache. I might be wrong though, in the original they shared the same space as the test and debug circuits, maybe it was the only thing that was moved.
 
From what i read from the original shots that section is named "Test/Pervasive". That block seems top have moved next to the L2 cache. I don't have a clue what a pervasive unit does but i think that it isn't a permutaion unit for Altivec.

What I find interesting compairing the shots is that the PPE seems quite a bit larger. From comparing it's previous size with the SPEs and L2 cache and the current size of those blocks I get an area increase of between 1.62 and 1.84.
 
Jaws said:
And the FlexIO has ~ 77 GB/s aggregate bandwidth with ~ 45 GB/s outbound and ~ 32 GB/s inbound...so the PS3 GPU would be able to read ~ 45 GB/s.

Yes, that's assuming PS3 is a UMA. But aren't we expecting PS3 to be PC like ? The NV GPU will need to be really beefy compare to its PC counterpart to be able to use up all those bandwidth.
 
V3 said:
Jaws said:
And the FlexIO has ~ 77 GB/s aggregate bandwidth with ~ 45 GB/s outbound and ~ 32 GB/s inbound...so the PS3 GPU would be able to read ~ 45 GB/s.

Yes, that's assuming PS3 is a UMA.

It's still the FliexIO aggregate bandwidth whether it's UMA or NUMA. Just to clarify the R500 read bandwidth of 33 GB/s was the minimum form the 'leak', i.e. its +plus 33 GB/s...if the rumoured 512 MB of system RAM is true, then I'd expect that UMA bandwidth to increase also and therefore the R500 'read' bandwidth to increase too...

Below are the potential GPU bandwidths from what we know so far,

Code:
		+ 22 GB/s ---<---				 ---<--- 16 GB/s
 						      \	     /
									[R500]
                         /        \
		+ 33 GB/s --->--- 				--->--- 32 GB/s



	 +/- 32 GB/s ---<---				 ---<--- xx GB/s
 						      \	     /
									[NV/G]
                         /        \
	 +/- 45 GB/s --->--- 				--->--- yy GB/s


Whether the NV5x|G7x custom GPU for PS3 is on a UMU or NUMA setup (with or without eDRAM), FlexIO will still provide ballpark bandwidth figures for 'data flow' to/from the R500 / Xenon UMA setup and the off-chip eDRAM module (16+32 read/ write GB/s from 'leak').

So FlexIO is not really 'overkill' IMHO.

V3 said:
But aren't we expecting PS3 to be PC like ?

Well the couple of polls I did earlier, the most popular result was a NUMA with eDRAM and the GPU to have VS+PS units... so yes, maybe you could call it 'PC like' (apart from the eDRAM)... but highly integrated nonetheless, in a console. Whether we see anything like that is another story. But hopefully we shall know soon...

V3 said:
The NV GPU will need to be really beefy compare to its PC counterpart to be able to use up all those bandwidth.

That's a good thing (TM) ! :p
 
For us non subscribers or otherwise language lacking, what's the summary of that page? Decoding 48 mpeg2 streams concurrently using Cell?
 
OK, to complement Mikage...

Toshiba's demo is
1. Load 48 SDTV-resolution MPEG2 streams from HDD simultaneously then decode them with 6 SPEs
2. Another SPE resizes them to thumbnails, then displays them tiled on a 1920x1080 screen. (The remaining 1 SPE is idle throughout the demo)

It's done on Toshiba's software platform where threads are automatically assigned to SPEs so programmers can write programs without doing thread scheduling by themselves.

up34421.jpg

up34422.jpg
 
Back
Top