Details trickle out on CELL processor...

Panajev2001a said:
And of course in Prescott Intel abandoned the double pumped ALUs all together.

Uhm... what did they do ? How did they obtained the same performance for data dependent instructions (2 instructions with data dependency executed in 1 external/slow clock cycle) ? What solution did they use ?

Sorry, I've searched high and low now to back my claim and can't find anything :(

I remember seing a micro-benchmark when prescott came out that showed that the back-to-back add substition for shifts had 1 cycle latency for each add. Maybe it was for 64bit mode.

Cheers
Gubbi
 
"We think it's going to be a much more seamless and speedy process for developers using these workstations," he said

Heh.


The learning curve for this platform should be significantly better than previous ones."

Heh should think is very diffrent than will , mythos .

Lets wait and hear from developers.
 
The comment "we should wait" pretty much counts for everything, though: good and bad. Since when have you known folks to be that patient on here? ;)

Nah, it's just "take this line and run with it!" :p
 
cthellis42 said:
The comment "we should wait" pretty much counts for everything, though: good and bad. Since when have you known folks to be that patient on here? ;)

Nah, it's just "take this line and run with it!" :p

Should i add this to the 1tflop comments ?

;)

I should send them inside info saying the xenon will be a 1.2 tflop machine. Wait for them to post it and then make it into common knowledge on these boards. :)
 
Of course you should. Like I said: good and bad. And everything in-between.

Until we get performance comparisons, real-world testing, and see the architecture stretch out through different devices (and on some points "across a few years") it's basically people filling in the gaps of "what they want to see" instead.
 
Mythos said:
Press conference in which talk is that PS3 cell development will go easy for developers.....

PlayStation 3 chip goes easy on developers

http://news.com.com/PlayStation+3+chip+goes+easy+on+developers/2100-1043_3-5476933.html

This is promising news...IMO, if this is true then it's just as impressive, if not more than the 4.6GHz press release.

"We're very much aware of the need to balance between innovation in architecture and the ability to leverage that innovation," H. Peter Hofstee, a researcher in IBM's Systems and Technology division

Obviously wait for realworld figures, but this implies good realworld performance with their tools.

And I like this statement the best,

"We've created something that is very flexible," he said. "Having a more generic architecture will allow people to do new things."

A software renderrer?...Generic architecture to try new things...If I was a games developer, I would be licking my lips at the thought of that prospect...REYES, Raytracing et al! ...Yummy! :p

Now this should give PS3 devs more coffee breaks and time to post at B3D! :LOL:
 
jaws said:
A software renderrer?...Generic architecture to try new things...
The most recent patent that nAo uncovered certainly opens up possibilities for things I didn't really think S|APUs would be suited for. And we don't even need to go as far as talking about writting new renderers here...
 
Jaws said:
Now this should give PS3 devs more coffee breaks and time to post at B3D! :LOL:

Or they will get completly sucked into the new possibilities and code effect after effect till they look like zombies. ;)

Fredi
 
Fafalada said:
jaws said:
A software renderrer?...Generic architecture to try new things...
The most recent patent that nAo uncovered certainly opens up possibilities for things I didn't really think S|APUs would be suited for. And we don't even need to go as far as talking about writting new renderers here...

Please elaborate...and which patent was this? :)

McFly said:
Jaws said:
Now this should give PS3 devs more coffee breaks and time to post at B3D! :LOL:

Or they will get completly sucked into the new possibilities and code effect after effect till they look like zombies. ;)

Fredi

How do we know they don't already look like that!...Have you met any of them! :LOL: ;)
 
Is this the patent Fafalada?

http://www.beyond3d.com/forum/viewtopic.php?p=422002#422002

Ahhh...I see, I always thought you could use S|APUs as TMUs as those SRAM LS's were gagging for it!...Though, I always thought that if the PUs L2 cache was readbale by the LS, then they could also be used as TMUs shared by the PEs S|APUs...Yep, one shading monster CELL will be ! :)

EDIT: This little blogg would be a fair assesment then? :p
 
I do not see how the PU can be used as TMUs in the general case... even the APUs for that matter: at least when you need texture filtering.

Using micro-polygons does not eliminate the need for texture filtering (it eliminates the need for perspective correction though) although it would certainly reduce it as we would perform some sort of filtering while belnding micro-polygons together.

Having a L1 cache shared by the APUs would help the APUs to work with textures (fast and low latency random access to texture memory is something that really helps).
 
Panajev2001a said:
I do not see how the PU can be used as TMUs in the general case... even the APUs for that matter: at least when you need texture filtering.

Using micro-polygons does not eliminate the need for texture filtering (it eliminates the need for perspective correction though) although it would certainly reduce it as we would perform some sort of filtering while belnding micro-polygons together.

Having a L1 cache shared by the APUs would help the APUs to work with textures (fast and low latency random access to texture memory is something that really helps).

AFAIK, there are two patents flying around that are similar in concept.

One has PU L2 cache readable by the S|APUs LS SRAM. The other has the S|APUs with L1 cache. In concept are they not the same thing. i.e. from the S|APUs pov, it's L1 cache and from the PUs pov, it's L2 cache?

EDIT:
If using micro-polygons, I see them filtered by the TMUs of the PixelEngines...all these combinations and permutations highlight one thing, the generic nature of the architecture to try new things as mentioned by Hofstee! :)
 
Jaws said:
Panajev2001a said:
I do not see how the PU can be used as TMUs in the general case... even the APUs for that matter: at least when you need texture filtering.

Using micro-polygons does not eliminate the need for texture filtering (it eliminates the need for perspective correction though) although it would certainly reduce it as we would perform some sort of filtering while belnding micro-polygons together.

Having a L1 cache shared by the APUs would help the APUs to work with textures (fast and low latency random access to texture memory is something that really helps).

AFAIK, there are two patents flying around that are similar in concept.

One has PU L2 cache readable by the S|APUs LS SRAM. The other has the S|APUs with L1 cache. In concept are they not the same thing. i.e. from the S|APUs pov, it's L1 cache and from the PUs pov, it's L2 cache?

EDIT: If using micro-polygons, I see them filtered by the TMUs of the PixelEngines...all these combinations and permutations highlight one, the generic nature of the archtecture to try new things as mentioned by Hofstee! :)

I do no think the GPU will be CELL based for starters, but I might be wrong.

http://makeashorterlink.com/?H303261F9
http://makeashorterlink.com/?G313531F9

These two pictures come from the two patents filed by Kahle ( http://makeashorterlink.com/?M333131F9 and http://makeashorterlink.com/?W20B42688 ) and they both suggest L1+L2 caches for the PU and a separate L1 cache shared by the SPUs/APUs (part of the "atomic facility" as it was called in one of the patents).
 
Panajev2001a said:
Jaws said:
Panajev2001a said:
I do not see how the PU can be used as TMUs in the general case... even the APUs for that matter: at least when you need texture filtering.

Using micro-polygons does not eliminate the need for texture filtering (it eliminates the need for perspective correction though) although it would certainly reduce it as we would perform some sort of filtering while belnding micro-polygons together.

Having a L1 cache shared by the APUs would help the APUs to work with textures (fast and low latency random access to texture memory is something that really helps).

AFAIK, there are two patents flying around that are similar in concept.

One has PU L2 cache readable by the S|APUs LS SRAM. The other has the S|APUs with L1 cache. In concept are they not the same thing. i.e. from the S|APUs pov, it's L1 cache and from the PUs pov, it's L2 cache?

EDIT: If using micro-polygons, I see them filtered by the TMUs of the PixelEngines...all these combinations and permutations highlight one, the generic nature of the archtecture to try new things as mentioned by Hofstee! :)

I do no think the GPU will be CELL based for starters, but I might be wrong.

http://makeashorterlink.com/?H303261F9
http://makeashorterlink.com/?G313531F9

These two pictures come from the two patents filed by Kahle ( http://makeashorterlink.com/?M333131F9 and http://makeashorterlink.com/?W20B42688 ) and they both suggest L1+L2 caches for the PU and a separate L1 cache shared by the SPUs/APUs (part of the "atomic facility" as it was called in one of the patents).

Yes...those are the patents I'm referring to as the S|APUs with L1 Cache and their own DMAC and the other set of patents I'm referring to are these from J Kahle,

http://www.beyond3d.com/forum/viewtopic.php?t=15304

They are similar in concept...one concentrates on DMA and the other on the cache. The cache one has L1/L2/L3 cache for the PUs.

Basically, there are two types of PUs, ones with upto L3 and ones with L2. Can I just clarify that every S|APUs have a DMAC and a DMAC for every PU? Is this a correct assumption as it's different from Suzuoki's CELL patents fig.6 where there is one per PE? There is a shared DMA pool at the S|APU level and at the PU level for dynamic resourcing via software?

I think the CPU will have different PUs to the GPU, upto L3 to help distributed processing off chip and the PUs on the GPU will have L2 cache for more local data processing. If you look at the CELL patents, the PU's are different sizes on the CPU and GPU (visualizer). Though I could be completely wrong on all this! :p

EDIT: Btw, is there a particular reason that leans you towards a GPU that isn't CELL based?
 
I like how the second anyone mentions generality in processing wrt the PS3, the thread immediately becomes the 589th thread we've had about REYES, ray tracing, radiosity, etc.
 
Back
Top