read here the original info with the links
http://gametomorrow.com/blog/index.php/2005/11/30/gpus-vs-cell/
*-----------------------------------------------------------8OO8---------------
GPUs vs Cell
Blogged under Cell by Barry Minor on Wednesday 30 November 2005 at 7:39 pm
Recently I came across a link on www.gpgpu.org that I found interesting. It described a method of ray-tracing quaternion Julia fractals using the floating point power in graphics processing units (GPUs). The author of the GPU code , Keenan Crane, stated that “This kind of algorithm is pretty much ideal for the GPU - extremely high arithmetic intensity and almost zero bandwidth usageâ€. I thought it would be interesting to port this Nvidia CG code to the Cell processor, using the public SDK, and see how it performs given that it was ideal for a GPU. First we directly translated the CG code line for line to C + SPE intrinsics. All the CG code structures and data types were maintained. Then we wrote a CG framework to execute this shader for Cell that included a backend image compression and network delivery layer for the finished images. To our surprise, well not really, we found that using only 7 SPEs for rendering a 3.2 GHz Cell chip could out run an Nvidia 7800 GT OC card at this task by about 30%. We reserved one SPE for the image compression and delivery task. Furthermore the way CG structures it SIMD computation is inefficient as it causes large percentages of the code to execute in scalar mode. This is due to the way they structure their vector data, AOS vs SOA. By converting this CG shader from AOS to SOA form, SIMD utilization was much higher which resulted in Cell out performing the Nvidia 7800 by a factor of 5 - 6x using only 7 SPEs for rendering. Given that the Nvidia 7800 GT is listed as having 313 GFLOPs of computational power and seven 3.2 GHz SPEs only have 179.2 GFLOPs this seems impossible but then again maybe we should start reading more white papers and less marketing hype.
http://gametomorrow.com/blog/index.php/2005/11/30/gpus-vs-cell/
*-----------------------------------------------------------8OO8---------------
GPUs vs Cell
Blogged under Cell by Barry Minor on Wednesday 30 November 2005 at 7:39 pm
Recently I came across a link on www.gpgpu.org that I found interesting. It described a method of ray-tracing quaternion Julia fractals using the floating point power in graphics processing units (GPUs). The author of the GPU code , Keenan Crane, stated that “This kind of algorithm is pretty much ideal for the GPU - extremely high arithmetic intensity and almost zero bandwidth usageâ€. I thought it would be interesting to port this Nvidia CG code to the Cell processor, using the public SDK, and see how it performs given that it was ideal for a GPU. First we directly translated the CG code line for line to C + SPE intrinsics. All the CG code structures and data types were maintained. Then we wrote a CG framework to execute this shader for Cell that included a backend image compression and network delivery layer for the finished images. To our surprise, well not really, we found that using only 7 SPEs for rendering a 3.2 GHz Cell chip could out run an Nvidia 7800 GT OC card at this task by about 30%. We reserved one SPE for the image compression and delivery task. Furthermore the way CG structures it SIMD computation is inefficient as it causes large percentages of the code to execute in scalar mode. This is due to the way they structure their vector data, AOS vs SOA. By converting this CG shader from AOS to SOA form, SIMD utilization was much higher which resulted in Cell out performing the Nvidia 7800 by a factor of 5 - 6x using only 7 SPEs for rendering. Given that the Nvidia 7800 GT is listed as having 313 GFLOPs of computational power and seven 3.2 GHz SPEs only have 179.2 GFLOPs this seems impossible but then again maybe we should start reading more white papers and less marketing hype.