You don't get streaming... neither the fact that bandwidth for moving 7MB is not the limiting factor especially over a time superior to 10ms.
You need to consider what is requirement of streaming.
Think about HD video playback (perfect streaming example) on your computer. If your CPU has small cache, even a fast CPU can have poor performance. If your CPU has large cache then playback will be smooth.
Actually larger cache was not equal to better perfomance in games nor decoding but it could be. Remember P4, A64 era?
I even compared perfomance between 2 CPUs clocked at same speed with same architecture but one had twice the L2 cache but got at best 5-10% perfomance improvement for a wide range of applications.
Actually Celeron and P4 is perfect example for this kind of application (streaming video processing) while doing other tasks.
Actually, how useful is a large cache for decoding a video, say... H.264. Is it done in the GPU these days or CPU ?
indeed, but things have to be considered ceteris paribus (on top of decoding can be branchy; complex and AMD Intel a run for their money back in time).Maybe it helped out a bit but they still underperformed greatly clock by clock vs the A64 in games to decoding to multi tasking. The A64 having much less L2 cache.
indeed, but things have to be considered ceteris paribus (on top of decoding can be branchy; complex and AMD Intel a run for their money back in time).
See this quote, streaming has nothing to do with DMA vs coherent memory space.
Dunno but with a well optimised decoder (software) a low-end CPU like Core2Duo at around 1.6-1.8GHz would have no trouble with 1080p 25fps 40mbps H.264 decoding. Problem is a lot of software decoders are really badly optimised while delivering worse IQ.
But yes GPU can help out quite a lot. For example latest Rambo movie in Blu-ray 1080p 24hz, avg:20mbps (read from disc, not ripped) + image enhancing features uses about 25-30% (30hz playback rate) of my E8400 Dual-cores time. With GPU acceleration it is 3-5%. The image quality is top notch on par with the best I was shown at Hi-Fi center where I live.
Yap, CPU decoding may turn off some intensive/advanced features. It usually can't do composition too, and % utilization may be too high to be comfortable.
Makes me remember how several years ago I was surprised how easily a single core A64 3200+ handled a 1080p 25Hz non-variable 40mbps MPEG-2 test video clip (or rather short CGI movie). Somwhere at 50-70% CPU utilisation with no GPU acceleration.
Ha ha... yeah. MPEG2 is a piece of cake these days compared to MPEG4.
Open source projects should have pretty good H.264 decoding these days.
Maybe it helped out a bit but they still underperformed greatly clock by clock vs the A64 in games to decoding to multi tasking. The A64 having much less L2 cache.
Getting off topic, but interesting nonetheless.
There... some informal H.264 decoder benchmarks on Core2Duo and A64 (in the second one):
Dec 2006: http://www.anandtech.com/show/2132/4 (Need GPU to play High Profile H.264)
This is one of the software being limitation cases. Multiplatform games spring to mind?
It'd be case by case. e.g., Cross platform physics library on Cell runs well.
Multiplatform games involves the entire console, skills and workflow. So it'd be affected by other parts, and techniques used. You'll be able to find titles that runs well on either or both platform for assorted reasons.
Same with F@H which has improved perfomance over the years by several times on same CPUs just becouse they started to utilise them better and extensions.
See this quote, streaming has nothing to do with DMA vs coherent memory space.
But Cell has indeed a lot more bandwidth, more execution resources. The size of L2 is a bit irrelevant in that context.
Back to L2 cache is really not a problem modern CPUs are going with tinier L1 caches with better characteristic (bulldozer won't come with 64KB i$ and d$ if rumour are to be trusted), manufacturers have chose to go with tinier L2 cache too (256KB is become a standard).
It's more complicated than size alone, there is the hierarchy, latency and bandwidth offered by those caches, associativity, etc. For example Intel caches are greater than AMD even if usually they are tinier.
I've been reading my posts from Friday to realize that they were pretty aggressive, must have been in a bad mood.
Anyway no matter the reasons, Patsu and Ihamoict forgive me
I thought CoreAVC was the fastest CPU-based decoder? At least for h264/avc/mpeg4.Using ffmpeg-mt, which is the most efficient cpu-based decoder out there, a C2D at 1.6 Ghz decodes a very high bitrate 1080p MKV rip at around 85-90% cpu time on both cores. This task is also easily achieved by the cpu in a WDTV live, which is a lot less powerful compared to a C2D.