rendezvous
Regular
Yeah, I see. But at the time I clicked to post here, that thread was much further down the page. Besides, they weren't really getting anywhere with their discussion. Nobody really knows at the present time, I guess. So I retract the question until further notice.Mordecaii said:This also has its own thread and is named almost the exact same as the question you just asked... It should be in the top 5 or so forum posts.
Jawed said:Well you've just "proven" that caching and data compression don't work.
You can't normalise memory bandwidths like that.
Shifty Geezer said:Why would that be? Surely a listed figure as 25 GB/s is 25 GB/s for the RAM amount it connects to. Neither part has provided bandwidth/pin or bandwidth/megabyte RAM figures.
Vaan said:...
So I don't find any method to measure or compare memory bandwidths between the two systems. Let's see in a couple of months with the full final specs.
Laa-Yosh said:...
or is it like Nvidia's architecture where the pixel pipes' shader ALUs have to do it as well? And just how many ALUs are there per pixel pipe in the RSX?
Kutaragi said:...
For example, RSX is not a variant of nVIDIA's PC chip. CELL and RSX have close relationship and both can access the main memory and the VRAM transparently. CELL can access the VRAM just like the main memory, and RSX can use the main memory as a frame buffer. They are just separated for the main usage, and do not really have distinction.
This architecture was designed to kill wasteful data copy and calculation between CELL and RSX. RSX can directly refer to a result simulated by CELL and CELL can directly refer to a shape of a thing RSX added shading to (note: CELL and RSX have independent bidirectional bandwidths so there is no contention). It's impossible for shared memory no matter how beautiful rendering and complicated shading shared memory can do.
...
blakjedi said:...
Question is this the CPU Read bandwidth is only half that of the northbridge... wouldnt it have been better to have the same bandwidth for reads and writes as the GPU?
Qroach said:So this isn't really an apples to apples comparrison?
Johnny Awesome said:Not at all, but it was a nice attempt.
Riddlewire said:Ok, so speaking of apples to apples...
Are the CPU cores in the X360 (3) and the Cell (1) exactly identical?
If this has already been addressed on this forum, I missed it.
Xmas said:I think this is wrong, and despite panajev's explanations, I guess they counted half of the max ops per cycle for both, i.e. 12.8 billion for Cell and 37.4 billion for RSX. It's the same way NVidia balanced NV40, half the shader ops can be dot products.Jaws said:2) Dot products
-PS3
claimed PS3 ~ 51 billion dot products per second
Cell ~ 8 per cycle (7 SPU + VMX)
8*3.2GHz~ 25.6 billion dot products per second
RSX ~ 51-25.6 ~ 25.4* billion dot products per second
* deduced from claim
PS3 ~ 51 billion dot products per second
Jaws said:They're not exactly identical. However the CELL PPE and XeCPU are both Power based, 12 Flops per cycle, 2-way SMT, in-order cores...
Jaws said:Xenos is capable of 24 billion dot products per second. If you allocate 37.4 billion to RSX, that's a helluva increase considering they're both on 90nm, no?
As an example of the advances being made, Pearson noted that Sony's new PlayStation 3 computer games console is 35 times as powerful as the model it replaced, and in terms of processing is "one percent as powerful as a human brain".
The 48 ALUs are divided into three SIMD groups of 16. When it reaches the final shader pipe, each of the 16 ALUs has the ability to write out two samples to the 10MB of EDRAM. Thus, the chip is capable of writing out a maximum of 32 samples per clock. At 500MHz, that means a peak fill rate of 16 gigasamples. Each of the ALUs can perform 5 floating-point shader operations. Thus, the peak computational power of the shader units is 240 floating-point shader ops per cycle, or 120 billion shader ops per second at 500MHz
The 10MB of EDRAM is actually on a separate die, at least initially. As future process technologies become available, it is possible that it could be on the same piece of silicon as the GPU. Still, the EDRAM resides on the same package, and has a wide bus running at 2GHz to deliver 256GB/sec of bandwidth. That's a true 256GB/sec, not one of those fuzzy counting methods where the 256GB is "effective" bandwidth that accounts for all kinds of compression. The GPU writes the back buffer, Z buffer, and stencil buffer to the EDRAM. When it is finally able to drawn to the screen, the EDRAM transfers the back buffer to the 512MB of GDDR3 for scan-out. The EDRAM does not store any textures—the full 10MB gets pretty much filled up with 1280x720 HD resolution, including Z, stencil, and anti-aliasing sub-pixel samples.
There's even a little magic that happens at that phase. The EDRAM has built in logic to perform Z compare, alpha blending, and resolving anti-aliasing samples into pixels. Normally those operations happen on the GPU, and require not only valuable silicon real estate and on-chip caches, but eat into memory bandwidth as data has to go back and forth to the GPU from the main graphics RAM. ATI's solution of building that logic into the EDRAM where the back, Z, and stencil buffers live eliminates a lot of data transfer and save time and silicon space on the GPU die itself. Because of the bandwidth savings and absolutely massive bandwidth to EDRAM, the Xbox 360 should be able to perform frame buffer effects like motion blur, depth of field, or lens flare with incredible speed.
TexT said:Sony is at it again...
As an example of the advances being made, Pearson noted that Sony's new PlayStation 3 computer games console is 35 times as powerful as the model it replaced, and in terms of processing is "one percent as powerful as a human brain".
http://uk.news.yahoo.com/050522/323/fjiiv.html
Jawed said:I don't believe there's 256GB/s between GPU and EDRAM, either.
I do believe there's 256GB/s between the ROPs and the back-buffer.
Jawed