Will L2 cache on X360's CPU be a major hurdle?

Korrupt · Jul 18, 2005

X360's CPU has to share 1mb of L2 cache between 3 cores and 6 threads while PS3 has 256kb L2 cache per SPE. Will that 1mb L2 cache be a limiting factor in comparison to the Cell's L2 rich CPU?

Gholbine · Jul 18, 2005

I thought the SPE's were using a local memory as opposed to a cache.

What's the difference?

cobragt · Jul 18, 2005

The 256KB of memory on each SPE is essentially a cache, however it has characteristics that seperate it from a conventional cache

Acert93 · Jul 18, 2005

SPEs have 256KB of local store and no cache. CELL PPE has 512KB L2 cache and (I assume) 64KB L1.

Xenon has 1MB of shared L2 cache between 3 cores which each have 64KB L1 cache.

At least that is how I remember each design. (I am sure Shifty can correct me if I am wrong).

Will cache be an issue next gen? Of course, every limitation is an issue.

But as noted by others in the past, 3D games tend to use less cache (partly because there is a lot of streaming). A shared cache has some advantages like easy sharing between processors. Another nice thing about 1 big cache is that a large chunk of code can fit into it. Numerous small caches could run into issues where you have 1 large program and 4 or 5 small ones. The small ones are no problem in either design, but if you have a large chunk of code, lets say 600KB large, it wont fit into a smaller segmented cache. So in that case a large 1MB cache shared is better than 3 smaller caches.

In general cache will be an issue in the 360. It will be something developers keep in mind when designing their engines and in some cases will just need to design around.

Ditto the SPEs. Design to its strengths while minimizing any limitations.

Another benefit mentioned in regards to separate caches is it could minimize the effects of thrashing on the other cores.

PC-Engine · Jul 18, 2005

cobragt said:
The 256KB of memory on each SPE is essentially a cache, however it has characteristics that seperate it from a conventional cache

If it's a cache then why not call it a cache? It's not a cache according to the engineer who designed it...

3dcgi · Jul 18, 2005

I think cobragt means it could be considered a software controlled cache.

Wunderchu · Jul 18, 2005

Korrupt said:
X360's CPU has to share 1mb of L2 cache between 3 cores and 6 threads

AFAIK, the L2 cache will be shared by Xenos as well, with it's cache locking feature...

phat · Jul 18, 2005

cobragt said:
The 256KB of memory on each SPE is essentially a cache, however it has characteristics that seperate it from a conventional cache

It's not a cache, but, like a cache, it is 0-wait state memory. Cell has 256*7 + 64 = 1856 kB of 0-wait state memory, and 512kB of small-wait state memory (PPE's L2). Xenon has 64*3 = 192kB of 0-wait state memory, and 1MB of small-wait state memory. It's largely anyone's guess as to how these numbers actually affect realworld performance at this point.

Korrupt · Jul 18, 2005

Wait a minute, each SPE doesn't have 256kb? So it's 256kb cache for all the SPE's? That's pretty low then...

mckmas8808 · Jul 18, 2005

It's not a cache, but, like a cache, it is 0-wait state memory.

What's so good about 0 wait-state memory?

phat · Jul 18, 2005

mckmas8808 said:
It's not a cache, but, like a cache, it is 0-wait state memory.

Click to expand...

What's so good about 0 wait-state memory?

It allows the execution core to do a load or store or instruction fetch at its full clock speed. Otherwise, it would have to wait for memory with each instruction fetch or data access, reducing its performance to a mere fraction of its full clock speed.

cho · Jul 18, 2005

AFAIK, the whitenoose's l2 cache is running half clock of the cpu cores.

phat · Jul 18, 2005

Korrupt said:
Wait a minute, each SPE doesn't have 256kb? So it's 256kb cache for all the SPE's? That's pretty low then...

Cell has 256kB of L1 memory for each SPE. (L1 does not mean L1 cache, in case anybody wants to nitpick.)

PC-Engine · Jul 18, 2005

Correct me if I'm wrong but doesn't the 512kb of cache in PPE in CELL have direct access to main memory while the LS of the SPEs have to go through the cache to get to main RAM?

inefficient · Jul 18, 2005

PC-Engine said:
Correct me if I'm wrong but doesn't the 512kb of cache in PPE in CELL have direction access to main memory while the LS of the SPEs have to go through the cache to get to main RAM?

"through the cache" you say that as if that were a bad thing

And you can just lock the L2 cache if you want to actually just bypass it.

creon100 · Jul 18, 2005

Korrupt said:
Wait a minute, each SPE doesn't have 256kb? So it's 256kb cache for all the SPE's? That's pretty low then...

No, there is 256kB per SPE, so 256kBx7.

What's the 64 in phat's Cell calculation?

Bobbler · Jul 18, 2005

creon100 said:
What's the 64 in phat's Cell calculation?

64kb of L1 cache on the PPE + 256*7 for the L1 type sram in the SPEs. Then there is the 512kb of L2 cache on the PPE.

cobragt · Jul 18, 2005

L1 and L2 cache are highly automated memory systems, which makes them simple to program for.

The only difference between L1 and L2 cache and the memory found on each SPE is that developers can dictate the way they wish the memory to be used on each SPE, increasing efficiency, however it operates much in the same manner as a cache would, that's why the local memories aren't traditonal cache.

PC-Engine · Jul 18, 2005

inefficient said:
PC-Engine said:

Correct me if I'm wrong but doesn't the 512kb of cache in PPE in CELL have direction access to main memory while the LS of the SPEs have to go through the cache to get to main RAM?

Click to expand...

"through the cache" you say that as if that were a bad thing

And you can just lock the L2 cache if you want to actually just bypass it.

The point I'm making is that how can LS function as cache if it doesn't have a direct connection to main RAM? If it can function as cache then why does it need to connect to another pool of cache to get to main RAM? AFAIR LS can't bypass the 512kb and go straight to RAM. Isn't the point of cache to cache data from main RAM so that you don't have to go offchip since it's a lot slower? If LS doesn't have a direct path to main RAM then how can it fill its LS and operate as cache? In essence the 7 SPEs will be fighting with the PPE over the 512kb of actual cache.

AlgebraicRing · Jul 18, 2005

Can we get back to the topic at hand? I think the thread starter is asking whether or not it was a design mistake to have a shared cache for the three processors. Particularly I would like to know if there is a way to partition the cache so that each processor can get a specified chunk of cache space. If there is no way to partition the cache, then is there a way to prevent one processor from filling the cache with its pages of memory which then push the pages of memory out which belong to the other two processors?

Will L2 cache on X360's CPU be a major hurdle?

Korrupt

Gholbine

cobragt

Acert93

Artist formerly known as Acert93

PC-Engine

3dcgi

Wunderchu

phat

Korrupt

mckmas8808

phat

cho

phat

PC-Engine

inefficient

creon100

Bobbler

Shazbot!

cobragt

PC-Engine

AlgebraicRing

Similar threads