My question is what are cache scrubbers, what do they do and where have they been used if they have been used
I don’t know if the actual math can be done, but it comes across as an efficiency or cost saving technique. Basically there are let’s say 2 levels of memory (there is more), the cache is very small, but very fast, and memory which is very large but very slow.
Basically to make sure you are getting as much as you can out of your processors all your math is done as much as possible on cache. And when you’re done you write the results out to slow memory. But you didn’t clear the cache necessarily, so when you request the next batch of work from slow memory (which in turn populates tbe cache) the slow memory waits for the cpu to dump the entire contents of the cache before it lets the memory dump things into there. I believe this is called trashing or something.
@sebbbi talks about it a lot.
The cache scrubbers act as fine grained eviction. It marks data in the cache that is no longer needed and it only evicts that which is marked. This means you don’t dump the entire cache, so you can save some cycles and overall it’s like a smoother flow. Ultimately, This saves a some cycles during this process. With a large cache some of this is nullified I suspect, its. It clear if cache is always trashed or if you can just put more into it if there is space. I suspect if not, then… yea. Cache scrubbers make sense. But it’s hard to tell how much benefit is had here.
On paper it sounds like an obvious evolution. But cpu IHV still have yet to do this? (or they have other methods around cache trashing management which isn't a cache scrubber) So the benefits may be for a very specific application and not a widespread benefit across the board. Not sure what the silicon costs are for this.
Neat innovation though, memory management is often overlooked and it plays a big part in performance.
We are talking about dumping the contents of everything and filling the whole cache. Vs. Fine grain eviction.
I think if nothing is needed in the cache then there is zero benefit. If there is something still needed in the cache then the cache will need to move up to the next higher level cache and retrieve it again.
But I do believe a larger cache will negate the need for fine grained eviction. But I could be wrong, there's no reason to endure cache trashing if you don't have to. I would agree with the general statement that having hardware based cache management is a good thing, but if no one is including it, then perhaps that may signal that it could also perform worse than what programmers can accomplish on their own.
TLDR; To maximize the performance of any processor, and especially GPUs, cache management is uber important. But when you work on very divergent workloads, your cache will be thrashed, and performance will drop heavily. Typically the way to work around this is to just code with cache thrashing in mind. But Sony has hardware to address it and make this less painful overall.