Looking at Sony's SSD patents, why do they need a "large" SRAM?
https://patents.google.com/patent/J...TO+HIDEYUKI&oq=SAITO+HIDEYUKI&sort=new&page=3
LBA size can be shrunk up to 32KiB for game installs for a 1TB SSD
https://patents.google.com/patent/J...TO+HIDEYUKI&oq=SAITO+HIDEYUKI&sort=new&page=3
This patent talks about using the SSD and SRAM cache as the working memory instead DRAM to reduce power consumption in standby state.
This how they are achieving 0.5W in rest mode? How much SRAM is needed?
One possible use case would be to serve as a scratchpad for the decryption and decompression hardware on-die.
Another claim in one of the patents discussed some time ago had the disk system writing to a secured buffer in the OS address space first, then reading from that buffer for the unpack process and writing to the final destination requested by the game. It may be more efficient if the architecture can keep that intermediate buffer on-die rather than duplicating traffic to RAM. Depending on the granularity of the accesses and the number of them in-flight, that alone could take a fair amount of storage. NAND pages can be in the hundreds of KiB, for example. I'm not sure if the storage needs to account for the uncompressed or compressed size of the payload before writing to the final destination. Compression would work best if it happened before encryption, so the reverse could allow for the decrypted data to then be streamed to RAM as part of the decompression process. However, if there's any other action that needs to be done on the data before final commit, it might need to happen while on-die and decompressed. That would be a sizeable inflation factor.
The infinity cache is labelled as the LLC in various code changes.so looking at RDNA
we have
L0 (CU) fastest
L1 (Shader Array) slower
L2 (All Shader engines) slowest
and now we are adding infinity cache. ?
I have seen many people try to call Infinity Cache as L3, but I don't think that's accurate they would have just called it L3.
From AMD's footnotes on RDNA2:
"Measurement calculated by AMD engineering, on a Radeon RX 6000 series card with 128 MB AMD Infinity Cache and 256-bit GDDR6. Measuring 4k gaming average AMD Infinity Cache hit rates of 58% across top gaming titles, multiplied by theoretical peak bandwidth from the 16 64B AMD Infinity Fabric channels connecting the Cache to the Graphics Engine at boost frequency of up to 1.94 GHz. RX-535"
The cache is not tied directly to the same hierarchy as the GPU caches, and it's possible that some kind of extended storage linked to the memory-side portion of the fabric could be applied to other designs using the fabric. The number of cache layers they have before getting the the LLC would be variable. It's also possible that they didn't want to be caught up in the random cache naming games played by the GPU side.
The GPU's handling of its caches is also architecturally distinct, since there's a fair amount of explicit cache handling at the instruction level that this new cache isn't visible to. There is handling for the LLC, but being done at the virtual memory page level means the GPU ISA cannot control it in the same way as the others, and this may also make it more extendable to other types of chip.
There are code changes related to Navi 21 that show ongoing use of it.Are we sure L2 is still there?
This could go back to how it connects to everything. Unlike other caches, it's not directly linked to the internal hierarchy but is accessed through an infinity fabric link. The numbers for Navi 21 may indicate that these caches are tied to the memory controllers or the infinity fabric stop used by each controller node.why bother calling it infinity cache then? There's like 40MB of L3 on A100 cards or something like that.