The primary distinction for an L3 is that it is behind the L1 and L2, and it is considered part of the coherent memory hierarchy.
How it is banked, how it is pipelined, and even if it uses SRAM is an implementation choice. There can be multiple ports and heavy banking, and there can be conflicts and penalties for certain access patterns such as the eDRAM L3 for IBM's latest POWER chips.
L3 for current on-die implementations doesn't have the same capacity, low cost, and PCB interface requirements of external DRAM, so things like the more complex address and command decoding and the single read/write bus with all its turnaround penalties are normally dispensed with. The wires are much finer and there's no big bus of PCB traces that needs to be driven.
The need to manage allocation and evictions, often simultaneously, leads to frequent simultaneous reads and writes, which encourages at least two ports.
The latencies involved and the number of other caches that feed into it are figured into a pipeline that also has stages devoted to routing access and wakeup of arrays.
How it is banked, how it is pipelined, and even if it uses SRAM is an implementation choice. There can be multiple ports and heavy banking, and there can be conflicts and penalties for certain access patterns such as the eDRAM L3 for IBM's latest POWER chips.
L3 for current on-die implementations doesn't have the same capacity, low cost, and PCB interface requirements of external DRAM, so things like the more complex address and command decoding and the single read/write bus with all its turnaround penalties are normally dispensed with. The wires are much finer and there's no big bus of PCB traces that needs to be driven.
The need to manage allocation and evictions, often simultaneously, leads to frequent simultaneous reads and writes, which encourages at least two ports.
The latencies involved and the number of other caches that feed into it are figured into a pipeline that also has stages devoted to routing access and wakeup of arrays.