arjan de lumens
Veteran
If you, for every memory access, must allocate a cache line prior to performing the access, then the amount of latency that can be masked will be on the order of cache size divided by memory bandwidth, which should amount to several hundred to a few thousand ns for typical GPU cache sizes and memory bandwidths. There is nothing preventing us from e.g. filling multiple cache lines at the same time.OpenGL guy said:Ok. How big is your cache line? How many cache lines do you have? These factors will determine how much latency you can hide. Caches on GPUs are generally smaller than CPU caches. Also, caches on GPUs tend to be divided among different units (Z, texture, color).arjan de lumens said:You can hide latency as long as the buffers you use to keep track of outstanding pixels/memory accesses aren't full. These buffers get rather expensive after a while, but, say, 100-200 ns of latency isn't that hard to mask this way.
In that situation, effective latency (as seen from the unit that accesses the memory controller) will go up sharply due to bandwidth saturation effects, so I would say that in this situation memory bandwidth is still more important to performance than the raw memory latency (the latency from the memory module receives the request until it returns the data).Plus, I think you are missing my whole point: If you aren't getting good cache line utilization, then you are wasting a lot of bandwidth, thus latency becomes important.