The pros and cons of eDRAM/ESRAM in next-gen

What I meant to say is if they saw any benefit on using a allegedly low latency ram compared to the main ram, for compute shader work. (Which I think, shouldn't be a problem at all for the ddr3, at least compared to gddr5)
I can see bandwidth being an advantage, and concurrent read/write of the ESRAM too... but was there any leak about the ESRAM latency? So far it looks like it doesn't provide much of an advantage.

I believe DDR3 and GDDR5 have a very similar latency.
 
SRAM always have latency advantage over DRAM, that's just how it is.

Whether that translate into any actual performance gains on the gaming space is hard to say, since graphics workload are highly tolerant to latency.
 
Yeah, but are shaders usually bandwidth bound? (Honest question, which is why I thought it would be nice for them to talk)
Compute shaders are often memory (bw and/or latency) bound. Most CUDA optimizations guides talk extensively about memory optimizations, while ALU optimizations are not discussed as much (since ALU isn't usually the main bottleneck for most algorithms on modern GPUs).
 
Hmmm, hope they will talk soon (GDC Europe?) about the benefits of that, if any.
Judging from sebbbi's words, the benefits are many. Compute shaders are the future and ROPs aren't as necessary as they were and aren't the key to performance.
 
This discussion has been around before. But my worry is that the Xbox just wont have the CUs to spare for nice compute.
 
SRAM always have latency advantage over DRAM, that's just how it is.
But we don't know how it plays out in this particular implementation. It's weird that even months after launch, while we have many devs praising ESRAM for many reasons, none of them ever said it was because latency. I think the chances are slim.
 
SRAM always have latency advantage over DRAM, that's just how it is.

Whether that translate into any actual performance gains on the gaming space is hard to say, since graphics workload are highly tolerant to latency.

On-die SRAM versus external DRAM, this is virtually certain.
It's not so certain on-die, particularly at large capacities where latency starts to become dominated by the time it takes to cross the arrays, and at that point density starts to win out. IBM's eDRAM analysis put the crossover point somewhere around where the ESRAM is.

Perhaps later disclosures can give a better handle on how much of the memory access process is shared between the DRAM and ESRAM paths.
The DDR interface and devices themselves have a sizeable but fixed latency contribution, and AMD's CPU cache miss latencies are such that we know that the DRAM is not the biggest contributor anymore (ns latencies are over twice what Intel can manage, and Intel's latency must include DRAM interface and device latencies already), much less what the GPU does in its own pipeline.
 
Perhaps later disclosures can give a better handle on how much of the memory access process is shared between the DRAM and ESRAM paths.
The DDR interface and devices themselves have a sizeable but fixed latency contribution, and AMD's CPU cache miss latencies are such that we know that the DRAM is not the biggest contributor anymore (ns latencies are over twice what Intel can manage, and Intel's latency must include DRAM interface and device latencies already), much less what the GPU does in its own pipeline.

Huh? Isn't that exactly the point of having a cache to begin with?
and what does the latency of the DRAM has to do with when a cache miss happens?
 
Huh? Isn't that exactly the point of having a cache to begin with?
and what does the latency of the DRAM has to do with when a cache miss happens?

If an access hits in cache, it doesn't incur main memory latency.
Main memory accesses aren't initiated until it has been determined that data is not in cache.
Memory latency is the sum of all the misses incurred on-die, and then the cost of the memory controller, interface, and DRAM.

The remote cache access latencies are documented, and they are higher than half the memory latency documented. AMD spends more time on-chip trying to figure out if it should hit main memory than it takes for said memory to be accessed.
 
If an access hits in cache, it doesn't incur main memory latency.
Main memory accesses aren't initiated until it has been determined that data is not in cache.
Memory latency is the sum of all the misses incurred on-die, and then the cost of the memory controller, interface, and DRAM.

The remote cache access latencies are documented, and they are higher than half the memory latency documented. AMD spends more time on-chip trying to figure out if it should hit main memory than it takes for said memory to be accessed.

Then why bother with caching? Just go to the RAM every time, no?
 
Then why bother with caching? Just go to the RAM every time, no?
3dilettante is referring to the instance where a cache miss happens.

When you get a cache hit, it's much faster than an external DRAM access would be even if the access was made without checking the cache first.
 
Back
Top