If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.
![]() |
|
|
#1 |
|
yes, i'm drunk
|
I know GPUs are built to hide a LOT of latency, do something else if memory access for something you just did isn't complete yet etc, and that GTX480 has memory access latency of 400-800 cycles.
I also know that compared to 7970 BIOS, 7970 GHz BIOS has slightly(?) higher memory latency settings. (which also explains why 7970's with GHz Edition BIOS achieve higher memory clocks than with normal BIOS, the VMEM voltage should be the same AFAIK) The big question is - how much added, say, 2-4 "clocks"* memory latency, affect the GPU performance? *clocks as the best word I know to describe it, as CPU/RAM memory latency settings are "measured" in clocks
__________________
I'm nothing but a shattered soul... Been ravaged by the chaotic beauty... Ruined by the unreal temptations... I was betrayed by my own beliefs... |
|
|
|
|
|
#2 |
|
Member
Join Date: Mar 2012
Location: Switzerland
Posts: 654
|
what you look as answer should tend in the wavefrontnumbers .. ( the wavefront acess on the 7970 is extremely high ) .
|
|
|
|
|
|
#3 |
|
Senior Member
|
800+2 = 802 = 0.25% increase.
|
|
|
|
|
|
#4 |
|
yes, i'm drunk
|
So what CPU-Z reports as "clocks" is same as cycles?
__________________
I'm nothing but a shattered soul... Been ravaged by the chaotic beauty... Ruined by the unreal temptations... I was betrayed by my own beliefs... |
|
|
|
|
|
#5 |
|
Senior Member
|
NO idea what GPUZ reports. Clocks = cycles.
|
|
|
|
|
|
#6 |
|
Member
Join Date: Aug 2011
Posts: 366
|
Both "clocks" and "cycles" are contractions for "clock cycles".
As for which cycles, that's another issue. The latency clocks reported by GPU-Z are memory bus cycles (divide the GDDR5 pr number by 4), while the 400-800 cycle memory latency typically refers to the time you have to wait as measured by the GPU clock (because that's what's interesting to a programmer). Of course, right now the memory clocks and the GPU clocks are close enough to each other that the distinction is irrelevant. It's also important to note that as the latency is measured in memory bus cycles, latency of 10 on a 500MHz bus is exactly the same as latency 20 on a 1GHz bus. As for the original question, the whole point of latency hiding in GPUs is based on the idea that when your problem is embarrassingly parallel, when you get hit by a memory access, you can always just go find something else to do while you wait. (whereas on typical CPU loads, you are pretty much hosed for the duration, so you want really good caches and low-latency memory). So the maximum the latency can get while not horribly ruining your performance depends on how many instructions (or, wavefronts) you can juggle, and on how long you can work on a single wavefront without stalling (on average). As that depends greatly on the workload you are executing, there is no simple answer. Quite probably, the added latency doesn't hurt you on most game loads. However, some GPGPU loads are much more latency-sensitive. |
|
|
|
|
|
#7 |
|
Senior Member
Join Date: Feb 2002
Posts: 2,019
|
2-4 clocks of memory latency is noise for a GPU. It would take a freak situation for the difference to be noticeable.
|
|
|
|
|
|
#8 |
|
yes, i'm drunk
|
Just one more thing - how much, say, 4 clocks, be in nanoseconds (or how many cycles would 1ns be)
__________________
I'm nothing but a shattered soul... Been ravaged by the chaotic beauty... Ruined by the unreal temptations... I was betrayed by my own beliefs... |
|
|
|
|
|
#9 |
|
Senior Member
|
nsec = (cycles/clockrate)/10^9
The "clockrate" is in MHz.
__________________
Apple: China -- Brutal leadership done right.
Google: United States -- Somewhat democratic. Microsoft: Russia -- Big and bloated. Linux: EU -- Diverse and broke. |
|
|
|
|
|
#10 |
|
Junior Member
Join Date: Mar 2012
Location: cracks
Posts: 53
|
The hertz measures the number of rotations per second (aka frequency).
The classic watch's hand measuring second makes a full rotation in 1/60 hertz (1 minute). A nanosecond is 1/10^9 second, so 10^9 nsec makes 1 second. a Gigahertz is 1'000'000'000 hertz, or 10^9 rotations per second. so... |
|
|
|
|
|
#11 |
|
Senior Member
|
I don't think it's as easy as that (and I don't think your intention was to depict it that way).
The total latency until a memory access is completed (and the write done or the results availably in the registers) seems not to be too closely related to what one traditionally understands when talking about memory timings as in latencies for, say normal DDR3-DRAM. At 800 MHz (DDR3-1600) having a CAS latency of 7 to 9 cycles is normal, adding two to four cycles on top of that... you do the math. In the end it's a matter of alignment and how well your memory controllers are able to coalesce accesses. If there's wiggle room left, you can tolerate data sitting a little longer in the buffers, if not, well, you'll have a performance penalty then. But I'm sure, product managers and their teams did ensure beforehand that perf does not fall off a cliff here.
__________________
English is not my native tongue. Before flaming please consider the possiblity that I did not mean to say what you might have read from my posts. Work| RecreationWarning! This posting may contain unhealthy doses of gross humor, sarcastic remarks and exaggeration! |
|
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|