Welcome, Unregistered.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Reply
Old 13-Jul-2012, 21:42   #1
Kaotik
yes, i'm drunk
 
Join Date: Apr 2003
Posts: 4,804
Send a message via ICQ to Kaotik
Default Memory latency and GPUs

I know GPUs are built to hide a LOT of latency, do something else if memory access for something you just did isn't complete yet etc, and that GTX480 has memory access latency of 400-800 cycles.

I also know that compared to 7970 BIOS, 7970 GHz BIOS has slightly(?) higher memory latency settings. (which also explains why 7970's with GHz Edition BIOS achieve higher memory clocks than with normal BIOS, the VMEM voltage should be the same AFAIK)

The big question is - how much added, say, 2-4 "clocks"* memory latency, affect the GPU performance?

*clocks as the best word I know to describe it, as CPU/RAM memory latency settings are "measured" in clocks
__________________
I'm nothing but a shattered soul...
Been ravaged by the chaotic beauty...
Ruined by the unreal temptations...
I was betrayed by my own beliefs...
Kaotik is online now   Reply With Quote
Old 13-Jul-2012, 23:33   #2
lanek
Member
 
Join Date: Mar 2012
Location: Switzerland
Posts: 655
Default

what you look as answer should tend in the wavefrontnumbers .. ( the wavefront acess on the 7970 is extremely high ) .
lanek is offline   Reply With Quote
Old 14-Jul-2012, 01:13   #3
rpg.314
Senior Member
 
Join Date: Jul 2008
Location: /
Posts: 4,070
Send a message via Skype™ to rpg.314
Default

800+2 = 802 = 0.25% increase.
__________________
The views presented here are my own and not my employer's.
Quote:
Originally Posted by Alexko View Post
So in a nutshell, model [BLANK] will have [BLANK], up to [BLANK], and even [BLANK] for a power consumption of just [BLANK]. Impressive.
rpg.314 is offline   Reply With Quote
Old 14-Jul-2012, 02:24   #4
Kaotik
yes, i'm drunk
 
Join Date: Apr 2003
Posts: 4,804
Send a message via ICQ to Kaotik
Default

Quote:
Originally Posted by rpg.314 View Post
800+2 = 802 = 0.25% increase.
So what CPU-Z reports as "clocks" is same as cycles?
__________________
I'm nothing but a shattered soul...
Been ravaged by the chaotic beauty...
Ruined by the unreal temptations...
I was betrayed by my own beliefs...
Kaotik is online now   Reply With Quote
Old 14-Jul-2012, 03:54   #5
rpg.314
Senior Member
 
Join Date: Jul 2008
Location: /
Posts: 4,070
Send a message via Skype™ to rpg.314
Default

NO idea what GPUZ reports. Clocks = cycles.
__________________
The views presented here are my own and not my employer's.
Quote:
Originally Posted by Alexko View Post
So in a nutshell, model [BLANK] will have [BLANK], up to [BLANK], and even [BLANK] for a power consumption of just [BLANK]. Impressive.
rpg.314 is offline   Reply With Quote
Old 14-Jul-2012, 12:59   #6
tunafish
Member
 
Join Date: Aug 2011
Posts: 366
Default

Quote:
Originally Posted by Kaotik View Post
So what CPU-Z reports as "clocks" is same as cycles?
Both "clocks" and "cycles" are contractions for "clock cycles".

As for which cycles, that's another issue. The latency clocks reported by GPU-Z are memory bus cycles (divide the GDDR5 pr number by 4), while the 400-800 cycle memory latency typically refers to the time you have to wait as measured by the GPU clock (because that's what's interesting to a programmer). Of course, right now the memory clocks and the GPU clocks are close enough to each other that the distinction is irrelevant.

It's also important to note that as the latency is measured in memory bus cycles, latency of 10 on a 500MHz bus is exactly the same as latency 20 on a 1GHz bus.

As for the original question, the whole point of latency hiding in GPUs is based on the idea that when your problem is embarrassingly parallel, when you get hit by a memory access, you can always just go find something else to do while you wait. (whereas on typical CPU loads, you are pretty much hosed for the duration, so you want really good caches and low-latency memory). So the maximum the latency can get while not horribly ruining your performance depends on how many instructions (or, wavefronts) you can juggle, and on how long you can work on a single wavefront without stalling (on average). As that depends greatly on the workload you are executing, there is no simple answer.

Quite probably, the added latency doesn't hurt you on most game loads. However, some GPGPU loads are much more latency-sensitive.
tunafish is offline   Reply With Quote
Old 16-Jul-2012, 02:56   #7
3dcgi
Senior Member
 
Join Date: Feb 2002
Posts: 2,019
Default

2-4 clocks of memory latency is noise for a GPU. It would take a freak situation for the difference to be noticeable.
3dcgi is offline   Reply With Quote
Old 18-Jul-2012, 19:14   #8
Kaotik
yes, i'm drunk
 
Join Date: Apr 2003
Posts: 4,804
Send a message via ICQ to Kaotik
Default

Just one more thing - how much, say, 4 clocks, be in nanoseconds (or how many cycles would 1ns be)
__________________
I'm nothing but a shattered soul...
Been ravaged by the chaotic beauty...
Ruined by the unreal temptations...
I was betrayed by my own beliefs...
Kaotik is online now   Reply With Quote
Old 18-Jul-2012, 19:29   #9
fellix
Senior Member
 
Join Date: Dec 2004
Location: Varna, Bulgaria
Posts: 2,817
Send a message via Skype™ to fellix
Default

nsec = (cycles/clockrate)/10^9

The "clockrate" is in MHz.
__________________
Apple: China -- Brutal leadership done right.
Google: United States -- Somewhat democratic.
Microsoft: Russia -- Big and bloated.
Linux: EU -- Diverse and broke.
fellix is offline   Reply With Quote
Old 18-Jul-2012, 21:29   #10
imaxx
Junior Member
 
Join Date: Mar 2012
Location: cracks
Posts: 53
Default

The hertz measures the number of rotations per second (aka frequency).
The classic watch's hand measuring second makes a full rotation in 1/60 hertz (1 minute).
A nanosecond is 1/10^9 second, so 10^9 nsec makes 1 second.
a Gigahertz is 1'000'000'000 hertz, or 10^9 rotations per second.


so...
imaxx is offline   Reply With Quote
Old 19-Jul-2012, 06:53   #11
CarstenS
Senior Member
 
Join Date: May 2002
Location: Germany
Posts: 2,842
Send a message via ICQ to CarstenS
Default

Quote:
Originally Posted by rpg.314 View Post
800+2 = 802 = 0.25% increase.
I don't think it's as easy as that (and I don't think your intention was to depict it that way).

The total latency until a memory access is completed (and the write done or the results availably in the registers) seems not to be too closely related to what one traditionally understands when talking about memory timings as in latencies for, say normal DDR3-DRAM. At 800 MHz (DDR3-1600) having a CAS latency of 7 to 9 cycles is normal, adding two to four cycles on top of that... you do the math.

In the end it's a matter of alignment and how well your memory controllers are able to coalesce accesses. If there's wiggle room left, you can tolerate data sitting a little longer in the buffers, if not, well, you'll have a performance penalty then. But I'm sure, product managers and their teams did ensure beforehand that perf does not fall off a cliff here.
__________________
English is not my native tongue. Before flaming please consider the possiblity that I did not mean to say what you might have read from my posts.
Work| Recreation
Warning! This posting may contain unhealthy doses of gross humor, sarcastic remarks and exaggeration!
CarstenS is offline   Reply With Quote

Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 14:36.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.