If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.
![]() |
|
|
#4076 |
|
Member
Join Date: Jan 2010
Location: Hamburg, Germany
Posts: 987
|
You mean Eastern Europe, Middle East and North Africa?
__________________
x: RCP_sat R2.x, R1.y y: RCP_sat ____, R1.y z: RCP_sat ____, R1.y |
|
|
|
|
|
#4077 |
|
Senior Member
Join Date: Sep 2010
Posts: 1,029
|
Not only those, but Central European countries, Germany (especially german engineers who go to USA to work there
Ok, some of these countries might be rich, but there is something special in USA which makes people wanna go there and not somewhere else. |
|
|
|
|
|
#4078 | |
|
Member
Join Date: Mar 2004
Posts: 752
|
Quote:
Regardless, this is getting off-topic But anyone in Europe would really argue about the poorer standard of life, when we have public healthcare systems and safety nets for the weakest in the society. Regardless, the graphic cards are the same price before value added taxes are applied.
__________________
Never Argue With An Idiot. They'll Lower You To Their Level And Then Beat You With Experience! |
|
|
|
|
|
|
#4079 | |
|
Darlek ******
Join Date: Jun 2004
Posts: 9,498
|
Quote:
__________________
Guardian of the Most holy Two Terabytes of Gaming Goodness™ |
|
|
|
|
|
|
#4080 | |
|
Senior Member
|
*SCNR*
Quote:
__________________
English is not my native tongue. Before flaming please consider the possiblity that I did not mean to say what you might have read from my posts. Work| RecreationWarning! This posting may contain unhealthy doses of gross humor, sarcastic remarks and exaggeration! |
|
|
|
|
|
|
#4081 | ||
|
Member
Join Date: Jan 2010
Location: Hamburg, Germany
Posts: 987
|
Quote:
Quote:
__________________
x: RCP_sat R2.x, R1.y y: RCP_sat ____, R1.y z: RCP_sat ____, R1.y |
||
|
|
|
|
|
#4082 |
|
Registered
Join Date: Apr 2012
Posts: 4
|
|
|
|
|
|
|
#4083 |
|
Meh
Join Date: Mar 2004
Location: New York
Posts: 9,809
|
This is way OT but....
I'm neither from the US or Europe but in my travels people from Europe talk way more about wanting to go to the US than the other way around.
__________________
What the deuce!? |
|
|
|
|
|
#4084 | |
|
Member
Join Date: Jan 2010
Location: Hamburg, Germany
Posts: 987
|
Quote:
By the way, the most attractive target for German emigrants is Switzerland! And now back to topic! Does anybody got some more insight into the question I asked here? I mean, if nV takes the data locality issue serious, they should pin warps to a certain vALU (or actually a set of vALU, SFUs and L/S units) in roughly the same way as GCN does it with pinning its wavefronts to a certain vALU.
__________________
x: RCP_sat R2.x, R1.y y: RCP_sat ____, R1.y z: RCP_sat ____, R1.y |
|
|
|
|
|
|
#4085 | |
|
Member
Join Date: Jun 2008
Location: Torquay, UK
Posts: 913
|
Quote:
Continuing way OT ... I call it Hollywood effect Back on topic: I was wondering why GK104 is slower in BitCoin mining than GF110. I know this workload is purely integer, yet still it seems odd new GPU is 20-30% slower in both OpenCL and CUDA miners (including CUDA miner compiled using 4.2 toolkit). average numbers: 110MH/s (GTX680) vs 140MH/s (GTX580) |
|
|
|
|
|
|
#4086 | |
|
Senior Member
|
Quote:
__________________
Apple: China -- Brutal leadership done right.
Google: United States -- Somewhat democratic. Microsoft: Russia -- Big and bloated. Linux: EU -- Diverse and broke. |
|
|
|
|
|
|
#4087 | |
|
Member
Join Date: Jul 2010
Location: Istanbul
Posts: 727
|
Quote:
it's slower on GPC OpenCL benchmark too Code:
GTX 580 GTX680 SHA-1 Hash 571.0 471.9
__________________
SiS 6326 > Ti 4200 > 9800XT > 9800GT > GTX 460 Celeron 366 > Celeron 1700 > Athlon XP 2500+ > E6300 > Q9650 |
|
|
|
|
|
|
#4088 |
|
Member
Join Date: Aug 2011
Posts: 370
|
Bitcoin is basically all shifts. Perhaps the shift hardware is not as good in Kepler? (It certainly has no reason to be as good for gaming loads.)
|
|
|
|
|
|
#4089 | |
|
Member
Join Date: Jun 2008
Location: Torquay, UK
Posts: 913
|
Quote:
It will be interesting to see if big Kepler will bring any improvements in these tasks or not. Last edited by Lightman; 12-Apr-2012 at 23:39. |
|
|
|
|
|
|
#4090 |
|
Eric the Half-a-bee
Join Date: Oct 2003
Location: The cat detector van from the Ministry of Housinge
Posts: 2,050
|
|
|
|
|
|
|
#4091 |
|
Junior Member
Join Date: Jul 2008
Posts: 51
|
A new driver won't help much here, as the integer performance is severely handicapped compared to GTX 580 and even to GTX 560. According to CUDA C Programming Guide version 4.2, 32-bit integer shifts and compares have only 1/24 of the throughput of the 32-bit FMA. That would put the GTX 680 at around 1/6 of the GTX 580 throughput in those operations. The other integer operations aren't quite that slow though.
|
|
|
|
|
|
#4092 | |
|
Senior Member
Join Date: Mar 2010
Location: Cleveland, OH
Posts: 1,579
|
Quote:
|
|
|
|
|
|
|
#4093 |
|
Senior Member
Join Date: Dec 2002
Location: Under a Crushing Burden
Posts: 4,290
|
I always assumed you were, guess it true what they say.
__________________
You bought horse armor didn't you? |
|
|
|
|
|
#4094 | |
|
Member
Join Date: Aug 2011
Posts: 370
|
Quote:
The fastest implementation you could build is probably splitting the 32-bit words into 16-bit ones, but then you are going to at least quadruple the amount of ops, and double the amount of state per thread. The state is probably the bigger hit. I have actually always been a bit puzzled as to exactly why AMD gpus are as good at 32-bit shifts as they are. There really isn't any use that justifies the expenditure outside crypto. Is AMD the main supplier to NSA or something? |
|
|
|
|
|
|
#4095 | |
|
Member
Join Date: Jan 2010
Location: Hamburg, Germany
Posts: 987
|
Quote:
As to the reason, I always thought that bit manipulating instructions are quite cheap, maybe save for the shifts. But AMD obviously thought it was less enough effort to put it in at full speed. Maybe someone can enlighten us, how much a 32bit shift unit costs compared to a FMA?
__________________
x: RCP_sat R2.x, R1.y y: RCP_sat ____, R1.y z: RCP_sat ____, R1.y Last edited by Gipsel; 17-Apr-2012 at 15:13. |
|
|
|
|
|
|
#4096 |
|
Member
Join Date: Jul 2010
Location: Istanbul
Posts: 727
|
There is some talk about nvidia's lack of BFI_INT and int rotate functions that makes them significantly slower than AMDs..
__________________
SiS 6326 > Ti 4200 > 9800XT > 9800GT > GTX 460 Celeron 366 > Celeron 1700 > Athlon XP 2500+ > E6300 > Q9650 |
|
|
|
|
|
#4097 | |
|
Member
Join Date: Jan 2010
Location: Hamburg, Germany
Posts: 987
|
Quote:
Comparing GF100/110 : GF104/114 : GK104 per clock cycle (the whole CPU and taking the hotclock for Fermi), the instruction issue rates for 32bit integer shifts relate as 4:2:1 and 32bit integer multiplies 2:1:2, making this stuff really slow on GK104 (and you still have to consider the lower clock speed of Kepler), shifts are only about 1/3 of the speed of a GF114! Only integer adds are significantly faster.
__________________
x: RCP_sat R2.x, R1.y y: RCP_sat ____, R1.y z: RCP_sat ____, R1.y |
|
|
|
|
|
|
#4098 | |
|
Member
Join Date: Aug 2011
Posts: 370
|
Quote:
The thing is, throughput loads that use shifts don't really exist outside crypto. Which makes the existence of the instruction strange. I find it entirely believable that AMD added it for a single client. |
|
|
|
|
|
|
#4099 |
|
Member
Join Date: Jan 2010
Location: Hamburg, Germany
Posts: 987
|
As said above, AMD made the shifts fast already with the R700 generation (RV770 was doing shifts ~12 times as fast as RV670), Cypress only added to it by enabling also full speed rotates and simplifying shifts of wider data with the bitalign instruction. If there would have been a specific customer, I guess they would have it done it only for RV770, not for the entire line (DP was also RV770 only). I think they did it mainly because it was cheap and even simplifies some things because everything (the ALUs) gets more symmetric.
__________________
x: RCP_sat R2.x, R1.y y: RCP_sat ____, R1.y z: RCP_sat ____, R1.y |
|
|
|
|
|
#4100 | |
|
Member
Join Date: Jul 2010
Location: Land of Mu
Posts: 350
|
Quote:
http://muropaketti.com/artikkelit/na...md-vs-nvidia,2 (2nd benchmark) It isn't on "absolute stink" level, but still disappointing. If GK110 adds the missing instruction on full speed, it should beat a 7970. |
|
|
|
|
![]() |
| Tags |
| kepler, wait for it |
| Thread Tools | |
| Display Modes | |
|
|