Welcome, Unregistered.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Closed Thread
Old 25-Sep-2009, 17:12   #2426
trinibwoy
Meh
 
Join Date: Mar 2004
Location: New York
Posts: 9,809
Default

He's referring to his own speculation here.

Quote:
Nvidia was always aiming for a launch close to Black Friday, the first Friday after Thanksgiving and this year it is on November 27th. It is important to launch the product before this date as most of the shopping is usually done around this date.
__________________
What the deuce!?
trinibwoy is offline  
Old 25-Sep-2009, 17:12   #2427
MfA
Regular
 
Join Date: Feb 2002
Posts: 5,221
Send a message via ICQ to MfA
Default

Trinibwoy ... it doesn't, but when you are looking into the future by 4 months and a lot can still go wrong it's a rather small margin of error.
MfA is offline  
Old 25-Sep-2009, 17:41   #2428
trinibwoy
Meh
 
Join Date: Mar 2004
Location: New York
Posts: 9,809
Default

Sure, but Rys isn't necessarily referring to GT300 showing up at Newegg. There are lots of other tidbits that could leak or be released before that.
__________________
What the deuce!?
trinibwoy is offline  
Old 25-Sep-2009, 17:44   #2429
nAo
Nutella Nutellae
 
Join Date: Feb 2002
Location: San Francisco
Posts: 4,297
Default

Regarding the MIMD rumour, it's quite unlikely that GT300 has an instruction decoder and scheduler (+ I$?) per SP. Perhaps it refers to be able to have some sort of task parallelism while running in compute mode.
__________________
[twitter]
More samples, we need more samples! [Dean Calver]
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way
nAo is offline  
Old 25-Sep-2009, 17:58   #2430
trinibwoy
Meh
 
Join Date: Mar 2004
Location: New York
Posts: 9,809
Default

Ever since I read that dynamic warp formation paper, whenever I hear about MIMD on GPUs I always assume it's still SIMD hardware but MIMD from the view of the running program. I've never really understood why the data sent to a SIMD has to be from the same warp. The SIMD doesn't care does it?

Could they practically extend GT200's scoreboarding mechanism to simply collect bundles of "ready" threads and their associated operands from any/all running warps? It would probably require some sort of operand buffering mechanism and trickier prioritization but it doesn't sound much different from what they're doing now anyway.
__________________
What the deuce!?
trinibwoy is offline  
Old 25-Sep-2009, 18:06   #2431
nAo
Nutella Nutellae
 
Join Date: Feb 2002
Location: San Francisco
Posts: 4,297
Default

There are two different arguments mixing here. One thing is being able to re-converge your threads and pack in a warp only (or mostly..) threads that share the same IP. Another thing is being able to schedule instructions from diverged control flow or even different programs into the same warp. The latter is way more complex (and requires more instruction decoders)
__________________
[twitter]
More samples, we need more samples! [Dean Calver]
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way
nAo is offline  
Old 25-Sep-2009, 18:07   #2432
MfA
Regular
 
Join Date: Feb 2002
Posts: 5,221
Send a message via ICQ to MfA
Default

Quote:
Originally Posted by trinibwoy View Post
The SIMD doesn't care does it?
As long as there is enough local shared memory to accommodate all the warps I don't see why it should ... control flow gets less coherent, but how much would that matter with shaders used at the moment?
MfA is offline  
Old 25-Sep-2009, 18:14   #2433
MfA
Regular
 
Join Date: Feb 2002
Posts: 5,221
Send a message via ICQ to MfA
Default

BTW, what is already accumulated in warps at the moment? Are only vertices/fragments of a single drawcall combined? Or will it try to see what changes in between drawcalls?
MfA is offline  
Old 25-Sep-2009, 18:26   #2434
trinibwoy
Meh
 
Join Date: Mar 2004
Location: New York
Posts: 9,809
Default

Quote:
Originally Posted by nAo View Post
There are two different arguments mixing here. One thing is being able to re-converge your threads and pack in a warp only (or mostly..) threads that share the same IP. Another thing is being able to schedule instructions from diverged control flow or even different programs into the same warp. The latter is way more complex (and requires more instruction decoders)
Could you expand on that a bit? Why is packing by PC easier than packing by instruction?
__________________
What the deuce!?
trinibwoy is offline  
Old 25-Sep-2009, 18:30   #2435
nAo
Nutella Nutellae
 
Join Date: Feb 2002
Location: San Francisco
Posts: 4,297
Default

Quote:
Originally Posted by trinibwoy View Post
Could you expand on that a bit? Why is packing by PC easier than packing by instruction?
Packing by PC *is* packing by instruction (not the other way around though ).
What's harder is to pack different instructions in the same warp, as you need to improve, among other things, your instructions decoding rate.
__________________
[twitter]
More samples, we need more samples! [Dean Calver]
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way
nAo is offline  
Old 25-Sep-2009, 18:42   #2436
dnavas
Member
 
Join Date: Apr 2004
Posts: 325
Default

Quote:
Originally Posted by nAo View Post
What's harder is to pack different instructions in the same warp, as you need to improve, among other things, your instructions decoding rate.
I'd be concerned about operand fetching as well.

Does this get easier if your instruction set is simplified? I don't recall the exact details, but it seems like the instruction set is already pretty sparse, with MADD being the odd outlier and operand types (int16 vs. int32 vs. float vs. double?) contributing. Do we gain much by splitting the MADD? At some cost to operand bandwidth, one could gain greater use of the two math units, and it is a simplification (no more triple operand fetches, fewer instructions to support).
dnavas is online now  
Old 25-Sep-2009, 18:43   #2437
trinibwoy
Meh
 
Join Date: Mar 2004
Location: New York
Posts: 9,809
Default

Quote:
Originally Posted by nAo View Post
Packing by PC *is* packing by instruction (not the other way around though ).
What's harder is to pack different instructions in the same warp, as you need to improve, among other things, your instructions decoding rate.
No, I'm saying to pack the same instruction but not necessarily the same PC. A PC specifies not only an instruction, but an instruction at a specific point in the program. Not sure why you would need a faster decoding rate if you're packing by decoded instruction. There'll be more latency between decoding and execution but that shouldnt matter.

Edit: I'm assuming here that instructions and operand addresses are kept separate. If that's not the case then ignore me
__________________
What the deuce!?
trinibwoy is offline  
Old 25-Sep-2009, 18:49   #2438
nAo
Nutella Nutellae
 
Join Date: Feb 2002
Location: San Francisco
Posts: 4,297
Default

Quote:
Originally Posted by trinibwoy View Post
No, I'm saying to pack the same instruction but not necessarily the same PC. A PC specifies not only an instruction, but an instruction at a specific point in the program. Not sure why you would need a faster decoding rate if you're packing by decoded instruction. There'll be more latency between decoding and execution but that shouldnt matter.
You make the assumption that you can decode more instructions without stalling execution, but that works only if you increase the instructions decoding rate (unless you also want to assume that current parts are unbalanced..)
__________________
[twitter]
More samples, we need more samples! [Dean Calver]
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way
nAo is offline  
Old 25-Sep-2009, 18:53   #2439
trinibwoy
Meh
 
Join Date: Mar 2004
Location: New York
Posts: 9,809
Default

Ok, now I understand what you're saying. But that investment could be amortized over wider SIMDs or something (which they probably need to do regardless to avoid AMD completely running away with flops/mm).
__________________
What the deuce!?
trinibwoy is offline  
Old 25-Sep-2009, 18:56   #2440
nAo
Nutella Nutellae
 
Join Date: Feb 2002
Location: San Francisco
Posts: 4,297
Default

Quote:
Originally Posted by trinibwoy View Post
Ok, now I understand what you're saying. But that investment could be amortized over wider SIMDs or something (which they probably need to do regardless to avoid AMD completely running away with flops/mm).
Well, if you make it wider, than you need even more decoders to able to fill a bigger warp with useful work to do
__________________
[twitter]
More samples, we need more samples! [Dean Calver]
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way
nAo is offline  
Old 25-Sep-2009, 19:03   #2441
trinibwoy
Meh
 
Join Date: Mar 2004
Location: New York
Posts: 9,809
Default

Quote:
Originally Posted by nAo View Post
Well, if you make it wider, than you need even more decoders to able to fill a bigger warp with useful work to do
Who said anything about bigger warps?
__________________
What the deuce!?
trinibwoy is offline  
Old 25-Sep-2009, 19:42   #2442
nAo
Nutella Nutellae
 
Join Date: Feb 2002
Location: San Francisco
Posts: 4,297
Default

Quote:
Originally Posted by trinibwoy View Post
Who said anything about bigger warps?
Not that simple, there's a reason (actually more than one..) why NVIDIA hw logical SIMD width doesn't match physical SIMD width.
__________________
[twitter]
More samples, we need more samples! [Dean Calver]
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way
nAo is offline  
Old 25-Sep-2009, 19:54   #2443
MfA
Regular
 
Join Date: Feb 2002
Posts: 5,221
Send a message via ICQ to MfA
Default

You could simply run more warps of the same program ... trade flexibility and branch granularity for less control circuitry. Are there any annotated die graphs to show how much area they could save in this way? It seems to me that even for NVIDIA it's not really an issue.
MfA is offline  
Old 25-Sep-2009, 20:00   #2444
nAo
Nutella Nutellae
 
Join Date: Feb 2002
Location: San Francisco
Posts: 4,297
Default

Quote:
Originally Posted by MfA View Post
You could simply run more warps of the same program ... trade flexibility and branch granularity for less control circuitry. Are there any annotated die graphs to show how much area they could save in this way? It seems to me that even for NVIDIA it's not really an issue.
Umh, something doesn't compute here. If it didn't pose problems why aren't doing it already instead of 'artificially' increase their SIMD width? For instance don't they need 2 clock cycles to schedule an instruction?
__________________
[twitter]
More samples, we need more samples! [Dean Calver]
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way
nAo is offline  
Old 25-Sep-2009, 20:09   #2445
MfA
Regular
 
Join Date: Feb 2002
Posts: 5,221
Send a message via ICQ to MfA
Default

They double pump the SIMD array so they can easier deal with latency of the instructions ... but that's a completely orthogonal issue (you'd still be doing that even if you ran multiple warps with the same instruction unit). As for why they aren't doing it already, I personally don't think it makes sense for them to do it ... as I said you lose flexibility (if you don't have enough WARPs to run part of the SIMD lies idle) and the branch granularity increases (branch paths get shared between warps).

I don't think the control circuitry necessary for 16 wide scalar SIMD vs. a wider SIMD is really what is keeping their flops/mm2 down, that's why I asked for the annotated die micrograph.
MfA is offline  
Old 26-Sep-2009, 00:01   #2446
Sxotty
Senior Member
 
Join Date: Dec 2002
Location: Under a Crushing Burden
Posts: 4,290
Default

Quote:
Originally Posted by MfA View Post
Why would I? You stopped addressing the arguments and just resorted to stuff like this ... I'll just sit here and gloat now
It was a joke about misunderstanding what people are saying. A situation that arises quite often in forums.

I still think you meant that physX becoming popular would be bad b/c it would become entrenched in a dominant position and lead to less innovation due to pressuring others out of GPU market. If that is not what you meant feel free to correct my understanding.
__________________
You bought horse armor didn't you?
Sxotty is offline  
Old 26-Sep-2009, 02:07   #2447
Richard
Mord's imaginary friend
 
Join Date: Jan 2004
Location: PT, EU
Posts: 3,506
Default

Quote:
Originally Posted by DegustatoR View Post
Saying that something's better is the same as saying that something's worse. So you're essentially saying the same thing.
You need to re-read my post: saying something is worse is different from saying something sucks.

Chalnoth: wrt the wrappers. I tried a couple, unfortunately EF2000 was one of the earliest titles using GLIDE and the game ran in a weird DOS/win32 mode which isn't compatible with today's Windows. You can run the game in DOS or Windows or this weird mode. GLIDE is only supported in this last one.

Btw, I don't believe a wrapper will be really necessary as PhysX is more abandonment-proof (that's a mouthful) since it runs, however slowly, on X86.
__________________
The optimist proclaims that we live in the best of all possible worlds, and the pessimist fears this is true. - James Branch Cabell
Richard is offline  
Old 26-Sep-2009, 02:27   #2448
trinibwoy
Meh
 
Join Date: Mar 2004
Location: New York
Posts: 9,809
Default

Quote:
Originally Posted by nAo View Post
Not that simple, there's a reason (actually more than one..) why NVIDIA hw logical SIMD width doesn't match physical SIMD width.
I'm not discounting any possibility. Who knows, maybe instruction issue now runs at the shader clock.
__________________
What the deuce!?
trinibwoy is offline  
Old 26-Sep-2009, 08:59   #2449
Davros
Darlek ******
 
Join Date: Jun 2004
Posts: 9,489
Default

@Richard EF2000 didnt work with a wrapper because the game talked directly to the card and didnt use a driver
__________________
Guardian of the Most holy Two Terabytes of Gaming Goodness™
Davros is offline  
Old 26-Sep-2009, 14:12   #2450
-The_Mask-
Junior Member
 
Join Date: Sep 2009
Location: The Nederlands
Posts: 51
Default

According to CJ the GT300 is faster then HD5870, but not faster then HD5870X2
Link: http://translate.google.nl/translate...age%2F32639927
-The_Mask- is offline  

Closed Thread

Tags
nvidia, speculation

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 07:17.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.