Nvidia GTS250M Series - DirectX 10.1 Supported?

Since Shader Model 2.0, there has been no specification for "dual issue" at all. The token stream comes in and it's up to the driver's compiler to pack things optimally for the HW on which the shader will be run.

The main improvements of DX10.1 over DX10 are increased flexibility of the pipeline. For example, you can specify some interpolants to be computed per sample instead of per pixel, allowing for better results while using multisampling. These improvements increase the application's (and HW's) efficiency because it removes the need to multipass certain effects.

I'm not CGP but basically that's what i said (or not?) Now they pack more things to be done simultaneously. And after all dx11 (multithreaded) is something dx10 expected to be (as they claim in the other thread), if nV didn't crap out some dx10 inoculated versions of their earlier dx9.0c parts. Well they had their own way of packing these samples inside their Core.
While ATi can easily shuffle (theoretically) all data between their SIMDs and main impact on R600 performance came up with large but insufficient global (InterPipeline) cache and they done pretty good job patching that cache impact issues with dx10.1 superset cause all that samples didn't have to be flushed into global cache and fragment it even more, just for store some final results as initially is supposed to do. Did i get it?

But in terms of GFlops they use it while trying to describe GP-GPU computing power and depending how ultimately smaller fragments can be packed and processed for max-width data execution they get more real performance hitting that peak (nominal) performance. And use same amount of power (same number of transistors) that all do some useful job not just leaking current between drain-n-source.
 
if nV didn't crap out some dx10 inoculated versions of their earlier dx9.0c parts.
:oops: :rolleyes: There was a rather large :!: architectural & capability difference between G71 & G80...

The surprise was that they didn't add DX10.1 with the GT200 generation until now (3rd refresh)
 
I'm not CGP but basically that's what i said (or not?) Now they pack more things to be done simultaneously.
I don't see how what you said make much sense, sorry. The HW shader pipe is probably largely unchanged. See last paragraph below.
While ATi can easily shuffle (theoretically) all data between their SIMDs and main impact on R600 performance came up with large but insufficient global (InterPipeline) cache and they done pretty good job patching that cache impact issues with dx10.1 superset cause all that samples didn't have to be flushed into global cache and fragment it even more, just for store some final results as initially is supposed to do. Did i get it?
I struggle to make sense of this. DX10.1 doesn't increase compute density, which seems to be what you were implying in earlier messages. It increases efficiency, but not the way you imply.
But in terms of GFlops they use it while trying to describe GP-GPU computing power and depending how ultimately smaller fragments can be packed and processed for max-width data execution they get more real performance hitting that peak (nominal) performance. And use same amount of power (same number of transistors) that all do some useful job not just leaking current between drain-n-source.
What are you trying to say? I can only guess that you're trying to say that there's increased parallelism going on just because the new chips support DX10.1. I'm afraid I don't see the connection.

If these new products are achieving twice the GFLOPS/watt, then I'm sure people are referring to the same workload. In other words, if it were a DX10 workload, then the new HW would be twice as power efficient on that workload. Same with a DX9, OpenGL or CUDA workload. DX10.1 doesn't really come into the picture here.
 
I'm not CGP but basically that's what i said (or not?) Now they pack more things to be done simultaneously. And after all dx11 (multithreaded) is something dx10 expected to be (as they claim in the other thread), if nV didn't crap out some dx10 inoculated versions of their earlier dx9.0c parts.
How does DX10.1 fit into this picture? DX10.1 doesn't provide any more (cpu side) multithreading support then DX10 does. That's comming with DX11 and will be supported on down level hardware.
 
? And they did it how just with LP process? (LowPower)

Well... yeah.

Its all right there in the table.

GTS250m - 360 GFLOPS 28w TDP - 360 / 28 = 12.86
GTX260m - 462 GFLOPS 75w TDP - 462 / 75 = 6.16

So the main reason for the difference is the massively lower TDP which will come from the smaller, lower power process and possibly some tweaked designs to save power.

In terms of GLFOPS the 2 chips are perfectly comparable given thier different number of SP's and shader clock speed. DX10.1 makes no difference to this.
 
The GTS250m also is a lower voltage variant, meant to take place on real laptops rather than desktop replacements it seems.

GTS 260m - 396 GFLOPS 38w TDP - 396 / 38 = 10.42

I speculate nvidia using the highest quality GPU for gts 250m and chips that don't qualify on gts 260m
(though there's probably not much difference)
 
If we look at the chart here

http://www.theinquirer.net/inquirer/news/1271809/nvidia-mobile-gpu-kimono

The new GTS250M nearly DOUBLE the GFLOP / Watt over previous GTX260!!!!!!!!!!!!!!!

If TSMC 40NM is still leaky, then may be some even better Bin of Chip will come out later?

Dropping the voltages helps a lot...
The faster clocked GTS260M (+10%) ends up having a TDP that is 35% higher. Desktop chips have been pushing as far up the curve of diminishing returns as it has been possible to cool inexpensively, with little or no regard for power draw in and of itself. That is not an acceptable approach for devices meant for mobile use. (Some of us don't think it's a terribly good modus operandi for the desktop either.)
 
Back
Top