ELSA hints GT206 and GT212

Nvidia doesn't have a single competitive part. ATI is in control top to bottom for at least the next 3 months, likely 6. They have a cheaper part, so they can run NV margins to whatever point they feel like, and NV can't so squat. NV screwed their partners over for PR reasons, and in general, can't do anything dumber. Just wait, I will probably eat those words in a few days.
The thing is that ATI is not going to run NV's margins into the ground. Instead, they're going to make their own margins high. NVidia did the same thing when it raped ATI in perf/mm2 back in the R5xx vs. G7x days (and, to a lesser extent, G84/G86 vs RV630/RV610).
 
I doubt that's the right metric to use. 3DMark Vantage isn't solely SP limited - not by a long shot.

I certainly agree that 3DMark Vantage isn't solely SP limited and probably this is not the right metric to use.
I used this method (which is kinda simplistic) because you said that:

Something about Vantage has always given GT2xx a marked advantage over G9x

and

It may be that the shaders need more registers, thus reducing the latency hiding ability and efficiency of G9x.

I was trying to show that probably the SPs cannot produce this advantage.
And also I was trying to show that the advantage is not the same between GT2XX and GT240.

But yes the method is kinda simplistic (but with such little data that we have what can anyone do?)

Another example is:
with GT220, Nvidia managed in relation with an oc 9500GT (lets say same MHz as GT220) with same pixel rate, with same texel rate, with +50% shaders and with same bandwidth to achieve only +30% performance advantage in Vantage per MHz.

http://www.tomshardware.co.uk/geforce-gt-220,review-31703-7.html

So why the much less powerful 550MHz GT240 GDDR5 (in relation with GTS250 per MHz, half texel, 2/3 shaders etc...) can achieve +20% speed per MHz?

I am not arguing, i just find it strange.


What I was trying to point out is that the GTX260 is 1.05x the speed of the GTS250 in 3DMark06 (using your link), but 1.37x the speed in Vantage. This jives perfectly with with the GT240 being 0.69-0.74x the speed of the GTS250 in 3DMark06 but 0.94x in Vantage.

I agree about 3DMark06, i just don't agree about Vantage.

There's nothing really fishy about it (or conversely it's been fishy ever since GT200 was released).

For me the Vantage advantage GT200 had, could be explained much more easily.
This time the Vantage advantage is much bigger, so it is not so easy to explain.
 
XMAN26, can you keep your personal love notes to Charlie out of all public threads?
 
If he doesn't a much longer vacation is in store for him. We're getting a bit tired of cleaning all sorts of crap, irrespective of which side of the fence generates it, so perhaps more expeditive solutions are necessary. If you want to go at one another(not you Brit, people in general) do it in RPSC or somewhere else (read another forum).
 
For me the Vantage advantage GT200 had, could be explained much more easily.
This time the Vantage advantage is much bigger, so it is not so easy to explain.
But I just proved to you that it's not by normalizing scores to the GTS260. Basically, the GT240 is giving scores that are 32% lower than that of GT260 in both tests.
 
But I just proved to you that it's not by normalizing scores to the GTS260. Basically, the GT240 is giving scores that are 32% lower than that of GT260 in both tests.

Yes but this imo doesn't show that everything is fine.
On the contrary it shows that something is fishy.
Maybe i am missing something?

You are saying that since the performance difference between GT240 and GTX260 is the same (around -32%) in both tests then everything is fine.
Sorry i just don't get it.
The 2 designs have different ROPs/TMUs/SPs/Bandwidth ratios, why the performance difference should be the same?
Also GTX260 has twice the speed (in specs) or more of GT240 (1,9X (Gpixels) - 2,35X (Gtexels))

Lets see the specs:

GTX260.................GT240
16,1 Gpixel............8,8 Gpixel
41,4 Gtexel...........17,6 Gtexel
112 GB/s...............57,6GB/s
216SP(1242MHz)..96SP(1340MHz)

The difference in these tests between GT240 and GTX260 is only -32%.
Why?
Let's suppose we underclock the GTX260 (-32%).

uc GTX260...........GT240
11 Gpixel.............8,8 Gpixel
28,2 Gtexel.........17,6 Gtexel
76,2 GB/s............57,6GB/s
216SP(845MHz)..96SP(1340MHz)

Why the inferior GT240 to produce the same results with the GTX260?

Sorry, i am trying to understand what you are saying, but i interpret the results differently.

I suppose soon GT240 (550MHz GDDR5) will launch.
I am trying to say that i will be very surprised If it is only -32% slower than GTX260 in Vantage because i find it strange with these specs.
 
Last edited by a moderator:
Is the GT 220 considered a Compute 1.2 or Compute 1.1 GPU? None of the reviews even touched on this subject.

Update: Nevermind, I found the answer, it is Compute 1.2, probably one the first to be widely available.

Device 0: "GeForce GT 220"
CUDA Driver Version: 2.30
CUDA Runtime Version: 2.30
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 2
Total amount of global memory: 1073414144 bytes
Number of multiprocessors: 6
Number of cores: 48
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 16384
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.36 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: Yes
Compute mode: Default (multiple host threads can use this device simultaneously)

Specifications for Compute Capability 1.2
Support for atomic functions operating in shared memory and atomic functions operating on 64-bit words in global memory (see Section B.10);
Support for warp vote functions (see Section B.11);
The number of registers per multiprocessor is 16384;
The maximum number of active warps per multiprocessor is 32;
The maximum number of active threads per multiprocessor is 1024.

So this proves a certain ATI troll was dead wrong in claiming the GT 220 was a shrink of the G9x. It is derived from the GT200 architechture, just no FP64 ALUs. For comparison here's a G9x Compute 1.1 GPU, it has half the amount of registers.

Device 0: "GeForce 9600 GT"
CUDA Driver Version: 2.30
CUDA Runtime Version: 2.30
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 1
Total amount of global memory: 536543232 bytes
Number of multiprocessors: 8
Number of cores: 64
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.50 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: No
Compute mode: Default (multiple host threads can use this device simultaneously)
 
Last edited by a moderator:
The 2 designs have different ROPs/TMUs/SPs/Bandwidth ratios, why the performance difference should be the same?
I don't see your logic. The similarity implies that either the game workload is similar, or Vantage has parts that benefit certain ratios while 3DMark06 benefits from others, cancelling each other out. Either way, different ratios mean nothing.

For example, suppose I take 3Dmark06, and create a new benchmark called 3DMark Vintage that double the pixels, double the vertices, and double the CPU load. Every video card suffers a 50% drop in framerate. Does that contradict the fact that all these video cards have different design ratios? Of course not!
Let's suppose we underclock the GTX260 (-32%).
You forgot about setup speed and CPU limitations. We already have evidence in this thread that some game tests are partially CPU limited, but more importantly, 3DMark scores do not scale down as expected with pixel count due to the high vertex workload, so all your numbers have limited applicability (since all of them only affect pixel rate).

I can't find a site with the same 3DMark06 scores for the GT240 or 9800GT, but look at resolution scaling here:
http://www.tweaktown.com/reviews/1544/gigabyte_geforce_gtx_260_graphics_card/index12.html

The scores are proportional to framerate and inversely proportional to render time, so I did some linear regression on the inverse of the scores. For example, the BFG 9800 GTX OCX has a score equal to 1/(5.16e-5 + 1.20e-11 * pixels). Try it: you'll find that that it's accurate to within five 3DMarks. The constant term (in this case 5.16e-5) is pretty similar for all cards, and indicates that 70-80% of the render time is independent of pixel count. Only the 1.20e-11 figure will go down with more shaders/BW/ROPs/TMUs, thus they can only slightly increase your 3DMark score.

Now, the vga.it168.com review seems to have more strenuous settings, and the weaker GT240 seems to spend about 45% of its 3DMark06 render time limited by anything pixel related at 1920x1200 (again, that's BW, ROPs, TMU, or ALUs), and I'll wildly guess that maybe 15% more is vertex shader limited. If you double the speed at which pixels and shaders are processed/output, then you'll only take off 30% of the render time, and you'll only get a 42% higher score.

I can't find any Vantage data at different resolutions without changing the settings (e.g. AA, AF), but a similar situation holds. If a GTX260 spends 55% of the render time limited by CPU or setup, then a GT240 (which has roughly half the BW/ROPs/ALUs/TMUs) will take 1.45x as long to render the scene, not 2x as long. Or, put another way, the GT240 is 31% slower.

Lets see the specs:

GTX260.................GT240
16,1 Gpixel............8,8 Gpixel
41,4 Gtexel...........17,6 Gtexel
112 GB/s...............57,6GB/s
216SP(1242MHz)..96SP(1340MHz)
(Hence my "roughly half" estimate during pixel limited parts of the workload.)
 
the naming is a shame (a gddr3 version should be called "GT230"), and that cooler looks small.

give me a gddr-5 model with a dual slot cooler and I won't complain.
I could actually make use of that card if it's cheap enough. It doesn't use much power, apparently less than a radeon 4770 (which compares roughly to a 9600GT).

It would make a great energy efficient card (I don't mind gaming at 1024x768 high IQ anyway)
 
It doesn't use much power, apparently less than a radeon 4770 (which compares roughly to a 9600GT).

Ehh.. I see no 4770 numbers there but from other reviews (where system power in furmark is 30-40w below 4850) it looks like it's only 5-10 watt less in both load and idle, while the 4770 would be around 40% faster (the 4850 is generally 40-50% faster here, and the 4770 is only a few percent behind it).
 
the naming is a shame (a gddr3 version should be called "GT230"), and that cooler looks small.
If those benchmarks we saw are any indication the performance difference between gddr5 and "sddr3" (I wonder if this actually really is gddr3 and not just ddr3) isn't too big. But I won't disagree that there still should be a different name.

give me a gddr-5 model with a dual slot cooler and I won't complain.
They tested the gddr5 version. I think you won't see dual slot coolers with this class of chips. Generally unnecessary and just driving costs up. Doesn't mean you couldn't fit one yourself, of course.

I could actually make use of that card if it's cheap enough. It doesn't use much power, apparently less than a radeon 4770 (which compares roughly to a 9600GT).

It would make a great energy efficient card (I don't mind gaming at 1024x768 high IQ anyway)
Well, it only seems to use a little less power than the hd4770, while losing quite badly in performance. Still, I think it's an ok card. No external power connector, a pcb looking very simple (and hence cheap). So as long as it's actually sold cheaper than the hd4770 it should find its place. At least it beats the HD4670 (would have been disaster if not), though I suspect the ddr3 version will have to fight pretty hard to beat redwood-based cards (and those should be cheaper, use even less power, have more features etc. etc.).
 
Gigabyte makes a dual slot GT220, which looks good except it's half the specs at maybe 80% the price. shader clocks are pretty high, the GT240 probably has headroom.

right now I'm running an overclocked 8400GS and was quite surprised, it's faster than one might think.
 
Hmmm, that's impressive efficiency wise given the lower clocks and unit counts. Too bad the price is too high and absolute performance is too low.
What lower unit counts? It's got 96 (granted clocked slightly slower) instead of 64 shader units, and the same amount of texture units. It has (maybe) only half the rops and lower core clock too, but I'd say performance is right there where you'd expect it compared to 9600GT. And certainly on a perf per area metric it's less than stellar (a shrinked g94 should be smaller in theory).
 
What lower unit counts? It's got 96 (granted clocked slightly slower) instead of 64 shader units, and the same amount of texture units. It has (maybe) only half the rops and lower core clock too, but I'd say performance is right there where you'd expect it compared to 9600GT. And certainly on a perf per area metric it's less than stellar (a shrinked g94 should be smaller in theory).

Yeah you're right, brain fart. Don't know why I thought it only had 16 TMUs and not 32. Nevermind then, not so impressive at all. Not sure it makes sense to compare its die size to a best case G94 shrink though....
 
Actually thinking about this, the (g)ddr3 versions might already have to fight pretty hard with HD4670 if it loses 20% or so performance with the slower ram. That wouldn't be so good, rv730 has very similar size but using 55nm...
 
Halloween launch party for a "franken card" with mix of GT21x GPUs?
http://nvnews.net/vbulletin/showthread.php?t=140298
Maybe GT215 SLI/GT200 with GT216 for PhysX on one PCB?

edit:
EVGA teaser: http://www.evga.com/articles/00512/
footer.jpg


The shape of the mysterious card matches to a Single-Slot GTX 295: http://www.evga.com/PRODUCTS/IMAGES/GALLERY/017-P3-1295-AR_XL_4.jpg
 
Last edited by a moderator:
Back
Top