Nvidia GT300 core: Speculation

Status
Not open for further replies.
Because of delaying, having much bigger and more expensive to produce GPU... NVIDIA won`t be able to sell it`s GPUs on the same price as Rv870 so who will buy more expensive card which isn`t significantly faster than competitors one?
That's not as important, as the R&D cost is pretty high compared to the actual chip/board cost. Not having a much faster card would hurt their margins, but it wouldn't prevent it from selling well if priced adequately to the performance, and especially not if nVidia manages to differentiate its products from ATI's in meaningful ways.
 
This number is wrong.

How about.. there's a less than 50% chance they'll do a hard meaningful launch within 3 months?

There hasn't been a noteable quantity of any GT21x part, you think they finally fixed the leakage and launch a top to bottom DX10.1 lineup before the Win7 launch?
It's been about 7 months now since GT218 and GT216-A2 had been approved for production. where are they?
 
Last edited by a moderator:
Ah ok. That would be a LOT of bandwidth.

I wouldn't really call it a LOT if you compare the other hypothetical increases of the chip against the bandwidth increase.

Well still using shader clusters with a couple SP units, some DP units, TMU grouped together, with the ALUs running at higher clock, and unlike ATI still with scalar ALUs. Though if it's really using MIMD won't that actually decrease performance per die area further (at least for graphics)?

While I don't think they'll be using MIMD units, I wouldn't say that having MIMD units decreases PowerVRs perf/mm2 in the embedded markets. They still have the highest performance/efficiency per square millimeter in that market.

I'm still puzzled why MIMD units are supposed to be more "expensive" than SIMD units. As a layman I must be missing something rather important here.
 
Thanks Jawed for pointing that out again :)

Did you link that article as support for rpg's statement or the opposite? According to the conclusion RV770 has a much higher theoretical increase in MADD flops while GT200 has a higher increase in actual performance indicating lower utilization on the former.

Edit: Ah re-reading that thread I see you disqualified a bunch of pcgh's data and came up with your own conclusion. Don't think there's anything concrete there. This comparison should get easier with OpenCL (unless people write different versions of their apps for Nvidia and AMD hardware)

Actually what we've done was take all shaders from a game's scene an run it (untextured, unweighted!) through the GPU without the game's context.

What Jawed now has done with this results is, he has taken every instance of a shader being fillrate limited, being not in line with expected performance compared to other instances of the same architecture (read: unexplainably bottlenecked) out of the equation.

Basically it's a different approach than ours, more closely resembling purely synthetic shader-tests doing mostly maths or otherwise ALU-heavy stuff, while we were focusing more (not closely or exactly by any means!) on real world situations where you tend to end up being pix-fillrate limited once in a while and so on.
auntie edit says: Might be fun to repeat this, if the rumored 32 ROPs/RBEs on Cypress turn out to be true.. :)

Now, it's surely highly debatable which approach is more cherry-picking than the other, but I'm glad Jawed went through all his excel and database skill to point out the other interpretation of our results!
 
I wouldn't really call it a LOT if you compare the other hypothetical increases of the chip against the bandwidth increase.
And what if you compared it against the rumored increases in bw and other parts of the chip in what appears to be the first DX11-arch to market? ;)
 
Basically it's a different approach than ours, more closely resembling purely synthetic shader-tests doing mostly maths or otherwise ALU-heavy stuff, while we were focusing more (not closely or exactly by any means!) on real world situations where you tend to end up being pix-fillrate limited once in a while and so on.

Well I wouldn't say either approach is cherry picking but given the intent of the analysis I think Jawed is on the right track. If you wanted more real world numbers you could've left texturing etc in as well. But, if you could do a followup article in a few months (maybe with a few DX11 games in the mix) that would be great. :)

And what if you compared it against the rumored increases in bw and other parts of the chip in what appears to be the first DX11-arch to market? ;)

Technically it's still not a lot if you consider its competition is 2x Cypress ;)
 
Well I wouldn't say either approach is cherry picking but given the intent of the analysis I think Jawed is on the right track. If you wanted more real world numbers you could've left texturing etc in as well. But, if you could do a followup article in a few months (maybe with a few DX11 games in the mix) that would be great. :)
It's way more difficult (and legally questionable) to also grab the textures in the right resolution, which was not possible for us at that time. :) But thanks to Jawed you now have both interpretations to compare and draw your own conclusions.

Technically it's still not a lot if you consider its competition is 2x Cypress ;)
If both use GDDR5 and the highest binned memory parts available, theoretical peak should be about the same. :)
 
I could imagine GT300 is a single-chip design and thus saves a bit on memory transfers, i.e. texture fetch or inter-chip communication.
auntie edit says: Plus you might want to consider, that hemlock probably has to feed 160 TUs out of it's VRAM while GT300 probably can use the same amount to keep 128 TUs happy.

I'd put it this way: If GT300 is "just enough", Hemlocks is "a bit to little" - unless of course AMD comes up with some clever new tricks for avoiding multi-gpu pitfalls.
 
And what if you compared it against the rumored increases in bw and other parts of the chip in what appears to be the first DX11-arch to market? ;)

Do I have to tell you that bandwidth needs between different architectures are hardly comparable? :devilish:
 
The question being, who cares about PhysX when it is vendor specific?

I have to admit I'm a bit puzzled about this. Does Nvidia realize PhysX is doomed to fail? Do they actually hope it can succeed against Havok when neither AMD nor Intel supports it? Will they keep investing in it?
 
I have to admit I'm a bit puzzled about this. Does Nvidia realize PhysX is doomed to fail? Do they actually hope it can succeed against Havok when neither AMD nor Intel supports it? Will they keep investing in it?
Doesn't matter, as far as they're concerned, as long as it sells video cards for them now.
 
I also do not see where Physx is vendor specific: It's multi-architecture/multi-platform and seems to be some kind of proven middleware when it comes to integrating it in games. That's why it won't be doomed as soon as some people might think.

Acceleration of Physx via GPU is vendor specific, yes, and Nvidia trying to make it impossible to accelerate physx apart from pure CPU power seems kinda ridiculous too.
 
Do I have to tell you that bandwidth needs between different architectures are hardly comparable? :devilish:
Not more than me having to tell you, that it's equally pointless to try and qualify bandwidth from quantification in an otherwise as good as unknown arch. :)
 
That's true, but few games do.....
Well, it's not really about the number of games either, but whether or not those games that do are ones that people like.

In any case, yes, I understand the advantage of vendor-agnostic API's. I'm just trying to put it out there that nVidia isn't completely crazy for pushing PhysX, from a business perspective. Not that that's necessarily good for the consumer.
 
Status
Not open for further replies.
Back
Top