Nvidia GT300 core: Speculation

Status
Not open for further replies.
A while back I googled a some numbers comparing a Tesla and GeForce TDPs.
For a GT200, it was something like 160 versus 236.

I don't know about the accuracy, but that would put the contribution of the graphics-specific portions at about a 1/3, of which the texture units would be a part.
How accurate that is, I don't know.
 
For a Tesla, why would they disable the texturing? Rasterizer, triangle setup and z test hw and other fixed function stuff would be disabled.
 
My bad fuzzy thinking.
I was lumping in the entire range of texture functionality to that, not just some of the more esoteric graphics-related functions.
 
I never said that.
Increased utilisation of a production line increases margin. No magic required. A high-yielding line intrinsically costs less to run simply in terms of reduced effort in fixing yield problems. Again, no magic required. Further a high-yielding line is one that will attract customers. Again, increasing margin. Again, no magic required.

Jawed
 
Yet those are far more indirect and incidental than what Arun is proposing. If you can make those leaps then why can't you accept the simpler, more direct scenario as well?
 
Tesla use a lower voltage I believe (as on GTX295). They use good chips at a great perf/watt ; whereas gaming cards focus either on performance (high power use) or value (conservative voltage)

As for the old fixed function stuff, maybe it's not disabled but just sits there unused I don't know.
At least the NVIO is not there on the card.

Now that I think of it, a Tesla card has lower frequency, lower voltage memory too.
 
May be it just its there unused and is not disabled. But disabling/shutting it down in driver would be a low hanging fruit to pick from a perf/W perspective.
 
What's your reasoning about 1 Ghz+ TMUs on GT300? Shared clock domain with the shader cores? What else?

I can't know what he exactly means but it doesn't sound (let's call it theory for now, we'll bitchslap or congratulate him later) on first look like he would mean that the TMUs would run at the ALU frequency. I wouldn't be very surprised if the entire frequency management of D12U would be quite a step ahead from what we saw from the G8x/9x/2xx generation. My first best question would be if any of the frequencies are even "fixed" in a strict sense or get dynamically set according to demand.
 
I can't know what he exactly means but it doesn't sound (let's call it theory for now, we'll bitchslap or congratulate him later) on first look like he would mean that the TMUs would run at the ALU frequency. I wouldn't be very surprised if the entire frequency management of D12U would be quite a step ahead from what we saw from the G8x/9x/2xx generation. My first best question would be if any of the frequencies are even "fixed" in a strict sense or get dynamically set according to demand.

The reason for NV to have multiple frequencies is that certain parts of the chip cannot be clocked high enough.

I'm not even sure what sort of frequency management was in prior chips, but there's definitely a lot of interesting stuff they could do (but probably won't).

If you look at what Intel's Fort Collins team was working on, or some of the stuff done by Blauuw's group at Michigan, it's all pretty damn cool.

Dynamic frequency and voltage is the way to go, but it sure can create some massive massive validation and verification headaches.

David
 
Jawed,
Are you talking about the companys gross margins or are you talking about invidiual product margins with a product being a single (amount of) wafer(s). Mayb that's where your disagreement with Arun comes from?


dkanter,
Dynamic frequency and voltage is the way to go, but it sure can create some massive massive validation and verification headaches./QUOTE]
That's definitely worrying given the ridiculously low life cycles of todays GPUs (i'd expect, that you'd have to re-validate for a pure optical shrink also, if you want to utilize lower voltages etc. pp).
 
The reason for NV to have multiple frequencies is that certain parts of the chip cannot be clocked high enough.

His suggestion of 1.0+GHz doesn't necessarily mean that the TMUs would run anywhere close to ALU frequencies. 8800GTX had ALUs clocked at 1.35GHz.

I'm not even sure what sort of frequency management was in prior chips, but there's definitely a lot of interesting stuff they could do (but probably won't).

Why won't they probably? If the rumors end up being true and D12U has the complexity the so far hypothetical details suggest and it ends up consuming more or less as a 285 today, then chances are high that the frequency management is far more sophisticated than the ones today.

Dynamic frequency and voltage is the way to go, but it sure can create some massive massive validation and verification headaches.

Rumors also want NVIO for the high end SKU.
 
So, how long will it take NVidia to get usable drivers for GT300? If NVidia tries to launch the card around W7 launch it would have had working chips for a few weeks at most.

AMD's seemingly had working chips since May, or maybe earlier:

http://forum.beyond3d.com/showpost.php?p=1292190&postcount=613

The more radical the change in GT300, I guess the worse the launch drivers will be, if NVidia tries to do a rapid launch.

Jawed
 
I now next to nothing about DRAM and memory controller implementations, but... would it be possible to connect NVIO using pins that the Tesla version of the card uses for ECC? Alternatively, could pins used for ECC by Tesla instead be used to increase bandwidth on consumer cards?
 
ECC or parity for in-die structures might appear for critical parts of the chip.

I wonder about the granularity of the memory buses used by GPUs.
A 64-bit DRAM bus for CPUs is 72 bits with ECC.

What bog-standard chip can be put on there that is narrow enough, or will ECC mode force multiple channels to be ganged together?
 
ECC or parity for in-die structures might appear for critical parts of the chip.

I wonder about the granularity of the memory buses used by GPUs.
A 64-bit DRAM bus for CPUs is 72 bits with ECC.

What bog-standard chip can be put on there that is narrow enough, or will ECC mode force multiple channels to be ganged together?

Right now GDDR is availble in x32 and x16 interfaces which don't really work for ECC.

One options depending on bandwidths requirements is x8 DDR3 at boosted performance levels.

Ideally they would want x36 devices which probably won't happen because they would be extremely niche, so the most likely path is x8 DDR3 and taking the ~50% bandwidth hit.
 
Status
Not open for further replies.
Back
Top