How about 'irrelevant' ? :winkrpg.314 said:Corporate schizophrenia?
I mean: only a very narrow audience is going to be pedantic enough to care and they'll account for 0.1% of the actual customers. The broad trend lines are more important.
How about 'irrelevant' ? :winkrpg.314 said:Corporate schizophrenia?
The first GPU with ECC (a requirement for serious HPC) came to market less than 18 months ago and already has a strong presence in the fastest computers in the world. That's a very fast uptake by any measure.
Well whatever it is, they're currently making less from their half of the market than nVidia. At first blush it would seem they're trading margin for marketshare.
Anyhow, my point is that Tegra and Tesla successfully invaded two new markets previously unavailable to nVidia. They seem to know what they're doing and if they claim Kepler is 3x as power efficient as Fermi I'll take their word on it unless proven otherwise.
Still, strong presence sounds like a overstatement to say the least.
Besides, they're pretty much alone at this point: AMD doesn't really have a credible computing GPU offer, at least until Southern Islands comes out, at which point the competitive situation should make market penetration measurements more meaningful. Ditto for Intel Knights products.
Again, Quadros! I doubt NVIDIA is making significantly (if at all) more than AMD on consumer products, but AMD's FirePro sales are so small that they're practically irrelevant. Quadros, not so much.
That claim about Kepler was originally made in September 2010, most likely long before they had any silicon in hand, so that was an estimate based on simulations. From Fermi, we all know how reliable those can be, right?
Still, I'm not saying they won't meet this 3× improvement target for performance/W, but I'm not convinced it will be met for the same 300W budget as Fermi. In other words I don't think Kepler will be three times as fast as the GTX 580 or even 480.
GF100 was broken, but that had nothing to do with the high-level architecture. Is Fermi on the whole broken or inefficient? Well I don't think it's quite the most efficient design, some things a bit overkill most likely (like the 4 raster engines on GF100/GF110), I'm not quite sure the fully distributed vertex processing was worth it neither (all things told, 4 of these Polymorph Engines don't really appear to be faster than one from AMD, but since we have no idea (well I don't) how these solutions compare in transistor count (or die area) it's hard to tell if it is really inefficient).What I find amusing is that Fermi is supposedly the most broken and inefficient architecture ever, yet there's so much skepticism that nVidia can improve it significantly. The only conclusion is that their engineers are simply incompetent, no?
Edit: Looks like I looked at the numbers a bit wrong. 5.6% is the percentage of total systems that are DX11-capable. There are many more systems that use DX11 GPUs but an older OS (roughly 3-4 times as many as those who have both DX11 and Windows Vista/7!). However, DX11 isn't everything, and nVidia still has massive overall marketshare.
How about 'irrelevant' ? :wink
I mean: only a very narrow audience is going to be pedantic enough to care and they'll account for 0.1% of the actual customers. The broad trend lines are more important.
GF100 was broken, but that had nothing to do with the high-level architecture. Is Fermi on the whole broken or inefficient? Well I don't think it's quite the most efficient design, some things a bit overkill most likely (like the 4 raster engines on GF100/GF110), I'm not quite sure the fully distributed vertex processing was worth it neither (all things told, 4 of these Polymorph Engines don't really appear to be faster than one from AMD, but since we have no idea (well I don't) how these solutions compare in transistor count (or die area) it's hard to tell if it is really inefficient).
Yes but it's not exactly clear what the comparison point here really is, could be something very slow at least for tesselation.Nvidia said, distributed geometry and raster grew the chip by 10% compared to a traditional solution.
IIRC that's only really true for non-tesselated triangles.WRT to advantages (or the lack of it) compared to Cayman, keep in mind that every Fermi-type except for Quadros has an artificial limit of 1 triangle per clock if it's actually drawn. Only triangle culling is running at full speed.
Look at the number of recent entrants to the top 10 and count how many are GPU powered. Looking at the list of established (and old) members of the list isnt very useful unless you're going to great lengths to downplay the emergence of GPUs.
I'm not sure what you mean. If AMD GPUs and Knight's Ferry are successful at HPC it would only further validate nVidia's claims. I'm actually very interested in AMD's strategy to see how well they can leverage OpenCL in any attempt to gain a foothold here. It's still lagging behind in key areas.
We can guess till the cows come home. Or we can accept the fact that nVidia made 140m more than AMD's graphics division last quarter. Hard to believe the consumer segment lost money with that large of a gap. Would need to see numbers to change my mind.
Right. There was also a more recent quote about Kepler hitting 5Gflops/w. Not impressive at all considering the M2090 is already at 3Gflops/w.*
What I find amusing is that Fermi is supposedly the most broken and inefficient architecture ever, yet there's so much skepticism that nVidia can improve it significantly. The only conclusion is that their engineers are simply incompetent, no?
Agreed that nobody should expect 3x absolute performance from Kepler. They need to throw in lower power consumption as part of the deal.*
Since area efficiency for Tesla matters a lot less than GeForce margins unless they make a dedicated chip for GPGPU, a very good solution would be to power gate (rather than simply clock gate) as much graphics-centric hardware as possible. There is a small area cost to power gating but if done properly it could even help graphics power efficiency (e.g. workloads with a very low amount of texturing, or very long shaders that let the ROPs idle a lot). You'd probably want to be able to shut down these blocks for hundreds of cycles to be really worth it but given the size of GPU buffers (which is very easy for compute but hard for graphics) but still there should definitely be cases where this can be done at no risk.Removing fixed-function hardware (or at least decreasing the FF/GP ratio) would also help.
Since area efficiency for Tesla matters a lot less than GeForce margins unless they make a dedicated chip for GPGPU, a very good solution would be to power gate (rather than simply clock gate) as much graphics-centric hardware as possible. There is a small area cost to power gating but if done properly it could even help graphics power efficiency (e.g. workloads with a very low amount of texturing, or very long shaders that let the ROPs idle a lot). You'd probably want to be able to shut down these blocks for hundreds of cycles to be really worth it but given the size of GPU buffers (which is very easy for compute but hard for graphics) but still there should definitely be cases where this can be done at no risk.
I'm pretty darn sure Fermi doesn't do this, and I have no idea at all about Kepler, but it would be interesting.
Yes but very few programs use them (and even fewer will if there's a clear power penalty to doing so) and this is known at compile time, so it's easy to power gate the TMUs only if all running shaders are known not to be using them. Of course, as I said it'd be even better to be able to dynamically power gate them off even for graphics if the workload isn't using them much if at all.TMU's are exposed in CUDA.
I don't think power gating the ROPs would be a problem even if they're tightly coupled - they're essentially still a separate block likely designed by a separate engineering team and I very much doubt CUDA workloads use them at all.
If your application matches texture filtering, there's no way in hell you wouldn't use it. Power cost of doing it on ALU's is way more. And nobody except handheld devs look at power consumption of their code, so doesn't apply for Kepler anyway.Yes but very few programs use them (and even fewer will if there's a clear power penalty to doing so)
J Random Dev said:Gee, using a debugger and a profiler is hard enough. You want us to use a power meter too
They are just crazy:NVidia's still lying about the introduction date of Fermi (beginning of 2009 on that cute graph, well over a year earlier than reality), why the hell would anyone take anything else NVidia says seriously?
What leads you to that conclusion?Alexko said:You can't deny the fact that Quadros make a lot of money either. I don't know if whether the consumer segment made a profit, but if so it can't have been much.
What leads you to that conclusion?
How do you define profit in the first place?
The gross profit is easy: at least equal or probably higher than Quadro profit. (Use your 75% GM figure and run the numbers.)
After that, it's all bookkeeping on how you divide NRE costs among GF and Quadro, but they are likely to be higher or equal for Quadro than for GF.
If you think there is a way to make GF 'hardly profitable', I'd like to see it, because I don't.