David Kirk of NVIDIA talks about Unified-Shader (Goto's article @ PC Watch)

D. Kirk: It's true that Unified-Shader is flexible, but it's more flexible than actual need. It's like 200-inches belt. If it's 200-inches it fits you however overweight you are, but if you're not overweight it's useless.

This comes across as a bit near-sighted - he better hope games don't start packing on those pounds. And I hope they are doing something with the extra transistor budget they saved by not doing a USA. Performance/watt isn't going to cut it if absolute performance isn't comparable.
 
Richthofen said:
Of course the end user cares because this is what makes great products like the GF7600GT possible for the end user. Where is ATI's competeting product to this one? There is none. Don't tell me that stupid X1800GTO. It' only R520 inventory left over and won't nearly sell in quantities like the GF7600GT. It's not even in the same price spot.

Performance per sq mm is very important when it comes to a balanced, good and complete product lineup. That is something ATI is lagging since R300 days.

Your POV would be right for the project manager in the respectable company, but not the end users. The average end user does NOT care about die size or if the company sells parts with a loss. All the end users care about is performance/$ and to some extent power requirements/noise. I'm sure noone has an issue with 1800GTO being a inventory leftover if they get good performance and nice features for small money. IMHO, of course.

And of course, regardless of all that, the company-loyal bunch wants to see their vendor's name on the card and argue with the opposing camp about usefulness of features etc. Would be too boring otherwise I think... :LOL:
 
_xxx_ said:
EDIT: my point being, is that die space better invested in additional logic for the US or a few extra "classic" pipes?
And the other point being is that if there is, indeed, both a requirement for significant extra logic for US then the ratio of overhead decreases with DX10 over DX9 (which which where Kirk is making the comparisons to).
 
Ailuros said:
If those should be single issue ALUs I'm not so sure I'm that exited yet.

True, but I'd assume they're what R580 calls "pixel shaders", aka what used to be called a pipeline.

48 single ALU's does not seem like enough power increase to me. No way.

They've probably got a ton of die size to blow, too.
 
Gateway2 said:
True, but I'd assume they're what R580 calls "pixel shaders", aka what used to be called a pipeline.

48 single ALU's does not seem like enough power increase to me. No way.

They've probably got a ton of die size to blow, too.

Considering way too often R580 is being referred to as having 48 "pixel processors", it's exactly in line with what I just said.

If those are supposed ALUs after all, it'll start getting interesting only if they're dual issue ALUs.
 
Dave Baumann said:
And the other point being is that if there is, indeed, both a requirement for significant extra logic for US then the ratio of overhead decreases with DX10 over DX9 (which which where Kirk is making the comparisons to).

Sure, still it's hard to say how big the difference will be for the particular architectural approach. Maybe it would require nV to add lots of stuff for DX10 as well as (independant of that) lots of stuff to go unified and thus end up needing even more space in order not to develop everything from scratch (which I doubt they would do anyway). I just don't know how much overlap there is with the changes for DX10 _and_ for unified, the overhead might as well be entirely different (for better or worse) with nV's approach.

EDIT: I get you, I just don't know how it goes along with efficiency/mm² and all the economic flows at nV. They too, like any sane company, try to re-use as much stuff as possible before they start something from scratch. And than also the question if they're all about performance leadership or settling with being the "second best" but with parts which are cheaper to manufacture/higher margins.
 
Last edited by a moderator:
Richthofen said:
Performance per sq mm is very important when it comes to a balanced, good and complete product lineup. That is something ATI is lagging since R300 days.
Umm, R420/R480 certainly wasn't lacking in performance per sq mm. Performance per sq mm per clock, sure, but that's a useless metric.

One thing you have to take into account is that Xenos could outperform G71/RSX by a factor of 10 in a dynamic branching shader or vertex texturing shader. These are the two hallmarks of SM3.0, I might add. ;)
 
trinibwoy said:
This comes across as a bit near-sighted - he better hope games don't start packing on those pounds. And I hope they are doing something with the extra transistor budget they saved by not doing a USA. Performance/watt isn't going to cut it if absolute performance isn't comparable.

Yup. Certainly NV is to be congratulated for having a performance-competitive part in G71, when it is so much smaller. Sure, they don't have all the features, but it doesn't feel like the extra features are what is taking up all (or even most) of that extra die space anyway (but then I'm counting the mostly-unused dynamic branching stuff as 'performance' rather than 'feature').

But if ATI opens a bit of a performance lead on them, I suspect the community reaction will start to shift from the current "Jaysus, why is ATI's part so big?" to the older paradigm --"Having a bigger part shows your commitment to win the performance wars".
 
Ahh, the return of the good old "bigger is betterer"... :LOL:

"...but MY die is bigger!11!!"

EDIT: seriously, I think the increase in die size of R520 thanks to memory controller and all the fancy stuff was indeed an investment in the future and will contribute to the die size NOT getting ridiculously bigger, since that stuff will probably stay as a basis to build upon. A long-term investment, kind of. I think Orton said something along these lines in one of the conference calls as well.
 
Last edited by a moderator:
Mariner said:
The problem with these kind of pieces is that you just know that Kirk would be saying exactly the opposite if G80 had a full US.

Just another PR-led piece spreading some FUD around which sounds good to investors.

I await the ATI rebuttal which will contain plenty of PR-led FUD about non-unified architectures! ;)

Well he was lambasted for his assertion that a 256-bit bus was overkill for a 8 "pipeline" part. Although NV30 was a disaster, the performance of the 6600 and 7600 have clearly vindicated him.

As for it being FUD for investors, I agree that such remarks may be directed at least partially to investors. Senior executives, including Jen-Hsun, have been issuing some optimistic statements about G80 during investor presentations. However, I highly doubt they want to see a repeat of the NV30 situation, where that FUD was likely issued to keep competition of guard (ATI had been through several cycles of bloated high end inventory, so I think they may have been a little shy to really ramp up R300 production at first when NVIDIA had such a large share of that market historically). NVIDIA stock has had a healthy runup, and I think they want to reassure people who are concerned about another potential NV30 fiasco with these statements. Another such fiasco (i.e. Kirk being wrong when they have a xenos part in labs to compare) would destroy all long term credibility. And I don't even think a terribly short-sighted company would make that mistake.
 
Voltron said:
Well he was lambasted for his assertion that a 256-bit bus was overkill for a 8 "pipeline" part. Although NV30 was a disaster, the performance of the 6600 and 7600 have clearly vindicated him.
Not really. Neither of these parts are high end, they do they have the same usage scenarios as NV30/R300 had (i.e. operations they are tasked with are much more shader bound) and they can dedicate more transistors to bandwidth saving that those older parts can. Its not like they don't use 256-bit at the high end since NV35.

Another such fiasco (i.e. Kirk being wrong when they have a xenos part in labs to compare) would destroy all long term credibility. And I don't even think a terribly short-sighted company would make that mistake.
Another part of the job is to make essentially meaningless comparisons because they are fundamentally comparing different things, but attempt to paint them as meaningful.
 
Dave Baumann said:
Not really. Neither of these parts are high end, they do they have the same usage scenarios as NV30/R300 had (i.e. operations they are tasked with are much more shader bound) and they can dedicate more transistors to bandwidth saving that those older parts can. Its not like they don't use 256-bit at the high end, and have done sinse NV35.


Another part of the job is to make essentially meaningless comparisons because they are fundamentally comparing different things, but attempt to pain them as meaningful.


The general problem with NV30 was that it sucked. NV35 with 256-bit was better, but it still sucked.

Meaningless or not, NVIDIA has a very efficient architecture right now. And xenos gives them some insight into what ATI's future products will be. It would be naive to think that NVIDIA does not have tools in the lab to make potential comparisons meaningful at least to them, even if they have imperfect information.
 
Not really. Neither of these parts are high end, they do they have the same usage scenarios as NV30/R300 had (i.e. operations they are tasked with are much more shader bound) and they can dedicate more transistors to bandwidth saving that those older parts can. Its not like they don't use 256-bit at the high end, and have done sinse NV35.

I dont see the relevance. Really I dont. The only difference is Nvidia released the NV3x then and not the Nv4x. ((which we all know is not even a 8 pipeline part)). I think its completely fair point to compare old Nvidia high end to modern Nvidia midrange. Especially when you compare the NV35 to the Geforce 6600 line. In most cases the extra bandwith doesnt provide much benefit ((5900 Ultra verses a 6600GT)) in even software that is not bound by shader throughput. Load up a game like UT2004 and and compare it directly between a 5900 Ultra and a Geforce 6600GT and the 6600GT will provide a better gaming experience in any circumstance its not limited by framebuffer capacity.
 
The general problem with NV30 was that it sucked. NV35 with 256-bit was better, but it still sucked.
Well, the reference doesn't work with NV30 because it just wasn't an 8 pipeline part in the first place. Also, bear in mind that something such as 7600 GT has more bandwidth than a 9700 PRO does!

Meaningless or not, NVIDIA has a very efficient architecture right now. And xenos gives them some insight into what ATI's future products will be. It would be naive to think that NVIDIA does not have tools in the lab to make potential comparisons meaningful at least to them, even if they have imperfect information.
Whats meaningful to NVIDIA isn't necessarily whats being talked about here, this is just them painting their path as best in light of ATI painting a different one (take a look at ATI's PR today - that message will continue to ratchet up through press and investment communities up to and beyond DX10/R600). Its also not necessarily the case that all details are known there either - i.e. who really knows if we are looking at the full performance of Xenos's die or 3/4 of it?
 
Dave Baumann said:
Whats meaningful to NVIDIA isn't necessarily whats being talked about here, this is just them painting their path as best in light of ATI painting a different one (take a look at ATI's PR today - that message will continue to ratchet up through press and investment communities up to and beyond DX10/R600). Its also not necessarily the case that all details are known there either - i.e. who really knows if we are looking at the full performance of Xenos's die or 3/4 of it?

We don't know, hence why this discussion started in the first place. i was simply providing a a rational argument for why these statements might turn out to have some credence to them. Of course, cuch staemnts by Kirk and other NVIDA execs are not directed completely at investors by any strectch. I am sure they are laying the foundation for a PR counter by ATI in the event that NVIDIA launches first
 
ChrisRay said:
I dont see the relevance. Really I dont. The only difference is Nvidia released the NV3x then and not the Nv4x. ((which we all know is not even a 8 pipeline part)). I think its completely fair point to compare old Nvidia high end to modern Nvidia midrange. Especially when you compare the NV35 to the Geforce 6600 line. In most cases the extra bandwith doesnt provide much benefit ((5900 Ultra verses a 6600GT)) in even software that is not bound by shader throughput. Load up a game like UT2004 and and compare it directly between a 5900 Ultra and a Geforce 6600GT and the 6600GT will provide a better gaming experience in any circumstance its not limited by framebuffer capacity.
Chris, they are different parts built for different utilisations, on different processes and different memories - you target for all these factors. Both NV43 and G73 use smaller processes than NV30 did, ergo they can make different design decisions based on the different costs. G73, now, uses a 128-bit bus but with 700MHz memory thats fairly common now, providing ~3GB/s more bandwidth than 9700 PRO's 256-bit bus with common for the time 325MHz memory, and thats what Kirk was commenting against when he made the quote.
 
Is this the same genius that said that HDR+AA is useless for now and he doesn't see any reason to implement it for a while?
 
Back
Top