Ati's Technology Marketing Manager @ TechReport

madshi · May 25, 2004

bloodbob said:
Brilinear removes the "boundary between mip map levels" artifact.

So therefore it trilinear serve's no greater purpose then brilinear.

ATI nowhere said that there's no need for anything higher than brilinear. That's just your interpretation of what was said. If your interpretation would be correct, ATI would never do full trilinear - but they do (at times).

Evildeus · May 25, 2004

Considering the R420 will be decline in lower parts of the Market, doing trilinear sometimes must be a requirement. Don't know is people enable AF on the lowest parts :?

Xmas · May 25, 2004

WaltC said:
Xmas said:

This part is clearly misleading. GeForce 6800 can also output 32 samples per clock with color data. But only 32 pixels without color data.

Click to expand...

I still don't think it can do anything other than 32 Z or stencil *ops* per clock, with still a maximum of 16 *pixels* per clock rendered to screen. As far as pixels themselves go--pixels rendered to screen--black & white pixels are as much "color" pixels as any shades of red, green, or blue pixels. To get a maximum of 32 pixels per clock rendered to screen you'd need 32 pixel pipelines operating in parallel. But you've only got 16. Ops are not pixels--huge difference--you can't see an "op" on screen, as pixels are the smallest rendered screen elements there are.

I thought the phrase "pixels without color data" made it sufficiently clear what I meant. Call them zixels, or whatever. I only used the term "pixels" to differentiate from "samples".
ATI suggests R420 has a stencil/Z-op advantage over NV40 when multisampling is enabled. But apart from the higher clock speed, it has not.

Geo · May 31, 2004

This is what I found most interesting:

TR: We've heard that ATI started work on a chip code-named R400, and then decided to change direction to develop the R420. Why the mid-course correction, and what will become of the remains of the R400 project?

Nalasco: When we generate our roadmaps, we're always looking multiple years ahead, and, you know, circumstances obviously are going to change over that course of time. If you look at the development cycle for a new architecture, you're talking in the vicinity of a couple of years. One of the things that happened in our case is that we had these additional design wins or partnerships that we've developed with Nintendo and Microsoft, and that obviously requires some re-thinking of how the resources in the company are allocated to address that. So I think that's what you're really kind of seeing is that we had to make sure that we were able to continue with the roadmap that we had promised to keep producing for our desktop chips while also meeting these new demands, and we're confident that we're going to be able to do that.

Clearly, we've been able to execute with the X800, and we continue to be able to expect to execute on the same kind of schedule where we produce a new architecture every year, approximately, and a product refresh every six months. Really, the codenames are something that's used a little bit loosely early on in the design stages, but in this case, I would attribute it mostly to a rearrangement of priorities to meet the needs of our business.

Orton said something similar in his interview with Dave. This begins to look, without quite coming out and saying it, similar to Carmack's observations on the impact on NV that the original Xbox contract had. What did he say? Something like it caused NV to be "1/2 generation behind" on NV30, wasn't it? From a features pov, one could argue that R420 is "1/2 generation behind" NV40, tho clearly ATI did much better on hitting their timelines and performance goals than NV did with NV3x. I guess we'd need the alternate timeline gizmo to know definitively where ATI's flagship would be in the PC space today without those added projects.

Tim Murray · May 31, 2004

geo said:
I guess we'd need the alternate timeline gizmo to know definitively where ATI's flagship would be in the PC space today without those added projects.

R400 strikes me as something that would have turned out a LOT like NV40--lots of pipelines, very low core speed, needs faster RAM, needs a better process for higher clocks, etc. Now, if R400 had been developed using 0.09 low-K, that might be different. But, right now, with .13u? Nah, it would have been an NV40.

Geo · May 31, 2004

The Baron said:
geo said:

I guess we'd need the alternate timeline gizmo to know definitively where ATI's flagship would be in the PC space today without those added projects.

Click to expand...

R400 strikes me as something that would have turned out a LOT like NV40--lots of pipelines, very low core speed, needs faster RAM, needs a better process for higher clocks, etc. Now, if R400 had been developed using 0.09 low-K, that might be different. But, right now, with .13u? Nah, it would have been an NV40.

Could be. Orton is a very bright guy, and his position requires him to be fairly low-key in public. I think ATI until relatively late in the day thought they would have a clean kill on performance and were willing to live with that and give up the features war this generation. I think they were just short of shocked that NV produced a part at 220m transistors on .13u that was performance competitive at 16 pipes (versus 4 in NV's previous generation) *and* SM3.0.

I think the "wafers per die" dig and comments on not understanding yet how NV stuffed that many transistors in on .13 tend to show that. ATI was feeling pretty proud of themselves as the company that could push a process to the max (see all the love they received for R3xx on .15) and that had to hurt their pride just a bit.

Otoh, I think NV got a bit of a nasty surprise as well when ATI pulled out a 16 pipe card. I think NV4x was meant to be "JHH's Revenge" to make good on that "hallucengic" crack of his and they are a bit disappointed that they didn't get a clean kill on performance either.

Actually, from innocent bystander pov, I'm loving it.

jimmyjames123 · May 31, 2004

Very well said. It seems clear to me that ATI was surprised by NV to a certain degree, just as NV was surprised by ATI to a certain degree. I am *really* looking forward to seeing what improvements via drivers that NV and ATI have in store for us for this upcoming generation of graphics cards. Also, the low and midrange market battle will be very interesting to say the least.

FUDie · May 31, 2004

Xmas said:
ATI suggests R420 has a stencil/Z-op advantage over NV40 when multisampling is enabled. But apart from the higher clock speed, it has not.

You don't think the 25% advantage in clock speed the X800 XT has over the NV40 Ultra warrants the claim?

-FUDie

Geo · May 31, 2004

jimmyjames123 said:
Very well said. It seems clear to me that ATI was surprised by NV to a certain degree, just as NV was surprised by ATI to a certain degree. I am *really* looking forward to seeing what improvements via drivers that NV and ATI have in store for us for this upcoming generation of graphics cards. Also, the low and midrange market battle will be very interesting to say the least.

Yeah, remember the rumbles that ATI was willing to cut into margins this generation by accepting lower yields in order to maintain clear performance leadership? It had to be an unpleasant moment when they found out that NV had handed them their ass on that front by 60m transistors.

I wish I knew where the "NV and ATI don't count transistors the same way" rumbles generated from. I don't know it for sure, but it sniffs like the ATI camp in early denial on what happened.

I think we still have a couple months before we find out where we really are, however. NV has richly earned the suspicion of the enthusiast community on driver hacks and cheats --having said that, everything we think we know about the history of new graphics card generations (and, shit, it can't *all* be wrong, can it?) would strongly suggest NV should have more legitimate headroom to find in NV40 than ATI will in R420 due to the relative dissimilarity (NV40) and similarity (R420) to the previous generations (with a special exception for OGL if ATI is really rewriting from scratch).

And, frankly, I don't totally trust any of these numbers until there are more cards from both IHV's in the hands of the enthusiast community to bang on, poke, prod, etc for extended periods of time (more so than that given for even a relatively thorough review of a loaner). For all the smoke and thunder, y'all really do a pretty impressive job of finding out what the IHV's would prefer you not find out.

KimB · May 31, 2004

*sigh* So much crap spewed in that interview, I almost don't know where to begin. Well, first of all, I guess I'll skip over the crap that was already talked about.

When we think we can produce a product with adequate performance that allows you to actually take advantage of some of these new features, then that's when we'll add the features into that product. But if you try to introduce them too early, you're basically adding additional cost to the product that's not providing a benefit to the end user, and that's just something you want to avoid.

With an attitude like this, we'd have much slower advancement of gaming technology. It's not until low-end hardware is saturated with a certain set of features that those features can be used to their fullest.

One thing to consider when you're looking at these things is, you know, it might not sound like a huge difference between 24 bits or 32 bits, but what you're basically talking about is 33% more hardware that's required. You need 33% more transistors. Your data paths have to be 33% wider. Your registers have to be 33% wider.

This is all blatantly false. FP32 math units aren't 33% bigger than FP24 math units, and the parts that you would change wouldn't take up all of the core.

So if you were instead to devote those extra transistors to increasing performance, now you're able to run those 150-instruction shader programs at a much higher speed, so that a lot of techniques that were previously not feasible to run in real time now become feasible.

There's absolutely nothing that the X800 can do that the GeForce 6800 cannot do. The difference in performance is quite small, and there's simply no algorithm in the world that you could run on the X800 that would be that much faster. So, based on this logic, nVidia definitely took the better route, as they added both performance and features, allowing for more developer freedom.

[about branching]So performance will potentially be somewhat lower, because now you have to add in the extra pass, but certainly there's that you couldn't doâ€”that you could do only with hardware that supported it.

That's the understatement of the century. With the wrong kind of loop, performance could be absolutely abysmal without actual dynamic branching support.

[about geometry instancing]So again, it's one of those things where there's potential performance benefits from using it, but certainly there's no sort of image quality benefit that it provides, because anything that you can do with it, you can do using Shader Model 2 shaders, and the performance won't necessarily be that much better. It just depends on the characteristics of your scene.

Geometry instancing allows entirely new types of gaming environments that have been very hard to run since 3D graphics hardware has taken root. This can provide huge performance improvements for situations where you have a whole lot of similar models, such as in an RTS or a game like Serious Sam.

TR: Won't Shader Model 3 provide an easier-to-use programming model?

Nalasco: Well, not necessarily.

Yeah, right!!!

Come on! SM3 adds flexibility. This makes programming easier. Without that flexibility, you have to use hacks and workarounds. That's never easier than straight programming. This is just a silly argument, as he's basically arguing that since you can do more advanced things with SM3, it's going to be harder to develop for.

No, the point is that when you want to do the exact same thing on SM2 and SM3, it's either going to be identical or easier with SM3.

You do a post-processing effect where you effectively blur the image very slightly.... In a lot of ways it's like supersampling, where you blur the whole image just slightly to get rid of these hard edges.

Lovely. A blur effect compared to supersampling. If ATI supported gradient instructions, maybe they could have gotten rid of the aliasing the right way, by using anti-aliasing! Nobody in their right mind is going to apply a blur filter to a game.

Ninjagnu · May 31, 2004

Chalnoth said:
When we think we can produce a product with adequate performance that allows you to actually take advantage of some of these new features, then that's when we'll add the features into that product. But if you try to introduce them too early, you're basically adding additional cost to the product that's not providing a benefit to the end user, and that's just something you want to avoid.

Click to expand...

With an attitude like this, we'd have much slower advancement of gaming technology. It's not until low-end hardware is saturated with a certain set of features that those features can be used to their fullest

I don't get the point of your argument to be honest, sure tomorrow we could sit down, make a SM 4 with features that would make movies crawl back to thier cave in envy, but if such features ran at a not useable speed then it serves little purpose.

We are just scratching the top of what SM 1 and 2 can do, and for shaders its like the start of the 3d days, whats holding it back the most is the performance at which is can be ran, not how many abritary features that are supported.

nelg · May 31, 2004

Has the jury returned its verdict on Nvidiaâ€™s SM3.0 performance? Has anyone reached a conclusion that it is viable to use on the Nv40?

I somewhat agree with you Chalnoth, that developers need the hardware to progress but it is to simple to ignore performance in the equation.

WaltC · May 31, 2004

Xmas said:
I thought the phrase "pixels without color data" made it sufficiently clear what I meant. Call them zixels, or whatever. I only used the term "pixels" to differentiate from "samples".
ATI suggests R420 has a stencil/Z-op advantage over NV40 when multisampling is enabled. But apart from the higher clock speed, it has not.

Yes, but why not help eliminate the confusion? Web sites like TR doing interview questions with ATi and asking about "two pixels per clock per pipe," etc.--without apparently understanding that it's "ops" per clock instead of "pixels per clock"--is reason enough to be precise with the terminology, imo...

I mean, it gets pretty bad when the major hardware review sites expose their ignorance on simple but fundamental topics like this.

Besides "ops" is much simpler (not to mention much more accurate) than saying "32 black & white pixels per clock," and the term "ops" is already sufficiently distinct from "samples" so that we don't need to call ops "pixels," right?...

Clarity and precision seem woefully absent around this topic of late. The axiom that "pixels do not equal ops" and vice-versa seems very easy to understand--which is why I find it so puzzling that the two terms are so often used interchangeably as if they were the same thing. Might as well make statements like, "R420 does 520MHz per clock of pixel pipes."...

shanehudson · May 31, 2004

ATI said:
When we think we can produce a product with adequate performance that allows you to actually take advantage of some of these new features, then that's when we'll add the features into that product. But if you try to introduce them too early, you're basically adding additional cost to the product that's not providing a benefit to the end user, and that's just something you want to avoid.

Chalnoth said:
With an attitude like this, we'd have much slower advancement of gaming technology. It's not until low-end hardware is saturated with a certain set of features that those features can be used to their fullest.

That's capitalism, man. It's all about "how can we make the most moolah with the smallest investment of time, effort and money" not "how can we make rabid technophiles like that Chalnoth-guy happy even though it's not economically viable."

christoph · May 31, 2004

shanehudson said:
That's capitalism, man. It's all about "how can we make the most moolah with the smallest investment of time, effort and money" not "how can we make rabid technophiles like that Chalnoth-guy happy even though it's not economically viable."

remember the r300 launch and its impact

KimB · May 31, 2004

shanehudson said:
That's capitalism, man. It's all about "how can we make the most moolah with the smallest investment of time, effort and money"

This is the problem. It's short-sighted. This is a major problem with western business in general, but it's really disappointing to see it play itself out here.

shanehudson · May 31, 2004

christoph said:
remember the r300 launch and its impact

You think the R300 was so much of a leap for ATI simply because of the enthusiast segment? I don't. It's always about the bottom line. Having the performance crown certainly helps, but isn't their overriding concern, as much as it is ours. They want it if it helps them sell cards, but not just a couple of hundred...

shanehudson · May 31, 2004

Chalnoth said:
This is the problem. It's short-sighted. This is a major problem with western business in general, but it's really disappointing to see it play itself out here.

Yes and no. If ATI can have larger profit by turning out a marginally better performer (as I see it) without breaking the bank, then they'll do that instead of turning out a scorcher that gives them another 2% market share but costs them 2x as much to build. When the next model is being considered, repeat the process... do what you have to do, but not so much more that it costs you a disproportionate amount.

Quitch · May 31, 2004

geo said:
I wish I knew where the "NV and ATI don't count transistors the same way" rumbles generated from. I don't know it for sure, but it sniffs like the ATI camp in early denial on what happened.

DaveB.

Mintmaster · May 31, 2004

Chalnoth, I agree with some of your points, but I think you're misunderstanding some of his.

Chalnoth said:
So if you were instead to devote those extra transistors to increasing performance, now you're able to run those 150-instruction shader programs at a much higher speed, so that a lot of techniques that were previously not feasible to run in real time now become feasible.

Click to expand...

There's absolutely nothing that the X800 can do that the GeForce 6800 cannot do. The difference in performance is quite small, and there's simply no algorithm in the world that you could run on the X800 that would be that much faster. So, based on this logic, nVidia definitely took the better route, as they added both performance and features, allowing for more developer freedom.

I'm pretty sure he's talking about an "all else being equal" sort of statement, and with NV40, it isn't. It took 60M transistors for ATI go from 8 pipes to 16 pipes. If ATI went to 220M transistors, it's possible they could have made R420 a 24-pipe beast. Then there's power, though I don't know how much power a non-low-k, 24-pipe R420 would consume. If it's less than NV40, then you could probably increase the voltage and clock in an "all else being equal" situation.

In any case, if ATI had the same transistor budget as NV40, I'm positive you'd see situations where R420 would trounce NV40, maybe even by a factor of 2. Nalasco's argument is perfectly sound here. He's basically saying ATI would not have made a 16-pipe chip if it was to be SM3.0.

Chalnoth said:
[about branching]So performance will potentially be somewhat lower, because now you have to add in the extra pass, but certainly there's that you couldn't doâ€”that you could do only with hardware that supported it.

Click to expand...

That's the understatement of the century. With the wrong kind of loop, performance could be absolutely abysmal without actual dynamic branching support.

I'm not sure if he described that correctly. Multipassing the branch condition into another texture really doesn't do anything for you. He probably meant to say something like this:
http://www.beyond3d.com/forum/viewtopic.php?t=12889&start=8

For looping, I can't think of anything right now. I'm rather curious as to how NV40 handles dynamic loop size, actually.

Besides, he says "certainly there's that you couldn't doâ€”that you could do only with hardware that supported it". I don't know how you interpreted this, but I see it as him saying dynamic branching support in hardware is necessary sometimes. He started this paragraph with "Dynamic branching is trickier", so it makes sense.

Other than that, your points are relatively justified, IMO. I wouldn't call it a "crap spewed" interview, though. Given that he does work for ATI, it's not bad at all, especially when you compare it with interviews and PR from NVidia during the FX era. There were mountains of shit production there.

Ati's Technology Marketing Manager @ TechReport

madshi

Evildeus

Xmas

Porous

Geo

Mostly Harmless

Tim Murray

the Windom Earle of mobile SOCs

Geo

Mostly Harmless

jimmyjames123

FUDie

Geo

Mostly Harmless

KimB

Ninjagnu

nelg

WaltC

shanehudson

christoph

KimB

shanehudson

shanehudson

Quitch

Mintmaster

Similar threads