Short interview with Eric Demers (sireric, ATI) on 3DCenter

AA sure does take a lot of bandwidth.

"At 1600x1200 with 6xAA, our buffer consumption is over 100 MBs of local memory space."

Is there a way that you could cache the data that you need for AA on chip and not write it out to local memory?

You could go back to video cards with 64MB memory.
 
Whoops, dig, I was referring to geo's post, but I guess I was looking at your quote in his post when I posted without a quote.

Er, you can see how I got confused amongst all the quotes and posts, what with all the posting of the quoting (nice lady!). :D
 
rwolf said:
AA sure does take a lot of bandwidth.

"At 1600x1200 with 6xAA, our buffer consumption is over 100 MBs of local memory space."

That's memory consuption, not bandwith.
Normally most of that memory won't be touched during renering due to color/z buffer compression, but have to be reserved for the worst case.
This is the price for lossless compression.

Is there a way that you could cache the data that you need for AA on chip and not write it out to local memory?

Sure.
It's called tile based rendering.
 
rwolf said:
AA sure does take a lot of bandwidth.

"At 1600x1200 with 6xAA, our buffer consumption is over 100 MBs of local memory space."

Is there a way that you could cache the data that you need for AA on chip and not write it out to local memory?

You could go back to video cards with 64MB memory.
Sure there is. It's called tile-based deferred rendering (TBDR) ;)

Eric, thank you for the answers. I wish I had taken more time to assemble the questions :D
 
So why haven't Nvidia and ATI adopted TBDR rendering techniques? You wouldn't need a 256-bit bus and you would only need 64MB local memory. The 9800 would be half the price. What is the downside to TBDR?
 
rwolf said:
So why haven't Nvidia and ATI adopted TBDR rendering techniques? You wouldn't need a 256-bit bus and you would only need 64MB local memory. The 9800 would be half the price. What is the downside to TBDR?

Because ATI would have to start from scratch and there are some major hurdles going that route... perhaps in the future.
 
geo said:
If I was them I'd take Richard Huddy and make him top PR dog and send him out full-time to trumpet the gospel...

He is good, but they need to trot him out more often...

DaveBaumann said:
It seems that Richard may well be wheeled out to a few UK press...

Poor guy. Labelled an evangelical, trumpet-playing, crippled, celebrity horse-dog.

Good interview. Short but interesting! Has anybody actually had hands-on experience with the Mac SSAA implementation? How does it look/perform?

MuFu.
 
rwolf said:
So why haven't Nvidia and ATI adopted TBDR rendering techniques? You wouldn't need a 256-bit bus and you would only need 64MB local memory. The 9800 would be half the price. What is the downside to TBDR?

I guess what you gain for AA you lose it for the design of
a lot of the other parts of the chip and/or the drivers.
It is never a full win/win situation..

LeGreg
 
I could see from the software side it must take ages to come out with drivers that work. It took Nvidia 10 months after the release of the NV30 to come out with decent drivers. I bet they were in development for two years before the card was released too.
 
rwolf said:
I could see from the software side it must take ages to come out with drivers that work. It took Nvidia 10 months after the release of the NV30 to come out with decent drivers. I bet they were in development for two years before the card was released too.

Reading this mini-thread-within-a-thread made me wonder if either of the big two have considered releasing a TBDR as a *value* part. I can see, based on the investment in top-notch drivers, why this kind of dislocative switch in architecture might seem like too big a risk to take for a top-of-the-market card. But a value card would still give them the opportunity to learn by experience on the driver end and work up to that software investment instead of all-at-once. Then, as they gain confidence with it, eventually it could move to top-of-the-line.
 
geo said:
Reading this mini-thread-within-a-thread made me wonder if either of the big two have considered releasing a TBDR as a *value* part. I can see, based on the investment in top-notch drivers, why this kind of dislocative switch in architecture might seem like too big a risk to take for a top-of-the-market card. But a value card would still give them the opportunity to learn by experience on the driver end and work up to that software investment instead of all-at-once. Then, as they gain confidence with it, eventually it could move to top-of-the-line.

Ugh, bad idea.

It might make some sense if IHV costs in making a GPU were entirely just the marginal cost of manufacturing the chips and boards. But a huge part of their cost structure is development costs--designing, implementing and verifying a hundred million transistor ASIC in bleeding-edge process takes a large team several years and hundreds of millions of dollars. ATI, Nvidia, and now XGI amortize those costs by reusing essentially the same design across their entire product line. For example, the RV350 is almost exactly just half an R300 (albeit on a different process); the NV31 is almost exactly just half an NV30, and the NV34 is NV31 with a couple parts missing, and NV36 is almost exactly half an NV35, although IIRC with all the vertex shaders around.

The point is, development costs are shared across the entire line; beyond having to seperately undergo layout, board design and verification, development of the mainstream part is essentially "free" after having developed the high-end part. This goes double for driver development: once the drivers support one part in the lineup, they're only a couple tiny tiny tweaks from supporting them all.

Trying to split a product line across two radically different chips (and a TBDR is a fairly radical split from an IMR) would be a recipe for disaster.

(edits: late-night typos)
 
Thanks for the explanation, Dave H. Yeah, I have heard about the power of leveraging the investment down the line. But isn't this a recipe for gridlock? Who takes the chance then? A new player, usually without the resources to do it right, and not-ready-for-prime-time drivers?

Maybe this should be a rhetorical question, as I start to imagine the beady eyes of the OT police focusing on me. . .
 
geo said:
Thanks for the explanation, Dave H. Yeah, I have heard about the power of leveraging the investment down the line. But isn't this a recipe for gridlock? Who takes the chance then? A new player, usually without the resources to do it right, and not-ready-for-prime-time drivers?

Maybe this should be a rhetorical question, as I start to imagine the beady eyes of the OT police focusing on me. . .

I'll try to add to DaveH's post (to which I agree by the way), either you have a complete product-line/architecture based on IMR or TBDR. Mixing both approaches in the same product generation is a bad idea since it will increase inherently development times and resources. The ultimate goal - irrelevant of approach used - should be to maximize effectivity and at the same time minimize resource consumption.

Oh and by the way, despite that TBDRs require a much smaller memory footprint for anti-aliasing (especially the more amount of samples increase), there isn't a chance in hell that a competitive TBDR today for whatever segment could get away with just 64mb of onboard ram.

Back to the interview:

I found the comment (as 3DCenter noted on it's frontpage) about random sampling patterns highly interesting. I'd personally love to see myself some sort of a stochastic sampling pattern in future products, but judging from various comments in the interview as on those boards, stochastic doesn't only require a fairly high amount of samples to have a worthwhile advantage over current AA-patterns, but seems to be still too costly to be implemented in hardware.

The real question is now, what ATI's possibilities would be theoretically to build/expand on the existing excellent MSAA algorithm for future products. Framebuffer requirements shouldn't be a consideration for next year either anymore, since I doubt to see a high end product with less than 256mb ram onboard and thus 8x sparse grid lies within the range of possibilities.

Finally congratulations to both 3DCenter for the interesting questions and Eric for the quite well-thought answers. ;)
 
Back
Top