AMD: R7xx Speculation

Arun · Jun 19, 2008

While I agree that'd be a good idea from a marketing perspective, I know *personally* I find the comparison completely retarded because NVIDIA's 16x CSAA performance tends to be very good and, IMO, it looks better in nearly every situation than 8xMSAA. I use 16xCSAA all the same, and I basically never used 8xMSAA, heh... So it's not unfair to compare it and people would indeed take it seriously, but I don't really care myself.

ChrisRay · Jun 19, 2008

As arun said. Most Nvidia users dont bother with 8x multisampling. They go from 4x -- 16x to 16xQ typically.

Mintmaster · Jun 19, 2008

DemoCoder said:
I find their "diminishing returns" argument to be full of holes, BS marketing speak. There's a reason why CPUs hit a diminishing returns bottleneck, and it's because their workloads are primarily single threaded. Increasing ILP and clock for single threaded workloads does show diminishing returns.

There are some diminishing returns, particularly with no chip yet from any manufacturer doing more than one tri/clock (BTW, do you have any insight as to the reason?). In polygon throughput a 3870x2 slaughters a GTX 280, for example:
http://www.ixbt.com/video3/gt200-part2.shtml
Setup is definately an issue, as I discussed here:
http://forum.beyond3d.com/showpost.php?p=1177998&postcount=112

Eventually, monolithic chips will give problems feeding your shader units, too, because even if you're pixel shader limited, there is a minimum number of fragments that must be submitted between pipeline flushes to maintain ALU saturation. To a certain degree you can handle multiple renderstates simultaneously, but at some point it's just more efficient to have multiple GPU entities. As an example, GT200 can hold 30,720 fragments in flight since having much less can reduce efficiency. We know it can deal with VS, GS, Comput, and PS batches simultaneously, but how many different renderstates can it juggle?

There are issues with the way crossbars scale, too. If a load can be efficiently run on two chips, that's much better. When looking at the die shots, I think this is one of the reasons for GT200's non-ideal scaling compared to G92.

I agree that the analogy to CPUs is terrible, but it's not entirely BS.

The problem for ATI is that because they don't have the uber-big-chips, their marketing message will only be successful in the short-term as the 4870x2 goes up against the GT200. But as soon as NV shrinks it and tweaks it, and roles out the inevitable smaller cut-down versions, NV will have a story at the high end and middle of the market.

Obviously ATI already took this into account when determining their sweet spot. GT200 won't shrink that much during its lifetime. ATI can always scale up if it makes sense, like R300->R420.

The question is whether NVidia will ever be able to cool and power a pair of GT200's in a dual slot case, and whether people will care.

As for cut down versions, we already have a pretty good idea of what to expect with G92 and G94. In fact, I think they've already stated that they're going to continue G92. There aren't any features or perf/mm2 improvements in GT200 that warrant a cut-down derivative in addition to G92.

Scarier for NVidia is how fast a 128-bit RV730 could be...

digitalwanderer · Jun 19, 2008

Can I just insert a little bit of drooling at this point without screwing up the signal/noise ratio much? :|

Jawed · Jun 19, 2008

Geo said:
Heh. Nobody left from ATI to catch the error?

Hey, isn't that Dave's job

Jawed

Silent_Buddha · Jun 19, 2008

Arun said:
While I agree that'd be a good idea from a marketing perspective, I know *personally* I find the comparison completely retarded because NVIDIA's 16x CSAA performance tends to be very good and, IMO, it looks better in nearly every situation than 8xMSAA. I use 16xCSAA all the same, and I basically never used 8xMSAA, heh... So it's not unfair to compare it and people would indeed take it seriously, but I don't really care myself.

Which makes me wonder.

Does Rv770 still support the custom AA modes of R(v)6xx?

Considering the extremely small hit it takes from going from 4xMSAA to 8xMSAA. Then Edge Detect mode might finally be truly useful in single card configurations. And to me, from what I've seen, nothing comes close to the "24x" (8xMSAA + Edge Detect) Edge Detect mode in terms of sheer beauty.

Regards,
SB

Arty · Jun 19, 2008

Mintmaster said:
Scarier for NVidia is how fast a 128-bit RV730 could be...

Mind taking a jab at it's configuration. I believe Jawed already did.

Jawed · Jun 19, 2008

Sunrise said:
I did a fast comparison between the posted specs of RV770 from amdzone and RV670 from AMD.

Woah that's a worrying list for all sorts of reasons.

To my surprise, the specs of RV770 are missing:
1) Ring Bus Memory Controller
2) Fully distributed design with 512-bit internal ring bus for memory reads and writes

I honestly doubt this is missing, though.

5) Inverse telecine (2:2 and 3:2 pull-down correction)
6) Bad edit correction

I'd be amazed if they're missing just because they should be part of UVD2, shouldn't they?

RV770 losing doubled dual-link DVIs is seriously crap though. I wonder if AMD over-ruled ATI guys (graphics product group) on that. Sure peeps with two 30" monitors are rare, but now they've just been clobbered. I wouldn't be surprised if AMD will say something like, "if you want to drive two 30" displays buy an HD4870X2." Presuming that that card will support it.

Or maybe AMD will suggest that you buy an AMD mobo with integrated GPU - presumably that has at least one dual-link DVI.

Are dual 30" displays popular with "professional" users?

Jawed

DeF · Jun 19, 2008

Silent_Buddha said:
Which makes me wonder.

Does Rv770 still support the custom AA modes of R(v)6xx?

Considering the extremely small hit it takes from going from 4xMSAA to 8xMSAA. Then Edge Detect mode might finally be truly useful in single card configurations. And to me, from what I've seen, nothing comes close to the "24x" (8xMSAA + Edge Detect) Edge Detect mode in terms of sheer beauty.

Regards,
SB

Exactly, specs do mention 24xCFAA so its there. Now we need someone to do the tests. I remember people were complaining about blurriness when custom filters were applied but i didn't have a chance to test it myself.

Silent_Buddha · Jun 19, 2008

Jawed said:
RV770 losing doubled dual-link DVIs is seriously crap though. I wonder if AMD over-ruled ATI guys (graphics product group) on that. Sure peeps with two 30" monitors are rare, but now they've just been clobbered. I wouldn't be surprised if AMD will say something like, "if you want to drive two 30" displays buy an HD4870X2." Presuming that that card will support it.

Or maybe AMD will suggest that you buy an AMD mobo with integrated GPU - presumably that has at least one dual-link DVI.

Are dual 30" displays popular with "professional" users?

I couldn't even imagine using dual 30" monitors. I much prefer a 30" monitor paired up with 20" monitor in portrait view. Otherwise the viewing area gets to be so wide as to be distinctly uncomfortable to use. At least for me.

Although I've been tempted to try a setup with a 30" main screen flanked on each side by a 20" in portrait mode.

Still I can't see the reasoning for NOT having dual-DVI on both outpots on Rv770 cards. It doesn't impact cost significantly does it?

Regards,
SB

fellix · Jun 19, 2008

I honestly doubt this is missing, though.

Not only that, but probably RV770 ditched out 8*32-bit memory access in favour of 64-bit organization.

ShaidarHaran · Jun 19, 2008

DeF said:
Exactly, specs do mention 24xCFAA so its there. Now we need someone to do the tests. I remember people were complaining about blurriness when custom filters were applied but i didn't have a chance to test it myself.

Depends on which filter is used. The narrow/wide tent filters do add a blur to the image, but standard "box" MSAA and edge-detect modes do not.

OpenGL guy · Jun 19, 2008

DeF said:
Exactly, specs do mention 24xCFAA so its there. Now we need someone to do the tests. I remember people were complaining about blurriness when custom filters were applied but i didn't have a chance to test it myself.

The 24xCFAA (and 12xCFAA mode for 4xAA) mode doesn't cause any extra blurriness at all. Great pains are taken to make sure that only the edges are filtered.

Performance of the CFAA modes on 48xx parts should be surprising

AlphaWolf · Jun 19, 2008

OpenGL guy said:
Performance of the CFAA modes on 48xx parts should be surprising

Lightman · Jun 19, 2008

Silent_Buddha said:
I couldn't even imagine using dual 30" monitors. I much prefer a 30" monitor paired up with 20" monitor in portrait view. Otherwise the viewing area gets to be so wide as to be distinctly uncomfortable to use. At least for me.

Although I've been tempted to try a setup with a 30" main screen flanked on each side by a 20" in portrait mode.

Still I can't see the reasoning for NOT having dual-DVI on both outpots on Rv770 cards. It doesn't impact cost significantly does it?

Regards,
SB

According to Hilbert from Guru3D both DVI connectors are Dual-Link capable...

Two DVI connectors, which both have dual-link support (HDCP capable). You might think "hmm, is that needed?" Yes it is. High-Def screens and high-resolution monitors are the key issues here. Dual link DVI pins effectively double the power of transmission and provide an increase of speed and signal quality; i.e. a DVI single link 60-Hz LCD can display a resolution of 1920 x 1080, while a DVI dual link can display a resolution up-to 2560x1200 and I believe this can go even higher.

Source...

Sunday · Jun 19, 2008

OpenGL guy said:
The 24xCFAA (and 12xCFAA mode for 4xAA) mode doesn't cause any extra blurriness at all. Great pains are taken to make sure that only the edges are filtered.

Performance of the CFAA modes on 48xx parts should be surprising

he's just to modes! what he really meant is: "Performance of the CFAA modes on 48xx parts will blow your mind"

I guess no one expected 960 Mil of transistors in 260 sqmm, and no one expected 800SP's, and no one expected killing FSAA performance, blurfree 24xCFAA mode, and there are couple more things that'll be revealed soon that people (including my self) didn't believe, but ATI delivered!

Only thing that can beat RV770 this year in unexpected surprise category would be unexpectedly killing performance of 45nm Phenom

DegustatoR · Jun 19, 2008

Sunday said:
Only thing that can beat RV770 this year

Beat in what?

Jawed · Jun 19, 2008

DemoCoder said:
The problem for ATI is that because they don't have the uber-big-chips, their marketing message will only be successful in the short-term as the 4870x2 goes up against the GT200. But as soon as NV shrinks it and tweaks it, and roles out the inevitable smaller cut-down versions, NV will have a story at the high end and middle of the market.

Do you think AMD will leave the advantages of newer processes on the table? When NVidia gets an opportunity to shrink, so does AMD - though the trend is for NVidia to be eating AMD's dust when it comes to shrinks.

And the key point is that the smaller chip (and a narrow range of chips to cover your entire SKU range) is easier to engineer.

This is with the proviso that the X2 product that we're expecting is genuinely a good-bye to AFR nonsense.

AMD is already 9 months ahead of NVidia with 55nm and there's nothing more than a rumour that a 55nm GT200 has taped out. I don't expect AMD to have 40nm GPUs this year, but when AMD has RV770's successor (50% more performance?) on 40nm it looks like NVidia will be rolling up late with nothing more than a cheaper version of GT200 on 40nm which will be larger and no faster than RV870. GT200's successor, the part with a generational performance jump, will follow considerably later or be way more costly.

ATI's newest GPUs clearly have GFLOPs/mm2 and TEX + MSAA performance per unit and per GB/s advantages over the NVidia parts right now. ATI architecture's performance is scaling faster per process node than NVidia's - and ATI can launch on each new node more easily (i.e. sooner) because it's building smaller chips.

ATI's architecture is clearly showing the signs of scaling elegance:

http://forum.beyond3d.com/showthread.php?t=43523

which is a direct result of the architecture, as I was alluding to.

---

We don't know what the sweet spots for each node are - but it's extremely unlikely that ~576mm2 on 65nm is remotely close to the sweet spot. Otherwise we'd have arrived at this point a year ago

Though the sweet spot can't be static - but I doubt 576mm2 will ever be a sweet spot on 65nm, no matter how mature the process gets.

Jawed

revan · Jun 19, 2008

some interesting stuff -bye bye ringbus

don't know if NDA is gone for the architectural changes in R770 (apart for R670) but the guys at extremetech already start to dissect the body...

here:
http://www.extremetech.com/article2/0,2845,2320865,00.asp

Geo · Jun 19, 2008

We first heard rumbles of the strategy we're seeing play out now over a year ago. I tend to think that this is also part of the AMD acquistion. I tend to doubt an independant ATI would have gone this route. You need "the halo" more if gpus is your main thing than if it's just another part of what you do.

I remember the Dave Orton interview for R420 and "who will step down first?" on increasing size. Well, the answer has obviously been ATI/AMD.

But I have a suspicion that GT200 will keep the "largest gpu ever fabbed" crown for all time, and even NV will "step down" a bit from here.

AMD: R7xx Speculation

Arun

Unknown.

ChrisRay

<span style="color: rgb(124, 197, 0)">R.I.P. 1983-

Mintmaster

digitalwanderer

Jawed

Silent_Buddha

Arty

KEPLER

Jawed

DeF

Silent_Buddha

fellix

ShaidarHaran

hardware monkey

OpenGL guy

AlphaWolf

Specious Misanthrope

Lightman

Sunday

DegustatoR

Jawed

revan

Geo

Mostly Harmless

Similar threads