AMD: R9xx Speculation

Jawed · Jan 23, 2010

no-X said:
Jawed: Maybe I didn't describe my idea properly. Imagine two systems:

1. mainstream GPU, resolution e.g. 1440*900
2. high-end GPU, resolution e.g. 1920*1200

both systems will achieve 60FPS under these settings. Isn't the load of triangle setup identical in both situations? If so, what's the benefit of having twice as fast setup for the high-end GPU?

The high-end GPU (it should be 2560x1600 for an enthusiast class system, or 3x the width of that using an EyeFinity resolution if you really want to be pedantic) should be rendering more triangles.

Further, because ALU capability on mainstream GPUs is reduced by an even larger margin on lower GPUs (e.g. 20:1 + comparing Cypress and Cedar) it's likely that tessellation will be "turned down" on lower GPUs if HS/DS is a substantial cost, i.e. the density of triangles per pixel will be lower on the slower GPU than on the higher GPU.

(I'm still unclear on whether HS or DS will be mostly ALU limited or TEX limited, by the way... Also, it's hard to determine what proportion of frame rendering time will be spent on VS/HS/DS/GS versus PS.)

The problem we have is we can't tell where the gaming bottleneck will lie for a highly tessellated game on an enthusiast rig. (And AMD will always make the excuse that Hemlock is the enthusiast card, which scales setup/rasterisation by almost 2x - "if the engine is coded properly for AFR").

But we can be sure the enthusiast rig will tend towards rendering more triangles simply because of resolution in games with adaptive tessellation based upon resolution.

Also, we can't tell how badly developer noobness with tessellation will lead to unintended consequences (let alone driver noobness).

Jawed

rpg.314 · Jan 23, 2010

no-X said:
rpg.314: I understand. I think the increase of clock speed was sufficient by now. Now it's necessary to increase the triangle-rate significatly because of tesselation. But developmnet of distributed geometry system would be advantageous only if the demand of geometry performance will be significantly increased each year/half-year. As I understand, increase of geometry performance is a single-shot job because of tessealation. Further n-ary increase is no needed, additional polygons wouldn't affect image quality. That's why I see no reason to develop a scaleable system.

Over time, you can bet tessellation factors to increase (due to better hw/more dev experience etc.). Which will create more triangles from the same patches, increasing your geometry load.

Groo The Wanderer · Jan 23, 2010

Gipsel said:
Okay. I thought I've seen a picture of it with a triple slot cooler (can't find it now though). So probably I was wrong.

There may be triple coolers, it all depends on the stepping. The first A0s probably did. I think the one I saw is an A1, A2s were the 'fixed' versions that would have been production, or possibly an A3 that entailed minor bug fixes. The hardware was basically done.

I have a pic of it, but this forum won't let you attach files, so, well, no post.

-Charlie

Groo The Wanderer · Jan 23, 2010

rpg.314 said:
It might have been a 300W+ part, I dunno.

LRB 1, made on 45 nm, the cancelled thing, was comparable to a GTX285. GF100, it's closest competitor should be ~80% faster. Since GF100 is on 40 nm, let's call it 65% ahead. Clearly, if it has to catch up within say 10-15% of GF100, it needs a major tweaks/redesign. Merely throwing more cores at it won't help lrb. Not to mention that as gpus shed their ff hw, they have room for growth above permitted by Moore's law, while lrb is pretty much stuck at the ceiling.

It was around 300W, depending on a few things, the stepping being one of them.

As for the processes, are you really comparing Intel's 45nm HKMG process to TSMC's bulk? Want to bet that in reality, not on paper, Intel 45 beats TSMC 40 on every single metric? What I mean by that is TSMC _CAN_ draw lines at 40nm, smaller than Intel, but I am willing to bet the real achievable parts have Intel WAY ahead in every respect.

The problem wasn't that it was slow, it more or less hit targets, it was just 1.5 years late, almost 2 generations. It was aimed for late 2008, and at that point, it would have been up against the G200a, and LRB2 would have been out before GF100 or potentially before Cypress.

Given that Intel could have easily doubled the shader/core count between 1 and 2, if not much more, it would have been competitive.

Then again, I am only looking at hardware. Intel software is... umm.. lets be VERY polite and say lacking that special 'zip' that makes things great. By zip I kind of mean basic functionality though.

-Charlie

MfA · Jan 23, 2010

What would have changed between LRB1 and LRB2 to make the power consumption low enough for that? (Other than dropping the clock.)

Lukfi · Jan 24, 2010

no-X said:
Isn't the load of triangle setup identical in both situations? If so, what's the benefit of having twice as fast setup for the high-end GPU?

In many games, geometry detail level is scalable at least like "low, med, high". I imagine this will be the case with DX11 titles using heavy tesselation: at the highest settings, you will only be able to run the game smoothly on high-end GPUs with enough geometrical performance. So in practice, with a mainstream board you'll want to use a medium setting, which will still look pretty good, because of tesselation, and because the high settings will be an utter overkill of geometry. This could however pose problems in performance tests, where reviewers will use high settings and lower models will be capped by triangle rate. Of course, this is an extreme scenario. I hope that nVidia won't try to capitalize on their superior triangle rate by pushing devs to use tesselation just to hog the triangle setup.

eastmen · Jan 24, 2010

Lukfi said:
In many games, geometry detail level is scalable at least like "low, med, high". I imagine this will be the case with DX11 titles using heavy tesselation: at the highest settings, you will only be able to run the game smoothly on high-end GPUs with enough geometrical performance. So in practice, with a mainstream board you'll want to use a medium setting, which will still look pretty good, because of tesselation, and because the high settings will be an utter overkill of geometry. This could however pose problems in performance tests, where reviewers will use high settings and lower models will be capped by triangle rate. Of course, this is an extreme scenario. I hope that nVidia won't try to capitalize on their superior triangle rate by pushing devs to use tesselation just to hog the triangle setup.

For all we know nvidia's advantage in tesselation may last a few months or weeks. So i dunno.

rpg.314 · Jan 24, 2010

This prolly hasn't been posted here in this thread.

http://www.theinquirer.net/inquirer/news/1432013/amd-svp-weighs-graphics-competition

A year from now we'll have something new and exciting", he teased, without elaborating further.

Note the date, June 2009. Before Evergreen's launch.

Groo The Wanderer · Jan 24, 2010

MfA said:
What would have changed between LRB1 and LRB2 to make the power consumption low enough for that? (Other than dropping the clock.)

32nm and optimizations of a brand new architecture/paradigm.

-Charlie

Groo The Wanderer · Jan 24, 2010

rpg.314 said:
This prolly hasn't been posted here in this thread.

http://www.theinquirer.net/inquirer/news/1432013/amd-svp-weighs-graphics-competition

Note the date, June 2009. Before Evergreen's launch.

That was after Evergreen was shown off at Computex, almost a month later in fact. The 'year' means Evergreen FWIW, it will take ~9 months from that post for it all to be out.

As for the competing on the low end part, keep an eye on NV's ASPs in Q1. Q4 was/is going to be odd because of artificial shortages, but that won't be the case in Q1.

-Charlie

rpg.314 · Jan 24, 2010

Groo The Wanderer said:
32nm and optimizations of a brand new architecture/paradigm.

-Charlie

Surely, LRB2 can't be more programmable/generalized than LRB1. After all, LRB1 has x86. :smile:

Or they are somehow gonna make it more specialized/exotic/restricted?

MfA · Jan 24, 2010

Other than the hierarchical ringbusses with snoop directories (YUCK) what would have changed?

Groo The Wanderer · Jan 24, 2010

rpg.314 said:
Surely, LRB2 can't be more programmable/generalized than LRB1. After all, LRB1 has x86. :smile:

Or they are somehow gonna make it more specialized/exotic/restricted?

No, I just meant that the first generation of a chip is just that. When they get silicon back, there are always things that the realize are overdone, can be done more efficiently, or in general, laid out better. given the architecture, if they can save a few mm off the core, multiplied by 32, that adds up quick.

-Charlie

Silent_Buddha · Jan 24, 2010

rpg.314 said:
This prolly hasn't been posted here in this thread.

http://www.theinquirer.net/inquirer/news/1432013/amd-svp-weighs-graphics-competition

Note the date, June 2009. Before Evergreen's launch.

So the implication being that AMD has something "new and exciting" planned around the June 2010 timeframe. Or at least that was their timeline last year.

That could match up with Huddy's comments that he expects AMD to have the performance edge for "most of the year." IE - they'll lose it to Nvidia between the GF100 launch and whenever this "new and exciting" thing is launched. And are expecting to regain the performance crown with the "new" thing.

Now if only AMD would pull an Nvidia and start paper launching that product now so we could start speculating.

Regards,
SB

Squilliam · Jan 25, 2010

Silent_Buddha said:
Now if only AMD would pull an Nvidia and start paper launching that product now so we could start speculating.

Regards,
SB

Think of the trees! They wouldn't chop down evergreens to satisfy our need for paper specifications would they?

rpg.314 · Feb 14, 2010

Interesting tidbits from Anand's Evergreen piece.

The move from TSMC to Global Foundries will surely challenge them once more.

The Northern Islands GPUs, due out later this year, were surely designed before anyone knew how RV870 would play out.

Considering the way GF is pimping it's HKMG and is promising customers' announcements in Q1 2010, I am betting that NI will be on 28nm @ GF.

NI might have been designed before RV870 came to market, but my guess is that it's design process began after RV770's launch.

Seen in the light of TSMC->GF transition, it is a possibility that the rumored 5860, 5880, 5990 are on GF's 40nm process. This will help them get setup on the new foundry and then let them use the experience for NI, much like rv740 was a pipe-cleaner for evergreen.

wishiknew · Feb 14, 2010

Wonder if ATI will surprise us with small, big or Nvidia big for NI.

Jawed · Feb 14, 2010

If Carrell is this enthusiastic about sideport then I take that as an indication that he thinks AFR sucks.

Of course it also means sideport/multi-chip discussions, silly as they've often been, have life in them.

Jawed

Silent_Buddha · Feb 14, 2010

Jawed said:
If Carrell is this enthusiastic about sideport then I take that as an indication that he thinks AFR sucks.

I hope so, I hate Crossfire/SLI due to AFR, not the fact that it is multi-GPU. I still remain hopeful that multi-GPU will eventually be attractive to me, but as long as it remains AFR, it's a complete and total non-starter for me.

Regards,
SB

JasonCross · Feb 16, 2010

Jawed said:
If Carrell is this enthusiastic about sideport then I take that as an indication that he thinks AFR sucks.

Of course it also means sideport/multi-chip discussions, silly as they've often been, have life in them.

Jawed

It's not that AFR "sucks" per se, just that it's rather limited. It works well only when you have two symmetric GPUs. Consider that forthcoming AMD CPUs are going to have integrated graphics, and they'd love to use that for a little extra boost to graphics performance when you're using a discrete GPU. You certainly can't do AFR when you have a 4x or 8x or greater performance difference between the two GPUs.

It wouldn't surprise me to see ATI invest quite a bit in new forms of mutli-GPU rendering over the coming year.

AMD: R9xx Speculation

Jawed

rpg.314

Groo The Wanderer

Groo The Wanderer

MfA

Lukfi

eastmen

rpg.314

Groo The Wanderer

Groo The Wanderer

rpg.314

MfA

Groo The Wanderer

Silent_Buddha

Squilliam

Beyond3d isn't defined yet

rpg.314

wishiknew

Jawed

Silent_Buddha

JasonCross

Similar threads