PDA

View Full Version : The ATI R600 Rumours & Speculation Centrum


Pages : 1 [2] 3 4 5 6 7 8 9 10

Sobek
08-Nov-2006, 12:06
Ok so...how about we stop trying to piss on each other via the internet (obviously impossible as you can't fax urine), and get down to facts?

It's like reading a post on the WoW forums after accidentally clicking a link to there.

chavvdarrr
08-Nov-2006, 12:18
was there discussion if the supposed 512-bit bus is 512 to on-board RAM, or say, 256b to RAM and 256b to embedded ala-Xenos?

Jawed
08-Nov-2006, 12:30
Or does fully unified really just mean VS/PS and not other functions like triangle setup?
Triangle setup is prolly a late candidate for unification. Fixed function hardware does this very well as far as I can tell, and it would be a disproportionate cost right now.

A bit like ROPs are still fixed function, as well as texture filtering. And curiously, NVidia has gone "backwards" making texture addressing "fixed function" after being programmable for a fair while.

Jawed

satein
08-Nov-2006, 13:22
Triangle setup is prolly a late candidate for unification. Fixed function hardware does this very well as far as I can tell, and it would be a disproportionate cost right now.

A bit like ROPs are still fixed function, as well as texture filtering. And curiously, NVidia has gone "backwards" making texture addressing "fixed function" after being programmable for a fair while.

Jawed

Is it possible that NV made it fixed to reduce a number of transitor count on G80 comparison to programmable one? Also, speed wise, the fixed function hardware should well provide more than that of programmable one too.... <== my guess...

Demirug
08-Nov-2006, 13:29
A bit like ROPs are still fixed function, as well as texture filtering. And curiously, NVidia has gone "backwards" making texture addressing "fixed function" after being programmable for a fair while.

Jawed

Texture addressing was always a part of the fixed function texture unit. The only “programmable” part was the moving of the coordinates from the shader to the texture unit because the only path goes over the first ALU/FPU in the pipeline. This design goes back to the first “shader” hardware NV20 with had only one pixel shader FPU at all.

nAo
08-Nov-2006, 13:42
Demirug is right, and this thing has already repated so many times before.. :)
while samping cube maps the hw is just moving the texturing coordinates from a reg to the TMU,
otherwise it's also perspectively correcting them.
I would not be suprised at all if G80 still does that.

dnavas
08-Nov-2006, 14:35
And as I've contributed my own misunderstanding to that sub-thread, let me add a bit of color. Texture addressing passes through two MULs in the first ALU in G7x where addresses *can* be modified using the MUL units (thus perspective correction). The generic part of address calculation is elsewhere. So, for a fairly high cost (keeping half the Vec4 in ALU1 busy) G7x isn't getting much in return....

Xmas
08-Nov-2006, 14:53
while samping cube maps the hw is just moving the texturing coordinates from a reg to the TMU,
otherwise it's also perspectively correcting them.
Surely you don't mean to say an interpolated value used for cube map sampling will not be perspective correct?

And as I've contributed my own misunderstanding to that sub-thread, let me add a bit of color. Texture addressing passes through two MULs in the first ALU in G7x where addresses *can* be modified using the MUL units (thus perspective correction). The generic part of address calculation is elsewhere. So, for a fairly high cost (keeping half the Vec4 in ALU1 busy) G7x isn't getting much in return....
For interpolant-based reads, perspective correction is pretty much always required. For dependent reads, there is at least some chance that the coordinates being used need to be multiplied by some value, so the MUL really isn't wasted that often.

nAo
08-Nov-2006, 15:19
Surely you don't mean to say an interpolated value used for cube map sampling will not be perspective correct?
this is exactly what I meant, if you use the interpolated value as it is you don't care about doing any persp correction as it would just change its module and not its direction, and we know cube maps just don't care about sampling vectors length..

nAo
08-Nov-2006, 15:34
And as I've contributed my own misunderstanding to that sub-thread, let me add a bit of color. Texture addressing passes through two MULs in the first ALU in G7x where addresses *can* be modified using the MUL units (thus perspective correction).
no..the TMU is between the 2 ALUs, so your sampling coords just go through one TMU.
AFAIK they perform divs, not muls.. (don't ask me why!)

Megadrive1988
08-Nov-2006, 15:45
uhhh with news that G80 is more powerful and different than once believed, is it a stretch to think that ATI is going to have to implement a 96-shader, 512-bit bus R600 instead of a 64-shader, 256-bit bus R600 ?

maybe that, or something to boost the performance is the reason for the delay.

Bouncing Zabaglione Bros.
08-Nov-2006, 16:35
uhhh with news that G80 is more powerful and different than once believed, is it a stretch to think that ATI is going to have to implement a 96-shader, 512-bit bus R600 instead of a 64-shader, 256-bit bus R600 ?

maybe that, or something to boost the performance is the reason for the delay.

It's too late to change anything of major significance like that. ATI would have to scrap the latter if they decided at this point they needed to build the former. Either they've already got an external 512 bus, or it's too late to put one in.

vertex_shader
08-Nov-2006, 16:52
uhhh with news that G80 is more powerful and different than once believed, is it a stretch to think that ATI is going to have to implement a 96-shader, 512-bit bus R600 instead of a 64-shader, 256-bit bus R600 ?

maybe that, or something to boost the performance is the reason for the delay.
2x speed bump over the last generation not special.
Dx9 performance say nothing, nv30 was fast too in dx8 games, and than everyone know what happend.

R600 not have any delay, its coming when needed, some days before vista public launch :wink:
I remember for the first g80 launch date rumor, its say early summer, early winter in europe already.

I not really understand why we talking about delay? ATi schedule changed because of xenos, nv do only some refresh after nv40, rsx mostly g71, xenos and r520 was 2 new, and different gpu.



01.27 - 01.30.

Razor1
08-Nov-2006, 17:08
It's too late to change anything of major significance like that. ATI would have to scrap the latter if they decided at this point they needed to build the former. Either they've already got an external 512 bus, or it's too late to put one in.


True, can't just bolt another 256 bit bus on there :grin:

Razor1
08-Nov-2006, 17:09
2x speed bump over the last generation not special.
Dx9 performance say nothing, nv30 was fast too in dx8 games, and than everyone know what happend.

R600 not have any delay, its coming when needed, some days before vista public launch :wink:
I remember for the first g80 launch date rumor, its say early summer, early winter in europe already.

I not really understand why we talking about delay? ATi schedule changed because of xenos, nv do only some refresh after nv40, rsx mostly g71, xenos and r520 was 2 new, and different gpu.



01.27 - 01.30.


It might not be delayed because there is no Vista yet, but its late, that is a problem, nV will have 3-4 months work with developers tuning games and engines to thier new cards........

vertex_shader
08-Nov-2006, 17:28
It might not be delayed because there is no Vista yet, but its late, that is a problem, nV will have 3-4 months work with developers tuning games and engines to thier new cards........

R600 samples out for dev's already.
85% of the games optimized for nv cards because of nv twimtbp program, i not see the point here, ati always have this disadvantage, and we can see what happening in reality when we checking this generation in-game performance numbers.(Oblivon the best example, its a part of the TWIMTBPaid :wink: program, and runs much better on ati cards).

The only thing ati need to be better is the PR, PR department sucks big time, check the the last 3 weeks, g80 leaks/infos everywhere, AEG guys doing his job very well, and ati can't hard launch a long delayed (yes this is delayed) mainstream card (x1650xt).

INKster
08-Nov-2006, 17:41
(Oblivon the best example, its a part of the TWIMTBPaid :wink: program, and runs much better on ati cards).


Correction: it used to run much better on ATI cards, but not anymore. ;)

Xmas
08-Nov-2006, 17:56
this is exactly what I meant, if you use the interpolated value as it is you don't care about doing any persp correction as it would just change its module and not its direction, and we know cube maps just don't care about sampling vectors length..
Duh! :embarrased:

no..the TMU is between the 2 ALUs, so your sampling coords just go through one TMU.
AFAIK they perform divs, not muls.. (don't ask me why!)
Because the interpolators do the screen space linear interpolation of the vertex values (U/W), (V/W) and (1/W). To get the perspective correct U and V, one needs to divide lerp(U/W) and lerp(V/W) by lerp(1/W).

nAo
08-Nov-2006, 18:03
Because the interpolators do the screen space linear interpolation of the vertex values (U/W), (V/W) and (1/W). To get the perspective correct U and V, one needs to divide lerp(U/W) and lerp(V/W) by lerp(1/W).
I know this, but at the same I wonder why they not simple compute 1 RCP (to be reused for all the evaluated interpolants whitin a pixel) and N muls..
I thought dividers were expensive.. at the same time if your hw has a divider than you can save a RCP..

Chalnoth
08-Nov-2006, 18:17
AEG guys doing his job very well
The AEG and nVidia split ways a long time ago. Seriously, whose fault is it that nVidia pays more attention to their customers than ATI?

Bouncing Zabaglione Bros.
08-Nov-2006, 18:22
There's no doubt that ATI's PR always seems to be missing in action at times when (if the situation were reversed) Nvidia's PR would be out in force and working very hard. Where are you ATI marketing?

Chalnoth
08-Nov-2006, 18:32
When I visited nVidia earlier this year, I talked to a couple of guys who used to work for ATI. They both felt that nVidia had a vastly more consumer-centric company culture than ATI.

overclocked_enthusiasm
08-Nov-2006, 18:50
The ATI PR department, while inept, is not the problem. The problem is that since R3xx, ATI has been late with products and have been reacting to NVDA. NVDA has beaten ATI in launching the next generation products and refreshes since R3xx. As a LONG term investor in ATI I followed these launches and these misteps by ATI with a fine tooth comb. The proof of this failure is refelected in the market share data each quarter and here we are going into ANOTHER Xmas season with no competing products. Yes ATI does have some good products...but once again they don't have the best.

The real problem at ATI can be traced to the change in CEO which happened at the same time that ATI was on top of the world with R3xx. Everyone needs to face the simple fact that Jen at NVDA is hardcore...a winner...and thumped ATI so bad they had to take a buyout from AMD to survive. Jen did it by creating products on time, in the right segments and establishing new markets for his products like SLI. Orton instead takes fragile (in that they are cancelled now) low margin business deals like supplying chipsets for Intel. Jen is a dreamer, a visionary and a savy CEO where Orton is an engineer who is more excited about the technology than the nuts and bolts of running a business. The contrasts cannot be more clear nor can the results.

NVDA is going to rule the discrete market in a more and more dominant manner until there finally is a Fusion of CPU and GPU. Once that Fusion takes place...we will see the true value of ATI to AMD. Until then, expect ATII's dicrete business to wither on the vine and for R700 and later generations to either be scraped or pulled into the Fusion initiative.

overclocked_enthusiasm
08-Nov-2006, 18:54
When I visited nVidia earlier this year, I talked to a couple of guys who used to work for ATI. They both felt that nVidia had a vastly more consumer-centric company culture than ATI.

There is no question that is true. NVDA has always made sure that the channel and AIB partners got their products as well as the OEMs. This has built NVDA a great deal of mindshare with consumers who shop at Best Buy or online for their goods. NVDA sees the value of this as establishing their brand and the end result is customer loyalty. Being ontime to market with new products is a perfect example of this fundamental difference in corporate cultures between the 2 companies.

ERP
08-Nov-2006, 18:56
Just want to correct one misconception here
It's not uncommon for reviewers to see cards before a wide variety of devs see them.

Razor1
08-Nov-2006, 19:05
R600 samples out for dev's already.
85% of the games optimized for nv cards because of nv twimtbp program, i not see the point here, ati always have this disadvantage, and we can see what happening in reality when we checking this generation in-game performance numbers.(Oblivon the best example, its a part of the TWIMTBPaid :wink: program, and runs much better on ati cards).

The only thing ati need to be better is the PR, PR department sucks big time, check the the last 3 weeks, g80 leaks/infos everywhere, AEG guys doing his job very well, and ati can't hard launch a long delayed (yes this is delayed) mainstream card (x1650xt).


And where do you get your info from that Dev's have the r600 in hand , many dev's that is.

ATi's disadvantage is PR for thier products, that stems from management, they should have corrected this problem long ago, which they haven't, hopefully AMD is more capable. Oblivions performance was due to a better design desicion on ATi's part, well not really better since the r520 and r580 were a new gen, and the gf7's were older tech.

overclocked_enthusiasm
08-Nov-2006, 19:11
http://seekingalpha.com/article/12758

Read this if you want to see what is wrong with ATI.

Bouncing Zabaglione Bros.
08-Nov-2006, 19:24
I think ATI has some great tech and has brought out some great products. Personally, I like ATI's products better than Nvidia's for a number of reasons, but ATI just don't do anywhere as well as Nvidia when it comes to getting into the faces of the public and blowing their own horns. And it's been the case for years now.

overclocked_enthusiasm
08-Nov-2006, 19:36
I think ATI has some great tech and has brought out some great products. Personally, I like ATI's products better than Nvidia's for a number of reasons, but ATI just don't do anywhere as well as Nvidia when it comes to getting into the faces of the public and blowing their own horns. And it's been the case for years now.

I agree as well. The bigger problem is these windows of unopposed opportunity given to Nvidia first with 7800 and now with 8800. They both look to be about the same duration of 3 or 4 months as ATI scrambles (or is it saunters?) their way to getting out a competing next-gen architecture. During this 3-4 month window of unopposed opportunity given to them, Nvidia is able to pad their already high gross margins (no price competition), pad their high market share and enjoy the PR buzz and mindshare gains afforded those with the best current product (ala Core Duo).

These missed opportunites have mired ATI's gross margins in the toilet, caused them to lose market share and has relegated them to the also ran position second fiddle and late to market. Once ATI's products do come to market they have no pricing power beacuse Nvidia is 3-4 months more mature in the process and can push prices down due to better yields and force ATI to respond accordingly. Also, Nvidia can respond with a refresh or push clocks to stay on par with ATI and in the end ATI has done nothing more than catch up and has missed the golden "first to market" opportunity that nets hign gross margins, increased market share and the positive PR buzz.

When you think about it, what does ATI PR really have to crow about right now? RV560? They are simply on their heels and are responding to Nvidia months and months after the fact. So sad... NVDA stock hit a presplit high of $70 today and ATI would still be in the $15-18 range if not for the AMD buyout at $21.

Razor1
08-Nov-2006, 19:40
Just want to correct one misconception here
It's not uncommon for reviewers to see cards before a wide variety of devs see them.


Very true.

Bouncing Zabaglione Bros.
08-Nov-2006, 19:49
I agree as well. The bigger problem is these windows of unopposed opportunity given to Nvidia first with 7800 and now with 8800. They both look to be about the same duration of 3 or 4 months as ATI scrambles (or is it saunters?) their way to getting out a competing next-gen architecture. During this 3-4 month window of unopposed opportunity given to them, Nvidia is able to pad their already high gross margins (no price competition), pad their high market share and enjoy the PR buzz and mindshare gains afforded those with the best current product (ala Core Duo).


People always talk about the "halo effect" of having the top technology, but there's also a halo effect of getting there first, of being perceived to be a market leader, of having your logo associated with the big games, etc.

That 3-4 month lead is worth more to Nvidia than just the monetary advantage of being able to sell it's latest tech at top prices with no competition, it's all about getting that halo effect in all those different aspects.

Xmas
08-Nov-2006, 20:47
I know this, but at the same I wonder why they not simple compute 1 RCP (to be reused for all the evaluated interpolants whitin a pixel) and N muls..
I thought dividers were expensive.. at the same time if your hw has a divider than you can save a RCP..
But they do (although I'don't think they reuse rcp(lerp(1/W)), register pressure and things...). Their DIV is just a macro instruction for RCP+MUL.

satein
08-Nov-2006, 20:53
Yes, it may be not that rigid... but it may help heating up a bit.

Now comes Fuad saying that...
ATI has R600 silicon out [Game developers and special customers have it] (http://www.theinquirer.net/default.aspx?article=35614)

JasonCross
08-Nov-2006, 21:20
This is almost the direct opposite of what I see happening. ATI is a low-margin, low-ROI business compared to AMD's core. AMD is far more likely to halve their resource allocation to it then double it, IMO.


I personally think AMD/ATI getting out of the $500+ graphics card business is about as likely as AMD getting out of the $500+ CPU business.

I think the company recognizes that there is a place for high-end, high-cost, high-profit products that sell to a very small minority, so long as the engineering effort that goes into that architecture can be leveraged into more affordable, mass-market products. With GPUs it's not quite the same as just speed-binning CPUs and maybe halving the cache, but the architectures are designed to be quite scalable - and unified shaders only makes that more the case.

Pete
08-Nov-2006, 23:39
Besides the marketing and overall product-line advantage of having a prestige product, surely some (if not all) of the engineers would prefer to be working on such a marvel rather than always plugging away on economic, efficient, affordable GPUs? It's a payoff that seems to work for everyone (designer, manufacturer, seller, buyer).

gostriker
09-Nov-2006, 02:04
When I visited nVidia earlier this year, I talked to a couple of guys who used to work for ATI. They both felt that nVidia had a vastly more consumer-centric company culture than ATI.

b3d forums are certainly looking more like a fan site these days. I miss the discussion on the tech. I know, am leaving. :oops:

Chalnoth
09-Nov-2006, 03:39
Besides the marketing and overall product-line advantage of having a prestige product, surely some (if not all) of the engineers would prefer to be working on such a marvel rather than always plugging away on economic, efficient, affordable GPUs? It's a payoff that seems to work for everyone (designer, manufacturer, seller, buyer).
Well, from an engineering standpoint, budget GPU's can be every bit as challenging to build. Sure, they're not quite as cool, but you also don't really see the big picture much when you're working on a little piece of a large project. So I don't really buy that argument.

However, having a competitive company culture can certainly be a great driving force for all employees.

dnavas
09-Nov-2006, 05:19
For interpolant-based reads, perspective correction is pretty much always required. For dependent reads, there is at least some chance that the coordinates being used need to be multiplied by some value, so the MUL really isn't wasted that often.

I think what I was trying to say was that the first Vec4 was underutilized because two of its units are kept busy -- whether the other two are able to be used is a bit iffy. Not that the MUL (or DIV, or what have you :) ) is wasted, only that the Vec4 unit is -- that was the cost I was referring to. In the G80 they seem to have fixed that and invented a different problem (where else is that MUL used?), but, this is the R600 thread.... :)

I am encouraged to hear that things still appear to be on track for early next year. While post-Xenos and G80, the architecture of R600 might be anticlimactic, I do hope to see a significant rise in ALU power, which should prove to be very exciting for not just current games, but the whole gpgpu movement. The accessibility of that power will also be interesting to track.

I eagerly await leaks :D

Jakob
10-Nov-2006, 00:32
R600 can't be THAT bad if they are seeding developers and researchers. I take that news as a strong positive.

trinibwoy
10-Nov-2006, 02:46
Heh, I found this post quite amusing....

b3d forums are certainly looking more like a fan site these days. I miss the discussion on the tech. I know, am leaving. :oops:

considering it closely followed this one....

But they do (although I'don't think they reuse rcp(lerp(1/W)), register pressure and things...). Their DIV is just a macro instruction for RCP+MUL.

That enough tech for ya? :razz:

trinibwoy
10-Nov-2006, 02:51
Now that we know what G80 is capable of both in terms of performance and IQ what are some possibilities for an R600 upset? I would think bandwidth is the most obvious one. Do the benchmarks that have R580 putting up a good fight against G80 tell us anything about a possible kink in Nvidia's armor come R600 time?

INKster
10-Nov-2006, 03:15
Now that we know what G80 is capable of both in terms of performance and IQ what are some possibilities for an R600 upset? I would think bandwidth is the most obvious one. Do the benchmarks that have R580 putting up a good fight against G80 tell us anything about a possible kink in Nvidia's armor come R600 time?

Unless power issues prevent that. One of the main barriers for power parity may be the Ring Bus.
I, for one, find it impressive that Nvidia implemented so much innovation by using only a relatively simple crossbar.
The bar is very high with G80, we can only hope that ATI didn't got caught with their pants down during the 2 years of deliberate disinformation by Nvidia, because now is too late for last minute major redesigns to the architecture.
Either way, i'm expecting it, at the very least, to have competitive performance.

Rangers
10-Nov-2006, 03:28
Originally Posted by gostriker View Post
b3d forums are certainly looking more like a fan site these days. I miss the discussion on the tech. I know, am leaving.

Really. Every thread has turned into ATI is d00med crap. I move it be banned or set aside to one thread. Call it the ATI is d00med thread and leave it at that, set off from the rest.

Geo
10-Nov-2006, 05:46
Really. Every thread has turned into ATI is d00med crap. I move it be banned or set aside to one thread. Call it the ATI is d00med thread and leave it at that, set off from the rest.

Yes, that's the plan right there. Fight wild exaggeration with wild exaggeration. It's the internet after all. . .

Look, NV is having a very good week. They get to have them too. It's unfortunate that some horse race people can't enjoy one side having a good week without succumbing to the temptation to bury the other side . . .

For enthusiasts it's all about the top end, sure, but in case nobody noticed its roughly 5-10% of the market, and where ATI has really been getting killed until just recently isn't there (which you'd never know from the Mercury numbers, btw, because they consider an X1600 to be a high end part). . . its been in the mid-range which is much much larger. And now ATI has some nice parts in those price ranges, and G80 --wonderful part that it is-- ain't gonna help at all there for some months until the rest of the family shows up.

So, y'know, celebrate NV this week because they deserve it, but lets put the funeral rites for ATI on hold because they are way premature.

SugarCoat
10-Nov-2006, 05:59
Now that we know what G80 is capable of both in terms of performance and IQ what are some possibilities for an R600 upset? I would think bandwidth is the most obvious one. Do the benchmarks that have R580 putting up a good fight against G80 tell us anything about a possible kink in Nvidia's armor come R600 time?

If they can do a pretty consistant 100% improvement, from day 1 launch, over the current X1950XT, they'll have something worth waiting for.

As far as X1950XT vs 8800GTX i've noticed the X1950XT currently scales better with AA as resolution increases. At least in percentage. But thats not something i would count on automatically being there for the X2800 considering the architecture break. Plus thats also something i think we can count on getting improved upon as time progresses and drivers mature.

They also could improve their filtering (specifically mipmap levels) for sure, visually right now i'd say the 8800 and ATI HQ is on par, theres still room for improvement ....plus totally free 4x FSAA thanks to the huge bus width. (buhahahahah)

The problem wont be the 8800GTX, the problem is going to be keeping a decent percentage lead above not only the GTX but nVidias refresh part which is inevitable.

Doing all that, and perhaps throwing it on a smaller more case friendly PCB a sexy demo for launch better then toyshop, and they'd have something ;).

Bjorn
10-Nov-2006, 06:37
They also could improve their filtering (specifically mipmap levels) for sure, visually right now i'd say the 8800 and ATI HQ is on par, theres still room for improvement ....plus totally free 4x FSAA thanks to the huge bus width. (buhahahahah)

From what i've seen in the reviews so far, the 8800 definitely has better IQ (AF f.e) then what ATI (AMD) has right now. And they also support 8X MSAA + the different CSAA modes. On the other hand, Ati has had their 6X MSAA for ages so i'd be rather surprised if they didn't do something better for this generation.

^eMpTy^
10-Nov-2006, 06:56
In regards to nVidia's better marketing. I agree ATi is handing them another 3 or 4 months of domination and high margins with R600. But I think the biggest problem for ATi has been their complete inability to build a cost-effective high performance midrange part. nVidia hit the bulls eye two gens in a row with the 6600GT and 7600GT.

Overall I think ATi makes fantastic hardware, R580 was a thing of beauty...but they need to learn how to effectively target the midrange and stop jumping on every new manufacturing process that comes around (90nm with R520 and god knows what with R600)...

I do have a question I wanted to pose to you guys...what are your thoughts on the 384bit memory bus on G80? Aggressive? Expected?

Do you guys think R600 will go for 512bit? Stick with 256? Or go to 384?

Ailuros
10-Nov-2006, 07:06
In regards to nVidia's better marketing. I agree ATi is handing them another 3 or 4 months of domination and high margins with R600. But I think the biggest problem for ATi has been their complete inability to build a cost-effective high performance midrange part. nVidia hit the bulls eye two gens in a row with the 6600GT and 7600GT.

Overall I think ATi makes fantastic hardware, R580 was a thing of beauty...but they need to learn how to effectively target the midrange and stop jumping on every new manufacturing process that comes around (90nm with R520 and god knows what with R600)...

I do have a question I wanted to pose to you guys...what are your thoughts on the 384bit memory bus on G80? Aggressive? Expected?

Do you guys think R600 will go for 512bit? Stick with 256? Or go to 384?

Agreed on most of the points; as for R600's buswidth I most certainly won't judge an architecture or it's possible performance on buswidth alone, especially if I don't know how each architecture handles it's bandwidth. Purely theoretically GDDR4@256 can deliver as much bandwidth as GDDR3@384; one could think of a couple of points regarding memory costs/availability and things like upwards or downwards scalability.

Even then I'd first prefer to have a clear insight how competing GPU B handles it's bandwidth; granted I don't expect R600 to be a TBDR, yet if I make someone think of that corner case it means there that it could get along with way less bandwidth then an "IMR" of today.

Careful with sterile numbers; they provide only one side of the story but can be misleading at times.

Shtal
10-Nov-2006, 07:12
If rumors are true about ATI will have few advantages over G80.
R600 512bit Bus GDDR4 vs. G80 384bit Bus GDDR3
R600 32 ROP's vs. G80 24 ROP's
R600 64 pipelines vs. G80 not exactly you could call 32 pipelines
R600 80nm tech vs. G80 90nm
R600 600-650MHz GPU vs. G80 570-580MHz GPU

If this is true to happend 3 months from now, it might worth a wait for mighty R600 to flex its muscle.
Also if this information is true about R600. I'm not sure if Nvidia be able to adjust higher clock GPU enough to match R600.

But it is also possibility ATI will stay with 256bit bus memory.

The only sad part if this is last ATI's fight: Just like 3Dfx with Vodoo 5....

^eMpTy^
10-Nov-2006, 07:16
Agreed on most of the points; as for R600's buswidth I most certainly won't judge an architecture or it's possible performance on buswidth alone, especially if I don't know how each architecture handles it's bandwidth. Purely theoretically GDDR4@256 can deliver as much bandwidth as GDDR3@384; one could think of a couple of points regarding memory costs/availability and things like upwards or downwards scalability.

Even then I'd first prefer to have a clear insight how competing GPU B handles it's bandwidth; granted I don't expect R600 to be a TBDR, yet if I make someone think of that corner case it means there that it could get along with way less bandwidth then an "IMR" of today.

Careful with sterile numbers; they provide only one side of the story but can be misleading at times.

I completely understand what you mean. In fact, given that G80 does have support for GDDR4, it's obvious that nVidia went with GDDR3 simply because they didn't need anymore bandwidth, and wanted to boost their margins by choosing cheaper ram.

My question is a purely theoretical one. I was caught a little off guard by the whole 384bit bus thing. I thought we'd be stuck at 256 for a while due to the complexity of going wider. But now that I see G80, I'm not sure.

So my question is, is 384bit an ambitious design or is it simply the next logical step who's time had come? And do you think ATi will follow suit, stay at 256, or go even more ambitious and skip straight to 512bit?

Even if ATi stays at 256, they can likely get more than enough bandwidth out of GDDR4, so in the end it doesn't really matter. I'm just curious from an architectural standpoint.

Ailuros
10-Nov-2006, 07:40
How can anyone be certain nowadays with all the misguidance that occurs prior to launches? From all the guesses I'd make I'd say that the 512bit one would be the less likely, but don't put your money on it either cause I really don't know.

R300King!
10-Nov-2006, 07:46
Is the G80 maxing out its bandwidth right now or no? If so, in which tests or games?

Also, the G80 is capable of GDDR4 too, so when it does come out, it will increase its bandwidth further.

I personally think ATI-AMD(DAAMIT) should be releasing something right now as in terms of performance of their upcoming R600. All this does is let Nvidia sell boatloads of them to people who are "looking out" for something on the horizon. If the waters are a dead calm then of course the decision becomes more clear to get the G80 now. If we saw the R600 beats the G80 by even 10~20%, I think many people will wait to purchase.

Of course, this could mean DAAMIT doesn't have the goods to beats the mighty G80 and so better left quiet. If they are having clockspeed problems(as I've heard on the boards) then maybe it's quite possible this is the cause of the delay and the lack of any info on performance whatsoever. If that's the case then.... :cry:

Shtal
10-Nov-2006, 08:11
The question been raise about 384bit mem vs. 512bit mem.

For example ATI X1900XTX bandwidth vs. X1950XTX bandwidth you only see small margin between two of them. It could be two reasons.
A. games right now don't take advantage of X1950XTX bandwidth.
-OR-
B. You need more powerful hungry GPU to utilize/take advantage with more bandwidth memory.

I just don't see what R600 will do with 512bit memory running 2GHz+ GDDR4.

nAo
10-Nov-2006, 08:17
It seems to me that making an efficient use of GDDR4 memory is harder wrt GDDR3.
Things get probably more interesting when the amoung of data moved per pixel increases (FP64, FP12, AA, etc..)

Ailuros
10-Nov-2006, 09:10
The question been raise about 384bit mem vs. 512bit mem.

For example ATI X1900XTX bandwidth vs. X1950XTX bandwidth you only see small margin between two of them. It could be two reasons.
A. games right now don't take advantage of X1950XTX bandwidth.
-OR-
B. You need more powerful hungry GPU to utilize/take advantage with more bandwidth memory.

I just don't see what R600 will do with 512bit memory running 2GHz+ GDDR4.

I wouldn't judge things that much considering bandwidth and it's effiency on the R580+ in comparison to possibilities on R600.

ChrisRay
10-Nov-2006, 09:18
Is the G80 maxing out its bandwidth right now or no? If so, in which tests or games?

Also, the G80 is capable of GDDR4 too, so when it does come out, it will increase its bandwidth further.

I personally think ATI-AMD(DAAMIT) should be releasing something right now as in terms of performance of their upcoming R600. All this does is let Nvidia sell boatloads of them to people who are "looking out" for something on the horizon. If the waters are a dead calm then of course the decision becomes more clear to get the G80 now. If we saw the R600 beats the G80 by even 10~20%, I think many people will wait to purchase.

Of course, this could mean DAAMIT doesn't have the goods to beats the mighty G80 and so better left quiet. If they are having clockspeed problems(as I've heard on the boards) then maybe it's quite possible this is the cause of the delay and the lack of any info on performance whatsoever. If that's the case then.... :cry:


From what I have seen from people overclocking their 8800GTS, They dont seem to get much from overclocking the bandwith, ((marginal performance gains compared to core overclocking)) Even with less bandwith than an 8800GTX a 620 Mhz core GTS can give an 8800GTX at stock a run for its money and often outperform it. So its really hard to say just how much the extra bandwith is benefiting the Geforce 8800GTX.

Ailuros
10-Nov-2006, 09:23
Is the G80 maxing out its bandwidth right now or no? If so, in which tests or games?

Try 8x multisampling at it's highest resolution.

I personally think ATI-AMD(DAAMIT) should be releasing something right now as in terms of performance of their upcoming R600. All this does is let Nvidia sell boatloads of them to people who are "looking out" for something on the horizon. If the waters are a dead calm then of course the decision becomes more clear to get the G80 now. If we saw the R600 beats the G80 by even 10~20%, I think many people will wait to purchase.

And a paper launch would be a good idea at this point when it's projected release according to my understanding will take place in Q1 07? Needless to say that without having anything final yet it's way too risky to leak any performance numbers out. Last but not least why warn the competition about what's about to come? To give them even more ammunition for their upcoming G8x refresh?

Of course, this could mean DAAMIT doesn't have the goods to beats the mighty G80 and so better left quiet. If they are having clockspeed problems(as I've heard on the boards) then maybe it's quite possible this is the cause of the delay and the lack of any info on performance whatsoever. If that's the case then.... :cry:

Some say that silence is gold and it applies in more than one cases; consider the above there might be a different perspective.

ChrisRay
10-Nov-2006, 09:29
Ailuros does bring up a good point with 4xAA being single cycle now through the ROPS. Bandwith will be less of a concern. But 8xQ will definately have much higher bandwith bottlenecks.

Jawed
10-Nov-2006, 13:31
It seems to me that making an efficient use of GDDR4 memory is harder wrt GDDR3.
Things get probably more interesting when the amoung of data moved per pixel increases (FP64, FP12, AA, etc..)
R5xx can access GDDR3 as 8x32 - as opposed to traditional 4x64. With GDDR4 the burst length doubles, doesn't it? It seems to me that the combination of R5xx and GDDR4 results in access patterns that are similar to 4x64 GDDR3 GPUs.

So, what happens to a GPU that accesses GDDR3 as Nx64 when it's faced with GDDR4's double-length burst?

Jawed

nAo
10-Nov-2006, 14:13
So, what happens to a GPU that accesses GDDR3 as Nx64 when it's faced with GDDR4's double-length burst?

Dunno, but will see that soon. btw..what happens when you goe from 256 to 512 data bus..you have 16 32bit wide channels or 8 64bit wide channels? :)

Jawed
10-Nov-2006, 14:21
Dunno, but will see that soon. btw..what happens when you goe from 256 to 512 data bus..you have 16 32bit wide channels or 8 64bit wide channels? :)
Presuming that the "narrowness", 32-bit, was a key goal of ATI's MC design, and with ATI being the "lead architect" for GDDR4, I think it's reasonable to pin my hopes on it being 16x32...

Jawed

dnavas
10-Nov-2006, 14:49
How do we lay out this board? If you take something like http://www.beyond3d.com/reviews/nvidia/g80-arch/images/board-naked-big.jpg where are the other four chips going to go?

Jawed
10-Nov-2006, 14:59
On the back. They run cool so won't need substantial (any?) heatsinking.

Jawed

no-X
10-Nov-2006, 14:59
e.g. like this?

http://img204.imageshack.us/img204/6591/firestreamfrontzr9.th.jpg (http://img204.imageshack.us/img204/6591/firestreamfrontzr9.jpg) http://img204.imageshack.us/img204/821/firestreambacksn3.th.jpg (http://img204.imageshack.us/img204/821/firestreambacksn3.jpg)

Jawed
10-Nov-2006, 15:02
Precisely, 1GB.

The difference being that's implemented as 2 chips per "32-bit channel".

Jawed

dnavas
10-Nov-2006, 15:04
Good point -- they do seem to have done that before.

flopper
10-Nov-2006, 15:39
I start to wonder about ATI/AMD.
Now we seen the 8800gtx crush and are crushing anything on the market due to the new gen.

Its silent from ATI, either due to we know we will beat them down (8800gtx) by a good margin and being ready for the G81 refresh.
or simply,
its the same or similiar in numbers and the deciding factor will be how the cards will make DX10 shine or not.

However interesting the DX9 numbers are and how good they also are, the Dx10 will be the gameplay level if its much better/efficent and look better and then the cards got to run the games at DX10 which then isnt avaible with any numbers to display what the cards do there.
I play mostly BF2142 and BF2 and some MMOG online which is my main playarena.
So as of now, my x1950xtx just runs those smooth and great at 1600x1200.

However, if I want to get Vista, and have DX10 then I would like to know what card to get since that are more interesting then todays massive boost in DX9.

3Dmark in all glory, but frames and eyecandy is what are useful for me. Even though I like the tech behind and also read such forums as this, which btw is great high level of technical lingo the end is that what does the card produce in real gameplay?

That is why I would like to see ATI/AMD produce such stuff when time is due.
DX10 is the name of the game next year. However, DX9 still be the main arena.
Vista will set a new standard for gamers or my guess it will flop big time.

BrynS
10-Nov-2006, 17:51
Uttar's concise transcript (http://www.beyond3d.com/forum/showthread.php?t=35498) of NV's Q3 CC suggests that NV know that R600 will be a wee bit portly. :razz: EDIT: As per Uttar's clarification (http://www.beyond3d.com/forum/showpost.php?p=870630&postcount=6), Jen-Hsun's comment doesn't necessairly imply specific knowledge of R600 die size.

[...]
Q: Question on die size. G80 die size looks much bigger, losing the small die size advantage? Gross margins?
A: Second question first. This time around, the competition has an infinitely large die size. The die size is higher, but the ASPs are also higher.
[...]

Bouncing Zabaglione Bros.
10-Nov-2006, 17:56
Uttar's concise transcript (http://www.beyond3d.com/forum/showthread.php?t=35498) of NV's Q3 CC suggests that NV know that R600 will be a wee bit portly. :razz:

Apparently that's just a weak joke because R600 is MIA, hence "infinitely large". Don't read too much into it. Probably just a way for Jen-sun to divert the question on G80 margins given it's large die size.

Geeforcer
10-Nov-2006, 17:59
Uttar's concise transcript (http://www.beyond3d.com/forum/showthread.php?t=35498) of NV's Q3 CC suggests that NV know that R600 will be a wee bit portly. :razz: EDIT: As per Uttar's clarification (http://www.beyond3d.com/forum/showpost.php?p=870630&postcount=6), Jen-Hsun's comment doesn't necessairly imply specific knowledge of R600 die size.

No, the exchange went like this:

Analyst: "Before your had a die-size advantage, but no longer with next generation... what gives?"
Jen: "Well, competitor's next-gen product is not even out - so you can't really say who has die size advantage. Right now, their die size is infinitely large"

Translation: Let’s see how big R600 is before we start talking about die size advantage.

Ailuros
10-Nov-2006, 18:11
...and I thought that size matters after all :blush:

SugarCoat
10-Nov-2006, 21:27
From what i've seen in the reviews so far, the 8800 definitely has better IQ (AF f.e) then what ATI (AMD) has right now. And they also support 8X MSAA + the different CSAA modes. On the other hand, Ati has had their 6X MSAA for ages so i'd be rather surprised if they didn't do something better for this generation.

I agree nVidia has better AA again, which isnt abnormal, their AA is usually quite good anyway, but from most of the screenshots i've seen, especially in Oblivion levels where theres alot of stones on a walkway, IQ = pretty much identicle short of whipping out the magnifying glass, and both even have mipmap issues in the same sections where you can see it change from one level to the next. Perhaps its a title thing.

Mark
10-Nov-2006, 22:01
Assuming GDDR4 @ 1.4GHz on a 256bit bus, that gives you 89.6GB/s bandwidth... on a 512bit bus it gives you 179.2GB/s... the 8800 GTX has 86.9GB/s.

I'm thinking ATI will go with the design that makes the R600 competitive, not the one that makes it competitive+uber-expensive to produce.

Arun
10-Nov-2006, 22:31
Assuming GDDR4 @ 1.4GHz on a 256bit bus, that gives you 89.6GB/s bandwidth... on a 512bit bus it gives you 179.2GB/s... the 8800 GTX has 86.9GB/s.

I'm thinking ATI will go with the design that makes the R600 competitive, not the one that makes it competitive+uber-expensive to produce.Words of wisdom. Someone give this man a cookie...! :)


Uttar

Moloch
10-Nov-2006, 22:51
I start to wonder about ATI/AMD.
Now we seen the 8800gtx crush and are crushing anything on the market due to the new gen.

Its silent from ATI, either due to we know we will beat them down (8800gtx) by a good margin and being ready for the G81 refresh.
or simply,
its the same or similiar in numbers and the deciding factor will be how the cards will make DX10 shine or not.

However interesting the DX9 numbers are and how good they also are, the Dx10 will be the gameplay level if its much better/efficent and look better and then the cards got to run the games at DX10 which then isnt avaible with any numbers to display what the cards do there.
I play mostly BF2142 and BF2 and some MMOG online which is my main playarena.
So as of now, my x1950xtx just runs those smooth and great at 1600x1200.

However, if I want to get Vista, and have DX10 then I would like to know what card to get since that are more interesting then todays massive boost in DX9.

3Dmark in all glory, but frames and eyecandy is what are useful for me. Even though I like the tech behind and also read such forums as this, which btw is great high level of technical lingo the end is that what does the card produce in real gameplay?

That is why I would like to see ATI/AMD produce such stuff when time is due.
DX10 is the name of the game next year. However, DX9 still be the main arena.
Vista will set a new standard for gamers or my guess it will flop big time.
Perhaps the silence is the calm before the storm :wink:

trinibwoy
10-Nov-2006, 23:01
Words of wisdom. Someone give this man a cookie...! :)

Assuming that's the case wouldn't AMD (forcing myself to accept it!! :)) be terribly concerned with Nvidia's possible advantage if they should decide to move to high speed GDDR4 on a 384-bit bus. That same 1.4Ghz memory is gonna give them 134GB/s of bandwidth. 1Ghz gives them 96GB/s.

What's going to happen next summer when Nvidia's $350-$450 card is rocking a 320-bit bus with some el-cheapo 1Ghz GDDR4 and 80GB/s bandwidth?

Apple740
10-Nov-2006, 23:02
I want to see some "leaked" Ati slides where the R600 is ~40% faster than the 8800GTX in Crysis@DX10. :)

trinibwoy
10-Nov-2006, 23:04
Why not just make them yourself? They'll be just as relevant / believable ;)

Rangers
10-Nov-2006, 23:31
So I was thinking, what kind of specs does R600 need to beat G80, now that we know what G80 is?

(take with grain of salt that I know very little about GPU's)

Well, breaking G80 down by ALU's, it has 128 1D, which is "like" 32 of the old 4D kind. But they are double pumped so it is more like 64.

So lets say ATI is going to do a Xenos style chip as they have said. 96 4D ALU's would seem to provide them with a nice performance margin. Nvidia though would have presumably more efficiency to close the gap, and the direct comparison is only valid if R600 is clocked at 675 or more (since the Nvidia Alu's are 675 double pumped).

However Xenos is a small chip (~230m) with 48 alu's, so it seems like they would have lots of room for this.

They'd need at LEAST 32 TMU's for the throughput to compete. Hell, it seems to me the throughput is much more important than the shader power in determining how fast these chips are. It seems unlikely G80 is really stressing it's massive shader power with todays games. Yet it is much faster which must be due to the throughput. I'd like to see 64 full blown TMU's.

Then I'd like to see a 800 mhz clock. This might be unlikely though, it seems these big chips are taking a clock step backwards judging by G80. But who knows, ATI's design may have different capabilities.

Chalnoth
10-Nov-2006, 23:56
Well, I'd say the question will come down to nVidia's decision to go with fewer higher-clocked ALU's instead of more lower-clocked ALU's. I don't think that memory bandwidth will be a huge issue. It may rob or add a few percent here and there, but won't make or break the architecture. I don't think there's going to be a huge difference in capabilities, though it would surprise me if ATI also went with a scalar architecture.

hoom
11-Nov-2006, 00:23
Uhm, Rangers you start at G80 is equivalent to 64 4D ALUs at 675mhz which is fine.

But then you start talking about 96 4D ALUs at 800mhz as being competition?! Thats NV going 'thats not a knife, this is a knife' & ATI whipping out an SSBN.

A 96 ALU R600 should only need to hit something like 450mhz to be competitive.
If its 64 ALUs it needs to be 675mhz.

*all assuming same IPC per 4D ALU which we know isn't actually true

INKster
11-Nov-2006, 00:30
Another R600 rumour, and it's not from the Inq (i think...:wink:):

http://www.vr-zone.com/?i=4293

R300King!
11-Nov-2006, 00:54
Sounds like a power beast to me. :D I just hope it's performance is equal to it's power consumption. ;)

Rangers
11-Nov-2006, 01:06
Uhm, Rangers you start at G80 is equivalent to 64 4D ALUs at 675mhz which is fine.

But then you start talking about 96 4D ALUs at 800mhz as being competition?! Thats NV going 'thats not a knife, this is a knife' & ATI whipping out an SSBN.

A 96 ALU R600 should only need to hit something like 450mhz to be competitive.
If its 64 ALUs it needs to be 675mhz.

*all assuming same IPC per 4D ALU which we know isn't actually true

You're just judging by shader power though. Which isn't, far from, the only factor here. Texture capabilities will also be key, which historically ATI might be deficient in.

Not too mention, the scalar ALU's are supposed to be more efficient right? So what we're talking about, the R600 having 50% more raw shader capability, Nvidia might make a chunk of that deficit back on efficiency, is what I'm presuming. And if ATI's chip isn't at least 675mhz, a hefty target, they'll make some more back there (all this pretending that the architectures are ALU-comparable, which of course they aren't). And by the time R600 comes out, it'll probably be dealing with a G80 refresh to boot.

Xenos seems to give good indication they ought to be able to get at least 96 ALU's in there to me, though. Or even R580.

Razor1
11-Nov-2006, 01:14
You're just judging by shader power though. Which isn't, far from, the only factor here. Texture capabilities will also be key, which historically ATI might be deficient in.

Not too mention, the scalar ALU's are supposed to be more efficient right? So what we're talking about, the R600 having 50% more raw shader capability, Nvidia might make a chunk of that deficit back on efficiency, is what I'm presuming. And if ATI's chip isn't at least 675mhz, a hefty target, they'll make some more back there (all this pretending that the architectures are ALU-comparable, which of course they aren't). And by the time R600 comes out, it'll probably be dealing with a G80 refresh to boot.

Xenos seems to give good indication they ought to be able to get at least 96 ALU's in there to me, though. Or even R580.


Sounds reasonable, but ATi will have to increase thier branching performance of the Xenos (batch size of 64), which would mean cutting down the size of thier SIMD, which will increase control silicon, to what amount I don't know, but in any case the chip will end up larger to some degree then Xenos if using the same amount of 48 ALU's, then have to factor in all the extra's, Dx10 functionality, AVIVO, etc. I don't think they will have an issue reaching the desired clock speed, they should be able to hit 600+, power might be a different story. But even with 96 shader array's they shouldn't have any problems going to 600, even if it was on .09, and we are pretty much certain its on .08 but teh .08 process won't yield much more to the clocks then .09, IMOH I think it will come down to how much power is going to be needed to drive it.

Pete
11-Nov-2006, 03:24
Do the rumors about the longer-than-G80GTX-PCB, high # of PCB layers, and high power draw support a 512-bit external memory bus (and so lots of RAM chips), or just lots of power draw (necessitating lots of power-massaging circuitry) from a power-hungry die?

B/c it seems to me that the external bus width would be key to determining the rest of the chip. It would seem to lead to 32 ROPs, at least 32 TMUs, and probably way over 64 shader "cores," considering both R580 and Xenos.

I'm skeptical ATI will have separate clock domains, and also skeptical that they'll put out a 64-shader core flying at well over 650MHz (I'm thinking RAM clocks, so close to 1GHz). So it seems to me that, barring G80-style double-clocked ALUs, R600'll just be brutally wide.

/clueless guessing

Jawed
11-Nov-2006, 03:34
Razor, I don't think anyone round here knows much about clocks on 80nm or how similar the 80nm node is to 90nm (especially as there are different grades at each node optimised for power or raw speed).

It's interesting that G80's shader-ALU clock, ~1350MHz, running on 90nm is the sort of thing that would never have been predicted before the rumours started. It isn't a minor thing NVidia has done with the 90nm process.

I'm not saying that R600 will run at similar rates - I'm merely using it as an example of something that's radically different from what we're used to.

Put another way, it seems fairly fruitless trying to argue configuration (branching granularity, etc.) based on clocks. If we'd known G80 was ~1.2GHz, back at the beginning of the year, would that have gotten us any closer to knowing the rest of the architecture?

Jawed

Razor1
11-Nov-2006, 03:44
That is true, I wouldn't be suprised if ATi has been looking into this too, something like fast 14, but here is the problem, ATi would have to take a large step away from thier current gen's to do something like what nV did with thier shader processors and start custom building libraries, I don't think ATi is taking that big of a step at least not yet anyways. This is the only way to go in the future though, nV might have opened a can of worms, where AMD can really give ATi an advantage.

Well if we need indication of .80 why did ATi go with strained silicon for thier latest notebook gpu. Strained silicon is expensive, last I heard 2 times the cost of regular silicon, and added to this SOI is also around 2 times the cost of regular silicon. Granted we are talking about a next gen product, but still, it all depends if ATi has taken an approach of hand building thier libraries and also if they went with simpiliar ALU's. If one is missing they won't be going for extreme high clocks and conservative power usage.

Pete I'm skeptical about the different clocks on ATi products aswell, they might be doing it though, after seeing what nV has done, I'm sure its crossed thier mind, but they probably won't be able to get great results with it yet, starting off slow would be thier best bet, but then again, look at the r520, and r580, those were some amazing chips and big leap in tech from ATi's side, nV took smaller steps once the nV 40 was out well, the g80 is a different monster altogether lol.

Jawed
11-Nov-2006, 04:18
That is true, I wouldn't be suprised if ATi has been looking into this too, something like fast 14, but here is the problem, ATi would have to take a large step away from thier current gen's to do something like what nV did with thier shader processors and start custom building libraries, I don't think ATi is taking that big of a step at least not yet anyways. This is the only way to go in the future though, nV might have opened a can of worms, where AMD can really give ATi an advantage.
Well, we've been wondering about fast14 for years now I think, which sorta tells us exactly nothing :sad:

What's weird is this assumption that "ATI can't possibly do anything to match NVidia". There's a wodge of stuff, conceptually, inside G80 that ATI did first, some of it years ago. NVidia could only compete this year by strapping two G71s together, G71 is that far behind on IQ and raw performance.

Well if we need indication of .80 why did ATi go with strained silicon for thier latest notebook gpu. Strained silicon is expensive, last I heard 2 times the cost of regular silicon, and added to this SOI is also around 2 times the cost of regular silicon. Granted we are talking about a next gen product, but still, it all depends if ATi has taken an approach of hand building thier libraries and also if they went with simpiliar ALU's. If one is missing they won't be going for extreme high clocks and conservative power usage.
I thought SS was for power consumption, which is obviously important to the mobile sector. ATI might have trialled SS on 90nm first? At some point ATI's 80nm GPUs will use SS? etc.

Jawed

Chalnoth
11-Nov-2006, 05:22
Well, the thing that really amazes me is just how big of a rabbit nVidia pulled out of its hat this time around. I really don't think that anybody was expecting the G80 to be this, at least not before around a month ago.

Now, if ATI was also in the dark, I don't know whether this will help or hurt them. For example, ATI could have deduced higher performance in today's games by projecting G7x-style logic density, but at a sacrifice in DX10 performance. But since it appears that nVidia went the other way entirely, well, we'll have to see. It'll be interesting, to say the least.

Personally I think that ATI's problems are more long-term in nature, with respect to priorities that are sure to change over time due to the merger.

Shtal
11-Nov-2006, 05:25
So I was thinking, what kind of specs does R600 need to beat G80.

I would says it might be this:

80nm
64 Shader pipelines (Vec4+Scalar)
32 TMU's
32 ROPs
128 Shader Operations per Cycle
650MHz Core
512GFLOPs for the shaders
512-bit 1024MB 2.0GHz GDDR4 Memory
128.0 GB/sec Bandwidth (at 1000MHz x2)

Shtal
11-Nov-2006, 05:46
What ATI actually need is to come to senses with R600 in order to show off in comparison with R580 X1950XTX.
R580 currently have 16 pipelines ratio 3:1 makes 48 pixel shaders.
R600 should be 64 pipelines ratio 2:1 makes 128 pixel shaders.
R580 currently have 16 ROP's
R600 should be 32 ROP's
R580 currently have 16 TMU's
R600 should be 32 TMU's
R580 currently have 32 x 8 crossbar on 256bit wide memory
R600 should be 64 x 8 corssbar on 512bit wide memory
R580 currently have 650MHz core clock
R600 should be at least 700MHz core clock (But that might not happend because chip is to hot)

All this comparison:
But unify shaders could have any combinations pixel/vertex geometry shaders is any way by the request. But I want to see in comparison with G80 who is better.

If this could truly happened it might put a lead over G80 about 20-25% MAX unless ATI have something better.
By the time this might happened I could be convince Nvidia will have back up plan.

INKster
11-Nov-2006, 05:55
I would says it might be this:

80nm
64 Shader pipelines (Vec4+Scalar)
32 TMU's
32 ROPs
128 Shader Operations per Cycle
650MHz Core
512GFLOPs for the shaders
512-bit 1024MB 2.0GHz GDDR4 Memory
128.0 GB/sec Bandwidth (at 1000MHz x2)

There are three things in there that i don't find credible for R600:

- 32 ROP's.
Is it really necessary to have more than 24 at the moment ?

- 512 bit external memory bus + 2.0 GHz effective GDDR4.
256bit + 2.8 GHz GDDR4 sounds like a more reasonable proposition to me.

- 650 MHz core.
Why does R600 need to have the same clockspeed as R580 ? G80 is at 575 MHz vs 650 MHz for the G71 and it didn't hurt them a bit. Personally, i find the raw core "MHz" spec as little more than a dying marketing trend.

Dave Baumann
11-Nov-2006, 05:58
- 32 ROP's.
Is it really necessary to have more than 24 at the moment ?
Somewhat arbitary number isn't it? I assume, though, that number isn't wholly arbitary as its from 8800, but for what reason is that the ideal? I can certainly see why it would make sense for G80, but does that have to be the case for every graphics card?

Shtal
11-Nov-2006, 06:09
There are three things in there that i don't find credible for R600:
- 32 ROP's.
Is it really necessary to have more than 24 at the moment ?.

It's all about I have something that you don't. It may not be necessary at the moment, but to show, we are late with the chip but we are not catching up with G80 but instead we are ahead.

T- 512 bit external memory bus + 2.0 GHz effective GDDR4.
256bit + 2.8 GHz GDDR4 sounds like a more reasonable proposition to me.

It's all about I have something that you don't. You are Nvidia at 384bit but we are at 512bit. It maybe stupid but it could happened.

There are three things in there that i don't find credible for R600:
- 650 MHz core.
Why does R600 need to have the same clockspeed as R580 ? G80 is at 575 MHz vs 650 MHz for the G71 and it didn't hurt them a bit. Personally, i find the raw core "MHz" spec as little more than a dying marketing trend.

It's all about marketing to show to consumers we have higher clock speed.
It may not be necessary at 650MHz.

INKster
11-Nov-2006, 06:15
Somewhat arbitary number isn't it? I assume, though, that number isn't wholly arbitary as its from 8800, but for what reason is that the ideal? I can certainly see why it would make sense for G80, but does that have to be the case for every graphics card?

I've based my skepticism on the somewhat simplistic "doubling" of features relative to R580 that was stated.

R580 was faster than R520, but the same 16 ROP's didn't seem to hurt them, just like on Nvidia's high-end GF6/7.

As for new methods of AA, etc, they can in fact raise the stakes, but what if R600 has indeed 32 ROP's, but with each of them carrying no major changes from R5xx ?
Brute force from ATI ? It happened with R4xx, and the result was a lost opportunity to keep the momentum and gain further technology leadership compared to NV4x (seen as having better shading hardware, but also some downgrading on texture filtering quality, for instance).


Bah, what do i know ? You are the one who should have been giving us something to keep this thread alive and kicking. ;)

silent_guy
11-Nov-2006, 06:35
/clueless guessing

Speculating about what they are doing is pretty much useless, but it's clear what they should be doing:

create something that's either so much faster than G80 that every alpha-gamer has to have it
... or create something that has the same or better !/$.

Preferably the latter... or both. :wink:

From a pure technical point of view, R580+ was better than G71 for pretty much all parameters in absolute terms.
But financially, it always had to play catch up because of its 350/192 die size disadvantage, which allowed Nvidia to either undercut ATI for the same performance or overpower them at the same price.

R580 was a sign of either an incompetent marketing department too insecure to cut features and demanding everything under the sun OR a weak marketing department unable to stand up against the desire of engineering to implement all the cool tricks, even if those add little or no pricing power. It's also an indication of engineering paying too much attention to ! instead of !/$ (and !/W.)

For the R600 and (especially!) its lower derivatives to be a success for ATI, they must have had the singular vision to drop the focus on pure performance alone and start caring about !/$ two or more years ago. Given their existing inefficient architecture, that means rethinking most if not all the blocks from scratch. Possible but IMHO unlikely.
If they didn't, they may well win the absolute performance crown (for now), but they'll keep losing ground in the mid-end segments, which is where the real money is made.

Edit: I guess what I'm saying is: none of the lists above mention silicon efficiency, even though that (and crappy execution) has been the biggest liability of ATI during the last few years.

Dave Baumann
11-Nov-2006, 06:47
As for new methods of AA, etc, they can in fact raise the stakes, but what if R600 has indeed 32 ROP's, but with each of them carrying no major changes from R5xx ?
What if?

Here's an analogy - given the performances, it strikes me that a single G80 texture filter unit has some fairly similar properties to a G71 filter unit in relation to the samples per cycle. However, given the types of workloads now being seen, especially with HDR becoming more widely used and the bandwidth available to make it even more so, it appears to have made sense to NVIDIA to double them up per pixel in order to effectively get single cycle FP16 bilinear filtering (among other sampling traits).

Generally speaking I would say the "ideal" is more, more, more!!! But more to the point, the ideal is what makes sense for the design of the individual unit and the properties of the rest of the ecosystem of the processor.

Shtal
11-Nov-2006, 06:48
I agree with silent_guy.

But we all have to see if ATI will do better with R600 vs. R580

Since R600 is based on R500 Xones processor. It would be easer for ATI to make R600 more officiant then R580.

But I also agree what Dave Baumann is saying. Their is such a time frame what had occurred

Dave Baumann
11-Nov-2006, 06:51
R580 was a sign of either an incompetent marketing department too insecure to cut features and demanding everything under the sun OR a weak marketing department unable to stand up against the desire of engineering to implement all the cool tricks, even if those add little or no pricing power.
You appear to say that relative to G71, however G71 was both released both after R580 and wouldn't have been known of in terms of properties when design started. Given the die size differences between R520 and R580 I don't think anyone would have refused a 3x math power increase.

dnavas
11-Nov-2006, 07:10
But more to the point, the ideal is what makes sense for the design of the individual unit and the properties of the rest of the ecosystem of the processor.

So if the bus size doubles, then so do the filtering units and the ROPs, and if the speed of memory access increases, then so follows the speed of processing data. Or not, as each chooses to believe.

But, if I'm forced to guess now, I'd say the aim is for doubling everything, and raising clocks 30% or so, but, we shall see in the fullness of time.

silent_guy
11-Nov-2006, 07:20
You appear to say that relative to G71, however G71 was both released both after R580 and wouldn't have been known of in terms of properties when design started.

Fair enough.
In that case, it's more a matter of marketings inability to make correctly assess (or shape!) the desires of their customers (who seemed to be willing to overlook lower image quality).

Given the die size differences between R520 and R580 I don't think anyone would have refused a 3x math power increase.

Agreed. Of course, this underlines even more the inefficiency of R520.:wink:

swaaye
11-Nov-2006, 07:27
I know I remember reading in one of the better reviews out there that ATI and NV count transistors way differently. And so, going by that in a comparison is really pointless. After all, the actual measured die area between G70 and R520 was quite similar if I recall.

Here:
http://techreport.com/reviews/2005q4/radeon-x1000/index.x?pg=3

Dave Baumann
11-Nov-2006, 07:27
Agreed. Of course, this underlines even more the inefficiency of R520.:wink:

How so? R520's performance wasn't bad - would have certainly reflected better had it been released earlier. But, still, if the architecture was scalable enough to afford a 3x math increase for a relatively small cost, then why not? Its proven that it could open up new markets as well.

silent_guy
11-Nov-2006, 07:47
How so? R520's performance wasn't bad - would have certainly reflected better had it been released earlier. But, still, if the architecture was scalable enough to afford a 3x math increase for a relatively small cost, then why not? Its proven that it could open up new markets as well.
It wasn't bad in absolute terms, but it was 65% larger for lower performance. A small area increase to get a 3x math performance points to an unbalanced and costly non-math fraction for the R520. Maybe scalable for future generations, but also eating up margins of the current one.
In a lot of fields, you can often justify a larger die and higher cost by offering a range of different features that customers hopefully will care about. When your customers only really care about absolute performance, you don't have the luxury of playing it safe or overdesigning with the future in mind if your competitor does not.

silent_guy
11-Nov-2006, 07:51
I know I remember reading in one of the better reviews out there that ATI and NV count transistors way differently. And so, going by that in a comparison is really pointless. After all, the actual measured die area between G70 and R520 was quite similar if I recall.

Yes, G70 was only slightly smaller than R520. But also produced in 110nm instead of 90nm. Cheaper and more mature so higher yields. The number of transistors is largely irrelevant. Area is what counts.

Dave Baumann
11-Nov-2006, 07:57
I think your initial premise on the die sizes of the two are incorrect - R520 was in the order of 288mm2, G70 around 334mm2

silent_guy
11-Nov-2006, 08:11
I think your initial premise on the die sizes of the two are incorrect - R520 was in the order of 288mm2, G70 around 334mm2

Oops. :oops:
Though that doesn't change the argument about architectural efficiency where it makes more sense to compare R520 with G71, it could be that the production cost of R520 was indeed similar to G70 then.

Ailuros
11-Nov-2006, 08:12
As others have already warned with sterile numbers, speculating isn't necessarily a bad thing the trick is usually to find something that makes at least sense or at least start thinking what each unit might or should be capable of for a D3D10 GPU.

As one of the examples think of the G7x ROPs vs. G8x ROPs; it might be quite easy to say 16 vs. 24, but you've already faced the first pitfall since they're not equal at all in capabilities.

Now as for TMUs I don't see a single reason why ATI would move from a 3:1 ALU:TMU relation to something like 2:1, just because G80 does seem to come closer to that rate (which can be arguable but that's besides the point). In my mind at worst ATI has kept the 3:1 ratio and at best moved to 4:1 and no I wouldn't suggest less than 24 frankly either.

To recycle back to the beginning of this post, now someone let me know what each TMU is exactly capable of, before I could theoretically say what is or isn't good enough.

My 2 cents.

Razor1
11-Nov-2006, 11:28
Well, we've been wondering about fast14 for years now I think, which sorta tells us exactly nothing :sad:

What's weird is this assumption that "ATI can't possibly do anything to match NVidia". There's a wodge of stuff, conceptually, inside G80 that ATI did first, some of it years ago. NVidia could only compete this year by strapping two G71s together, G71 is that far behind on IQ and raw performance.


I thought SS was for power consumption, which is obviously important to the mobile sector. ATI might have trialled SS on 90nm first? At some point ATI's 80nm GPUs will use SS? etc.

Jawed

Yes SS is for power consumption, thats why its concerning me, I can see why they used it for thier notebook GPU's but nV has no issue with thier 7600's in notebooks about power usage, and the rv560 I guess it is, don't remeber the exact notebook GPU using SS, doesn't really have a performance advantage in its class. Not to mention nV has made a gpu that is more then 2 times the size of thier previous GPU and hasn't doubled power usage.

I'm not sure when they started with SS, this is the first time I heard about it :)

PeterAce
11-Nov-2006, 15:36
As a die hard card owner/fan of ATI (though R300 and R420) I feel the last few generaions high-end releases from ATI have not been compelling (for me at least).

Last December I had a choice between G70 (7800 GTX 512) or R520 (much better IQ but week-ish shader performance and an annoying noise cooler).

So I chose the 7800 GTX 512.

Unfortunatly the R580 came after Christmas, which is too late (as I always buy my graphics card so I can have the maximum amount of games playing time over the holiday season) and still still came with the annoying cooler!

What disapoints me is that this year ATI have missed the buying window again! G80 is obviously the only contender for a high-end purchise for the Holiday/Christmas season.

A R600 (in Jan/Feb) is too late!

INKster
11-Nov-2006, 16:11
I think your initial premise on the die sizes of the two are incorrect - R520 was in the order of 288mm2, G70 around 334mm2

Yes, but Nvidia used a relatively cheap 110nm process for G70 (and without low-k), while R520 had the (then) surely more expensive 90nm one.

If you compare G71 and R580 (both on 90nm), the difference is huge, there's no way around that fact.

satein
13-Nov-2006, 12:34
This is from VR-Zone a few days ago... The R600 board is needed to redesigned...
ATi R600 Card Re-Design In Progress (http://sg.vr-zone.com/?i=4293)
A re-design for R600 card and cooling is currently underway to make it shorter and better cooled. The original R600 card design is 12 inches long and ATi is probably trying to shorten it to at least 8800GTX length. The Inquirer has recently reported ATI has already produced some first R600 cards that are clocked lower to send out to game developers for debugging and optimizing their games for R600. The R600 card we seen will conform to the new PCI-SIG graphics spec of delivering 225/300W power for high-end graphics cards. Therefore it will have a new 2x4 pin connector for additional power on top of the current 6-pin connector.

I don't know if this may be rigid... but it would be very long if it was turely 12" card :???:

flopper
13-Nov-2006, 12:53
As a die hard card owner/fan of ATI (though R300 and R420) I feel the last few generaions high-end releases from ATI have not been compelling (for me at least).
What disapoints me is that this year ATI have missed the buying window again! G80 is obviously the only contender for a high-end purchise for the Holiday/Christmas season.

A R600 (in Jan/Feb) is too late!

Depends, if R600 is much more powerful than 800 and the refresh G81 then its worth the wait and also the price will be better due to nvidias card been out longer.

I got a x1950xtx and I am happy about the card.
Could have waited until 8800 but I never liked nvidia.

Wished that game developers would have used the compressing algoritms ati has to textures.
that would been a mindblowing game which Nvidia would been forced to implment.

DX10 is the key though.
How they perform there is more important than DX9 games.

Rys
13-Nov-2006, 13:50
Wished that game developers would have used the compressing algoritms ati has to textures. that would been a mindblowing game which Nvidia would been forced to implment.
If you're talking about 3Dc or 3Dc+, (IIRC of course) NVIDIA has supported that technology for some time in their drivers, on their modern hardware.

Bouncing Zabaglione Bros.
13-Nov-2006, 13:58
Any one close to ATI got an impression of what they think of G80? Are they grumpy because it's much better than R600 and stole all their thunder by getting out first, or are ATI quietly happy because they know R600 will kick the (very impressive looking) G80's ass anyway?

Still, it much be really painful to ATI for R600 to be missing Christmas while their competitor gets their next gen card out for Christmas and the Vista business release.

Jawed
13-Nov-2006, 13:58
I think D3D10 requires support for 3Dc.

Jawed

Bouncing Zabaglione Bros.
13-Nov-2006, 13:59
If you're talking about 3Dc or 3Dc+, (IIRC of course) NVIDIA has supported that technology for some time in their drivers, on their modern hardware.

That's software support only though, isn't it? Or does Nvidia now support 3Dc in hardware too, at least with the G80?

nAo
13-Nov-2006, 14:01
That's software support only though, isn't it? Or does Nvidia now support 3Dc in hardware too?
G80 should support it AFAIK

Rys
13-Nov-2006, 14:11
That's software support only though, isn't it? Or does Nvidia now support 3Dc in hardware too, at least with the G80?
Decompression happens in the driver for G7x, then a map to an equivalent surface format. Not sure about native support for decompression in G80, but if Jawed is right then you'd definitely expect it.

Dave Baumann
13-Nov-2006, 14:53
If you compare G71 and R580 (both on 90nm), the difference is huge, there's no way around that fact.
Thats entirely immaterial. We're compraring what was actually released.

vertex_shader
13-Nov-2006, 15:07
This is from VR-Zone a few days ago... The R600 board is needed to redesigned...
ATi R600 Card Re-Design In Progress (http://sg.vr-zone.com/?i=4293)


I don't know if this may be rigid... but it would be very long if it was turely 12" card :???:

Vr-zone always writing crap about ATi cards, r600 PCB 10.4 inch long.

Rys
13-Nov-2006, 15:22
Vr-zone always writing crap about ATi cards, r600 PCB 10.4 inch long.
They weren't talking about the PCB as such, methinks. You might be right about that element, but dozens of shipping boards show that it's not just PCB size you need to take into account, and that might be the case here.

Rys
13-Nov-2006, 15:26
Thats entirely immaterial. We're compraring what was actually released.
Fact remains that high-end G71 and R580 traditionally have somewhat equivalent performance and are pitched into the same price bands, so the comparison is entirely valid if one wants to make it (and even more valid if you want to keep process static when coming to a judgement about performance/area/watt/price/whatever).

Dave Baumann
13-Nov-2006, 16:00
The characteristics of perf/watt or whatever will waver process to process, so the process is part of that type of consideration. The process, hence the resultant size of the chip, will also give different yeilds, which affects the pricing.

You're correct, they were released at similar price points/performances, which is why comparing what they were actually released on was valid for the discussion.

Bob
13-Nov-2006, 16:08
I think D3D10 requires support for 3Dc.
Have a look at EXT_texture_compression_latc (http://developer.download.nvidia.com/opengl/specs/GL_EXT_texture_compression_latc.txt).

Shtal
13-Nov-2006, 17:17
Could have waited until 8800 but I never liked nvidia.


I wish beyond3d made a poll to people/fans who visiting this forums/web page.

Who "do you like/don't like" to most Nvidia or ATI to see who we have more fans ATI's or Nvidia.

Jawed
13-Nov-2006, 17:26
http://ati.amd.com/developer/brighton/DirectX%2010%20for%20techies.pdf

•DXGI_FORMAT_BC4_ TYPELESS / _UNORM / _SNORM

•Single channel textures (signed or unsigned)
•4 bits per pixel compression (ATI1N)
•DXGI_FORMAT_BC5_TYPELESS / _UNORM / _SNORM

•Dual channel textures (signed or unsigned)
•8 bits per pixel compression (3DC/ATI2N)

Jawed

flopper
13-Nov-2006, 21:02
I wish beyond3d made a poll to people/fans who visiting this forums/web page.

Who "do you like/don't like" to most Nvidia or ATI to see who we have more fans ATI's or Nvidia.

8800 a great card, dont like the company.
To many issues with games I played with nvidias cards.
I play MMOG´s.

I prefer ATI, never had any issues with the cards last 3 gens.

R600, bring it on AMD.

flopper
13-Nov-2006, 21:05
I think D3D10 requires support for 3Dc.

Jawed

Finally then ;)

I am all for faster and better looking textures.
After all, tech in all glory but the frames is what counts...
;)

Shtal
13-Nov-2006, 22:40
R600, bring it on AMD.

I hope extra prower requiremnt on R600 over G80 also brings extra performance over G80 vs. consume more power then G80 and performance same; just like R580 vs. G71

LeStoffer
13-Nov-2006, 23:01
I can't help but taking it as a bad owen that absolutely nothing has surfaced about the R600 in order the steal at least some of the 800GTX thunder. Not even the slightest rumour along the lines of "G80? Ha! You have seen nothing yet!" or whatever... If only ATI hadn't pushed themselves at the edge with 80nm process and GDDR4, I bet they would have made it for Christmas.

DemoCoder
13-Nov-2006, 23:20
Let's assume the 12" card length rumor is true, what would be the implications of this?

1. Larger than 256-bit memory bus? Perhaps 384-512 bit and layout of memory chips forced layout issues?

2. Xenos-like ROP daughter-chip? Perhaps ATI's version of NVIO is a frikkin huge eDRAM ROP chip. However, it still seems untenable given that 1920x1440 HDR+4xMSAA frame would need 100+mb

3. Voodoo1 architecture :) Same as #2, but no eDRAM. ROP daughterchip, but no embedded RAM other than cache. Instead, the ROP chip is connected to its own 128-256bit bus pool of memory. R600 core would have 512mb RAM/256-bit bus, and R600ROP chip would have its own 256-512mb of framebuffer GDDR memory on private 128-256bit bus. Of course, they would also need to be connected by a high speed inter-GPU bus.

4. Same as above, but replace ROP chip with second R600. Yep, it's ATI's version of the GX2/FuryMaxx. Maybe ATI run simulations and figured out that the R600 single GPU will lose the performance grown to a G80/G81, and is planning to have an ultra-highend "single PCB crossfire" solution on launch.

Shtal
13-Nov-2006, 23:26
I can't help but taking it as a bad owen that absolutely nothing has surfaced about the R600 in order the steal at least some of the 800GTX thunder. Not even the slightest rumour along the lines of "G80? Ha! You have seen nothing yet!" or whatever... If only ATI hadn't pushed themselves at the edge with 80nm process and GDDR4, I bet they would have made it for Christmas.

Besides internal 512bit memory controler on R600. But R600 does/is also support/capable External 512bit memory Bus controler. If ATI choose to use it; and its not a rumors my friend but the fact.

So far we know as a fact about R600 not rumor.

a. 80nm tech
b. GDDR4
c. internal and external 512bit mem controler.

With good enough yealds on the chip it may edge out G80.

overclocked_enthusiasm
14-Nov-2006, 00:09
If only ATI hadn't pushed themselves at the edge with 80nm process and GDDR4, I bet they would have made it for Christmas.


You mean like they did with 130 nm low-k and 90 nm also? I am still baffled what at ATI changed to get away from using smaller die sizes, more mature manufacturing processes, lower clocks and better thermal characteristics. R3xx seems so long ago.....

DemoCoder
14-Nov-2006, 00:23
What is the source of those facts? 512-bits seems an awful expense if ATI is intent on increasing ALU vs TEX. If they come out with ~800m-1billion trannies on 80nm, maybe the chip will be large enough to sport enough pins to do it, but I would expect something slightly more radical from ATI than such a brute-force approach. If the R600 is simply a unified+doubled R580 (e.g. 96 SIMD ALUs), they'll have a bigger chip, with perhaps better blending/overdraw/tex performance, but at the expense of an expensive memory architecture, less efficient shaders, and once again, poorer margins.

Jakob
14-Nov-2006, 00:29
Heh, and yet another thread degenerates into beating the old "ATI has twice the die size, yet same performance" horse.

Hopefully DAAMIT will rectify this disparity with R600. G80 ain't small.

LeStoffer
14-Nov-2006, 00:30
I am still baffled what at ATI changed to get away from using smaller die sizes, more mature manufacturing processes, lower clocks and better thermal characteristics. R3xx seems so long ago.....

Indeed. Well, apart from die sizes: G80 could will turn out to be as succesful as R300 dispite both of them being pretty huge for their time.

DemoCoder
14-Nov-2006, 00:44
Heh, and yet another thread degenerates into beating the old "ATI has twice the die size, yet same performance" horse.

Hopefully DAAMIT will rectify this disparity with R600. G80 ain't small.

Yeah, it ain't small, but the G71 was. G80 is 2.45x the trannies of the G71, but packs in way more functionality and performance. It's got 1.5x ROPs, each ROP is 100% orthogonal, does all HDR+MSAA modes, single cycle 4xMSAA, 8Z, coverage-sample AA, sparse MSAA grid, etc. It's got 1.33x the TMUs, each TMU is more orthogonal and functional as well, each TMU does 2x the bilinear rate of G71 TMUs, and does AF mostly angle independent at full speed. There there's ~2x the raw FLOPs, much more functional ALUs, branching, GPGPU stuff. That's a huge upgrade in performance and features for roughly 2.45x the transistors, and that's without the inevitable optimization that will come from the G81, since the G80 is a "version 1.0" of a new architecture.

Arty
14-Nov-2006, 01:13
r600 PCB 10.4 inch long.
First you said R600 was ~720 million transistors and now this. I dont mean to offend you as long as you are throwing us bread crumbs along the way. :wink4:

Shtal
14-Nov-2006, 01:21
What is the source of those facts? 512-bits seems an awful expense if ATI is intent on increasing ALU vs TEX. If they come out with ~800m-1billion trannies on 80nm, maybe the chip will be large enough to sport enough pins to do it, but I would expect something slightly more radical from ATI than such a brute-force approach. If the R600 is simply a unified+doubled R580 (e.g. 96 SIMD ALUs), they'll have a bigger chip, with perhaps better blending/overdraw/tex performance, but at the expense of an expensive memory architecture, less efficient shaders, and once again, poorer margins.

THE UP AND COMING R600 will have a real 512 bit memory controller. Unlike its predecessors which had an internal 512 ring memory bus, the R600 will have it externally as well.
This means that the packaging of the chip will be extremely expensive. The wider memory bus you use the more pins you need in your chip package.

If the 512 memory ring turns to be the real thing, we are talking about 128 GB/s of memory bandwidth with GDDR4 clocked at 2000MHz. We also learned that the R600 may use memory faster than 2000MHz as it will be available by Q1. If ATI keeps pushing the chip we might get even faster GDDR4 chips at production time.

Even the PCB of the R600 will be super complicated, as you need a lot of wires to make 512 bit memory to work. Overall it has the potential to beat Nvidia's G80, but yet again it will come at least three months after Nvidia. The G80's memory works at 384 bit as Nvidia pretty much dis-unified everything in G80 from shaders to memory controllers. Nvidia likes to make rules and probably could not get more than 384 bit wide controller in the chip, as the G80 is still a 90 nanometre chip.

It’s a shame that we will need to wait at least until February to see it in action. µ
http://www.theinquirer.net/default.aspx?article=35062

Jawed
14-Nov-2006, 01:28
You're wildly overstating the texturing, by the way. Per TMU the capability of G80 is practically the same as G71 - it's the doubled-up nature of them (literally 2.35x) that makes them outperform G71.

Jawed

Geeforcer
14-Nov-2006, 01:43
THE UP AND COMING R600 will have a real 512 bit memory controller. Unlike its predecessors which had an internal 512 ring memory bus, the R600 will have it externally as well.
This means that the packaging of the chip will be extremely expensive. The wider memory bus you use the more pins you need in your chip package.

If the 512 memory ring turns to be the real thing, we are talking about 128 GB/s of memory bandwidth with GDDR4 clocked at 2000MHz. We also learned that the R600 may use memory faster than 2000MHz as it will be available by Q1. If ATI keeps pushing the chip we might get even faster GDDR4 chips at production time.

Even the PCB of the R600 will be super complicated, as you need a lot of wires to make 512 bit memory to work. Overall it has the potential to beat Nvidia's G80, but yet again it will come at least three months after Nvidia. The G80's memory works at 384 bit as Nvidia pretty much dis-unified everything in G80 from shaders to memory controllers. Nvidia likes to make rules and probably could not get more than 384 bit wide controller in the chip, as the G80 is still a 90 nanometre chip.

It’s a shame that we will need to wait at least until February to see it in action. µ
http://www.theinquirer.net/default.aspx?article=35062

Wait, did I see it right? Inquirer report = Fact?

LeStoffer
14-Nov-2006, 01:47
Wait, did I see it right? Inquirer report = Fact? :lol:

Shtal
14-Nov-2006, 01:48
Wait, did I see it right? Inquirer report = Fact?

Show me your fact anything on R600; example 80nm tech or GDDR4.

Arty
14-Nov-2006, 01:48
Wait, did I see it right? Inquirer report = Fact?
"Last week of January" is the most reliable info re R600 launch.

Geeforcer
14-Nov-2006, 01:54
Show me your fact anything on R600; example 80nm tech or GDDR4.

I can't and hence will not pronounce them as such, although considering that both are used by ATI right now would make them increasingly likely.

MulciberXP
14-Nov-2006, 01:59
THE UP AND COMING R600 will have a real 512 bit memory controller. Unlike its predecessors which had an internal 512 ring memory bus, the R600 will have it externally as well.
This means that the packaging of the chip will be extremely expensive. The wider memory bus you use the more pins you need in your chip package.

If the 512 memory ring turns to be the real thing, we are talking about 128 GB/s of memory bandwidth with GDDR4 clocked at 2000MHz. We also learned that the R600 may use memory faster than 2000MHz as it will be available by Q1. If ATI keeps pushing the chip we might get even faster GDDR4 chips at production time.

Even the PCB of the R600 will be super complicated, as you need a lot of wires to make 512 bit memory to work. Overall it has the potential to beat Nvidia's G80, but yet again it will come at least three months after Nvidia. The G80's memory works at 384 bit as Nvidia pretty much dis-unified everything in G80 from shaders to memory controllers. Nvidia likes to make rules and probably could not get more than 384 bit wide controller in the chip, as the G80 is still a 90 nanometre chip.

It’s a shame that we will need to wait at least until February to see it in action. µ
http://www.theinquirer.net/default.aspx?article=35062

where do they get these wonderful toys

DemoCoder
14-Nov-2006, 02:00
Well, that depends how you define TMU. If you define TMU as a bilerp unit, they've been "doubled up", but that's semantics. I could equally well say draw a box around 2 bilerps and and say the TMUs have 2x filtering rate.

But you missed the gist of the argument, given that the thrust was about functionality mostly. G80 TMUs are completely orthogonal to format, support all DX10 formats and compression. They have more more functionality than G7x samplers.

G80 TMUs have additional functionality that G7x apparently doesn't have (see NV_gpu_program4, e.g the new TEX instructions) They also have some capability that isn't exposed by either DX10 nor NV's OGL extensions yet.

Your claim "practically the same as the G71" would be more confusing to people since it seems to suggest that NVidia just took the same TMU cells on the G71 and put twice as many on the G80. Yes, they did double up on bilerp hardware, but the lerps are more capable from a functionality perspective, and the surface formats handled are way different. I think merely focusing on the increase in filtering HW overlooks the fact that practically everything else changed.

Shtal
14-Nov-2006, 02:10
Ok, I'm sorry about saying fact. I should be more carefull next time what I say as a fact.

Dave Baumann
14-Nov-2006, 02:11
If only ATI hadn't pushed themselves at the edge with 80nm process and GDDR4, I bet they would have made it for Christmas.
Curious - can you explain how a memory choice would affect the release timings?

DemoCoder
14-Nov-2006, 02:18
Did you care to look up Inq's history of G80 predictions? I would take anything they say with an enormous grain of salt.

INKster
14-Nov-2006, 02:36
Curiously or not, the ATI R600 "external 512bit memory bus" rumour started spreading right after the 384bit bus of the G80 was nearly confirmed.

Some jealous pro-ATI post around the web reaction can't be excluded from the list of possible origins, until there's more (and credible) evidence of the fact.

Shtal
14-Nov-2006, 04:20
Curiously or not, the ATI R600 "external 512bit memory bus" rumour started spreading right after the 384bit bus of the G80 was nearly confirmed.

Some jealous pro-ATI post around the web reaction can't be excluded from the list of possible origins, until there's more (and credible) evidence of the fact.

Predictions, Predictions, Predictions, Predictions....
We know squad about ATI R600: Only assumptions.

But my personal logical guess that sounds realistic at the moment is external 256bit wide bus over higher Speed 2.6-2.8GHz GDDR4 256bit. (Maybe lower speed on 384bit bus)

As far as ROP's; ATI could also follow Nvidia using only 24 ROP's. It could be 32 ROP's.

96 pixel shaders or 128 pixel shaders makes logical sense, but as far real 64 physical pipelines using 2:1 ratio to reach 128 shaders I'am not sure about that. I was thinking more like 32 pipelines using same 3:1 ratio to get to 96 pixel shaders. Just like R580 16 pipeline 3:1 ratio gets 48 pixel shaders.

Texture Units on R600 vs. R580 which is only 16. - it is possibility they could double it. - Since that is how generations of video cards usually goes.

SugarCoat
14-Nov-2006, 04:29
Curiously or not, the ATI R600 "external 512bit memory bus" rumour started spreading right after the 384bit bus of the G80 was nearly confirmed.

Some jealous pro-ATI post around the web reaction can't be excluded from the list of possible origins, until there's more (and credible) evidence of the fact.

Actually it was about the same time as the first G80 specs leak if i remember correctly.

Predictions, Predictions, Predictions, Predictions....
We know squad about ATI R600: Only assumptions.

But my personal logical guess that sounds realistic at the moment is external 256bit wide bus over higher Speed 2.6-2.8GHz GDDR4 256bit. (Maybe lower speed on 384bit bus)

As far as ROP's; ATI could also follow Nvidia using only 24 ROP's. It could be 32 ROP's.

96 pixel shaders or 128 pixel shaders makes logical sense, but as far real 64 physical pipelines using 2:1 ratio to reach 128 shaders I'am not sure about that. I was thinking more like 32 pipelines using same 3:1 ratio to get to 96 pixel shaders. Just like R580 16 pipeline 3:1 ratio gets 48 pixel shaders.

Texture Units on R600 vs. R580 which is only 16. - it is possibility they could double it. - Since that is how generations of video cards usually goes.

I dont see ATI touching that type of ram. For one thing its still rare meaning low production and high costs.

Shtal
14-Nov-2006, 04:36
I dont see ATI touching that type of ram. For one thing its still rare meaning low production and high costs.
So; your personal guess, where does it leaves you with your assumption?

turtle
14-Nov-2006, 05:02
I dont see ATI touching that type of ram. For one thing its still rare meaning low production and high costs.

Havn't we heard GDDR4 yeilds are extremely good? 1400mhz (2.8) is readily available (http://www.samsung.com/Products/Semiconductor/GraphicsMemory/GDDR4SDRAM/512Mbit/K4U52324QE/K4U52324QE.htm) from Samsung, with 1600mhz (3.2) looming extremely soon in the future. 2.2-2.8 (1.2-1.4ghz bin) sounds like a logical possibility to me, as we've seen stranger things happen (like nvidia using 1.1ns GDDR3 before it was readily available) and as we know cards are not usually clocked to the max of the binned chip used.

silent_guy
14-Nov-2006, 05:27
c. internal and external 512bit mem controler.
With good enough yealds on the chip it may edge out G80.
But my personal logical guess that sounds realistic at the moment is external 256bit wide bus over higher Speed 2.6-2.8GHz GDDR4 256bit. (Maybe lower speed on 384bit bus)
This means that the packaging of the chip will be extremely expensive. The wider memory bus you use the more pins you need in your chip package.
Hi Shtal,
you seem to be very knowledgable about these things (so much so, in fact, that you remind me of Ludwig Wittgenstein) and I have a few questions:

What's an internal memory controller?
Why is it useful to have an internal 512-bit bus when the external bus is only 256 bits?
Why would R600 need a 512-bit bus to outperform the 384-bit bus of G80? Are the memory controllers of R600 less efficient?
How is yield of R600 correlated with performance of G80?
Why would the speed for 384-bits be lower than for 256-bit? (And wouldn't that require a 786-bits internal memory controller?)
How many additional pins do you need for a 512-bit bus? And how much more expensive would the package be compared to the cost of the die?

Thanks!

Shtal
14-Nov-2006, 05:48
Hi Shtal,
you seem to be very knowledgable about these things (so much so, in fact, that you remind me of Ludwig Wittgenstein) and I have a few questions:

What's an internal memory controller?
Why is it useful to have an internal 512-bit bus when the external bus is only 256 bits?
Why would R600 need a 512-bit bus to outperform the 384-bit bus of G80? Are the memory controllers of R600 less efficient?
How is yield of R600 correlated with performance of G80?
Why would the speed for 384-bits be lower than for 256-bit? (And wouldn't that require a 786-bits internal memory controller?)
How many additional pins do you need for a 512-bit bus? And how much more expensive would the package be compared to the cost of the die?

Thanks!

What's the catch?

I will answer for now just one question at the moment first for you.
ATI does not need 512bit memory because?
a: to expensive
b. not sure what they would need it for all that extra bandwidth that could wasted completely.
c. If you look at Geforce 8800GTS, you simple overclock the core and you are getting GTX level of peformance of FPS without extra bandwidth needed.
d. 512bit my personal guess to early for now.
One more thing I want to add to this question on ROP's Between G8800 GTX vs. GTS which is 24 vs. 20 so far I could tell makes very small impact for now.
For the most part extra bandwidth start playing important role when you crank up resolution and enable Anti-aliasing (AA) or if you add High Dynamic Range (HDR) on top.

Ailuros
14-Nov-2006, 07:10
Predictions, Predictions, Predictions, Predictions....
We know squad about ATI R600: Only assumptions.

But my personal logical guess that sounds realistic at the moment is external 256bit wide bus over higher Speed 2.6-2.8GHz GDDR4 256bit. (Maybe lower speed on 384bit bus)

As far as ROP's; ATI could also follow Nvidia using only 24 ROP's. It could be 32 ROP's.

96 pixel shaders or 128 pixel shaders makes logical sense, but as far real 64 physical pipelines using 2:1 ratio to reach 128 shaders I'am not sure about that. I was thinking more like 32 pipelines using same 3:1 ratio to get to 96 pixel shaders. Just like R580 16 pipeline 3:1 ratio gets 48 pixel shaders.

Texture Units on R600 vs. R580 which is only 16. - it is possibility they could double it. - Since that is how generations of video cards usually goes.

http://www.beyond3d.com/forum/showpost.php?p=871045&postcount=359

Shtal
14-Nov-2006, 08:02
http://www.beyond3d.com/forum/showpost.php?p=871045&postcount=359 To recycle back to the beginning of this post, now someone let me know what each TMU is exactly capable of, before I could theoretically say what is or isn't good enough.

What do you think?

rwolf
14-Nov-2006, 09:15
THE UP AND COMING R600 will have a real 512 bit memory controller. Unlike its predecessors which had an internal 512 ring memory bus, the R600 will have it externally as well.
This means that the packaging of the chip will be extremely expensive. The wider memory bus you use the more pins you need in your chip package.

If the 512 memory ring turns to be the real thing, we are talking about 128 GB/s of memory bandwidth with GDDR4 clocked at 2000MHz. We also learned that the R600 may use memory faster than 2000MHz as it will be available by Q1. If ATI keeps pushing the chip we might get even faster GDDR4 chips at production time.

Even the PCB of the R600 will be super complicated, as you need a lot of wires to make 512 bit memory to work. Overall it has the potential to beat Nvidia's G80, but yet again it will come at least three months after Nvidia. The G80's memory works at 384 bit as Nvidia pretty much dis-unified everything in G80 from shaders to memory controllers. Nvidia likes to make rules and probably could not get more than 384 bit wide controller in the chip, as the G80 is still a 90 nanometre chip.

It’s a shame that we will need to wait at least until February to see it in action. µ
http://www.theinquirer.net/default.aspx?article=35062

Doesn't it become harder to put more pins on a chip when it is shrunk.

rwolf
14-Nov-2006, 09:42
http://ati.amd.com/developer/brighton/05%20Graphics%20Performance.pdf


What we had to change about our hardware.

• Where state checking happened
• Mostly this was in the driver
• The runtime used to do buffering and filtering too…
• What state checking happened
• Scary amounts of cross validation between state changes
• Why state checking happened
• Invalid state combinations are a “not good” kind of a thing
• (Some unrelated state lay in the same control words…)
• Now our hardware does most of the validation…

Seems like R600 will see more functionality of the drivers moved into the chip.

stevem
14-Nov-2006, 11:00
What can we glean from p20? R600 or R580? :smile:
How fast is video memory?
• We have colossal bandwidth to local vid mem, but it’s never enough…
• So as far as possible we rely on cleverness, not wide pipes
• For example, we prefer to perform the Z test before the pixel shader starts execution

LeStoffer
14-Nov-2006, 12:20
Curious - can you explain how a memory choice would affect the release timings?

I was thinking about availability of the specific memory of course. But since you know that very well, so I'll take the liberty to assume that the memory choice isn't an issue regarding the delay of the R600. :wink:

Rys
14-Nov-2006, 12:30
GDDR4-based graphics boards have been shipping in quantity since late August. That would indicate sampling some time before that (a quarter at least, one presumes) and MP of all variants is in full swing (at Samsung at least). R600 definitely isn't around just now because of the memory choice.

LeStoffer
14-Nov-2006, 12:50
GDDR4-based graphics boards have been shipping in quantity since late August. That would indicate sampling some time before that (a quarter at least, one presumes) and MP of all variants is in full swing (at Samsung at least). R600 definitely isn't around just now because of the memory choice.

Yes, I am aware that the X1950XTX is using GDDR4 and it has been shipping for some time already. :wink: But that doesn't mean that there couldn't have been supply issues. nVidia after all decided to bypass GDDR4 for the (short) time being with the G80 and who knows if Samsung could ramp production enough for both X1950XTX and R600 for Christmas? I didn't know hence my supply speculation, but that is now lain to rest. :smile:

Ailuros
14-Nov-2006, 13:02
What do you think?

Take a peak on the G80 and tell me if you can compare it's TMUs with it's ROPs. In other words G7x TMU or ROP != G8x TMU or ROP and last but not least Vec16 ALU != Vec5 ALU. It was merely an observation for your collection of random numbers.

It could easily be a 6* 16-way SIMD with 24 TMUs and 16 ROPs. But that still doesn't say zip, zilk, nada since I cannot know what each unit is capable of. What's even worse are any kind of weird comparisons with what seems to be a fundamentally different design (G80).

Rys
14-Nov-2006, 13:29
...R600 for Christmas...
I'm absolutely convinced at this point that we'll see R600 released to the public in 2007. However your supply considerations are still valid, should the first R600-based products be in volume production ready for Vista's consumer release (the likely release date IMO), but I think Samsung have plenty. If you ask them they say that GDDR4 supply is 'very healthy'.

It could easily be a 6* 16-way SIMD with 24 TMUs and 16 ROPs.
I bet good money that it definitely isn't :runaway: but I agree with you that unit counts don't mean jack this time around.

Oushi
14-Nov-2006, 15:31
it's seems weird but i have this strong feeling inside me !!
we will see another nv30.But this time by ATI R600.it's the same
statution !!! new architecture new bus new direct X several delays by ATI.
what do u think guys ?!

neliz
14-Nov-2006, 16:17
Ati CAN'T make the same mistake twice...

Cuthalu
14-Nov-2006, 16:29
it's seems weird but i have this strong feeling inside me !!
we will see another nv30.But this time by ATI R600.it's the same
statution !!! new architecture new bus new direct X several delays by ATI.
what do u think guys ?!

They've already made Xenos, so it's not that new architecture.

Btw, since when has any product been released withouth so called delays? Never?

Oushi
14-Nov-2006, 17:27
They've already made Xenos, so it's not that new architecture.

Btw, since when has any product been released withouth so called delays? Never?

what about G80 :wink:

Rys
14-Nov-2006, 17:42
what about G80 :wink:
NVIDIA have publicly stated that it should have been released before November. As a matter of fact, NVIDIA have publicly stated that they tried to get it out last year, never mind this. So yes, very late depending on how drunk you think Jen-Hsun was :lol:
it's seems weird but i have this strong feeling inside me !!
:runaway:

Kaotik
14-Nov-2006, 17:48
what about G80 :wink:

G80 which was first told to be released.. humm.. was it summer "at the latest"?

Razor1
14-Nov-2006, 17:51
:runaway:


Ya really like that symbol don't ya lol, kinda grows on ya ;)

But anyways, no the r600 will not be another gffx, unless ATi's engineers purposefully sabotage thier own product!

Geo
14-Nov-2006, 17:56
Part of what has been missed a touch in the boggling over the 1350 MHz shaders is that base clocks took a step backwards in G80 to 575 MHz. This is covered by increasing ROPs to 24, for instance. A hypothetical R600 in the ~850 MHz range could be roughly equivalent with the same number of ROPs as R580.

ATI has not shown, that I can recall, much ankle towards going towards the clock domains concept and making significant differences in chip clocking for different parts of the chip. NV at least had tried that out a bit with the G7x line before throwing down with the major clock differences with G80. So, at least today, I'd be surprised if that shows up in R600. . .and thus I'd lean towards a higher base clock that applies to everything, with somewhat fewer units. That seems to be ATI's MO the last few years.

Shtal
14-Nov-2006, 18:16
Doesn't it become harder to put more pins on a chip when it is shrunk.

If they keep a die same size they could to it, but takes a lot more effort and transistors.

Sorry Cant' talk right now I gotta go to work:)

Chalnoth
14-Nov-2006, 18:24
So, at least today, I'd be surprised if that shows up in R600. . .and thus I'd lean towards a higher base clock that applies to everything, with somewhat fewer units.
They'd better have more units, or they'll get stomped.

Geo
14-Nov-2006, 18:29
They'd better have more units, or they'll get stomped.

Heh, depends on which part of the chip we're talking about. Shaders, yes. I was thinking more of ROPs there. But yeah, they're going to need more than 128 shaders, however arranged, if I'm right about "one clock speed" for the entire chip.

dnavas
14-Nov-2006, 18:30
Depends on what those units are capable of, unless we think R600 is going the scalar route as well....

96 < 128, but, 96 Vec4 > 128 scalar, even when running at 800Mhz vs. 1350Mhz....

ed. Geo beat me :) But, even with ROPs, you'll likely need to factor in differences in capability.

Razor1
14-Nov-2006, 18:39
I don't think it will exceed 750 mhz if its all one clock speed, unless ATi goes with some exotic silicon, at least for GPU's, and so far only one GPU has ever used strained silicon which was thier latest .80 nm notebook gpu. But in anycase yeah 96 vec 4 ALU's should, even at 600, be able to do a number against 128 scalar at 1300 mhz

Acert93
14-Nov-2006, 18:42
Heh, depends on which part of the chip we're talking about. Shaders, yes. I was thinking more of ROPs there. But yeah, they're going to need more than 128 shaders, however arranged, if I'm right about "one clock speed" for the entire chip.

Would not 128 Xenos style shaders at 650MHz be about 750GFLOPs? And 96 shaders would be about 560GFLOPs I believe. Obviously utilization would be a valid point, but I am not sure ATI would need more than 128 shaders as 128-Xenos-like-Shaders (Vec4 + SF) @ 650MHz is nearly 50% bump over G80 in raw flops (not that flops mean everything). I have not been able to follow all the threads as of late so I could be missing something.

Geo
14-Nov-2006, 18:44
But in anycase yeah 96 vec 4 ALU's should, even at 600, be able to do a number against 128 scalar at 1300 mhz

There's been a remarkable lack of band-wagon jumping on the "96 ALUs" theory since Tech Report reported Orton's remark, and some uncomfortable body language and facial ticks. Dunno if this is expectations management or what, but at this point I've personally moved 64/80/96 ALUs back into the ??? catagory.

Razor1
14-Nov-2006, 18:56
Yeah I'm leaning towards 96 ALU's, you don't have the CEO of a company "hinting" at something like this, without some truth behind it. It was a fairly strong hint too lol, something to the effect of "it might even have 96....". It could be more but definitly 96 or higher I think. But the more ALU's this things got, I wouldn't be suprised if the clock speed drops aswell. The rumors of the high power usage of the r600 has been there for the beginning and I've heard about it from numerous sources, so most likely that is true too.

Chalnoth
14-Nov-2006, 19:10
96 might be a bit much. I personally think 64 vec-4 ALU's is a bit more likely, as it'd be somewhat challenging to both double the theoretical shader power of the chip while also increasing its capabilities. Possible, I suppose, but I personally think it's unlikely.

Razor1
14-Nov-2006, 19:13
actaully just reread that article, sounds like it could be something like what they for the x800's, 12 pipes, 4 pipes for safty, but for a refresh they could have enough chips with all 16 pipes.......

Acert93
14-Nov-2006, 19:14
96 might be a bit much. I personally think 64 vec-4 ALU's is a bit more likely, as it'd be somewhat challenging to both double the theoretical shader power of the chip while also increasing its capabilities. Possible, I suppose, but I personally think it's unlikely.

Isn't that exactly what NV did though? And using Xenos as a baseline, Xenos is fairly power effecient and small and has a bit of functionality at the baseline already. Obviuosly your opinion. I think in the biggest picture is when did ATI get serious about R600 (continuation of R400?) and how much did Xenos development hurt/help R600 in term of resources?

Chalnoth
14-Nov-2006, 19:28
Isn't that exactly what NV did though? And using Xenos as a baseline, Xenos is fairly power effecient and small and has a bit of functionality at the baseline already. Obviuosly your opinion. I think in the biggest picture is when did ATI get serious about R600 (continuation of R400?) and how much did Xenos development hurt/help R600 in term of resources?
The theoretical shader power is less than twice that of the G70. It does so much better mostly due to added efficiency.

LeStoffer
14-Nov-2006, 20:17
Obviously utilization would be a valid point, but I am not sure ATI would need more than 128 shaders as 128-Xenos-like-Shaders (Vec4 + SF) @ 650MHz is nearly 50% bump over G80 in raw flops.

128? :shock: That would be one insanely monster huge chip. I think they would have trouble enough getting a 96 (Vec4 + SF) with full DX10 features into smooth mass production. But then again, there have to be a reason to the delay. :wink:

jamis
14-Nov-2006, 21:05
96 might be a bit much. I personally think 64 vec-4 ALU's is a bit more likely, as it'd be somewhat challenging to both double the theoretical shader power of the chip while also increasing its capabilities. Possible, I suppose, but I personally think it's unlikely.
Isn't 96 pretty much confirmed:
"Orton pegged the floating-point power of today's top Radeon GPUs with 48 pixel shader processors at about 375 gigaflops, with 64 GB/s of memory bandwidth. The next generation, he said, could potentially have 96 shader processors and will exceed half a teraflop of computing power."
http://techreport.com/etc/2006q4/stream-computing/index.x?pg=1

Ailuros
14-Nov-2006, 21:53
I bet good money that it definitely isn't :runaway: but I agree with you that unit counts don't mean jack this time around.

I've lost already one bet; I'm not really up to lose another one heh. One public apology for complete ownage is enough at a time :D

SugarCoat
14-Nov-2006, 21:54
So; your personal guess, where does it leaves you with your assumption?

I think they'll be using the same 2.0GHz GDDR4 or a max of 2.2. Anything higher and they'll have severely limited production. We're talking about boards that are ranging in 512mb to a gig of the stuff each so thats a heavy request, more so if you take into consideration the worsening yeilds as you request higher speeds.

Havn't we heard GDDR4 yeilds are extremely good? 1400mhz (2.8) is readily available (http://www.samsung.com/Products/Semiconductor/GraphicsMemory/GDDR4SDRAM/512Mbit/K4U52324QE/K4U52324QE.htm) from Samsung, with 1600mhz (3.2) looming extremely soon in the future. 2.2-2.8 (1.2-1.4ghz bin) sounds like a logical possibility to me, as we've seen stranger things happen (like nvidia using 1.1ns GDDR3 before it was readily available) and as we know cards are not usually clocked to the max of the binned chip used.

We've heard GDDR4 yeilds are good from the same people that then turned around and said they were absolutly terrible. Theinq. So whats that tell you? The fact that nVidia totally bypassed GDDR4, even the lower spec stuff, tells me they did it to make absolutly sure they had the supply to back a hard launch and keep shipments coming.

SirPauly
14-Nov-2006, 21:59
Isn't 96 pretty much confirmed:

http://techreport.com/etc/2006q4/stream-computing/index.x?pg=1

That Stream computing presentation is actually offered at ATI/AMD's web-site:


http://ati.amd.com/companyinfo/events/StreamComputing/index.html

vertex_shader
14-Nov-2006, 22:11
First you said R600 was ~720 million transistors and now this. I dont mean to offend you as long as you are throwing us bread crumbs along the way. :wink4:

I never write anything about r600 transistor count.

pjbliverpool
14-Nov-2006, 22:14
96 might be a bit much. I personally think 64 vec-4 ALU's is a bit more likely, as it'd be somewhat challenging to both double the theoretical shader power of the chip while also increasing its capabilities. Possible, I suppose, but I personally think it's unlikely.

With only 64 Xenos style ALU's R600 would have to run at 740Mhz merely to match R580's theoretical shader power. It would have to run at 870Mhz to get over 500 GFLOPs and to match G80 it would need to run at dead on 900Mhz!

And baring in mind G80's shaders are likely more efficient than the Xenos style aswell I can't see R600 being limited to 64.

96 would mean that at the more reasonable speed of 650Mhz R600 could have a little more theoretical power than G80. I suppose we also need to know how much more efficient G80's shaders are though before a real comparison can be made. And thats assuming R600 uses Xenos style shaders, they might try something completely different.

vertex_shader
14-Nov-2006, 22:23
THE UP AND COMING R600 will have a real 512 bit memory controller. Unlike its predecessors which had an internal 512 ring memory bus, the R600 will have it externally as well.
This means that the packaging of the chip will be extremely expensive. The wider memory bus you use the more pins you need in your chip package.

If the 512 memory ring turns to be the real thing, we are talking about 128 GB/s of memory bandwidth with GDDR4 clocked at 2000MHz. We also learned that the R600 may use memory faster than 2000MHz as it will be available by Q1. If ATI keeps pushing the chip we might get even faster GDDR4 chips at production time.

Even the PCB of the R600 will be super complicated, as you need a lot of wires to make 512 bit memory to work. Overall it has the potential to beat Nvidia's G80, but yet again it will come at least three months after Nvidia. The G80's memory works at 384 bit as Nvidia pretty much dis-unified everything in G80 from shaders to memory controllers. Nvidia likes to make rules and probably could not get more than 384 bit wide controller in the chip, as the G80 is still a 90 nanometre chip.

It’s a shame that we will need to wait at least until February to see it in action. µ
http://www.theinquirer.net/default.aspx?article=35062

512bit rumor first come from an asian site, not from inq.

Arty
14-Nov-2006, 22:24
I never write anything about r600 transistor count.
Sorry I mistook you for some one else, my apologies.:oops:

LeStoffer
14-Nov-2006, 22:26
That Stream computing presentation is actually offered at ATI/AMD's web-site:


http://ati.amd.com/companyinfo/events/StreamComputing/index.html

Thanks. At around the 20:10 mark we got the "today it is 48 - in the future maybe it is 96". After having listen to it myself, I'm not totally convinced that was a solid hint to the R600. The "maybe" sounded a bit vague... :razz:

trinibwoy
14-Nov-2006, 22:30
With only 64 Xenos style ALU's R600 would have to run at 740Mhz merely to match R580's theoretical shader power. It would have to run at 870Mhz to get over 500 GFLOPs and to match G80 it would need to run at dead on 900Mhz!.

Only if you count the missing MUL! Bwahahaha!!

Arun
14-Nov-2006, 22:34
You arguably shouldn't count it, since ATI has a dedicated MUL for perspective correction, basically... But if it got partially exposed in the future, it would be interesting, and could arguably count as a (small?) part an ALU or some such... 1/4th would be my first guestimate.

Uttar

jamis
14-Nov-2006, 22:46
Thanks. At around the 20:10 mark we got the "today it is 48 - in the future maybe it is 96". After having listen to it myself, I'm not totally convinced that was a solid hint to the R600. The "maybe" sounded a bit vague... :razz:
Listen a bit further and he says:"Our next generation is going to exceed half a gigaflop", no maybies there. He also repeats the magical number of 96...

Kaotik
14-Nov-2006, 23:07
Listen a bit further and he says:"Our next generation is going to exceed half a gigaflop", no maybies there. He also repeats the magical number of 96...
I assume you mean "half a teraflop" :wink:
Anyawy, what's the theoretical max of G80 on those fancy floppies:?:

trinibwoy
14-Nov-2006, 23:08
A middling 346 gflops without the MUL.

jamis
14-Nov-2006, 23:32
I assume you mean "half a teraflop" :wink:
Anyawy, what's the theoretical max of G80 on those fancy floppies:?:
LOL, yes. I think it's >500 GFlops too. At least that's what they say.

rwolf
14-Nov-2006, 23:48
Part of what has been missed a touch in the boggling over the 1350 MHz shaders is that base clocks took a step backwards in G80 to 575 MHz. This is covered by increasing ROPs to 24, for instance. A hypothetical R600 in the ~850 MHz range could be roughly equivalent with the same number of ROPs as R580.

ATI has not shown, that I can recall, much ankle towards going towards the clock domains concept and making significant differences in chip clocking for different parts of the chip. NV at least had tried that out a bit with the G7x line before throwing down with the major clock differences with G80. So, at least today, I'd be surprised if that shows up in R600. . .and thus I'd lean towards a higher base clock that applies to everything, with somewhat fewer units. That seems to be ATI's MO the last few years.

Not true, I believe that the ring bus operates at the same speed as the memory interface and it connects internal components together.

rwolf
15-Nov-2006, 00:02
Depends on what those units are capable of, unless we think R600 is going the scalar route as well....

96 < 128, but, 96 Vec4 > 128 scalar, even when running at 800Mhz vs. 1350Mhz....

ed. Geo beat me :) But, even with ROPs, you'll likely need to factor in differences in capability.


So what if those units run at 3GHz.

LeStoffer
15-Nov-2006, 00:14
So what if those units run at 3GHz.

Damn rwolf, you just cant get that fast14 tech out of your head, can you? :wink:

Rangers
15-Nov-2006, 01:56
So what if those units run at 3GHz.

ATI isn't that aggressive.

Anarchist4000
15-Nov-2006, 04:09
So what if those units run at 3GHz.

If that happened ATI likely wouldn't have any competition for the next couple years. Besides the fact they would likely be pushing around 3TFLOPS at that point if all the other rumored speculation plays out.

Domell
15-Nov-2006, 08:34
http://66.249.93.104/translate_c?hl=en&ie=UTF-8&oe=UTF-8&langpair=zh-CN%7Cen&u=http://we.pcinlife.com/thread-654405-1-1.html&prev=/language_tools

Does it mean that r600 will use 256-bit memory bus or what??

rwolf
15-Nov-2006, 08:54
ATI isn't that aggressive.

Why not? Ring bus is running at GDDR4 speeds.

rwolf
15-Nov-2006, 09:22
http://www.theinquirer.net/default.aspx?article=35707

R600
16 ROPS
700MHz-800MHz
64 shaders each with 4-way simd

DAAMIT kept the RingBus configuration for the R600 as well, but now the number has doubled. The External memory controller is a clear 512-bit variant, while internally you will be treated with a bi-directional bus double the width. The 1024-bit Ringbus is approaching

Since R600 SIMD Shader can calculate the result of four scalar units, it yields with scalar performance of 256 units - while Nvidia comes with 128 "real" scalar units.

Delay is caused by another weird bug.

Ailuros
15-Nov-2006, 09:29
I wish the folks at the INQ knew only half the time what they're rambling about :roll:

rwolf
15-Nov-2006, 09:35
We are heading for very interesting results in DX10 performance, since game developers expect that NV stuff will be faster in simple instrucions and R600 will excel in complex shader arena. In a way, you could compare R600 and G80 as Athlon XP versus Pentium 4 - one was doing more work in a single clock, while the other was using higher clock speed to achieve equal performance.

Interesting.

rwolf
15-Nov-2006, 09:37
First of all, the GPU is a logical development that started with the R500Xenos, or Xbox GPU, but without the 10MB eDRAM part. Unlike the Xbox GPU, the R600 has to be able to support a large number of resolutions and, if we take a look at today's massive 5Mpix resolutions, it is quite obvious that R600 should feature at least five times more eDRAM than Xbox 360 has

I doubt that the R600 has eDRAM and only has a modest cache. It wouldn't make sense to have a 512-bit bus if you have eDRAM.

neliz
15-Nov-2006, 09:48
ATI isn't that aggressive.

Not aggressive on the products, but aggresive within their limited design?

ChrisRay
15-Nov-2006, 10:02
http://www.theinquirer.net/default.aspx?article=35707

Oy. I dont even know WHAT theo is trying to say here.

nicolasb
15-Nov-2006, 10:09
First of all, the GPU is a logical development that started with the R500Xenos, or Xbox GPU, but without the 10MB eDRAM part. Unlike the Xbox GPU, the R600 has to be able to support a large number of resolutions and, if we take a look at today's massive 5Mpix resolutions, it is quite obvious that R600 should feature at least five times more eDRAM than Xbox 360 has I doubt that the R600 has eDRAM and only has a modest cache. It wouldn't make sense to have a 512-bit bus if you have eDRAM.Maybe you read that wrong...? They're saying that if R600 had eDRAM it would have to have 5 times as much as Xenos, which is clearly impossible and is precisely the reason why R600 quite definitely doesn't have eDRAM.

I know one shouldn't necessarily believe The Inquirer, but, if this does turn out to be accurate, does anyone else feel a little dissapointed? 16 ROPs and 64 shaders? Good news about the clock-speed and the memory bandwidth, I suppose, but.... :| In my heart-of-hearts I was hoping for something that would convincingly beat G80 all round. Oh well.

Ailuros
15-Nov-2006, 10:12
*sigh* Why don't you guys first wait what each unit is capable of before you even come to conclusions about unit amounts? It must be the third time I remind that, but as more food for thought:

R5x0 ALU = 12 FLOPs
Xenos ALU = 9 FLOPs
G80 SP = 3 FLOPs

Uhmmm hello....?

And yes before anyone says it sterile flop numbers aren't an indicator either, yet it's even more senseless to judge something purely on unit amount. Patience until we know what each is capable of.

Kocur
15-Nov-2006, 10:56
Well, if the Inq is right (on the specs, not on the analysis :smile:), then R600 clocked at 800MHz will be, theoretically, up to 20% faster in shaders than the current 8800GTX. Of course, in practice the perfromance difference will vary, depending on the particular scenario. Moreover, the "efficiency" of the chip could be an important factor here. And it looks like, again, ATI will be faster in "heavy duty" AA modes in very high resolutions.

fehu
15-Nov-2006, 10:58
hi
I'm new in this forum also if I use to read it almost regularly

I have a question, or almost a curiosity
I'm understanding that G80 has many simpler units and the R600 maybe have more complex Vec4 units
From a developer standpoint it make a difference?
We will see in this generation too game working badly on a card only because every game need a different path with different optimization for every card?:evil:
Or simply means that a developer send to the gpu some instruction and the G80 split it and the R600 group it?

In the first case what are the real world instruction used by developers? simple or vec4?

Rys
15-Nov-2006, 11:11
hi
I'm new in this forum also if I use to read it almost regularly

I have a question, or almost a curiosity
I'm understanding that G80 has many simpler units and the R600 maybe have more complex Vec4 units
From a developer standpoint it make a difference?
We will see in this generation too game working badly on a card only because every game need a different path with different optimization for every card?:evil:
Or simply means that a developer send to the gpu some instruction and the G80 split it and the R600 group it?

In the first case what are the real world instruction used by developers? simple or vec4?
The developer really shouldn't care, unless they're fine tuning. The driver and its instruction assembler is designed to get the best use out of the shading hardware, so you just write your shader and assume that part of it is just running as best it can (for the most part). Per-chip optimisations are generally not something a developer wants to spend much time on, but that's not to say it doesn't happen (especially in the console space).

As for real-world instruction mixes, they're so varied that choosing the most common is a redundant task, and it certainly wouldn't come down to just scalar or vec4.

Regardless, the first D3D10 parts should have some commonality in regards to how they tackle the instruction mix of real-world shaders.

fehu
15-Nov-2006, 11:26
...

so the R600 is only teoretically faster in the shader field?:?:

Rys
15-Nov-2006, 11:35
so the R600 is only teoretically faster in the shader field?:?:
It's impossible for me to say at this point. Regardless, don't believe (even some small fraction) of what The Inquirer deemed fit to report as fact this morning, since they're hilariously wrong in places.

NocturnDragon
15-Nov-2006, 11:38
so the R600 is only teoretically faster in the shader field?:?:

What is a R600? :wink:

Razor1
15-Nov-2006, 11:42
I'll put more money on that chinese forum then the INQ :wink:



I know R600 MC512, but the version can not help but remind us of the limits R520XT, Mars Card



This is an interesting quote from that forum. Guy says he used to work as internal staff at AMD.

CJ
15-Nov-2006, 12:19
Latest rumors say 07 January 2007 for an NDA to expire... seems like ATI is going to send something 'new' to reviewers at or before that time. Could it be R600?

LeStoffer
15-Nov-2006, 12:47
Latest rumors say 07 January 2007 for an NDA to expire... seems like ATI is going to send something 'new' to reviewers at or before that time. Could it be R600?

Are we talking about a hard launch, because then January 7 is sooner than expected.

Geo
15-Nov-2006, 13:37
Are we talking about a hard launch, because then January 7 is sooner than expected.

That does feel a bit early, if no other reason than if you think about it, then that is AMD requiring the reviewing community to spend their Christmas/New Years holidays cranking away at R600. There was three weeks to work on G80, and it was all needed, as it would be for R600, or any other new gen architecture. Maybe that date would work for a refresh, but it would be tough for a new gen for NDA lift.

Bouncing Zabaglione Bros.
15-Nov-2006, 13:45
That does feel a bit early, if no other reason than if you think about it, then that is AMD requiring the reviewing community to spend their Christmas/New Years holidays cranking away at R600. There was three weeks to work on G80, and it was all needed, as it would be for R600, or any other new gen architecture. Maybe that date would work for a refresh, but it would be tough for a new gen for NDA lift.

ATI may not have much choice. G80 has been (rightfully) very well received, and is picking up all the publicity and good PR. If ATI doesn't get out there as soon as possible, even a very good R600 will simply have lost a lot of mindshare and word-of-mouth advertising.

People don't want to wait too long when they are getting their Vista PCs and seeing all the goodies G80 can give them in the next few months.

You reviewers might end up between a rock and a hard place over Christmas and New Year's because I don't think ATI has the luxury of delaying R600's release to fit it in neatly with what makes everyone happy.

neliz
15-Nov-2006, 13:45
Inq just said a 12 layer pcb which is smaller than the g80tx

http://uk.theinquirer.net/?article=35708

I don't think that all they said will make it to all variants though, the monstrous cooling they talk about can't really be used on a Pro variant.. now can it?

It also kills of most of the rumours (including theirs) as "nv marketing fud" especially to the regards of it being huge and power hungry; they did a 180 and now bring it as small and energy efficient.. at least, less hungry than g80.

IbaneZ
15-Nov-2006, 14:09
Still linking to the inq are we? :lol:

http://www.beyond3d.com/forum/showpost.php?p=865881&postcount=2432

fehu
15-Nov-2006, 14:13
Inq isn't bible, but simply a place where you can find tons of rumors
sometime the rumors are true sometime not

why blame them?

Razor1
15-Nov-2006, 14:14
because the write them as if they are the truth :wink:

neliz
15-Nov-2006, 14:19
Well.. Part 2 said they actually saw a card and they talked about revisions. Or are they just translating all the chinese/taiwanese sites on this?

IbaneZ
15-Nov-2006, 14:26
Inq isn't bible, but simply a place where you can find tons of rumors
sometime the rumors are true sometime not

why blame them?

I agree. What would we do without the inq? Most rumor threads would be pretty boring. :wink:

But "Geeforcers" post is funny.

I can't wait to see what ATI has planned for us. I'm tired of reading how damn good G80 is, I want something fresh. :lol:

neliz
15-Nov-2006, 15:03
I can't wait to see what ATI has planned for us. I'm tired of reading how damn good G80 is, I want something fresh. :lol:

I can't help thinking about R400 when I read that. if R400's design was too weak for DX9, how far has that design evolved since then to make it succeed today?

INKster
15-Nov-2006, 15:16
All this "news" about R600 makes me a little upset.
With the G80 GTX and the GX2 i was already seeing excessive power consumption and board length, but if this turns out to be true, i don't know if R600 will be as successful, especially with the holiday season over (meaning lower budgets for ultra-high-insane graphics cards).

But, most of all, it spells bad news for mainstream cards. I certainly wouldn't want a X1650 XT successor with X1950 XTX-level power consumption and heat output, even if it meant a similar performance for the money and DX10 support.

Geo
15-Nov-2006, 15:31
You reviewers might end up between a rock and a hard place over Christmas and New Year's because I don't think ATI has the luxury of delaying R600's release to fit it in neatly with what makes everyone happy.

Always possible. On the other hand, Inq has seemed very convinced that R600 will launch with Vista retail. So maybe the 1/7 date will be the editors day. That would fit. (remembering if I knew that to be true, I couldn't talk about it --so I'm speculating like everyone else)

pjbliverpool
15-Nov-2006, 15:55
Well, if the Inq is right (on the specs, not on the analysis :smile:), then R600 clocked at 800MHz will be, theoretically, up to 20% faster in shaders than the current 8800GTX. Of course, in practice the perfromance difference will vary, depending on the particular scenario. Moreover, the "efficiency" of the chip could be an important factor here. And it looks like, again, ATI will be faster in "heavy duty" AA modes in very high resolutions.

Are you discounting the MUL from G80? Because with it its actually still got a little extra theoretical shader power than a 800Mhz R600 with 64 shader units.

I was also under the impression that G80's scalar design makes it more efficient than the larger Xenos style ALU's.

Jawed
15-Nov-2006, 16:20
I was also under the impression that G80's scalar design makes it more efficient than the larger Xenos style ALU's.
Yeah a scalar arrangement of ALUs is the best configuration for utilisation.

The side effect, though, is that the scheduling, arbitration and register-file logic required to support a scalar pipeline is considerably more complex than that required for the competing vector architecture. This is down to combination of quantity of data being moved around and the very fine-grained access patterns. In G80, every clock cycle can schedule a different access pattern against the register file and every clock cycle can issue a different instruction - while ATI's architectures take a more leisurely 4-clock timing. G71 takes a glacial 220-clock timing, by comparison.

So, in the past we've talked about the "unified scheduling overhead" that Xenos or R600 suffers (R580 has a subset). Well, G80 suffers an additional scalar-instruction overhead. I don't know what the magnitude of it is but it seems to me to be non-trivial.

Jawed

nelg
15-Nov-2006, 17:31
Yeah a scalar arrangement of ALUs is the best configuration for utilisation.

The side effect, though, is that the scheduling, arbitration and register-file logic required to support a scalar pipeline is considerably more complex than that required for the competing vector architecture. This is down to combination of quantity of data being moved around and the very fine-grained access patterns. In G80, every clock cycle can schedule a different access pattern against the register file and every clock cycle can issue a different instruction - while ATI's architectures take a more leisurely 4-clock timing. G71 takes a glacial 220-clock timing, by comparison.

So, in the past we've talked about the "unified scheduling overhead" that Xenos or R600 suffers (R580 has a subset). Well, G80 suffers an additional scalar-instruction overhead. I don't know what the magnitude of it is but it seems to me to be non-trivial.

Jawed

What about compiler optimizations? Would a scalar design have less room/need for optimizations vs. a vector design?

Jawed
15-Nov-2006, 19:28
I dare say it should be simpler to compile for. The less units you have to co-issue or dual-issue, the simpler in general simply because you have less combinations of instruction ordering that you need to evaluate for dependency.

What I'm not clear about is the co-issued MUL, the one that's currently "missing". And how that relates to the SF/interpolator unit.

Rys and Uttar should be reporting back with more info on all this - presumably the 100 series drivers is going to "activate" more G80 functionality. And it may change again with D3D10 drivers.

Jawed

Razor1
15-Nov-2006, 20:23
Well pretty much hit the nail on the head there Jawed, but shader compiliars are pretty good as it is for vector archictures, so there won't much of a difference IMO, possibly in some special cases the g80 scaler's will have an advantage, well other then being utilized more of the time.

Arun
15-Nov-2006, 20:52
Well pretty much hit the nail on the head there Jawed, but shader compiliars are pretty good as it is for vector archictures, so there won't much of a difference IMO, possibly in some special cases the g80 scaler's will have an advantage, well other then being utilized more of the time.This has nothing to do with the compiler. Modern shader compilers will handle auto-vectorization for pre-G8x architectures just fine anyway. But your compiler can be as godly as you want it to be, it still won't be able to generate vector code out of a set of dependent scalar instructions!

As for the MUL, it's obviously working on perspective correction stuff right now and happy doing that, but we'll see whether it begins being able to do other stuff in the future. There isn't much to test about it without new drivers sadly - although that doesn't mean there's nothing left to test in current ones, quite on the contrary ;) (although since I don't have my board yet and Rys will be busy in the short-term, I guess it'll have to wait a short while anyhow)


Uttar

Rys
15-Nov-2006, 21:12
although since I don't have my board yet and Rys will be busy in the short-term, I guess it'll have to wait a short while anyhow
Well, I've not been too busy that I didn't run a few tests with 97.02. MUL rate** is ~92% of peak now, not using the thing for anything other than basic shading, up about 5% from 96.94. Others are up a wee bit too, in terms of theoretical throughputs, but nothing massive (and mostly near peak as before).

Generally, I think the guts of their upcoming compiler/assembler work will be to optimise certain mixes in terms of throughput, remove some bottlenecks, stop bubbling, etc, since the basics seem to work fine. Profiling some shipping game shaders is next on my list anyway, even if NVIDIA aren't bothered :twisted:

But yes, Uttar is hinting that I'm off on holiday next week. Hope you all enjoy the madness while I lie on a beach in Portugal :lol:

** this is with a short shader and it going up over a driver revision, the hardware gets more efficient generally as shader program lengths go up, too.

Geeforcer
15-Nov-2006, 21:14
Theo seems almost stunned by the fact that R600 uses Vec4 ALUs. Has anyone, and I mean, anyone, expected differently?

Razor1
15-Nov-2006, 21:26
This has nothing to do with the compiler. Modern shader compilers will handle auto-vectorization for pre-G8x architectures just fine anyway. But your compiler can be as godly as you want it to be, it still won't be able to generate vector code out of a set of dependent scalar instructions!



Ah I see where ya guys are coming from!

3dilettante
15-Nov-2006, 21:40
As for the MUL, it's obviously working on perspective correction stuff right now and happy doing that, but we'll see whether it begins being able to do other stuff in the future. There isn't much to test about it without new drivers sadly - although that doesn't mean there's nothing left to test in current ones, quite on the contrary ;) (although since I don't have my board yet and Rys will be busy in the short-term, I guess it'll have to wait a short while anyhow)


I don't get why something as low-level as dual-issue of a pixel unit can be disabled at as high a level as the driver, or why they would.

How can a buggy driver louse that up?

In CPUs, that kind of switching off is done in microcode, and not without good reason.

What if we have to wait until G85 for a fully capable second MUL, if ever?