AMD: R7xx Speculation

Status
Not open for further replies.
:LOL:

But, just amazing, what they done with just 256mm². I would like to see what they can do with 400-500mm², but if R700 is real MGPU-architecture it is not necessary. :smile:
 
Yeah the amount of changes they did tells me that R700/RV770 really was in the works for a long time. But still, amazing work and that hub sounds interesting...
 
By the way, it's funny how diametrical are the design approaches of G200 and RV770, where the former has its SIMD arrays distributed on the edges of the ASIC, in a contrary to RV770 where all the units are beautifully squared in the very center of the chip. :smile:
 
I remember the Dave Orton interview for R420 and "who will step down first?" on increasing size. Well, the answer has obviously been ATI/AMD.

But I have a suspicion that GT200 will keep the "largest gpu ever fabbed" crown for all time, and even NV will "step down" a bit from here.

You could also look at "stepping down" as taking the inevitable route to multi-core GPUs - and here it seems that ATI is ahead, and Nvidia will have to follow. G200 is the last hurrah for the giant monolithic GPU.
 
So its final clear: 800SPs, 40 TMUs, 16 ROPs, new MC - how does it work on R700?

The obvious assumption is that there is some form of high speed interconnect between the hubs on both chips. What I'd like to know is just how much bandwidth that connection has because that would seem to be the major bottleneck.

The odd thing is that the setup looks almost identical to the memory controllers used by AMD's quad cores.
 
Yeah it seems increasingly likely that G200 is going to be the largest monolithic GPU for a few years at least.

In a way I feel like ATI's new strategy was bold and this was the first time we actually had both teams release their cards on time (shocking for ATI, i know) and the first time ATI had this strategy right from the beginning of a new generation (RV670 vs G92 doesnt count since they were just die shrink refreshes of existing monolithic GPUs). In a lot of ways it makes sense - ATI can easily convert the RV770 architecture to mobile GPU's while G200 probably never will until its cut down and shrunk considerably. Cutting down RV770 to RV710, RV730, etc. will not be that hard either if the rumors that the architecture is very modular and scalabe are true.

Nvidia's approach of releasing a monolithic high-end GPU then later bringing in derivaties of smaller GPUs (like G80 came first then we had G84 and G86) may be in trouble because it seems like Nvidia has different teams focused on each set, and G84 and G86 were terrible in comparison. I'm sure there are G200 derivatives for the midrange on the way, but with word that the G92 design will be used for another year or so, it remains to be seen whether these cards can face off against RV770 still.

Anyways as pointed out by Jawed a few times, the next shrink that becomes available from TSMC is likey to be 40nm (if they skip 45nm) by sometime early next year can see a RV770 derivative (maybe RV870) with some ridiculous numbers in the same die space. R600 architecture was supposed to last at least 3 generations so R800 series we might see some gigantic numbers if they're on the 40nm process.
 
We first heard rumbles of the strategy we're seeing play out now over a year ago. I tend to think that this is also part of the AMD acquistion. I tend to doubt an independant ATI would have gone this route. You need "the halo" more if gpus is your main thing than if it's just another part of what you do.
http://eetimes.com/news/latest/show...XITDCIQSNDLRSKH0CJUNN2JVN?articleID=208404063

The decision to use a two-chip strategy for the high end was made more than two years ago, based on an analysis of yields and scalability. It was not related to AMD's recent financial woes, said Rick Bergman, general manager of AMD's graphics division.
Seems to me this was a decision made before AMD came on the scene.

AMD hasn't chosen to stop making halo graphics cards - it's decided to make them with 2 GPU chips.

Jawed
 
I'd give anything for there to be another architect interview at B3d like the R600 one, though that may have been an anomaly. What happened with the ring bus? Was it too overloaded, too unnecessary, too much of a pain to optimize?

Extremetech needs to fix its diagram too, otherwise RV770 has 1600 ALUs.

The Data Request Bus is poorly defined. It must link all the texture caches to each SIMD array's TMU, but it also has to link the vertex cache and global data share to everything else.
It would have been called a crossbar if it was one, wouldn't it?
What is it then, if not a crossbar, a switch fabric, ring bus?
 
Well the GTX+ better fix the issues with increased AA and resolutions or otherwise the 4850, which seems to take far fewer hits than the 9800GTX, will still be the better solution if you love AA and AF and resolution at that price bracket.

Now I really want to see what the 4870 can do though
 
some interesting stuff -bye bye ringbus

don't know if NDA is gone for the architectural changes in R770 (apart for R670) but the guys at extremetech already start to dissect the body...

here:
http://www.extremetech.com/article2/0,2845,2320865,00.asp
LOL, so the memory system is still fully distributed but no longer has the form of a ring :LOL: :LOL: :LOL:

Moving the L2s closer to the MCs is, of course, very similar to what NVidia has (identical?). The hub in RV770 is much like the crossbar in G8x and later, twixt texture clusters and ROP partitions.

The difference between the two designs appears to be that the hub is not involved in completed fragments arriving in the RBEs - there is seemingly a separate path, via Shader Export, directly to the RBEs.

Though with NVidia's GPUs we don't know if there's a single crossbar between clusters and partitions or whether there's one for texture data and a separate one for fragments.

So AMD seems to have chosen to match RBE count and MC count (RV560/570, RV630/635 being the obvious exceptions) - which means that colour/z/stencil performance looks like it'll only scale with GPU-count and clock.

---

ATI managed to cram two-thirds more stream processors
Huh? four-thirds! For some reason the article is written on the basis that RV670 is 480 ALU lanes.

---

Can anyone work out what specifically was done to make the ALUs smaller?

Jawed
 
AMD hasn't chosen to stop making halo graphics cards - it's decided to make them with 2 GPU chips.

It's quite possible that ATI can make a 4870x2 to take the performance crown (and do it cheaper, cooler and with less power than it's monolithic competiton), while Nvidia simply can't make a G280x2. Right there you can see the advantages of going multicore demonstrated in the form of a product that's viable verses a product that isn't.

ATI decided to get out of that dead-end early (and paid for it over the last couple of chip generations) while Nvidia tried to stick it out with monolithics for as long as possible - but it's put them behind on their next steps towards multi-GPU.
 
The extremetech diagram kind of conflicts with the other block diagram from pcinlife.

The the data request bus is listed as a crossbar, and the pcinlife diagram has no direct mention of the crossbar between the L2 caches and the L1s.

I'm also seeing no mention of that memory read/write cache in the extremetech diagram.
 
Status
Not open for further replies.
Back
Top