AMD: R7xx Speculation

Status
Not open for further replies.
Huh? four-thirds! For some reason the article is written on the basis that RV670 is 480 ALU lanes.

---

Can anyone work out what specifically was done to make the ALUs smaller?

Jawed

Gah! Thanks for catching the typo. Don't know how that slipped past. It should be fixed to say "more than double."

Review of the 4850 going up tonight probably. The surprise pulling in the embargo lift made it slightly more rushed than I would have liked, but we'll have more interesting stuff next week. :)
 
Hm, simple repacking? But we really have no way to know for sure being, blind about RV670 ASIC structure. :rolleyes:
 
Last edited by a moderator:
It's confusing, the first diagram here:

http://www.extremetech.com/article2/0,2845,2320866,00.asp

has a lower section to it implying that the ALUs take less space.

Jawed

Supposedly, they do. You'd have to ask ATI for how exactly they managed to do pretty much the same work in less space or increase transistor density, but a single RV770 ALU takes up less die area than a single RV670 ALU.

As for the accuracy of the diagrams - none of those were made by me, they're directly from ATI.
 
The Data Request Bus is poorly defined. It must link all the texture caches to each SIMD array's TMU, but it also has to link the vertex cache and global data share to everything else.
It would have been called a crossbar if it was one, wouldn't it?
What is it then, if not a crossbar, a switch fabric, ring bus?
It looks like there's a 1:1 relationship between TUs and L1s.

Alternatively the TUs can fetch from either vertex cache or global data share. Is it reasonable to presume that at any one time only one TU can fetch from either GDS or VC? If all the TUs could concurrently fetch from VC, say, you'd have a completely stupid crossbar.

So, one fetching TU at a time would imply some fancy thread scheduling (i.e. the per-SIMD sequencers being given timeslots by the command processor, or perhaps round-robin scheduling?).

So, is local data share an analogue of parallel data cache in G80?

Jawed
 
Supposedly, they do. You'd have to ask ATI for how exactly they managed to do pretty much the same work in less space or increase transistor density, but a single RV770 ALU takes up less die area than a single RV670 ALU.
Thanks for your article, nice to get so much detail in one place.

Jawed
 
The 24xCFAA (and 12xCFAA mode for 4xAA) mode doesn't cause any extra blurriness at all. Great pains are taken to make sure that only the edges are filtered.

Performance of the CFAA modes on 48xx parts should be surprising ;)

/drool...

/drools some more...

I hope someone tests this out. Pretty sure Techreport will, as they tested it before on R(v)6xx... Can't wait.

If Edge Detect is useable single card for 24 samples, that will put it immediately on my must have list.

Regards,
SB
 
/drool...

/drools some more...

I hope someone tests this out. Pretty sure Techreport will, as they tested it before on R(v)6xx... Can't wait.

If Edge Detect is useable single card for 24 samples, that will put it immediately on my must have list.

Regards,
SB

Yeah, if ED works well at 1680x1050 I'm going to pull the trigger.
 
http://www.techpowerup.com/reviews/MSI/HD_4850/images/front_full.jpg

GPU still seems big for 255mm2, - on this nice regular picture I get 220*216 pixels and 1114 for the pci-e - and with a pci-e length of 85mm it equals to 16.8*16.5 or 276mm2 within 1-2%. Maybe they don't count the plastic cover - however the same calculation fits for the 192mm2 for the rv670..

But maybe ATI likes the good vibes around a small chip ;)

The 276mm^2 is the correct size, 255mm^2 was just rumored size earlier
(I think?)
 
It's quite possible that ATI can make a 4870x2 to take the performance crown (and do it cheaper, cooler and with less power than it's monolithic competiton), while Nvidia simply can't make a G280x2. Right there you can see the advantages of going multicore demonstrated in the form of a product that's viable verses a product that isn't.

ATI decided to get out of that dead-end early (and paid for it over the last couple of chip generations) while Nvidia tried to stick it out with monolithics for as long as possible - but it's put them behind on their next steps towards multi-GPU.

Not necessarily. It's quite possible that Nvidia has had a seperate R&D team working on a multi-GPU concent seperate from the monolithic GPU concept. We all know they have enough cash to do this. What we don't know is if they have or have not been doing it.

However, considering ATI is closer to being first to market with a viable (non-traditional AFR multi-GPU) product, it's probably safe to consider them ahead in that. And certainly (other than user configured profiles) ahead of Nvidia with multi-GPU software. All IMHO of course.

Regards,
SB
 
Not necessarily. It's quite possible that Nvidia has had a seperate R&D team working on a multi-GPU concent seperate from the monolithic GPU concept. We all know they have enough cash to do this. What we don't know is if they have or have not been doing it.
if there won't be GT200-related midrange/lowend products in near future, but just 8x00/9x00 cards filling those price segments, it might indicate that they have assigned other design team(s) for "alternative" (in this case, multi-gpu instead of monolithic) route
But it remains to be seen, of course.
 
if there won't be GT200-related midrange/lowend products in near future, but just 8x00/9x00 cards filling those price segments, it might indicate that they have assigned other design team(s) for "alternative" (in this case, multi-gpu instead of monolithic) route
But it remains to be seen, of course.
GT2xx parts are still in development, not canned. I think it was Arun who dug up info on them.
 
Not necessarily. It's quite possible that Nvidia has had a seperate R&D team working on a multi-GPU concent seperate from the monolithic GPU concept.
What concept? Take two GT204s (something like 8 TCPs @ 55nm), slap them together on one board and sell it as GT290X2 or something? Where's concept in that?
GT200 is a great chip. GT200 dissipates less heat and wants less power than even 4850CF while being on 65nm (according to Anand anyway). The only "problem" with it is that it doesn't provide the kind of speed we were anticipating from the "G8x"-architecture GPU with such complexity. But I believe we'll see some significant improvements here with future driver releases. And I believe it'll be unbeatable in GP/CUDA applications (with the possible exception of DP math), which was one of the key points of its design AFAI can say.
So i really see no reason for any desperate moves on NVs side. Some mid-term strategy correction maybe, but nothing radical.
 
Status
Not open for further replies.
Back
Top