AMD RV770 refresh -> RV790

Something is wrong in this preview they say the chip includes 8 render back ends but their comparative table states 16.
I think it was a typo, as it now appears to have been corrected.

I'm pretty interested in one of these. With an overclock it should be a lovely little card.
 
Theoreticals in comparison with HD4670:
  • GFLOPs - 87%
  • Texture - 87%
  • Bandwidth - 160%
Average performance is 152%.

Shouldn't those theorecticals be 187% at least for GFlops? Texture is lower due to the higher clocks on 4670.

Also, Guru3d seemed to have received this specific sample through outside channels, as most of us probably know. I think that is why they were not going to put a real label on the card. I expect the top model to be 750-800mhz with the 6pin, 4770, and a 650-700mhz model hopefully without the 6pin, 4750.
 
Last edited by a moderator:
Got to remember that HD4850 is short on bandwidth, particularly for 8xMSAA (apparently about 25% short) though only 5%+ short with 4xMSAA.

So, 8 RBEs. That would imply a chip that's very much in the vicinity of 100mm2...

To me, the 4850 also seems to be short on RBE capacity, so I find these results quite unbelievable if these 2 RBE-quads are the same as in the RV770 - especially as the chip has less bandwidth than the 4830. What if they bumped some of the RBE rates (like z-pixel), though?

edit: seems like Guru3D is talking about 16 ROPs all the way, now...
 
Could this card be the 4930 instead?

1. They may not want to release too many lower end variants of the mainline chip when a lower end chip may suffice. They want to keep their margins higher.

2. It includes a 6 pin PCI-E power adapter whilst the 4670 does not and fitting it in at the 47xx level could confuse consumers since it seems to perform above the level of the 4830 so could do as a cheaper replacement.

3. There is a large distinction between OEM PC ready cards like the 4670 and the ones which require a PCI-E power supply and therefore must be built in with a much beefier PSU.

Lastly, could this also be a base line chip for the 4690?

Say HD 4930 for the high end variant and HD 4690 for a lower end GDDR3 version with no external PCI-E connection?
 
Shouldn't those theorecticals be 187% at least for GFlops? Texture is lower due to the higher clocks on 4670.
D'oh, yeah, you're right, I got that completely wrong. Should be 832GFLOPs for "HD4750" versus 480GLOPs, i.e. 173%.

Jawed
 
Could this card be the 4930 instead?

1. They may not want to release too many lower end variants of the mainline chip when a lower end chip may suffice. They want to keep their margins higher.

2. It includes a 6 pin PCI-E power adapter whilst the 4670 does not and fitting it in at the 47xx level could confuse consumers since it seems to perform above the level of the 4830 so could do as a cheaper replacement.

3. There is a large distinction between OEM PC ready cards like the 4670 and the ones which require a PCI-E power supply and therefore must be built in with a much beefier PSU.

Lastly, could this also be a base line chip for the 4690?

Say HD 4930 for the high end variant and HD 4690 for a lower end GDDR3 version with no external PCI-E connection?
4730/4750/4770/4790 fits RV740 perfectly, though I doubt there will be 4 variants of the chip(3 maybe). I would imagine RV770 is being EOL'ed if it hasn't already, RV730 is still relatively young but I would expect the higher parts would be replaced with the lowerend RV740.

I have a feeling only the highest model will have a 6pin.
 
With the HD4750 coming so close to HD4850 performance (and a possibly a higher clocked HD4770 could be on par with it), could there be a possibility that the HD4850/4870 might be replaced with the HD4900 series (RV790)?
 
With the HD4750 coming so close to HD4850 performance (and a possibly a higher clocked HD4770 could be on par with it), could there be a possibility that the HD4850/4870 might be replaced with the HD4900 series (RV790)?

That's what I think too. They're getting mature HD49XX parts out and need to replace the 4830, which was made of drop-off parts anyway. so that's where 740 comes in.
 
So if 47xx replaces 46xx in the lineup, and 49xx replaces 48xx. Does that mean we'll probably also see a 44xx replace the existing 43xx? Especially since this generation ATI seems to be on the ball about a top to bottom lineup.

I'm wondering if they did anything to rework idle power consumption. That would be icing on the cake if the 47xx performance increase over 46xx is even remotely indicative of the performance increase the 49xx series will have over the 48xx series (doubtful but you never know).

[edit] - NM, just saw in the Guru3d preview that they don't know what the official name of RV740 will be yet. So it could be everything from a 4750 to a 4835 to a 4930. :p

Regards,
SB
 
To me, the 4850 also seems to be short on RBE capacity,
Puzzled why you say that - it appears to be short on bandwidth. HD4870 is usually at least 25% faster despite being only clocked 20% higher - HD4870's extra bandwidth is helping a lot. 8xMSAA ranges from 35-50% faster on HD4870.

so I find these results quite unbelievable if these 2 RBE-quads are the same as in the RV770 - especially as the chip has less bandwidth than the 4830. What if they bumped some of the RBE rates (like z-pixel), though?

edit: seems like Guru3D is talking about 16 ROPs all the way, now...
The more I think about what I said earlier about 8 RBEs being enough the more uneasy I feel about it :???:

What's bugging me is that in something like CoD:WaW with 4xMSAA HD4870 should be about 26% faster than HD4850. With "HD4750" at 90% of HD4850, that makes HD4750 apparently 71% of HD4870.

Texturing is 69% of HD4870... Are the TUs having more of a pronounced effect in these benchmarks - allowing only 8 RBEs to hold the fort?


I didn't evaluate performance in comparison with HD4830:
  • GFLOPs - 113%
  • Texture - 113%
  • Bandwidth - 89%
Averaging 108% in games. Taken at face value, if an 8-RBE RV740 is able to do this then it means that TUs are dominating the performance of these games, and that the 32 TUs in HD4670 just didn't have enough bandwidth.


Comparing bandwidth per RBE clock:
  • HD4670 - 5.3GB/s/RBE/clock
  • HD4830 - 6.3GB/s/RBE/clock
  • HD4750 8 RBEs - 9.8GB/s/RBE/clock
  • HD4750 16 RBEs - 4.9GB/s/RBE/clock
  • HD4850 - 6.4GB/s/RBE/clock
  • HD4870 - 9.6GB/s/RBE/clock
The two GDDR5 GPUs having 9.8 and 9.6GB/s/RBE/clock looks like quite a coincidence, doesn't it?

Gah, I dunno :???: I should be playing Quake Live but the queue is really really really taking the piss right now.

Jawed
 
If that's just the "pro" model and they can deliver in large quantities, I think Nvidias G92 inventory will be a lot longer around than the green boys would like.
 
Puzzled why you say that - it appears to be short on bandwidth. HD4870 is usually at least 25% faster despite being only clocked 20% higher - HD4870's extra bandwidth is helping a lot. 8xMSAA ranges from 35-50% faster on HD4870.

I did a comprehensive test with a 4830 and a 4850 clocked to 575/900, and the average difference was 5% without AA and 4% with 4xMSAA. Yes, the 4850 is bw limited, but not this much - if only the bandwidth was responsible for the complete lack of showing of the +25% ALUs and TEX units with 4xAA, I think at least the 0xAA case would be a bit more ahead. Besides, the 4870 would have more than 25-30% advantage, I believe.

What's bugging me is that in something like CoD:WaW with 4xMSAA HD4870 should be about 26% faster than HD4850. With "HD4750" at 90% of HD4850, that makes HD4750 apparently 71% of HD4870.

Texturing is 69% of HD4870... Are the TUs having more of a pronounced effect in these benchmarks - allowing only 8 RBEs to hold the fort?

I didn't evaluate performance in comparison with HD4830:
  • GFLOPs - 113%
  • Texture - 113%
  • Bandwidth - 89%
Averaging 108% in games. Taken at face value, if an 8-RBE RV740 is able to do this then it means that TUs are dominating the performance of these games, and that the 32 TUs in HD4670 just didn't have enough bandwidth.

Taken from a different angle, if the rv740 has 8% average on the 4830 with +13% core clock and -11% memory clock (which will end up with even lower memory performance due to the higher delay of the gDDR5), it would mean that adding 8 ROPs to the rv740 would add absolutely no performance. I honestly can't believe that, not nearly.

Comparing bandwidth per RBE clock:
  • HD4670 - 5.3GB/s/RBE/clock
  • HD4830 - 6.3GB/s/RBE/clock
  • HD4750 8 RBEs - 9.8GB/s/RBE/clock
  • HD4750 16 RBEs - 4.9GB/s/RBE/clock
  • HD4850 - 6.4GB/s/RBE/clock
  • HD4870 - 9.6GB/s/RBE/clock
The two GDDR5 GPUs having 9.8 and 9.6GB/s/RBE/clock looks like quite a coincidence, doesn't it?

Well, yes, but I don't see where it leads us - I'm not good at knowing the significance of such ratios, I admit.
 
Considering the setup and that it's guru3d I guess it has some kind of acceptance from AMD. Maybe that's also why we don't get the exact core frequency, power consumption tests and die shot.
Regarding the power connector it could be that it's just present on this engineering sample. 4830 is actually quite low power to, so this should be close to not needing it.
I would just think this is the 4770 (maybe except core clock) while the 4750 would be DDR3.
Looking at the consistency in performance compared to the 4830 points to 16 RBEs, otherwise it should loose in some cases (although it's not the broadest test suite)
 
Last edited by a moderator:
:oops: That's incredible that such a board has "leaked", theoretically quite a coup.


Theoreticals in comparison with HD4850:
  • GFLOPs - 90%
  • Texture - 90%
  • Bandwidth - 80%
It's averaging 91% of the performance of HD4850. That's pretty amazing.

Jawed

Fixed. ;) I'm pretty sure the boards were running @ 700.
 
I did a comprehensive test with a 4830 and a 4850 clocked to 575/900, and the average difference was 5% without AA and 4% with 4xMSAA. Yes, the 4850 is bw limited, but not this much - if only the bandwidth was responsible for the complete lack of showing of the +25% ALUs and TEX units with 4xAA, I think at least the 0xAA case would be a bit more ahead. Besides, the 4870 would have more than 25-30% advantage, I believe.
Apart from bandwidth there's CPU, setup rate and fillrate that could be restricting the difference between HD4850/HD4830 to 4-5%.

I think with MSAA off HD4870 tends to be merely 20% faster than HD4850.

Taken from a different angle, if the rv740 has 8% average on the 4830 with +13% core clock and -11% memory clock (which will end up with even lower memory performance due to the higher delay of the gDDR5), it would mean that adding 8 ROPs to the rv740 would add absolutely no performance. I honestly can't believe that, not nearly.
Yeah, I'm doubtful too.

Well, yes, but I don't see where it leads us - I'm not good at knowing the significance of such ratios, I admit.
Well, with 16 RBEs and 800MHz GDDR5 "HD4750" will be really bandwidth limited in comparison with HD4850.

Truth be told, a mainstream card being bandwidth limited is really what you'd expect. I always thought it strange how things like HD2600XT had way way too much bandwdith, 11GB/s/RBE/clock.

Apart from anything else, the idea that RV740 has 16 RBEs marks a significant change in back end architecture: doubling the capability per 32-bit memory controller channel.

As it happens, there's a sort of precendent as RV530 benefitted from doubled-Z rate whereas R520 and R580, both sorely in need of the same, didn't :cry:

http://forum.beyond3d.com/showthread.php?t=27037

Of course, I'm still clinging to the idea that RV790 is 40nm with vast numbers of ALUs and TUs - if RV740 can double the RBE count per MC, then why not RV790 too :p With super fast GDDR5 it would be monstrous, but very nicely balanced. Where's that :drool: smiley?

Jawed
 
Hmm, I stand corrected. G3D's 740 ES is 650Mhz, so my corrections were essentially wrong. Probably another 5% to the wall.

(That doesn't make sense on the 900GFlop numbers, but gah, I'll take it as is.)

@Jawed
Any coarse predictions on the 1GB DDR3 part? I'm curious on how much fillrates contributed to 730's loss of grace wrt 3870.
 
Back
Top