AMD RV770 refresh -> RV790

Interesting. Wonder if it reworked mostly for incremental speed increases, size, power consumption, or all of the above?
What's a refresh, after all? :LOL:

Since the sideport wasn't being used, I wonder if they either removed it entirely to save space, or fixed it to work better?
It would be funny if deleting it was the entire "architectural change" :LOL:

I wonder if they worked any other miracles with density and transitor packing in other areas of the chip similar to what they achieved with the ALUs previously?
Seems pretty unlikely to me - since that was a primary motivation for RV7xx.

Then again... Just how much were they willing to rejig things just for a (so far rumored) relatively modest speed increase? Especially with DX11 chips presumeably coming a few months after it?
Whereas I'm thinking, just how far from D3D11-capable is RV770? Seriously, it gives the appearance of being very close.


To transform RV770 into D3D11-capable seems to require:
  • control processor and sequencer changes to cater for the domain shader and hull shader, specifically in buffering and scoreboarding
  • some texture-filtering quality improvements?
  • 16Kx16K textures
  • some tweaks to LDS/GDS (currently don't think these are required - but that's a guess based on them being enough for OpenCL and Compute Shader seeming to require no extra functionality over OpenCL in this regard)?
Per my signature, I like to think that RV790 to RV870 is much like R580 to R600. R600 was a functional change more than it was a performance change (though obviously certain things did get faster, e.g. ALU throughput, fp16 texture filtering and some RBE operations).

So I would like to think that any architectural tweaks in RV790 will be retained for RV870 and that otherwise they'll have much the same performance but RV870 will just have a few tweaks to enable D3D11-specific features to perform reasonably.

That's not to say that RV870 won't be more efficient than RV770 in the way it does certain things they can both do. Perhaps some of those improvements come in RV790.

Maybe I should add "happy to be horribly wrong" to my signature :p

Jawed
 
Maybe AMD is preparing to tackle the high end market (>$299) with this chip since everyone knows that the RV770 gets uncomfortably close to a full fledged GT200 performance in various situations especially with 8xMSAA.
 
A thing it would make sense to rejig is some of the stuff where there is 8* function block but there are 10* SIMDs in RV770.
Also the cache bandwidth is scaled for 8* SIMDs.

That was discussed back around the launch, the bulk of the chip having been designed for 8* SIMDs but they wound up having spare room on the chip so dumped in the extra 2* SIMDs quite late.

So I'd be expecting a multiple of 4 number of SIMDs (12/16) & with the various function blocks & bandwidths scaled appropriately.

Or heck, why not a 2 chip MCM RV740? :devilish:
 
Whereas I'm thinking, just how far from D3D11-capable is RV770? Seriously, it gives the appearance of being very close.


To transform RV770 into D3D11-capable seems to require:
  • control processor and sequencer changes to cater for the domain shader and hull shader, specifically in buffering and scoreboarding
  • some texture-filtering quality improvements?
  • 16Kx16K textures
  • some tweaks to LDS/GDS (currently don't think these are required - but that's a guess based on them being enough for OpenCL and Compute Shader seeming to require no extra functionality over OpenCL in this regard)?
So no upgrade for Tesselation needed and the already-in-there-unit is up to speed for DX11?
 
So a reworked chip and card with a new cooler. Piecing together the various rumors.

Interesting. Wonder if it reworked mostly for incremental speed increases, size, power consumption, or all of the above?

Since the sideport wasn't being used, I wonder if they either removed it entirely to save space, or fixed it to work better?

I wonder if they worked any other miracles with density and transitor packing in other areas of the chip similar to what they achieved with the ALUs previously?

Then again... Just how much were they willing to rejig things just for a (so far rumored) relatively modest speed increase? Especially with DX11 chips presumeably coming a few months after it?

Regards,
SB

And what if AMD finally used Fast14? It could explain a substantial clock increase without modifying the basic structure of RV770, while decreasing consumption. But then rumors about higher voltages would be wrong... :?:
 
Yeah, also wanted to say that. Anand said that ATI's tesselation unit's implementation is not DX11 compliant
 
So no upgrade for Tesselation needed and the already-in-there-unit is up to speed for DX11?
The tessellator is a "fixed-function" unit that I guess doesn't need any significant functional changes (though it may need an increase in per-clock capability?). It's job is to generate output vertices from a set of input vertices + tessellation factors. That functionality doesn't change in D3D11 as far as I can tell.

Tessellation, the all-encompassing concept as opposed to the pipeline stage, in RV7xx and earlier GPUs is not D3D11 compliant because there's no hull and domain shader support in hardware.

RV7xx and earlier GPUs "kludge" vertex shaders to perform hull and domain shading.

In the most complex case the output from the "hull shader" (implemented as a VS) needs to be fed into video memory (using stream out or render to vertex buffer, I think - the latter being a kludge for DX9) to then be consumed by the tessellator on a second pass which can then feed into "domain shading" (implemented as a VS).

http://developer.amd.com/gpu_assets/Real-Time_Tessellation_on_GPU.pdf

Generally speaking the topics surrounding tessellation make my head hurt, so other opinions are useful.

As far as I can tell, the task of amplification performed by the fixed-function tessellator in RV7xx and earlier GPUs is forward compatible. It's the kludging associated with DX9, 10 and 10.1 that causes the grief, centred on getting data into the tessellator and using its output.

---

The generic task of tessellation clearly requires support for hull and domain shaders, which means extra data formats and extra buffering. Since VS->HS->TS->DS is meant to be single-pass this also means that the shading hardware needs to support HS and DS types of shaders which currently aren't in there.

Jawed
 
so 10% improvement on the memory. If the clocks go up by 100Mhz (750->850 instead of 950) you'd be looking at an overall improvement of ~20% over RV770 without any other optimizations. This is indeed very much into GTX285 territory.

Wrong.

In order to get exactly 10% performance increase in "all situations", you'll have to increase performance of BOTH memory and core by 10%.

When the the memore performance is inreased by 10% and core performance by 11.5%, the overall performance increase will be something beween 10% and 11.5%, depending which was the bottleneck.
 
Wrong.

In order to get exactly 10% performance increase in "all situations", you'll have to increase performance of BOTH memory and core by 10%.

When the the memore performance is inreased by 10% and core performance by 11.5%, the overall performance increase will be something beween 10% and 11.5%, depending which was the bottleneck.

Well, it's not so easy... Increasing clocks or SP, and bw, usually never brings linear performance increases... It usually depends on the architecture and on bottlenecks... But I would not be so confident in a real 10% performance increase by only increasing clocks by 10%. It simply doesn't work like that. ;)
 
Hmmm, so it's sounding almost like this is AMD's version of a 9800 GTX -> 9800 GTX+ move. Just increased clock speed and nothing else?

It seems weird if they bump the model number from Rv770 to Rv790 with no functional chip change. If it was just 4870 -> 4890 while keeping the chip name Rv770, that would seem to make more sense for just a clock increase with no functional changes to the chip.

It makes me think we're still not getting the whole story.

Regards,
SB
 
The generally accepted rumour is that ATI will use a better version of the 55nm process for RV790, hence the name change. It's like an optical shrink, a minor revision of the chip is all that's needed, except now there is no shrink. But still better characteristics.

I'm wondering, years ago I heard about ATI using UMC as a second source for their chips. Supposedly an "RV516" along with other similarly "+1" codenamed chips were manufactured there. But were there ever cards on the market where the core markings would really read "RV516"?
 
That picture is from here:

http://www.bit-tech.net/news/hardware/2008/06/05/amd-hd-4850-rv770-gpu-die-shot/1

EDIT: the full version of the picture is captioned as coming from Bit-tech.

Jawed
Actually, that picture is quite meaningless and only illustrates an RV770-Chip. In fact, the chip is quite old - being from 8th week of 2008. ;)

Sorry for the confusion - the text is more important.


Hmmm, so it's sounding almost like this is AMD's version of a 9800 GTX -> 9800 GTX+ move. Just increased clock speed and nothing else?

It seems weird if they bump the model number from Rv770 to Rv790 with no functional chip change. If it was just 4870 -> 4890 while keeping the chip name Rv770, that would seem to make more sense for just a clock increase with no functional changes to the chip.

It makes me think we're still not getting the whole story.
Already forgot about R420 -> R481? Same chip, same functions, same process node, except for low-k which was hip right then and brought about … 20 MHz increase for core frequency (and a loud dual-slot cooler).
 
Don't forget...

R300 -> R350 -> R360
R420 -> R423 (PCIe version) -> R480 -> R481 (AGP Version)
RV350 -> RV360 (with low-k) -> RV370 (110nm version) -> RV380 (PCIe version)
 
Last edited by a moderator:
Don't forget...

R300 -> R350 -> R360
R420 -> R423 (PCIe version) -> R480 -> R481 (AGP Version)
RV350 -> RV360 (with high-k) -> RV370 (11nm version) -> RV380 (PCIe version)

Me wants!

I'm thinking Fuad's source told him "no it's not -just- RV770 with higher clocks it's a new chip". Fuad didn't take into account this new chip could be -just- a variation of RV770 capable of higher clocks.
 
Don't forget...

R300 -> R350 -> R360
R420 -> R423 (PCIe version) -> R480 -> R481 (AGP Version)
RV350 -> RV360 (with high-k) -> RV370 (11nm version) -> RV380 (PCIe version)

I know - but what's the difference? Two chips, only a very minor production detail apart which has no impact for the perceived value in the eyes of the customer.

Again, Nvidia is no better than Ati is no better than Nvidia.
 
on the nvidia side :
NV25 -> NV28
NV17 -> NV18
NV35 -> NV38
NV40 -> NV45 -> NV48 (all NV40 as far as I know).

We also have the ATI RV280 and R580+.
 
Back
Top