AMD: R8xx Speculation

How soon will Nvidia respond with GT300 to upcoming ATI-RV870 lineup GPUs

  • Within 1 or 2 weeks

    Votes: 1 0.6%
  • Within a month

    Votes: 5 3.2%
  • Within couple months

    Votes: 28 18.1%
  • Very late this year

    Votes: 52 33.5%
  • Not until next year

    Votes: 69 44.5%

  • Total voters
    155
  • Poll closed .
Has anyone seen a die shot of RV870 yet?

I only found this: http://i608.photobucket.com/albums/tt167/misterbluntman/diapo00618fa9f21903558ac4f27fb650e7.jpg

http://www.hardforum.com/showpost.php?p=1034669251&postcount=45

Now the question is, is it fake? Because I can count a hell of a lot more than 1600 shader processors. On the other hand, when I compare the layout with the press release material from AMD it kind looks like it could be real ...

http://www.anandtech.com/video/showdoc.aspx?i=3643&p=5

Anyone?
 
Oh ye, if you're referring with "press material" to the slide with Cypress and then arrows to hemlock with 2 chips and down to juniper, redwood and cedar, the "chips" in the slide aren't Evergreen-chips, they look mostly like RV770, and someone said that there's some indications that it would be RV770 and some AMD CPU "blended" together.
 
What's puzzling me is that the 3DMark06 Perlin Noise has these inputs:

dcl_color0 v0.y
dcl_color1 v1.xy

So how is the Vantage version so different that it could be interpolation bottlenecked? Is the Vantage version doing a load of calculation in the vertex shaders (or geometry shaders) and then relying upon attribute interpolation as part of the generation of the noise? It sounds plausible and sounds like a good use for attribute interpolation, but it'd be nice to get some kind of confirmation...

Jawed

The 3D Mark Vantage tests (GPU Particles, Perlin Noise, Parallax Occlusion Mapping) doesn't seem to be interpolation limited.

The 750MHz 4770 has the same scores (+3% up to -3%) in relation with the 625MHz 4850.

The only score that 4850 is clearly worst is the GPU Cloth (vertex/geometry shading test) (4770 has +20% perf. in relation with 4850 in this test)

I suspect this is either because geometry set-up engine has higher clock speed or it is interpolation related.

RV770 had 32 interpolators and 40 TUs.
the 5870 how many interpolators it has?
Is it per SIMD? (10+10)
Or is it per TU? (40+40)
 
If the specs are these (850MHz 800SPs & 700MHz 720SPs) i guess 5750 1GB will be around 5%-15% faster in relation with a 4850 1GB and 5770 will be around 4770 1GB speed (-10% up to +10%)

For DX11 parts the prices are not bad...
 
RV770 had 8 interpolators, not 32.
RV870 isn't limited by interpolation, until the shader core become 100% loaded(?)
 
RV770 had 8 interpolators, not 32.
RV870 isn't limited by interpolation, until the shader core become 100% loaded(?)

Sorry I thought that the interpolators was 32.

The rasteriser can generate 8 2x2 pixel quads and send up as many as 32 pixels per cycle down into the dispatcher.

Are you sure that the interpolators are 8?

EDIT*
Sorry i just saw Dave's answer.
Dave can you answer please how many interpolators the Cypress has?

Thanks
 
I think tha answer is 0 or 320, according to what do you mean for "interpolators" :D

Not the answer i was looking.

Yes we read in the various reviews that the interpolation job has moved to the SIMD core part of the design.

Let me rephrase.

How many interpolations per clock can the RV790 do?

and

How many interpolations per clock can the 5870 do?

Unless you suggest that the 5870 can do up to 10X more interpolations per clock.

Is this the case?
 
Not the answer i was looking.

Yes we read in the various reviews that the interpolation job has moved to the SIMD core part of the design.

Let me rephrase.

How many interpolations per clock can the RV790 do?

and

How many interpolations per clock can the 5870 do?

Unless you suggest that the 5870 can do up to 10X more interpolations per clock.

Is this the case?

As far as I understand, RV790 has 32 intepolators that are dedicated units and so it could perform up to 32 interpolations per clock.
Cypess moved the interpolation in the shader core so interpolations are now executed in the shader core ALUs. Knowing the exact -real- number of interpolations per clock is impossible, as now interpolations are another task to be scheduled for the SP and so in the real case they depend from the workload. How many maximum per clock I don't know because I should know exacltly how it is executed, i.e. if you have a linear interpolation you have to multiply two values for the weight and add the result, then divide it by the sum of the weights. multiply-add and add can be done by a 5-way SP probably in one or two cycles, but then there is the divide operation taking additional cycles (I don't know how many).
 
Last edited by a moderator:
Wouldn't they just interpolate as part of the pixel shader now, and not in a seperate scheduled task? Would also raise the VLIW utilization and therefore be almost free in some cases.
 
Back
Top