AMD: R8xx Speculation

How soon will Nvidia respond with GT300 to upcoming ATI-RV870 lineup GPUs

  • Within 1 or 2 weeks

    Votes: 1 0.6%
  • Within a month

    Votes: 5 3.2%
  • Within couple months

    Votes: 28 18.1%
  • Very late this year

    Votes: 52 33.5%
  • Not until next year

    Votes: 69 44.5%

  • Total voters
    155
  • Poll closed .
no.. this is legit:

Code:
Cypress ~P16xxx - P17xxx - P18xxx
Juniper XT ~P95xx
Redwood ~P46xx

No idea on real Vantage scores but according to the cheat sheet on ORB, Juniper is between a 4870(~7.9K) and 4890(~10.7K) and Redwood is just shy of a 9800gt(~5K).

So it seems Redwood is quite a bit faster than a 3870, so either Redwood is clocked high w/ 320SPs or it has decent clocks and maybe 480SPs?
 
Last edited by a moderator:
No idea on real Vantage scores but according to the cheat sheet on ORB, Juniper is between a 4870(~7.9K) and 4890(~10.7K) and Redwood is just shy of a 9800gt(~5K).

So it seems Redwood is quite a bit faster than a 3870, so either Redwood is clocked pretty high or it has decent clocks and maybe 480SPs?

Re Redwood.

I posted about comparing the power usage of the 3 ATI 128bit 40nm mobile chips here. Roughly summarising previous post from july:

RV740 mobile 30-44W
Broadway(= Juniper mobile) is 30-60W
Madison(= Redwood mobile) is 15-30W

As the node is the same and assuming they are all using similar clocked GDDR5 on a 128bit interface. It looks like max power of Redwood is round half of Juniper.

This implies roughly 1/2 Junipers shaders ie 800 / 2 ~ 320-480sp....most likely as there are other common fixed components using power that are shared across designs is most probably to come in at 320sps.
 
Last edited by a moderator:
Re Redwood.

I posted about comparing the power usage of the 3 ATI 128bit 40nm mobile chips here. Roughly summarising previous post from july:

RV70 mobile 30-44W
Broadway(= Juniper mobile) is 30-60W
Madison(= Redwood mobile) is 15-30W

As the node is the same and assuming they are all using similar clocked GDDR5 on a 128bit interface. It looks like max power of Redwood is round half of Juniper.

This implies roughly 1/2 Junipers shaders ie 800 / 2 ~ 320-480sp....most likely as there are other common fixed components using power that are shared across designs is most probably to come in at 320sps.

Thanks for the info.
I remember reading/glancing over the original post, I guess it just didn't sink in.

I still think it could go either way. Redwood seems about 30% faster than a 3870, which was implied in the most recent "specification leak" as the IPC of the architectural changes, though could also be interpreted as power savings due to a smaller node.
So whether the 30% increase in performance comes from more efficient units or more units we will see a small efficient design, 2 years ago it was a highend GPU and now it is a notch above IGPs.
 
Thanks for the info.
I remember reading/glancing over the original post, I guess it just didn't sink in.
Yeah sorry - had a lot of info to try and fit in one post and didn't want to make it too long, ended up not being as obvious at it might otherwise of been.

Nobody followed up either, so i figured people weren't interested :cry:

I still think it could go either way. Redwood seems about 30% faster than a 3870, which was implied in the most recent "specification leak" as the IPC of the architectural changes, though could also be interpreted as power savings due to a smaller node.
So whether the 30% increase in performance comes from more efficient units or more units we will see a small efficient design, 2 years ago it was a highend GPU and now it is a notch above IGPs.
Still comparing to the RV740m which has min power of 30W, on the same node probably with same memory Redwood/Madison is half that at 15W. ie there is no way that Redwood/Madison has 640 shaders like the RV740 does.

Likely to get separation via memory config. With Juniper which looks like it needs GDDR5, this part looks more tuned to maximise performance per watt when matched with DDR3.
 
snt2gs8k.jpg

Picture is here.
it's just a shot of Cypress cooler.
but there was 8(1 dropped) memory pad.
测出来了!

用散热器孔距为基准:

RV870 核心面积 = 17.8 * 17.2 = 306.2mm^2 图里的单位有误

算上误差, 估计在 305 ~ 320 之间

according to hole distance.
 

Truth be told, no i haven't. But the link you post here is from like...yesterday and yes i have read the past 20 pages or so, which are the more juicy ones!:D

What i said and allow me to say again, is that there is an obvious performance gap between the 350mm2 Cypress and the 180mm2 Juniper.

That will result in a big performance difference between them as well. I do believe that ATI should release a bomb product in the 200$ price range and i don't believe that Juniper can supply that. What is the point to replace more than one year later, the 200$ 4850 (talking about launch prices) with a 200$ product that will perform roughly 25% better?
 
What i said and allow me to say again, is that there is an obvious performance gap between the 350mm2 Cypress and the 180mm2 Juniper.

So.. what will RV840 do?, there's no holes, just cheese and Cheddar.
 
Last edited by a moderator:
RV870 is an original dual-core card.

I think the proper marketing term doesn't translate well from english to chinese and hence the confusion about "dual-core." In RV770 design one "core" consists of just 8 stream processors.

Theo Valich already mentioned the proper marketing word for this "Dual-Core Card" but since it was his plagiarism article, he deleted the most relevant part and thus the only piece of true value.

Theo's original article can be found Here.
 
"Dual-shader" might mean that one cluster contains two 5-way SIMDs, both of which share a quad-TU. This provides two options:
  1. each 5-way SIMD is 8 strands wide, i.e. a thread is 32 wide - this improves branch incoherence penalties substantially
  2. each 5-way SIMD is 16 strands wide (just like now) but the ALU:TEX is doubled to 8:1 - this leads to a massive increase in compute density as the TUs currently cost ~29% of a cluster's area, and this would reduce that penalty to a mere ~17%. Put another way that would double per cluster compute for 71% more area.
Jawed
 
"Dual-shader" might mean that one cluster contains two 5-way SIMDs, both of which share a quad-TU. This provides two options:
  1. each 5-way SIMD is 8 strands wide, i.e. a thread is 32 wide - this improves branch incoherence penalties substantially
  2. each 5-way SIMD is 16 strands wide (just like now) but the ALU:TEX is doubled to 8:1 - this leads to a massive increase in compute density as the TUs currently cost ~29% of a cluster's area, and this would reduce that penalty to a mere ~17%. Put another way that would double per cluster compute for 71% more area.
Jawed
Wow, you got me all confused in your tterminology here. Using this terminology, how would you describe nv hw? 1 way simd that is 8 strands wide?
 
For what it's worth I'm getting 334mm² including packaging, leading to an estimated die size of 316mm².

Jawed
 
Back
Top