AMD: R8xx Speculation

How soon will Nvidia respond with GT300 to upcoming ATI-RV870 lineup GPUs

  • Within 1 or 2 weeks

    Votes: 1 0.6%
  • Within a month

    Votes: 5 3.2%
  • Within couple months

    Votes: 28 18.1%
  • Very late this year

    Votes: 52 33.5%
  • Not until next year

    Votes: 69 44.5%

  • Total voters
    155
  • Poll closed .
Maybe they should have called it 52XX series? :LOL:

:runaway:
Agree.

HD5450 just a tad faster than HD4350 (and a tad slower than HD4550) just doesn't make sense, it should have been branded HD5350 at most.

Correct me if I'm wrong but...

- 5 denotes the product generation
- 4 denotes performance level (so, this should be higher than HD4350 since it's the case for any other EG chips with respect to their predecessors)
- 50 denotes a "mid range" product (in its segment)

BTW, I think these 64-bit parts don't make much sense with (quite strong) IGPs integrated in the value CPUs for both AMD and Intel.
 
Last edited by a moderator:
hmm, 256 MB memory on 64-bit memory bus means two 32-bit 1 Gbit memory chips.
And 512 MB memory on 64-bit memory bus means two 32-bit 2-Gbit memory chips.
These cards use ordinary ddr3 or ddr2 ram. They come as either 16bit or 8bit devices. A 512MB card would therefore use 4 16bit 1gbit chips, a 1GB version either 8 8bit or 16bit devices (2 in parallel).
2gbit chips are out anyway for these cards, too expensive for now.
Likewise, a 256MB card would need 512mbit chips. Not going to happen I guess, ddr3 512mbit chips don't even exist, and nobody really makes 512mbit ddr2 chips any more, they are not going to be cheaper (per chip) than the 1gbit ones afaik. Unless the chip can run in a 32bit configuration.
 
Somewhat OT: Considering that LIano is supposed to have 240 alu's, this chip is probably the second last of it's breed.

I doubt that, they would still need/want something low-end for Intel-based laptops/netbooks. Also a 28 nm chip with a 64bit interface might very well be a 400 alu part and be a nice step up from Liano.
 
I was under the impression that Llano (2 Ls, not LI) was a 400-SP part. Did I miss something?
We don't have any full die shot, it could very well be a 80/160SP part (2 or 4 SIMDs with 8 or 16 Vec5 units each).

The die shots seen on AMD's papers don't tell us anything as they could be photoshopped and the only high res "full die shot" I've seen (from a "reputable" fakes source) showed some inconsistencies (missing blocks).
 
I doubt that, they would still need/want something low-end for Intel-based laptops/netbooks. Also a 28 nm chip with a 64bit interface might very well be a 400 alu part and be a nice step up from Liano.

Intel netbooks all use IMG IP. Intel based laptops are definitely a market, but the desktop market definitely appears toast.

80 alu's -> 400 alu's => 5x jump in one shrink? I don't think so. Especially when you look at the trends in the past. So for cedar, no.

Redwood is more interesting. May be not immediately, but in 2-3 generations, I am expecting amd fusion chips to cannibalize this market as well.

<$100 market probably accounts for ~80% (anyone got better numbers?) of the unit sales. AMD has a real opportunity here to gobble up this market. Let's just hope LIano is not delayed anymore.
 
There will obviously be some movement from discrete budget cards to integrated on die/package. But it's not going away. Budget buyers are even less likely to upgrade their entire computer every 2 years.

Budget GPUs still make sense in the case of upgrading video capabilities there.

Regards,
SB
 
Intel netbooks all use IMG IP. Intel based laptops are definitely a market, but the desktop market definitely appears toast.

ION2 is discreet part and there are upcoming netbooks with ion2.

80 alu's -> 400 alu's => 5x jump in one shrink? I don't think so. Especially when you look at the trends in the past. So for cedar, no.

A 28nm 64bit chip should have around 500-600 million transistors, that is pretty close to Redwood and as it would still be a DX11 chip no (or very few) transistors is needed for new features. 320 might be more realistic than 400.
 
ION2 is discreet part and there are upcoming netbooks with ion2.
ION or ION2 isn't exactly Intel platform, though, in case of ION it's just Intel CPU, ION2 gets closer by using Intel CPU & NB, though, but still nVidia part for separate gfx
 
Oh, thanks for confirming. I wouldn't have expected that to make a ~15% difference - since it can do one 2-component interpolation per clock using all 4 simple alus in the vec5 unit if I got that right.
The problem is ALU-based interpolation creates a serialisation that doesn't exist on the older GPUs. So, for example, a trivial shader that paints a pixel with a texel is:

Code:
TEX
ALU

on older GPUs - but on Evergreen it's:

Code:
ALU
TEX
ALU
So, right out of nowhere the shader has gained latency (each ALU clause costs a minimum of 40 cycles that needs to be hidden by other hardware threads). Obviously this trivial shader is essentially fillrate limited, but you can see this is another variable beyond the mere question of extra ALU workload.

Another interesting dimension here is the scheduling of interpolation in the shader. Older GPUs reserved registers at the same time as interpolations were performed, so that the interpolated attributes (texture coordinates, etc.) were all computed and stuffed into registers ready to be used as soon as the shader instance launched. Now there is no need to reserve so many registers: instead interpolate as the values are required - potentially saving a significant count of allocated registers. The problem, though, is how good's the compiler? And with all the other Evergreen GPUs being 4:1 ALU:TEX but Cedar being 2:1, there's potential for a different compilation to provide a better trade-off.

Jawed
 
ION or ION2 isn't exactly Intel platform, though, in case of ION it's just Intel CPU, ION2 gets closer by using Intel CPU & NB, though, but still nVidia part for separate gfx

I think the point was that any of the discrete budget GPUs still absolutely toasts any integrated solution.

The GM45 on Clarksdale is now roughly equal to ATI's integrated, but both are far slower than the 5450 for example.

I'm not expecting future Fusion GPUs to be faster either, especially if they are having to use main memory.

Regards,
SB
 
Intel netbooks all use IMG IP. Intel based laptops are definitely a market, but the desktop market definitely appears toast.

80 alu's -> 400 alu's => 5x jump in one shrink? I don't think so. Especially when you look at the trends in the past. So for cedar, no.

Redwood is more interesting. May be not immediately, but in 2-3 generations, I am expecting amd fusion chips to cannibalize this market as well.

<$100 market probably accounts for ~80% (anyone got better numbers?) of the unit sales. AMD has a real opportunity here to gobble up this market. Let's just hope LIano is not delayed anymore.

With same manufacturing tech and optimizations than cpu the on die gpu could run at much higher clocks. Maybe with different clocks for the alu-s too. With frequency over 1 GHz even with less alu-s the fusion could be fast enough.
 
Maybe they should have called it 52XX series? :LOL:

:runaway:

Found the quote I was looking for but couldn't find
http://www.tomshardware.co.uk/benchmarking,review-8027-13.html
dirty-marketing-ati.jpg
 
We don't have any full die shot, it could very well be a 80/160SP part (2 or 4 SIMDs with 8 or 16 Vec5 units each).

The die shots seen on AMD's papers don't tell us anything as they could be photoshopped and the only high res "full die shot" I've seen (from a "reputable" fakes source) showed some inconsistencies (missing blocks).


I don't think that Llano will have only 80 or 160 Shaders. With such a low amount of shaders they would have to run them with ~3GHz to reach the ~~1 Teraflops mentioned in the presentations:

http://www.anandtech.com/showdoc.aspx?i=3673&p=3

~~1 Teraflop could mean 400 Shaders @ 1,25GHz or 480 Shader @ 1,04GHz or something else. But I doubt that they use less than 320 Shaders. Otherwise they would have to have ridiculous high frequencies to reach the ~~1Teraflop and therefore a lot of heat issues and a high power consumption too.

Also they mention ~ 1 billion transistors somewhere in the presentation. Therefore the GPU part of the APU should have roughly 600-700 Mio Transistors.
 
That's a funny slide, albeit very true. I had to replace a dead GF4 ti 4200 with a POS FX 5200 and hated the damn thing. I eventually ended up replacing that with a Radeon AIW 9600 Pro. Fantastic card. Played everything out with high settings + AA & AF, and overclocked very nicely as well.
 
I don't think that Llano will have only 80 or 160 Shaders. With such a low amount of shaders they would have to run them with ~3GHz to reach the ~~1 Teraflops mentioned in the presentations:

http://www.anandtech.com/showdoc.aspx?i=3673&p=3

~~1 Teraflop could mean 400 Shaders @ 1,25GHz or 480 Shader @ 1,04GHz or something else. But I doubt that they use less than 320 Shaders. Otherwise they would have to have ridiculous high frequencies to reach the ~~1Teraflop and therefore a lot of heat issues and a high power consumption too.

Also they mention ~ 1 billion transistors somewhere in the presentation. Therefore the GPU part of the APU should have roughly 600-700 Mio Transistors.
I read this as "2012 target". maybe when graphics will be intergrated with Bulldozer CPU core.

I see no reason to expect more than 80 or even 40 cores in Llano - there is no memory bandwidth available, so why bother?!
They'll need IGP that's good enough for HTPC - eyefinity, video accelaration, thats all.
 
there's 25.6GB/s, which is four time the bandwith of low end cards.
It will be bandwith starved no doubt, but two channels of ddr3 are better than one channel of ddr2.

it was said both to be 480SP and 240SP.. 240SP would be the realistic, bandwith bound ground maybe.
 
I read this as "2012 target". maybe when graphics will be intergrated with Bulldozer CPU core.

I see no reason to expect more than 80 or even 40 cores in Llano - there is no memory bandwidth available, so why bother?!
They'll need IGP that's good enough for HTPC - eyefinity, video accelaration, thats all.

Dual chanel ddr3 1600 is 25.6 GB/s theoreticaly. The fusion chip will need to have the memory controler for both on the same die unlike IGPs. There will be much more room to optimize them. They could also increase the amount of gpu cache. Who knows maybe the gpu will be able to use the L3 cache too in the future.
 
Back
Top