256-bit in the Midrange

Jawed said:
There are other dimensions to this. We have RV540 and RV560 - one of which is meant to be 8-1-3-1/2 (can't remember) - meaning that ~22GB/s (as per RV530) prolly won't be enough. Since they're imminent (3 months?), I wonder how ATI is going to equip the 8-1-3-1/2 part with fast-enough memory. It might have to be with 1.1ns GDDR3.

So, is X1600XT running at the limits of 1400MHz memory or is there another wodge of performance to come? Being cynical, I think the former. Which leads me to expect that an 8-1-3-1/2 part requires mondo-expensive GDDR3 or GDDR4...

Jawed

The RV560 will have a 256 bit memory bus.
 
ANova said:
The RV560 will have a 256 bit memory bus.

A bold statement! I'd quote you in my sig, but its kind of busy right now.

The Good news: Someone agrees with you!
The Bad news: It's VR-Zone and they haven't been right on anything ATI in about a year. http://www.vr-zone.com/?i=2932
 
Last edited by a moderator:
Sxotty said:
I kind of think that nvidia actually has the better solutions in the low- to low/mid...
...and have done so for a while. They understand this micro-segment of one-up from the bottom very well. Ati have done well in the R9550 space, but margins there are very thin.
 
stevem said:
...and have done so for a while. They understand this micro-segment of one-up from the bottom very well. Ati have done well in the R9550 space, but margins there are very thin.

And clearly thin margins aren't something NV is interested in. :smile: They've been pretty clear that going after the margin-rich opportunities in any market segment is their thing --and you can't argue with their results of late.
 
The fundamental problem with 256-bit is you've got a fixed cost with it, AND you need a minimum die (->package) size. IMO, "$200-250 at introduction" parts could be getting a 256-bit bus if, and only if they are built with good redundancy, and those are the "low" parts of the chip.
That would increase the die size, yet keep it appropriate for a $200-250 part (after all, those ARE failed chips). So the idea really is to add one or two extra quads/VS for the $300 part, then disable those for the $200 part, in order to keep high margins and a sufficient die size at the same time.

The only real problem with this strategy - I'll admit it - is that if the yields are good for the $300 board with this chip, you've got a problem for the $200 chip, since *chip* (and not per-dollar) demand for this market segment would be higher. So if you had 70% of your chips who could fit for the $300 category, and 25% for the $200 category (with the rest being totally borked), you'd be forced to use nearly half of those 70% to fill in that demand, lowering your potential margins. Such a situation is one you want to avoid quite a bit indeed.

Uttar
 
Uttar said:
The only real problem with this strategy - I'll admit it - is that if the yields are good for the $300 board with this chip, you've got a problem for the $200 chip, since *chip* (and not per-dollar) demand for this market segment would be higher. So if you had 70% of your chips who could fit for the $300 category, and 25% for the $200 category (with the rest being totally borked), you'd be forced to use nearly half of those 70% to fill in that demand, lowering your potential margins. Such a situation is one you want to avoid quite a bit indeed.

Uttar

Right. I've always assumed we see those boards later in the life-cycle, typically, because they have to build up a healthy inventory of them first due to the higher demand that's going to be coming and the limited (you hope!) "yield" of them.
 
Uttar, I don't think that is necessarily accurate, there is nothing wrong with having better than expected yields and using the chips in a lower end product, it builds enthusiast support if they can unlock them, there is lots of hype I mean b3d's review of the GTO2 was even discussing it. Basically it is not something a CEO would moan about, they might think man we could have gotten better margins, but they won't really be upset as it is a good indicator not a bad one.

Second they can then always make another sku if the yields are too good like Nvidia did with the 6200AGP, that was originally the NV43 gpu, but yields were too good so they simply made another chip to fill in the low end 6200 later. Incidentally they did the same for 6800 as well.
 
Sxotty said:
there is nothing wrong with having better than expected yields and using the chips in a lower end product
Margins matter. ATI's margins right now are crap, so I doubt they'd care much. On the other hand, NVIDIA GPU margins are around 45-50% right now, and they want to keep them at that level or even increase them slightly. Their last 3-4 conference calls basically were screaming "MARGINS! MARGINS! MARGINS!".
If they sell a chip that cost them $20 for $40, when they could have sold it for $50, that means you could have had margins of 60%, yet only got margins of 40%. To reuse the numbers I used in my above post, assume a 70/25/5 distribution of yields, and a 40/55/5 ideal one (the 5 is there so I don't have to normalize everything in my calculations).
Then, for a $1M production run, you could have had a gross profit of $2.25M. Instead, to supply the mid-end demand, they had to reduce your gross profits to $2.1M. In the first case, they'd have gotten 55.56% margins, while in reality they only managed 52.4%, so 2.5% less. This IS significant in this business, and realworld numbers would most likely be more around 40 against 50%.
Second they can then always make another sku if the yields are too good like Nvidia did with the 6200AGP, that was originally the NV43 gpu, but yields were too good so they simply made another chip to fill in the low end 6200 later. Incidentally they did the same for 6800 as well.
But then that new SKU would be too small to host a 256-bit bus. Try rereading my earlier post.

Uttar
 
With the direction GPUS are going. ((IE complex shaders. And of course non traditional designs such as the X1600)) Is bandwith really going to be such a deciding factor for midrange? Obviously we'll always be bandwith bound to some degree. But I am wondering if it'll be deciding factor in performance in the coming years. Especially when we move to unified architectures.
 
Jawed said:
I think the escape route is pretty clear: GDDR4 - 128-bits will be able to provide 35GB/s+.
I suppose that's why they crammed a cut-down ring bug MC, to scale with newer RAM while keeping the core and card as cheap as possible?

It'd seem that texture units are the limiting factor compared to the 6800/GS/GT, but I'm not sure how scaling the core with GDDR4 clocks helps, as you're left with the same ratio.

The real solution seems to be to drop the prices, which will probably only happen en masse once ATI releases its true midrange solution. I'm guessing that won't happen until R580, then they can have an X-1-3x architecture in three market segments (with RV515 bringing up the rear). Does that make it RV515 <$150, RV530 $150-250, RV560 $250-400, R580 $400+? MSRPs, of course.
 
ChrisRay said:
With the direction GPUS are going. ((IE complex shaders. And of course non traditional designs such as the X1600)) Is bandwith really going to be such a deciding factor for midrange? Obviously we'll always be bandwith bound to some degree. But I am wondering if it'll be deciding factor in performance in the coming years. Especially when we move to unified architectures.
Depends a lot on what trends pick up, post-SM4.0 introduction.

Imagine deferred shading taking over the development market (that is, ala STALKER or Heavenly Sword afaik). Performance characteristics would suddenly become quite different (192-bits or more per-pixel, and that's just for the G-Buffers). At the same time, it would imply a lot of consecutive arithmetic ops in the post-processing pass with few texture accesses (up to a certain extend, at least). The consequences would be quite drastic.
Personally, I'm not a big fan at all of deferred shading. Easy realtime edge detection so you can just detect the edges and blur them instead of having to AA them is cool, and a few of the bump-related tricks are cool, but besides that... heh

TBH, a lot depends on just how aggressive NVIDIA & ATI are when it comes to AA & AF in the coming years, especially so G80/R600-wise. If they try making 6-8x AA the standard, even with tons of FP16 rendertargets (lossy compression perhaps?), then that'd change everything too. Things like opening up programmability to a number of other units would be very interesting, too. I've got a few interesting ideas on what'd be possible with a TBDR and a very specific API with part of the deferring process programmable, but I'm a bit too lazy to go into that here :)


Uttar
 
Back
Top