AMD: R8xx Speculation

How soon will Nvidia respond with GT300 to upcoming ATI-RV870 lineup GPUs

  • Within 1 or 2 weeks

    Votes: 1 0.6%
  • Within a month

    Votes: 5 3.2%
  • Within couple months

    Votes: 28 18.1%
  • Very late this year

    Votes: 52 33.5%
  • Not until next year

    Votes: 69 44.5%

  • Total voters
    155
  • Poll closed .
Both of them are salvage parts. Think of it this way, if 285 gets pushed down in price, how deep will their price have to fall?

They're still 450 mm^2 parts, and sold in much higher volumes than the GTX285, I suppose.
Since the price difference of the GTX285 was not THAT large, I wonder how much the GTX285 compensated for the low prices on the rest.

Clearly the HD5850 is the star of this generation though, not the HD5870. The whole GTX200-series has just been sent to retirement.
 
The pipeline is only virtual in this sense on Larrabee (anything else?). It's a real, single-pixel-shader-at-a-time pipeline on current ATI GPUs, excepting the virtualisation required to share the unified shaders amongst VS/GS/PS (and others).
I now know that up to 8 states are supported concurrently, e.g. SQ:SQ_PGM_RESOURCES_PS defines register, stack etc. with SQ:SQ_PGM_CF_OFFSET_PS defining program address. The full set of shader types is supported across these 8 states.

It's not clear how the hardware protects out of order Z updates from two or more states. Perhaps they delay issue of the final export instruction(s) from states that are newer than the oldest state.

I'm unclear if these states can be mapped to separate "contexts", in the sense that, for example, 2 games could run in parallel with absolutely distinct resources.

Jawed
 
At a hardware level, Larrabee is not as latency tolerant as a GPU
At a software level, a long-enough strand appears able to cover texture latency in a single-chip case.
A texture read or remote fetch would be even longer than that, though what that costs I am not sure.
The fiber would have to be compiled to be longer, which may have some similar penalties to increasing GPU batch size.
I'm not sure what practical limits there are to fiber length.
Agreed (threads need more fibres to hide longer latencies induced by multiple chips).

I know modern chips have a ton of monitors, though I'd be curious how much a core based on the P55 will sport.
Larrabee should run shader compilation on-chip, so maybe it can make adjustments.
I was thinking of software-monitors, e.g. percentage of time that back-end is idle because too many triangles are overlapping each other and the scheduler is scoreboarding to avoid incorrect write ordering.

I suppose you could argue there's years of work in developing performance-based re-JITting for the execution of a standard D3D pipeline.

The next question is when Larrabee is small enough and cool enough for such a setup.
The chip in the die shot certainly doesn't look like a promising candidate, but perhaps at a later node.
The two large ICs on a card scheme GPU makers have been using may be a fluke.
Even if Larrabee does sport better scaling, the market that would benefit from this is already very niche.
Yeah, I get a sense that Intel's fighting off GPUs with Larrabee as discrete until the market for discrete GPUs is so small that it can't support discrete GPUs - at which point CPU rendering is left standing - with some degree of Larrabee-cation integrated. Though in theory by then all consumer CPUs will only cost about $20.

Yeah, that was pretty underwhelming.
Does this help much for tesselation?
The peak triangle throughput numbers don't change.
We need an evaluation of just how setup-limited game graphics are now, or are with tessellation turned on.

Jawed
 
How does any of that have any bearing on the marketing?
I don't think a technical description of a new chip is 100% marketing. I know you like to scowl around as the most pragmatic cynicist of them all, but I think that's reaching.

That's not to say that AMD hasn't indulged in cynical marketing by drawing two rasterisers, merely that I don't think that's a reasonable starting point. Pity none of the reviewers seems bothered enough to ask, oh well.

Jawed
 
It's not just marketing slides at fault.
Anandtech stated AMD added a rasterizer, though the fact that it didn't elaborate on this might mean they just took the slide at face value.

Techreport's article included a description of an exchange of information with someone at AMD that reinforced the impression of there being two rasterizers: that the front end had been modified and that load-balancing was being done. If it's just one rasterizer, well duh, it's hard to require balancing with only one thing to balance.
That's something more damning than a really misleading slide (why two Hierarchical Z blocks?).
 
I don't think a technical description of a new chip is 100% marketing. I know you like to scowl around as the most pragmatic cynicist of them all, but I think that's reaching.

That's not to say that AMD hasn't indulged in cynical marketing by drawing two rasterisers, merely that I don't think that's a reasonable starting point. Pity none of the reviewers seems bothered enough to ask, oh well.

Jawed

The starting point is measured throughput which puts it at 1 Tri/clock. Not sure why you consider it reaching to ask why they are promoting doubled rasterizers when it's something that hasn't been marketed in that way before. It's a valid question, no cynicism involved.

People have tested this and asked questions. No answers forthcoming so far.
 
Clearly the HD5850 is the star of this generation though, not the HD5870. The whole GTX200-series has just been sent to retirement.
Now imagine if AMD decided not to skip the single GPU high end last gen. Everyone thought it was a brilliant move on AMD's part, but they were successful because the architecture was amazing, not because they limited themselves to the <$300 market.

Same architecture as 4770, same RAM and bus width, 32 ROPs, 1280 SPs would have initiated this retirement a year ago.
 
Now imagine if AMD decided not to skip the single GPU high end last gen. Everyone thought it was a brilliant move on AMD's part, but they were successful because the architecture was amazing, not because they limited themselves to the <$300 market.

Same architecture as 4770, same RAM and bus width, 32 ROPs, 1280 SPs would have initiated this retirement a year ago.

I wonder though... Would it have been feasible to have such a large chip on 55 nm? The 4870 was quite a hothead. I'm not sure if you could make that architecture much larger on a 55 nm process (while maintaining the same clockspeeds/performance).
40 nm wasn't an option back when the 4870 came out. They could have done a high-end part when 4770 came out, but that would have taken away resources from the 5800... so no, I don't think AMD could have delivered this level of performance at an earlier stage, realistically.
 
Yeah, they didn't have the TDP headroom on 55nm to go much higher. People forget that because everything else was so good (die size, price etc) but power consumption was right up there with the big boys.
 
The starting point is measured throughput which puts it at 1 Tri/clock.
That doesn't tell us how many rasterisers there are. Unlike you I think that's an interesting subject. You seem to think it's just marketing.

Not sure why you consider it reaching to ask why they are promoting doubled rasterizers when it's something that hasn't been marketed in that way before. It's a valid question, no cynicism involved.
AMD's press pack has multiple slides that mention two rasterisers, for what it's worth. Cynicism is describing the entire press pack as nothing but marketing.

People have tested this and asked questions. No answers forthcoming so far.
No tests of rasteriser configuration have been performed.

Jawed
 
Same architecture as 4770, same RAM and bus width, 32 ROPs, 1280 SPs would have initiated this retirement a year ago.
When would it have arrived? When would the rest of RV7xx have arrived?

Though I have to ask where's that 181mm² chip. Is AMD stretching out SKU launches for maximum marketing effect over 4-6 weeks? Or is that chip, whatever it's called, having trouble?

Jawed
 
Yeah, they didn't have the TDP headroom on 55nm to go much higher. People forget that because everything else was so good (die size, price etc) but power consumption was right up there with the big boys.
Also, it seems RV770 was lucky and worked well enough on first spin to launch. Power/heat for later RV770s was notably better - not sure if that's down to additional engineering by AMD or solely process maturity, though.

Jawed
 
That doesn't tell us how many rasterisers there are. Unlike you I think that's an interesting subject. You seem to think it's just marketing.

I think you misunderstand. Like you, I'm very interested to know whether the marketing slides are of significance or just marketing. As it stands they are marketing a typical increase in scan conversion throughput as "multiple rasterizers" whereas this has never been done in the past.

No tests of rasteriser configuration have been performed.

Au contraire :)

http://www.hardware.fr/articles/770-6/dossier-amd-radeon-hd-5870.html

tris.png
 
When would it have arrived? When would the rest of RV7xx have arrived?

Though I have to ask where's that 181mm² chip. Is AMD stretching out SKU launches for maximum marketing effect over 4-6 weeks? Or is that chip, whatever it's called, having trouble?

Jawed

Juniper is still coming, before GT215 (it's competitor) and most likely GT300. Looks like Mobile has been pulled forward one whole quarter (got to make sacrifices somewhere.)

On that note, it will be really interesting to see what nvidia will do with regard to pricing.
If they launch at the same price as Hemlock, they will get beaten.
If they launch a bit below Hemlock. they would not compete on price/performance versus a more mature Rv870.
 
I think you misunderstand. Like you, I'm very interested to know whether the marketing slides are of significance or just marketing. As it stands they are marketing a typical increase in scan conversion throughput as "multiple rasterizers" whereas this has never been done in the past.
Perhaps fan-out/latency/work wise it was simplest to just instantiate the old rasterizer twice whereas they redesigned it in the past ... an artifact of implementation irrelevant to performance but if so the truth nonetheless.
 
I saw that too and it made me sigh a bit. I really want to see RV870 process multiple tris per clock, as it's going to make a major boost in many games. At the very least I was hoping it could cull backfacing and offscreen triangles at a much faster rate. If vertex coords are held in their own stream, there should be enough BW to cull over 25B tris per second for a perfect mesh, and the vertex shader power is definately there. Even one tenth of that figure would be a vast improvment.

If NVidia is able to crack this nut with GT300, it's going to have a big leg up on AMD. Intel seems to think it can do it with deferred rendering, but who knows if they can nail all the corner cases required for a real DX11 driver.
 
When would it have arrived? When would the rest of RV7xx have arrived?
After RV770. Keep that the main goal, but this hypothetical chip would negate the need for RV790. Winter 2008 would be a nice time for it.

It would have taken some sparkle off RV870, though. Maybe that's one of the reasons they avoided it.
 
I think you misunderstand. Like you, I'm very interested to know whether the marketing slides are of significance or just marketing.
Funny way of expressing it when you said "How does any of that have any bearing on the marketing?"

http://forum.beyond3d.com/showpost.php?p=1339132&postcount=3797

As it stands they are marketing a typical increase in scan conversion throughput as "multiple rasterizers" whereas this has never been done in the past.
But if it's true that there's two of them then that is technically interesting. See, you keep saying it's nothing but a marketing detail.

What does triangle rate have to do with rasterisation rate or the count of rasterisation units?

Jawed
 
Funny way of expressing it when you said "How does any of that have any bearing on the marketing?"

http://forum.beyond3d.com/showpost.php?p=1339132&postcount=3797


But if it's true that there's two of them then that is technically interesting. See, you keep saying it's nothing but a marketing detail.


What does triangle rate have to do with rasterisation rate or the count of rasterisation units?

Jawed

Perhaps the count of ROP/RBEs isn't so important but the number of rasterization engines could increase triangle setup rate, if anyone could ever figure out a method for parallelization.
 
Back
Top