AMD: R8xx Speculation

How soon will Nvidia respond with GT300 to upcoming ATI-RV870 lineup GPUs

  • Within 1 or 2 weeks

    Votes: 1 0.6%
  • Within a month

    Votes: 5 3.2%
  • Within couple months

    Votes: 28 18.1%
  • Very late this year

    Votes: 52 33.5%
  • Not until next year

    Votes: 69 44.5%

  • Total voters
    155
  • Poll closed .
Something going on in Chinese Chiphell again.


1. GT300 big, so yields predictably unreasonable as of now (meh)
2. RV870 takes 10+ USD in terms of "testing fees", packaging/testing dept. isn't too keen on accepting the order list due to "complexity". Hmm?
3. RV830 is smooth sailing.
4. nVidia and GloFo hanky panky.

Mainly 2 is the object of interest. I haven't heard of packaging complexities before, maybe it's underrated wrt the ever ongoing discussion of diesize and yields?

Or maybe... the 180mm^2 chip is RV830, and doing an MCM of 2 Shanghai sized GPUs with a proper interconnect is quite a daunting task.

Anyone?


2. RV870 takes 10+ USD in terms of "testing fees", packaging/testing dept. isn't too keen on accepting the order list due to "complexity".===> First dual core GPU (RV830 X2 ) ???
 
I am betting this time that RV870XT will be on par with GTS 360 in terms of cost/performance. :p

Don't get me wrong but if I would have taken your bets of the past I'd be a rich guy right now :LOL:

Seriously the performance part of your estimate would be something to write home about, the cost part not in the least. Especially with the current financial crysis (yeah yeah I know folks don't like hearing about it....).
 
Well, first to come to mind would be fast communications between cores (similar to dual & quad cores over CPU side where the actual CPU is 2 physical cores on one "board")
However if this is better or worse than just stuffing them to 1 big core on GPU side, I don't know - probably just worse performance wise, better yield wise.
 
I wouldn't bet on them having pulled off the holy grail of multi GPU computing. Having said that, if any company can do it, AMD can. (As nv atm is not interested in x2 solutions, unless they are playing catch up.) For eg. What happens to the serialization choke points, (input assembly, rasterization and now tesselation)

And yes if the RV870 is a dual core, then what happens to the r800 (4870x2's replacement)? A quad core GPU? Or a dual socket dual core GPU.
 
I wouldn't bet on them having pulled off the holy grail of multi GPU computing. Having said that, if any company can do it, AMD can. (As nv atm is not interested in x2 solutions, unless they are playing catch up.) For eg. What happens to the serialization choke points, (input assembly, rasterization and now tesselation)

I would think that even for the performance market, a single GPU would be way more efficient. Hell, we have a hard time with dual-GPUs even at large resolutions.
Either way, it will at least be an interesting generation.
 
Last edited by a moderator:
No, I get the informations from other sources. But of course I am in contact with Charlie, too.
Well, there were much trouble about RV870 launch. In the spring I get the info that RV870 can launch in this summer. Then grow TSMCs problems and RV870 was set to September. Now it looks like he really comes in October.

Really?

-Charlie
 
AMD just released their DX11 white paper.
http://www.legitreviews.com/article/1001/1/

After reading that one is there still anyone that insists that Evergreen is just RV7x0 + DX11?

What would be the advantage over no native dual core ?.

They won't have an immigrant problem :LOL:

On a more serious note, if they've truly gone that route I wouldn't think that they've built something that would have serious drawbacks against a typical single core. Ie I can't imagine a dual core on the same die to employ something like AFR; the next best question would be how they could have handled the interconnects and bandwidth considerations but it's way out of my league as a layman to find something that makes half way sense.

How should I understand that: all cores below "RV870" are single core, "RV870" consists of two dies on the same package and the X2 is twice the former? That doesn't make any sense to me and for such a case each core would have to be around 120mm2, otherwise in a theoretical 2*"RV870"(2 cores on same die) case (or else 2*2) the die area advantage compared to the competition would be gone.

I figure the primary goal if the above should be true must have been to get over at least partially the memory redundancy problem with multi-chip setups so far. Under ideal situations they wouldn't need AFR to address two cores and the two cores would share the same memory. If that's even possible w/o hitting into any physical limitations I don't know but it sounds like a very tall order to me.
 
I didn't get what was in that whitepaper that makes such a scenario improbable. I have no opinion either way. Evergreen may or may not be just RV770+dx11.
 
I didn't get what was in that whitepaper that makes such a scenario improbable. I have no opinion either way. Evergreen may or may not be just RV770+dx11.

The hw changes sound to me way too extensive to suggest not to seriously revamp important aspects of an architecture. Else do you think G80 would have ever won the first DX10 round if it wouldn't had been a USC?
 
I dunno, a bunch of those changes were written as 'DX10 vs DX11' without mentioning the DX10.1 equivalent.
Others are things that were already in RV770 even if not supported in DX10.1.
 
The focus seems to be on Compute Shader and it seems to me an effort to deflect attention away from the noises NVidia's making about CS. NVidia seems to be ready to market CS4/CS4.1 as all that a developer needs (because all its GPUs since G80 work this way), so developers should focus on them, not CS5. And therefore D3D11 isn't relevant until NVidia says so. I wouldn't be surprised if NVidia's already well under way with this campaign.

Some of the restrictions in CS4.1 come from existing ATI cards, though, e.g. the private-write/shared-read model of shared memory (only being able to write to a 16-256B region, private to a thread) and the lack of atomics - both of these are fairly serious restrictions it seems. Private-write/shared-read isn't even a hardware restriction in ATI (R7xx only) but Brook+ also has this restriction, which I can't work out the underlying logic of (I can only think it's super-slow on ATI).

768 threads in a thread group is the basic limit of G80 - but, something I didn't fully realise till recently, the CUDA block limit is 512 threads. I wonder if this limit of 512 also influenced the shared memory model.

Actually, is there a difference between CS4 and CS4.1?

So, it seems that CS4 is significantly less functional than CUDA on current GPUs. The slight lack of functionality of G80 (no atomics) is a bit of a hindrance, but it does seem like shared memory functionality has been knobbled by ATI. And I still expect ATI earlier than R7xx to be incapable of CS4 - unless shared memory is emulated through video memory.

I don't remember seeing this before:

Indirect Compute Dispatch: This feature enables the generation of new workloads created by previous rendering or compute shading without CPU intervention. This further reduces CPU overhead and frees up more processing time to be used on other tasks.
This seems to imply that kernel domains can be sized, created and despatched by the GPU.

Or maybe it simply means that the GPU can auto-run a kernel based on a domain that's defined by a buffer that was created on a prior rendering pass. So the input buffer effectively defines the domain size, and completion of writes to that buffer is required before the new kernel can start. It might not even be a new kernel, but a repeated instance of the kernel that's just completed. Some kind of "successive refinement"?

I can't find this whitepaper on AMD's site.

Jawed
 
Back
Top