AMD: R7xx Speculation

Status
Not open for further replies.
Anyone thinking about "FX" branding for R700 high end part to complete AMD's FX line-up?

Something like Radeon 4890FX...
 
There is no AMD's FX line-up anymore.

What's Phenom FX then, if not FX line-up?
amd_k10_logo-170907.jpg
 
I'm not sure, if we need to share frame buffer and z-buffer, which are way more bandwidth dependant than texture memory. I think 16-32GB/s interface (just for textures) could suffice. Other possibility is fast-clocked, but narrow interface (die dimensions, low space for additional pads)

Yeah, since each GPU is already is breaking down the scene into tiles, each one would only need to keep the frame and z buffers for its own area. That's assuming they are working on the same frame. Or would they work on alternating frames?

If they work on the same frame, wouldn't the interconnect also have to share transformed vertex data too? If the transformed data belongs to the other GPU's area (which would only be known post-transform), it would need to be moved there (?).
 
And what about offscreen rendered buffers? E.g how is shadow map rendering done? I doubt it is in doable just by sharing backbuffer.
 
I haven't been keeping up with R700 news, rumors, speculations lately.

is each R700 GPU likely to have a 512-bit external bus, or go back down to 256-bit bus ?
 
My guess: 4 dies, each with a path to a memory controller that is externally 512-bits wide. Internally as high as 4096-bit.
 
Would something along these lines work?
r700nw2.png


4 cores + 1 "hub" with "central memory controller" and some other functions like i/o etc?
 
With each of the four cores being what? 55nm 400 - 450m transistors?

if they are going to be any smaller than that, then what would be the point in breaking them down in to seperate dies.

eg if they are just gona be 300 million each... you may as well just have a single 1.2 billion chip. it would be alot less hassle.

thoughts?

EDIT: I love the fact that drivers over the next 2 years are going to become so good at scaling with mutli chip. I thinks its going to be a brilliant situation.
 
With each of the four cores being what? 55nm 400 - 450m transistors?

if they are going to be any smaller than that, then what would be the point in breaking them down in to seperate dies.

eg if they are just gona be 300 million each... you may as well just have a single 1.2 billion chip. it would be alot less hassle.

thoughts?

I'm not completely sure a single 1.2 billion chip is as easy to make as most ppl seem to believe. Or, more adequately, made to yield in an even remotely satisfactory fashion. Scaling may also be a pain. With the smaller 300 million chips your yields would be awesome(per the small chip), and scaling would be far easier to achieve(from butt-end 1 chip configs to the fizzle my shnizzle 4 chip ones).

OTOH, there are issues that have to be tackled in a multi-chip approach(many of which have already been discussed before in this thread), so it's not a walk in the park either.
 
My guess: 4 dies, each with a path to a memory controller that is externally 512-bits wide. Internally as high as 4096-bit.

drool.

Would something along these lines work?
r700nw2.png


4 cores + 1 "hub" with "central memory controller" and some other functions like i/o etc?


droooooool.


one can easily imagine a consumer rig with 4 cards, each with 4 GPUs, thus 16 small but powerful GPUs.

larger, higher-end, non-consumer workstations could easily have 64-128 GPUs
(older SGI UltimateVision systems had upto 16 ATI GPUs)

and large-scale super computer visualization systems with over 1000 R700 GPUs.
it's not like we couldn't see systems, consumer & non-consumer, with that many AMD/ATI chips deployed during the 2008-2010 timeframe, with either R7xx or R8xx based tech.
 
Last edited by a moderator:
I'd have thought a fairly small central die performing CF/ringbus arbitration/loadbalancing, UVD, PCIE & some other I/O stuff but with the links to GDDR & ROPs being on the other dies.
Maybe they would have Budget & High-End versions of the central core with more complex arbitrator/bigger caches etc in the High-End one.
 
Status
Not open for further replies.
Back
Top