Freak'n Big Panda
Regular
There is a trend in the industry at the moment of aiming to address the high end market with MGPU designs (R680, R700) but there are numerous problems with this approach as many of us are aware of.
For AFR based schemes, persistent data is the number one killer of scaling performance. The need to duplicate the memory for each chip is also a major cost concern.
Clearly the IHVs will try to tackle these problems, the question is how?
I have an idea on how to address the issues and I'd like feedback from some of the more knowledgeable members here.
Say a GPU had 4 x-bit DRAM channels. Would it be possible to tie two of the channels on GPU0 to 2 of the channels on GPU1 to give GPU0 access to GPU1s local memory and vice versa? This would kill two birds with one stone as memory would no longer need to be duplicated and persistent data would no longer be an issue thanks to the UMA.
I was thinking something like this:
GPU0 DRAM CH0 <-----------> GPU1 DRAM CH0
GPU0 DRAM CH1 <-----------> GPU1 DRAM CH1
GPU0 DRAM CH2 <-----------> DRAM
GPU0 DRAM CH3 <-----------> DRAM
GPU1 DRAM CH2 <-----------> DRAM
GPU1 DRAM CH3 <-----------> DRAM
So the bus between GPU0 and GPU1 would use the GDDR5 protocol in order to provide direct access to the other GPU's local memory store.
What do you think about this idea and do you have any other thoughts on how the industry will address the MGPU issue from a hardware standpoint?
For AFR based schemes, persistent data is the number one killer of scaling performance. The need to duplicate the memory for each chip is also a major cost concern.
Clearly the IHVs will try to tackle these problems, the question is how?
I have an idea on how to address the issues and I'd like feedback from some of the more knowledgeable members here.
Say a GPU had 4 x-bit DRAM channels. Would it be possible to tie two of the channels on GPU0 to 2 of the channels on GPU1 to give GPU0 access to GPU1s local memory and vice versa? This would kill two birds with one stone as memory would no longer need to be duplicated and persistent data would no longer be an issue thanks to the UMA.
I was thinking something like this:
GPU0 DRAM CH0 <-----------> GPU1 DRAM CH0
GPU0 DRAM CH1 <-----------> GPU1 DRAM CH1
GPU0 DRAM CH2 <-----------> DRAM
GPU0 DRAM CH3 <-----------> DRAM
GPU1 DRAM CH2 <-----------> DRAM
GPU1 DRAM CH3 <-----------> DRAM
So the bus between GPU0 and GPU1 would use the GDDR5 protocol in order to provide direct access to the other GPU's local memory store.
What do you think about this idea and do you have any other thoughts on how the industry will address the MGPU issue from a hardware standpoint?