AMD: R9xx Speculation

Err... differential signaling on GDDR5 coming?
I could never understand the argument for differential signalling. You need twice as many data wires to implement it, so why not just use twice as many single-ended signals? I have a hard time believing that differential signalling can get the next GDDR to clock over twice as high.
 
I could never understand the argument for differential signalling. You need twice as many data wires to implement it, so why not just use twice as many single-ended signals? I have a hard time believing that differential signalling can get the next GDDR to clock over twice as high.
In fact, it's more by a factor of 10.

BTW, GDDR5 couldn't use it as it's single ended by design.

Perhaps GDDR6 will be 16-bit differential, but GDDR5 still has some headroom.
 
I could never understand the argument for differential signalling. You need twice as many data wires to implement it, so why not just use twice as many single-ended signals? I have a hard time believing that differential signalling can get the next GDDR to clock over twice as high.

because you aren't accounting correctly. There is a lot more to an SE interface than the data signals.
 
Perhaps GDDR6 will be 16-bit differential, but GDDR5 still has some headroom.

With 512bit bus and 6Gbps GDDR5 they could reach 384GB/s but the cards would cost more than the 5800 radeons.
But i think that chasing bandwidth trough the whole video memory is quite a power and bandwidth waste now that we have soon 2GB cards.
It would be much more effective to have a smaller and a lot faster memory close to the gpu and a second multi GB memory for data storage and reads. Something like L1 and L2 video memory :rolleyes:
 
because you aren't accounting correctly. There is a lot more to an SE interface than the data signals.
Can you give me an example? If all the signals in a differentially signalled memory interface were converted to twice as many single ended signals to be used in whatever way you please, what's the ratio in the amount of data that can be transferred per clock?
 
Is your experience applicable to understanding the merits of differential signaling first hand?

I certainly wouldn't want to design a modern high speed serdes but I understand all the fundamentals, try to keep up with the research, and understand the digital/logical side pretty well. Its effectively the sister field of what I've been working on for the past 11+ years which is the link/routing side of the world. I can't put together a good PLL/DLL but understand everything else.

The main hidden tradeoff between diff and se signaling is additional strobe and P/G isolation required by SE vs the differential pair required per diff signal.

SE gets harder and harder to push past 6-8 GT/s while there are already mass produced 6-10 GT/s differential systems in production that deal with significantly nastier electrical environments.
 
Can you give me an example? If all the signals in a differentially signalled memory interface were converted to twice as many single ended signals to be used in whatever way you please, what's the ratio in the amount of data that can be transferred per clock?

you have to account for the strobes and P/G isolation required by SE. DE requires less P/G isolation because the signaling itself is more resilient to common mode noise injection.

Over the short interconnect distances that say GDDR5 is pushed today, you should be able hit ~13-16 GT/s with a differential solution.

As an example, it might be educational to look at the FBD pins vs DDR2/3 pins.
 
What happened with FBD anyway? Was the clock simply pushed too high? If Intel had gone with say twice the pins and 3 times the DDR2 data rate (instead of 6) would we now not still be stuck with single ended multidrop busses in an electrical environment which has far better justification for differential signalling (and daisy chaining) than GPUs?
 
you have to account for the strobes and P/G isolation required by SE. DE requires less P/G isolation because the signaling itself is more resilient to common mode noise injection.
Okay, that makes sense, though it's a bit hard to quantify how much this adds expense to the design.

Over the short interconnect distances that say GDDR5 is pushed today, you should be able hit ~13-16 GT/s with a differential solution..
That doesn't seem like it's much better than the 6-8 you mentioned with SE, isolation issues aside. Also, do the memory makers have silicon that can handle that kind of switch speed for the prefetch buffers? Why didn't RAMBUS push for higher speeds before?
 
Ive been following this thread for some time and the question that springs to my mind is, what next after NI if it is indeed on 28 nm? 22 nm is far out and according to the current TSMC schedule will be introduced in Q3 2011. 40nm was officially available at TSMC in Q408, thats a gap of almost two years to 28nm.

Basically ever since the cancellation of 45 nm at TSMC the schedule has gone awry. We had 55 nm within six months of 65nm. 45nm was supposed to come in late 08 followed by 40nm in Q1/Q2 09 i suppose. In their haste the process had too many problems and we didnt see real volume till Q309. Same way I dont expect any quantities of 28nm till Q1 2010, which means 40G being the leading process for 2 whole years! After that its another wait of a year and a half for 22 nm.

Normally we were seeing a 12 month product cycle with a new process available each year. Now it seems like GPU makers have to do two product cycles with the same process. This leads me to believe we'll see NI at 40nm. If 32nm hadn't been cancelled we'd have seen NI on 32nm this year. And the possibility of it being done on 32nm SOI at GF has all but been ruled out i suppose
 
Last edited by a moderator:
http://translate.google.cn/translate?js=y&prev=_t&hl=zh-CN&ie=UTF-8&layout=1&eotf=1&u=http%3A%2F%2Fbbs.chiphell.com%2Fviewthread.php%3Ftid%3D78057%26page%3D5%26authorid%3D2&sl=auto&tl=en

nApoleon said R9xx chip should be the time to tape out at 40nm node,less stream processors than Cypress,performance target is 10%-20% higher than GTX480.There are some changes in architecture,but it's not a totally new architecture.

So thats "Hecatonchires" then? The refresh coming in the second half of 2010? (Aug, Sept timeframe maybe)

And "Northern Island" coming in April 2011? (28nm @GLoFo?)
 
The only thing im concerned about at the present time is whether one of the chips in the Northern Islands group is the North Island. It would be a slap in the face for all New Zealanders if one of the Northern Islands chips isn't called North Island our epically named northern island.
 
http://translate.google.cn/translate?js=y&prev=_t&hl=zh-CN&ie=UTF-8&layout=1&eotf=1&u=http%3A%2F%2Fbbs.chiphell.com%2Fviewthread.php%3Ftid%3D78057%26page%3D5%26authorid%3D2&sl=auto&tl=en

nApoleon said R9xx chip should be the time to tape out at 40nm node,less stream processors than Cypress,performance target is 10%-20% higher than GTX480.There are some changes in architecture,but it's not a totally new architecture.

Where do you have the 10-20% performance plus over the GTX480? I don't see that in your link.
 
Back
Top