when will 512bit memory bus come?

MfA · Apr 8, 2004

Just seeing what surface is visible for a given pixel is such a small part of the rendering problem ... the amount of bandwith you can burn on more accurate shading before you hit diminishing returns wont be in reach for a while.

In addition to eDRAM and traditional MCMs (with or without area I/O) there are a couple of alternatives on the horizon to using memory on the PCB, for instance Sun's edge communication or chip-on-chip (with area-IO).

mboeller · Apr 8, 2004

I find it a little curious that no one looked a few millimeters out of the box.

IMHO 512 bit are not needed for a very long time cause we already have other technologies to improve the bandwidth. Here are two examples :

Rambus XDR able to transmit 4 times as much data per pin and clock than DDR-SDRAM.

and Huffmann compressed Framebuffer and Z-Buffer.

This are only two small examples what can be done to improve the bandwidth problem, so imho even more can be done to improve the situation before going to 512bit busses.

MfA · Apr 8, 2004

xdr-dram will probably remain too expensive for anyone but Sony to consider using it for a while. The ASICs used on FB-DIMMs would be interesting for use with graphics cards though.

AFAICS adaptive huffman is a bad idea, DPCM with adaptive Golomb-Rice coding is so much easier to implement in hardware. My guess would be that the guys who did that project werent very familiar with low complexity lossless image coders.

arjan de lumens · Apr 8, 2004

XDR uses differential pairs at 3.2Gbps per pair, which gives the same per-pin bandwidth as (non-differential) 1.6 Gbps GDDR3 memory. No gain there.

That huffman encoding/decoding paper has been discussed here before; even if the stated compression ratios were actually achieved, it still doesn't solve the problem that Huffman encoding and decoding (especially the adaptive variants) are slow, serial tasks that do not parallelize well at all, limiting their usefulness in GPUs.

EDIT: also, doing block compression of color buffers will force you to, along polygon edges, do full read-modify-write cycles (instead of just write-only cycles) of every block that the polygon edges pass through, so you get a large amount of additional traffic for the case where you are rendering small opaque polygons.

psurge · Apr 8, 2004

MfA - do you have a link for that? I found some stuff on DPCM and Golomb-Rice codes (LOCO-I) but not both...

Cheers,
Serge

Simon F · Apr 13, 2004

arjan de lumens said:
it still doesn't solve the problem that Huffman encoding and decoding (especially the adaptive variants) are slow, serial tasks that do not parallelize well at all,

I'm glad someone's pointed this out.

demalion · Apr 13, 2004

Ah, that comment ignites my curiosity regarding recollections of geometry compression discussions. Will we have cause to discuss such a topic again in the near future, I wonder?

Vince · Apr 13, 2004

arjan de lumens said:
XDR uses differential pairs at 3.2Gbps per pair, which gives the same per-pin bandwidth as (non-differential) 1.6 Gbps GDDR3 memory. No gain there.

Uhh, please correct me if I'm wrong, but it was my understanding that with normal single-ended signaling, you'll need a ground pin for every data pin.

Thus, for every pair of pins on the package, you'll yeild 1.6Gbps with GDDR3, but with first generation XDR, you'll yeild 3.2Gbps. And XDR is slated to scale from it's current 400mhz to 800mhz (6.4GHz) within 2 years.

Basic · Apr 13, 2004

Vince said:
Uhh, please correct me if I'm wrong, ...

You are.

There's no special "signal ground" lines in GDDR3.
IO signals are relative the power ground, and I don't think there's any extra power ground signals compared to XDR.

Vince · Apr 13, 2004

Basic said:
You are.

There's no special "signal ground" lines in GDDR3.
IO signals are relative the power ground, and I don't think there's any extra power ground signals compared to XDR.

Interesting, then how exactly does GDDR3 get around the noise problems that are related with unbalanced transmission? Where's a good place for more specific information? I remember hearing that in single-ended signaling, your ratio of active pins to power and those to ground will converge on 1:1 - obviously, you can get around this with XDRs DRSL which should provide you with 2X the mean, per pin, transmission rate as the ratio nears 1:1, but I've never read up on any of the DDR2/GDDR3 specs.

MfA · Apr 13, 2004

You can isolate signals with by putting grounded traces in between without having extra grounding pins ... power-consumption and signal integrity are seperate issues.

Samsung's GDDR3 chips have 14 power and 18 ground pins which are designated for output, it seems to have seperate power grids for the memory and the I/O, for 32 I/O pins.

BTW it is interesting that NVIDIA is saying they are using compression for the color buffer (I assume they mean the framebuffer).

Joe DeFuria · Apr 13, 2004

MfA said:
BTW it is interesting that NVIDIA is saying they are using compression for the color buffer (I assume they mean the framebuffer).

Why? Doesn't both NV3x and R3xx already do color compression? (Am I missing something, other than this might be an improved compression over the NV3x implementionation? )

Megadrive1988 · Apr 13, 2004

512-Bit bus?

not until at least R500 / NV50

but more likely, with R600 / NV60

it's said that 512-Bit bus will be very hard to implement, but they said the same thing about 256-Bit bus.

Maintank · Apr 13, 2004

That would be alot of pins

when will 512bit memory bus come?

MfA

mboeller

MfA

arjan de lumens

psurge

Simon F

Tea maker

demalion

Vince

Basic

Vince

MfA

Joe DeFuria

Megadrive1988

Maintank

Similar threads