NVidia DDR - II - what speed and bandwidth will it have?

g__day

Regular
So far I haven't undestood my logic error guys, help me out.

Samsung say their 128mbit card can do 4GB/sec (regardless of whether the RAM frequency was say 32 GHz on a 1 bit bus or 32 MHz on a 1,024 bit bus).

http://www.samsungelectronics.com/news/device_solution/com_news_1026373517250_001600.html

Seoul, Korea – July 10, 2002: Samsung Electronics Co, Ltd. introduces the industry’s first graphics memory chip using the next-generation DDR II specifications. Samsung’s new graphics chip is available in 128Mb density and boasts a data transmission rate of 1-GigaHertz and higher. The device is capable of transmitting 4-Gigabytes of data per second and is ideally suited for systems that require high memory speed and performance such as 3D graphics, gaming and network applications.


Well I guess you'd need 8 seperate 128 Mbit chips to give you 128 MBytes of memory on a video card.

So doesn't that give you 8 * 4 Gbytes/sec = 32 GBytes/sec raw speed?

By my way of thinking at 4 GByte/sec throughput per chip, does that mean you either have:

1) 128 bit bus at 250MHz (doesn't seem that fast) or
2) 256 bit bus at 125 MHz

Does this sort of memory need to be accessed one chip at a time or several banks at once?

Many thanks!!!
 
g__day said:
By my way of thinking at 4 GByte/sec throughput per chip, does that mean you either have:

1) 128 bit bus at 250MHz (doesn't seem that fast) or
2) 256 bit bus at 125 MHz

Well, they already say that the memory runs at 1 GHz (effective clockrate), so the actual configuration must be 32bit at 1000 MHz (effective clockrate) per chip. 8 chips could be configured as 256bit bus at 1 GHz, for 32GB/s, or the chips could be arranged in pairs sharing the same data lines to make a 128 bit bus at 1 GHz, giving 16GB/s bandwidth.
 
Thanks - I understood most off that, it leads me to ask why would you want the chips to share buses as this halves graphics bandwidth and that is what graphics cards are starved for? Is it just too complex and/or costly to wire the card 32 bit seperate buses to each memory chip into a intelligent controller?

DDR II is only clock doubled I thought not quadrupled? So 64 bit at 2 * 250MHz seems achieveable - but are you thinking 32 bit at 4 * 250MHz- or 32 bit at 2 * 500MHz?
 
256bit busses are more expensive to implement than 128 bit buses,

Yep, DDRII is still clock doubled. These chips are 32 bit at 2 * 500MHz.
 
Because it says so, twice, on the page you linked to…

"With data transfer rates of 1GHz, the device can transmit 4 gigabytes of data per second.
"

It says the transfer rate is 1 GHz, and to get 4GB/s bandwidth on a 1 GHz bus, it has to be 32bits wide.


And once more:

"The company is scheduled to start mass production of its 128MB, 1GHz graphics DDR II DRAM in the third quarter of 2002"
 
D'oh - brilliant!!! Thanks :)

So its 1 GHz or better a pin * 32 pins of data / 8 bits per byte to give 4 GB/sec throughput. Meaning the bus must be at least 64 bits for address and data (assuming 32 pins for address).
 
I'll be very disappointed if Nvidia only has a 128-bit bus on NV30. With the other three GPUs (or VPUs) 3DLabs P10, Matrox Parhelia and ATI R300 all having 256-bit memory bus, I would expect/hope that NV30 has one too.

That said, it will be interesting to see how well an NV30 w/ a 128-bit bus and DDR-II plus LMAIII plus any new bandwidth saving technology NV30 has (GigaPixel?) will compete with R300, with it's raw bandwidth plus modest bandwidth saving (HZ3) features.
 
I think you forgoet to mention the R300 also has a cross-bar style memory control ~ LMAIII AFAIK
 
megadrive0088 said:
That said, it will be interesting to see how well an NV30 w/ a 128-bit bus and DDR-II plus LMAIII plus any new bandwidth saving technology NV30 has (GigaPixel?) will compete with R300, with it's raw bandwidth plus modest bandwidth saving (HZ3) features.
R300 has several other bandwidth saving features. You haven't seen how well it performs with MSAA? :)
 
OpenGL guy said:
R300 has several other bandwidth saving features. You haven't seen how well it performs with MSAA? :)

I am pretty much stunned that ATi are only now touting an early Z rejection with R300, implying that R200 didn't have early z rejection - this means the Hierarchical z buffer in R200 was only a bandwidth saving device as any occluded pixels that have pixel shader ops would have to be full rendered before it realises that it didn't need to calculate it!
 
I think the early z-rejection is separate from what Hyper-Z does, and is a side-effect of being able to do multisampling.

What I wonder, however, is whether or not this early z-rejection (on GeForce3/4 or R300 hardware) is usable at all when the highest-available MSAA is put to use.
 
DaveBaumann said:
I am pretty much stunned that ATi are only now touting an early Z rejection with R300, implying that R200 didn't have early z rejection - this means the Hierarchical z buffer in R200 was only a bandwidth saving device as any occluded pixels that have pixel shader ops would have to be full rendered before it realises that it didn't need to calculate it!
I thought HyperZ II hierarchical Z tested and discarded blocks of occluded pixels (not sure what the block size was). My interpretation from what I've read on HyperZ III is that it can now do the Z test on individual pixels before the pixel shaders, rather than just on blocks. That should allow it to catch a larger percentage (if not all?) of the occluded pixels.
 
g__day said:
Donald Webster - from 3dGPU - had this sad observation on a similar thread:

All current indications taken in, there will be no 256-bit bus on the nV30. An employee from nV is quoted as stating "256-bit bus is overkill". The only reason I can think of someone saying anything that dumb is that they don't have one.

http://www.3dgpu.com/yabb_se/index.php?board=2;action=display;threadid=703

Whether a 256-bit bus is overkill depends on the chip design, obviously, and where you focus your performance goals. I would assume that nVidia engineers know their business, but would still be surprised if they could outperform the R300 across the board with a lower bandwidth memory subsystem. Cacheing and various optimisation techniques could make a 128bit 500MHz DDR(II) bus perform similar to a 256bit 350MHz bus. Lower bandwith, but also lower latency after all. Memory parts costs should be substantially higher however, and I'm not convinced that they gain much in terms of chip packaging and PCB costs. Overall, going 256bit seems a better choice.
And if the NV30 uses a 128-bit bus, it seems reasonable to assume that their next part will double that to 256bits, and will be designed around the possibilities such a memory subsystem will offer. It's a technological knee, and I suspect many will prefer to shop after nVidia is past it, rather than before. A factor of two in bandwidth is nothing to sneeze at, and should allow for a corresponding increase in performance, if the rest of the design is balanced to take advantage of it.

Entropy
 
Sigh, this is getting greedy - but imagine a DDR II 256 bit solution with 256MB of RAM - by my calculations that would give a raw speed platform with 64 GB/sec bandwidth - sigh :)
 
g__day said:
Sigh, this is getting greedy - but imagine a DDR II 256 bit solution with 256MB of RAM - by my calculations that would give a raw speed platform with 64 GB/sec bandwidth - sigh :)
32bytes * 1 billion cycles/sec will give you only 32 GB/sec. Increasing total memory will not increase the available bandwith.
 
As Russ says 256MB - configured as 16 * 128MBit chips, each with its own 64 bit bus to a memory controller/s, each controller (if there are multiple) tied on a 256/512 bit bus, each chip giving you 4GB/sec = 64 GB/sec - within the realms of possibility certainly.

But I'd hate to do the wiring :)
 
Back
Top