256 vs 128 bit bus

malcolm

Newcomer
Im wondering how big the difference is since nvidia said it was overkill.
Does this mean it gets unused a lot of the time?
Where would 256bit be an advantage?
Antialiasing, anisotropic filtering or maybe big textures?
And how much difference?
 
Well it's just like Bill said: ''640k ought to be enough for anybody''

Guess that kinda counts for the 256bit to :LOL:

Well this should kinda explain it partly.
Graphics memory seems to have much higher peak bandwidth than system memory when running at the same speed'....... it.....does mmm well so it seems?
A GF 2 GTS uses DDR333 (PC2700) which if I plugged it on the mobo (sys memory) it should give a bandwidth of 2.7gb/s so why has the GeForce 5.3Gb/s. THIS IS WHY: WTF Nvidia is cheating shame on them

Graphics cards use 128-bit memory interfaces vs PC system memory a 64-bit interface so at same clock speed....that would mean.....more bandweith to spare for textures antialiasing textures yadayadayada ofcourse all that bandweith in those new cards is not used by games you can buy this year.

I am tired so no technical blablabla I need to get up in 2 hrs yawn ......
any missinformation or wrong saying is part due my sleep ........zzzzzz
 
Basically, I have said for a long time that a 256-bit bus was not a good idea, primarily due to cost concerns. Quite simply, it's very expensive to produce video cards on a 256-bit bus.

That said, it is most certainly possible for modern video cards to make full use of a 256-bit bus. That is, any video card that supports multisampling AA can benefit very significantly from increased memory bandwidth (talking specifically about the GeForce3/4 and Radeon 9700 here).

At the same time, however, are two other issues. Firstly, it is possible to significantly reduce the total bandwidth usage of today's video cards through better optimizations. Possible ways of optimizing include partial deferred rendering (I don't support full deferred rendering, btw...), frame/z-buffer compression, occlusion detection, early z-rejection, hierarchical Z, and so on. In this way, it may be possible to drastically reduce the memory bandwidth hit required for MSAA, reducing the need for insanely-high memory bandwidth.

The second primary issue is simply that there are two ways to increase the memory bandwidth. One is increasing the number of pins (i.e. 128-bit bus->256-bit bus). The other is increasing the bandwidth per pin. This can be done through increased frequencies, or increased number of signals sent per clock.

Increasing the frequencies is inherently cheaper than increasing the number of pins because the primary costs involved are inside the chips themselves. Still, at the same time, it does require collaboration between graphics chip companies and memory manufacturers, meaning that this is the harder of the two improvements in memory bandwidth to make happen.

Another way to look at it might be that ATI felt that memory manufacturers didn't have faster memory technologies out quickly enough for their next-generation design.

It might be very interesting to see what nVidia is going to put out. I think the most likely scenario is a similar memory interface, though there is always the chance of a more efficient one in lieu of more memory bandwidth. Still, I doubt that nVidia could convince so many people that their NV30 would be absolutely superior to the R300 if they were using a 128-bit bus.
 
Samsung has claimed 1 GHz DDR-II memory. If nVidia uses that for the NV30, they won't be too far behind the 9700 (30%) in raw bandwidth even with a 128 bit bus. If they envisioned a future where you would run out of shader performance before you ran out of bandwidth with that much bandwidth, it is not unreasonable to assume that they would design the NV30 with a 128-bit bus.

It will be interesting to see if the theoretical cost savings of keeping to a 128-bit bus will show up in the price of the NV30 compared to the price of the 9700.

nVidia has several times in the past bet on advancing memory technology to meet the bandwidth requirements of their video chips. Perhaps in this instance the advance of packaging technology (with BGA memory in the Parhelia and 9700) outstripped the advance of memory technology.
 
Even nVidia could get 500+ MHz DDR memory running with nv30, we have to remember that ATI can plug this to the near-future releases as well... With 256 bit bus implemented I don't see any reason we they would go back to the 128 bit bus (in high-end that is). So memory vice ATI has thigs pretty well covered. Over 30 GB/sec bandwidth easily isn't that bad :)
 
The other possibility is that they're treating the 128bit physical bus as two logical buses with two independent clock phases, using the memory chip's output enables to get the data on the bus at the correct time. This would obviousley require you to be able to mess with the output enable at twice the DDR's burst data rate (so for 300MHz DDR it means 1200MHz), no idea if this is physically possible, and would probably require some rather special memory pads in the graphics chip itself.

Note, if they do this it isn't a new idea, even in a graphics chips as the Weitek 9100 did it back in 1993 (admittedly only with EDO memory running at something like 60MHz).

John.
 
Chalnoth said:
Basically, I have said for a long time that a 256-bit bus was not a good idea, primarily due to cost concerns. Quite simply, it's very expensive to produce video cards on a 256-bit bus.

I cut the rest. It made good sense.
I would argue the point above though - "very expensive" is always relative. Is there a cheaper way to get the same results? Is the cost high relative to a 128-bit PCB (probably true at the moment) or relative to the total component cost (false at the moment)?

I'd say that going 256-bit wide offers a good combination of simplicity in implementation, low price, and lack of IP-complications and is arguably the best way to get that kind of bandwidth on a gfx-card at this point in time.

While you are correct that hiking the clock is generally the cheaper route, it is dependent on parts availability and prices, and is not without engineering obstacles of its own.

Entropy
 
JohnH said:
The other possibility is that they're treating the 128bit physical bus as two logical buses with two independent clock phases, using the memory chip's output enables to get the data on the bus at the correct time. This would obviousley require you to be able to mess with the output enable at twice the DDR's burst data rate (so for 300MHz DDR it means 1200MHz), no idea if this is physically possible, and would probably require some rather special memory pads in the graphics chip itself.

Note, if they do this it isn't a new idea, even in a graphics chips as the Weitek 9100 did it back in 1993 (admittedly only with EDO memory running at something like 60MHz).

John.

Sounds like QBM, with the combiner-chip embedded into the graphics-chip instead of an discrete device. So this should be possible. But QBM (on motherboards) works only up to 133MHz and 166MHz in the near future. But graphics-RAM was always way faster then main-RAM.
 
I think the constraint on mobo's for this is that you want to be able to upgrade your mem, which makes it tough to go for a higher clock, graphics cards on the other hand work with a very fixed load...

John.
 
Chalnoth said:
Basically, I have said for a long time that a 256-bit bus was not a good idea, primarily due to cost concerns. Quite simply, it's very expensive to produce video cards on a 256-bit bus.


At the same time, however, are two other issues. Firstly, it is possible to significantly reduce the total bandwidth usage of today's video cards through better optimizations. Possible ways of optimizing include partial deferred rendering (I don't support full deferred rendering, btw...), frame/z-buffer compression, occlusion detection, early z-rejection, hierarchical Z, and so on. In this way, it may be possible to drastically reduce the memory bandwidth hit required for MSAA, reducing the need for insanely-high memory bandwidth.

Well its true that if you have more pins then you have to change your packaging of the die. And more pins = more money. And having to route more singals on a PCB that probably required more layers dose also add cost. But what about the other alternatives you mentioned. Higher speed memory is costly as well and its out of the IHV control for pice and when its ready to be used. Then if you do have bandwidth saving features inside the core then those have a cost as well (using up space and gate count that could be used for another feature). So which is more costly? don't know. Keep in mind the longer and the more you build the cheaper it gets to build.


Increasing the frequencies is inherently cheaper than increasing the number of pins because the primary costs involved are inside the chips themselves. Still, at the same time, it does require collaboration between graphics chip companies and memory manufacturers, meaning that this is the harder of the two improvements in memory bandwidth to make happen.

Don't forget a High clock rate can cause signal interference. However you also can generate EMI buy double your busses so both of these have to be weighed in the design. Now maybe for Graphics this is not an issue. However when Rambus Mobos were under development I know of a couple of my RF Engineering buddies that were ask to work for them to help sort our some signaling issue.....
 
Back
Top