More SLI

KimB · Oct 7, 2004

psurge said:
I thought that ATI had already implemented somekind of chip to chip interface... couldn't you just supply board connectors, hook up a "master" board to the PCI-E slot and stick the other ATI board(s) wherever?

They haven't done anything in this area for some time (not since the ATI Rage Fury MAXX have I heard of any product like this...). There were some professional boards that used multiple ATI chips, but these used 3rd-party technology for operation, if I remember correctly.

hkultala · Oct 8, 2004

SlmDnk said:
Why not just make dual core GPUs??? 8)

because it's cheaper and more effective to just make "one" GPU with just twice the number of pipelines.

With CPU's there are lots of dependencies between instructions.
You cannot just double the number of pipelines and get double the performance like in GPU's, but you can put two processors executing completely different instruction streams to one chip.

_xxx_ · Oct 8, 2004

AFAIK ATI is SLI capable since the first Radeon. Anyone remembers Radeon MAXX?

The interface also shoudn't be an issue, be it PCIE or whatever connector between the cards or a two-chip card...

Guden Oden · Oct 8, 2004

_xxx_ said:
Anyone remembers Radeon MAXX?

There was no such product. The maxx was based on the rage128 pro chip, and it didn't work under windows2000 as I recall, and performance was far from 2x anyway. Doing alternate frame rendering should also add an extra frame of latency; ungood.

_xxx_ · Oct 8, 2004

Right, now I remember - that was called Rage Fury MAXX. They had plans to do a Radeon version of that AFAICR, though cancelled it.

I don't think SLI is much of a problem to implement with any GFX chip. And it doesn't give 2x performance with any.

MfA · Oct 8, 2004

IMO Bitboys had the right idea as far as parallelizing direct mode rendering is concerned, they wanted to use sort middle, although it needs some serious bandwith (basically the entire output of vertex shading has to be communicated in worst case). Doesnt really scale all that well, or at least you will need a switching network to distribute transformed vertices if you go beyond a couple of chips, but parallelizing direct mode rendering is problematic whatever you do (sort last needs a lot of bandwith too, and sort first has ridiculous storage requirements).

Better just to give up the idea of throwing an unsorted polygon soup at the 3D card and start parallelizing at a much higher level IMO.

3dcgi · Oct 9, 2004

Mfa. Is 3dlabs Wildcat Realizm similar to what you're thinking? In the complete implementation there is a single vertex processing chip that feeds multiple rasterization chips.

TheMightyPuck · Oct 9, 2004

I don't think this has much direct commercial (i.e money making) application, but it is totally awesome and that is bound to have an indirect effect. If I had the cash I would totally grab a couple NV boards and SLI it. Props to Nvidia for providing this option

euan · Oct 9, 2004

E & S

Nappe1 · Oct 11, 2004

I have reasons to believe that basically all ATI chips since Rage Pro have supported more than one chip configurations. They have just not used that option. as in marketing terms, there has not been any economical use of it.

and when the dual (or even more) chip configuration is in the core, I don't see any reason why not it would be possible to connect to cards to each other via ribbon cable or similar. Of Course, cards need the connectors, but after that, if multi chip interface on the core has been designed right, making additional connector to pcb should not be a real problem. (most likely, the additional balls are already in the chip packages, you just need to wire them to the connector.

so? what I expect? well, IF ATI needs a multi GPU rendering, we will see a one. Most likely, ATI does not see the SLI as viable option right now, because it will boost the total price of the Graphics subsystem of computer beyond 900 Euros. That's a price range that attracts way too little customers in gaming market, able to keep it profittable if battling against nVidia with prices at same time.

KimB · Oct 11, 2004

Well, splitting up rendering tasks between multiple chips on the same board and between different cards are fundamentally different problems.

PatrickL · Oct 11, 2004

Problem will likely change next 3 or 4 years if motherboards move to several pcie-16x slots as standard.

synth · Oct 11, 2004

hi I'm new here but been reading for quite some time now...

anyway thought i'd add something that i saw a little while ago

http://www.hwupgrade.it/articoli/905/index.html

it was built by sapphire as a proof of concept it's apparently not "working" but they say they could get it to work but it would cost WAY to much....

interesting none the less that 2 r360's can run together "natively" i.e. no external bridge.....so is the "SLI" capability buit into the r360?? if so maybe its been there all with all the radeons just not used....???

So if this is taken into account maybe SiS is building a system to tap into the SLI capabilties of the ATI chips that is already there?? just a thought?

Fodder · Oct 11, 2004

synth said:
interesting none the less that 2 r360's can run together "natively" i.e. no external bridge.....so is the "SLI" capability buit into the r360?? if so maybe its been there all with all the radeons just not used....???

R3x0 and R4x0 support up to 256 (I think?) GPUs in parallel, and there a bunch of solutions exploiting this in the professional market.

Dave Baumann · Oct 11, 2004

The Sapphire board was actually a joke. However, there are systems with multiple R300/R350 chips - SGI's Onyx4 being one and the E&S RenderBeast uses about 34, IIRC.

ATI's parallelism is the same internally, for multiple quads, as it is externally over multiple chips. Rather than allocating triangle quads from the setup engine in a round robin, or first free basis, the screen space is divided into tiles and each quad within a chip is allocated certain tiles (if its a 2 quad chip then they would alternate, AFAIK) - this screen tiling and division will work across muliple chips as well as some tiles will be assigned to the quads on one chip and others on the other chip(s). You can probably play around with how the tiles ar distributed across the chip in order to get load balancing as effective as possible.

There is also another mode that ATI operate in for multisampling - the screen is divided into tile again but quads on different chips can be assigned the same tiles to render, but with different multisampling patterns. The final image can then be composited back together and you'll end up with multiples of MSAA greater than the native capabilities of a single chip (so with two chips you could achieve true 8x and 12x FSAA).

Fodder · Oct 11, 2004

Is assigning tiles to chips handled by the driver or extra hardware? And I guess in a single chip, is assigning tiles to quads handled by the GPU or the driver?

Edit: Err, tiles not quads.

Dave Baumann · Oct 11, 2004

Given that chip still needs to operate when there are fewer quads than are natively designed within the chip (9500, X800 PRO) its easy to guess that these things are configurable outside of an actual chip as this effects how the tiling is distributed even on a single chip.

Fodder · Oct 11, 2004

Not disputing what you're saying, but couldn't a simple BIOS setting tell the chip how many quads to alternate tiles between?

... and I don't think you really answered my Q's. Am I missing something obvious? (wouldn't be the first time

)

Dave Baumann · Oct 11, 2004

I'm saying that its obviously something fairly trivial that controls the number of quads to operate the tiles over since we have boards that have a different number of quads being used than are in the chip natively and we have also had people alter these with software/bios updates.

max-pain · Oct 11, 2004

http://www.gamespot.com/news/2004/10/08/news_6110146.html

Nvidia SLI benchmark results

...
The Opteron 250 GeForce 6800 GT SLI system scored an impressive 8,081 points in the recently released 3DMark05 benchmark. In comparison, a single GeForce 6800 GT scored 4,452 points on a slower 2.8GHz Intel Pentium 4 processor in a recent GameSpot 3DMark05 test. The SLI system also scored 18,176 points in 3DMark03, roughly 7,000 points higher than the average score for a single GeForce 6800 GT card.
...

More SLI

KimB

hkultala

_xxx_

Guden Oden

Senior Member

_xxx_

MfA

3dcgi

TheMightyPuck

euan

Nappe1

lp0 On Fire!

KimB

PatrickL

synth

Fodder

Stealth Nerd

Dave Baumann

Gamerscore Wh...

Fodder

Stealth Nerd

Dave Baumann

Gamerscore Wh...

Fodder

Stealth Nerd

Dave Baumann

Gamerscore Wh...

max-pain

Similar threads