"..combines four Radeon 8500...based on the 9700..this

T2k

Veteran
ExtremeTech has a good nreport from the SIGGRAPH but the most interesting thing is this one:

0,3363,sz=1&i=13700,00.jpg


"This board, made by CAE combines four Radeon 8500 GPUs onto a single PCI board. It's designed for workstation markets to accelerate complex rendering tasks. A version based on the 9700 will be available this fall."

I still don't understand something: how can be much faster this card using one single stupid slow PCI slot?

Could somebody help me out with a viable idea... thx. :)
 
DaveBaumann said:
For starters, thats a 64bit PCI slot...

Of course. But it's still slow...

I don't understand how can use this thing efficiently (I mean real-time) if you have to deal with the PCI's bandwidth... :-?
 
With AGP 4X the bandwidth is up at 1066 MB per second...

With 64 bit PCI bandwidth is essentially the same, 64-bit PCI-X architecture runs at speeds up to 133 MHz and transfer rates above 1gigabyte per second

So essentially capable of of 4X AGP transfer speeds which isn't so bad...AGP 8X would be nice but the spec just got released :)
 
Of course. But it's still slow...

I don't understand how can use this thing efficiently (I mean real-time) if you have to deal with the PCI's bandwidth

Your lack of understanding stems from your lack of knowledge about the subject.

PCI is an ambigious term in this instance. PCI can come in many flavours. Beyond simply going 64bit, it can have a higher clock. Like I previously stated this could easily be PCI-X, which IIRC 64bit with a 133 MHz bus. Which is comparable if not better than AGP 4X.
 
So essentially capable of of 4X AGP transfer speeds which isn't so bad...AGP 8X would be nice but the spec just got released

That would prevent multiple board simulaniously as well - you'll likely find that they can run these with more than just one board in as well.
 
Saem said:
Your lack of understanding stems from your lack of knowledge about the subject.

PCI is an ambigious term in this instance. PCI can come in many flavours. Beyond simply going 64bit, it can have a higher clock. Like I previously stated this could easily be PCI-X, which IIRC 64bit with a 133 MHz bus. Which is comparable if not better than AGP 4X.

In fact AGP could be seen as an 'extended' PCI bus with multiple data per clock and improved control mechanismes.

About the bandwidth if PCI is 32 bits at 33 MHz and AGP is 32 bits at 66 MHz it would be like this:

PCI 32 bits 33 MHz ~ 100 MB/s
AGP 32 bits 66 MHz ~ 250 MB/s (single data per clock)
AGP 4X 32 bits 66 MHz ~ 1 GB/s (quadruple data per clock)
AGP 8X 32 bits 66 MHz ~ 2 GB/s (octuple data per clock)
PCI 64 bits at 133 MHz ~ 8 GB/s (whoww!!)

It would be the same peak bandwidth than SDRAM 133 (data bus with 64 lines and 133 MHz single data per clock).
 
Saem said:
Of course. But it's still slow...

I don't understand how can use this thing efficiently (I mean real-time) if you have to deal with the PCI's bandwidth

Your lack of understanding stems from your lack of knowledge about the subject.

PCI is an ambigious term in this instance. PCI can come in many flavours. Beyond simply going 64bit, it can have a higher clock. Like I previously stated this could easily be PCI-X, which IIRC 64bit with a 133 MHz bus. Which is comparable if not better than AGP 4X.

You are so sententious, sure. :rolleyes: Why? Eh, who cares... :rolleyes:



PCI-X 2.0 266/533 will double/qudr. but it's still in draft phase.

Well, maybe you didn't understand me 'cause of my broken english... :) so, my question is same:

Do you think is that enough for this card?

EDIT: I'm copied here something instead of other comment... :)
 
RoOoBo said:
PCI 32 bits 33 MHz ~ 100 MB/s
AGP 32 bits 66 MHz ~ 250 MB/s (single data per clock)
AGP 4X 32 bits 66 MHz ~ 1 GB/s (quadruple data per clock)
AGP 8X 32 bits 66 MHz ~ 2 GB/s (octuple data per clock)
PCI 64 bits at 133 MHz ~ 8 GB/s (whoww!!)

It would be the same peak bandwidth than SDRAM 133 (data bus with 64 lines and 133 MHz single data per clock).

PCI 32 bits 33 MHz ~ 133 MB/s
AGP 32 bits 66 MHz ~ 266 MB/s
AGP 4X 32 bits 66 MHz ~ 1 GB/s
AGP 8X 32 bits 66 MHz ~ 2 GB/s
PCI 64 bits at 133 MHz ~ 1 GB/s
Look:
64bits = 8bytes.
8*133Mhz = 1064Million bytes/s - 1GB/s
 
PCI 64 bits at 133 MHz ~ 1 GB/s
Look:
64bits = 8bytes.
8*133Mhz = 1064Million bytes/s - 1GB/s

Lol, I put 64 rather than 8 in the calc.

Edit: In fact taking into account RDRAM bandwitdh is around 3.2 GB/s or 4.2 GB/s (for 800 and 1066) I should have seen that that 133 SDRAM was really fast ;)
 
Doesn't most people say that AGP 4x hardly give anything above AGP x2, and even the difference to AGP 1x is small. So there's at least a good chance that PCI 64 at 133MHz would be enough.

But what interests me the most is the two other chips with HSF. It's hardly AGP.. ehrm PCI bridges. And are the chips around them RAM? Is this supposed to be much more of an stand alone card than traditional gfx cards? Are the two extra chips CPUs which only purpouse is to feed the GPUs with data, dependant on just parameters sent over the PCI bus?
If that's the case, and the card has enough memory to avoid texture/geometry swapping, then it should be enough with a PCI bus.

Another thing, am I blind, or is there no video out on that card? I wouldn't be supprised if they were meant for multi card setups. (Well of course they would need at least one more card, but I meant more than that. :D )

But you'd need a rather large case to fit that beast and still get any airflow.

Btw:
Is this it? http://www.cae.com/en/visuals/tropos.shtml
 
Remember I'm not a pro....not even a layman I'd say....but...is rendering really all that dependant on bus speeds....?
I thought it was more a matter of the CPU sending data to the graph. card for it to process.(in a case like this I mean,workstation made for rendering and such...)
 
RiotSquad said:
Remember I'm not a pro....not even a layman I'd say....but...is rendering really all that dependant on bus speeds....?
I thought it was more a matter of the CPU sending data to the graph. card for it to process.(in a case like this I mean,workstation made for rendering and such...)

all of that has to travel via the bus ;)
 
My only gripe would be with the relatively inaccurate rendering of the Radeon 8500 (in comparison to the GeForce3/4 cards, and apparently it's younger sibling, the R300). Hopefully this is most of just a "prototype," and the boards they really expect to sell are ones with R300's in 'em.
 
Chalnoth said:
My only gripe would be with the relatively inaccurate rendering of the Radeon 8500 (in comparison to the GeForce3/4 cards, and apparently it's younger sibling, the R300). Hopefully this is most of just a "prototype," and the boards they really expect to sell are ones with R300's in 'em.

If you check out the article that this came from at extremetech, you'll see their actually working on a board based off the 9700 for this fall.
I cant even begin to imagine the rendering power behind an array of 4 or even 8 9700s in tandem.
 
Edit:
Ooops...never underestimate the speed of this forums posters.... :)
Was a reply to Mulciber...

Yeah...I'm not *that* much of a n00b.... :p
What I meant is....isn't it more in such a setup all about the CPU (most likely very specialized and/or multiple) more or less just sends the pure raw data and the rendering done by graph. card....? (hence not needing a super fast bus compared to how a "normal system" would want/need when rendering)
Of course it'd still need a fast bus due to config being able to send more data...but the rendering part (which I'd imagine goes to the GPU's in such a system) would still be the heavier work....? (hence the system rather waiting for the graph card than the other way around...?)
I'm not a pro in any way like written....but that's how I thought these more powerful graph stations like CGI works...

Edit #2:
Feel free to enlighten me btw...even though I'm very unlikely to ever use the info there's no such thing as too much knowledge imho. :)
 
Update:

I just had a thought about what those chips at the front of the board could be.

They could be designed to combine the framebuffers.

Here is how it might work:

1. Split up all the triangles in the scene, so that each video card accepts one quarter of the triangles. These triangles are rendered to a single frame and z-buffer on each chip.

2. Once rendering for one frame is done, all of the z-buffer and frame buffer information is combined into a single output frame by one or both of the chips in the front of the board.

I think this might be an optimal usage of fillrate and geometry rate. Otherwise you might have a very hard time leveraging the full geometry rate of all four processors.

You might also have to store alpha blending modes and similar things for later rendering, as you would get some very strange results otherwise.
 
Back
Top