The Open Graphics Project

Nick

Veteran
Hi all,

I would like to hear some opinions about the Open Graphics project.

In short, the idea is to improve 3D graphics support on Linux and other open-source platforms by developing a new graphics card, with nearly completely open specifications. To achieve this, a Xilinx Spartan 3 FPGA chip would be used for the first generation. The company making the cards, TechSource, has lots of experience with 2D graphics cards. Drivers would be developed by the open-source community and they are also involved in defining the specifications and design. They don't intend to compete with NVIDIA or ATI, and the card won't be targeted at games.

Some questions still (partially) unsolved:

- What is the vision of the project, the final goals?
- What other ways exist to reach the final goals?
- What target market(s) can be addressed?
- What performance is to be expected, desired?
- What version of OpenGL should be supported?
- What architecture would be preferred?
- What would be the development time?
- What is achievable or not within given restrictions? (the forbidden question)

Any comments of the experts will be highly appreciated.

Thanks,

Nicolas Capens
 
It strikes me as a bit optimistic, but perhaps not impossible, to try to fit an entire OpenGL1.x-compliant rasterizer into a Spartan-3 FPGA. While the largest Spartan-3 is specified to have the equivalent of 5M logic gates, a commonly used rule of thumb is that you will have to divide the number by about 4 to get a realistic estimate. Given such an FPGA and a competent 3D hardware design team, they might be able to achieve performance and image quality rivalling that of a Voodoo1.
 
We were hoping it would be more like TNT2 performance, with specifications as close as possible to OpenGL 2.0. The FPGA would be an XS3S1500 or XS3S2000 most likely. That's 2 million gates instead of 5 million, for the first generation, but one gate is the equivalent of several transistors on an ASIC? It's not a gamer card, but 3D performance should still be sufficient for popular 'desktop' 3D applications.

Anyone able to give some input to solve the unanswered questions?
 
Nick said:
We were hoping it would be more like TNT2 performance, with specifications as close as possible to OpenGL 2.0. The FPGA would be an XS3S1500 or XS3S2000 most likely. That's 2 million gates instead of 5 million, for the first generation, but one gate is the equivalent of several transistors on an ASIC? It's not a gamer card, but 3D performance should still be sufficient for popular 'desktop' 3D applications.

Anyone able to give some input to solve the unanswered questions?
Those 'gate' numbers are still a best-case estimate that is usually far from the truth unless you are really, really fond of shift registers. If you take the FPGA 'gate' count, divide by 5 to get ASIC-equivalent gate count (rule of thumb for most non-shift-register circuits which roughly matches the experiences I have had with FPGAs and ASICs), multiply by 4 transistors per gate, you end up at a complexity roughly akin to about 1.6 million transistors (not including SRAMs, which there is usually plenty of in an FPGA). Also keep in mind that FPGAs don't easily run at high clock speeds; if you take e.g. a processor design that will do 1GHz in an ASIC, it may be able to do about 100-150 MHz in an FPGA.
 
Hmm, I'd be more pessimistic on clock, maybe 10-20MHz. Util may also be lower than that for a complex design...

John.
 
One actual data point wrt 3d accelerators in FPGA: the PowerVR MBX was shown running in a Xilinx FPGA about two and a half years ago - back then, it ran at 14 MHz.
 
TechSource claimed the FPGA could run at around 200 MHz. Of course mapping an ASIC design directly to an FPGA will require a significant frequency reduction, but wouldn't it be possible to correct that with introducing new pipeline stages or other techniques?
 
3dcgi said:
FPGAs don't tend to be cheap. How much would one of these cards cost?
There was a poll on the maillist and I believe most people were willing to pay 200 $. This isn't about high performance of course, but about having adequate 3D support on open-source platforms. Later versions could use an ASIC and have more competitive prices and performance. Still, they try to keep it a niche market.

Anyway, wouldn't it be cool to have a graphics card you can program all by yourself? An FPGA even allows to reprogram the hardware in a matter of minutes, if you got the tools. Aren't there many enthousiats here who would just love to have it to play with?

Any suggestions for the architecture? How to combine the best flexibility and performance within the restrictions of the FPGA? Thanks for all suggestions.
 
Nick said:
TechSource claimed the FPGA could run at around 200 MHz. Of course mapping an ASIC design directly to an FPGA will require a significant frequency reduction, but wouldn't it be possible to correct that with introducing new pipeline stages or other techniques?
The FPGA can run at 200 MHz in the sense that its registers and clock generators are theoretically able to sustain such a clock speed if you have practically no logic between pipeline steps. But getting anything real to work at such speeds in an FPGA generally requires that you write your design more or less directly at the netlist level and carefully check that every addition/change you make doesn't blow up the tiny timing budget you have (200 MHz -> 1 register + about 4 logic cells of delay in the FPGA; in ASIC terms, this would roughly match the timing budget of a Pentium4, and likely require roughly the same design effort as well).
 
I think the project is rather strange. The idea is to have a card with completely open documentation. The price they pay is that it will be A LOT slower than any other card you'd put in your computer. They diss cards that is "relatively well supported by open-source drivers" like Matrox G550 since it's "technologically stagnent, expensive to buy, and will never support PCI-express", but I doubt they will be able to make a card that can touch it.

Now if it was possible for the end user to use the nature of an FPGA, and do some experimenting with the architecture, it would have a cool geek-factor. But that doen't seem to be a goal.

It seems like they added 3D mostly to be compatible with 3D desktops. So it's a rather rudimentary support. Like this:
Texture mapping
• Simple linear interpolation
• Second order differential for approximation of perspective correction (possible feature)
• Bilinear interpolation

I don't want to spoil the fun, but I doubt we'll ever see this one. (It could still be a fun project for anyone involved though.)
 
arjan de lumens said:
The FPGA can run at 200 MHz in the sense that its registers and clock generators are theoretically able to sustain such a clock speed if you have practically no logic between pipeline steps. But getting anything real to work at such speeds in an FPGA generally requires that you write your design more or less directly at the netlist level and carefully check that every addition/change you make doesn't blow up the tiny timing budget you have (200 MHz -> 1 register + about 4 logic cells of delay in the FPGA; in ASIC terms, this would roughly match the timing budget of a Pentium4, and likely require roughly the same design effort as well).
Thanks for the information!

It makes sense that a GPU performs more work per clock cycle than a Pentium 4. So to implement a GPU on an FPGA it would either have to run slower or perform less work per clock cycle than usual. Would it be possible to increase the clock frequency by adding extra stages, so simple and pipelined operations are faster, or are there hard technical limitations?

I guess that question applies to existing GPU architectures as well... Supposing things like a dot3 operation are time critical, would it help to pipeline them and 'double' the frequency? Disregarding heat issues of course.

Considering the space limitations of the FPGA, would it make sense to split shader operations into more elementary operations? Then it could use a more generic floating-point SIMD unit to do everything. I guess such architecture is easier to pipeline.
 
Basic said:
I think the project is rather strange. The idea is to have a card with completely open documentation. The price they pay is that it will be A LOT slower than any other card you'd put in your computer. They diss cards that is "relatively well supported by open-source drivers" like Matrox G550 since it's "technologically stagnent, expensive to buy, and will never support PCI-express", but I doubt they will be able to make a card that can touch it.
The problem is more that the older Matrox cards with good open-source drivers are becoming hard to get, so there's no future in continuing the driver development. It's never going to support the newest OpenGL versions anyway. They hope the success of the FPGA based board will be enough to attract investors so the second generation can have an ASIC based on a fairly modern process.
Now if it was possible for the end user to use the nature of an FPGA, and do some experimenting with the architecture, it would have a cool geek-factor. But that doen't seem to be a goal.
There's no promise from TechSource that the FPGA design will be open-source. All specifications and documentation will be available though, so you can experiment with the drivers any way you like.
It seems like they added 3D mostly to be compatible with 3D desktops. So it's a rather rudimentary support. Like this:
Texture mapping
• Simple linear interpolation
• Second order differential for approximation of perspective correction (possible feature)
• Bilinear interpolation
Those are the old specifications. Me and a few others have corrected most of that. OpenGL 2.0 support is now under consideration. We're looking for an architecture that combines the best flexibility within the space restrictions, and still provide adequate performance for such low-end applications.
I don't want to spoil the fun, but I doubt we'll ever see this one. (It could still be a fun project for anyone involved though.)
These open-source guys can achieve anything. ;) Anyway, I don't dare making any conclusions about practical feasibility without first knowing the theoretical feasibility. The overall architecture is the first next step now...

Thanks!
 
Would it be possible to increase the clock frequency by adding extra stages, so simple and pipelined operations are faster, or are there hard technical limitations?
As arjan mentioned, there are hard limitations due to the designs of FPGAs. It takes X amount of time to read from a flip-flop, run it through one LC worth of logic, and arrive at the next nearest flip-flop. Usually, this the maximum advertized frequency (300 - 500 MHz). If you want to do anything more fancy than an inverter or a 2x1 mux, you'll have to cut your frequency.

Arguably, you can pipeline down to the basic blocks, but then you have a problem: You need to find work to fill in the pipeline.

So you end up either leaving the pipeline mostly empty (which is bad), or you need enormous storage space to carry the state for each and every thread you'll run.

I guess that question applies to existing GPU architectures as well... Supposing things like a dot3 operation are time critical, would it help to pipeline them and 'double' the frequency? Disregarding heat issues of course.
Sure! You just run into the same issues as above. How deep do you want to pipeline your dp3? Can you find enough work to fill in the pipeline? Is your register file large enough and fast enough to feed your pipeline?
 
One problem with having a deep pipeline in an FPGA is there are a limited number of flip flops. Long pipelines need lots of flip flops. Plus control logic can get very difficult in some cases. Splitting dataflow like ALUs into stages is easier.
 
Flip-flops for e.g. pipeline stages aren't particularly expensive in FPGAs; for the Xilinx FPGAs that have been suggested, there is a flip-flop available in every logic cell, so adding extra pipeline stages will have a relatively smaller impact on circuit size than in ASICs. On the other hand, register files with more than 2 ports tend to be really painful to work with in FPGAs.

As for hard technical limitations, if you want to make a GPU in an FPGA, you will want to use the FPGA's on-chip multipliers as much as possible and roll your own as infrequently as possible; IIRC these multipliers are limited to about 80-100 MHz.

As for control logic, you will need to take care that you don't spend too much circuit delays in complicated state machines; if you don't develop your hardware algorithms carefully, this is likely to blow up in your face and you get stuck with a 15MHz GPU that resists all attempts at optimization.
 
Back
Top