Will PC graphics hit a deadend?

Alistair · Jul 8, 2003

Different strokes for different folks. As you say, this is perhaps the wrong board for my point.

A.

PS: Fetishes are pathological? Uh Oh.

Ailuros · Jul 8, 2003

PS: Fetishes are pathological? Uh Oh.

Assume you get sexually aroused by carbon, wouldn't you say you're in trouble even a bit? LOL

no_way · Jul 8, 2003

From the software and content-generation side, some rescue from hitting the limits is going to come from specialization and reuse.
Meaning that some software writers already have been focussing on specific aspects of game engines, like physics, audio, procedural content generation, AI. By licencing their production, a lot of effort is saved for people writing the game itself.
3D content is getting easier to create too. Polygon limits are relaxing, LOD systems are automatically able to render just as many as needed. Animations can be reused, physics engines are going to take some workload away from animators ( no need to animate doors opening etc ). Some content can be directly generated with procedural methods ( one day, graphics accelerators must be able to render DarkTrees realtime

) . And of course modelling tools themselves are getting better and better every release.

KimB · Jul 8, 2003

I don't think there are ever going to be software and content generation limits. There will always be better tools and whatnot.

As for the hardware side, the limits are only in sight for current technology. New technologies will take over to improve processor performance. Personally, I think the next significant leap will be with a move to GaAs (or similar, high mobility semiconductor).

Dio · Jul 8, 2003

Chalnoth said:
I don't think there are ever going to be software and content generation limits. There will always be better tools and whatnot.

The content generation limits are already here and a very major issue for the game industry because only the large survive and there's no way (yet) of selling 'indie' software.

http://www.costik.com/weblog/2003_05_01_blogchive.html
Very interesting read from Greg Costikyan, creator of the mighty Paranoia. He has a PPT on it with more detail on his site somewhere too.

JohnH · Jul 8, 2003

Chalnoth said:
arjan de lumens said:

AFAIK, Gallium Arsenide is not really useful except for niche products - it draws something like ten times as much power as silicon and has awful yields once you get more than about a million transistors per chip. Something similar applies to most of the alternative semiconductors as well.

Click to expand...

Power consumption: I find that exceedingly hard to believe. With about five times the electron mobility, it should be capable of about one fifth the power consumption.

Yields: This is likely, but only due to the fact that the fabs haven't dealt much with the substance. Silicon has had decades for the manufacturing processes to be refined. It will take time for another substance to become as good, but there are others (such as GaAs) that show much more promise than Silicon, if the appropriate levels of R&D are applied.

I remember reading somewhere that large scale GaAs has problems as a result of dislocations in its crystal latice and other problems due to fragility of the wafers themselves. No idea if these issues could be fixed, but GaAs has been around for a while now so you'd think if they could have fixed it..

Wrt power, not sure, but migth be related to leakage current.

Will have to have a dig...
John.

Dave H · Jul 8, 2003

Ailuros said:
A TBDR won't shade fragments that end up being occluded. An IMR often will, unless a "software deferred" rendering style is being used, ala Doom3. So in fragment shader-heavy scenes, a TBDR can achieve the same realized fillrate as an IMR that has greater fragment shader execution resources.

Click to expand...

LetÂ´s make it more specific then: how much would you predict would the difference in transistor count be between a PS/VS3.0 TBDR and an equivalent IMR?

Err. No idea.

I mean, first there's the fact that I don't think there's any fair definition of "equivalence" between a TBDR and an IMR, because the efficiency advantages of a TBDR vary depending on the scene data, rendering techniques, and settings in question. What's the level of pure overdraw inherent in the scene? Are the polys sent to the card in back-to-front order, front-to-back, or somewhere in between? Is a Doom3 style z-only first pass being used? Is the IMR bandwidth-limited or fillrate-limited?

Note that a TBDR's efficiency benefits come only on the rasterize/render side of the rendering process, not in the geometry side. So a TBDR's benefits should tend to grow at higher resolutions. (I think. I haven't thought that through completely.) Also note that, by merely donating enough on-chip hardware, a TBDR has the option of providing multisampling AA entirely on the chip, and thus at essentially no performance cost. (The performance hit of MSAA on an IMR comes from the extra off-chip bandwidth that is used.) So on the one hand, a well-designed TBDR is going to have a higher transistor count than otherwise, in order to take advantage of this feature, which will give it a huge benefit at high levels of MSAA, but no benefit at all if AA isn't turned on. Of course, a well-designed IMR is going to dedicate tons of logic to various workload reducing algorithms which are unnecessary on a TBDR. And so forth.

Second, I really couldn't say as I don't have a good notion of what fraction of overall transistor budgets are currently dedicated to which functions on a GPU. For some ridiculous reason I can never understand, the consumer 3d hardware industry refuses to release any technical information whatsoever about their products (even though the competition could easily reverse-engineer most info and the industry moves so quickly that any "trade secret" embodied in a current product would be useless by the time it could be incorporated into a competitor's future chip). So, AFAIK, no such information is publicly available.

One could make reasonable guesses based on overall transistor counts for various GPUs of various configurations (assuming the configuration details are publicly available, which has proved famously untrue of a certain IHV's chips of late). But that wouldn't take you so far, and I'm not going to pretend I know how the upgrade to PS/VS 3.0 is going to affect transistor counts in the vertex and fragment shader pipelines. (I mean, yes, it primarily means the addition of texture address calculation and sampling units in the vertex shaders, and of the logic to implement dynamic flow control in the fragment shaders. But how a balanced design will be crafted from the combination of the existing shader pipelines and those new requirements is anybody's guess. There's more than one way to skin a cat. And, among other things, the IHVs (but not me) have run hundreds of thousands of cat-skinning simulations to help them decide on the best method.)

Essentially there are two cop-out approaches to answer your question. One is to note that TBDRs require the same hardware resources on the geometry side, and less on the fragment side by the factor of actual overdraw on the IMR, to achieve the same performance. Factor in the cost of sorting/tiling logic, z-caches, etc. on a TBDR, and the cost of framebuffer compression, overdraw reducing algorithms, and hierarchical z-cache on an IMR. And you've got an answer rough enough to be pretty meaningless.

The other approach is to note that transistor count tracks closely (albeit not linearly) with IC cost. Unfortunately, the other determinants of chip cost are yields--about which we can't presume anything useful--and volume, partially because of volume fab discounts, but primarily in terms of how many chips you have to amortize design costs over. If we presume PowerVR is providing the TBDR, and ATI or Nvidia the IMR, then it's clear that PVR is going to sell fewer units and thus have higher costs for a chip with the same transistor count. Still, it's reasonable to assume that GPUs selling into the same market segment have broadly similar transistor counts. At which point your question devolves--with a great deal of fudging and hand-waving--into one about price/performance.

Dio · Jul 8, 2003

JohnH said:
I remember reading somewhere that large scale GaAs has problems as a result of dislocations in its crystal latice and other problems due to fragility of the wafers themselves. No idea if these issues could be fixed, but GaAs has been around for a while now so you'd think if they could have fixed it..

Similar vague memories here. Practical mass-produced GaAs was 'just around the corner' back in the 80's, but has since faded away - most likely because silicon has gone much further than was expected at the time (back then, the UV lithography problem was seen as a brick wall IIRC).

As a very rough simplification: currently we're in a 'silicon is good for quite a few years' zone. As we run towards the end of that zone, the panic level rises, and we move into a 'silicon is reaching the end of its life' zone (which we have been in before). Then someone makes a breakthrough. It might extend silicon, or it might be practical gallium arsenide, or germanium, or diamond transistors, I have no idea. Something will almost certainly turn up.

But the basic premise is that right now there's no great pressure to explore alternative technologies because the current stuff can go some way yet.

arjan de lumens · Jul 8, 2003

IIRC, one of the main reasons for the huge power draw of GaAs versus silicon is that in the GaAs logic families that are actually faster than silicon, gates draw substantial current even when they don't switch, whereas in silicon CMOS logic nearly all power draw is due to switching only. The reason for this is that the only fast transistors in GaAs correspond to depletion-load NMOS transistor, and the only gates you can construct from this transistor type alone draw power like crazy even while not switching. In silicon CMOS, you generally achieve best performance and near-zero static power draw by combining NMOS and PMOS transistors. There are PMOS-like transistor types available in GaAs as well, but these are actually slower than their silicon counterparts (GaAs has good electron mobility, but awful hole mobility) and are as such nearly useless.

The GaAs wafer fragility issue was, however, recently solved by growing GaAs on top of strontium titanate on top of a traditional silicon wafer.

Ailuros · Jul 8, 2003

I mean, first there's the fact that I don't think there's any fair definition of "equivalence" between a TBDR and an IMR, because the efficiency advantages of a TBDR vary depending on the scene data, rendering techniques, and settings in question.

Bingo

What's the level of pure overdraw inherent in the scene? Are the polys sent to the card in back-to-front order, front-to-back, or somewhere in between? Is a Doom3 style z-only first pass being used? Is the IMR bandwidth-limited or fillrate-limited?

Wild guestimates: In upcoming games hardly higher than a factor of 5 or slightly more. Mostly front to back, or else dual pass deferred rendering mechanisms to avoid worst case scenarios (or do you think that in an IMR world developers are going to deliberately cripple the performance of the widest majority of hardware out there?). Being hard pressed for an answer for the last one, I'd say they look up to now severely fillrate-limited.

Note that a TBDR's efficiency benefits come only on the rasterize/render side of the rendering process, not in the geometry side. So a TBDR's benefits should tend to grow at higher resolutions.

Bingo Nr.2

Also note that, by merely donating enough on-chip hardware, a TBDR has the option of providing multisampling AA entirely on the chip, and thus at essentially no performance cost. (The performance hit of MSAA on an IMR comes from the extra off-chip bandwidth that is used.)

That "donating" enough on chip hardware is the real issue here. I'd say that if you really want high quality, high sample Multisampling on a high end TBDR today, you need to scale the Z units in analogy. That's an additional hardware cost.

IMR's can easily in the future switch to more exotic algorithms, while gaining the same low performance loss as high sample MSAA on a TBDR. There a TBDR might even go into exotic algorithms too, just for the reason that hardware implementation is cheaper in the end.

But that wouldn't take you so far, and I'm not going to pretend I know how the upgrade to PS/VS 3.0 is going to affect transistor counts in the vertex and fragment shader pipelines.

Presupposition most possible functionalities are identical between TBDR vs IMR (which is unlikely), I'd say surprisingly close. Just another wild guestimate

If you average all advantages and disadvantages of both architectures on a pure theoretical basis (for the time being), please tell me why you are so sure that a TBDR will be able to reduce transistor count overall? And I am definitely not thinking of a say 105 vs 115M transistor scenario. I'm a bit picky; I'd expect rather a 40-50% difference in transistor count, while providing at least similar functionalities and average performance.

zidane1strife · Jul 8, 2003

expense... difficulty... it is increasing... let it be.... SLOWETH...

KimB · Jul 8, 2003

arjan de lumens said:
IIRC, one of the main reasons for the huge power draw of GaAs versus silicon is that in the GaAs logic families that are actually faster than silicon, gates draw substantial current even when they don't switch, whereas in silicon CMOS logic nearly all power draw is due to switching only. The reason for this is that the only fast transistors in GaAs correspond to depletion-load NMOS transistor, and the only gates you can construct from this transistor type alone draw power like crazy even while not switching. In silicon CMOS, you generally achieve best performance and near-zero static power draw by combining NMOS and PMOS transistors. There are PMOS-like transistor types available in GaAs as well, but these are actually slower than their silicon counterparts (GaAs has good electron mobility, but awful hole mobility) and are as such nearly useless.

That makes sense, as if I remember correctly, hole mobility of GaAs is actually very similar to that of silicon.

As for static current draw, it would depend on a large number of factors. There are many, many different types of transistors, and all that I know of can be used for digital logic. I'm sure this issue is also solvable.

asicnewbie · Jul 9, 2003

arjan, just out of curiousity, how does GaAs compare with SiGe (silicon germanium)? I know SiGe is used for a lot of RF-stage components in the wireless world (although for some lower-freq applications, I read that many companies are able to squeeze off the baseband + RF-stage in a conventional mixed-signal silicon process.)

Isn't it possible to do fast digital-logic on normal silicon? CMOS digital logic is 'slow' because each state-transition (0->1, 1->0) requires a transistor pair to flip operating-regions. In some transistor topologies, like ECL (emitter coupled logic), all transistors in the gate operate in the ON region. And of course, they burn tons of power even when the logic circuit is 'idle' (not changing state) because the constituent transistors are always on. Only the most exotic, specialized applications can demand this sort of design methodology...(like I/O drivers and possibly a bus controller.)

Yet as process geometries continue to shrink, CMOS (digital-logic) leakage current continues to get worse and worse. I think the TSMC 90nm general logic family has two versions of every gate, a 'low-Vt' and 'high-Vt'. The high-speed version has substantial leakage current (power-consumption when the transistor is not switching its state), and the other version has low-power draw but longer switching-delay. This makes me wonder if a future GPU would ever use the previously mentioned 'exotic' transistor building blocks? Their additional power-consumption may not appear so bad when compared to the leakage current of a future process (65nm, 45nm, etc,)

KimB · Jul 9, 2003

arjan de lumens said:
There are PMOS-like transistor types available in GaAs as well, but these are actually slower than their silicon counterparts (GaAs has good electron mobility, but awful hole mobility) and are as such nearly useless.

Btw, I looked back at my notes from class. Here's a quick table of some info on mobility of a couple of semiconductors:

Code:

        electrons   holes
Carbon  1800        1600
Silicon 1350        475
GaAs    8500        400

Anyway, as you can see, silicon also has a significant disparity in hole/electron mobility. Though it isn't nearly as severe, it is significant. If NMOS/PMOS transistors were paired, a naiive calculation puts GaAs at roughly 10% faster, assuming switching is in serial. Of course, carbon would be nice, too, but I'm not sure that anybody thinks it would be cost-effective to use perfectly-pure diamonds for chip lithography.

KimB · Jul 9, 2003

asicnewbie said:
arjan, just out of curiousity, how does GaAs compare with SiGe (silicon germanium)? I know SiGe is used for a lot of RF-stage components in the wireless world (although for some lower-freq applications, I read that many companies are able to squeeze off the baseband + RF-stage in a conventional mixed-signal silicon process.)

Good question. Just did a quick search on google:
http://www-mtl.mit.edu/mtlhome/6Res/AR2002/09_mater/014_hmsigehmosfets.pdf

So SiGe may be a good solution instead of GaAs, I don't know.

Entropy · Jul 9, 2003

asicnewbie said:
Yet as process geometries continue to shrink, CMOS (digital-logic) leakage current continues to get worse and worse. I think the TSMC 90nm general logic family has two versions of every gate, a 'low-Vt' and 'high-Vt'. The high-speed version has substantial leakage current (power-consumption when the transistor is not switching its state), and the other version has low-power draw but longer switching-delay. This makes me wonder if a future GPU would ever use the previously mentioned 'exotic' transistor building blocks? Their additional power-consumption may not appear so bad when compared to the leakage current of a future process (65nm, 45nm, etc,)

Although this is pure speculation on my part, for GPUs it would seem to make sense to go with the low leakage family, and use the power budget for additional parallellism/functionality. It's a multidimensional tradeoff though. At 65 and 45 nm, there are still untapped techniques for power management that (as far as can be seen from the outside) aren't used on GPUs today, and should serve to extend the current paradigm at least that far. (And which could serve to shift the 'low-Vt' and 'high-Vt' decision as well.)

Below 45 nm, my crystal ball goes murky however. Pitch black, more like. 157nm steppers should be able to extend down to that with better than NA=0.85, but I know nothing about shorter than 157nm lithography and where the tradeoffs may lie then. The paper referenced above gives some ideas as to where CMOS as we know it runs into trouble, but other more exotic processes? No idea. I haven't come across much/any material accessible to laymen that has seemed better than handwaving.

Entropy

Dio · Jul 9, 2003

Chalnoth said:
I'm not sure that anybody thinks it would be cost-effective to use perfectly-pure diamonds for chip lithography.

With mobilities like that and the heat and power handling capacities of diamond transistors (diamond operates well at 500 degrees C, and has much better heat conduction than silicon), you can bet your life that a lot of people are looking at diamond transistors.

There was a great article in Scientific American on diamond transistors several years ago - unfortunately it doesn't appear to be on the web. They were mostly talking about transistors for handling power grid switching - 10000 volts at 1000 amps and such like, but there was some discussion about the feasibility of small-scale circuits.

It's perfectly possible that diamond transistors are the (very) long term future. As an analogy, diamond is probably in about the same shape silicon was back in the 50's...

pascal · Jul 9, 2003

Dio said:
Chalnoth said:

I'm not sure that anybody thinks it would be cost-effective to use perfectly-pure diamonds for chip lithography.

Click to expand...

With mobilities like that and the heat and power handling capacities of diamond transistors (diamond operates well at 500 degrees C, and has much better heat conduction than silicon), you can bet your life that a lot of people are looking at diamond transistors.

There was a great article in Scientific American on diamond transistors several years ago - unfortunately it doesn't appear to be on the web. They were mostly talking about transistors for handling power grid switching - 10000 volts at 1000 amps and such like, but there was some discussion about the feasibility of small-scale circuits.

It's perfectly possible that diamond transistors are the (very) long term future. As an analogy, diamond is probably in about the same shape silicon was back in the 50's...

I saw this one too. Good article.

KimB · Jul 9, 2003

Dio said:
It's perfectly possible that diamond transistors are the (very) long term future. As an analogy, diamond is probably in about the same shape silicon was back in the 50's...

Well, while there may be a niche market for diamond semiconductors, I feel that other technologies will come to the forefront before diamond becomes viable.

Side note: as far as I know, pure diamond has the highest heat conductivity of any substance known to man.

Dio · Jul 9, 2003

I would generally agree, but crystal balls are notoriously unreliable!

Will PC graphics hit a deadend?

Alistair

Ailuros

Epsilon plus three

no_way

KimB

Dio

JohnH

Dave H

Dio

arjan de lumens

Ailuros

Epsilon plus three

zidane1strife

KimB

asicnewbie

KimB

KimB

Entropy

Dio

pascal

KimB

Dio

Similar threads