What do you want to see the most in the next gen video cards

Josiah · Oct 19, 2003

Most of all I want a realistic lighting model (not hacked vertex lighting or lightmaps, phong shading or better). And of course increased resolution (geometry/textures) but I assume that is a given.

KimB · Oct 19, 2003

A realistic lighting model is up to software developers.

Oh, and even the GeForce2 can do phong shading.

GraphixViolence · Oct 20, 2003

I voted for more pipelines. This is provides exactly the same benefits as higher core clock frequency.

Adding memory bandwidth or texture units won't help when apps become primarily shader-bound, which is hopefully the way they are moving.

Merging the pixel and vertex shaders or increasing programmability doesn't seem like it will provide any significant benefit for at least another few years. Compare some of the DX9 demos that have been available for a while with the best-looking games around today (and keep in mind HL2 and Doom 3 aren't even out yet), and it's clear that developers are nowhere close to fully utilizing the programmability DX9 already offers. I think that's partly because today's DX9-class hardware lacks the processing power to do so, and partly because developers usually need a few years working on a stable programming platform to get the most out of it.

Faster AA/AF would be good, but that's a special case application. More pipes make everything go faster, with or without these features enabled.

PCI Express is a foregone conclusion... Intel has decided this will be the new standard, and all graphics hardware makers will follow.

KimB · Oct 20, 2003

GraphixViolence said:
Merging the pixel and vertex shaders or increasing programmability doesn't seem like it will provide any significant benefit for at least another few years.

Merging pixel and vertex shaders would be more of a hardware-side efficiency improvement.

And we're already so close to as much programmability as will ever be necessary for the current paradigm in 3D graphics, we should definitely not stop now.

As a side note, one of my major reasons for wanting to see merged pixel and vertex shaders is that that should get rid of integer precision formats as well as FP24. Everything would have to be done at at least FP32 (aside from register space saving gotten from going for FP16 or so), which should bring shader hardware closer in terms of precision capabilities.

GraphixViolence · Oct 20, 2003

It could be argued that supporting formats with less than FP32 precision is just as much of a hardware efficiency improvement as merging pixel & vertex shaders. Every additional bit of precision you have to support takes up silicon that could instead be used for increasing performance. As the NV30 vs. R300 match-up showed, an 8-pipe 24-bit architecture can be a lot more attractive than a 4-pipe 32-bit architecture.

Since merging pixel & vertex shaders pretty much requires making everything FP32, I think it would create an efficiency trade-off where redundant circuitry can be removed on the plus side, but more circuitry is required to support unnecessarily high precision on the minus side. Add in the additional complexity required to schedule computations efficiently among the different shader units, and I don't think a merged shader architecture is necessarily a big advantage over what exists today.

As for increasing programmability, I wasn't implying that progress should be stopped in this area. I just think the next generation of graphics hardware would provide a good opportunity to let developers catch up to the current level and get more out of it, and the best way to do that would be to concentrate on adding more performance instead of more features that won't get used in the near term.

KimB · Oct 20, 2003

GraphixViolence said:
It could be argued that supporting formats with less than FP32 precision is just as much of a hardware efficiency improvement as merging pixel & vertex shaders. Every additional bit of precision you have to support takes up silicon that could instead be used for increasing performance. As the NV30 vs. R300 match-up showed, an 8-pipe 24-bit architecture can be a lot more attractive than a 4-pipe 32-bit architecture.

I really don't think that's an accurate analysis. As far as functional units are concerned, the R300 certainly doesn't have twice as many FP24 units as the NV35+ has FP32 units. The NV3x also has deeper pipelines, and there may be extra transistors involved in supporting the integer formats (which I feel should be nixed...though I think FP16 isn't such a bad thing until we move to higher-precision DACs). In other words, there are other things that make the NV3x more transistor-hungry than the R3xx such that one cannot draw a direct comparison between the choice to use FP32 and FP24. There are too many other differences between the chips to single out that one as the cause.

As for increasing programmability, I wasn't implying that progress should be stopped in this area. I just think the next generation of graphics hardware would provide a good opportunity to let developers catch up to the current level and get more out of it, and the best way to do that would be to concentrate on adding more performance instead of more features that won't get used in the near term.

I really don't think so. If programmability progress stalls, it will be only make it harder to get it started up again.

OpenGL guy · Oct 20, 2003

Chalnoth said:
GraphixViolence said:

It could be argued that supporting formats with less than FP32 precision is just as much of a hardware efficiency improvement as merging pixel & vertex shaders. Every additional bit of precision you have to support takes up silicon that could instead be used for increasing performance. As the NV30 vs. R300 match-up showed, an 8-pipe 24-bit architecture can be a lot more attractive than a 4-pipe 32-bit architecture.

Click to expand...

I really don't think that's an accurate analysis. As far as functional units are concerned, the R300 certainly doesn't have twice as many FP24 units as the NV35+ has FP32 units.

GV did say NV30, which did appear to only have four FP units.

The NV3x also has deeper pipelines,

How did you come to that conclusion?

PatrickL · Oct 20, 2003

I did not vote as i think the poll would need an explanation.

What do we want as what? End user, 3d addict, coder, engineer ?

I think results should change, depending who is answering

Ostsol · Oct 20, 2003

OpenGL guy said:
Chalnoth said:

I really don't think that's an accurate analysis. As far as functional units are concerned, the R300 certainly doesn't have twice as many FP24 units as the NV35+ has FP32 units.

Click to expand...

GV did say NV30, which did appear to only have four FP units.

Some support: http://www.beyond3d.com/forum/viewtopic.php?t=8005

Chalnoth's statement is certainly true for the NV35, though. Granted, those "mini" FP units are certainly -very- capable on their own.

On a related note: am I correct in assuming that one cannot simply write a sequence of operations requiring a full ALU, a mini, and another mini, each operation dependant on the results of the previous operation? That in order for all to be in use at a particular point in time one has to have three independant instructions? (Similar situation for the R3xx's pipeline, I suppose. . .)

mikechai · Oct 21, 2003

I want R360/NV38 level performance and features at $99.

Anybody know when?
________
Volcano classic

KimB · Oct 21, 2003

OpenGL guy said:
GV did say NV30, which did appear to only have four FP units.

But the NV35 doesn't have many more transistors, and apparently has a similar number of functional FP32 units as the R300 has FP24 units. That essentially means that any argument about FP24 vs. FP32 that depends upon looking at the NV30 vs. R300 is meaningless because of the existence of the NV35.

The NV3x also has deeper pipelines,

Click to expand...

How did you come to that conclusion?

From this interview:

Another example is if youâ€™re doing dependant texture reads where you use the result of one texture lookup to lookup another one. Thereâ€™s a much longer title time on the pipeline than there is in ours.

The typical way to improve performance with lots of dependent texture reads it to have a deeper pipeline.

Gnep · Oct 21, 2003

It's been touched on here, but why not move the entire physics engine onto the GPU? I mean, on a discrete-element basis, surely some benefit could be had from the parrellisation available and the geometry engine. Also then no need to pass collision detection info over AGP - only instructions on what actor in a scence moves where going back-and-forth.

Although then I suppose the CPU wouldn't be left with so much to do

Ostsol · Oct 21, 2003

Collision detection information doesn't get passed over the AGP bus. . . The only cases in collision detection where large amounts of data are passed over the bus are with things like cloth simulations or particle effects. For just about everything else the only thing that gets passed over the bus is new transformation matrices.

andypski · Oct 21, 2003

Chalnoth said:
From this interview:

Another example is if youâ€™re doing dependant texture reads where you use the result of one texture lookup to lookup another one. Thereâ€™s a much longer title time on the pipeline than there is in ours.

Click to expand...

The typical way to improve performance with lots of dependent texture reads it to have a deeper pipeline.

My goodness - Dr. Kirk said it - it must be true.

parhelia · Oct 21, 2003

GraphixViolence said:
I voted for more pipelines. This is provides exactly the same benefits as higher core clock frequency.

Then something tells me you will like the Volari...

KimB · Oct 21, 2003

andypski said:
My goodness - Dr. Kirk said it - it must be true.

Considering he's the Chief Scientist at nVidia, I would tend to think he has a rather authoritative position on the inner workings of the NV3x architecture.

OpenGL guy · Oct 21, 2003

Chalnoth said:
andypski said:

My goodness - Dr. Kirk said it - it must be true.

Click to expand...

Considering he's the Chief Scientist at nVidia, I would tend to think he has a rather authoritative position on the inner workings of the NV3x architecture.

The fact that he's the Chief Scientist at nvidia should imply that you should take his comments on other IHVs' architectures with just a little bit of skepticism. Did he provide performance numbers to back up his claim? Did he provide examples of how the NV3x is better at dependent reads? Didn't think so.

Pete · Oct 21, 2003

GraphixViolence said:
I voted for more pipelines. This is provides exactly the same benefits as higher core clock frequency.

Well, that's really a question of which is more cost-effective: a larger, slower core, or a smaller, faster one? Both can realize the same performance, but each may not be realized at the same cost.

Pete · Oct 21, 2003

Chalnoth said:
andypski said:

My goodness - Dr. Kirk said it - it must be true.

Click to expand...

Considering he's the Chief Scientist at nVidia, I would tend to think he has a rather authoritative position on the inner workings of the NV3x architecture.

He's also a public spokesman for nVidia, which should put you on alert as to his desire to tell the truth vs. sell his company. I'm mainly thinking of his (IIRC) interviews that initially misled most sites to proclaim the 5800 as eight-pipeline. I also remember his latest interview with FS where he said ATi couldn't claim to know anything about nV's pipeline because they didn't create it, then turned around and detailed the exact number of cycles it takes for ATi to do certain ops. I fail to see how one couldn't come to a decent approximation of a pipeline by examining cycle times for certain operations, much like nV obviously did for ATi's hardware.

BTW, how often are sin/cos functions used? I'm wondering if Dr. Kirk used that as an example b/c it's a marketable nV advantage, or a truly useful one.

GraphixViolence · Oct 21, 2003

Pete said:
Well, that's really a question of which is more cost-effective: a larger, slower core, or a smaller, faster one? Both can realize the same performance, but each may not be realized at the same cost.

True. I guess I based this statement on the assumption that increasing the number of pipes would be more cost-effective than increasing the clock speed by an equivalent amount. Over the past few years, transistor counts have been roughly doubling each year, while clock speeds haven't quite been keeping pace (doubling around every 1.5-2 years). However, new process technologies don't seem to be rolling out as fast and furious as they once were, so it may not be possible for this trend to continue.

Perhaps it would make more sense to wish for more operations per second in general, rather than more pipes or faster clocks specifically.

What do you want to see the most in the next gen video cards

What do you want to see the most in the next generation video cards?

More pipelines

More memory bandwidth

Faster Clock Speed

Merged Pixel and Vertex Shaders

More Programmability

More Texture Units

PCI Express

Faster AA and Ansio

Other

Josiah

KimB

GraphixViolence

KimB

GraphixViolence

KimB

OpenGL guy

PatrickL

Ostsol

mikechai

KimB

Gnep

Ostsol

andypski

parhelia

KimB

OpenGL guy

Pete

Moderate Nuisance

Pete

Moderate Nuisance

GraphixViolence

Similar threads