Are there any real surpises left?

Randell

Senior Daddy
Veteran
What could an IHV do now in terms of usable features/architecture changes that would surprise people and enhance 3d IQ and/or performance?

Or is it a matter of just complying with DX9/OGL2.0 plus a bit more so there are no 'rabbits' left to pull out the hat that would benefit us consumers?
 
1) deferred renderer/tiler + 256 bit bus = bandwidth irrelevent, and fragment shaders more efficient because you don't execute them unless you need to (really good if your shaders are long). Enable really high AA (16-64x)

2) higher level unit to control multipass, do raytracing, and other algorithms. Even less CPU/AGP involvement

3) even more performance. dispatch way more vertex/fragment ops per cycle

4) scene graph acceleration
 
Performance will still be a big problem.

What do you think, would it be easy to parallelize pixel shader operations within the pipeline? IIRC, an NV20 can do two PS ops per clock and pipeline (?), how many ops will an NV30 be capable doing per clock/pipe?

Assuming two ops per clock, a 1024 instruction program, 400 MHz and 8 pipelines, the fillrate will drop to 6 MPixel/s! Sure, that's still cool for accelerating CGI stuff, but game developers will be limited to tens or may be about 100 instructions per pixel to maintain reasonable frame rates. May be the NV30 can do lot more PS ops per clock, otherwise this will have to be addressed in future generations by adding more pipelines or increasing parallelism within the pipes.
 
Democoder,

Option 1 is what I would like to see to see if its viable, in a way I've been expecting/hoping for that kind of chip at some point anyway. I guess if ATI/Matrox/nVidia did that before Autumn '03 it would be a rabbit out of a hat.

Your other options are more technical than me, but I get your drift :)
 
Big speculation here.
nVidia/AMD merge and decide to enter in a new era with a new SPU (Super Processing Unit):
-Improved Hammer core CPU with GPU in an single chip
- .09 micron process
-Single fp Pixel/DSP pipeline at 3GHz
-Single fp VS/DSP pipeline at 3GHz
-Integrated 4 way UMA controller with 128bits DDR-II
-Only 105 millions transistors
-Unlimited programmability
-mATX form factor
-single chip NB and SB with DD sound, lan, USB 2.0, PCI-X, etc..
-3 PCI-X slots
-For sub $1000 personal supercomputers.

Power to the mass and a big, big installed base :)
 
I myself want to finally see Real Time Radiosity Lighting implemented, but some people here noted that it might not be possible on Silicon chips...

By the time we reach RT Radiosity Lighting, there will be no more advances to make! :D
 
Hmm...

Features...

Better compression for textures (probably a subset of the JPEG 2000 standard?) and introduction of geometry compression.

What about being able to generate vertices inside the 3d chip (like Matrox does with its displacement mapping feature to a limited degree). It could be used to create parametric or algorithmically generated objects (such as L-System trees or how it's called) inside the GPU.

Chris
 
Keep coming up with new features that is closer to Renderman, keep coming up with faster performance at the same time... and pray that Intel or AMD or anyone can create a monster CPU at the same time as they come up with the previous two that regular folks can afford. The last is probably the most important... I don't foresee anything these IHVs can do that won't be dependent on relatively affordable and extremely huge CPU power to really matter, not in the foreseeable future at least.
 
Adding TruForm to all your porn movies.... :LOL:

Seriously though, I'd like to see more realistic terrains in games, perhaps an advanced form of bump mapping? I'm not a technical person, but more realistic games is always something I look forward to. I'd like to see more breakthroughs in A.I. these days though, something more human-like in responses, tactical feedback, vision, and approaches. If A.I. had as many breakthroughs as we've had in visual quality, games would be a lot more fun these days. I like Reverend's response, more Renderman-like features, things that brings out games closer to movie-like quality. Doom3 should really help propel that notion to game developers.
 
DemoCoder said:
1) deferred renderer/tiler + 256 bit bus = bandwidth irrelevent, and fragment shaders more efficient because you don't execute them unless you need to (really good if your shaders are long). Enable really high AA (16-64x)

Many of the problems with long fragment shaders can be overcome by using an initial Z-pass to prime the depth buffer. I am not saying that a deferred renderer doesn't improve this situation, just that the limitations can be reduced in traditional architectures, particularly with hierarchical Z techniques where pixels rejected in subsequent passes become extremely fast.

We can also consider the case where the fragment depth value is generated inside the shader program itself rather than from direct rasterization - the problem now becomes the same for both immediate and deferred architectures - you cannot fill the depth value without executing the shader for all fragments. In the future we may see more of these kinds of situation developing.
 
pascal said:
Big speculation here.
nVidia/AMD merge and decide to enter in a new era with a new SPU (Super Processing Unit):
-Improved Hammer core CPU with GPU in an single chip
- .09 micron process
-Single fp Pixel/DSP pipeline at 3GHz
-Single fp VS/DSP pipeline at 3GHz
-Integrated 4 way UMA controller with 128bits DDR-II
-Only 105 millions transistors
-Unlimited programmability
-mATX form factor
-single chip NB and SB with DD sound, lan, USB 2.0, PCI-X, etc..
-3 PCI-X slots
-For sub $1000 personal supercomputers.

Power to the mass and a big, big installed base :)

As an AMD shareholder I have been arguing for just such a pairing for some time now.
 
Well I, for one, would be most pleasantly surprised by a Star-Trek style "HoloDec". But I suppose I won't see anything like that in my lifetime.
 
Radiosity the end station? Hardly, for one when people say radiosity they usually mean a method which only takes diffuse effects into account. Specular-radiosity/forward-raytracing/biderectional-raytracing/photon-tracing etc etc can be pretty good ... but of course some phenomena rely on the wave like properties of light, so even that is no cure all. <A HREF=http://www.lems.brown.edu/~leymarie/WaveRender/NotesDec98.html>Modelling light propogation with cellular automata</A> could capture all effects, but that will take a bit longer than even realtime radiosity.
 
andypski said:
Many of the problems with long fragment shaders can be overcome by using an initial Z-pass to prime the depth buffer. I am not saying that a deferred renderer doesn't improve this situation, just that the limitations can be reduced in traditional architectures, particularly with hierarchical Z techniques where pixels rejected in subsequent passes become extremely fast.

The only realistic solution for transparancy sorting (apart from raytracing) is using per pixel fragment buffers, which will remain a huge pain in the ass for the foreseeable future without tiling.

We can also consider the case where the fragment depth value is generated inside the shader program itself rather than from direct rasterization - the problem now becomes the same for both immediate and deferred architectures - you cannot fill the depth value without executing the shader for all fragments. In the future we may see more of these kinds of situation developing.

A deferred shader can just look at the pixel shader and when it modifies the fragment Z handle it as a transparant surface, there is really no good reason to use this ... so it wont affect too many surfaces.

Marco
 
Fuz said:
pascal, I think that will happen after ATI and Intel merge!
I was thinking about what you said and the P3-S core could be a good candidate to the job:
-Large 512kb 8-way set associative cache
-Best Performance/Watts in the market
-Highlly stable and small scalling heat generation with frequency
-Almost half the P4 die size.

Now get it and add:
- improved P3-S core
- 128bits DDR-II 4-way UMA controller
- single fp pixel/dsp pipeline
- single fp vs/dsp pipeline

Maybe it could be done with ~70 millions transistors and the .13 micron process 8)

Call it a MPU (Multimedia Processor Unit) and sell it for corporate, SOHO, gamers, kids, scholls, grandmother, etc... at low price point.

A very large installed base could explode (Boom) the 3D applications, audio and image processing, pros apps, etc... We will be happy when much more people become happy. :)
 
I'd like to see more support for LARGE terrian generation, I want Tribes maps to look puny.

Also, I'd like to see a lot more texture and geometry compression. Greater autonomy from graphics processor and CPU.
 
sumdumyunguy said:
Well I, for one, would be most pleasantly surprised by a Star-Trek style "HoloDec". But I suppose I won't see anything like that in my lifetime.

i somewhat dread the day things get that advanced.. it's already getting pretty insane, in terms of the addiction to roleplaying (EverQuest-style).. personally have a hard time believing that the enterprise only has a handful of those rooms, yet seemingly none of them are ever in use :p, one would think there'd be lineups of hundreds waiting to use them, lol..

i guess at least people will be more physically active in that case, not just being forced to sit infront of a screen to be involved with the medium.. sure won't be mentally or socially healthy though..
 
Real world lighting a la povray, or a reasonable kludge, would be something.

I'd settle for real texture filtering and no visual polygon (edge) aliasing.
 
Back
Top