PowerVR Serie 5 is a DX9 chip?

IMG is not in dire straits, I doubt it would have problems with funding ... the problem is willingness to take the risk which goes with it.

Deferred rendering has no fundamental problems with high polygon counts ... to say so with knowledge of how it is used in offline rendering belies logic.
 
Chalnoth said:
I hope not. As I've said many times in the past, I feel that fully-deferred rendering is the wrong way to go. Deferred rendering will have problems as polycounts increase in the next few years.

Are we still trotting these same tired, baseless accusations out about deferred rendering?

Chalnoth - with manufactures willingness to increase memory for larger frame buffers to cater for greater levels of FSAA depths (which under current compression algorithms still require the same space as uncompressed FSAA buffers) then the memory sizes used for IMR's are going to far exceed similar memory sizes required by true tile based deffered renderers and high polycounts, as, after all, TBR's will only need the target resolution frame buffer space to achieve similar (or even infinitly greater) levels of FSAA samples.
 
It’s a slight exaggeration, but the point is the MSAA limitations for a TBR are different for an IMR (under current methods of FSAA implementation). For IMR’s 6X and 8X AA are limited because of frame buffer space concerns and external bandwidth (alleviated to some degree by colour compression), but because the FSAA samples can be calculated on chip and downsized before being sent to the external frame buffer neither of these two elements are an issue for a TBR.

There will obviously be other limitations/trade-offs that a TBR designer will need to consider when choosing the FSAA depth and sampling patterns for their architecture (such as parameter space concerns, on-chip tile size/die space requirements, number of Z operations per cycle, etc. etc.)
 
Heh! Regarding deferred rendering (and IMR)on long pixel shaders I just remember this clever idea from ATI's DX9 optimization paper about the use of multiple render passes:

Ordering all geometry by distance and rendering it front to back might not be such a good idea since it might affect sorting by effect, shader or render state.

The solution is to use multi-pass rendering since vertex processing is rarely a bottleneck. On the first render pass just initialize the depth buffer with proper depth values for your scene by rendering all geometry without any pixel shaders and outputting only depth and no color information. Since no shaders are used, it is possible to render everything in the front to back order without causing any major render state changing overhead. Then render everything once again with proper shaders. Because depth buffer is already initialized with proper depth values, the early pixel rejection can happen due to HYPER Z optimizations, thus creating effective overdraw of one on the shader pass.

But maybe we should discuss it in a another thread, Dave?
 
I need to see a high end product from PowerVR in the PC space... badly.

I'm waiting it since the japanese launch of the Dreamcast...

In the meantime I'll buy ATI products, but with my RadeOn 8500, I'm ok for a year.
 
LeStoffer said:
Heh! Regarding deferred rendering (and IMR)on long pixel shaders I just remember this clever idea from ATI's DX9 optimization paper about the use of multiple render passes:

Ordering all geometry by distance and rendering it front to back might not be such a good idea since it might affect sorting by effect, shader or render state.

The solution is to use multi-pass rendering since vertex processing is rarely a bottleneck. On the first render pass just initialize the depth buffer with proper depth values for your scene by rendering all geometry without any pixel shaders and outputting only depth and no color information. Since no shaders are used, it is possible to render everything in the front to back order without causing any major render state changing overhead. Then render everything once again with proper shaders. Because depth buffer is already initialized with proper depth values, the early pixel rejection can happen due to HYPER Z optimizations, thus creating effective overdraw of one on the shader pass.

But maybe we should discuss it in a another thread, Dave?

Nothing new about it. Doom3 does this (at least similarly), and NVidia has been actively promoting this technique for some time now, too.
S3 DeltaChrome drivers may even do this automatically.
 
Xmas said:
... and NVidia has been actively promoting this technique for some time now, too.

I didn't know. :oops: But an interesting idea nonetheless since we were talking about complex pixel shaders on deferred rendering vs IMR. Am I missing something, Xmas?
 
Chalnoth said:
I hope not. As I've said many times in the past, I feel that fully-deferred rendering is the wrong way to go. Deferred rendering will have problems as polycounts increase in the next few years.
Where have I heard that before? Of course, it was here....a few years ago.

What I can say for certain is that deferred rendering will have problems as entropy increases during the heat death of the universe.
 
DaveBaumann said:
Are we still trotting these same tired, baseless accusations out about deferred rendering?

Chalnoth - with manufactures willingness to increase memory for larger frame buffers to cater for greater levels of FSAA depths (which under current compression algorithms still require the same space as uncompressed FSAA buffers) then the memory sizes used for IMR's are going to far exceed similar memory sizes required by true tile based deffered renderers and high polycounts, as, after all, TBR's will only need the target resolution frame buffer space to achieve similar (or even infinitly greater) levels of FSAA samples.

Well, I've stated my problems with fully-deferred rendering previously, but I suppose I'll state them again.

I want an architecture that is designed to have the best-possible performance in a worst-case scenario. The problem with a deferred renderer is that if you go ahead and, for example, implement 16x MSAA, there will be very little performance hit, until a game comes along that overruns the scene buffer, forcing all of that extra frame buffer and z-buffer information to be written to video memory, drastically reducing performance.

In other words, I'd rather see an architecture that can do 6x-8x FSAA at good framerates than one that can do 16x FSAA at good framerates, but whose performance drops drastically when a set geometry limit is reached. As a side note, there may also be other FSAA techniques, such as a version of the FAA technique seen on the Parhelia (though it may be impossible to completely detect all edges...).

And, the last thing is I still feel that more fillrate is not what is needed most to advance PC 3D graphics. We need more programmability, more pure processing power, and particularly more vertex processing power. After using a Radeon 9700 Pro for a while, I feel that its 6x FSAA is quite easily "good enough" for most any realtime use. More FSAA is just not necessary.

Once you take that away, and add in rendering algorithms where there is an initial z-pass, essentially all of a fully-deferred renderer's benefits go out the window, leaving only its primary drawback of high polycounts causing problems.
 
The problem with a deferred renderer is that if you go ahead and, for example, implement 16x MSAA, there will be very little performance hit, until a game comes along that overruns the scene buffer, forcing all of that extra frame buffer and z-buffer information to be written to video memory, drastically reducing performance.

But when does that happen when you've got 128MB frame buffers? Show us an example...

Once you take that away, and add in rendering algorithms where there is an initial z-pass, essentially all of a fully-deferred renderer's benefits go out the window, leaving only its primary drawback of high polycounts causing problems.

No it doesn't. There are all kinds of things, such as MRT's, coming along that TBR's will have a much greater efficiency than IMR's. Plus, with this you are still relying on early Z routines wich can still the pipeline a little, an area thats not an issue with TBR's.
 
After using a Radeon 9700 Pro for a while, I feel that its 6x FSAA is quite easily "good enough" for most any realtime use. More FSAA is just not necessary.

No objection as long as the sampling pattern is at least similar to ATI's. In a comparative case like that though, give me one good reason why a TBDR wouldn't be faster in high resolutions (in a same sampling pattern/same amount of samples scenario)?
 
We seem to be on a strange discussion loop here:

Chalnoth says: My main problem with Deferred Rendering is X argument.

DaveB responds: X is not a problem/can be easily solved by...

Chalnoth says: DR's also have problem Y.

Dave B responds: Y is not a problem/can be easily solved by...

Chalnoth says: I still believe DR's have a problem with X and Y compared to IMR's.

It seems to me that most of Chalnoth's doubts about deferred renderers have been fully refuted in the past by both DaveB and SimonF (who should know a thing or two about it!), yet we still hear the same arguments again and again.

I expect these discussions will keep popping up until we have a DX9 DR on the market, but the arguments about the limitations of DR's as compared to IMR's seem pretty thin to me.
 
I may as well throw in my "standard" argument against deferred rendering:

If the benefits do in fact for outweigh any drawbacks for the PC, we would have seen more than just a few low-mid range parts from one vendor.

IHVs aren't "dumb", as much as we like to point the finger at them for being exactly that. This isn't to say they don't make mistakes, but when the most successful IHVs are still using IMR, there is only one reason for it:

The advantages of DR don't outweigh the disadvantages...at least not at this time.

What are all the disadvantages? Beats me. The proponents of deferred rendering would "shoot down" any apparent disadvantage, saying "nah...that's not a problem."

What I would like to hear from the proponents, is what exactly they feel is the reason for the lack of PC implementations? If the practical advantages are clear and unambiguous, with little to no drawbacks, why is it that everyone isn't jumping on the Deferred rendering bandwagon? My thoughts:

1) There are some disadvantages that are real, and tend to be show-stoppers, that we haven't discussed or been thought about here. I remember reading something from an nVidia employee saying something like "we haven't solved all the 'fringe' cases where DR breaks down...we may in the future, but not yet.

and/or

2) The advantages may be there, but they are not as great as typically hyped, particularly in the PC space. This makes it more risky for management to invest the R&D to make the "switch" to deferred rendering, because the return on investment is not a sure thing.
 
I may as well throw in my "standard" argument against deferred rendering:
If the benefits do in fact for outweigh any drawbacks for the PC, we would have seen more than just a few low-mid range parts from one vendor.

As I understand it, deferred renderers are considerably more complex in their operation than standard IMRs. Only in the last couple of generations of IMRs have we started to see more complex techniques to improve efficiency. Perhaps this is one reason why so few companies (Videologic and Gigapixel are the only ones that come to mind) have gone with deferred renderers.

Another possibility is that Videologic must now hold considerable amounts of IP relating to deferred rendering - could other companies easily create a deferred renderer which didn't infringe on this IP?

A third argument is that deferred rendering technology must be worth something - Microsoft was originally in talks with Gigapixel about supplying the Xbox chip. 3Dfx then paid a lot of money for Gigapixel and NVidia happily bought all their IP when 3Dfx collapsed.

The forthcoming MBX chips from ImgTch show that the technology is excellent for small devices. As improvements in 3D hardware from one generation to the next are fundamentally little more than a matter of scaling up smaller units, why should it not be possible for a high-end deferred renderer to be produced using similar technology?
 
There's also a third scenario Joe:

TBDR's have advantages in X departments and disadvantages in Y departments, while IMR's advantages in Y departments and disadvantages in X departments, whereby which global average consensus between the architectures outweighs the other remains to be seen (if ever) since we haven't really seen fully speced TBDR's yet.

Why other IHV's haven't chosen a DF route yet? Can you safely exclude the possibility that immediate mode rendering isn't the safer route for them to walk on and they don't have as much experience to produce an at least as well performing TBDR?

I'll flip the coin: why hasn't PowerVR changed it's approach through all these years to immediate mode instead? "Hey we like to differentiate ourselves (or defer if you like...)" doesn't exactly make a good point sticking to a hypothetically "inferior" rendering approach.
 
The advantages may be there, but they are not as great as typically hyped, particularly in the PC space.

I don't see why that should be the case in the first place. Each IHV will hype their products or in extension it's rendering approach (when it comes to TBR vs IMR), or do you really expect NV or ATI to come out and say that TBR is superior to what we do or PVR to admit that IMR's are in fact better than TBR?

I believe that in certain departments TBDR's are in fact more efficient, but IMR's with the recent rudiments that they employ don't seem to be that far apart anymore either. I never believed for a second though that a similarly speced TBDR would kill an IMR in performance and across the board.

edit:

Dave Baumann:

Did you not think about combining some of the work that 3dfx and GigaPixel have done with NVIDIA architecture? Did you consider going deferred rendering or even to 'semi-defer' and cache some geometry data, sort it and then render it?

Geoff Ballew:

The challenge with some of those other architectures is with today's level of performance, today's capabilities, and today's requirements for software compatibility no one has really solved the compatibility issues, the corner cases if you will, of those different techniques. Without talking about future products, because someday we may solve those corner cases, we still use a more traditional pipeline in this GPU because what we want is absolute, solid compatibility and stability.

We did acquire technology down the paths of those other architectures and we've got a lot of smart people looking at it. When the time is right to include some of those other ideas, we'll be there, but its not the time for that yet.

Interpret it as you wish.
 
As I understand it, deferred renderers are considerably more complex in their operation than standard IMRs.

IIRC, That's not what the proponents like to say...they usually talk of much simpler operation and lower transistor counts.

Another possibility is that Videologic must now hold considerable amounts of IP relating to deferred rendering - could other companies easily create a deferred renderer which didn't infringe on this IP?

Possible, but I would imagine VideoLogics "IP" on deferrend rendering isn't much different than every other company's "IP" related to IMR. In short, I doubt that's a show-stopper.

A third argument is that deferred rendering technology must be worth something -

Or worth nothing ;), depending on how you look at it.....

Microsoft was originally in talks with Gigapixel about supplying the Xbox chip.

And yet, they went with an IMR....

3Dfx then paid a lot of money for Gigapixel

And yet....nothing came of it....

and NVidia happily bought all their IP when 3Dfx collapsed...

And yet, still no deferred renderer, or any known plans to make one in the forseeable future.

So, what does it say when it appears that people are "interested" in deferred rendering, and then presented with the choice....end up going for IR, or not capable of producing a deferred renderer? That's pretty much exactly my point.

The forthcoming MBX chips from ImgTch show that the technology is excellent for small devices.

Not proven until the product is on the market and successful. ;) Still, I have in fact always been optimisitic about deferred renderers in closed devices. Ones that don't have legacy apps designed with the "limitations" of IMRs in mind.

I want to make it clear that my "pessimism" for deferred rendering, is pretty much limited to the PC space.

why should it not be possible for a high-end deferred renderer to be produced using similar technology?

Well, that's what we're debating, isn't it? ;) I suggest you ask IMG, ATI, and nVidia those questions.
 
Back
Top