New are you ready video

SA · Oct 31, 2002

There was nothing mentioned in my previous statement about not deferring anything. Just no sorting, tiling, or triangle binning. Although an IMR renders immediately it doesn't necessarily need to render everything immediately.

Sharkfood · Oct 31, 2002

On the video-

No, I dont think it's a spoof, but instead meant to appeal to hardcore NV fans (as the intro warning/rating signifies).

I DO think, however, NVidia is in dire need of a spoof, something humorous or at least a tad more informal versus the hard edged/serious ploys of years past. I think some sort of humor would be better received than the status-quo overhype snowjob that we've all come to know and "love."

There were a lot of us not very "awed" by the Geforce4, and also not particularly happy with how long the NV30 is taking to hit the shelves, so if they try to elevate expectations with by their normal means (over hyper with zero specifics, which creates speculation), they are just setting themselves up for under delivery.

On the 9700 Pro vs NV30 arguments. I would think it rather obviously unfounded to make bold speculation of either for or against the NV30's performance vs the 9700 Pro. We have no clue REALLY if the NV30 is 128-bit or 256 bit (just some dodging of the 256 bit question in a couple interviews), nor do we know what technology or even if it's scalable/multichip. For all we know, the next NVidia videocard could either be a serious under delivery of a 128-bit, 400mhz IMR board, some snazzy Gigapixel/deferred renderer, or some multi-chip/scalable SLI monster that dims the lights in your home when powered up. There is nothing known about this thing other than NVIDIA is hyping the crap out of it, which they have done to every single chip they've ever released.. So it will either be another Geforce4 type of delivery (small, incremental.. nothing really new, just faster), or a Geforce256 type delivery (larger step up, new features, and a growable line of technology).

no_way · Oct 31, 2002

SA said:
There was nothing mentioned about deferring anything. Although an IMR renders immediately it doesn't necessarily need to render everything immediately.

deÂ·fer1 (d-fūr)
v. deÂ·ferred, deÂ·ferÂ·ring, deÂ·fers
v. tr.
1: To put off; postpone.

Not rendering immediately = deferring the rendering
am i wrong ?

DemoCoder · Oct 31, 2002

Semantics. There is a difference between a full scene capture, and a streaming processor that tries to batch some things together. I think what SA is saying is that there is a middle ground between a full blown deferred tiler and a straight IMR.

When people in this board say "deferred rendering" they are usually talking about a Kyro/Gigapixel architecture.

However, there are also datastructures that you can create based on a stream model (e.g. can't look at previous or future values) that require "capturing" or "deferring" only a few primitives at a time.

I'd call a model which defers only a few vertices a "lazy renderer"

SA · Oct 31, 2002

The following presentation by Ned Greene provides some interesting ideas on methods of optimizing hierarchical z buffering and z occlusion culling:

http://gamma.cs.unc.edu/SIG02_COURSE/greene.ppt

KimB · Oct 31, 2002

arjan de lumens said:
I'm afraid that I don't understand - I don't see how this can become 'immediate-mode tiling' - sounds more like a partially-deferred scheme to me, and I don't see the connection to tiling. Care to explain further?

I'm just talking about going partway to a fully-deferred renderer, but not all the way. As DemoCoder said, a "lazy renderer"

Which doesn't preclude the current ATI algorithm from being within, say, 0.1% of the "best possible" algorithm (although I do believe there is a bit more room than that left)....

I'm absolutely certain there's a hell of a lot more room left. We are really in the infancy of all computer technology, even moreso in the 3D graphics field, both in hardware and in software. There are huge advancements to be made everywhere in computers. I see it as an absolute impossibility that a video card using some tech in a first-generation design, and other tech that's a mere one-two years old (by the parent company, anyway) can be anywhere close to optimal.

LeStoffer · Oct 31, 2002

arjan de lumens said:
Got me thinking - where in the basic IMR architecture is there any potential for any improvement over R300? (that is, other than adding brute force: bandwidth, pipelines, texture and vertex units etc)

Z-buffering/Early Z-test/Z-compression/hierarchical Z: R300 is ATI's third pass at hierarchical Z - I doubt there is much left to gain here other than in conjunction with bounding volumes.

Bounding volumes rejection - may be useful combined with Hierarchical Z - requires developer support.

Anisotropic mapping - very little left to gain. Given the kind of performance hit R300 takes when doing aniso, it looks like ATI has actually superceded the Feline algorithm (!).

Texture compression - Some room for improvement over S3TC - VQTC looks like a better method in general. Requires some developer support.

Immediate mode tiling - requires extensive developer support, as long as OpenGL/Direct3d don't get scene graph support. You can do this on an R300 today, using OpenGL's scissor test to define a 'tile', if you feel so inclined

Geometry data compression - R300 supports N-patches and displacement mapping, which are, after all, just compact ways to represent complex geometry - other than that, there may be a little room for compressing vertex arrays.

Antialiasing - with any given number of samples per pixel, R300's compressed multisampling should be about comparable to Z3 wrt bandwidth usage - Z3 may offer slightly better quality. There are faster AA methods as well, but they tend to require substantial developer effort in order not to break down all the time.

Stencil buffer compression - here, there seems to be room for substantial improvements (I guess; Nvidia and ATI have been silent on this issue so far)

Framebuffer compression (other than collapsing same-color samples for multisampling) - potential for moderate improvements for skies, featureless walls and other surfaces with sufficiently gradual color changes. Possibly difficult to do efficiently enough to be useful.

Good points as I really don't think that there is much more low hanging fruit to grap off easily. It is a matter of silicon cost vs performance benefit, and the likes of ATI should have gotten so far with their HyperZ III, that you really need to do some sort of sorting for any substantial improvements in performance (vs the silicon cost).

A substantially more complex algorithm may well be more efficient but also cost way to much silicon for this process generation.

I'm no engineer so I dunno of course, but simple logic dictates that the folks at ATI and nVidia know full well of any semi-deferred rendering tricks out there. If they haven't tried them out by now (even just in test sample designs) they either cost to much in silicon to implement for their benefit over LMA II and HyperZ III.

No, let's have a bit of sorting in a nice chunk of cache what-do-you-say?

Simon F · Oct 31, 2002

arjan de lumens said:
Got me thinking - where in the basic IMR architecture is there any potential for any improvement over R300? (that is, other than adding brute force: bandwidth, pipelines, texture and vertex units etc)

...

Texture compression - Some room for improvement over S3TC - VQTC looks like a better method in general. Requires some developer support.

...

I don't see VQTC (2bpp) (at least as implemented in Dreamcast) being adopted over S3TC (4bpp or 8bpp) despite the higher compression rate because the hardware engineers don't really like it. The large LUT/Secondary cache needed to hide latency is a bit unpopular. I suppose the PVR-TC might be a replacement candidate but for the moment it's only in PowerVR's MBX and, furthermore, the last time I spoke to MS's DX team they weren't keen on newer texture comp modes.

[Edit} Fixed a spelling error![/EDIT ]

Humus · Oct 31, 2002

As far as texture compression goes, I think we need a texture compression that works better with normalmaps.

psurge · Oct 31, 2002

SA, this method you are referring to... does it result in no overdraw and little z-traffic even if geometry is submitted back to front? Are you assuming any prior knowledge of visibility at all?

arjan de lumens · Nov 1, 2002

Chalnoth said:
arjan de lumens said:

I'm afraid that I don't understand - I don't see how this can become 'immediate-mode tiling' - sounds more like a partially-deferred scheme to me, and I don't see the connection to tiling. Care to explain further?

Click to expand...

I'm just talking about going partway to a fully-deferred renderer, but not all the way. As DemoCoder said, a "lazy renderer"

The way I understand it from your posts: A renderer that gathers a bunch of polygon data, less than a full frame, but still a substantial amount, then renders the data in a more or less traditional tile-based manner, then goes on to collect more data, on and on until the frame is fully rendered. Is this correct?

Which doesn't preclude the current ATI algorithm from being within, say, 0.1% of the "best possible" algorithm (although I do believe there is a bit more room than that left)....

Click to expand...

I'm absolutely certain there's a hell of a lot more room left. We are really in the infancy of all computer technology, even moreso in the 3D graphics field, both in hardware and in software. There are huge advancements to be made everywhere in computers. I see it as an absolute impossibility that a video card using some tech in a first-generation design, and other tech that's a mere one-two years old (by the parent company, anyway) can be anywhere close to optimal.

Nah. I am a bit sceptical. The fast Z algorithms that ATI use have remained essentially the same over 3 generations of hardware, which to me suggests that any substantial improvement left is either slow (trading compression/decompression speed for bandwidth usage), expensive as hell or totally non-obvious. (I consider this loosely similar to how x86 processors have been stuck at 3 instructions per clock for the last 7 years)

As for immediate-mode renderers: If a renderer needs to defer/delay/put off the rendering of even one polygon for any reason whatsoever other than that the renderer is just busy, then the renderer does NOT qualify as an immediate-mode renderer - the 'immediate-mode' aspect is lost when that polygon is deferred. (If you disagree with this restriction on what constitutes an 'immediate-mode renderer', then say so!) For renderers that do defer some, but not all, polygons, 'partially-deferred' is a more appropriate term.

Dave Baumann · Nov 1, 2002

arjan de lumens said:
As for immediate-mode renderers: If a renderer needs to defer/delay/put off the rendering of even one polygon for any reason whatsoever other than that the renderer is just busy, then the renderer does NOT qualify as an immediate-mode renderer - the 'immediate-mode' aspect is lost when that polygon is deferred. (If you disagree with this restriction on what constitutes an 'immediate-mode renderer', then say so!) For renderers that do defer some, but not all, polygons, 'partially-deferred' is a more appropriate term.

This is crossing boundries here, because, as we know, if a defferred renderer doesn't have sufficient binning space it will render whats binned prior to an entire frame being captured - how does this differ from an immediate mode renderer deffering some data but noit necessarily all?

arjan de lumens · Nov 1, 2002

DaveBaumann said:
arjan de lumens said:

As for immediate-mode renderers: If a renderer needs to defer/delay/put off the rendering of even one polygon for any reason whatsoever other than that the renderer is just busy, then the renderer does NOT qualify as an immediate-mode renderer - the 'immediate-mode' aspect is lost when that polygon is deferred. (If you disagree with this restriction on what constitutes an 'immediate-mode renderer', then say so!) For renderers that do defer some, but not all, polygons, 'partially-deferred' is a more appropriate term.

Click to expand...

This is crossing boundries here, because, as we know, if a defferred renderer doesn't have sufficient binning space it will render whats binned prior to an entire frame being captured - how does this differ from an immediate mode renderer deffering some data but noit necessarily all?

The point here is that when the "immediate-mode" renderer starts deferring any polygon at all, it is no longer doing immediate-mode rendering in any meaningful sense of the term "immediate-mode".

KimB · Nov 1, 2002

arjan de lumens said:
As for immediate-mode renderers: If a renderer needs to defer/delay/put off the rendering of even one polygon for any reason whatsoever other than that the renderer is just busy, then the renderer does NOT qualify as an immediate-mode renderer - the 'immediate-mode' aspect is lost when that polygon is deferred. (If you disagree with this restriction on what constitutes an 'immediate-mode renderer', then say so!) For renderers that do defer some, but not all, polygons, 'partially-deferred' is a more appropriate term.

I don't care. It doesn't matter in the least what you call it. All I'm saying is that I would support a sort of partially-deferred rendering mode, not one that attempts to capture the entire scene.

arjan de lumens · Nov 1, 2002

Chalnoth said:
I don't care. It doesn't matter in the least what you call it. All I'm saying is that I would support a sort of partially-deferred rendering mode, not one that attempts to capture the entire scene.

I'd say that it matters to the discussion when a given term is used to mean different things by different people, with the confusion that results. I have a tendency to try to force out clarifications when I feel (rightly or not) that that happens. Sorry if I came across as too confrontational.

andypski · Nov 1, 2002

Simon F said:
I don't see VQTC (2bpp) (at least as implemented in Dreamcast) being adapted over S3TC (4bpp or 8bpp) despite the higher compression rate because the hardware engineers don't really like it. The large LUT/Secondary cache needed to hide latency is a bit unpopular.

Agreed - it becomes a bit of a problem if you want to orthogonally support all texture formats since your LUT is harder to manage when using many texture samplers, and fast access to data in the LUT is at a premium since multi-port memory is expensive. If the format represented a massive improvement over S3TC then I expect it would get more consideration, but I believe that it lies in that hazy area where the quality/performance tradeoffs just don't make enough sense.

mboeller · Nov 1, 2002

Simon F said:
I suppose the PVR-TC might be a replacement candidate but for the moment it's only in PowerVR's MBX and, furthermore, the last time I spoke to MS's DX team they weren't keen on newer texture comp modes.

An description of PVR-TC would be nice!

Is it based on this? : http://www.acm.org/sigs/sigmm/MM2000/ep/levkovich/

efty · Nov 2, 2002

IS there any significance in the fact that the model shown at about 32 sec uses squares instead of triangles?

KimB · Nov 3, 2002

mboeller said:
An description of PVR-TC would be nice!

Is it based on this? : http://www.acm.org/sigs/sigmm/MM2000/ep/levkovich/

No, they use a form of TC that's called vector quantization. Basically, the main drawback of the technology is the very long compression times (essentially requiring pre-compression by game developers...something they've been rather hesitant to do). I'm still kind of foggy on how exactly the technique works...from a first-glance, you'd think it would also be rather poor for realtime graphics (i.e. low decompression time for entire image, but it really looks like there would be more decompression time if you just want a few texels at a time, which is the norm in 3D graphics), but it seems to work.

Humus · Nov 3, 2002

S3TC more or less requires precompression already if you want good quality and use decently hires textures. The on-the-fly compression generally doesn't give you optimal quality, but you don't want Q3 to take minutes to load either. Compressing a single 512x512 texture with nVidia's DXTC photoshop plugin generally takes 3-4 seconds.

New are you ready video

SA

Sharkfood

no_way

DemoCoder

SA

KimB

LeStoffer

Simon F

Tea maker

Humus

Crazy coder

psurge

arjan de lumens

Dave Baumann

Gamerscore Wh...

arjan de lumens

KimB

arjan de lumens

andypski

mboeller

efty

KimB

Humus

Crazy coder

Similar threads