The corruption of realtime graphics principles and the how things should be *spawn

Kovalevsky

Newcomer
Numeric instability isn't so overwhelming a problem: avoid floats, notice the 4th dimension of space i.e., scale (hence the quadtree, octree etc.) and when you bisect the half-open [i, i + j), let left = [i, i + (j SAR 1)[ & right = [i + (j SAR 1), i + (j SAR 1) + ((j + 1) SAR 1)[. Ad hoc C is for real men, C++ is a bad effeminate joke.

Let floats dissolve. Let the immoral combination of object-order & image-order be no more (object-order: you persp. project vertices, image-order: you persp. tex. map).

Something should be made explicit: the current persp. texture mappers are deliberately broken. The tex. formats are not fit for image-order algorithms (to which the traditional tex. mappers belong), the access isn't neighborly and the higher the tex. resolution the more it is bad (big mem. hierarchy-fighting jumps). Then you have point-sampling, a consequence of which is aliasing. (A rectangular picture's element is rectangular. A pixel is, generally, a half-open rectangle [x, x + X) * [y, y + Y). In particular when X = Y = 1, there is but a point in the pixel). There are also those proliferous never-acceptably-slow divisions, at least 1 per pixel, otherwise it is no perspective. Even on the current machines, a single viewport-filling rectangle at the present-day resolutions would slow down CPU rendering to a crawl. How ridiculous.

There are quite general solutions.
Did you know that quaternions were found before vectors? (Just like the complex num. were used before Gauss). Quaternions are a pretext for more algebra i.e., transistors.

P.S.: The brokenness was called deliberate because the GPU industry needs floats, multiplication, division, Z-buffering, image-order (whether persp. tex. mapping or raytracing) & mediocre programmers so that it may justify parallelism and insinuate that costly megawattic pollutant piece of ephemeral junk into your machine. Carmack promoted GPUs, his worshipers followed him & made rich NVIDIA (and Carmack).
 
That's some cryptic stuff. Hardware texture units actually exploit data locality pretty well (texture data is rarely stored in scanline order).
And relative to the calculations that you absolutely need to perform to get a high-resolution image with high-quality lighting, one reciprocal per drawn pixel to perform perspective correct interpolation is fairly insignificant.
 
Something should be made explicit: the current persp. texture mappers are deliberately broken. The tex. formats are not fit for image-order algorithms (to which the traditional tex. mappers belong), the access isn't neighborly and the higher the tex. resolution the more it is bad (big mem. hierarchy-fighting jumps). Then you have point-sampling, a consequence of which is aliasing. (A rectangular picture's element is rectangular. A pixel is, generally, a half-open rectangle [x, x + X) * [y, y + Y). In particular when X = Y = 1, there is but a point in the pixel). There are also those proliferous never-acceptably-slow divisions, at least 1 per pixel, otherwise it is no perspective. Even on the current machines, a single viewport-filling rectangle at the present-day resolutions would slow down CPU rendering to a crawl. How ridiculous.
Higher resolution textures are supported by mipmaps. Mipmaps allow you to "prefilter" the higher resolution data so you can get acceptable quality with good performance. The tradeoff is increased memory footprint (~33% for a square texture), but it beats the alternative (aliasing and bad performance).

Also, textures are "tiled" in memory to allow for efficient memory accesses when filtering. The idea being to get samples for a bilinear fetch into the same cacheline as much as possible. Applications are free to do their own custom filters with point sampling, but you will rarely be satisfied with the results of a single point sample and you will lose out on a lot of h/w optimizations that make h/w filtering efficient if you implement a full custom filter in your shader.

Now, one can argue that some textures are prone to aliasing (say a normal map) because you can't filter them easily. But, hopefully, your accesses aren't too chaotic or else you can experience aliasing even if you implement your own custom filtering and performance is likely to suffer as well.
 
Whatever the fixed storage format, it can be defeated by image-based reads. It won't be as fast as traversing headlong (and going to slow RAM only once per n > 1 accesses) & splatting a (say) Morton quadtree. In this respect the MIP pyramid is, again, maliciously ill-formated: the levels aren't in Morton order (or any such Peano style sequence). It just is unbelievable.
 
and when you bisect the half-open [i, i + j), let left = [i, i + (j SAR 1)[ & right = [i + (j SAR 1), i + (j SAR 1) + ((j + 1) SAR 1)
My experience in math was purely utilitarian until it was to late for me to learn the more advanced things... I have a greater appreciation for it now... So I've got to ask you - what does this mean? I am at a loss.
 
My experience in math was purely utilitarian until it was to late for me to learn the more advanced things... I have a greater appreciation for it now... So I've got to ask you - what does this mean? I am at a loss.
That particular part is pretty straightforward: all it says is that, if you want to bisect an integer range from [i to i+j(, you do an integer divide (shift arithmetic right) by 2 of j and the create 2 intervals with that.

What I don't get is the deeper meaning of it all...

"Ad hoc C is for real men, C++ is a bad effeminate joke." Relevancy?
Object-order and image-order immoral? What did they ever do to deserve this?

Then he switches to a rant about, what I believe is, memory locality when accessing textures? Where textures should be stored in morton order instead of what it is now? How are they stored right now anyway? Given that most textures are compressed in rectangular blocks, I doubt that it's an linear scan order. But that's apparently still malicious one way or the other?

What's the alternative to point sampling?
What's the alternative to the perspective correcting division?
There seems to be a rich new world order of GPU masters who are pushing broken solutions?

Whatever the fixed storage format, it can be defeated by image-based reads. It won't be as fast as traversing headlong (and going to slow RAM only once per n > 1 accesses) & splatting a (say) Morton quadtree. In this respect the MIP pyramid is, again, maliciously ill-formated: the levels aren't in Morton order (or any such Peano style sequence). It just is unbelievable.
Is this a suggestion to go through the pixels in texture space and then project them to view space? That would still require a division per pixel, and instead of a slightly irregular access pattern for the texture, you'd get an irregular access pattern in your frame buffer. Given the way rasterization is done, that'd probably be worse? You'd get multiple writes to the same destination pixel?

It's all very weird...
 
It is slightly subtler. The lengths aren't necessarily the same, the length of L is j SAR 1 while the length of R is (j + 1) SAR 1, j being a signed integer. Using the same lengths for L & R causes loss of precision in iterations. With big fat floats you always lose prec. What an abomination those floats: they each carry their exponent, what a waste of space (and of time).

C++ is so falsely abstract it's not even funny. Your mind should be dealing with abstract beings, not your hands: abstract things aren't in space-time, their size here is 0. With C++ one pretends to be abstract while writing endless (not of size 0 in space-time) empty (of size 0 there) words. C is ad hoc, concrete. A C user thinks much and writes little.

An alternative to per-pixel division is scanning the view plane along lines parallel to its intersection with the plane of the polygon under consideration. (See e.g., http://advsys.net/ken/download.htm#cubes5). Divisions & point sampling can be dealt with by subdividing, in view space, the object to be projected while narrowing view pyramids. (Has been done with octrees).

Donald Meagher was sabotaged, it is obvious that his renderers should have been preferred.
 
Again, Z order or any fixed order for that matter won't help. Splatting will always be faster than ray tracing (texture mapping as of now is but reboundless ray tracing against planar polygons). But perhaps we should be content with WOLF3D's walls? After all the columns can always be accessed sequentially. (DOOM's floors & ceilings can't in general). It's funny how augmenting the resolution of the input or of the output (textures, viewport) conveniently embitters the situation, requiring bigger and bigger GPUs.
 
They didn't teach us quaternions (http://en.wikipedia.org/wiki/Quaternion) at all (at Helsinki University). Not in the first or the second linear algebra course. For most of the game developers it seems to be enough to understand that a normalized quaternion can represent a rotation (basically a compressed form of a 3x3 orthonormal matrix). And that's it. However once you really start to work with the quaterions on the GPU side, you need to understand how the math actually works (for example to preprocess the mesh properly to make the VS->PS interpolation work) and understand what kind of values (distribution of numbers) you have in all the four channels (in order to pack the quaternions to your g-buffer and/or vertices as tightly as possible while still retaining good quality). Also quaternions don't work like matrices in all cases (are not linear to interpolate) so you need to understand more complex math such as dual quaternions (http://en.wikipedia.org/wiki/Dual_quaternion) to perform skinning (or complex higher order interpolation). GPU Pro 5 had a quite nice article about quaternion based GPU pipeline (a nice tutorial to get started).

I think that most of the discussion here has been about the math needed in graphics rendering (geometry and vector math). Physics simulation also needs quite a bit of math. You need to understand differential calculus (often performed with numerical methods in real time simulations) among other things. Also modern microfacet material models (http://simonstechblog.blogspot.fi/2011/12/microfacet-brdf.html) have quite complex mathematical background (differential calculus + physical light models) that you need to understand in order to improve the formulas (or in order to split the formulas to lookup + calculation to optimize them for real time usage). And you of course need to understand basics of statistics. Importance sampling is currently used everywhere (http://en.wikipedia.org/wiki/Importance_sampling).

One important thing that is not that well discussed in university studies is numeric instability. This is very important topic when you are implementing any computational geometry algorithms. Often the most straightforward algorithm is not practically viable, because it's not working well in all corner cases (because of numeric instability). Path finding mesh for example is often nowadays done by voxelizing the level and then converting the voxel presentation to the path finding mesh, instead of using complex (and unstable) computational geometry algorithms to create the union (from the triangle soup). Voxelization can also be used instead of convex decomposition to generate collision meshes for objects (http://kmamou.blogspot.fi/2011/10/hacd-hierarchical-approximate-convex.html, https://www.graphics.rwth-aachen.de/media/teaching_files/mueller_siggraph12.pdf).

That is true that a lot of math is hidden in the physics simulation and I might plan a second seminar on these aspects especially since numerics is the name of my professorship :)
However, again, it is difficult to find good books about physics simulation in games (I have one, but this is to simple).

The book I ordered

http://www.mathfor3dgameprogramming.com/

also has a chapter about fluid dynamics and cloths simulation at least...
 
What an abomination those floats: they each carry their exponent, what a waste of space (and of time).
More often than not, HW uses non-float representations internally, exactly for this reason. Depending on where in the pipeline, things like shared exponent and what not can be used, or completely different representations (fixed point, residual) can be employed. Different types of HW need to work on an agreed upon, shared representations of the input. Floats are pretty good for that.
 
After some googling, I'm confident that he's been reading this paper:
http://fab.cba.mit.edu/classes/S62.12/docs/Meagher_octree.pdf

It touches all the major points: division, floats, ...

It doesn't touch C vs. C++ which is understandable because the paper was written in 1981. And I still don't see the relevance of that aspect in a discussion about GPU architecture anyway...

So, basically, he wants to store everything in an octree and take everything from there.

From the days when I was interested in ray tracing, I remember that one of the problems with octrees is that it works very well with static worlds, but not so much with dynamic ones. And it consumes more memory. This is exactly what this paper mentions as well: memory consumption is roughly linear with the surface area of the object, and random transformations are complicated. (If I'm not mistaken, they require a full rebuild of the octree, which would be a huge cost in terms of memory BW.)

I wonder how, in an octree-only world, you'd deal with thousands of trees in a landscape, all of them with slightly different orientation? How would you do with vertex shaders that can instantly transform an object into different positions?

As somebody else said: the fact that the outside representation is in float, doesn't mean that floats are used at the lower level. It's a given that they switch to fixed point.

And you didn't address the fact that scanning in view space instead of screen space makes efficient parallelization much harder and results in less ideal memory accesses in the frame buffer.

If things are done in a certain way today, after decades of innovation, it's probably because it converged to a near optimal solution. Maybe not the best, maybe far from ideal for a specific class of problems, but near optimal in general.

As for texturing: you should buy a contemporary game sometime: the games of the last 15 years look don't look at all like wold3d. And the quality did not improve because of higher resolution, it improved because bilinear texturing became trilinear. Because point sampling became mip-mapping, and because anisotropic filtering was added. It doesn't matter in what order you scan your texture: that filtering is always needed.
 
Whatever the fixed storage format, it can be defeated by image-based reads. It won't be as fast as traversing headlong (and going to slow RAM only once per n > 1 accesses) & splatting a (say) Morton quadtree. In this respect the MIP pyramid is, again, maliciously ill-formated: the levels aren't in Morton order (or any such Peano style sequence). It just is unbelievable.

What on earth are you talking about? (Admittedly I'm speaking of the graphics hardware I've worked on) The levels are in Morton or at least tiled order.
 
No matter the format, accessing it by way of the pixels' detour won't be neighborly in general (observing a wall at a sufficient angle will result in the next pixel's tex. sample to be rather far from the current one).

Splatting, especially the hierarchical front-to-back variety, is much better. Unlimited Detail is Meagher's early '80s algorithm. More precisely a combination of his perspective & orthographic splatters. (Like perspective incorrect texture mapping where you linearly interpolate between two "correct" UV samples while i.e, in parallel, the FPU is making the perspective division for a future "correct" UV sample).
 
No matter the format, accessing it by way of the pixels' detour won't be neighborly in general (observing a wall at a sufficient angle will result in the next pixel's tex. sample to be rather far from the current one).
That's one of the reasons (though not the main one) they invented mipmaps.

Splatting, especially the hierarchical front-to-back variety, is much better. Unlimited Detail is Meagher's early '80s algorithm. More precisely a combination of his perspective & orthographic splatters. (Like perspective incorrect texture mapping where you linearly interpolate between two "correct" UV samples while i.e, in parallel, the FPU is making the perspective division for a future "correct" UV sample).
For those of us who are not familiar with the path of enlightenment and who want to avoid the immoral ways of contemporary GPUs: can you provide links to your bibl^H^Hsome papers?
 
That particular part is pretty straightforward: all it says is that, if you want to bisect an integer range from [i to i+j(, you do an integer divide (shift arithmetic right) by 2 of j and the create 2 intervals with that.
Sorry to bother you again silent_guy but I wasn't clear with my question, so if you or anyone else can spare a moment thank you.
I guess what I'm asking is does this:
Numeric instability isn't so overwhelming a problem: avoid floats, notice the 4th dimension of space i.e., scale (hence the quadtree, octree etc.) and when you bisect the half-open [i, i + j), let left = [i, i + (j SAR 1)[ & right = [i + (j SAR 1), i + (j SAR 1) + ((j + 1) SAR 1)[
actually mean anything???
Moreover what is the relation of this "when you bisect the half-open [i, i + j), let left = [i, i + (j SAR 1)[ & right = [i + (j SAR 1), i + (j SAR 1) + ((j + 1) SAR 1)"
with this "notice the 4th dimension of space i.e., scale" and in particular why bring up this: "(hence the quadtree, octree etc.)"??
Is the integer interval a single dimension of a bounded volume in a quadtree/octree? and by scale is this some roundabout way of saying numerical precision? (ie smallest increment possible)
 
actually mean anything???
Moreover what is the relation of this "when you bisect the half-open [i, i + j), let left = [i, i + (j SAR 1)[ & right = [i + (j SAR 1), i + (j SAR 1) + ((j + 1) SAR 1)"
with this "notice the 4th dimension of space i.e., scale" and in particular why bring up this: "(hence the quadtree, octree etc.)"??
Is the integer interval a single dimension of a bounded volume in a quadtree/octree? and by scale is this some roundabout way of saying numerical precision? (ie smallest increment possible)
In this case, I think(!) scale means hierarchical level of the tree. And when you descend into the tree, you don't split the node exactly in half, but you do it integer-wise with those 2 half-open intervals for each dimension, instead of floating point (because that'd be immoral.)

But, yeah, after his last reply (which got deleted by the moderators), I think there's no point in continuing the conversation.
 
From the fifteenth century on, the methods of pro-jection make a further advance. Jan van Eyck* in the great altar painting in Ghent makes use of the laws of perspective, e. g., in the application of the vanishing point, but without a mathematical grasp of these laws. This is first accomplished by Albrecht Durer who in his Underweysung der messung mit dem zirckel und richischeyt makes use of the point of sight and distance-point and shows how to construct the perspective picture from the ground plan and eleva-tion. In Italy perspective was developed by the archi-tect Brunelleschi and the sculptor Donatello.
The first work upon this new theory is due to the architect Leo Battista Alberti. In this he explains the perspec-tive image as the intersection of the pyramid of visual rays with the picture-plane. He also mentions an in-strument for constructing it, which consists of a frame with a quadratic net-work of threads and a similar net-work of lines upon the drawing surface. He also gives the method of the distance-point as invented by him, by means of which he then pictures the ground divided Into quadratic figures.* This process received a further extension at the hands of Plero della Fran-cesca who employed the vanishing points of arbitrary horizontal lines. (A brief history of mathematics. An authorized translation of Dr. Karl Finks Geschichte der Elementar-Mathematik, 1900)

Decades of innovation? Mod rewrite: The principles of current realtime graphics were conceived in the fifteenth century as per my above quote. The current approach is based on archaic thinking and vastly inferior to an optimal approach.
 
Last edited by a moderator:
Often things are like they are more because of inertia than any other reason.
LCD screens were feasible in the 70's but they only showed up 30 years later, because there was no incentive to invest in that technology when the manufacturers still could make a lot of profits w/o.
It's a bit like rasterization, a lot of time and effort has been put into it, but that doesn't mean it's nearly optimal in any way, it's just a massive bag of tricks and ray tracing is a lot more elegant and simple, but doesn't have the performance yet. [What if as much money was spent into it instead ? Would we only have ray casters today ?]
Or you can compare that to C++, people use it because of inertia, today not even game devs should use it, it's really no good and a very bad match for current multicore machines...

That said, treating people a weaklings or imbeciles isn't exactly helpful and therefore rather discouraged around here.

Also we are completely off-topic now...
 
Should've known from the start that it was going to turn out to be about nothing but the same unf***ing limited detail s***.
 
Back
Top