The corruption of realtime graphics principles and the how things should be *spawn

Discussion in 'Rendering Technology and APIs' started by Kovalevsky, Dec 12, 2014.

1. Kovalevsky Newcomer

Joined:
May 28, 2014
Messages:
9
0
Numeric instability isn't so overwhelming a problem: avoid floats, notice the 4th dimension of space i.e., scale (hence the quadtree, octree etc.) and when you bisect the half-open [i, i + j), let left = [i, i + (j SAR 1)[ & right = [i + (j SAR 1), i + (j SAR 1) + ((j + 1) SAR 1)[. Ad hoc C is for real men, C++ is a bad effeminate joke.

Let floats dissolve. Let the immoral combination of object-order & image-order be no more (object-order: you persp. project vertices, image-order: you persp. tex. map).

Something should be made explicit: the current persp. texture mappers are deliberately broken. The tex. formats are not fit for image-order algorithms (to which the traditional tex. mappers belong), the access isn't neighborly and the higher the tex. resolution the more it is bad (big mem. hierarchy-fighting jumps). Then you have point-sampling, a consequence of which is aliasing. (A rectangular picture's element is rectangular. A pixel is, generally, a half-open rectangle [x, x + X) * [y, y + Y). In particular when X = Y = 1, there is but a point in the pixel). There are also those proliferous never-acceptably-slow divisions, at least 1 per pixel, otherwise it is no perspective. Even on the current machines, a single viewport-filling rectangle at the present-day resolutions would slow down CPU rendering to a crawl. How ridiculous.

There are quite general solutions.
Did you know that quaternions were found before vectors? (Just like the complex num. were used before Gauss). Quaternions are a pretext for more algebra i.e., transistors.

P.S.: The brokenness was called deliberate because the GPU industry needs floats, multiplication, division, Z-buffering, image-order (whether persp. tex. mapping or raytracing) & mediocre programmers so that it may justify parallelism and insinuate that costly megawattic pollutant piece of ephemeral junk into your machine. Carmack promoted GPUs, his worshipers followed him & made rich NVIDIA (and Carmack).

#1
2. Xmas Porous VeteranSubscriber

Joined:
Feb 6, 2002
Messages:
3,314
140
Location:
On the path to wisdom
That's some cryptic stuff. Hardware texture units actually exploit data locality pretty well (texture data is rarely stored in scanline order).
And relative to the calculations that you absolutely need to perform to get a high-resolution image with high-quality lighting, one reciprocal per drawn pixel to perform perspective correct interpolation is fairly insignificant.

#2
Simon F likes this.
3. OpenGL guy Veteran

Joined:
Feb 6, 2002
Messages:
2,357
28
Higher resolution textures are supported by mipmaps. Mipmaps allow you to "prefilter" the higher resolution data so you can get acceptable quality with good performance. The tradeoff is increased memory footprint (~33% for a square texture), but it beats the alternative (aliasing and bad performance).

Also, textures are "tiled" in memory to allow for efficient memory accesses when filtering. The idea being to get samples for a bilinear fetch into the same cacheline as much as possible. Applications are free to do their own custom filters with point sampling, but you will rarely be satisfied with the results of a single point sample and you will lose out on a lot of h/w optimizations that make h/w filtering efficient if you implement a full custom filter in your shader.

Now, one can argue that some textures are prone to aliasing (say a normal map) because you can't filter them easily. But, hopefully, your accesses aren't too chaotic or else you can experience aliasing even if you implement your own custom filtering and performance is likely to suffer as well.

#3
4. Kovalevsky Newcomer

Joined:
May 28, 2014
Messages:
9
0
Whatever the fixed storage format, it can be defeated by image-based reads. It won't be as fast as traversing headlong (and going to slow RAM only once per n > 1 accesses) & splatting a (say) Morton quadtree. In this respect the MIP pyramid is, again, maliciously ill-formated: the levels aren't in Morton order (or any such Peano style sequence). It just is unbelievable.

#4
5. Infinisearch VeteranRegular

Joined:
Jul 22, 2004
Messages:
739
139
Location:
USA
My experience in math was purely utilitarian until it was to late for me to learn the more advanced things... I have a greater appreciation for it now... So I've got to ask you - what does this mean? I am at a loss.

#5
6. silent_guy VeteranSubscriber

Joined:
Mar 7, 2006
Messages:
3,754
1,380
That particular part is pretty straightforward: all it says is that, if you want to bisect an integer range from [i to i+j(, you do an integer divide (shift arithmetic right) by 2 of j and the create 2 intervals with that.

What I don't get is the deeper meaning of it all...

"Ad hoc C is for real men, C++ is a bad effeminate joke." Relevancy?
Object-order and image-order immoral? What did they ever do to deserve this?

Then he switches to a rant about, what I believe is, memory locality when accessing textures? Where textures should be stored in morton order instead of what it is now? How are they stored right now anyway? Given that most textures are compressed in rectangular blocks, I doubt that it's an linear scan order. But that's apparently still malicious one way or the other?

What's the alternative to point sampling?
What's the alternative to the perspective correcting division?
There seems to be a rich new world order of GPU masters who are pushing broken solutions?

Is this a suggestion to go through the pixels in texture space and then project them to view space? That would still require a division per pixel, and instead of a slightly irregular access pattern for the texture, you'd get an irregular access pattern in your frame buffer. Given the way rasterization is done, that'd probably be worse? You'd get multiple writes to the same destination pixel?

It's all very weird...

#6
7. Rodéric a.k.a. Ingenu ModeratorVeteran

Joined:
Feb 6, 2002
Messages:
4,031
898
Location:
Planet Earth.
#7
8. Kovalevsky Newcomer

Joined:
May 28, 2014
Messages:
9
0
It is slightly subtler. The lengths aren't necessarily the same, the length of L is j SAR 1 while the length of R is (j + 1) SAR 1, j being a signed integer. Using the same lengths for L & R causes loss of precision in iterations. With big fat floats you always lose prec. What an abomination those floats: they each carry their exponent, what a waste of space (and of time).

C++ is so falsely abstract it's not even funny. Your mind should be dealing with abstract beings, not your hands: abstract things aren't in space-time, their size here is 0. With C++ one pretends to be abstract while writing endless (not of size 0 in space-time) empty (of size 0 there) words. C is ad hoc, concrete. A C user thinks much and writes little.

An alternative to per-pixel division is scanning the view plane along lines parallel to its intersection with the plane of the polygon under consideration. (See e.g., http://advsys.net/ken/download.htm#cubes5). Divisions & point sampling can be dealt with by subdividing, in view space, the object to be projected while narrowing view pyramids. (Has been done with octrees).

Donald Meagher was sabotaged, it is obvious that his renderers should have been preferred.

#8
9. Kovalevsky Newcomer

Joined:
May 28, 2014
Messages:
9
0
Again, Z order or any fixed order for that matter won't help. Splatting will always be faster than ray tracing (texture mapping as of now is but reboundless ray tracing against planar polygons). But perhaps we should be content with WOLF3D's walls? After all the columns can always be accessed sequentially. (DOOM's floors & ceilings can't in general). It's funny how augmenting the resolution of the input or of the output (textures, viewport) conveniently embitters the situation, requiring bigger and bigger GPUs.

#9
10. Billy Idol LegendVeteran

Joined:
Mar 17, 2009
Messages:
5,980
819
Location:
Europe
That is true that a lot of math is hidden in the physics simulation and I might plan a second seminar on these aspects especially since numerics is the name of my professorship
However, again, it is difficult to find good books about physics simulation in games (I have one, but this is to simple).

The book I ordered

http://www.mathfor3dgameprogramming.com/

also has a chapter about fluid dynamics and cloths simulation at least...

#10
11. Dominik D Regular

Joined:
Mar 23, 2007
Messages:
782
22
Location:
Wroclaw, Poland
More often than not, HW uses non-float representations internally, exactly for this reason. Depending on where in the pipeline, things like shared exponent and what not can be used, or completely different representations (fixed point, residual) can be employed. Different types of HW need to work on an agreed upon, shared representations of the input. Floats are pretty good for that.

#11
12. silent_guy VeteranSubscriber

Joined:
Mar 7, 2006
Messages:
3,754
1,380
After some googling, I'm confident that he's been reading this paper:
http://fab.cba.mit.edu/classes/S62.12/docs/Meagher_octree.pdf

It touches all the major points: division, floats, ...

It doesn't touch C vs. C++ which is understandable because the paper was written in 1981. And I still don't see the relevance of that aspect in a discussion about GPU architecture anyway...

So, basically, he wants to store everything in an octree and take everything from there.

From the days when I was interested in ray tracing, I remember that one of the problems with octrees is that it works very well with static worlds, but not so much with dynamic ones. And it consumes more memory. This is exactly what this paper mentions as well: memory consumption is roughly linear with the surface area of the object, and random transformations are complicated. (If I'm not mistaken, they require a full rebuild of the octree, which would be a huge cost in terms of memory BW.)

I wonder how, in an octree-only world, you'd deal with thousands of trees in a landscape, all of them with slightly different orientation? How would you do with vertex shaders that can instantly transform an object into different positions?

As somebody else said: the fact that the outside representation is in float, doesn't mean that floats are used at the lower level. It's a given that they switch to fixed point.

And you didn't address the fact that scanning in view space instead of screen space makes efficient parallelization much harder and results in less ideal memory accesses in the frame buffer.

If things are done in a certain way today, after decades of innovation, it's probably because it converged to a near optimal solution. Maybe not the best, maybe far from ideal for a specific class of problems, but near optimal in general.

As for texturing: you should buy a contemporary game sometime: the games of the last 15 years look don't look at all like wold3d. And the quality did not improve because of higher resolution, it improved because bilinear texturing became trilinear. Because point sampling became mip-mapping, and because anisotropic filtering was added. It doesn't matter in what order you scan your texture: that filtering is always needed.

#12
13. Simon F Tea maker ModeratorVeteran

Joined:
Feb 8, 2002
Messages:
4,560
157
Location:
In the Island of Sodor, where the steam trains lie
What on earth are you talking about? (Admittedly I'm speaking of the graphics hardware I've worked on) The levels are in Morton or at least tiled order.

#13
OpenGL guy likes this.
14. Kovalevsky Newcomer

Joined:
May 28, 2014
Messages:
9
0
No matter the format, accessing it by way of the pixels' detour won't be neighborly in general (observing a wall at a sufficient angle will result in the next pixel's tex. sample to be rather far from the current one).

Splatting, especially the hierarchical front-to-back variety, is much better. Unlimited Detail is Meagher's early '80s algorithm. More precisely a combination of his perspective & orthographic splatters. (Like perspective incorrect texture mapping where you linearly interpolate between two "correct" UV samples while i.e, in parallel, the FPU is making the perspective division for a future "correct" UV sample).

#14
15. silent_guy VeteranSubscriber

Joined:
Mar 7, 2006
Messages:
3,754
1,380
That's one of the reasons (though not the main one) they invented mipmaps.

For those of us who are not familiar with the path of enlightenment and who want to avoid the immoral ways of contemporary GPUs: can you provide links to your bibl^H^Hsome papers?

#15
Simon F likes this.
16. Infinisearch VeteranRegular

Joined:
Jul 22, 2004
Messages:
739
139
Location:
USA
Sorry to bother you again silent_guy but I wasn't clear with my question, so if you or anyone else can spare a moment thank you.
I guess what I'm asking is does this:
actually mean anything???
Moreover what is the relation of this "when you bisect the half-open [i, i + j), let left = [i, i + (j SAR 1)[ & right = [i + (j SAR 1), i + (j SAR 1) + ((j + 1) SAR 1)"
with this "notice the 4th dimension of space i.e., scale" and in particular why bring up this: "(hence the quadtree, octree etc.)"??
Is the integer interval a single dimension of a bounded volume in a quadtree/octree? and by scale is this some roundabout way of saying numerical precision? (ie smallest increment possible)

#16
17. silent_guy VeteranSubscriber

Joined:
Mar 7, 2006
Messages:
3,754
1,380
In this case, I think(!) scale means hierarchical level of the tree. And when you descend into the tree, you don't split the node exactly in half, but you do it integer-wise with those 2 half-open intervals for each dimension, instead of floating point (because that'd be immoral.)

But, yeah, after his last reply (which got deleted by the moderators), I think there's no point in continuing the conversation.

#17
18. Kovalevsky Newcomer

Joined:
May 28, 2014
Messages:
9
0
Decades of innovation? Mod rewrite: The principles of current realtime graphics were conceived in the fifteenth century as per my above quote. The current approach is based on archaic thinking and vastly inferior to an optimal approach.

#18
Last edited by a moderator: Dec 20, 2014
19. Rodéric a.k.a. Ingenu ModeratorVeteran

Joined:
Feb 6, 2002
Messages:
4,031
898
Location:
Planet Earth.
Often things are like they are more because of inertia than any other reason.
LCD screens were feasible in the 70's but they only showed up 30 years later, because there was no incentive to invest in that technology when the manufacturers still could make a lot of profits w/o.
It's a bit like rasterization, a lot of time and effort has been put into it, but that doesn't mean it's nearly optimal in any way, it's just a massive bag of tricks and ray tracing is a lot more elegant and simple, but doesn't have the performance yet. [What if as much money was spent into it instead ? Would we only have ray casters today ?]
Or you can compare that to C++, people use it because of inertia, today not even game devs should use it, it's really no good and a very bad match for current multicore machines...

That said, treating people a weaklings or imbeciles isn't exactly helpful and therefore rather discouraged around here.

Also we are completely off-topic now...

#19
Lightman likes this.
20. Laa-Yosh I can has custom title? LegendSubscriber

Joined:
Feb 12, 2002
Messages:
9,568
1,455
Location:
Budapest, Hungary
Should've known from the start that it was going to turn out to be about nothing but the same unf***ing limited detail s***.

#20
Similar Threads - corruption realtime graphics

Replies:
2
Views:
2,230
2. Entitled gamers, corrupt press and greedy publishers

patsu, in forum: Console Industry
Replies:
248
Views:
21,843

Replies:
10
Views:
3,191

Replies:
3
Views:
3,596

Replies:
33
Views:
6,430

Replies:
43
Views:
13,149