Fast perspective warping

krychek

Newcomer
I want to warp a 2d Image to give it a perspective effect. So an overhead view would be transformed and would look like if its viewed at an angle. No rotations in the transform. And the Image will be shrunk at all areas so interpolation is not required. So the transformation would convert an axis aligned rectangle to a trapezium with its parallel sides parallel to the X axis (horizontal). The image layout in memory is the common layout - consecutive memory address runs across the image horizontally.

I wrote a simple routine that downsamples a line from M pixels to N pixels M>=N.
By varying N along Y (vertical axis) I could perform the perspective warping effect but only in the X axis. This works fine except for the sub pixel/texel precision thing that i have to fix.

Now I am not sure what would be a good way to implement the perspective warp in Y (vertical) direction too. My concern is that it would be thrashing the cache really bad.

So, should I just downsample lines from the source image one line at a time, and then accumulate them to a single line in the source destination?

Or should I swizzle the image into small blocks and for each destination pixel simply fetch the corresponding pixels from the swizzled source image?

Or any other quick hacks?

Since I am playing with this on a embedded system with really bad tools, I cannot test various methods, so I wanted some suggestions/ hints.

Thanks!
 
It sounds to me like you might attempting to handle this via foward mapping instead of reverse mapping. You really need to iterate across the pixels in the destination rather than iterating across the pixels in the soruce to handle general mappings like this well.

As far as your processing concerns, how fast do you need this to be? Almost all PCs today have HW that will certainly outperform what-ever you write in software.

A little detail abotu what you are trying to accomplish might make it easier to suggest a solution within the bounds of your requirements.

-Evan
 
Well this is on an embedded system with no 3d acceleration (not even good enough 2d acceleration). I had implemented a simple texture mapper on PC some time ago so there I had used reverse mapping.

I have a program that continually generates 2d images, at around 10 frames per second. Now instead of displaying the 2d overhead view, I want to warp it and make it look like its being viewed from an angle (and not lose much fps in the process ;)). The key thing is that the mapping is fixed, so the warping is exactly same every time - it is never gonna change - just the image a new image has to be warped. And I wanted to take advantage of this.

Because of the hardware limitation - cpu speed, mem speed, and because the warping is fixed, and since there will be no rotation in the mapping, I wanted to go for forward mapping. I didn't want to have the overhead of calculating/interpolating texture coordinates, but more significantly I wanted to eliminate non-linear memory access as much as possible so I could be cache friendly. I want it to be fast by making it as cache friendly as possible.
 
which platform is that? if you give us a hint we may know better what ops you can/cannot use. otherwise, w/o any considerations, i'd say use full-blown hyperbolic interpolation for the vertical axis ; )
 
The platform has a risc processor running around 600Mhz. Is this info helpful? :) Its not a standard platform. Hyperbolic interpolation of the texture coords (I assume that's what you meant) should not be a problem as I could precompute a set of texture coords since the mapping is fixed.
 
After thinking about it a bit more, I think your success is in doing it with fwrad mapping is going to be related to how harsh the projection is and how much you care about quality. The problem as I see it is that the kernel will have a trapezoidal shape. This means that if you have two pixels in different rows, trying to reference the same part of a given row, each will need a different width on the horizontal filter.

Does terrible explanation this make?

-Evan
 
krychek: There are straightfoward ways to do forward mapping for rotations/shears (using a two pass approach), which were used in real-time video effects boxes, but I've not seen that adapted to handle perspective.

Frankly, I think the safest way is to simply use a reverse mapping scheme with a perspective (i.e. a hyperbolic) mapping, i.e. if X and Y are the destination pixel locations and U&V are the source pixel location, then
Code:
U = (a*X + b*Y + c)/(p*X + q*Y + r)
V = (e*X + f*Y + g)/(p*X + q*Y + r)
where a, b, ..f, q, q, & r are constants that depend on the transformation. If you do things on a scanline basis, the b*Y+C etc parts can be replaced by constants for the scan line.
 
Last edited by a moderator:
Simon F, I know how to do ordinary reverse mapping with bilinear filtering, that itself is not a problem. But to make it look correct/ have good quality, the filter sizes have to vary and could get pretty big, else the image quality would be pretty bad in the distance/ there might be the common swimming artifacts (though it won't exactly be swimming if I am getting very low fps :)) . A few pixels in the destination image would span a lot of pixels in the source image because of the perspective. If I use coorect kernel sizes with reverse mapping, for each pixel in the destination I could be sampling a big group of pixels in the source that lie in multiple horizontal lines! Or, I should compute the required miplevels for every new frame and then do ordinary mipmap filtering! Both of these sound a bit weird to me. Maybe I misled a bit with the title with being "fast", but I meant with decent quality.

But I am also not very sure that the forward mapping method I am leaning towards will actually give me the best quality either.

This makes me feel like there is no good way to do this to get good quality at good speed.

ehart, your explanation has confused me a bit. For a moment if we forget about performance, and all we care about is quality, then shouldn't the kernel be approximately trapezoidal in shape? The set of source pixels which should influence a destination pixel would lie in an approximate trapezoid. But most filtering schemes use some elliptical or some such shaped filters, why?
 
krychek said:
Or, I should compute the required miplevels for every new frame and then do ordinary mipmap filtering!
IMHO, I think that is what you'll have to do.
 
ahem, you did not mention anything about pursuing high quality. now, for surfaces exibiting high anisotropy mipmapping is not exactly a high quality solution. my advice here would be exactly the opposite to Simon's* - either go with reverse w/ billinear or better some forward mapping.


* i guess it's about that time of the year again when we b3d'ers have our traditional argument on the use of mipmapping on slanted surfaces.
 
darkblu said:
ahem, you did not mention anything about pursuing high quality. now, for surfaces exibiting high anisotropy mipmapping is not exactly a high quality solution. my advice here would be exactly the opposite to Simon's* - either go with reverse w/ billinear or better some forward mapping.


* i guess it's about that time of the year again when we b3d'ers have our traditional argument on the use of mipmapping on slanted surfaces.
Indeed.

However, since the post was titled "Fast perspective warping" then one would assume some lower bound on the "pixels per second" stands.

Anyway, MIP mapping can still be employed when doing anisotropic filtering - you can still use them as a fast way to assemble approximations to large filter kernels. IIRC, Heckbert suggests using this.
 
Baoquan Chen published a paper on forward image warping with a method to do single pass forward mapping without a per pixel divide (memory access pattern is not so great though).
 
Last edited by a moderator:
MfA said:
Baoquan Chen published a paper on forward image warping with a method to do single pass forward mapping without a per pixel divide (memory access pattern is not so great though).

Presumably that involves running through lines of pixels where
Code:
(p*X + q*Y + r)
is constant? Sounds nasty.
Of course, Doom got away with this sort of approach because the game guaranteed either p or q was zero.

You probably don't need to do a per-pixel divide if you did a single newton iteration step per pixel to compute the reciprocal, and used the previous pixel's answer as the starting point.
 
Simon F said:
Sounds nasty.
Well, parallel lines remain parallel lines. So at least anti-aliasing is relatively easy, only ever need to worry about 2 lines in the target image ... as long as you are always minifying (which I assume is the case here since it hasnt been mentioned).
 
Okay I should have made it more clear that its achieved fast perspective with good quality possibly taking advantage of this specific mapping. My mistake.

Further, I should also say that since my mapping is fixed throught the life time of the program, I could assume that finding the texture coordinates is not going to be a problem. I could just have divides per a single line, or even that could be elimiated through pre calculation. The bad memory access patterns as in the Chen's paper doesn't happen because horizontal lines in the source will be mapped to horizontal lines in the destination. But for achieving the minification in Y I would access the memory in a bit bad pattern.

And the image is always minified in X for sure, while I initially said its always minification everywhere, I am thinking of relaxing that for Y.

So finally all that remains is processing in a cache friendly way - assuming the accumulation of the samples is not becoming the bottleneck. Which is why I was leaning toward the forward mapping, processing the source line by line such that more than one source line can be accumulated with some weight into a single destination line. (By line I mean a horizontal line.)

With mipmapping, all computed mipmaps won't be obviously used. But the usage of the mipmaps can be cache friendly as far as I can see.

Btw, I didn't understand ehart's comment about the filter kernel being trapezoidal. I thought that is how it should be. Can someone point me to a good resource for understanding how filter kernels should be shaped and why?

Thanks again.
 
Last edited by a moderator:
krychek said:
Btw, I didn't understand ehart's comment about the filter kernel being trapezoidal. I thought that is how it should be. Can someone point me to a good resource for understanding how filter kernels should be shaped and why?
i believe he was refering to forward mapping. in which case as you note it's natural that your filter be trapezoidal, i.e. a nice trapezoid along the middle vertical of the projection plane and scewed trapezoidal as your surface goes both z-deeper and away from the middle vertical.


Simon,

mipmaps at high anisotropy are ok as long as mipmap selection is based on the smaller rho, not on the greater. which is not the case with trilinear/bilinear with mipmaps.
 
DarkBlu got what I was talking about right, but it does not just apply to forward mapping. The correct reverse mapping also has this shape, it is just that we typically use approximations in reverse mapping that are not correct.

As for the trapezoid versus an elipse issue, I have seen both referenced. If you treat the pixel as a square the mapping is going to be roughly a trapezoid under the perspective projection, if you consider it to be a circular dot it will be egg-shaped. (Elipses only hold up under non-perspective transforms if I remember right.)

My comment about the trapeziod and forward mapping causing some possible difficulties was simply that if you have two adjacent scanlines, they both will likely need to filter from the same scanline in the source image, but the width will be different for each. As a result, to do this as forward mapping, you could not simply perfom one mapping on each scan line of the input image, then map across the horizontal lines, because you would only have one filter width for each scan line, when you will likely need at least two to do a good job.

The interesting thing is that this is essentially what you are doing with mip-mapping.

-Evan
 
Ah, now its clear. Now its obvious why ellipses/eggshapes are considered and why I was thinking of trapzoids.

I was aware of the fact that a line in the source image would map partially to two lines (or more). So I process some source lines more than once. I guess this is turning out more to be like reverse mapping.
 
krychek said:
With mipmapping, all computed mipmaps won't be obviously used. But the usage of the mipmaps can be cache friendly as far as I can see.
You can take that a few steps further. If your transformation is constant, you will know beforehand which parts of the screen will require which mipmap level. Say the lower half of the screen is pure magnification. You can skip mipmap generation for that part, and only compute mipmaps for the rest (the top "half", which probably isn't exactly half, but you get the idea).

Further -- and I'll say up front that you probably won't want to go there, as it will have to be undone for trilinear and/or anisotropic -- you can eliminate any need for temporary buffers during mipmap generation if you're content with pure bilinear. You can just filter down "in place", overwriting the full res image wherever a smaller mipmap level will be required. And if you do software rasterizing, you can cut down on spatial precision/render lower res in these low lod regions from the get-go. Resolution will be reduced by the downfilter anyway, so there'd be no need to generate the whole image in full res.
 
Last edited by a moderator:
Back
Top