Scaler - complexity, structure & cost

Would not a "real smart" scaler give quite nice result in games? If they only blow it up to fit, would it not always be better to look at media in its original form?
I know dvd on a 4k screen would look like ****, but with a real good scaler I thought it would look "quite" nice, or am wrong about what a good scaler really can do?

If scalers could do miracles we could output only one white pixel and scaler would do the rest.

Unfortunately you cannot correctly create pixels that don't exist. You can interpolate, but it in facts just blurs image.

Zvekan
 
Then why should one want to scale 720p to 1080p? If the source is in 720p, 720p resolution would result in better image quality, yes?
 
If scalers could do miracles we could output only one white pixel and scaler would do the rest.

Unfortunately you cannot correctly create pixels that don't exist. You can interpolate, but it in facts just blurs image.
There are ways to scale an image and actually use information within the low-resolution content to guide interpolation operators such that details are maintained. That's what I'd assume he was talking about when referring to "smart scalers." It won't make new details, but the idea is not to destroy existing ones which is what makes things look a lot worse, and trying to post-process a fix only makes it more apparent.

Unfortunately, a lot of the good methods for this are either too heavy-handed for hardware implementations or eat up numerical precision for breakfast (or both).
 
I have no idea what sort of quantities they ship Realta in.
But with the numbers from:
http://forum.beyond3d.com/showthread.php?p=1015700#post1015700

TSMC 300mm wafer 90nm ~ 3k.
65nm ~ +15% (3450)
100 million transistors logic = 20mm2 on TSMC 65nm.
3305 dies per wafer.
~90,6% yield according to ICKnowledge's die calculator.
65nm=$1,15/functional die
Packaging & testing? No idea.

From the pdf it seems realta was introduced in 2004. With the initial costs of moving to a new process it seems they might still be on the same one as before with their volume? Thus the cost will most likley be quite a bit higher. What can they use, something like 130nm? I have no references for the production costs on that so I can't speculate.

On their site they mention integration cost of $10-70 000.
http://www.hqv.com/technology/index1/video_processor.cfm?CFID=&CFTOKEN=12383601
 
They seem to adaptively chose between 4-32 tap interpolation (in each dimension) probably based on edges. So at most that is 64 MACs per pixel, which at 1080p60 would make for 23 GMACS. A respectable amount, but to put it in perspective at 10 bits precision an integer multiply-accumulate takes less than a quarter of the resources a single precision floating point FMAD does, which the Xenos can manage over ten times as much per second just taking the programmable part of the shaders into account.
 
Then why should one want to scale 720p to 1080p? If the source is in 720p, 720p resolution would result in better image quality, yes?
Absolutely, yes. But most people don't care about video image quality at all. :oops: All they care about is whether the picture fills the screen or not. :rolleyes: So, if you have (say) a 1920x1080 screen, people prefer to watch a 720p video stream up-scaled to cover the whole of the screen rather than a native 720p image letterboxed in the middle of the screen with black borders all round.

The situation is even more extreme if you're playing back standard-definition video on a high-def screen. There has to be at least a minimal amount of scaling to get the aspect ratio right - an anamorphic NTSC DVD, for example, would have to be scaled from 720x480 to about 853x480 to get the correct ratio - but, again, most people with high-def displays want their DVD playback to fill the whole of the screen rather than be a small image letterboxed in the middle.

(I actually prefer native-res video myself, but I seem to be the only person who does :???: ).

There is, incidentally, an important distinction between video filmed "from life" and generated 3D graphics, here. A real-life scene has infinite resolution and consequently has perfect anti-aliasing when photographed. A generated scene does not have perfect anti-aliasing - it's a discrete-pixel digital emulation of an analogue world. This has some implications on how good an image looks when it is upscaled. Upscaling a photo tends to work a lot better than upscaling a generated 3D image.
 
A tiny bit sometimes. The geometric representation of a 3D scene also has "infinite resolution", yet you can get aliasing.

The human eye, a camera, or a GPU all sample a scene at a finite resolution without properly band-limiting the incoming signal first (though the imperfections of the optical system help a bit). This means you can get aliasing. Then they apply filters to the sampled signal to avoid even more aliasing (as well as other reasons, like edge enhancement or data reduction).

The human eye is relatively immune to aliasing due to the irregular receptor distribution, optical imperfections, permanent eye movement and persistence of vision. Those points do not apply to cameras to the same extent, especially digital ones.
 
The hardest bit is definitely the de-interlacing. If you want that done "correctly", a fast GPU with some large shader programs would work. You need stuff like multiple frame buffers / render targets and such.

The actual upsizing can be done in multiple ways (as shown), from simple stretching to wave transformations (like with a jpeg) and lots of things in between. What most scalers in TVs do, is calculate and blur the edges. That is cheap to do, but gives a very visible shimmering "shadow" outline around most objects. If you want to do better than that, you might again be better off with some serious shader programs on a fast GPU.

If you use it for video, it gives your GPU something to do. Then again, if you use it to display a 3D scene, you would want to simply render it at a higher resolution.

So, why does MS put a hardware scaler in their console? To improve the framerates. A hardware scaler is much cheaper than a double-sized GPU, that can push more pixels. And it's a lot simpler than writing those shader programs.
 
The next big leap was Rage 128 PRO, which supports even some HDTV resolutions (720x480 is still officially the highest input resolution, but the chip actually manages much higher res. - 14xx*1xxx works, i haven't tested any higher, but if you give me a link to some HD-MPEG2 videos, I could test it for you.) The scaler uses 4-tap/4-tap filter and I think this was the first consumer graphics chip supporting 4-tap/4-tap filtering. Rage 128 PRO consists of 8 mil. trans. (don't forget it's dual-pipeline 3D core, AGP4x interface, 128bit mem. controller, integrated TMDS with ratiometric expander, RAMDAC etc.)
If you're talking rage128...
I don't know too much about that one, but it is supposedly pretty similar to the radeon ones. The back-end scaler of the radeons (pre-avivo, I don't think avivo-based cards have a BES) can handle source video resolutions up to 1536 or 1920 horizontal resolution (depends on the exact model, and noone seems to know... - for instance r200 is 1920 rv250 is 1536...), and it will only be able to do 2-tap filtering with that resolution (the overlay line buffer is too small). For 4-tap the video must not be larger than half that. Vertically it's virtually unlimited. If you want to play a 1920x1080 video with a rv250, it wouldn't work. What the chip can do however is predownscale, so you essentially get 960x1080 video (still with 2-tap filtering, it's actually hard to notice quality is degraded...).
Of course, nowadays everyone uses front-end scalers - just use textured video, the 3d engine is a pretty powerful scaler these days... The BES of the radeon was also a bit nasty to program, as it not only depends on source resolution but of course the output resolution (timing) as well.
And of course back-end scalers have all sort of limitations: there is typically only one per card, so not only can you output only one video that way, but it's limited to one output too (unless you'd have a true clone mode where you'd drive two displays with the same display controller, i.e. same timings). You can't use it if you have rotated display (unless the BES would support rotation directly, which it probably doesn't on any graphic card), you can't use it really well in composited environments etc.

I think that a scaler for HDTV resolutions can consist of less than 1 million of transistors if you don't have any special requirements.
A scaler alone should be quite a bit simpler I think, even if you'd include things like color-space conversion. Of course for deinterlacing the complexity required can vary from basically 0 (bob or weave) to something very complex... (even first radeons had some sort of adaptive algorithm, though I don't think you'd call it high-quality these days)
 
Last edited by a moderator:
Thanks for this interesting post! I can talk about R128's scaler only from user's standpoint, but as I remember, it's quality was a bit different to R100's quality. R100 was visibly sharper (very good for DVDs and files with similar resolutions, but R128 was better for low-resolution videos (VCD resolution and similar). It was slightly softer (but still way sharper than competition - Voodoo 3, G400 etc.), reduced jaggies very well (G400 was significantly worse) and didn't amplify block noise (like R100). It was a pitty, that reviewers didn't mention it and most people buyed matrox's cards as a better solution for video playback. I like G400, I still use it from time to time, but it's scaler was pretty poor compared to Rage 128 PRO...
 
OT: Is matrox alive? pardon my ignorance.
Yes, but it's been a long time since they developed any new graphics chips/cards. These days they make stuff for the professional market, particularly multi-monitor solutions; you may have heard of some of their products like "DualHead2Go", etc.

It's shame, really. The old Matrox Millennium was the pinnacle of (effectively) 2D-only cards, well ahead of the previous front-runner (which was the Diamond Stealth card based on the S3 968 chip). It was (AFAIK) the only product ever to feature WRAM.

But none of Matrox's 3D solutions ever really cut it.
 
Back
Top