Panajev2001a
Veteran
No, scaling to an entire buffer is wasteful and unnecessary.
You can do horizontal scaling entirely as an inline process. You just interpolate between the pixels as they come in and insert new ones into the output stream. For a (reasonably good quality) cubic filter you need just four pixels as inputs. Chump change.
Vertical inline scaling would need much more storage, because it requires you to keep multiple lines of pixels available as inputs.
Now that's still something I'd most definitely call a hardware scaler. It's not just an accidental capability of a bog-standard RAMDAC, you need to include those transistors into the design somewhere. NVIDIA's PC chips could do it for a while though.
I think you hit the nail on the head, it sounds like a perfectly reasonable explanation as to why they enabled horizontal scaling only (4 pixel inputs for a reasonably nice cubic filter does not seem too bad in terms of storage needed). The question is why it took them this much time to feel safe about developers using this functionality in games... was the driver that allowed access to the HW scaler slow or were there some software related bugs ?