atpkinesin
Newcomer
The objective mathematical error metric is known and used for eternity: MSE.
Human perception isn't objective though, so people are looking for a objective subjective metric since quite some time. That that's not possible should be evident. If the game's pre-processing (tonemapping), the environment's mid-processing (viewer condition) and brain's post-processing of an image would happen to be "standardized", you could move MSE into post-proc space (basically brain-space before meaning is attributed). Unfortunately that's a folly.
If you can accept an objective perceptor (e.g. a computer), and would accept that it is a valid representation of the averaged subjective perceptions (e.g. it's at the center of the normal distribution of divergent perceptions), MSE is as good as it gets.
I want to push back on the latter part of your argument because it doesn't consider results of decades of research that has gone into image and video compression. Lossy compression techniques optimize image quality using objective metrics that have been chosen based on an understanding human perception. JPEG, for instance, uses the fact that people are much more sensitive to changes in brightness than to color. JPEG would produce worse results if it only considered MSE! This is something that could be applied for image quality measurement today: measure the dynamic range of the luminance and chrominance channels and compare the reconstructed and native images. That result would be an objective measure of image quality based on human perception.
The "problem" with image quality measurement is not that overall image quality is subjective, it is that we haven't figured out what aspects of image quality are important for understanding reconstruction results. "What GPU is best?" is also a subjective question, but tech enthusiasts have developed a set of objective measures that can help answer that question for many different people. Each individual measure helps any given person weigh what is most important to them. The same approach can (and should!) be used for image quality assessment. Image quality can be objectively measured with contrast, sharpness, blur, noise, dynamic range, total luminance, color accuracy, and ghosting and aliasing as mentioned above. Maybe some of these measures are useless, and maybe some are only as useful as the "AVG FPS only" era of GPU assessment, but eventually we can find the equivalent of the "1% lows" and the "frametime graphs" as we start to better understand what is important to answer our specific question. We don't need a single number (just like we don't use a single number to evaluate GPUs) of "Image Quality" to be able to make meaningful and quantifiable comparisons.
(Hi all, first comment. I've been a sporadic lurker for more than a decade but i really enjoy reconstruction tech so I figured today was as good as any for my first comment)