Video Motion Recognition

That kind of granularity would be OK for what I was thinking of.

I'm not sure if it's been discussed before, but I was sort of pondering the relationship between the human visual system, and graphics rendering. I was thinking:

- can we determine an area of focus of a user on a screen?

- can we thus, for example, vary the level of graphical detail in a frame to skew high towards the area of focus, and lower elsewhere?

- what 'features' would be predisposed to this kind of scaling without the user noticing? Your peripheral vision may detect changes in some features more than others..

It was another application that got me thinking about this..I saw a paper on video coding, where they did some experiments to examine features of probable focus when looking at video (e.g. faces), and then when encoding the video, do so at better quality around such features and lower quality elsewhere. You could do the same with games - do 'offline' experiments to see what kinds of things the human eye is most likely to look at, and increase the quality of your corresponding assets relative to others - but I was wondering about the feasibility of going one step further, by determining which regions of pixels the eye was currently focussed on while the game was actually being played, and adjusting rendering quality on-the-fly.

I did a little reading and one of my main concerns doesn't seem to be a huge stumbling block. I was wondering if the speed at which the eye could move to another region would make this difficult to do. But apparently typical eye motion takes in the order of 30ms..so if we're, say, sampling eye locations every ~8ms (120fps) and our render latency is ~17ms or less (60fps), then you might be able to keep up with eye movement and readjust your level of details per region without the user noticing. I'm sure there are many other potential roadblocks though...

The main motivation here is that if you are focussed on a given screen region - as typically, it seems, one would be - it might be wasteful to treat all pixels equally.
 
Back
Top