Kinect technology thread

Well I don't know about you, but I don't personally have spontaneous arm thrust towards the screen. ;) There's a tiddly chance of a flas positive if the player untowardly puts their hand into the button zone, but there's likewise a chance of accidentally activation trigger buttons on a dual-stick controller if you knock it against your lap/leg. I don't see a low-risk problem as a sound basis for this interface choice.

Nope. 2D position is absolute based on the camera, and depth is 'absolute' and doesn't drift over time.

Then again, the camera isn't stationary, is it, but keeps roaming around to keep the skeleton in view, which makes the positioning relative. Still, if they know where your hand is to power up a button as they do, then depth isn't a problem. Keep exactly their current code except instead of charging a timer by 2D presence, activate it on a minimum depth. If it's not as easy as that, there's something wrong with Kinect's libraries!

Yeah, it should be possible to do a 'quick' button press motion to trigger an action. It'd be suitable for party games.

Or think of a controller + Kinect combo action.
 
Nope. 2D position is absolute based on the camera, and depth is 'absolute' and doesn't drift over time.

Then again, the camera isn't stationary, is it, but keeps roaming around to keep the skeleton in view, which makes the positioning relative. Still, if they know where your hand is to power up a button as they do, then depth isn't a problem. Keep exactly their current code except instead of charging a timer by 2D presence, activate it on a minimum depth. If it's not as easy as that, there's something wrong with Kinect's libraries!

Well, there's the 2D camera and there's the 3D projector. The 3D projector can focus from something like 1.6 to 3.5m, and then there's the 27 degree tilt from the camera. Sure you can probably combine information from the 2D camera and 3D camera to work something out, but it doesn't sound trivial to me at all.

However, we've seen gestures as well as the XNE's version for button pressing used, and the gesture thing (I think it was moving down, similar to a swipe right/left) does seem to work. Absolute button pressing though, I'm not sure. I still think you need some kind of command to refocus it in a certain location. Maybe clap at the location in space where you want the buttons to bring the buttons in that location, and then pressing them while you get your avatar animation to show your movement in relation to the buttons in 3D space you can get feedback on how close you are to the buttons, lag, etc.
 
Well I don't know about you, but I don't personally have spontaneous arm thrust towards the screen. ;)

lol, I dunno - I know what you're saying, but often in games I'll point at the screen, gesturing to my son as to who the through ball was SUPPOSED to go to (the unmarked guy running through not the guy standing around being marked by 3 defenders)...oh...did I go on a bit of a OT rant there - sorry! ;)
 
I would have thought the issue here is when people might accidentally press the button by pointing at the screen or stretching arms - or maybe a sudden move forward being misread?

The problem with joyride specifically is that the steering wheel is 'the virtual circle centred between your hands'. If you move your right hand somewhere else, then the steering wheel stretches with it.

There is no 'good' solution to it without introducing lag/stuttering/inability to press the button whilst cornering etc.

In general the problem with buttons could be:
- depth can be relative to your shoulder, your foot, the console, your chest, your head, your hand/elbow etc. None of them are a particularly great answer as to what depth you consider the button to be.
- kinect "may/will" consider the end of your arm as a 'blob', with your fingers being "noise". Your hand may be the one part of your body that kinect has the most trouble locating a precise position/depth.
- pressing a button with your hand produces a rather weird motion, such that it may not be particularly obvious where you are intending to push.
- logically a hand cursor would be a good idea, although that may not make designers very happy.

I think combining sound/word with gesture/button pressing would be ideal.

Sound recognition has always seemed rather impractical to me.
 
lol, I dunno - I know what you're saying, but often in games I'll point at the screen, gesturing to my son as to who the through ball was SUPPOSED to go to (the unmarked guy running through not the guy standing around being marked by 3 defenders)...oh...did I go on a bit of a OT rant there - sorry! ;)
But in those cases you'll be mid game and shouldn't have UI buttons around to press. Although I see what you're saying. With sensible interface design it shouldn't be an issue, which means not having a UI game up when you pause and want to lambast a co-player for not moving their player here instead of there.
 
What puzzles me more is why they don'thave you able to reach towards the screen to press the virtual button! Seems an ideal implementation of their depth detection, taking EyeToy's onscreen buttons and having you actaully reach forwards to press them. It wouldn't be at all hard in theory - for each hotzone, test if hand is in that location, and then if so test if it's closer than a threshold. If so, button is pressed. Hovering to select ignore completely their potential!

I'd expect reaching forward 'charges' the button faster. Certainly they did this in the forza demo.
 
- pressing a button with your hand produces a rather weird motion, such that it may not be particularly obvious where you are intending to push.

What do you mean ? Can they detect the reversal in direction/depth at the "tip" of the hands (and ignore the other hand parameters within the button area) ?


I'd expect reaching forward 'charges' the button faster. Certainly they did this in the forza demo.

In EyeToy, I think you can wave your hand/palm over the target button to charge it faster (The quicker you wave, the faster it charges). These kind of interfaces are good for general UI navigation, but may not be suitable for "point in time" trigger -- like pausing a movie at the right frame, killing something quick, or hit the "I know the answer" button.

If Kinect can detect the reversal in depth of the "button press" motion, it may be faster. There will still be a delay (because the system needs to detect motion before and after the press to confirm the reversal in depth, but it may be a quicker way to "detect/confirm" a press ?).
 
In general the problem with buttons could be:
- depth can be relative to your shoulder, your foot, the console, your chest, your head, your hand/elbow etc. None of them are a particularly great answer as to what depth you consider the button to be.

Within the context of the original supposition of being able to physical push a virtual onscreen button many of the detractions shouldn't be much of a barrier.

For instance, using the Kinect X360 UI demo as a point of reference.

Absolute positioning of the hand (endpoint of the arm) isn't required. It only needs to know whether that point has moved forward or not from the place it was located at. Whether it moved from 7 feet to 6 feet or 5 feet to 4 feet (distance to camera) shouldn't matter.

The only problem that may come up is whether a person knows to push towards the screen as if pushing the button on the screen, or if they are pushing "down" towards the floor as if pushing a button in front of them. However, I'd think it'd be pretty obvious to push towards the TV.

Sound recognition has always seemed rather impractical to me.

I still think it's quite elegant, practical, and fast if they were to also include voice recognition for trigger words. BING not only makes for a short word appropriate for signaling that you have reached your target, but also plays a part in marketing Microsoft's Bing. :D Likewise it's not a word that is likely to be just randomly spoken at the start for a sentence for example. And it should be a relatively unambiguous word that would be easy for the system to recognize.

Other ways they could have done button clicking is say a quick side to side flick of the arm, with the selection item being whatever was under then hand at the time you did the flick motions. This would be analogous to a double click in Windows for example. Although one problem there is that people might at first just try flicking their hand instead of their forearm.

Regards,
SB
 
The user may have to linger over the targeted item (wave or push deeper) to activate it.

If it's just a quick flip/flick, there may be a lot of false positives as the user interact with the rest of the game (or as random act). It happened in some PSEye games.

e.g., Scrolling is usually done via a quick flick of the hand, but you'll need a final confirm (wave) to make sure the scroll wheel didn't just turn because of some unintentional hand action.
 
Pressing a button with your hand produces a rather weird motion, such that it may not be particularly obvious where you are intending to push
What do you mean?

Pushing forwards is not a case of propelling your hand forwards in a straight line. Depending on how quickly you do it/where you arm is, the motion can appear to be 'swinging your forearm left/right/down/up'. It's just not a very natural motion (punching is more natural - but probably not what you want for an generic interface).

Can they detect the reversal in direction/depth at the "tip" of the hands (and ignore the other hand parameters)?

Almost certainly. Can they do it reliably? Probably not. (the kinect developers would not have used hover in place of push unless they had a very good reason to do that)
 
If it's just a quick flip/flick, there may be a lot of false positives as the user interact with the rest of the game (or as random act). It happened in some PSEye games.

Everything I was saying with regards to button pressing has been in relation to pushing/activating a button/choice in some for of UI (game or OS).

Actual game actions would all obviously have to be relevant for the actions in game. Again controller input suited to game or thinking the other way around game suited to controller input. Trying to force the issue (FPS pulling trigger of a gun for example) is where you're going to end up with unsatisfactory experiences. It's the whole situation of a control type wholly unsuited for the game.

Those types of games with regards to Kinect absolutely have to be a hybrid control type. IE - Kinect + controller. Except perhaps in the most casual of situations. Some kind of shooter on rails where you say Bang to shoot a gun or Pew Pew to shoot a laser for example while controlling aim with your arm.

Heh, thinking of that, it would make for a great party game. But I'd imagine multiplayer would be a mess unless one player has the gun (bang) while the other has the laser (pew). :D :D

Regards,
SB
 
Everything I was saying with regards to button pressing has been in relation to pushing/activating a button/choice in some for of UI (game or OS).

It'd still have false positive if the activation action is too trivial/common.



Pushing forwards is not a case of propelling your hand forwards in a straight line. Depending on how quickly you do it/where you arm is, the motion can appear to be 'swinging your forearm left/right/down/up'. It's just not a very natural motion (punching is more natural - but probably not what you want for an generic interface).

Almost certainly. Can they do it reliably? Probably not. (the kinect developers would not have used hover in place of push unless they had a very good reason to do that)

Yeah, that's why I mentioned "ignore other hand parameters" at the same time. Since Wii Sports Resort tracks the gyro data and ignores the accelerometer readings to filter out the "noises", is it possible to ignore other parameters, except the depth, as long as the palm is inside the button.

Naturally, the system will need to know whether the palm is inside the button in the first place.
 
Within the context of the original supposition of being able to physical push a virtual onscreen button many of the detractions shouldn't be much of a barrier.

For instance, using the Kinect X360 UI demo as a point of reference.

Absolute positioning of the hand (endpoint of the arm) isn't required. It only needs to know whether that point has moved forward or not from the place it was located at. Whether it moved from 7 feet to 6 feet or 5 feet to 4 feet (distance to camera) shouldn't matter.

If you stood at 6 feet, and stepped forwards, should that activate the button?
(or leaned/twisted/slouched/crouched/sat down etc)

Likewise if you moved a foot backwards, should the button now be unreachable?

If not, then maybe the button is either somehow relative to an anchor on the player, or maybe just based on body shape?

I still think it's quite elegant, practical, and fast if they were to also include voice recognition for trigger words. BING not only makes for a short word appropriate for signaling that you have reached your target, but also plays a part in marketing Microsoft's Bing. :D Likewise it's not a word that is likely to be just randomly spoken at the start for a sentence for example. And it should be a relatively unambiguous word that would be easy for the system to recognize.

It should work for a system menu, although I suspect living in the room next to someone who keeps saying 'bing' would drive most people insane.

Other ways they could have done button clicking is say a quick side to side flick of the arm, with the selection item being whatever was under then hand at the time you did the flick motions. This would be analogous to a double click in Windows for example. Although one problem there is that people might at first just try flicking their hand instead of their forearm.

Sounds interesting, but kinect is supposed to be 'simpler than a controlpad' - I suspect the developers probably wouldn't want to do anything that required 'teaching a customer' :(.

(I think Kinect/PC will see more interesting UI ideas like that one)
 
Evidently there has been a disconnect with users between their movements and their onscreen avatar. So much in fact that Microsoft is going to redesign the avatars to give them more realistic proportions. Bummer. I kinda like the big head avatars. :(

http://www.oxm.co.uk/article.php?id=21744

Tommy McClain
 
What would be nice is if they gave you an option of new or old style avatars with a caveate that new costumes and accessories for avatars may or may not work properly with old style ones. Not sure the older Arcades would have enough flash memory to hold both styles, however.

Regards,
SB
 
That's a good demo, explaining a lot more stuff. The AI is described as being a marvellous piece buried away in MS's dusty vaults. Speech recognition is using "TellMe" from a company MS acquired - that's one to look up. The skimming stones event has me rubbing my chin. What if the player doesn't try to skim? What if they don't even know how? The actual 'dialogue' was decidedly shonky, without any indication that anything said is recognised. Molyneux repeats that Milo exists 'in the cloud' and will learn vocabulary etc. over time, but so far short of a keyword here or there, I don't see any evidence of progression in language recognition.

Edit: Really what's needed is someone to 'play' it who isn't trying to demo it but is just exploring Milo's responses. I imagine that no matter what you do in the skimming stones episode, Milo will come out of that skimming stones, and later he'll still say, "we had fun skimming stones." In which case it's not terribly interactive.
 
Back
Top