Just wish to share with you this article explaining why voice command is, as my point of view, totally useless :
http://robertfortner.posterous.com/the-unrecognized-death-of-speech-recognition
All the buzz about voice driving of Kinect/Xbox as a mediaplayer or anything else will be useless. As explained, you will have fun with it a few times, and totally forgot it because of too much false positive order.
With this article in mind, I am just thinking about Milo... Do you really imagine a 'game' will be more advanced and intelligent than whatever speech recognition we have actually in the market ?
And one thing to keep in mind for US people : all other countries do not speak english. So if english recognition is already very difficult and need a large power of artificial intelligence, imagine if you are using a language as difficult as japanese or french ?