The advantage natal has is that the array microphone can isolate sounds from individual players. Combined with player facial recognition, etc, then in theory there shouldn't be many issues with disruption, noise, etc.
Microsoft have pretty decent track record in speech, all things considered.
But it won't know who to give priority to (e.g., if kid talks first, then I may be ignored; although sometimes he may be the legit user while daddy is the idiot who's trying to tease him). Plus not all noises come from human, visible or not. All the surrounding sound will permeate/stack on top of each other even if the mic tries to pick up from one area.
One of the best uses for voice-driven UI is when I don't have to walk all the way to the TV/console to issue command (e.g., play music). So the camera may not even see me. If I had to go to the TV and face the camera, I might as well use gestures or remote for more accurate hit. [size=-2]I use RemotePlay for this right now though.[/size]
I'm sure it'd be better than cellphone speech recognition, but the success rate needs to be very high and consistent. Not sure how accurate the mic array is though. May need to filter out background noise/music.
Which wouldn't be a problem for, say, adventure games. Or educational games. Or perhaps in the Team management screen of a sports simulation.
Using the sports simulation, even with lag it may be much faster to just call out the name of the player you want, rather than scrolling to him and selecting him. Just an example.
Then again, if I had to try pronouncing some of the names of players in the NHL...
Yap, picking an item out of a known library is one of the most successful voice-input use cases. Something like Scribblenaut or Heavy Rain should work too.
SingStar works amazingly well, but then again, they kinda cheated because the mic is right beside the player's mouth. So the voice input characteristics can be tuned rather accurately (more consistent acoustic parameters).
EDIT: I remember MS signed the Scribblenaut developer for a game ? If so, we will probably see some sort of Scribbenaut for Natal.
A good way to make general voice input more worthwhile is to allow titles to export their top-level macros to the Natal hardware unit (at the Dashboard level). So the user can utter a voice command even when the 360 is off, and have the console boot up, start the right game and, for instance, go all the way to joining an MP session waiting for a game. I suspect this is why Natal has a separate power supply.
If they have this kind of structure laid out, then the user can also use gestures or the controller to choose these high level macros (like AppleScript !)