Poll: How accurate is XB1 voice control?

How accurate is XB1 voice control for you?


  • Total voters
    32
I must admit I've never needed to control my TV so urgently that it couldn't wait a couple of seconds while I get to the side table next to my sofa. That's where the remote usually is.

Usually? I thought a 5% failure rate was unforgivable? If it isn't there 100% of the time you should probably throw your TV and or console away.
 
Users tend to more forgiving with these type of devices versus something like keyboards where we are conditioned to expect 100% all the time.
Interesting example you gave. Even though a keyboard always registers precisely the key I press with reliability nearing perfection, I can't imagine having to type during my work instead of using speech recognition.

A technology doesn't have to be perfect to be a vast improvement over the alternative or status quo. Some people just can't seem to grasp that.
 
And what solution did you have in place just in case that wait was ever too long?
Use the controls on the TV?

Usually? I thought a 5% failure rate was unforgivable? If it isn't there 100% of the time you should probably throw your TV and or console away.
It's a different issue. One is an irritation with a lack of response to an action and it's something that goes beyond remote controls. Click an icon on a computer and nothing seemingly happens, or there's no feedback that something is happening? Irritating.

But I've never needed to control my TV so urgently that it bothered me to find the remote, or use the TV controls.
 
How is it badly designed? And how is it slow? Maneuvering through the XB1 UI with a controller is fast. Which isn't hard because it's a pretty straight forward and simple design. Outside of the apps and store there is an app list and settings page. But it's just a game console not a PC which offers far greater utility, requires a more complex design and is a more difficult enviroment to pull off a highly useable and fast UI.

By nature voice control is a more efficient way to control a UI. Going from point A to B is just a matter of a command. And for humans, language is the most efficient form of communication that exists.

Good UI is a matter of efficiency and useability. You could get anywhere quick in a UI if you just threw everything right on the screen but you end up with a convoluted mess of an eye sore. But voice control isn't a quest through a visual design with a simple pointer mechanic.

And exactly what's faster and not based on a mouse or a touch based UI but is just as complex? I don't think most people like the ideal of the console or remote looking like a cockpit of a space shuttle with a bunch of knobs and buttons just to get to things quickly.

Good UI is based on input type, mouse or touch input UI is not best for gamepad input. If x1 UI was well designed it would be gamepad centric (note, this would not take away anything from voice control)
For example current way to get app snapped is not optimal.
My version would be like this: double press/click xbox symbol to get app list that can be snapped. (that app iconlist could appear in right side "snap" area or horizontal list that could be browsed quickly by RB-LB or just left-right one by one) List begins with most recently/most used list of apps.

Same with app switching, click xbox symbol once and you get horizontal applist same way as snapable applist. (home screen would be first "app") This applist could be organized and multirow(up/down navigation), for example pins, games, other apps.

For specific tasks like "record that" something like pressing menu & view buttons same time could be quite nice way to do it.

That way _most_ of snap apps and app switching could happen in 1-2s max. Sure voice control could be faster in some specific tasks but good gamepad UI would be faster in most tasks.

Current x1 UI.. well it could be lot better, it is not "gamepad" UI and home button & (b) back button logic is just horrible.
 
The big difference between a remote and voice is predictabillity. Yes, the remote could get lost, the buttons may wear over time, low battery, etc.. but you know the problem and you can probably fix it. But it's frustrating when you say something to Siri or GNow and they interpret it differently. Kinect should work for the majority of people, but for those people that can't get Kinect to work reliably and in a predictable manner, it would indeed very frustrating.
 
Seems like most owners put it in the 90% category which is encouraging and a bit different than the media narrative.

Still have not opened mine so unable to vote :LOL: Probably will around Christmas.

Judging by my Galaxy S 3 google voice recognition, it will never be 100% accurate, but may be surprisingly accurate. I still have to wonder if it's really any better than using the controller though in a well streamlined system.
 
How do you streamline controller input for entering a search string?

Device input is shifting away from hierarchical menus and linearly navigated lists. It's a shift we started seeing with the the internet and hyper-linking.

Xbone is a remarkable and ballsy device (it's just not one that I'm personally sold on at the moment).
 
I suppose for many, including myself, the reference is past UI models that generally work well. I've never had trouble finding what I want in XMB except maybe for system settings. For games, a list is fine, but then I don't have 50+ games installed. If I did, perhaps I'd be more inclined to find one by name rather than perusing the list? User interfaces themselves are forgoing structure and replacing it with searches. From iPad to Windows, standard practice now is to type in what you're looking for rather than look for it yourself. It sounds like that's the way XB1's UI has been designed too. The most used items are on the front screen, and then the rest gets lost, but you don't need to care where it is as you can call it by name. For an internet game streaming service, which will be the future, search by name will be invaluable (like finding content on NetFlix et al) and natural voice input will be better than clumsy controller-based keyboards. You also don't need to worry about spelling. I wouldn't mind a finger-tracking keyboard though. If I could push virtual buttons and type on a screen the beginnings of a title and see it come up, that'd do me fine. I'd prefer gesture input to voice.
 
How do you streamline controller input for entering a search string?

Device input is shifting away from hierarchical menus and linearly navigated lists. It's a shift we started seeing with the the internet and hyper-linking.

Xbone is a remarkable and ballsy device (it's just not one that I'm personally sold on at the moment).

Again I haven't used XBO, but I've gotten the idea from some posts that MS streamlined the UI for voice, while perhaps even intentionally making it difficult to navigate with a controller, or at least not focusing on ease of use with controller, therefore having the effect of making Kinect mandatory in essence, perhaps a desired outcome.
 
Good UI is based on input type, mouse or touch input UI is not best for gamepad input. If x1 UI was well designed it would be gamepad centric (note, this would not take away anything from voice control)
For example current way to get app snapped is not optimal.
My version would be like this: double press/click xbox symbol to get app list that can be snapped. (that app iconlist could appear in right side "snap" area or horizontal list that could be browsed quickly by RB-LB or just left-right one by one) List begins with most recently/most used list of apps.

Same with app switching, click xbox symbol once and you get horizontal applist same way as snapable applist. (home screen would be first "app") This applist could be organized and multirow(up/down navigation), for example pins, games, other apps.

For specific tasks like "record that" something like pressing menu & view buttons same time could be quite nice way to do it.

That way _most_ of snap apps and app switching could happen in 1-2s max. Sure voice control could be faster in some specific tasks but good gamepad UI would be faster in most tasks.

Current x1 UI.. well it could be lot better, it is not "gamepad" UI and home button & (b) back button logic is just horrible.

It is gamepad centric. It follows standard gamepad navigation conventions. By your account all consoles have had horrible gamepad UIs.

Good UIs require you to design around more than just optimizing one isolated feature and is impossible if you ignore your target audience.

Snapping doesn't warrant Icon button use. While its a nice feature it shouldn't be prioritize that high. And there is a reason why most modern consoles, smartphones and tablets sport a "Home" button. Most general users wants a simple and efficient way to get back to the "Home" or "Main" screen of their device. And users navigating back to the Home screen should happen far more than snapping apps. Home buttons were added for that exact feature. If its function was so unimportant it could of been added by commandeering an existing button on a controller which have plenty of unused buttons when navigating the UI.

You could map a function to every button on a gamepad and all you would be doing is providing a bunch of functionality ignored by the vast majority of your users. Its the reason your standard mouse or your standard mobile device doesn't come with 18 button or keys on it. And why most consoles' UIs can be navigate with just the joystick and three other buttons (launch button, back button and home button). Adding more does nothing for the typical user's experience as most favors solutions that involve a minimal numbers of keys/buttons involved in navigating the UI not more.

If you can criticize MS is that its UI design doesn't accommodate a small fraction of the userbase that favors what are basically complicated keystrokes shortcuts. But favors using voice in a way to unlock functionality for mainstream users thats usually obscured behind mechanisms that most don't bother to learn.
 
Last edited by a moderator:
DrJay should probably watch some videos on how the UI actually works. Up, down, select, left, right? No. You say, "Xbox Select" and then all menu options on the screen are highlighted in text, and then you say the one you want. You don't have to move a cursor or highlight around on the screen with voice control. If you are navigated to a new screen, you don't have to say, "Xbox Select" again. The text keeps popping up until it stops listening. If you happen to be on a screen that can be scrolled, there are voice commands to see the content in pages, but you can also scroll using a gesture by hand.

Thank you for explaining this. For some folks, arguing with their imagination is more enjoyable than arguing from a factual and/or experiential basis.
 
Again I haven't used XBO, but I've gotten the idea from some posts that MS streamlined the UI for voice, while perhaps even intentionally making it difficult to navigate with a controller, or at least not focusing on ease of use with controller, therefore having the effect of making Kinect mandatory in essence, perhaps a desired outcome.

No the os UI is easily navigable by controller. However, no matter how easy the on screen UI is to operate its easier to say "Xbox, goto <app>". No menus, scrolling, xmb, up down or anything its there. When you want to do anything else.. Say it. "watch tv" no remote necessary. Goto Netflix and you're already there.

There's very few non-gaming cases that you can argue for using a controller or remote versus voice at this point.
 
Again I haven't used XBO, but I've gotten the idea from some posts that MS streamlined the UI for voice, while perhaps even intentionally making it difficult to navigate with a controller, or at least not focusing on ease of use with controller, therefore having the effect of making Kinect mandatory in essence, perhaps a desired outcome.

Thats untrue. There is nothing difficult about navigating the XB1 UI. Its like having XP main screen with an app folder, a store app, a snap app, a profile app, a recently used app bar and a window start menu that a contains pinned apps. There isn't much to navigate to and from in the UI. Its simplistic and basic in nature for which any UI needs to be when it has to accommodate both a gamepad and a general user.

With the XB1, MS took Kinect which is basically a voice recognition based command line interface and it paired with one of most inefficient tools (a gamepad) for navigating an UI. There is no way to streamline an UI to accommodate a gamepad to make as easy to navigate as it would be with a robust command line interface.

Using a CLI is like giving a taxi driver your destination and letting him do all the navigating. A gamepad is like navigating for a driver and giving street by street directions or in other words a human GPS.

And no console makes heavy use of gamepad button based shortcuts. There is nothing to encourage platform providers to basically design their UI in a way that purposely creates a small subset of power users when it comes to UI navigation.
 
Last edited by a moderator:
Good UI is based on input type, mouse or touch input UI is not best for gamepad input. If x1 UI was well designed it would be gamepad centric (note, this would not take away anything from voice control)

For the rest we'll have to agree to disagree that the Xbox One UI isn't well designed.

What I do think is inane is that people think the Xbox One voice controls only seem more efficient because the UI isn't designed well for a controller.

I've already shown an example, where Voice Controls can be and are more efficient than the most efficient controller based input possible. The single button push to accomplish a task.

The example, being the ability to push the Xbox Button in order to return to the home page from wherever you happen to be. There you have the following two situations.

The button press is as fast or slightly faster than issuing the voice command of "Xbox, Go Home" if the controller is already powered on and in your hands.

The button press is slower, and sometimes significantly slower than issuing the voice command if the controller isn't already in your hands or is powered off to conserve battery life.

You cannot get more efficient or faster than a single button press to accomplish a task, and yet even then, at best it is similar in speed to issuing the voice command to accomplish the same task. And at its worse, controller off and not immediately at hand (like when watching a movie), significantly slower and less efficient than issuing the voice command.

That means that no matter how well designed the UI is with regards to a controller control scheme it will always be inferior or at best similar in speed and efficiency to issuing voice commands.

The lone caveat here being that it is true only as long as the system can process and execute your voice commands reliably and quickly. But that has nothing to do with how well the UI is designed with regards to controlling it with a console controller.

Of course, there are things where controller input is more appropriate or accurate. Sustained movement and the ability to stop at an arbitrary point, for example. Like continuous scrolling on a web page. But for things like that you also have gesture based controls.

And, of course, an argument could be made that some of the voice commands aren't as well designed as they could be. Volume up and volume down could arguably not be as good as say, volume [x] percent if you need to adjust volume by a large degree, but again the UI design has no bearing on that anyway.

And this is true not only for the Xbox One, but for the PS4 as well for the limited number of voice commands available (assuming they didn't bork their voice command implementation) and minus the gesture controls.

Regards,
SB
 
I suppose for many, including myself, the reference is past UI models that generally work well. I've never had trouble finding what I want in XMB except maybe for system settings. For games, a list is fine, but then I don't have 50+ games installed. If I did, perhaps I'd be more inclined to find one by name rather than perusing the list? User interfaces themselves are forgoing structure and replacing it with searches. From iPad to Windows, standard practice now is to type in what you're looking for rather than look for it yourself. It sounds like that's the way XB1's UI has been designed too. The most used items are on the front screen, and then the rest gets lost, but you don't need to care where it is as you can call it by name. For an internet game streaming service, which will be the future, search by name will be invaluable (like finding content on NetFlix et al) and natural voice input will be better than clumsy controller-based keyboards. You also don't need to worry about spelling. I wouldn't mind a finger-tracking keyboard though. If I could push virtual buttons and type on a screen the beginnings of a title and see it come up, that'd do me fine. I'd prefer gesture input to voice.

X1 has pins on your home screen and a "games and apps" button on the main screen. Pretty easy and quick to get to all of your games and apps, as well as view your favourites. Bing search covers the rest.
 
Yesterday I was playing the Dead Rising 3 demo and wanted to switch over to Battlefield 4 to play with a friend. One voice command, "Xbox, go to Battlefield 4" and you get an instant transition to your new game or app. Not sure how that could get any faster. No need to go to the home screen or anything. The UI works fine and most of the things you'd want to do can be done very quickly with the controller. So if you want to use the controller, for any reason, it's going to work fine. There are cases where the voice control will actually be better and easier, assuming it works well in your environment and does not have any problems with your accent/speech. I'm probably the ideal case, and it works great for me.
 
2) The 5% failure is around fringe tasks and not the major functions which most are calling pretty much 100%. So in general use it works, and only in some aspects where the names get complicated does it fail, which is still better in some cases than trying to type in on a virtual keyboard. Texting on a phone can be 80% or worse at times, but we all still put up with it. For finding games and content, using the name is very intuitive and likely faster than various button interfaces even if you have to try a couple of times.

5%? If my IR remote had a regular failure rate of 5%, I'd be declaring it as 'broken'. 5% would be 1 failure hit out of 20 buttons pressed. Of course, if I attempt to press multiple buttons at the same time or increase the speed of how I press the buttons, I would increase the failure rate dramatically so much that instinctively, as most intelligent humans, you live and adapt how you use these devices accoardingly and within their bounderies. Pressing too many buttons at once IMO does not constitute a failure of the remote, but a failure of usage. A working remote control, within proper usage, should yield a close to 99.9% success rate.

Also, a failure on a remote (or on a phone by miss-hitting) is a lot less of a inconvinience. If the button didn't register, you just press it again. If it doesn't work the second time, then it starts to become an annoyance. The difference however is that the 'cause & effect' are instant. It takes around 1-2 second(s) to realize that the button you pressed didn't yield the result you wanted.

If you want to compare this to voice controls - you're already losing a lot more time given the voice command itself isn't as instant as pressing a button. Voice controls in this context is more like talking to stranger and explaining something and him not understanding. It's forcing you to repeat a longer sentance.

The tolerance might be higher for voice-controls because it is more complex and it gives you the ability to do more complex things.

What I would be interested in from current Xbox One owners - when a failure hit does register - how do you go around to solving it?

- do you retry the voice command the same? Do you change the way you state your voice commands? How many times do you need to repeat the command until it works?

On a remote, you simply repress the button it doesn't register. With voice controls, I wouldn't imagine it to be as simple; some commands might register at the 2nd attempt, but some commands that are inherently difficult to understand, probably fail a 2nd and a 3rd time as well? How much tolerance do you apply as a user in this case until you just give up and use the controller?
 
5%? If my IR remote had a regular failure rate of 5%, I'd be declaring it as 'broken'.
Are you really away of it though? eg. I press volume up. Nothing happens. I press again. It's so small that you won't notice much of the time. Or you go to type in a channel number and it only accepts a partial before changing channel, and then it changes to the second half, and then you start again. Failure is quite apparent on touch interfaces that sometimes don't register, or you press the wrong thing. And, again, the 5% is into the fringe commands. The core commands are much nearer 100% accuracy by reports of users here.

Pressing too many buttons at once IMO does not constitute a failure of the remote, but a failure of usage.
Same with voice controls some of the time.
 
Are you really away of it though? eg. I press volume up. Nothing happens. I press again. It's so small that you won't notice much of the time. Or you go to type in a channel number and it only accepts a partial before changing channel, and then it changes to the second half, and then you start again. Failure is quite apparent on touch interfaces that sometimes don't register, or you press the wrong thing. And, again, the 5% is into the fringe commands. The core commands are much nearer 100% accuracy by reports of users here.

Actually, now that I think of it - the Samsung remote that my mum uses, has quite a higher failure rate than what I am used to. Although, that is pretty much down to a low battery and I think the buttons, which are rather big (and cheap) don't always register nicely when you press them. The transmitter also seems to lag a bit more compared to mine, which could be down to the battery or the quality (or the fact that it most likely has been thrown around and dropped a few times).

Bear in mind, my own remote isn't anything hightech and the IR remotes I've been using all work flawless in that - yes, I can state - they are pretty close to 99% and if there is a failure, it's usually down to user error (e.g. me pressing a button too quick after another one or being too far away).

I think I would notice, though I accept your point as well that you tend to overlook these things when they do occur. On the other hand, I suspect this is also down to it being a rather small annoyance - namely because the feedback is instant. Using voice-controls isn't - because already the command 'xbox. one. do whatever' already requires around 4-5+ seconds (I'm guessing here, as Xbox. One. already takes around 2-3 seconds), so it not registering means a higher loss in time, ergo more noticable annoyance. Also, I suspect there is a small gap between when you complete the command and when it's executed on the system (after all, the Xbox also needs to figure out that the command is completed?).

Fair enough if people are happy to tolerate this. I'd still be quite interested to know if when these failures do occur, how many retries are required (on average) until the device does the command as requested. Surely, this has to be a lot more complex (and therefore error-prone) than simply repressing a button on a remote?

I'm happy to say that if I was forced to use voice-controls for a very specific task - like on a device to play games and only to play games - I'd be happy to give it a much higher tolerance. However, within the context of my livingroom and controlling my AV systems, I wouldn't really put up with it, as I'm used to a flawless working remote. The benefits would need to significantly outweigh the cons, but even then, for some things, I prefer simplicity and reliability over complex and less reliable methods - especially for inherent simple commands (start, stop, play, pause, volume controls, channel numbers, menu navigation etc).
 
Back
Top