Kinect technology thread

Option 2 is not really a possibility. The PrimeSense specs have the depth data being 1/4 the CCDs resolution. If MS have available a 640x480 depth, that means they have a 1280x960x30fps CCD, which is a lot more highend than typical webcams, and wouldn't make financial sense to include hardware they're not using. As I've mentioned before, the cost factors suggest two versions of the same CCD used for optical and depth, meaning 1/4 of the optical resolution for the depth.

Ok I follow you there. If Microsoft were going to evolve the CCD in Kinect for their next generation system would a 1280x720x32 @ 30fps CCD be the next most likely version? That would almost double the resolution (640x360x16 @ 30fps)?

Tommy McClain
 
His fingers flicker in and out as they lie inbetween sampling points, and he's standing twice as close to the camera than typical playing distance. There's enough there to recognise large hand gestures, like open hand, closed fist, and spread fingers, but not enough detail to do individual finger tracking (at normal play distance, though a foot frmo the camera would work if the setup can focus that closely).
The other guy, who's almost side by side with kudo is pictured in the entire picture. Kudo seems to be a little closer...

But i'm thinking purely for hand gestures recognition at a playing distance, it would be a huge downside if even for having some gestures you'd have to be closer to the camera.

Wow... that's like... painfully low XD

I thought the limitation was the usb 2.0, which would allowed 2 30fps 640*480 streams, and that devs could choose between each sensor they would want to prioritize... But at that res doesn't even make sense that they are not using 60fps XD I wonder if that's for performance issues (they are working with only 1/4 of the image afterall).

For the people who could test kinect last year and now: Does the depth view looks like its been reduced, or there's no difference now?
 
Ok I follow you there. If Microsoft were going to evolve the CCD in Kinect for their next generation system would a 1280x720x32 @ 30fps CCD be the next most likely version? That would almost double the resolution (640x360x16 @ 30fps)?
Quadruple the resoluion! ;) I imagine so. You can get FullHD capture in mobiles now. 720p CCDs should be common as muck in a couple of years. There's also the possibility of a different system entirely. Maybe the cost of time-of-flight cameras will come down, or somesuch, so we can use that tech instead which should improve in every area.

I thought the limitation was the usb 2.0, which would allowed 2 30fps 640*480 streams, and that devs could choose between each sensor they would want to prioritize
They'll get both streams simultaneously. USB is easily capable of supporting that. With 24 bit depth, 720p colour + 720p depth at 30 fps would be ~160 MB/s which is well within USB2's spec.
But at that res doesn't even make sense that they are not using 60fps XD I wonder if that's for performance issues (they are working with only 1/4 of the image afterall).
What's been said before is that at the higher resolution, they were using 4x the processing and not gaining any real advantage. If 640x480 doesn't benefit you more than using 320x240, it doesn't make sense to use it if the lower spec option is cheaper. Also the PrimeSense tech quarters CCD resolution, so to get a 1280x720 res depth image would require a 2560x1440 res CCD!

It's important not to focus solely on numbers. The tech exists to serve a purpose. If having the higher res depth doesn't aid the purpose, we're not actually missing out on anything so can't begrudge the tech. As a parallel, if Move came with a 2560x1440 camera, it wouldn't change the experience which is already as accurate as a player can perceive.
 
Talking about numbers, USB2 in practice doesn't get anywhere near 160mb/s though does it?
 
Wouldn't a increase of fps be better for gamers then a increase of resolution?
because that's the main complain right now input lag.
 
Maybe Shifty's getting confused between megabits and megabytes!

In practise, a USB 2.0 connection tops out at around 35 megabytes per second. Uncompressed 720p video can't be done. The PlayStation Eye uses MJPEG compression on the video to stream 640x480 at 60fps.
 
Maybe Shifty's getting confused between megabits and megabytes!
Yeah, I was just going to correct myself! I had the 480 mb/s figure, but failed to make the byte/bit distinction.

In practise, a USB 2.0 connection tops out at around 35 megabytes per second. Uncompressed 720p video can't be done. The PlayStation Eye uses MJPEG compression on the video to stream 640x480 at 60fps.
What amount of compression artefacts does that introduce, as that was an issue with EyeToy? In fact irrespective of that, noise means you need a level of filtering of the source image anyway.

This also means LightHeaven was right about the bandwidth limiting image data to one 640x480 uncompressed 30fps feed, but compression provides the two frames simultaneously. We know the devs gain access to both feed plus audio because there's a system called registering that matches the skeleton data to the visual data.
 
grandmaster said:
Maybe Shifty's getting confused between megabits and megabytes!

In practise, a USB 2.0 connection tops out at around 35 megabytes per second. Uncompressed 720p video can't be done. The PlayStation Eye uses MJPEG compression on the video to stream 640x480 at 60fps.

I think actually at 16bit the PS Eye does this resolution uncompressed, no MJPEG. The numbers fit pretty much exactly.
 
In that case maybe the next PS Eye and Kinect will feature a monopoly proprietor system, which isn't a good thing to add maximum compatibility with the PC and so on.

I mean, if the USB port doesn't improve over time.

Alucardx23, I've been toying with some options at the page you provided http://www.ivona.com/# and the British voice is really nice. Slow, very clear and with a nice accent. I think most people should speak to AI's trying to recognize sounds, just that way.
 
Not really as the camera has to work with USB for older models. They can't give a technical advantage to Slims, and devs couldn't develop for it even if MS did because they'd alienate owners of older models. We can be confident it's USB with extra power in a proprietary connector to prevent wrong device attachment that could be damaging to equipment.
 
They'll get both streams simultaneously. USB is easily capable of supporting that. With 24 bit depth, 720p colour + 720p depth at 30 fps would be ~160 MB/s which is well within USB2's spec.
Isn't usb 2.0 theoretical limit something like 60 MB/s and that without counting any overhead?

Edit: sorry, didn't see this was already adressed XD

What's been said before is that at the higher resolution, they were using 4x the processing and not gaining any real advantage. If 640x480 doesn't benefit you more than using 320x240, it doesn't make sense to use it if the lower spec option is cheaper. Also the PrimeSense tech quarters CCD resolution, so to get a 1280x720 res depth image would require a 2560x1440 res CCD!

It's important not to focus solely on numbers. The tech exists to serve a purpose. If having the higher res depth doesn't aid the purpose, we're not actually missing out on anything so can't begrudge the tech. As a parallel, if Move came with a 2560x1440 camera, it wouldn't change the experience which is already as accurate as a player can perceive.

I agree perfectly with this point, but not being able to hand-gesture commands to the camera in a distance you would usually play might be affected by this resolution and limit the purpose itself of the tech. Thought i have no idea if the current res is a limitation on that factor or not.
 
Quadruple the resoluion! ;) I imagine so. You can get FullHD capture in mobiles now. 720p CCDs should be common as muck in a couple of years. There's also the possibility of a different system entirely. Maybe the cost of time-of-flight cameras will come down, or somesuch, so we can use that tech instead which should improve in every area.
From conversations I've had, it wasn't the cost of the ToF, it was the accuracy. You're measuring differences in femtoseconds, and even tiny clock jitter can cause large changes in perceived distance. As the tech improves (and it will, if only to be a contender for the next generation of these interfaces) it may become accurate enough to be considered.
 
Surely they would want to double or quadruple the frame-rate first? People notice the latency, so why not try to get that ~150-200ms down to ~80-100 with more sampling points for greater movement accuracy?

The way I see it, they'll need to increase both the resolution and the frame-rate for the next generation. I would suspect they would go with some half step resolution for the standard RGB such as ~900 by 600 or whatever the correct ratio is. To capture finger movements etc they would need both a higher resolution image to resolve them and a higher frame-rate to capture enough sample points.
 
Surely they would want to double or quadruple the frame-rate first? People notice the latency, so why not try to get that ~150-200ms down to ~80-100 with more sampling points for greater movement accuracy?

Wait you want it to get down below a standard controller (~100-133 ms when not counting display lag)? Is that even possible? And Kinect is currently ~150 ms when not counting display lag.

Regards,
SB
 
Surely they would want to double or quadruple the frame-rate first? People notice the latency, so why not try to get that ~150-200ms down to ~80-100 with more sampling points for greater movement accuracy?
Input framerate is going to have little impact on latency. Going from 30fps to 60fps would mean 17ms less time to get a frame, which is all of 10%. The latency is in the video encoding, decoding, image processing, etc. Smoothing input movements means several frames of lag as you interpolate. Gesture-based gaming means lots and lots of lag as there's no other way to determine what someone's gesture is until they've completed it (although you could cheat and have wildly different gestures that can be identified in their opening movement, so as the player continues the rest of the gesture, the game already executes the move. This would probably require fairly unnatural input though).
 
Multiple language issues, and they won't roll it out until all regional languages are supported?

It could be languages, but I immediately remembered this bit:

In more than a few demos, Kinect didn't always understand what people were saying. And these were folks without accents and in rooms without a lot of echo or noise. Kinect has a setting for audio that measures the amount of ambient noise in your room and filters it out, but until we can field test it, it's impossible to know how well this works. Accents are a bigger concern. We've been told the "hope" is that accents are not an issue. Maybe Microsoft can enlist Ubisoft for help, because EndWar used voice commands and did an excellent job accounting for accents.

http://uk.xbox360.ign.com/articles/109/1099085p1.html
 
Back
Top