Kinect technology thread

Scott_Arm · Aug 5, 2010

It's interesting that he mentions "additional software algorithms". The discussion is about processing postures kinect may not recognize, and how much work and processing it takes to recognize those new postures. It sounds like it can be quite intensive, but they're using GPU more than CPU. So, my question is, is the basic skeletal tracking that Kinect provides free? I'm guessing no, since it doesn't look like there is any significant memory, or a good size ASIC/FPGA or processor inside Kinect.

Sounds like they built a pretty flexible API that covers a lot of basic generic cases, but that is also extensible. I'm glad it's not a rigid system. Sounds like devs will really be able to tune their results.

Another interesting thing from hearing about GPU being used to process the data, is how that might affect the design of the next 360. DX11 and shader(compute) heavy, or do they go with a heterogeneous CPU with vector units?

Silent_Buddha · Aug 5, 2010

Alucardx23 said:
Another interview with Blitz Games Studios, chief technical officer, Andrew Oliver. He talks about using the GPU for Kinect image processing.

What kinds of resources are these additional software algorithms taking up on the Xbox 360 hardware?

Well that's interesting, because obviously if you're trying to run your game and look at these huge depth buffers and colour buffers, that's a lot of processing. And it's actually processing that a general CPU is not very good at. So you can seriously loses half your processing if you were to do it that way. We've found that it's all down to shaders, but turning a depth buffer into a skeleton is pretty hardcore shader programming. What you tend to do is write all your algorithms, get it all working in C++ code, and then work out how to now write that in shaders.

By shaders you mean that it's running on the GPU?

Exactly. The GPU on the Xbox is very powerful but we've all only been using it for glossy special effects. A really good example of this is Kinectimals, as the most intensive thing that you can do on a GPU is fur rendering. So that GPU is doing all the fur rendering, and I can guarantee that it's also doing a lot of image processing too. It's brilliant that the Xbox has a really good GPU and can handle both these things, but actually writing that shader code to do image analysis is hardcore coding at its extreme!

http://www.next-gen.biz/features/interview-andrew-oliver

Interesting, so by implication this could also indicate that the 10-15% system resource useage could be construed as 1% CPU and 9-14% GPU for Kinect. Or some other variable relationship. 1% CPU, 5% GPU and 5% memory for instance.

I hadn't actually considered using the GPU to work out some of the Kinect functions but thinking about it now it would make a lot of sense.

Regards,
SB

Squilliam · Aug 5, 2010

Scott_Arm said:
Another interesting thing from hearing about GPU being used to process the data, is how that might affect the design of the next 360. DX11 and shader(compute) heavy, or do they go with a heterogeneous CPU with vector units?

Personally im betting on DX12! What better system to process an image than a specialist image processor? Im thinking perhaps in the next generation system instead of just creating a skeleton they can also create a vertex mesh to define all of the space taken up by the body.

I am starting to think that the Kinect latency is caused by their tapping into the GPU for the processing. I remember Sebbi I think talking about the latency hit when calling on the GPU for compute especially when it is already doing rendering.

Cyan · Aug 5, 2010

Silent_Buddha said:
Interesting, so by implication this could also indicate that the 10-15% system resource useage could be construed as 1% CPU and 9-14% GPU for Kinect. Or some other variable relationship. 1% CPU, 5% GPU and 5% memory for instance.

I hadn't actually considered using the GPU to work out some of the Kinect functions but thinking about it now it would make a lot of sense.

Regards,
SB

In this today's USA patent -there are 100% possibilities they are talking about Kinect- they explain and cover many aspects on the matter. Besides that, they go into detail about many other features in the rest of the patent's vast text.

[0046]When the multimedia console 100 is powered ON, a set amount of hardware resources are reserved for system use by the multimedia console operating system. These resources may include a reservation of memory (e.g., 16 MB), CPU and GPU cycles (e.g., 5%), networking bandwidth (e.g., 8 kbs), etc. Because these resources are reserved at system boot time, the reserved resources do not exist from the application's view.

[0047]In particular, the memory reservation preferably is large enough to contain the launch kernel, concurrent system applications and drivers. The CPU reservation is preferably constant such that if the reserved CPU usage is not used by the system applications, an idle thread will consume any unused cycles.

[0048]With regard to the GPU reservation, lightweight messages generated by the system applications (e.g., popups) are displayed by using a GPU interrupt to schedule code to render popup into an overlay. The amount of memory required for an overlay depends on the overlay area size and the overlay preferably scales with screen resolution. Where a full user interface is used by the concurrent system application, it is preferable to use a resolution independent of application resolution. A scaler may be used to set this resolution such that the need to change frequency and cause a TV resynch is eliminated.

Full Patent:

http://appft.uspto.gov/netacgi/nph-...228.PGNR.&OS=dn/20100199228&RS=DN/20100199228

patsu · Aug 6, 2010

scently said:
Interesting articles. Hopefully this will stop all the silly comparison with the Eyetoy or PSEye because this is soo much more.

goonergaz said:
I must confess to being guilty - clearly there is a lot more tech (altho I still think the price is way too high) - the problem is, until it grabs me and says 'see I'm not a slightly better eyetoy!' then I'm still not 'getting it'.

It's not unreasonable. Most people will look at the games to see the differences. PS Move has similar problem with Wiimote+.

EDIT:

Squilliam said:
I am starting to think that the Kinect latency is caused by their tapping into the GPU for the processing. I remember Sebbi I think talking about the latency hit when calling on the GPU for compute especially when it is already doing rendering.

There was an earlier article that mentioned Kinect uses the GPU for processing (but didn't go into the details). So the Kinect shader now stays permanently in the GPU after bootup ?

Squilliam · Aug 6, 2010

patsu said:
There was an earlier article that mentioned Kinect uses the GPU for processing (but didn't go into the details). So the Kinect shader now stays permanently in the GPU after bootup ?

Thats what seemed to have been implied. Perhaps it is required as you have the ability to signial Kinect at any point, including during a game. However we'd have to really understand how Kinect works to wake up the Xbox 360 from sleep.

grandmaster · Aug 6, 2010

Kinect uses a relatively small amount of GPU time plus a percentage of CPU time from one core. Fairly sure it's accurate to say that more CPU than GPU is used.

Kinect latency comes from many different parts of the processing pipeline. Acquiring the depth map and beaming it over the USB port, for starters. There is a baseline lag that you can't really avoid.

Shifty Geezer · Aug 6, 2010

Have you compared Kinect's lag to EyeToy's? EyeToy's had some pretty hefty delay. Would be a worthwhile comparison for those who have already experienced Eyetoy how Kinect lag may or may not be felt in the experience. Eyetoy was clearly laggy with the video feed trailing several frames, but in game it never felt like I missed punching a ninja due to delay.

grandmaster · Aug 6, 2010

It's the sort of testing we can do when I have a Kinect sensor and we're a long way away from that! Also, cross-platform motion control games will prove to be quite enlightening I would say.

Of course, what would be interesting would be pad controls for Joy Ride. I suspect we won't be seeing that in a hurry!

There is a clear latency in Kinect Adventures. In the Rallyball bit, the balls fly out at you very quickly, making it easy to tell. You react in time but the Avatar doesn't. You have to adjust your own "personal" timing to compensate - same with the jumping in Kinect Sports etc.

For the intended audience I doubt it will affect the quality of the experience. You quickly find out what works and automatically compensate.

Scott_Arm · Aug 6, 2010

It'll be interesting to see how they can mitigate lag in the next gen console. Obviously there is a significant amount of processing to be done, on top of the camera latency.

Everyone is concerned about the camera only being 30Hz, but maybe they can't process the data fast enough to take advantage of a 60Hz camera, without completely crippling the GPU performance for graphics or CPU performance for general purpose. Hopefully next gen they can use at least a 60Hz camera and process in one or two frames.

scently · Aug 6, 2010

Here is the full patent filling for kinect.
http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/PTO/search-adv.html&r=1&f=G&l=50&d=PG01&p=1&S1=20100199228.PGNR.&OS=dn/20100199228&RS=DN/20100199228

One of the patents is for the detection of sign languages.

AlphaWolf · Aug 6, 2010

scently said:
One of the patents is for the detection of sign languages.

Along with every other way you could possibly wave your hands at a camera.

They just need to make it so I can set 'middle finger' shuts down game and returns to dashboard and I'm sold.

bkilian · Aug 7, 2010

AlphaWolf said:
Along with every other way you could possibly wave your hands at a camera.

They just need to make it so I can set 'middle finger' shuts down game and returns to dashboard and I'm sold.

Dunno about that, I imagine if the game exited every time I got annoyed at it, I'd be even more annoyed...

AzBat · Aug 7, 2010

Heard of the Amazon Mechanical Turk program? Neither have I, but evidently Kotaku believes Microsoft is using that service as a way to help solve the problem of using Kinect while on the couch...

http://kotaku.com/5605936/is-this-how-microsoft-will-fix-kinects-couch-problem

Kotaku said:
Kotaku reader Charonchan pointed us to a series of Mechanical Turk jobs—HITs or Human Intelligence Tasks—that appear to be Kinect related. Users are tasked with looking at an images, seeing if there is an identifiable human head in the shot, then tagging the head, shoulders, elbows and hands with a simple skeletal frame.

Many of the images have users seated on coaches or near tables, chairs and Guitar Hero drum controllers. Those images are animated—helpful for picking out details in these low quality, monochromatic shots—and they look like this.

The images are full of variety, filled with sofas, lamps, ottomans, coffee tables, big people, little people, dogs and all kinds of distractions that might confuse Kinect's infrared projector and depth sensor. They're available on Amazon Mechanical Turk for the studying and tagging until next week.

While the HIT listing doesn't specifically mention that this is related to Kinect or Xbox 360, the job requestor links back to the "Upper Body Image Tagger" on Microsoft's Windows.Net site.

Here's the Microsoft site that explains to the HITs how they are to accomplish their tasks.

http://imagetagger.blob.core.window...images_upper_body/examples_upperbody_v01.html

Wierd! LOL

Tommy McClain

Shifty Geezer · Aug 7, 2010

So how does MS take this data and generate generic solutions? They can't do a complete image search of what they have versus the database of samples every frame! also I notice the back of that couch is glowing, placing it's depth completely wrong. This is something TOF cameras shouldn't have. Next-gen should be a lot better with its depth detection!

Graham · Aug 7, 2010

Note the gradient on the ground

The machine, it is learning :mrgreen:

zed · Aug 7, 2010

@SG range is within a certain range, hence the whitish samples
@graham I assume u mean the black stuff, well for me that saiz a null reading.

good to see MS try to intergrate learning into the algorithm (but a bit late in the game for a product thats gonna launch in 4 months, Zeds prediction- we're gonna see a lot of complaints about it not working as promised 5:1 odds)

also very good to see them doing what I have been proposing for years (though in completely different fields, AI - build off human abilities) Wow Im actually very surprised to see them do this, brilliant stuff.

LightHeaven · Aug 7, 2010

Shifty Geezer said:
So how does MS take this data and generate generic solutions? They can't do a complete image search of what they have versus the database of samples every frame! also I notice the back of that couch is glowing, placing it's depth completely wrong. This is something TOF cameras shouldn't have. Next-gen should be a lot better with its depth detection!

Doesn't need to be a complete image search, but i guess the struct to hold all that data in a search friendly way is the software magic they did... Also, i believe they take the fact that a person's pose doesn't dramatically changes in a few ms, so once you know the current pose, you can be sure that the next one will be similar to it and so on.

LightHeaven · Aug 7, 2010

zed said:
@SG range is within a certain range, hence the whitish samples
@graham I assume u mean the black stuff, well for me that saiz a null reading.

good to see MS try to intergrate learning into the algorithm (but a bit late in the game for a product thats gonna launch in 4 months, Zeds prediction- we're gonna see a lot of complaints about it not working as promised 5:1 odds)

also very good to see them doing what I have been proposing for years (though in completely different fields, AI - build off human abilities) Wow Im actually very surprised to see them do this, brilliant stuff.

Microsoft has been using machine learning to teach kinect how to see from the very beginning... Early this year at CES they had a video showing how they spent pretty much the whole year last year teaching this device to recognize humans XD

Cyan · Aug 7, 2010

scently said:
Here is the full patent filling for kinect.
http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/PTO/search-adv.html&r=1&f=G&l=50&d=PG01&p=1&S1=20100199228.PGNR.&OS=dn/20100199228&RS=DN/20100199228

One of the patents is for the detection of sign languages.

http://www.eurogamer.net/articles/digitalfoundry-the-case-for-kinect-article

I've read this article in Eurogamer which clearly states that Kinect isn't able to track finger movements, let's say..., in a Minority Report style. I think that's true given the low resolution of the camera, obviously.

However the patent says implicitly that Kinect will be able to read and detect sign languages and additional features can be identified, such as reading your lips, and track finger or even toe movements, and individual features of the face. Nose and eyes, for instance.

Does it mean that either the article or the patent is wrong? I wonder if Kinect can recognize all those things using the 640x480 pixels part of the camera, which exists, although it isn't the depth sensor. :?:

The patent features this photo which shows how it works:

It's the sign for the word cat:

http://www.signingsavvy.com/sign/cat

Kinect technology thread

Scott_Arm

Silent_Buddha

Squilliam

Beyond3d isn't defined yet

Cyan

orange

patsu

Squilliam

Beyond3d isn't defined yet

grandmaster

Shifty Geezer

uber-Troll!

grandmaster

Scott_Arm

scently

AlphaWolf

Specious Misanthrope

bkilian

AzBat

Agent of the Bat

Shifty Geezer

uber-Troll!

Graham

Hello :-)

zed

LightHeaven

LightHeaven

Cyan

orange

Similar threads