Project Natal: MS Full Body 3D Motion Detection

We were right the thing process datas before sending inforamtion to the 360, lag may vary accordingly to the loads it is handling. And may being to jerky put an extra stress on the system as people become more confident in the system the stress while still here may be down a bit. This would explain why ricochet showed some lag (not deal breaking from every report) and the pain guy and burnout don't seem to suffer consistantly from the issue, may be to many random movements.
Anyway even given extra ressources there will be limit to system like track 4 people voices, their movement and do some 2D shape recognittion will induce some lag in the same way to much load induce frame rate penalty.
 
Worth pointing out that I've spoken to non-MS affiliated developers who've been able to test Natal at length in their own offices and they're as breathlessly enthusiastic about it as the E3 reporters.

Now to get one of them on the record...

You do know that there were devs excited about the Wiimote, right? But yes, please get one on the record.
 
It's a fancy, cool, awesome device, but essentially you can just treat it from a free-to-platform perspective, because all of the magic - all of the processing - happens sensor-side.

That would only be true if all it handed back were gestures, which would be very limiting so I don't think they're doing that at all. Presumably it hands back a continual physical position and velocity representation of the player, which would still need some pretty heavy processing per game to determine what exactly the player was doing and to map that onto the specific game requirements (kicking a ball, waving a bat, dancing, who knows what)

Anyone who's done recognition of an analogue stick or touchscreen (e.g. casting spells based on a particular shape moved/drawn), will know how much work it is to recognise these while eliminating false positives and lag. Doing so for an entire 3D representation of a human body would be an order of magnitude harder, and there's no way the sensor can do everything that every game would want to do.

So I guess they're right, they process all the signals into a form games can use, but the games will have to process that into what is useful for each and every game.
 
On purpose, the thing sold more than 50 millions in 3 years, and is still selling despite of its weaknesses.

Sorry, I don't play sales numbers. (Are we assuming then that developers are enthusiastic just because the Natal may print money?)
 
Last edited by a moderator:
...
So I guess they're right, they process all the signals into a form games can use, but the games will have to process that into what is useful for each and every game.
I concur. If all they hand back is skeleton information, the applications are limited by the devices hardware/firmware. How would you add prop support if not handled on the Natal-box? Then again, perhaps the only way to get the speed is to sacrifice flexibility, and they made a conscious decision to fix on human interfacing when designing the processor?

Is there any chance we'll ever know? NDAs are going to lock us out of the facts :(
 
I don't think they are focusing on human squelette, if the advs are any indication, the thing can scan object. In regard to the flexibility, I think that the hard has to be flexible enough to handle 3D data, sound data, that's why I think it can deal with 2D data as well.
I would put my bet on a real cheap ARM/MIPS cpu driving an multi purpose DSP, kind of a PPU SPU interaction.
 
I don't think they are focusing on human squelette, if the advs are any indication, the thing can scan object. In regard to the flexibility, I think that the hard has to be flexible enough to handle 3D data, sound data, that's why I think it can deal with 2D data as well.
I would put my bet on a real cheap ARM/MIPS cpu driving an multi purpose DSP, kind of a PPU SPU interaction.
What does it send back to the XB360 then?
 
What does it send back to the XB360 then?
Well from my really restricted knowledge I would that the "thing" has to be able to send various type of data to the 360, vertex/3D information, texture/2D image but we can't say for sure from what we know if it's able to send the same kind of orders that the standard controller would. The burnout demo could lead us to think that it can but they are too many unknown factors:
To which extend the game has been modified?
What was doing the laptop connect to the prototype?
was the demoed 360 running a custom firmware
etc.
The point is that to be remotely close to the expectations the thing raised Ms must have the device to be able to send various informations (speak about a conclusive sentence :LOL:).
I would bet that the device send mostly pretty "high level informations" as USB2 doesn't allow for that much information per second @ 30FPS and assuming the peak figures usb2 transfer rate is up to 2MB per frame.
 
I would guess that the Natal chips merely construct the 3D scenes (the 48 skeletal points per body in a 3D space) and bundles that, along with the RGB imagery, to the 360. The 360 CPU from there would be tasked with detecting and playing with the motion of those 48 skeletal points.
 
What does it send back to the XB360 then?

Maybe there's a certain degree of programmability. The device may have a laundry list of variables you can configure before use. How many and which data points you want to track on each skeletal system, for example. So, if the camera is tracking a person, maybe you only care about the upper torso and the legs/head are not important. Or maybe there are x number of data points in the spine, but you only want to track a limited number of those points. There could be things like that they've done to reduce the workload. I'm sure it has its own working memory and there will be a certain budget you have to adhere to. There may be routines/modes to identify non-skeletal shapes.

I'm sure there are defined objects for skeletal bodies that are returned from the system, as well as voice, video etc.
 
I would guess that the Natal chips merely construct the 3D scenes (the 48 skeletal points per body in a 3D space) and bundles that, along with the RGB imagery, to the 360. The 360 CPU from there would be tasked with detecting and playing with the motion of those 48 skeletal points.

But unless there's modes, you're restricting the Natal to detecting whole bodies -- any hopes of finger detection is gone if that 48 points thing is real.
 
Eurogamer interview with project director:

Hmmm...something that's bugging me...

Alex Kipman: Essentially we do a 3D body scan of you. We graph 48 joints in your body and then those 48 joints are tracked in real-time, at 30 frames per second. So several for your head, shoulders, elbows, hands, feet...

48 Joints. Not just 48 points of interest. But 48 points where there would be a potential for the skeleton to bend or move independantly of other points.

I can't come up with 48 points without resorting to finger joints (14 each hand for 28 total), shoulders-elbows-knees-hips-ankles-neck-torso (12), maybe balls of the feet for (2). So that's 42. And another (6) for facial features? Corners of lips, eyes, etc?

Can't be right though I can't imagine it tracking all 3 joints in your index finger for example.

I dunno, is there anyone around that has experience with more traditional MO-Cap that could understand perhaps better what "Joints" they would be tracking in order to form a Skeleton to track in 3 dimentions?

You obviously wouldn't need more than 3 points for an arm which would be shoulder-elbow-wrist for example.

Regards,
SB
 
That would only be true if all it handed back were gestures, which would be very limiting so I don't think they're doing that at all. Presumably it hands back a continual physical position and velocity representation of the player, which would still need some pretty heavy processing per game to determine what exactly the player was doing and to map that onto the specific game requirements (kicking a ball, waving a bat, dancing, who knows what)

Anyone who's done recognition of an analogue stick or touchscreen (e.g. casting spells based on a particular shape moved/drawn), will know how much work it is to recognise these while eliminating false positives and lag. Doing so for an entire 3D representation of a human body would be an order of magnitude harder, and there's no way the sensor can do everything that every game would want to do.

So I guess they're right, they process all the signals into a form games can use, but the games will have to process that into what is useful for each and every game.

Yes but it's already removing much of the processing that would have to be done system side to mimic it with say Eye Toy + PS3 wand to achieve the same effect.

So as with any system the software still has to do stuff to actually use the data, but the control does all the process of the 3D space and then just presents the data that the software would be interested in.

And yes, it'll be a larger stream of data than say a controller which has at most 12 buttons that could be pressed simultaneously (not sure why you'd want to), 2 analog sticks that would be used simultaneously, and a D-pad that could be used by any of those at the same time. And is being processed at 60-120 fps.

Versus whatever (48 joints + text from voice recognition) at 30 fps.

Granted a button only has an on/off switch. And the Analog sticks would only have X/Y coordinates. While each of the 48 joints would potentially have X/Y/Z and velocity/acceleration data.

Regards,
SB
 
Hmmm...something that's bugging me...



48 Joints. Not just 48 points of interest. But 48 points where there would be a potential for the skeleton to bend or move independantly of other points.

I can't come up with 48 points without resorting to finger joints (14 each hand for 28 total), shoulders-elbows-knees-hips-ankles-neck-torso (12), maybe balls of the feet for (2). So that's 42. And another (6) for facial features? Corners of lips, eyes, etc?

Can't be right though I can't imagine it tracking all 3 joints in your index finger for example.

I dunno, is there anyone around that has experience with more traditional MO-Cap that could understand perhaps better what "Joints" they would be tracking in order to form a Skeleton to track in 3 dimentions?

You obviously wouldn't need more than 3 points for an arm which would be shoulder-elbow-wrist for example.

Regards,
SB


The back/neck probably have several joints.

From Kotaku:

Theoretically, I got that wrong too, Tsunoda told me, though he didn't have a way to prove it to me there. The stick-figure skeletons that Natal recognize us as did not have fingers. Each one had a short stick for each hand. I saw no fingers, so I assumed it could not see my fingers. There seemed to be no way for Natal to know, say, how many fingers I was holding up. If it could, then it could maybe read hand signs issued to squadmates in military first-person shooters. I questioned Natal's ability to detect those finer movements. Tsunoda said that such detection was possible, though the sensitivity would be different at different distances. He thought my fingers idea was do-able.

So currently no finger-tracking, but '[Tsunoda] thought my fingers idea was do-able'.
 
Last edited by a moderator:
The back/neck probably have several joints.

From Kotaku:



So currently no finger-tracking, but '[Tsunoda] thought my fingers idea was do-able'.


depth accuracy of 3D camera is centimetric not millimetric. you can't realy dissociate and track finger or part of visage with this accuracy i think
 
Last edited by a moderator:
ok ive watched the whole video of the presentation
ok natal does break down/bug out numerous times during it (*)
now theres a part where the guy (whos wearing a bright orange shirt with sunglasses for some reason) uses his right hand to scroll through the dashboard, a part where u dont want it to crap out as its a definite action, unlike say the breakout part

looks like theres something attached to his shirt on that right wrist (perhaps its the insides of a wii wangle wand :) ) busted
vlcsnap-30495.jpg


(*)which is actually good as it shows it wasnt prerecorded, but OTOH it shows its nowhere close to production

edit - sleeve? only on one arm :)
vlcsnap-217401.jpg
 
Last edited by a moderator:
That picture is so low res. I can't make out anything. I tried to find an HD source on gametrailers, but didn't see one.
 
Back
Top