Stereo Vision and Time of Flight setups for Face Scanning

onQ

Veteran
Something that I have noticed while looking at all the NBA 2K15 face scans & reading the comments is that most of the scans that came out right are from the PS4 & I'm yet to see a good scan that's from the Xbox One. Maybe it's a software issue but it seems that the bad scans from the PS4 camera is user error & after getting better instructions people are able to get good scans but with Kinect no one has figured out how to make it work right.

If I had to guess I would say it's because Kinect 2.0 isn't able to do 3D scanning at close range so it's causing problems when people try to hold the camera close to their face. Also there is the fact that PS4 camera's 3D is 1280 x 800 while Kinect 2.0's 3D is 512×424 while it's RGB camera is 1920 x 1080 which might be harder to calculate than using 2 of the same cameras for depth and color.
 
Without seeing the results you're talking about, I can only provide some general insight. The 3D depth image resolution isn't a problem because the face is vacuum-formed over the point-cloud. Depth resolution is too small, but accumulating over multiple samples can be very accurate. We need only look at the incredible results achieved with the crusty methods of Kinect 1 regards realtime scanning. Disparity between depth and video images is irrelevant. You'll crop the images and scale, mapping based on face recognition tech.

Creating a 3D depth map from stereo is a lot harder and prone to errors. It'll no doubt work in the same way, creating a volume and shrink-wrapping the head onto it. So I'd be inclined to believe that it's the libraries giving poor results on XB1, if they are worse, or possibly not a best-case use of the tech. Does the user have to move forwards and backwards from the camera, or move it around their head?
 
Without seeing the results you're talking about, I can only provide some general insight. The 3D depth image resolution isn't a problem because the face is vacuum-formed over the point-cloud. Depth resolution is too small, but accumulating over multiple samples can be very accurate. We need only look at the incredible results achieved with the crusty methods of Kinect 1 regards realtime scanning. Disparity between depth and video images is irrelevant. You'll crop the images and scale, mapping based on face recognition tech.

Creating a 3D depth map from stereo is a lot harder and prone to errors. It'll no doubt work in the same way, creating a volume and shrink-wrapping the head onto it. So I'd be inclined to believe that it's the libraries giving poor results on XB1, if they are worse, or possibly not a best-case use of the tech. Does the user have to move forwards and backwards from the camera, or move it around their head?

So if it's not the camera hardware could it be the more flexible compute setup where they are able to break the job down to 64 tasks vs 16?
 
Nope. The difference in task count is for scheduling and improves GPU use. Worst case, it takes longer to build up the model from one machine than the other. That's true of the input as well. A machine lacking accuracy in a single sample can spend longer accumulating samples to get the same results. Have you got a link to the results you're talking about?
 
Nope. The difference in task count is for scheduling and improves GPU use. Worst case, it takes longer to build up the model from one machine than the other. That's true of the input as well. A machine lacking accuracy in a single sample can spend longer accumulating samples to get the same results. Have you got a link to the results you're talking about?

It's spread out all over the internet on lots of different forums I just been picking through them the last few days and noticed the trend.


http://www.operationsports.com/foru...765734-2k15-post-your-facescan-thread-70.html



But I have just found someone who has a solution to the Xbox One problem and he says that the problem is that the Kinect adjust to the lighting so when people try to add more light it only make things worse.


 
Someone has now done a test with the same setup for Xbox One Kinect & PS4 Camera


The results are in. Xbox vs PS4 face scan test. (pics inside)

The results are in from my experiment with scans on XBOX and PS4, using the same environment and lighting. The results are as followed:

Final results: Xbox one

This first pic is one of my 50 attempts on the Xbox 1, all with the same results no matter the lighting, background or distance my face was from the camera. As you can see i'm getting double eyes, nose and mouth.
20141009_162903.jpg

Final results: PS4

This pic is from my PS4 with the same lighting and same room as my attempts on Xbox one. As you can see the picture is flawless in respects to face alignment. I could try alot harder on lighting but that wasn't my main purpose in this experiment.
20141009_185225.jpg

My conclusions and final verdict:

Well its simple to tell from the two pictures and sad to admit but, PS4's camera and or software is far more reliable and superior to what ever is going on with the Xbox one and Kinect. I can not begin to tell you how many times Ive scanned my face on the kinect with the result coming out as shown, every time, no matter the lighting situation and or placement of the camera. The PS4 scan was accomplished on only My second scan.

I have concluded that where the problem lies in the Kinect is the green square in which you must place your face. upon booting up the PS4 scan I immediately noticed that the green box if reasonably larger then that of the Xbox one. There in lies the major problem for most. with the kinect I was unable to get any closer than about 18-20 inches away before the kinect would lose track of my face. With the PS4 camera I was able to get about 8-10 inches away from the camera with out it loosing track of my face. The first time I scanned on the PS4 I had the camera about 18 inches away, as i did the Kinect and the results came out similar to my Xbox results with the double orifices. I then scanned a second time with the PS4 camera out 8 inches away and the results are as you see...Near perfect.

So to conclude, the problem lies with in the kinects software for most people, But not all which is strange. in my opinion Xbox needs to issue some sort of update to enlarge the size of the green box and or somehow allow for better face detection at closer range.

Spead the word fellas and hopefully enough important people will hear or see this post and make the nessicary changes for Xbox. Cause I for one cant stomach having to play on the PS4 for much longer. lol just kidding PS fanboys
 
As you can see i'm getting double eyes, nose and mouth.
Sounds like a simple bug to me. The image scaling and alignment isn't matching the 3D, which is probably a hard coded factor someone forgot to update in final release. If the underlying 3D geometry is good, the 3D camera is doing okay.
 
Sounds like a simple bug to me. The image scaling and alignment isn't matching the 3D, which is probably a hard coded factor someone forgot to update in final release. If the underlying 3D geometry is good, the 3D camera is doing okay.

That's the thing for most people the 3D geometry isn't good. they are getting all types of crazy faces & I think it's being caused by the depth camera going out of whack when people move there head. If you're too close I guess it causes a problem with the IR reflecting back into the camera & if you're too far I guess it's harder for the RGB camera to get a good color scan of your face to overlay the depth map with. Some people are getting the hang of it with Kinect now but others are still getting really bad results.

Stuff like this is happening.

gvupzmh3rjnugomajriz.jpg
 
Oops...

Integration of multiple samples is out of whack. Basically 'motion blur' on the 3D data. Just a bug and not a limitation of the system.
 
Oops...

Integration of multiple samples is out of whack. Basically 'motion blur' on the 3D data. Just a bug and not a limitation of the system.

I think it's just due to the way that Kinect 2.0 gets it's depth map from the reflections of the IR array so scanning moving objects sometimes causes errors in the way IR reflect.


I guess you can say it boils down to user error for both Kinect & PS4 camera scans but with Kinect there is more chances of error because not only do you have to worry about getting a good RGB color scan you also have to worry about getting a good depth scan without causing problems with the IR reflections.
 
The videos showing Kinect's depth map suggest otherwise. Note the clean gradients on the head:
It only suffers with dark surfaces absorbing IR which isn't a problem for face scanning.
PS4's (or any) stereo camera depth is very noisy and error prone by comparison.

TBH it's pretty remarkable that they can get good results from stereo cameras! I'm curious how much the shape of the head is actually affected, or if on PS4 it's mostly just matching the face to the prebuilt model. Via background removal, stereo cameras will help extract profile and general shape, but I'd expect it to miss a lot of precision in fleshy tissue and bone structure.
 
The videos showing Kinect's depth map suggest otherwise. Note the clean gradients on the head:
It only suffers with dark surfaces absorbing IR which isn't a problem for face scanning.
PS4's (or any) stereo camera depth is very noisy and error prone by comparison.

TBH it's pretty remarkable that they can get good results from stereo cameras! I'm curious how much the shape of the head is actually affected, or if on PS4 it's mostly just matching the face to the prebuilt model. Via background removal, stereo cameras will help extract profile and general shape, but I'd expect it to miss a lot of precision in fleshy tissue and bone structure.

I think you need to watch that video again.

Edit: the holes forming in the depth map & in the IR video is exactly what's causing the problems. you're only looking at the outline but faces need detail.
 
Last edited by a moderator:
My watching it again won't help with your comprehension of it.
I'm talking about the holes that form when he move around in both the depth map feed & the active IR feed.

29mpp1w.jpg


2hqfqyv.jpg



I'm sure Microsoft has a better way of getting depth & color feed to use in a face scans but chances are this is what 3rd party devs have to work with.
 
Last edited by a moderator:
Edit: the holes forming in the depth map & in the IR video is exactly what's causing the problems. you're only looking at the outline but faces need detail.
I'm not looking at the outline but the surface detail, which is a smooth depth map. There also are no 'holes'. I'm guessing what you're calling a hole is the repeated greyscale. A transition from white to black isn't a hole in the feed but a limitation of the visual rasterisation of the 13 bit depth data. Instead of scaling it to a uniform gradient from front to back where the whole image would be a foggy grey, a greyscale gradient is repeated in bands to make the depth more obvious.

In the IR image, the 'holes' appear to be over exposure, easily adjusted for when imaging close. It was just a quick test grab by this guy and not an attempt at an ideal setup.

There's no problem with the source data or IR imaging. It is in fact far more accurate and less noisy than PS4's camera should be able to achieve. Trying to get a stereo disparity map from a front-lit, flatly shaded (ideal for albedo face texture) human face is not going to be easy and I'd anticipate a heck of a lot of noise. Look at this as an example of what stereo depth data looks like, derived from this source image. Or this one. Now that's holes! And that's with far stronger visual cues than a human face provides.

Kinect 2 is much better suited to this sort of job. NBA 2k15 is just buggy.
 
That's the thing for most people the 3D geometry isn't good. they are getting all types of crazy faces & I think it's being caused by the depth camera going out of whack when people move there head. If you're too close I guess it causes a problem with the IR reflecting back into the camera & if you're too far I guess it's harder for the RGB camera to get a good color scan of your face to overlay the depth map with. Some people are getting the hang of it with Kinect now but others are still getting really bad results.

Stuff like this is happening.
Yes... from this...

1500824_7968409736919vxoy7.jpg


To this... there is a difference. Fugly would be a compliment in this case:

udx2w9M.jpg


BzW5Fp0CcAEd_b1.jpg


BzUgmkyIIAAKPv8.jpg


BzUmeW4IYAA6QTL.jpg


t7F6Het.jpg


Still, an awesome game. I have NBA 2k14 and it's one of the games I've played the most, if not the most for now, on the Xbox One.
 
Is there a real info about depth sensitivity on kinect 2? Like 1cm? Less? More? The thing is that tov cam usually is expensive and kinect 2 is really cheap compared to those. Is there an example of close up the depth image on kinect 2?
 
quickly do lots of face capture before they fix it :D

its like when i have purple haired xbox Avatar. then i stupidly changed the hair after MS fixed it :(
 
Is there a real info about depth sensitivity on kinect 2? Like 1cm? Less? More? The thing is that tov cam usually is expensive and kinect 2 is really cheap compared to those. Is there an example of close up the depth image on kinect 2?

I have these screen caps from the Kinect Sports Rivals tech video.

Point Cloud:
Kinect%2B2%2Bdepth%2B5.png


Depth with RGB overlay:

Kinect%2B2%2Bdepth%2B7.png




Depth with IR overlay:

Kinect%2B2%2Bdepth%2B4.png
 
Back
Top