I suspect they haven't exactly:Why would the GPU work on the Kinect depth buffer? AFAICT they have offloaded all the Kinect chores to dedicated silicon.
1) take the very high-resolution IR image from the camera sensor, and convert it into a lower resolution depth image. This previously occured on the 'primesense' chip in kinect 1.
2) take that depth image and locate 'points of interest/shapes/edges/whatever' and turn that into 'skeletal models'.
3) from that, work out the motion from the previous frame and attempt to match the motion to 'commands'.
Step 1, along with aligning the video feed to the depth buffer/any compression and generation of an IR stream occurs inside kinect2.
The rest of it, given the resources required (and the flexibility of changing code), seems more likely to run inside the xbox1.