They must have used motion capture to make the facial expressions and body movement so natural. Usually it falls apart when they start talking but this looks so real.
Unity recently acquired ZivaFX and i suspect this is what they used here
Unity Acquires Ziva Dynamics for Realistic Humans (gamerant.com)