Create a 3D model of a person—from just a few seconds of video

Transporting yourself into a video game, body and all, just got easier. Artificial intelligence has been used to create 3D models of people’s bodies for virtual reality avatars, surveillance, visualizing fashion, or movies. But it typically requires special camera equipment to detect depth or to view someone from multiple angles. A new algorithm creates 3D models using standard video footage from one angle.

What once cost millions to produce in Hollywood special effects. Youtube Screenshot. The Daily Wire

By Matthew Hutson, Science Magazine April 13, 2018

The system has three stages. First, it analyzes a video a few seconds long of someone moving—preferably turning 360° to show all sides—and for each frame creates a silhouette separating the person from the background. Based on machine learning techniques—in which computers learn a task from many examples—it roughly estimates the 3D body shape and location of joints.

In the second stage, it “unposes” the virtual human created from each frame, making them all stand with arms out in a T shape, and combines information about the T-posed people into one, more accurate model. Finally, in the third stage, it applies color and texture to the model based on recorded hair, clothing, and skin.

The researchers tested the method with a variety of body shapes, clothing, and backgrounds and found that it had an average accuracy within 5 millimeters, they will report in June at the Computer Vision and Pattern Recognition conference in Salt Lake City.

The system can also reproduce the folding and wrinkles of fabric, but it struggles with skirts and long hair. With a model of you, the researchers can change your weight, clothing, and pose—and even make you perform a perfect pirouette. No practice necessary.

References

Video Based Reconstruction of 3D People Models, Thiemo Alldieck, Marcus Magnor, Weipeng Xu, Christian Theobalt, Gerard Pons-Moll. CVPR 2018 Spotlight, IEEE Conference on Computer Vision and Pattern Recognition 2018 (CVPR). arXiv:1803.04758v3 PDF

MonoPerfCap: Human Performance Capture from Monocular Video, Weipeng Xu, Avishek Chatterjee, Michael Zollhoefer, Helge Rhodin, Dushyant Mehta, Hans-Peter Seidel, Christian Theobalt. ACM Transactions on Graphics SIGGRAPH 2018, Vancouver, Canada PDF

Optical Flow-based 3D Human Motion Estimation from Monocular Video, Thiemo Alldieck, Marc Kassubeck, Bastian Wandt, Bodo Rosenhahn, Marcus Magnor. Proc. German Conference on Pattern Recognition (GCPR) 2017 Springer PDF

The technology could put you in a video game without fancy equipment. Science Magazine. Youtube Apr 13, 2018

Matthew Hutson is a freelance science journalist in New York City. Email, Twitter

Source Science Magazine AAAS

MonoPerfCap: Human Performance Capture from Monocular Video – ACM TOG (presented SIGGRAPH 2018). We present the first marker-less approach for temporally coherent 3D performance capture of a human with general clothing from monocular video. Our approach reconstructs articulated human skeleton motion as well as medium-scale non-rigid surface deformations in general scenes. Christian Theobalt. Youtube May 16, 2018

Also see
Human Performance Capture from Monocular Video Max Plank Institute
Comprehensive Human Performance Capture from Monocular Video Footage TU Braunschweig
AI Can Now Create 3D Models From Just Seconds of Video The Daily Wire
What you see in a 3D scan of yourself could be upsetting The Conversation