Microsoft's artificial intelligence research labs have created a training technique for 3D models from 2D data, a system that learns from real life to imagine the missing data when viewing a flat image.
Microsoft believes that it can constantly learn to generate better shapes than existing models when training with exclusively 2D images, which can be used by video game developers, e-commerce companies, animation studios, architectural firms, and many other businesses that have no budget to create 3D from scratch.
Some technical details are discussed in the document:
To do this they have used software that produces images from visualization data, training a generative model for 3D shapes that takes a random input vector (values that represent the characteristics of the data set) and generates a continuous representation of pixel (values in a grid in the 3D space) of the 3D object. Then the pixels are sent to a non-differentiable rendering process, where they are limited to discrete values before they are processed using a ready-to-use renderer.
The idea, explained very simply, is to have a brain capable of creating millions of three-dimensional objects and then verify which of them is the one that best adapts to the 2D image received initially.
Its focus takes advantage of the lighting and shading signal provided by the images, allowing it to extract more meaningful information per training sample and produce better results in those environments. In addition, you can produce realistic samples when training on natural image data sets.
Of course it must be borne in mind that magic does not exist. If a 2D image does not have all the necessary information (or shadows that give clues about its height, or curves with illuminated textures, etc.), the Artificial Intelligence system will invent the rest, but still be an important starting point.
Incidentally, a similar technique has been used in the threedy.ai company for several months. I invite you to see their videos to know their models.