In Focus

TensorFlow Graphics For Deep Learning of Computer Vision Model

Computer vision and computer graphics go hand in hand, forming a single ML system.

Recently, Google announced TensorFlow Graphics, a library for building deep neural networks for unsupervised learning tasks in computer vision. The library contains 3D-rendering functions written in TensorFlow, as well as tools for learning with non-rectangular mesh-based input data.
 
According to the company, at a high level, a computer graphics pipeline requires a representation of 3D objects and their absolute positioning in the scene. Other needs are a description of the material they are made of, lights and a camera. These attributes are then interpreted by a renderer to generate a synthetic rendering.
 
While, on the other hand, a computer vision system starts from an image and try to infer the parameters of the scene. This enables the prediction of which objects are in the scene, what materials they are made of, and their three-dimensional position and orientation.
 
Training models capable of solving these complex 3D vision tasks most often require large quantities of data. Since labelling data is a difficult and expensive process, it becomes important to have ways to design machine learning models that can comprehend the 3D world while being trained without much supervision. A combination of computer vision and computer graphics techniques offers an opportunity to leverage the vast amounts of readily available unlabelled data.
 
For, example, you can use analysis by synthesis where the vision system extracts the scene parameters and the graphics system renders back an image based on them. If the rendering matches the original image, the vision system has accurately extracted the scene parameters. So, in this setup, computer vision and computer graphics go hand in hand, forming a single ML system similar to an autoencoder, which can be trained in a self-supervised manner.
 
Source: Google 
 
The company said that Tensorflow Graphics is being developed to help tackle these types of challenges and it includes a set of differentiable graphics and geometry layers and 3D viewer functionalities that can be used to train and debug your machine learning models of choice.