Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying 3D Deep Learning with Python
  • Table Of Contents Toc
3D Deep Learning with Python

3D Deep Learning with Python

By : Xudong Ma, David Farrugia, Vishakh Hegde, Lilit Yolyan
4.4 (5)
close
close
3D Deep Learning with Python

3D Deep Learning with Python

4.4 (5)
By: Xudong Ma, David Farrugia, Vishakh Hegde, Lilit Yolyan

Overview of this book

With this hands-on guide to 3D deep learning, developers working with 3D computer vision will be able to put their knowledge to work and get up and running in no time. Complete with step-by-step explanations of essential concepts and practical examples, this book lets you explore and gain a thorough understanding of state-of-the-art 3D deep learning. You’ll see how to use PyTorch3D for basic 3D mesh and point cloud data processing, including loading and saving ply and obj files, projecting 3D points into camera coordination using perspective camera models or orthographic camera models, rendering point clouds and meshes to images, and much more. As you implement some of the latest 3D deep learning algorithms, such as differential rendering, Nerf, synsin, and mesh RCNN, you’ll realize how coding for these deep learning models becomes easier using the PyTorch3D library. By the end of this deep learning book, you’ll be ready to implement your own 3D deep learning models confidently.
Table of Contents (16 chapters)
close
close
1
PART 1: 3D Data Processing Basics
4
PART 2: 3D Deep Learning Using PyTorch3D
9
PART 3: State-of-the-art 3D Deep Learning Using PyTorch3D

3D data representation

In this section, we will learn the most frequently used data representation of 3D data. Choosing data representation is a particularly important design decision for many 3D deep learning systems. For example, point clouds do not have grid-like structures, thus convolutions cannot be usually used directly for them. Voxel representations have grid-like structures; however, they tend to consume a high amount of computer memory. We will discuss the pros and cons of these 3D representations in more detail in this section. Widely used 3D data representations usually include point clouds, meshes, and voxels.

Understanding point cloud representation

A 3D point cloud is a very straightforward representation of 3D objects, where each point cloud is just a collection of 3D points, and each 3D point is represented by one three-dimensional tuple (x, y, or z). The raw measurements of many depth cameras are usually 3D point clouds.

From a deep learning point of view, 3D point clouds are one of the unordered and irregular data types. Unlike regular images, where we can define neighboring pixels for each individual pixel, there are no clear and regular definitions for neighboring points for each point in a point cloud – that is, convolutions usually cannot be applied to point clouds. Thus, special types of deep learning models need to be used for processing point clouds, such as PointNet: https://arxiv.org/abs/1612.00593.

Another issue for point clouds as training data for 3D deep learning is the heterogeneous data issue – that is, for one training dataset, different point clouds may contain different numbers of 3D points. One approach for avoiding such a heterogeneous data issue is forcing all the point clouds to have the same number of points. However, this may not be always possible – for example, the number of points returned by depth cameras may be different from frame to frame.

The heterogeneous data may create some difficulties for mini-batch gradient descent in training deep learning models. Most deep learning frameworks assume that each mini-batch contains training examples of the same size and dimensions. Such homogeneous data is preferred because it can be most efficiently processed by modern parallel processing hardware, such as GPUs. Handling heterogeneous mini-batches in an efficient way needs some additional work. Luckily, PyTorch3D provides many ways of handling heterogeneous mini-batches efficiently, which are important for 3D deep learning.

Understanding mesh representation

Meshes are another widely used 3D data representation. Like points in point clouds, each mesh contains a set of 3D points called vertices. In addition, each mesh also contains a set of polygons called faces, which are defined on vertices.

In most data-driven applications, meshes are a result of post-processing from raw measurements of depth cameras. Often, they are manually created during the process of 3D asset design. Compared to point clouds, meshes contain additional geometric information, encode topology, and have surface-normal information. This additional information becomes especially useful in training learning models. For example, graph convolutional neural networks usually treat meshes as graphs and define convolutional operations using the vertex neighboring information.

Just like point clouds, meshes also have similar heterogeneous data issues. Again, PyTorch3D provides efficient ways for handling heterogeneous mini-batches for mesh data, which makes 3D deep learning efficient.

Understanding voxel representation

Another important 3D data representation is voxel representation. A voxel is the counterpart of a pixel in 3D computer vision. A pixel is defined by dividing a rectangle in 2D into smaller rectangles and each small rectangle is one pixel. Similarly, a voxel is defined by dividing a 3D cube into smaller-sized cubes and each cube is called one voxel. The processes are shown in the following figure:

Figure 1.1 – Voxel representation is the 3D counterpart of 2D pixel representation, where a cubic space is divided into small volume elements

Figure 1.1 – Voxel representation is the 3D counterpart of 2D pixel representation, where a cubic space is divided into small volume elements

Voxel representations usually use Truncated Signed Distance Functions (TSDFs) to represent 3D surfaces. A Signed Distance Function (SDF) can be defined at each voxel as the (signed) distance between the center of the voxel to the closest point on the surface. A positive sign in an SDF indicates that the voxel center is outside an object. The only difference between a TSDF and an SDF is that the values of a TSDF are truncated, such that the values of a TSDF always range from -1 to +1.

Unlike point clouds and meshes, voxel representation is ordered and regular. This property is like pixels in images and enables the use of convolutional filters in deep learning models. One potential disadvantage of voxel representation is that it usually requires more computer memory, but this can be reduced by using techniques such as hashing. Nevertheless, voxel representation is an important 3D data representation.

There are 3D data representations other than the ones mentioned here. For example, multi-view representations use multiple images taken from different viewpoints to represent a 3D scene. RGB-D representations use an additional depth channel to represent a 3D scene. However, in this book, we will not be diving too deep into these 3D representations. Now that we have learned the basics of 3D data representations, we will dive into a few commonly used file formats for point clouds and meshes.

CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
3D Deep Learning with Python
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist download Download options font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon