**Thinh Pham**, **Dat Quoc Nguyen**

VinAI Research, Vietnam

ML

November 24, 2021

An autoencoder is a neural network that is trained to attempt to reconstruct its input. Autoencoders are successfully applied to dimensionality reduction and information retrieval tasks. An autoencoder has two components: an encoder mapping an input to a latent code and a decoder mapping a latent code to an output called a reconstruction. The learning process of an autoencoder can be described as minimizing a loss function that penalizes the dissimilarity between inputs and reconstructions. A point cloud is a set of d-dimensional vectors of coordinates, color, normals, etc.

In this work, we try to answer the following question: *when learning autoencoders for point clouds, how do different types of loss functions affect the learning process and the quality of latent codes?*

We briefly review popular loss functions people use to train autoencoders on point clouds. The first one is Chamfer discrepancy: Given two point clouds P, Q, Chamfer discrepancy between P and Q is defined by

Another distance used to compare two point clouds is the Earth Mover’s distance (EMD). When P and Q have the same number of points, the EMD between two point clouds P and Q is defined as

In this work, we also propose to use the Sliced Wasserstein distance (SWD). In particular, the idea of sliced Wasserstein distance is that we first project both target probability measures *µ *and *ν *on a direction, says *θ*, on the unit sphere to obtain two projected measures denoted by *π _{θ}#µ *and

The *SW _{p }*is considered as a low-cost approximation for Wasserstein distance as its computational complexity is of the order O(

Lemma 1. *Assume *|*P*| = |*Q*| *and the support of **P and **Q is bounded in a convex hull of diameter **K, then we find that*

The inequality in Theorem 1 implies that minimizing the Wasserstein distance leads to a smaller Chamfer discrepancy, and the reverse inequality is not true.

Experiment Results In general, good latent codes are expected to have good performance in a wide range of downstream tasks. Here we compare the performance of different autoencoders trained with Chamfer discrepancy, Earth Mover’s distance, sliced Wasserstein distance. We consider the following tasks in our evaluation: point cloud reconstruction(Table 1), classification(Figure 2), registration(Table 3), and generation(Table 4).

In conclusion, SWD possesses both the statistical benefits of EMD and the computational benefits of Chamfer divergence. Also, latent codes learned by SWD seem to lead to better performance in many downstream tasks than those learned by Chamfer and EMD.

[1] Erhan Bayraktar and Gaoyue Guo. Strong equivalence between metrics of Wasserstein type. *arXiv preprint arXiv:1912.08247*, 2019.

Authors: Trung Nguyen, Quang-Hieu Pham, Tam Le, Tung Pham, Nhat Ho, Binh-Son Hua

Overall

4 minutes

**Trung Nguyen** (Research Resident)

Share Article