ML
ICML

Optimal Transport for Structure Learning Under Missing Data

May 16, 2024

Causal discovery in the presence of missing data introduces a chicken-and-egg dilemma. While the goal is to recover the true causal structure, robust imputation requires considering the dependencies or preferably causal relations among variables. Merely filling in missing values with existing imputation methods and subsequently applying structure learning on the complete data is empirical shown to be sub-optimal. To this end, we propose in this paper a score-based algorithm, based on optimal transport, for learning causal structure from missing data. This optimal transport viewpoint diverges from existing score-based approaches that are dominantly based on EM. We project structure learning as a density fitting problem, where the goal is to find the causal model that induces a distribution of minimum Wasserstein distance with the distribution over the observed data. Through extensive simulations and real-data experiments, our framework is shown to recover the true causal graphs more effectively than the baselines in various simulations and real-data experiments. Empirical evidences also demonstrate the superior scalability of our approach, along with the flexibility to incorporate any off-the-shelf causal discovery methods for complete data.

Overall

< 1 minute

Vy Vo, He Zhao, Trung Le, Edwin V. Bonilla, Dinh Phung

ICML 2024

Share Article

Related publications

GenAI
ML
NeurIPS
November 28, 2024
Long Tung Vuong, Anh Tuan Bui,
Khanh Doan, Trung Le, Paul Montague, Tamas Abraham, Dinh Phung
ML
NeurIPS
November 28, 2024

Hoang Phan*, Lam Tran*, Quyen Tran*, Trung Le

ML
NeurIPS
November 28, 2024

Haocheng Luo, Tuan Truong, Tung Pham, Mehrtash Harandi, Dinh Phung, Trung Le

GenAI
ML
NeurIPS
November 28, 2024

Minh Le, An Nguyen, Huy Nguyen, Trang Nguyen, Trang Pham, Linh Van Ngo, Nhat Ho