Parameter Estimation & Interpretability in Bayesian Mixture Models
Long Nguyen
Long Nguyen is an associate professor in the Department of Statistics and, by courtesy, Department of Electrical Engineering and Computer Science at the University of Michigan, Ann Arbor. He received his PhD degree from the University of California, Berkeley in 2007. Nguyen’s interests include nonparametric Bayesian statistics, machine learning and optimization, as well as applications in signal processing and environmental sciences. He is a recipient of the Leon O. Chua Award from UC Berkeley for his PhD research, the IEEE Signal Processing Society’s Young Author best paper award, the CAREER award from the NSF’s Division of Mathematical Sciences, and best paper awards from the International Conference on Machine Learning (ICML) in 2004 and 2014. Nguyen currently serves as associate editor of Bayesian Analysis, Journal of Machine Learning Research, the Annals of Statistics and SIAM Journal on Mathematics of Data Science.
We study posterior contraction behaviors for parameters of interest in the context of Bayesian mixture modeling, where the number of mixing components is unknown while the model itself may or may not be correctly specified. Two representative types of prior specification will be considered: one requires explicitly a prior distribution on the number of mixture components, while the other places a nonparametric prior on the space of mixing distributions. The former is shown to yield an optimal rate of posterior contraction on the model parameters under minimal conditions, while the latter can be utilized to consistently recover the unknown number of mixture components, with the help of a fast probabilistic post-processing procedure. We then turn the study of these Bayesian procedures to the realistic settings of model misspecification. It will be shown that the modeling choice of kernel density functions plays perhaps the most impactful roles in determining the posterior contraction rates in the misspecified situations. Drawing on concrete posterior contraction rates established in this paper we wish to highlight some aspects about the interesting tradeoffs between model expressiveness and interpretability that a statistical modeler must negotiate in the rich world of mixture modeling.