Improving Pareto Front Learning via Multi-Head HyperNetwork
Le Duy Dung
Le Duy Dung is an Assistant Professor at the College of Engineering and Computer Science, VinUniversity. Previously, he was a senior data scientist in Ads and Personalization team, Grab Holdings Inc. and a research scientist in School of Information Systems, Singapore Management University (SMU). He earned his PhD in Data Science and Engineering from SMU, under the supervision of Associate Professor Hady W. Lauw. In his candidature, he has been recognized with SMU Presidential Doctoral Fellowship Awards and SMU PhD Student Life Awards. Formerly, he earned his Degree of Engineer in Mathematics and Informatics from Hanoi University of Science and Technology, Vietnam in 2014. His research interests are in recommender systems, information retrieval, and visual analytics. More information about him can be found at: https://andrew-dungle.github.io/.
Multi-objective optimization (MOO) plays a critical role in many machine learning applications. The set of optimal solutions for these MOO problems are referred to as the Pareto front, in which each point on the front represents different trade-off between conflicting objectives. Pareto front learning is recently introduced as an effective approach to allow one to flexibly derive the optimal solution given the desired trade-off after training. This setup is practical as in many scenarios, the decision-makers can not specify the preference of one Pareto solution over another, and must switch between them depending on the situation.
Although Pareto front Learning offers many advantages in approximating the true Pareto front, existing methods are achieving the solutions that are either suboptimal or not widely dispersed. To overcome this issue, we propose a novel Multi-head HyperNetwork (MHN) architecture to improve the quality of the obtained Pareto front. Particularly, we generate multiple Pareto solutions from a set of diverse trade-off preferences via the MHN and improve the quality of the Pareto front by maximizing the hypervolume value defined by these solutions. The experimental results in a wide variety of machine learning tasks show that the proposed method significantly outperforms the baselines in producing high-quality Pareto front.