Safe, open, locally-aligned language models
Tim Baldwin
Tim Baldwin is Provost and Professor of Natural Language Processing at Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), in addition to being a Melbourne Laureate Professor in the School of Computing and Information Systems, The University of Melbourne and Chief Scientist of LibrAI, a start-up focused on AI safety. Tim completed a BSc(CS/Maths) and BA(Linguistics/Japanese) at The University of Melbourne in 1995, and an MEng(CS) and PhD(CS) at the Tokyo Institute of Technology in 1998 and 2001, respectively. He joined MBZUAI at the start of 2022, prior to which he was based at The University of Melbourne for 17 years. His research has been funded by organisations including the Australian Research Council, Google, Microsoft, Xerox, ByteDance, SEEK, NTT, and Fujitsu. He is the author of over 500 peer-reviewed publications across diverse topics in natural language processing and AI, in addition to being an ARC Future Fellow, and the recipient of a number of awards at top conferences.
In this talk, I will present recent work at MBZUAI targeted at the development and release of open-weight and open-source language models (LMs) for a range of different languages, with a particular focus on: AI safety, open-sourcing, and localisation. I will first motivate the need for (genuinely) open-source LMs, and then go on to describe the process we have developed for localisation and safety alignment of public-release LMs, including auto red-teaming, evaluation dataset creation, and safety alignment. I will further present details of a new safety AI leaderboard we are in the process of releasing.