SEMINAR

SeaLLMs – Large Language Models for Southeast Asia

Speaker

Xuan Phi Nguyen

Working
Alibaba Group
Timeline
Mon, Dec 18 2023 - 02:30 pm (GMT + 7)
About Speaker

Phi is a senior research engineer at DAMO Academy, Alibaba Group in Singapore, where he works in multilinguality in large language models and translation technologies with the goal to democratize AI to under-represented communities. Prior to that, he completed his PhD in Artificial Intelligence at Nanyang Technological University (NTU) in Singapore. Throughout his PhD, Phi also joined research internship programs at Salesforce AI Research and Facebook AI Research (FAIR) in 2019, 2021 and 2022. He has published several research papers in machine learning and natural language processing conferences, namely ICLR-20/22, NeurIPS-20/22, ICML-21, ACL-20/21, ICASSP-23 and EMNLP-23. Phi is also the recipient of the 2021 Singapore Data Science Consortium (SDSC) Dissertation Research Award, NeurIPS scholar award and the A*STAR Computing and Information Science (ACIS) Scholarship.

Abstract

Despite the remarkable achievements of large language models (LLMs) in various tasks, there remains a linguistic bias that favors high-resource languages, such as English, often at the expense of low-resource and regional languages. To address this imbalance, we introduce SeaLLMs, an innovative series of language models that specifically focuses on Southeast Asian (SEA) languages. SeaLLMs are built upon the Llama-2 and further advanced through continued pre-training, specialized instruction and alignment tuning. Our comprehensive evaluation demonstrates that SeaLLM-13b models exhibit superior performance across a wide spectrum of linguistic tasks and assistant-style instruction-following capabilities relative to comparable open-source models. Moreover, they outperform ChatGPT-3.5 in non-Latin languages, such as Thai, Khmer, Lao, and Burmese, by large margins while remaining lightweight and cost-effective to operate.

Related seminars

Tim Baldwin

MBZUAI, The University of Melbourne

Safe, open, locally-aligned language models
Mon, Dec 16 2024 - 02:00 pm (GMT + 7)

Alessio Del Bue

Italian Institute of Technology (IIT)

From Spatial AI to Embodied AI: The Path to Autonomous Systems
Mon, Dec 16 2024 - 10:00 am (GMT + 7)

Dr. Xiaoming Liu

Michigan State University

Person Recognition at a Distance
Mon, Dec 9 2024 - 10:00 am (GMT + 7)

Dr Lan Du

Monash University

Uncertainty Estimation for Multi-view/Multimodal Data
Fri, Dec 6 2024 - 10:00 am (GMT + 7)