SEMINAR

SeaLLMs – Large Language Models for Southeast Asia

Speaker

Xuan Phi Nguyen

Working
Alibaba Group
Timeline
Mon, Dec 18 2023 - 02:30 pm (GMT + 7)
About Speaker

Phi is a senior research engineer at DAMO Academy, Alibaba Group in Singapore, where he works in multilinguality in large language models and translation technologies with the goal to democratize AI to under-represented communities. Prior to that, he completed his PhD in Artificial Intelligence at Nanyang Technological University (NTU) in Singapore. Throughout his PhD, Phi also joined research internship programs at Salesforce AI Research and Facebook AI Research (FAIR) in 2019, 2021 and 2022. He has published several research papers in machine learning and natural language processing conferences, namely ICLR-20/22, NeurIPS-20/22, ICML-21, ACL-20/21, ICASSP-23 and EMNLP-23. Phi is also the recipient of the 2021 Singapore Data Science Consortium (SDSC) Dissertation Research Award, NeurIPS scholar award and the A*STAR Computing and Information Science (ACIS) Scholarship.

Abstract

Despite the remarkable achievements of large language models (LLMs) in various tasks, there remains a linguistic bias that favors high-resource languages, such as English, often at the expense of low-resource and regional languages. To address this imbalance, we introduce SeaLLMs, an innovative series of language models that specifically focuses on Southeast Asian (SEA) languages. SeaLLMs are built upon the Llama-2 and further advanced through continued pre-training, specialized instruction and alignment tuning. Our comprehensive evaluation demonstrates that SeaLLM-13b models exhibit superior performance across a wide spectrum of linguistic tasks and assistant-style instruction-following capabilities relative to comparable open-source models. Moreover, they outperform ChatGPT-3.5 in non-Latin languages, such as Thai, Khmer, Lao, and Burmese, by large margins while remaining lightweight and cost-effective to operate.

Related seminars

Dr. Tu Vu

Virginia Tech

Efficient Model Development in the Era of Large Language Models
Tue, Nov 5 2024 - 09:30 am (GMT + 7)
Representation Learning with Graph Autoencoders and Applications to Music Recommendation
Fri, Jul 26 2024 - 10:00 am (GMT + 7)

Trieu Trinh

Google Deepmind

AlphaGeometry: Solving IMO Geometry without Human Demonstrations
Fri, Jul 5 2024 - 10:00 am (GMT + 7)

Tat-Jun (TJ) Chin

Adelaide University

Quantum Computing in Computer Vision: A Case Study in Robust Geometric Optimisation
Fri, Jun 7 2024 - 11:00 am (GMT + 7)