GenAI
NLP
InterSpeech

A High-Quality and Large-Scale Dataset for English-Vietnamese Speech Translation

June 16, 2022

In this paper, we introduce a high-quality and large-scale benchmark dataset for English-Vietnamese speech translation with 508 audio hours, consisting of 331K triplets of (sentence-length audio, English source transcript sentence, Vietnamese target subtitle sentence). We also conduct empirical experiments using strong baselines and find that the traditional “Cascaded” approach still outperforms the modern “End-to-End” approach. To the best of our knowledge, this is the first largescale English-Vietnamese speech translation study. We hope both our publicly available dataset and study can serve as a starting point for future research and applications on EnglishVietnamese speech translation.

Index Terms: Benchmark dataset, English-Vietnamese, Speech translation, Automatic speech recognition, Machine translation, Cascaded, End-to-End.

Overall

< 1 minute

Linh The Nguyen*, Nguyen Luong Tran*, Long Doan*, Manh Luong, Dat Quoc Nguyen

InterSpeech 2022

Share Article

Related publications

GenAI
CV
NeurIPS
November 28, 2024

Hao Phung*, Quan Dao*, Trung Dao, Viet Hoang Phan, Dimitris N. Metaxas, Anh Tran

GenAI
ML
NeurIPS
November 28, 2024
Long Tung Vuong, Anh Tuan Bui,
Khanh Doan, Trung Le, Paul Montague, Tamas Abraham, Dinh Phung
GenAI
ML
NeurIPS
November 28, 2024

Minh Le, An Nguyen, Huy Nguyen, Trang Nguyen, Trang Pham, Linh Van Ngo, Nhat Ho

GenAI
NLP
EMNLP
November 28, 2024

Quyen Tran*, Nguyen Xuan Thanh*, Nguyen Hoang Anh*, Nam Le Hai, Trung Le, Linh Van Ngo, Thien Huu Nguyen