GenAI
NLP
InterSpeech

A High-Quality and Large-Scale Dataset for English-Vietnamese Speech Translation

June 16, 2022

In this paper, we introduce a high-quality and large-scale benchmark dataset for English-Vietnamese speech translation with 508 audio hours, consisting of 331K triplets of (sentence-length audio, English source transcript sentence, Vietnamese target subtitle sentence). We also conduct empirical experiments using strong baselines and find that the traditional “Cascaded” approach still outperforms the modern “End-to-End” approach. To the best of our knowledge, this is the first largescale English-Vietnamese speech translation study. We hope both our publicly available dataset and study can serve as a starting point for future research and applications on EnglishVietnamese speech translation.

Index Terms: Benchmark dataset, English-Vietnamese, Speech translation, Automatic speech recognition, Machine translation, Cascaded, End-to-End.

Overall

< 1 minute

Linh The Nguyen*, Nguyen Luong Tran*, Long Doan*, Manh Luong, Dat Quoc Nguyen

InterSpeech 2022

Share Article

Related publications

GenAI
ML
ICML Top Tier
May 14, 2024

Bao Nguyen, Binh Nguyen, Hieu Nguyen, Viet Anh Nguyen

GenAI
NLP
NAACL Top Tier
April 4, 2024

Thang Le, Tuan Luu

GenAI
CV
CVPR Top Tier
March 4, 2024

Thuan Hoang Nguyen, Anh Tran

GenAI
ICLR – Tiny Papers Track
February 14, 2024

Thanh-Thien Le, Linh The Nguyen, Dat Quoc Nguyen