GenAI
NLP
NAACL

Extractive Summarization with Text Generator

April 4, 2024

Standard extractive systems suffer from the lack of gold training signals since existing corpora solely provide document and human-written summary pairs while disregarding extractive labels. As a result, existing methods resort to imperfect pseudo-labels that are both biased and error-prone, thereby hindering the learning process of extractive models. In contrast, text generators which are commonly employed in abstractive summarization can effortlessly overcome this predicament on account of flexible sequence-to-sequence architectures. Motivated to bypass this inherent limitation, we investigate the possibility of conducting extractive summarization with text generators. Through extensive experiments covering six summarization benchmarks, we show that high-quality extractive summaries can be assembled via approximating the outputs (abstractive summaries) of these generators. Moreover, we find that the approximate summaries correlate positively with the auxiliary summaries (i.e. better generation model enables the production of better extractive summaries). Our results signify a new paradigm for training extractive summarizers i.e. learning with generation (abstractive) objectives rather than extractive schemes.

Overall

< 1 minute

Thang Le, Tuan Luu

Share Article

Related publications

GenAI
ML
ICML Top Tier
May 14, 2024

Bao Nguyen, Binh Nguyen, Hieu Nguyen, Viet Anh Nguyen

GenAI
NLP
NAACL Top Tier
April 4, 2024

Thang Le, Tuan Luu

GenAI
CV
CVPR Top Tier
March 4, 2024

Thuan Hoang Nguyen, Anh Tran

GenAI
ICLR – Tiny Papers Track
February 14, 2024

Thanh-Thien Le, Linh The Nguyen, Dat Quoc Nguyen