Review of neural approaches for conditional text generation

Authors

DOI:

https://doi.org/10.17721/1812-5409.2021/1.13

Keywords:

natural language processing, neural networks, machine learning, conditional text generation, paraphrase generation, grammatical error correction, text simplification

Abstract

The article is devoted to the review of conditional test generation, one of the most promising fields of natural language processing and artificial intelligence. Specifically, we explore monolingual local sequence transduction tasks: paraphrase generation, grammatical and spelling errors correction, text simplification. To give a better understanding of the considered tasks, we show examples of good rewrites. Then we take a deep look at such key aspects as publicly available datasets with the splits (training, validation, and testing), quality metrics for proper evaluation, and modern solutions based primarily on modern neural networks. For each task, we analyze its main characteristics and how they influence the state-of-the-art models. Eventually, we investigate the most significant shared features for the whole group of tasks in general and for approaches that provide solutions for them.

Pages of the article in the issue: 102 - 107

Language of the article: Ukrainian

References

BROWN, T.B. et al. (2020) Language Models are Few-Shot Learners. In NeurIPS 2020

KAGGLE. (2017) Quora Duplicate Questions [Online] – Available from: https://www.kaggle.com/aymenmouelhi/quora-duplicate-questions [Accessed: 19th June 2012].

WIETING, J. and GIMPEL, K. (2017) PARANMT-50M: Pushing the Limits of Paraphrastic Embeddings with Millions of Machine Translations. In ACL 2017

LAN, W. et al. (2017) A Continuously Growing Dataset of Sentential Paraphrases. In EMNLP 2017

LIN, T. et al. (2014) Microsoft COCO: Common Objects in Context. In ECCV 2014

PAPINENI, K. et al. (2002) Bleu: a Method for Automatic Evaluation of Machine Translation. In ACL 2002

LIN, C. (2004) ROUGE: A Package for Automatic Evaluation for Summary. In ACL 2004

FU, Y. and FENG, Y. (2019) Paraphrase Generation with Latent Bag of Words. In NeurIPS 2019

DAHLMEIER, D. et al. (2013) Building a Large Annotated Corpus of Learner English: The NUS Corpus of Learner English. In BEA 2013

TAJIRI, T. et al. (2012) Tense and Aspect Error Correction for ESL Learners Using Global Context. In ACL 2012

YANNAKOUDAKIS, H. et al. (2011) A New Dataset and Method for Automatically Grading ESOL Texts. In ACL 2011

BRYANT, C. et al. (2019) The BEA-2019 Shared Task on Grammatical Error Correction. In ACL 2019

DAHLMEIER, D. and NH, T. H. (2012) Better Evaluation for Grammatical Error Correction. In NAACL 2012

BRYANT, C. et al. (2017) Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction. In ACL 2017

OMELIANCHUK, K. et al. (2019) GECToR – Grammatical Error Correction: Tag, Not Rewrite. In BEA 2019

YANG, Z. et al. (2019) XLNet: Generalized Autoregressive Pretraining for Language Understanding. In NeurIPS 2019

ZHANG, X. and LAPATA, M. (2017) Sentence Simplification with Deep Reinforcement Learning. In EMNLP 2017

XU, W. et al. (2015) Problems in Current Text Simplification Research: New Data Can Help. In TACL 2015

XU, W. et al. (2016) Optimizing Statistical Machine Translation for Text Simplification. In TACL 2016 20. KINCAID, J. P. et al. (1975) Derivation Of New Readability Formulas. Institute for Simulation and Training, 56

MARTIN, L. et al. (2020) Multilingual Unsupervised Sentence Simplification

LEWIS, M. et al. (2019) BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In ACL 2020

VASWANI, A. (2017) Attention is all you need. In NIPS 2017

Downloads

Published

2021-06-16

How to Cite

Skurzhanskyi, O. H., & Marchenko, A. A. (2021). Review of neural approaches for conditional text generation. Bulletin of Taras Shevchenko National University of Kyiv. Physical and Mathematical Sciences, (1), 102–107. https://doi.org/10.17721/1812-5409.2021/1.13

Issue

Section

Computer Science and Informatics