Guided Beam Search to Improve Generalization in Low-Resource Data-to-Text Generation

Nicolas Garneau, Luc Lamontagne

Paper

In Sessions:

INLG Oral Session 2: NLG for low-resourced settings: (Wednesday, 13:30 CEST, Sun II , Watch on Zoom , Chat on Discord )

Poster

Guided Beam Search to Improve Generalization in Low-Resource Data-to-Text Generation

Abstract: In this paper, we introduce a new beam search algorithm that improves the generalization of neural generators to unseen examples, especially in low-resource data-to-text settings. Our algorithm aims to reduce the number of omissions and hallucinations during the decoding process. For this purpose, it relies on two regression models to explicitly characterize factual errors. We explain how to create a new dataset to train these models given an original training set of less than a thousand data points. We apply our approach in the low-resource, legal setting using the French Plum2Text dataset, as well as in English using WebNLG. We observe in our experiment that this combination improves the faithfulness of pre-trained neural text generators using both human and automatic evaluation. Moreover, our approach offers a level of interpretability by predicting the number of omissions and hallucinations present in a given generation with respect to the input data. Finally, we visualize our algorithm's exploration of the hypothesis space at different steps during the decoding process.