Abstract
Lexically-constrained NMT (LNMT) aims to incorporate user-provided
terminology into translations. Despite its practical advantages, existing work
has not evaluated LNMT models under challenging real-world conditions. In this
paper, we focus on two important but under-studied issues that lie in the
current evaluation process of LNMT studies. The model needs to cope with
challenging lexical constraints that are "homographs" or "unseen" during
training. To this end, we first design a homograph disambiguation module to
differentiate the meanings of homographs. Moreover, we propose PLUMCOT, which
integrates contextually rich information about unseen lexical constraints from
pre-trained language models and strengthens a copy mechanism of the pointer
network via direct supervision of a copying score. We also release HOLLY, an
evaluation benchmark for assessing the ability of a model to cope with
"homographic" and "unseen" lexical constraints. Experiments on HOLLY and the
previous test setup show the effectiveness of our method. The effects of
PLUMCOT are shown to be remarkable in "unseen" constraints. Our dataset is
available at https://github.com/papago-lab/HOLLY-benchmark