Towards Accurate Translation via Semantically Appropriate Application of
  Lexical Constraints

Yujin Baek; Koanho Lee; Dayeon Ki; Hyoung-Gyu Lee; Cheonbok Park; Jaegul Choo

Towards Accurate Translation via Semantically Appropriate Application of Lexical Constraints

Yujin Baek; Koanho Lee; Dayeon Ki; Hyoung-Gyu Lee; Cheonbok Park; Jaegul Choo

arXiv 2023

14

choo2023towards

Abstract

Lexically-constrained NMT (LNMT) aims to incorporate user-provided terminology into translations. Despite its practical advantages, existing work has not evaluated LNMT models under challenging real-world conditions. In this paper, we focus on two important but under-studied issues that lie in the current evaluation process of LNMT studies. The model needs to cope with challenging lexical constraints that are "homographs" or "unseen" during training. To this end, we first design a homograph disambiguation module to differentiate the meanings of homographs. Moreover, we propose PLUMCOT, which integrates contextually rich information about unseen lexical constraints from pre-trained language models and strengthens a copy mechanism of the pointer network via direct supervision of a copying score. We also release HOLLY, an evaluation benchmark for assessing the ability of a model to cope with "homographic" and "unseen" lexical constraints. Experiments on HOLLY and the previous test setup show the effectiveness of our method. The effects of PLUMCOT are shown to be remarkable in "unseen" constraints. Our dataset is available at https://github.com/papago-lab/HOLLY-benchmark

Keywords

cs.cl

Access

URL:

http://arxiv.org/abs/2306.12089v1

Citation

ID: 283110

Ref Key: choo2023towards

Use this key to autocite in SciMatic or Thesis Manager

References

No Bibliography

Blockchain Verification

Account:

NFT Contract Address:

0x95644003c57E6F55A65596E3D9Eac6813e3566dA

Article ID:

283110

Unique Identifier:

Network:

Scimatic Chain (ID: 481)

Blockchain Readiness Checklist

Authors

Abstract

Journal Name

Year

Title

5/5

Creates 1,000,000 NFT tokens for this article

Token Features:

ERC-1155 Standard NFT
1 Million Supply per Article
Transferable via MetaMask
Permanent Blockchain Record

Scan with Saymatik Web3.0 Wallet

Gas fees required in SCI Coins

Buy SCI

Saymatik Web3.0 Wallet

Google Play

App Store

Coming soon

Reference Key: lastname+year+titlefirstword+journalfirstword

Article Type (Article, Book, Proceedings etc.)

Add a reference in a raw form. Our automatic system will correct it later.

Towards Accurate Translation via Semantically Appropriate Application of Lexical Constraints

Abstract

Keywords

Access

Citation

References

References

Blockchain Verification

Blockchain Readiness Checklist

Article Tokenized!

Token Features:

Saymatik Web3.0 Wallet