gibbs sampling segmentation of parallel dependency trees for tree-based machine translation
;Mareček David;Žabokrtský Zdeněk
prague bulletin of mathematical linguistics2016Vol. 105pp. 101-110
132
david2016praguegibbs
Abstract
We present a work in progress aimed at extracting translation pairs of source and target dependency treelets to be used in a dependency-based machine translation system. We introduce a novel unsupervised method for parallel tree segmentation based on Gibbs sampling. Using the data from a Czech-English parallel treebank, we show that the procedure converges to a dictionary containing reasonably sized treelets; in some cases, the segmentation seems to have interesting linguistic interpretations.