Mining a comparable text corpus for a Vietnamese - French statistical machine translation system

DO Thi Ngoc Diep, LE Viet Bac, BIGI Brigitte, BESACIER Laurent, CASTELLI Eric
The 4th Workshop on statistical machine translation - EACL 2009 - March 2009
This paper presents our first attempt at constructing a Vietnamese-French statistical machine translation system. Since Vietnamese is an under-resourced language, we concentrate on building a large Vietnamese French parallel corpus. A document alignment method based on publication date, special words and sentence alignment result is proposed. The paper also presents an application of the obtained parallel corpus to the construction of a Vietnamese-French statistical machine translation system, where the use of different units for Vietnamese (syllables, words, or their combinations) is discussed.

BibTex references

@InternationalConference{DLBBC09,
  author       = {DO, T. and LE, V. and BIGI, B. and BESACIER, L. and CASTELLI, E.},
  title        = {Mining a comparable text corpus for a Vietnamese - French statistical machine translation system},
  booktitle    = {The 4th Workshop on statistical machine translation - EACL 2009},
  month        = {March},
  year         = {2009},
  address      = {Athens, Greece},
  url          = {/2009/DLBBC09},
}

Other publications in the database

» Thi Ngoc Diep DO
» Viet Bac LE
» Brigitte BIGI
» Laurent BESACIER
» Eric CASTELLI