详细信息
Research on Uyghur-Chinese Neural Machine Translation Based on the Transformer at Multistrategy Segmentation Granularity ( SCI-EXPANDED收录 EI收录) 被引量:6
文献类型:期刊文献
英文题名:Research on Uyghur-Chinese Neural Machine Translation Based on the Transformer at Multistrategy Segmentation Granularity
作者:Xu, Zhiwang[1,2];Qin, Huibin[1];Hua, Yongzhu[1]
机构:[1]Hangzhou Dianzi Univ, Inst Electron Device & Applicat, Hangzhou 310018, Zhejiang, Peoples R China;[2]Shaoxing Univ, Yuanpei Coll, Shaoxing 312000, Zhejiang, Peoples R China
年份:2021
卷号:2021
外文期刊名:MOBILE INFORMATION SYSTEMS
收录:SCI-EXPANDED(收录号:WOS:000674595200006)、、EI(收录号:20212910644298)、Scopus(收录号:2-s2.0-85110033915)、WOS
语种:英文
外文关键词:Computational linguistics - Computer aided language translation - Convolutional neural networks - Semantics
外文摘要:In recent years, machine translation based on neural networks has become the mainstream method in the field of machine translation, but there are still challenges of insufficient parallel corpus and sparse data in the field of low resource translation. Existing machine translation models are usually trained on word-granularity segmentation datasets. However, different segmentation granularities contain different grammatical and semantic features and information. Only considering word granularity will restrict the efficient training of neural machine translation systems. Aiming at the problem of data sparseness caused by the lack of Uyghur-Chinese parallel corpus and complex Uyghur morphology, this paper proposes a multistrategy segmentation granular training method for syllables, marked syllable, words, and syllable word fusion and targets traditional recurrent neural networks and convolutional neural networks; the disadvantage of the network is to build a Transformer Uyghur-Chinese Neural Machine Translation model based entirely on the multihead self-attention mechanism. In CCMT2019, dimension results on Uyghur-Chinese bilingual datasets show that the effect of multiple translation granularity training method is significantly better than the rest of granularity segmentation translation systems, while the Transformer model can obtain higher BLEU value than Uyghur-Chinese translation model based on Self-Attention-RNN.
参考文献:
正在载入数据...