Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.
Digital Object Identifier (DOI) : 10.14569/IJARAI.2014.030904
Article Published in International Journal of Advanced Research in Artificial Intelligence(IJARAI), Volume 3 Issue 9, 2014.
Abstract: Vietnamese word segmentation is an important step in Vietnamese natural language processing such as text categorization, text summary, and automated machine translation. The problem with Vietnamese word segmentation is complicated because Vietnamese words are not always separated by a space. One word can include one or more syllables depending on the context. This paper proposes a method for Vietnamese word segmentation based on the mutual information among the syllables combined with dynamic programming. With this method, we can achieve an accuracy rate of about 90% with a raw text corpus.
Nguyen Thi Uyen and Tran Xuan Sang, “Dynamic Programming Method Applied in Vietnamese Word Segmentation Based on Mutual Information among Syllables” International Journal of Advanced Research in Artificial Intelligence(IJARAI), 3(9), 2014. http://dx.doi.org/10.14569/IJARAI.2014.030904