Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.
Digital Object Identifier (DOI) : 10.14569/IJACSA.2014.050518
Article Published in International Journal of Advanced Computer Science and Applications(IJACSA), Volume 5 Issue 5, 2014.
Abstract: Splitting is a conventional process in most of Indian languages according to their grammar rules. It is called ‘pada vicchEdanam’ (a Sanskrit term for word splitting) and is widely used by most of the Indian languages. Splitting plays a key role in Machine Translation (MT) particularly when the source language (SL) is an Indian language. Though this splitting may not succeed completely in extracting the root words of which the compound is formed, but it shows considerable impact in Natural Language Processing (NLP) as an important phase. Though there are many types of splitting, this paper considers only consonant based and phrase based splitting.
T. Kameswara Rao and Dr. T. V. Prasad, “Telugu Bigram Splitting using Consonant-based and Phrase-based Splitting” International Journal of Advanced Computer Science and Applications(IJACSA), 5(5), 2014. http://dx.doi.org/10.14569/IJACSA.2014.050518