Telugu Bigram Splitting using Consonant-based and Phrase-based Splitting

T. Kameswara Rao; Dr. T. V. Prasad

doi:10.14569/IJACSA.2014.050518

DOI: 10.14569/IJACSA.2014.050518

PDF

Telugu Bigram Splitting using Consonant-based and Phrase-based Splitting

Author 1: T. Kameswara Rao

Author 2: Dr. T. V. Prasad

International Journal of Advanced Computer Science and Applications(IJACSA), Volume 5 Issue 5, 2014.

Abstract and Keywords
How to Cite this Article
{} BibTeX Source

Abstract: Splitting is a conventional process in most of Indian languages according to their grammar rules. It is called ‘pada vicchEdanam’ (a Sanskrit term for word splitting) and is widely used by most of the Indian languages. Splitting plays a key role in Machine Translation (MT) particularly when the source language (SL) is an Indian language. Though this splitting may not succeed completely in extracting the root words of which the compound is formed, but it shows considerable impact in Natural Language Processing (NLP) as an important phase. Though there are many types of splitting, this paper considers only consonant based and phrase based splitting.

Keywords: Bigram; n-gram; consonant based splitting; phrase based splitting

T. Kameswara Rao and Dr. T. V. Prasad, “Telugu Bigram Splitting using Consonant-based and Phrase-based Splitting” International Journal of Advanced Computer Science and Applications(IJACSA), 5(5), 2014. http://dx.doi.org/10.14569/IJACSA.2014.050518

@article{Rao2014,
title = {Telugu Bigram Splitting using Consonant-based and Phrase-based Splitting},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2014.050518},
url = {http://dx.doi.org/10.14569/IJACSA.2014.050518},
year = {2014},
publisher = {The Science and Information Organization},
volume = {5},
number = {5},
author = {T. Kameswara Rao and Dr. T. V. Prasad}
}

Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

Telugu Bigram Splitting using Consonant-based and Phrase-based Splitting

Upcoming Conferences