Computer Vision Conference (CVC) 2026
21-22 May 2026
Publication Links
IJACSA
Special Issues
Computer Vision Conference (CVC)
Computing Conference
Intelligent Systems Conference (IntelliSys)
Future Technologies Conference (FTC)
International Journal of Advanced Computer Science and Applications(IJACSA), Volume 4 Issue 2, 2013.
Abstract: Tokenization is the task of chopping it up into pieces, called tokens, perhaps at the same time throwing away certain characters, such as punctuation. A token is an instance of token a sequence of characters in some particular document that are grouped together as a useful semantic unit for processing. New software tool and algorithm to support the IRS at tokenization process are presented. Our proposed tool will filter out the three computer character Sequences: IP-Addresses, Web URLs, Date, and Email Addresses. Our tool will use the pattern matching algorithms and filtration methods. After this process, the IRS can start a new tokenization process on the new retrieved text which will be free of these sequences.
Ahmad Al Badawi and Qasem Abu Al-Haija. “IRS for Computer Character Sequences Filtration: a new software tool and algorithm to support the IRS at tokenization process”. International Journal of Advanced Computer Science and Applications (IJACSA) 4.2 (2013). http://dx.doi.org/10.14569/IJACSA.2013.040212
@article{Badawi2013,
title = {IRS for Computer Character Sequences Filtration: a new software tool and algorithm to support the IRS at tokenization process},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2013.040212},
url = {http://dx.doi.org/10.14569/IJACSA.2013.040212},
year = {2013},
publisher = {The Science and Information Organization},
volume = {4},
number = {2},
author = {Ahmad Al Badawi and Qasem Abu Al-Haija}
}
Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.