Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.
Digital Object Identifier (DOI) : 10.14569/IJACSA.2013.041122
Article Published in International Journal of Advanced Computer Science and Applications(IJACSA), Volume 4 Issue 11, 2013.
Abstract: Inverted Indexing is an efficient, standard data structure, most suited for search operation over an exhaustive set of data. The huge set of data is mostly unstructured and does not fit into traditional database categories. Large scale processing of such data needs a distributed framework such as Hadoop where computational resources could easily be shared and accessed. An implementation of a search engine in Hadoop over millions of Wikipedia documents using an inverted index data structure would be carried out for making search operation more accomplished. Inverted index data structure is used for mapping a word in a file or set of files to their corresponding locations. A hash table is used in this data structure which stores each word as index and their corresponding locations as its values thereby providing easy lookup and retrieval of data making it suitable for search operations.
Kaushik Velusamy, Deepthi Venkitaramanan, Nivetha Vijayaraju, Greeshma Suresh and Divya Madhu, “Inverted Indexing In Big Data Using Hadoop Multiple Node Cluster” International Journal of Advanced Computer Science and Applications(IJACSA), 4(11), 2013. http://dx.doi.org/10.14569/IJACSA.2013.041122