Inverted Indexing In Big Data Using Hadoop Multiple Node Cluster

Kaushik Velusamy; Deepthi Venkitaramanan; Nivetha Vijayaraju; Greeshma Suresh; Divya Madhu

doi:10.14569/IJACSA.2013.041122

DOI: 10.14569/IJACSA.2013.041122

PDF

Inverted Indexing In Big Data Using Hadoop Multiple Node Cluster

Author 1: Kaushik Velusamy

Author 2: Deepthi Venkitaramanan

Author 3: Nivetha Vijayaraju

Author 4: Greeshma Suresh

Author 5: Divya Madhu

International Journal of Advanced Computer Science and Applications(IJACSA), Volume 4 Issue 11, 2013.

Abstract and Keywords
How to Cite this Article
{} BibTeX Source

Abstract: Inverted Indexing is an efficient, standard data structure, most suited for search operation over an exhaustive set of data. The huge set of data is mostly unstructured and does not fit into traditional database categories. Large scale processing of such data needs a distributed framework such as Hadoop where computational resources could easily be shared and accessed. An implementation of a search engine in Hadoop over millions of Wikipedia documents using an inverted index data structure would be carried out for making search operation more accomplished. Inverted index data structure is used for mapping a word in a file or set of files to their corresponding locations. A hash table is used in this data structure which stores each word as index and their corresponding locations as its values thereby providing easy lookup and retrieval of data making it suitable for search operations.

Keywords: Hadoop; Big data; inverted indexing; data structure

Kaushik Velusamy, Deepthi Venkitaramanan, Nivetha Vijayaraju, Greeshma Suresh and Divya Madhu, “Inverted Indexing In Big Data Using Hadoop Multiple Node Cluster” International Journal of Advanced Computer Science and Applications(IJACSA), 4(11), 2013. http://dx.doi.org/10.14569/IJACSA.2013.041122

@article{Velusamy2013,
title = {Inverted Indexing In Big Data Using Hadoop Multiple Node Cluster},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2013.041122},
url = {http://dx.doi.org/10.14569/IJACSA.2013.041122},
year = {2013},
publisher = {The Science and Information Organization},
volume = {4},
number = {11},
author = {Kaushik Velusamy and Deepthi Venkitaramanan and Nivetha Vijayaraju and Greeshma Suresh and Divya Madhu}
}

Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

Inverted Indexing In Big Data Using Hadoop Multiple Node Cluster

Upcoming Conferences