Future of Information and Communication Conference (FICC) 2024
4-5 April 2024
Publication Links
IJACSA
Special Issues
Future of Information and Communication Conference (FICC)
Computing Conference
Intelligent Systems Conference (IntelliSys)
Future Technologies Conference (FTC)
International Journal of Advanced Computer Science and Applications(IJACSA), Volume 12 Issue 8, 2021.
Abstract: In recent years, the emergence of WWW (World Wide Web) led to the accumulation of huge amount of information and data. Hence the web is found to consist of unstructured and structured information that impacts the day to day life of the society. Because of such availability of huge information, utilization of the required information becomes more challenging. This paper provided a comprehensive survey on the current situation and recent trends on web content mining (WCM) and its applications thereby contributing to the enhancement of the upcoming research in WCM. The paper focused mainly on the mining and retrieval techniques, various WCM approaches, challenges and process of information retrieval and information extraction. The paper describes the four major tasks of web content mining that is information retrieval, information extraction, generalization and validation in detail. WCM concentrates on orchestrating, sorting, classifying, collecting, congregating of web data and provide the improved data which can be easily accessed by the users. Web content mining tools were needed to scan text, images and HTML documents and provide results to the search engine. It guides the search engine to provide better productive results for every search based on their importance. The paper also analysed different web content mining tools for the extraction of relevant information from the corresponding web page.
Manjunath Pujar and Monica R Mundada, “A Systematic Review Web Content Mining Tools and its Applications” International Journal of Advanced Computer Science and Applications(IJACSA), 12(8), 2021. http://dx.doi.org/10.14569/IJACSA.2021.0120886
@article{Pujar2021,
title = {A Systematic Review Web Content Mining Tools and its Applications},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2021.0120886},
url = {http://dx.doi.org/10.14569/IJACSA.2021.0120886},
year = {2021},
publisher = {The Science and Information Organization},
volume = {12},
number = {8},
author = {Manjunath Pujar and Monica R Mundada}
}
Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.