Important Features Detection in Continuous Data

Piotr Fulmanski; Alicja Miniak-Górecka

doi:10.14569/IJACSA.2012.031239

DOI: 10.14569/IJACSA.2012.031239

PDF

Important Features Detection in Continuous Data

Author 1: Piotr Fulmanski

Author 2: Alicja Miniak-Górecka

International Journal of Advanced Computer Science and Applications(IJACSA), Volume 3 Issue 12, 2012.

Abstract and Keywords
How to Cite this Article
{} BibTeX Source

Abstract: In this paper, a method for calculating the importance factor of continuous features from a given set of patterns is presented. A real problem in many practical cases, like medical data, is to find which parts of patterns are crucial for correct classification. This leads to the need of preprocessing all data, which has influence on both time and accuracy of applied methods (when unimportant data hide those which are important). There are some methods that allow selection of important features for binary and sometimes discrete data or, after some preprocessing, continuous data. Very often however, such conversion is burdened with the risk of losing important data, which is a result of lack of knowledge of optimal discretization consequence. Proposed method allows to avoid that problem, because it is based on original, non-transformed continuous data. Two factors - concentration and diversity - are defined and are used to calculate the importance factor for each feature and pattern. Based on those factors e.g. unimportant features can be identified to decrease dimension of input data or ''bad'' patterns can be detected to improve classification. An example how proposed method can be used to improve decision tree is given as well.

Keywords: important features extraction; continuous data analysis; decision tree.

Piotr Fulmanski and Alicja Miniak-Górecka, “Important Features Detection in Continuous Data ” International Journal of Advanced Computer Science and Applications(IJACSA), 3(12), 2012. http://dx.doi.org/10.14569/IJACSA.2012.031239

@article{Fulmanski2012,
title = {Important Features Detection in Continuous Data },
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2012.031239},
url = {http://dx.doi.org/10.14569/IJACSA.2012.031239},
year = {2012},
publisher = {The Science and Information Organization},
volume = {3},
number = {12},
author = {Piotr Fulmanski and Alicja Miniak-Górecka}
}

Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

Important Features Detection in Continuous Data

Upcoming Conferences