Exploreing K-Means with Internal Validity Indexes for Data Clustering in Traffic Management System

Sadia Nawrin; Md Rahatur Rahman; Shamim Akhter

doi:10.14569/IJACSA.2017.080337

DOI: 10.14569/IJACSA.2017.080337

PDF

Exploreing K-Means with Internal Validity Indexes for Data Clustering in Traffic Management System

Author 1: Sadia Nawrin

Author 2: Md Rahatur Rahman

Author 3: Shamim Akhter

International Journal of Advanced Computer Science and Applications(IJACSA), Volume 8 Issue 3, 2017.

Abstract and Keywords
How to Cite this Article
{} BibTeX Source

Abstract: Traffic Management System (TMS) is used to improve traffic flow by integrating information from different data repositories and online sensors, detecting incidents and taking actions on traffic routing. In general, two decision making systems-weights updating and forecasting are integrated inside the TMS. The models need numerous data sets for making appropriate decisions. To determine the dynamic road weights in TMS, four (4) different environmental attributes are considered, which are directly or indirectly related to increase the traffic jam– rain fall, temperature, wind, and humidity. In addition, peak hour is taken as an additional attribute. Usually, the data sets are classified by instinct method. However, optimum classification on data sets is vital to improve the decision accuracy of the TMS. Collected data sets have no class label and thus, cluster based unsupervised classifications (partitioning, hierarchical, grid-based, density-based) can be used to find optimum number of classifications in each attribute, and expected to improve the performance of the TMS. Two most popular and frequently used classifiers are hierarchical clustering and partition clustering. K-means is simple, easy to implement, and easy to interpret the clustering results. It is also faster, because the order of time complexity is linear with the number of data. Thus, in this paper we are going to demonstrate the performance of partition k-means and hierarchical k-means with their implementations by Davies Boulder Index (DBI), Dunn Index (DI), Silhouette Coefficient (SC) methods to outline the optimal number classifications (features) inside each attribute of TMS data sets. Subsequently, the optimal classes are validated by using WSS (within sum of square) errors and correlation methods. The validation results conclude that k-means with DI performs better in all attributes of TMS data sets and provides more accurate optimum classification numbers. Thereafter, the dynamic road weights for TMS are generated and classified using the combined k-means and DI method.

Keywords: Traffic Management System (TMS); Data Clustering; K-means; Hierarchical Clustering; Cluster Validation

Sadia Nawrin, Md Rahatur Rahman and Shamim Akhter, “Exploreing K-Means with Internal Validity Indexes for Data Clustering in Traffic Management System” International Journal of Advanced Computer Science and Applications(IJACSA), 8(3), 2017. http://dx.doi.org/10.14569/IJACSA.2017.080337

@article{Nawrin2017,
title = {Exploreing K-Means with Internal Validity Indexes for Data Clustering in Traffic Management System},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2017.080337},
url = {http://dx.doi.org/10.14569/IJACSA.2017.080337},
year = {2017},
publisher = {The Science and Information Organization},
volume = {8},
number = {3},
author = {Sadia Nawrin and Md Rahatur Rahman and Shamim Akhter}
}

Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

Exploreing K-Means with Internal Validity Indexes for Data Clustering in Traffic Management System

Upcoming Conferences