Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.
Digital Object Identifier (DOI) : 10.14569/IJACSA.2014.051106
Article Published in International Journal of Advanced Computer Science and Applications(IJACSA), Volume 5 Issue 11, 2014.
Abstract: We propose an efficient Frequent Sequence Stream algorithm for identifying the top k most frequent subsequences over big data streams. Our Sequence Stream algorithm gains its efficiency by its time complexity of linear time and very limited space complexity. With a pre-specified subsequence window size S and the k value, in very high probabilities, the Sequence Stream algorithm retrieve the top k most frequent subsequences of size S. The Stream Sequence algorithm also provides a high accuracy of the estimation of the number of occurrences of each promoted subsequence. Our experiments indicate several factors that influence the result accuracy of the Sequence Stream algorithm: stream size, subsequence size S and frequency of the subsequence.
Adi Alhudhaif, “Efficient Identification of Common Subsequences from Big Data Streams Using Sliding Window Technique” International Journal of Advanced Computer Science and Applications(IJACSA), 5(11), 2014. http://dx.doi.org/10.14569/IJACSA.2014.051106