Paw Search-A Searching Approach for Unsorted Data Combining with Binary Search and Merge Sort Algorithm Paw Search Algorithm

—Searching is one of the oldest core mechanism of nature. Nature is changing gradually along with searching approaches too. Data Mining is one of the most important industrials topic now-a-days. Under this area all social networks, governmental or non-governmental institutions and ecommerce industries produce a huge number of unsorted data and they are to utilize it. For utilizing this huge number of unsorted data there needs some specific features based unsorted data structure tools like searching algorithm. At present there are several sorted data based searching algorithms like Binary Search, Linear Search, Jump Search and Interpolation Search and so on. In this paper of Paw Search Algorithm, it is fully focused to develop a new approach of searching that can work on unsorted data merging several searching techniques and sorting techniques. This algorithm starts its operation by breaking down the given unsorted array into several blocks by making the square root of the length of the given array. Then these blocks will be searched within its specific formula till the target data is found or not, and in the inner side of each block there will be performed Merge Sort and Binary Search approach gradually. Time and Space Complexity of this Paw Search algorithm is comparatively optimal.


I. INTRODUCTION
In this present world, technology is the heart of all activities, operations and so on, and the large amount of unsorted data sets generated from different sites of the world as well as different institutions are the largest and best resources of the present technology. Managing this large amount of data with proper data structure techniques is the best tool for leading this IT world now-a-days. In this paper, we will go through a new technique of searching of data from the given unsorted array of data. There are several techniques raised now-a-days, but here we will go through a new dimension of searching and merging several built in techniques along with some new approaches to generate an optimal output.
In present world there are tons of unsorted data produced within a minimal time randomly. There are several searching algorithms like Linear Search [1,2,3], Binary Search [1,2], Jump Search [1,4], Hybrid Search [1,5] and Interpolation Search [1,6] now-a-days which work only on sorted data. But till now there are less approaches that work on randomly generated unsorted data. Several optimal data structures tools are badly required to operate this very large number of unsorted data which are producing day by day. Data Mining is one of the most important industrials topic now-a-days. Under this area of data mining for all social networks, governmental or non-governmental institutions and ecommerce industries produce tons of unsorted data and they are to utilize it. For utilizing this tons of unsorted data there is a need of some specific feature based unsorted data structures tools like searching algorithm [6] based on unsorted data array. So, a data structure tool [7] that will work on directly unsorted data is the prime concern for developing another searching approach.
Data scientists are trying to develop several data structures tools to utilize the tons of unsorted data randomly around the globe continuously. With a view to helping the data scientists here I am trying to develop data structures tools for finding out of any data from any given array of unsorted data. We know that sorting of data consumes a large number of time; so, from this concept of time utilization there needs some specific approach that can perform searching operation on unsorted data which minimize the uses of time. As the technology and technology related models/industries appreciate the approaches that minimize time consuming, so this is the demand of time to have high performer approach consuming less time without sorting the large data set at a single time.
Through this whole paper we will go through the approaches to develop a specific feature based searching algorithm entitled paw search algorithm that will be capable to perform searching operations on unsorted data, and here we may also go through the help of some existing searching approaches and sorting approaches at the inner phase of searching operation to ensure the high performance of searching.
The main principle of this Paw Search Algorithm is to work on (i) unsorted data segmenting the given array of data into several (ii) blocks.
Initially it starts working with x blocks of unsorted data by making the square root of the length n of the given array of data i.e., x=ceiling⌈√n⌉ where n is length of the given array of unsorted data. www.ijacsa.thesai.org This algorithm will never check all of the blocks of unsorted data linearly, rather than it will go the blocks of unsorted data all but as like as binary approach but not fully follow the binary approaches. And for the inner block operation we will also call here the merge sort approach for the better performance of this paw search algorithm.

II. LITERATURE REVIEW
In this section we will go through the several existing searching algorithms, and most of them here work only on sorted data:

A. Classification of Searching Techniques
There are several searching techniques present now-a-days. Depending on external and internal issue there are two types of searching techniques as (i) external search and (ii) internal search, and based on sequential and interval issue there are two types of searching techniques as (i) sequential search and (ii) interval search.

B. Present Searching Algorithms
There are several searching algorithms based on sorted data. Some of them are listed below-1) Linear search algorithm: Linear search algorithm [3] could be an easy search algorithmic program. It's a sequent search that performed on sequences of numbers that are ascending or down or unordered. And it checks every component of the whole list to look a specific information from the list. If the comparison is equal, then the search is stopped and declared productive. For a listing with n things, the most effective case is once the worth of item to be searched is adequate to the primary component of the list, during this case only one comparison is required. Worst case is once the worth isn't within the list or happens one time at the top of the list, during this case n comparisons are required.
2) Binary search algorithm: It is a quick search formula because the run-time quality is Ο (log n). Divide and conquer Principle is used here as its' search formula. This formula performs higher for sorted knowledge assortment. In binary search [8], we tend to 1st compare the key with the item within the middle position of the info assortment. If there's a match, we are able to come forthwith. If the secret's but middle key, then the item should lie the lower 1/2 the info collection; if it's bigger, then the item should lie the higher 1/2 the info assortment.
3) Hybrid search algorithm: Hybrid Search algorithmic [3,9] rule combines properties of each linear search and binary search and provides a far better and economical algorithmic rule. This algorithmic rule may be accustomed search in associate degree unsorted array whereas taking less time as compared to the linear search algorithmic rule. As mentioned this algorithmic rule is combines 2 looking algorithms, viz. Linear Search and Binary Search. Like Hybrid Search algorithmic rule, the array is split into 2 sections so searched in every of the sections. The algorithmic rule starts with examination the key component to be searched with the 2 extreme components of the array, the primary and therefore the last, further because the middle component. If a match is found, the index worth comes back. However, if it's not, the array is split into two sections, from the center index. Currently the search is meted out within the section on the left in a very similar method. The acute components and therefore the middle component of the left division are compared with the key worth for a match, that if found, returns the index worth. If not, the left section is once more divided into two components and this method goes on until a match is found within the left section. If no match is found within the left division, then the algorithmic rule moves on to the proper division, and therefore the same procedure is meted out to search out a match for the key worth. Now, if no worth is found that matches the key worth even when ransacking through all sections, then it's more divided and therefore the method repeats iteratively till it reaches the atomic state. If the worth isn't gift within the array, as a result of that the algorithmic rule returns -1.

4) Interpolation search algorithm:
Interpolation search [2,10] rule is improvement over Binary search. The binary search checks the part at middle index. However, interpolation search could search at completely different locations supported price of the search key. The weather should be in sorted order so as to implement interpolation search. As mentioned the Interpolation Search is Associate in Nursing improvement over Binary explore for instances, wherever the values in a very sorted array are uniformly distributed. Binary Search continuously goes to the center part to ascertain. On the opposite hand, interpolation search could head to completely different locations in line with the worth of the key being searched. For example, if the worth of the secret's nearer to the last part, interpolation search is probably going to start out search toward the tip facet. 5) Jump search algorithm: Jump search algorithmic [11,12] rule, additionally known as block search algorithmic rule. Solely sorted list of array or table will use the Jump search algorithmic rule. In jump search algorithmic rule, it's not in any respect necessary to scan each component within the list as we have a tendency to liquidate linear search algorithmic rule. We have a tendency to simply check the m component and if it's but the key component, then we have a tendency to move to the m + m component, wherever all the components between the m and m + m component square measure skipped. This method is sustained till m component becomes adequate to or larger than key component known as boundary price. The worth of m is given by m = √n, wherever n is that the total range of components in associate array. Once the m components attain the boundary price, a linear search is finished to seek out the key price and its position within the array. And also the numbers of comparisons square measure adequate to (n/m + m -1). It should be noted that in Jump search algorithmic rule, a linear search is finished in reverse manner that's from boundary price to previous price of m.
Though there is a large number of searching approaches [13] on different aspects like strings [14], numeric values and www.ijacsa.thesai.org so on there is still a concern of optimizing [15, 16] these approaches. Now-a-days industry requires specific feature based searching tools like audio, video and/or image based searching [17,18], and as the industry is changing day by day with the help of upgraded technology, searching approaches are also gradually being changed as needed [19,20].
And, still there needs of most powerful, high performer, fast searching unsorted data based searching approaches; keeping this conscious in mind, this paw search approach for unsorted data is demand of time now.

III. METHODOLOGY
In this Methodology section, we will go through several sections like Planning, Design, Paw Search Algorithm etc. for the development of the proposed approach of search precisely and clearly:

A. Planning
Define To develop this algorithm I have planned several data structure approaches, arrays, sub-arrays or blocking, sorting approaches and so on.
 First plan is to manage several unsorted data sets that may be generated from different environment like weather data, space data and son.
 Second plan is to find out the length of the array with filling this array with that unsorted raw data.
 Third plan is to divide the unsorted array into several sub-arrays which are termed as data blocks in the later chapters of this paper.
 Fourth plan is to find out an optimal way to have operations on these blocks by traversing them.
 Fifth plan is to operate a searching approach on the blocks for finding out the optimal outputs.
 Sixth plan is to calculate the time and space complexity of this algorithm.
 Seventh plan is to compare these time and space complexity with different present searching algorithms properly.
The designation process of this algorithm is briefly described in part B of this section.

B. Design
To design this algorithm we are to go through a list of unsorted data set firstly as the main principle of this Paw Search -A Searching Approach for Unsorted Data Combining with Binary Search and Merge Sort Algorithm is to work on (i) unsorted data segmenting the given array of data into several (ii) blocks.
Initially it starts working with x blocks of unsorted data by making the square root of the length n of the given array of data i.e., x=ceiling⌈√n ⌉ where n is length of the given array of unsorted data. but when the length of this array isn't a perfect square root number then the block number becomes a fraction number, but the block number can't be a number as a fraction number in real, so we are to operate here the ceiling operator to get the integer number of blocks. But in this situation there needs some dummy data as like zero to make the block size perfect i.e., same length of each block.
For example let assume an unsorted array arr1[ ] of data of the length of 16 which is a perfect square root number that is shown in Table I. Here the Block1, Block2, Block3 and Block4 are the four blocks of the segmented arr1 [ ] shown in Table II Now let assume another unsorted array arr2 [ ] of data of the length of 8 which is not a perfect square root number that is shown in Table III   Here the Block1, Block2 and Block3 are the three blocks of the segmented arr2[ ] shown in Table IV. In Block4 there is putted an extra zero as a dummy data for remaining the blocks size same.
However, this paw search algorithm will never visit all of the blocks of unsorted data linearly, rather than it will go through the blocks of unsorted data all but as like as binary approach. But it won't fully follow the binary approach.
The designing resources and working procedures list of this algorithm is listed here-

C. Paw Search Algorithm
Assume that there is an array with the length of n of the nodes value of any given graph or other randomly generated unsorted data, now let's demonstrate our desired Paw Search Algorithm for finding out the target data from this given array of unsorted data. Here, we will go through the procedural steps

D. Explanation and Implementation
Let's understand the block visiting procedures now, a graphical view is illustrated in Fig. 1 to show the working flow of the x blocks generated from the length of the array by making square root on it, and the length of each block is also x i.e., the block size and the block numbers are same.
Here x is the block number and l, k, m, p, y are also the sub number of x and they are the right mid, right-right mid, ……. , left mid, left-right mid, …….. , and gradually so on.
For implementing this algorithm let assume an array A[] with the length of n as shown in the following Table V. The operational steps of these x number of blocks are also shown in Fig. 2. This algorithm follows the Left to Right approach. According to this Fig. 2  Firstly, Lets' find out the number of Blocks: So, the blocks are shown in Table VII follows:

IV. PERFORMANCE ANALYSIS
Some fundamental key terms related to performance measurement of this proposed searching approach of paw search algorithm will be discussed through this section briefly. Basically, here we will cover the time and space complexity of this proposed searching approach and also cover a brief comparison of different existing searching approaches with this proposed searching approach:

A. Space Complexity
The Now lets' go through the time complexity phase of this Paw Search Algorithm. For calculating Time complexity of this algorithm we are to go through the divide and conquer approach of recursive method through traversing the x blocks generated by squaring root the length n of the given array of data.
Let the block1 of the length of x elements generated by squaring root the length n of the given array of data i.e., First of all we are to calculate the space complexity of merge sort approach for sorting this sub array i.e., Block1 of x length. And we already know that the space complexity of this merge sort approach is O(x) that means that it needs of space for sorting this sub array data is as equal as the length of the sub array, here which is x. As the size of each and every block is same and at a time only one block will be sorted, so here the space complexity is O(x). Now, let's calculate the space complexity of this paw search algorithm to find out the target value for this sub array x i.e., Block1 So, there needs space as same as the length of the array x for performing this operation successfully. We can also see the graphical view (push and pop operation of stack method) of the recursive method of this Block1 x as below in Fig. 3.  Fig. 3(A) shows the PUSHING of Block1's data into the STACK, Fig. 3(B) shows Block1 fully PUSHED into the STACK, Fig. 3(C) shows the POPPING of Block1's data from the STACK and Fig. 3(D) shows the Block1 which is fully popped from the STACK.
So, this Block1 needs same space as the length of this Block1 i.e., x. Similarly, all rest of the blocks need same space as their block size. So, here the space complexity is log (x). Now, for finding out the target element from each block we will operate here the Binary Search approach. And we already know that the space complexity of the Binary Search approach is log (x). www.ijacsa.thesai.org Hence the space complexity of this paw search algorithm is log n

B. Time Complexity
Now lets' go through the time complexity phase of this Paw Search Algorithm. For calculating Time complexity of this algorithm we are to go through the divide and conquer approach of recursive method through traversing the x blocks generated by squaring root the length n of the given array of data.
Let the Block1 of the length of x elements generated by squaring root the length n of the given array of data i.e., Firstly, we are to calculate an extra time for generating this x blocks by making square root of the given data array of the length of n.
Secondly, we are to consider the time complexity of merge sort approach for sorting this sub array i.e., Block1 of x length. And we already know that the time complexity of this merge sort approach is x log (x). Now, let's calculate the space complexity of this paw search algorithm to find out the target value for this sub array x i.e., Block1

C. Difference between Binary Search and Paw Search
The Paw Search Algorithm and the Binary Search Algorithm aren't same. There are several distinct difference between this two approaches. A difference chart between these two algorithms is shown in Table VIII follows: It begins its operation with the sorted array of data. It divides the given array of unsorted array of data of n length into x blocks by squaring root the length i.e., It doesn't divide the array into blocks.
The input data is either unsorted or unsorted doesn't fact here.
The input data must be sorted here. So, the Paw and Binary Searching technique isn't similar at all rather than it is quite different and comparatively more efficient than Binary Searching technique. It also should be mentioned that the Paw Search Algorithm solves the limitation of taking fully sorted array as an input of Binary Search Algorithm.

D. Difference between Jump Search and Paw Search
The Paw Search Algorithm and the Jump Search Algorithm aren't same. There are several distinct difference between this two approaches. A difference chart between these two algorithms is shown in Table IX follows: It begins its operation with the unsorted array of data.
It begins its operation with the sorted array of data.
It divides the given array of unsorted array of data of n length into x blocks by squaring root the length i.e., It also divides the given array of sorted array of data of n length into x blocks by squaring root the length i.e., It doesn't follow the linear approach for traversing its blocks.
It follows the linear approach for traversing its blocks.
It is faster. It is comparatively slower.
It doesn't travel the blocks sequentially. It travels the blocks sequentially.
Under the block operation it operates here binary search approach as an inner approach.
Under the block operation it operates here linear search approach as an inner approach.
The input data is either unsorted or unsorted doesn't fact here. The input data must be sorted here.
It is a combined searching system. It is a unique searching system. So, the Paw and Jump Searching technique isn't similar at all rather than it is quite different and comparatively more efficient than Jump Searching technique. It also should be mentioned that the Paw Search Algorithm solves the limitation of taking fully sorted array as an input of Jump Search Algorithm.

E. Time Complexity Comparisons with others Algorithms
A comparison list of time complexity of different search algorithms like linear search, binary search, hybrid search, interpolation search and paw search in different cases like worst case, average case and best case is shown in Table X

V. CONCLUSION WITH FUTURE WORK
By developing this long discussion of this research paper, I come to know that research is the fundamental weapon of this globalizing world i.e., IT world, and the large number of unsorted data is the heart of each and every research now-adays. And managing this large number of unsorted data properly with proper searching technique is the core point of this paw search algorithm. The prime attraction of this research work is to develop a specific as well as more optimal formula of searching purposes from the unsorted list or array of data with the help of other searching and sorting techniques like merge sort and binary search. This Paw Search Algorithm shows the optimal way to generate a proper searching output taking an unsorted data list or array of data along with optimal time and space complexity, several comparisons of different searching approaches with this paw search algorithm are shown in Table VIII, IX, X and XI consecutively.
However, research is a continuous process. It will be upgraded with the demand of time day by day. There are also available a lot of future works here, some of them are listed below:  Developing a more optimal logic/formula to optimize this algorithm  Developing a Machine Learning Model to predict the desired block containing the desired data with Machine Learning Approach ACKNOWLEDGMENT It is a great pleasure for me to present this thesis paper titled as "Paw Search -A Searching Approach for Unsorted Data Combining with Binary Search and Merge Sort Algorithm". I express heartiest thanks to friends and my well-wisher for their continuous inspiration and support, which led me to complete this research work. www.ijacsa.thesai.org Finally, I express my appreciation to my parents and other family members for their unconditional support as without their support and inspiration, it would be impossible for me to complete this research successfully.