An Automated Framework for Detecting Change in the Source Code and Test Case Change Recommendation

Improvements and acceleration in software development has contributed towards high quality services in all domains and all fields of industry causing increasing demands for high quality software developments. In order to match with the high-quality software development demands, the software development industry is adopting human resources with high skills, advanced methodologies and technologies for accelerating the development life cycle. In the software development life cycle, one of the biggest challenges is the change management between versions of the source codes. The versing of the source code can be caused by various reasons such as change in the requirements or adaptation of functional update or technological upgradations. The change management does not only affect the correctness of the release for the software service, rather also impact the number of test cases. It is often observed that, the development life cycle is delayed due to lack of proper version control and due to the improver version control, the repetitive testing iterations. Hence the demand for better version control driven test case reduction methods cannot be ignored. A number of version control mechanisms are proposed by the parallel research attempts. Nevertheless, most of the version controls are criticized for not contributing towards the test case generation of reduction. Henceforth, this work proposes a novel probabilistic refactoring detection and rule-based test case reduction method in order to simplify the testing and version control mechanism for the software development. The refactoring process is highly adopted by the software developers for making efficient changes such as code structure, functionality or apply change in the requirements. This work demonstrates a very high accuracy for change detection and management. This results into a higher accuracy for test case reductions. The final outcome of this work is to reduce the development time for the software for making the software development industry a better and efficient world. Keywords—Change detection; pre-requisite detection; feature detection; functionality detection and test case change recommendation


I. INTRODUCTION
The improvements in the code development is a must to be performed task for all software development cycles, due to the continuous changing client requirements. The improvements or the changes in the software source code can be done in various ways such as version control or requirement tracking or using third party tools. Nonetheless, the most frequent and highly adopted method is the refactoring method as suggested by M.
Fowler et al. [1]. The effect of refactoring on the software source code is highly compatible with the change management process and further with the other phases of software development life cycle. The notable outcome by the work of E. R. Murphy-Hill et al. [2] have listed the standard phases of refactoring of source code, which deeply influences the adaptation of the process. The detailed comparative analysis of other versioning methods with refactoring is performed by N. Tsantalis et al. [3] highlighting the benefits of refactoring over other methods. The challenges of refactoring process for any source code cannot be ignored and can cause higher complexity during versioning in case of improper management as demonstrated by M. Kim et al. [4]. Another study focuses on the software development improvisation by Microsoft, suggesting similar measures as documented by Miryung Kim et al. [5]. Also, the similar study is conducted on another open source tool, GitHub, by D. Silva et al. [6] and the result is same as the previous studies recommending similar measures to be followed for safe refactoring of the source code (Fig. 1).
Therefore, understanding that the refactoring (Fig. 2) of the source code can be highly helpful for source code changing, most of the development practices uses this method.
Nevertheless, the process of refactoring the code can be helpful for making controlled changes into the code, but these changes results into further changes of testing process and test case management. Hence, the demand for change detection and test case verification without repeating the test cases for the features, which has not changed during the refactoring process, is highly prioritized by the industry practitioners and researchers. Thus, this work attempts to provide a solution to the change detection and test case reductions.
The rest of the work is furnished such as in the Section II, the outcomes from the parallel researcher are analyzed, in Section III, problem definition and the scope for improvements are listed, in Section IV, the proposed change detection algorithm is discussed, in Section V, the proposed test case detection and reduction algorithm is elaborated, in the Section VI, the proposed complete automated framework is furnished, in the Section VII, the results are discussed, in the Section VIII, the comparative analysis for understanding the improvements are discussed and in the Section IX, this work presents the final conclusion.

II. PARALLEL RESEARCH OUTCOMES
The versioning of the source code is performed in order to include changes in the source code. Often the changes are recommended by the customer or the changes are made due to the technical requirements fulfilments. Thus, refactor results into changes in pre-requisites or the feature of the source code or functionality of the source code. Hence, detecting the correct changes are the prime important task.
In order to detect the correct changes after a source code is refactored is the prime task. A number of parallel researches are taken place to accomplish this task. In this section of the work, the parallel research outcomes are analysed.
The first case study produced by E. R. Murphy-Hill et al. [2] have reported a framework that collects the historical data from the source code version control and integrates the changes into popular Eclipse IDE. The advancements of this work are done by S. Negara et al. [7], where the process of using meta data generated by version history is used. Nevertheless, this process is completely dependent on the refactoring trails or the auto-generated information during the refactoring process.
Removing the dependencies on the auto-generated information by the refactoring tools, the work of J. Ratzinger et al. [8] proposes a framework to generate commit messages during the refactoring process. This feature enables the framework to detect all changes including the minor updates. Regardless to mention, this framework is expected to be deployed from the beginning of the code development life cycles, which makes this framework being criticized among the practitioner's community. The other popular strategies supporting this method were also made. The work of Miryung Kim et al. [5] have finetuned the framework for detecting further detection of changes.
Yet other popular methods for detecting the change are analysing the pattern and behaviours of the source code as demonstrated by G. Soares et al. [9] or analysing the software code metrics as represented by S. Demeyer et al. [10].
In the other hand, detecting refactoring using the static code analysis is also widely accepted method. The work by D. Dig et al. [11] on component-based detection of changes made the process of detection automated and specified. Also, the work by K. Prete et al. [12] have proposed an alternative method for detecting the source code changes using the templates. The major bottleneck of this process is to separate the workable templates from the templates, which does not defer any functionality. In order to improve this process, M. Kim et al. [13] proposed a logical separation of the templates using querying the construction of the code.
Furthermore, all the bottlenecks of the existing works are summarized and analysed by P. Weissgerber et al. [14]. This work takes up the recommendations and frames the generic scopes for improvements in the next section of the work.

III. IDENTIFICATION OF SCOPE FOR IMPROVEMENTS
Furthermore, with the detailed understanding of the refactoring process outcomes by various research attempts and the strong connection with the change detection with test case management, in this section of the work, the research problems are identified.
Based on the outcomes of the parallel researches, the following short comings are identified: • Firstly, the general-purpose regression testing is carried out on a complete set of source code which is produced and modified time to time in the software development life cycle. Most of the instances it is been observed that the pre-configured test cases are deployed in the new version of the source code. Regardless to mention that most of the test cases are configured to test the areas where no changes are made. Hence, the optimizations of the test cases are completely ignored.
• Secondly, during the manual generation of the test cases, the identification of the high priority test cases is carried out. Most of the parallel researches depends on the pre-defined functional requirements given by the customer to decide the priority of the functional requirements and based on this available information, the priority of the test cases is decided. It is natural to understand that, due to this often the hidden and critical functionalities are ignored and as well as the test cases to validate these functionalities.
• Third, automation of the test case generation is demanding area of research for regression testing. Nonetheless, the processes are far from perfection and complete acceptability.
• Finally, defining the priority test cases depends on various factors. None of the parallel researches have demonstrated all possible combinations to evolve the optimization of test cases.
This work addresses the first problem mentioned in the work.
Henceforth, in the next section of the work, the proposed change detection algorithm is discussed.

IV. PROPOSED CHANGE DETECTION
The changes made into the source code using refactoring of the codes, must be identified for reducing the test cases or generating outline of test cases.
The proposed change detection algorithm is developed in total four parts.

Algorithm -1: Source Code Pre-Processor (SCPP)
Step -1. Access the repository for source code files Step -2. Mark the previous version of the file as V(n) Step -3. Mark the recent version of the file as V(n+1) Step -4. Identify the number of lines in the V(n) and V(n+1) Step - 5

Algorithm -2: Prerequisite Requirement Change Detection (PRCD)
Step -1. Load the files as V(n) and V(n+1) Step -2. Accept the tokenizer report Step -3. Build the list of "package" and "import" statements Step -4. For each line a. Detect the changes in "package" and "import" statements Step - 5

. List the inclusion of Prerequisite statements
Step -6. List the exclusion of Prerequisite statements The algorithm is visualized graphically here in Fig. 4. Step -1. Load the files as V(n) and V(n+1) Step -2. Accept the tokenizer report Step - 3

. Build the list of variable identifiers
Step -4. For each line a. Detect the changes in variable identifiers statements Step -5. List the inclusion of variable identifiers statements Step -6. List the exclusion of variable identifiers statements The algorithm is visualized graphically here in Fig. 5.

Algorithm -4: Source Functionality Change Detection (SFCD)
Step -1. Load the files as V(n) and V(n+1) Step -2. Accept the tokenizer report Step -3. Apply programming parser on the token Step The algorithm is visualized graphically here in Fig. 6. Henceforth, with the detailed understanding of the proposed change recommendation algorithm, this work furnishes the test case change identification method in the next section.

V. PROPOSED TEST CASE CHANGE RECOMMENDATION
The testing is one of the most important phases in the software development life cycle. With the recent developments in software, the automation in the test cases have grown popularity. Due to the refactoring of the source codes, often the test cases are also affected. These can cause the following situations: • Inclusion of the new test cases.
• Exclusion of the existing test cases, and.
• Removal of the duplicated test cases.
Thus, considering these factors, in this section of the work, the proposed test case change recommendation algorithm is proposed.
The algorithm is visualized graphically here in Fig. 7.

VI. PROPOSED AUTOMATED FRAMEWORK
In this section of the work, the proposed automated test case change recommendation framework is elaborated. The proposed framework demonstrates how different components are collaborated and coupled together for making the complete process automated (Fig. 8). The automated framework is designed to reduce the time needed for verifying and reducing or introducing test cases to the existing test case repositories.
Firstly, the source code version files are access from the location where all source codes are stored, usually called the source code repository. The source code repository is maintained by the version control tools used by any organization. This proposed framework does not apply any constraints on the version control features, rather only expects the versioning to be done only on separable source codes. After the source code files are loaded, the pre-processing algorithm is deployed on the source code to reduce the comments and to tokenize the source code files. Once the tokenization is completed, the same source code files are pushed to the proposed PRCD, proposed CFCD and proposed SFCD algorithms. The result from these algorithms are identification of pre-requisite changes, identification of feature or variable changes and identification of functionality changes, respectively. Finally, the recommendation algorithm, TCCR, generates the final recommendations based on the existing test case repository.
Further, with the detailed understanding of the complete framework work flow, in the next section of the work the results are discussed.

VII. RESULTS AND DISCUSSION
The results obtained from the proposed automated framework is highly satisfactory and are discussed in this section of the work. Due to the highly integrated structure of the framework, the results are discussed under multiple separate factors as Experimental Setup, Pre-processor Output, Change Detection Output, Pre-Requisite Test Case Availability, Recommendation Output, Variable Test Case Recommendation Output and Functionality Test Case Recommendation Output.

A. Experimental Setup
Firstly, the experimental setup is discussed here. The primary component of the experiment relies on the Java's 274 | P a g e www.ijacsa.thesai.org (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 8, 2020 "diff" utility. Diff Utilities library is an Open Source library for playing out the correlation/diff activities between writings or some sort of information: processing diffs, applying patches, creating bound together diffs or parsing them, producing diff yield for simple future showing (like one next to the other view) et cetera. The other details are discussed here in Table I.

B. Pre-Processor Output (SCPP Algorithm)
Secondly, the pre-processing outputs are listed here in Table II.   TABLE II. SCPP ALGORITHM The result is visualized graphically here in Fig. 9. Further, the tokenizer output is discussed in Table III.

Number of Lines Detected
The result is visualized graphically here in Fig. 10.
Furthermore, the comment removal phase output is discussed in Table IV.  The result is visualized graphically here in Fig. 11. 275 | P a g e www.ijacsa.thesai.org (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 8, 2020 C. Change Detection Process Output Thirdly, the change detection process outputs are listed here in Table V.

E. Code Feature Change Detection Output
Fifthly, the Code Feature Change Detection outputs are listed here in Table IX.   The result is visualized graphically here in Fig. 14.

F. Source Functionality Change Detection Output
Sixthly, the Source Functionality Change Detection summary is presented here in Table XI.

G. Test Case Change Recommendation Output
Finally, the Test Case Change Recommendation outputs are presented here in Table XII and Table XIII.