Re-tooling Code Structure Based Analysis with Model-Driven Program Slicing for Software Maintenance

Static code analysis is a methodology of detecting errors in program code based on the programmer's reviewing the code in areas within the program text where errors are likely to be found and since the process considers all syntactic program paths; there is the need for a model-based approach with slicing. This paper presented a model of high-level abstraction of code structure analysis for a large component based software system. The work leveraged on the most important advantage of static code structure analysis in re-tooling software maintenance for a developing economy. A program slicing technique was defined and deployed to partition the source text to manageable fragments to aid in the analysis and statecharts were deployed as visual formalism for viewing the dynamic slices. The resulting model was a high-tech static analysis process aimed at determining and confirming the expected behaviour of a software system using slices of the source text presented in the statecharts. Keywordssoftware maintenance; static analysis; syntactic program behavior; program slicing.


INTRODUCTION
There had been a growing use of static analysis both in commercial and academic areas in the verification of properties of software used and locating potentially vulnerable code in critical sensitive computer systems [1].Static code analysis is the analysis of computer software that is performed without actually executing programs built from that software.This is in contrast to the analysis performed on executing programs which is known as dynamic analysis [2].Empirical evidence from [3] and [4] showed that software maintenance consumed between 50% and 80% of the resources in the total software budget.In addition, according to [5] and [6]; an estimated 50% to 80% of the time and material involved in software development is devoted to maintenance of existing code, hence the main justification for this work was to leverage on the most important advantage of static code structure analysis in re-tooling software maintenance for a developing economy.
The most important advantage of static code analysis lied in the possibility of considerable cost saving by defects elimination in a program; this is expected to bring about some economic benefits to a developing country as technological innovations were known to do and they can be expected to save considerable costs in software maintenance.The earlier an error is determined; the lower the cost of its correction.According to [7], correction of an error at the testing stage is ten times more expensive than its correction at the construction (coding) stage.
This paper presented a model for code structure analysis for a large component based software system.A program slicing technique was deployed to partition the code to manageable fragments to aid in the analysis and statecharts were deployed as visual formalism to view the dynamism of the static slices.The model aimed at determining and confirming the expected behavior of a software system using the source text because research had shown that during maintenance, the most reliable and accurate description of the actual behavior of a software system is its source code [8].The rest of this paper is organized as follows: A background to the concepts of static codes structure analysis and program slicing was presented in section II; section 3 described the materials and methods adopted in the research, the resultant models were discussed in section 4 and the section 5 concluded the paper.

II. RESEARCH BACKGROUND
Identifying errors in software during development is very important so that the end product can be error free and perform to its specification.
Reference [9] believed that early identification of bugs in a developing program can be achieved through the concept of program slicing.Generally, program slice has a wider spectrum of applications that include debugging, testing, maintenance, code understanding, complexity measurement, security etc. Static analysis is any form of analysis that does not require a system to be operated.The process complements dynamic analysis, where system operation is central.When applied to code, static analysis is typically referred to as white-box, glassbox, structural or implementation based techniques [10].In most cases the analysis is performed on some version of the source code and in the other cases some form of the object code.The term is sometimes applied to the analysis performed by an automated tool, with human analysis being called program understanding, program comprehension or code www.ijacsa.thesai.orgreview.Static Analysis is performed without program execution and the process includes almost everything except conventional testing.This is because dynamic testing requires running code, some program properties such as race conditions are too hard to test for and though it may be impossible to program correctness, one can easily prove simple properties of simplified models.Static analysis can be applied earlier in development because some kinds of defects are hard to find by testing (e.g., timing-dependent errors) and because testing and analysis are complementary; each is best at finding different faults.The sophistication of the analysis performed by tools varies from those that only consider the behavior of individual statements and declarations, to those that include the complete source code of a program in their analysis.Uses of the information obtained from the analysis vary from highlighting possible coding errors to formal methods that mathematically prove properties about a given program [11].
One implementation technique of formal static analysis is model checking; it includes the consideration of systems that have finite state or may be reduced to finite state by abstraction.Data-flow analysis is a lattice-based static analysis technique for gathering information about the possible set of values and the abstract interpretation models the effect that every statement has on the state of an abstract machine.In this case, the model 'executes' the software based on the mathematical properties of each statement and declaration.This abstract machine over-approximates the behaviours of the system and the abstract system is thus made simpler to analyze, at the expense of incompleteness since not every property true of the original system is true of the abstract system.If properly done, though, abstract interpretation is sound as every property true of the abstract system can be mapped to a true property of the original system [12].
The sophistication of the analysis performed by the model varies from those that only consider the behavior of individual statements and declarations, to those that include the complete source code of a program in their analysis.Uses of the information obtained from the analysis vary from highlighting possible coding errors to formal methods that mathematically prove properties about a given program [13].Static analysis can find weaknesses in the code at the exact location, it can be conducted by trained software assurance developers who fully understand the code and it allows a quicker turn around for fixes.In addition to all these, it permits weaknesses to be found earlier in the development life cycle, reducing the cost to fix.
Program slicing was initially proposed to guide programmers during program debugging, but had been found to be useful for the process of understanding programs.Dynamic slicing was used to identify those parts of the program that contributed to the computation of the selected function for a given program execution.This can be used to understand program execution by adopting a commonly used high level abstraction.Program slicing is the computation of the set of programs statements, the program slice that may affect the values at some point of interest, referred to as a slicing criterion.Program slicing can be used in debugging to locate source of errors more easily.Other applications of slicing include software maintenance, optimization, program analysis, and information flow control.[13].Slicing techniques have been seeing a rapid development since the original definition by Mark Weiser.At first, slicing was only static, i.e., applied on the source code with no other information than the source code.Reference [14] introduced dynamic slicing, which worked on a specific execution of the program; for a given execution trace.Based on the original definition of [15], informally, a static program slice S consists of all statements in program P that may affect the value of variable v at some point p.The slice is defined for a slicing criterion C=(x,V), where x is a statement in program P and V is a subset of variables in P. A static slice includes all the statements that affect variable v for a set of all possible inputs at the point of interest (i.e., at the statement x).Static slices are computed by finding consecutive sets of indirectly relevant statements, according to data and control dependencies.Dynamic slicing techniques provided a means to prune unrelated computation, and it may help to narrow down this part of a program that contributed to the computation of a function of interest for a particular program input.

III. MATERIALS AND METHODS
The methodology adopted in the work was a modification of the Jakstab framework [16].Jakstab is an Abstract interpretation-based, integrated disassembly and static analysis framework for designing analyses on executables and recovering reliable control flow graphs.In order to make the framework suitable for the research in this paper, the author added an extra layer of abstraction to the original framework to obtain a modified methodology suitable for the task at hand.While the starting point for the Jakstab framework is binary source, the approach in this paper performs the analysis on the program source code.
In addition to the modified Jakstab framework; the following materials were deployed in building the model-based high-tech source analysis system.
 A bottom-up dynamic slicing technique was defined in this work and deployed to obtain a hybrid-tech static analysis model (Fig. 1).A slice was constituted by an executable portion of the original program whose behavior is, under the same input, indistinguishable from that of the original program on a given variable 'V' at point 'P' in the program.Reference [13] had showed that bottom-up program slicing techniques could be successfully deployed to transform a large component-based program into a smaller one that contains only statements relevant to the computation of a given function.
 Statecharts notations used in [17] were deployed as a concise visual formalism that captured the dynamic behaviour of a system in representing program slices in this work.Illustrated below is a visualization of a login module Passwd.pas(Fig. 2).
 Goal models-a graph structure representing stakeholder goals and their inter-dependencies was deployed to decompose goals into sub-goals through AND/OR refinements (Fig. 3).www.ijacsa.thesai.orgIV.RESULTS AND DISCUSSIONS This work described a model-based high-tech methodology for static source analysis which consisted of systematically using slices of code source models as primary source artifacts throughout the analysis process (Fig. 1).This was because, though proving program correctness may be wrong at static analysis phase, it is possible to prove simple properties of simplified models.The process took as input, the program source text and generated slices based on slicing criterion.
The knowledge schema at this point comprised of the knowledge of the syntax and semantics of the programming language (Fig. 4).Low-tech static analysis that involved simple software inspection and manual checking of simple syntactic standards were first carried out.This was followed by the hightech static analysis that included enforced syntactic checks, check for conformance with respect to specifications, designs, and the code data flow analysis.The model was bottom-up and generated slices were analyzed, refined if necessary and evaluated.The sum of all analyzed parts (slices) gave the whole (entire source) at the end of the day which may eventually become a candidate for analysis at a later time.Generated slices were represented by statecharts and resolved into goal models.
The Static analysis model was applied to a real-life legacy application [13].A procedural application developed for a commercial bank in Nigeria prior to the consolidation of the banking sector in the country was chosen due to its high availability and statecharts were deployed to show the abstract description of the behaviour of the source system.Fig. 5 showed the statechart that implemented the login page of the application, passwd.pas.
Visualizations like the one above (Fig. 5) were built for different slices of the application and the different visualizations were combined to obtain a high-level metamodel that contained the entire description of the original system (Fig. 6).
The top-level statechart above was converted to an annotated goal model using the conversion process described earlier (Fig. 3), all the transitions were converted into goals using the AND/OR decomposition rules.Some tasks in the goal model contributed to quality concerns modeled by the softgoals; for example, "correctPassword" contributed to the security concern while "ErrorMessage" contributed to the usability concern (Fig. 7).Previous research had showed that the maintenance of existing software source code consumed up to ¾ of the resources in the total software budget (time and material), and that the earlier an error is determined, the lower is the cost of its correction.In addition, [18] opined that code comprehension require 47% and 62% of the total time for enhancement and correction tasks, respectively.The main justification for this work therefore is to leverage on the most important advantage of static code structure analysis in re-tooling software maintenance for a developing economy-cost saving.In this work, a framework for code structure analysis for a large component based software system with program slicing was developed as a re-tooling technique for developing economies in order to enable an early detection of bugs during a software development process thereby saving significant costs in software maintenance.www.ijacsa.thesai.org[13]

Figure 4 .
Figure 4.The model-based high-tech source analysis system

Figure 5 .
Figure 5. Sample visualization of the login module using Statechart V. CONCLUSION

Figure 6 .Figure 7 .
Figure 6.Sample top-level Statechart of the Procedural application