A Survey on Detection and Prevention of Web Vulnerabilities

The Internet provides a vast range of benefits to society and empowers the users in a variety of ways to use web applications. Simply, the internet has become the most transformative and fast-growing technology ever built, but it also brings new security challenges to web services in internet applications because of the scattered and open nature of the internet. A simple vulnerability in the program code could favor/benefit an attacker to obtain unauthorized access and perform adversary actions. Hence, the security of web applications from a hacking attempt is of paramount importance. This paper focuses on a literature survey recapitulating security solutions and major vulnerabilities to promote further research by systemizing the existing methods, on a bigger horizon. The data is collected from an absolute of 86 primary studies that are taken from well-known digital libraries. Different methods comprising secure programming, static, Dynamic, Hybrid analysis, and machine learning classify the data from articles. The quantity of references or the significance of a developing strategy is kept in account while selecting articles. Overall, our survey suggests that there is no way to alleviate all the web vulnerabilities therefore more studies is desirable in the area of web information security. All methods’ complexity is addressed and some recommendations regarding when to use the application of given methods are provided. Finally, we typify the experience gained and examine future research openings in web application security. Keywords—Web security survey; web vulnerabilities; detection and prevention techniques


I. INTRODUCTION
Web-based applications are the best network-based solution to provide standard facilities. It has revolutionized the way standard facilities can be offered. Developing modern web applications is now the best mode. These applications are developed with the combination of a client and server-side development. The server-side portion uses different programming languages (.Net, PHP, Python, and Ruby) and front-end is a client-side portion, which runs on the user's web browser with different programming languages such as JavaScript and CSS/HTML. These two portions are frequently interconnected through HTTP or HTTPS protocol through asynchronous XML (AJAX) and JavaScript [1]. Fig. 1 describes the architecture of server-side and client-side of the website.
The availability of web applications has made them an integral part of everyone's daily life. This is because of their primarily free and internet-accessible availability and ability to handle sensitive data such as banking and payment for ecommerce. Because of their increased popularity, web applications are also the primary focus of hackers [2]. The popular uses of web applications, such as web blogs, social media, banking, and e-commerce, and their vulnerabilities are the focus of hackers to hack web applications with vulnerabilities. The weakness, bug, and loophole in the web application that can be exploited by hacker are called vulnerability [3]. The most critical vulnerabilities are cross-site scripting (XSS), SQL injections (SQLI), and cross-site request forgery (CSRF) that are listed in the top 10 web vulnerabilities by OWASP. The hackers can use the information of these vulnerabilities to compromise the website. Therefore, website requires security countermeasures to secure web application. A variety of techniques is being used around the globe to overcome these vulnerabilities and these techniques assist to identify the website vulnerabilities. There is a strong need for frequent testing to prevent and minimize web vulnerabilities. However, it requires that the tester have adequate experience The methodology to cope with security issues is to find out the bugs before discovery and exploitation by hackers. One of the keen approaches is the use of a white-box Technique. It consists of an analysis of website source code. However, there is a problem with massive false positives and the web application's source code may not be available. There is another procedure called black-box testing to help analyzers and overcome the method of white box testing. The strategy is to examine the vulnerabilities of the application by giving some input for specific vulnerability output. Many researchers have effectively analyzed black box scanner in vulnerability detection. Furthermore, they find out its constraints by repeatedly testing numerous black-box scanners against a wide range of vulnerable applications. A lot of work in this direction is focused on fuzzing. It deals with testing (semi)-random values [5]. Another important method to prevent web vulnerabilities is data mining and machine learning. These learning methods with a variety of web applications are considered a unique approach. However, it can also be used in source code to identify vulnerabilities [6].
We tend to survey the last ten years of existing web vulnerabilities in this study. The goal is to systematize the present methods into a vast picture that supports future research. We categorized the review of web vulnerability detection methods using hybrid analysis, dynamic analysis, static analysis, data mining, and techniques of machine learning. Initially, with traditional approaches, we outline the web vulnerability discovery and analysis difficulty. We also briefly explain the web vulnerabilities and their types. Hybrid analysis, dynamic analysis, Static analysis, and machine learning are different approaches to prevent web vulnerabilities. After that, we discuss each method in detail with the definition, prevention, advantages, and challenges.
We structured the paper as follows. We initially explained the working of a web application with distinctive qualities.
Section II describes the classification of web vulnerabilities along with the methods to secure it. Then, discuss and categorize each existing countermeasures in Section III. Section IV, we arrange the analysis method to detect vulnerabilities with the table and discussion. At that point, in Section V, The connected work is debated. In Section VI conclusion of this survey.
II. BACKGROUND: WEB VULNERABILITIES ANALYSIS AND METHODS We describe the classification of web vulnerabilities and methods to secure web applications. A term vulnerability is a defect referred to error and bug that arises due to defects in the coding of a web application. This result in a severe type of damage to web application upon exploitation [4,7]. Table I present five types of web vulnerabilities and we categorized these vulnerabilities into three main sections such as improper authentication, improper input validation, and improper session management and. It has been further divided into four web vulnerability categories: Query manipulation, Client-side, Path injection, and session management.
The main issue in security for web applications may be an inappropriate validation of user input. This Input enters into a web application via entry points ($_GET in the PHP language) and hackers can utilize web vulnerability through MySQL query. The major number of attacks occur with the combination of simple input and metadata like ‗And, OR'. Therefore these websites can frequently ensure the input of the user and validate the path and entry points [8].

A. Improper Input Validation
The web application is must validate or sanitize user input properly before its utilization in the web servers. Usually, web developers exercise sanitizing practices (i.e., sanitizers) for the transformation of inputs by the user into trusted data through filtration. For example, an HTML page may include JavaScript code (a PHP document may contain static HTML labels just as PHP declaration [2,9].

1) Query Manipulation
Query manipulation is a vulnerability related to structures that store data like databases and where malicious code manipulates queries and changing them. With the help of these web vulnerabilities, a hacker easily manipulatesthe parameter of user input. As a result, the attacker becomes able to change the query's syntax. When the validation of these parameters is not proper, the maliciously infected parameters enter the reliable website due to which unsafe and unreliable information enters the web applications and damage its security. Hence, the missing or improper affirmation of controllable user data is the leading causee for injection vulnerability. There are different types of web vulnerabilities such as SQLi, LDAPI, and NoSQL. These vulnerabilities are related to the construction of filters and queries that are operated by some kind of engine example DBMS. SQL injection is considereda famous and exploited vulnerability. The other vulnerabilities are the same as SQLi, i.e. If a query involves sanitized user inputs with malicious characters then the behavior of the query performed can be altered [2,9].

B. Client-Side Injections
Client-side injection enables malicious code to be executed by an attacker like JavaScript payloads on victim browsers without a server request. There are different vulnerabilities in this category such as XSS, remote code execution (RCE), and email injection (EI) [6].

1) File and Path Injection Vulnerability
In this class of vulnerability, a hacker manages the entrance to records from web applications or a document framework and URL areas not quite the same as the web application. These are the weakness which has a place with this gathering are RFI, LFI, and Directory traversal (DT) otherwise Path Traversals (PT) [6]. In this category, we have only considered Local file inclusion and remote file inclusion for this study.

C. Improper Authentication and Authorization (Logic Flaw)
Improper authenticating and authorizing procedures imply the invalid exercise of protocols like access control policies also known as -ACPS‖ as well as functions of authenticating. The logic of web application is generally executed by applying the application's control flow and saving by protecting sensitive information. One can achieve this situation or condition directly by keeping safety measures and checks to the coding of source or indirectly by the path directions provided to users like interface screening. Unsuitable implementation of business logic represents the logic errors, which force the application to behave in different ways as expected from it which results in dropping standard in -QOS‖ known as quality of service, losing both finance and information through the leakage. Three out of 10 top securityrelated hazards about applications of web [OWASP Top 10] be able to refer missing Insecure Direct Object Reference, Functional access Control, and Invalidated Readdresses and simply application logic susceptibilities [2,9].

D. Improper Session Management
Web applications use the web session to recognize and associate multiple web entries from a single user within a specific period. A collection of web sessions is referred to as a session of a web, it may be utilized by the website for keeping the details, path of states from the past web requests and may change the further operations. In web application development, the management of the session is achieved by the cooperation of the client and the server with each other. The general tactic to do this is that an exclusive identifier (like a session ID) sent to the client by the server after the successful verification of the user. Securing alone the session ID will not be enough for managing the protected session. Session hijacking is performed by hackers through a malicious request linked to the authentic session ID. CSRF is a well-known outbreak in this category as listed in OWASP top 10 web vulnerabilities. The vulnerable web application on risk could not identify if web requests are infected or malicious until these are associated with valid session information [2,9].

1) Session Management
Website use web session to recognize and associate multiple web entries from a single user within a specific period [2,9]. The vulnerabilities that belong to this group are clickjacking, CSRF, Session fixation, and the hijacking of a session [10]. In CSRF, hacker submitsa malicious request as a legitimate user to web application.Clickjacking is a type of attack that invites a person to click or appeal on objects placed in infected pages and by doing this, some undesirable actions may happen without any consensus of the authentic person . Session fixation and hijacking are those attacks that aim for the user's session ID, on the other cross-Site request forgery and hand clickjacking also CSRF focus on the fact that illegal request on behalf of user [2,11].

III. RELATED WORK
Existing secondary studies on the topic of securing web applications are discussed (survey papers, review articles). Fig.  2 presents relevant reviews of the literature published over the past 13 years. Much work has been published to identify a taxonomy for vulnerabilities in software. Delgado et al. [12] built up a scientific classification for ordering the runtime programming flaw observing methodologies and monitoring them in light of three elements: component utilized for checking program execution and language. (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 6, 2020 524 | P a g e www.ijacsa.thesai.org Tsipenyuk et al. [7] arranged common flaws causing web vulnerabilities, seven out of eight categories are related to environmental and configuration issues. Many attacks and vulnerabilities are classified with various taxonomies developed and submitted in Igure and Williams' comprehensive survey [13]. Krsul [14] classifications classify vulnerabilities in software. The examinations by Halfond et al. [15], Chandrashekhar et al. [16], and Garcia-Alfaro and Navvaro-Arribas [17] give an audit on the strategies for relieving the most dangers vulnerabilities such as SQLI and XSS. The study by Cova et al. [18] features the advantages and disadvantages of weakness examination instruments accessible to secure the website. Fonseca et al. investigation [19] outline the coding flaws that should be avoided in C#, Java, and PHP. In another study, Shahriar and Zulkernine [20] gave the best in class approaches accessible for discovery and the aversion of hack attempts on applications under operation. Furthermore, they discussed the methodologies for moderating web vulnerabilities atthe program level In their study, Hydara et al. [21] discuss the methods of cross-site scripting vulnerability. The XSS systematic literature review highlights various systems for discovering and avoiding XSS attacks.
Wedman et al. [10] presented a definite survey of vulnerabilities aimed at launching session hijacking attacks and available mechanisms to protect users from such attacks for the protection of web applications from vulnerabilities, various methods are utilized and described in Li and Xue [9]. All the previously mentioned audits concentrate on any of the accompanying perspectives such as (I) building up a scientific categorization for characterizing attacks and vulnerabilities, (ii) detect the coding flaws that are abused for propelling attacks, and (iii) categorizing the flaws checking methodologies. The survey on SQLI distributed in 2012 does not take after a methodical strategy that confines the range of their investigation.
Deepa and Santhi [2] provided up-to-date approaches to web vulnerability prevention. This paper is divided into different phases of the software development life cycle with 86 primary studies. There is different web vulnerabilities research paper such as in case of XSS is thirty-five, in case of SQL injection is seventeen and in case of logical bug is thirty-five. Buczak, Anna, and Erhan [22] describe a literature survey on data mining and machine learning for intrusion detection. The latest review by Ghaffarian, Seyed, and Hamid [23] provided a detailed review of the many different methods based on machine learning that analysis and discovery of software vulnerabilities.

IV. CATEGORIZE EXISTING COUNTERMEASURES
Numerous researchers around the globe are working on several different ways to detect web vulnerabilities. The following sections present different technique/methods to find web vulnerabilities, such as static analysis, fuzzing and dynamic analysis, hybrid, machine learning technique, and secure programming, The basic significance of the issue of web vulnerabilities is that many methodologies are researched and proposed. The suggested approaches are not absolute; All of them either need soundness or they are incomplete. Subsequently, all research is working to urge an enhanced approach contrasted with past works, referring to a particular part of the procedure of web vulnerability examination and revelation/discovery; like coverage of vulnerability, discovery exactness, runtime efficiency. Shahriar and Zulkernine [24] presents an extensive review to prevent web vulnerabilities reported during 1994 to 2010.

A. Secure Programming
Secure programming allows programmers to follow secure practices when they are developing the web application. Secure Programming protects coding practices by coding properly, checks the input data; encode correctly the user input, its type further by setting the query's parameter, also by bringing stored procedures to work. Query statements are named to those queries whose parameters are set with placeholders like -?‖ for referring to user data. SQL code handling placeholders in the string, which is attacking just like input. Queries that are parameterized and procedures already saved bear the same outcome however great measures are considered when programmed. Moreover, in developing of website, SQLIA's still a problem [2,9].
To protect web apps from attackers, it is important to keep a close eye on the security features at every stage of the lifecycle while developing the web application. It is referred to as SDLC. After setting up a web application we must furnish the secondary security layer [2,3]. Now day's operating systems are even more secure from the systems years back. The reason for this is the placement of automatic tools of safeguarding and protection within the compilers, core library alike DEP, and .NET respectively. In Linux and windows, stack or canaries cookie may also be used frequently [25]. These tools or systems stops a wide range of attacks without considering about the programmer practicing secure programming practices or not.
The writing of a safe program code has made clear on developers by the deployment of the website. Furthermore, due to the utilization of the stones library, it would be resolved in Java-based applications and Juillerat [26] that applies this technique. This library allows hackersto use databases using OOP and JavaScript payload instead of SQL payloads. The direct replacement of input data provided by the user as a string cannot be possible because it only goes via suitable procedures. Hence the programmers don't have to do much, limiting the additional work as the security features are controlled by the library. It can easily get rid of unsafe string code practice and when the number of queries framed. It can be performed by placing in the data and code a visible partition and Johns et al [27] accomplished. They achieved this by representing query syntax by the ELET (embedded language encapsulation type) introduction. To prevent attacks of XSS, Grabowski et al. [28] The created type system used in Java programming, implement directions of secure and safe programming.
A study was carried out by [29] to allow safe web development by using swift programming model language formed on the Jif language. This language confirms the integrity and confidentiality of information within the program code or declaring annotations description. The locations of the server or client can be recognized to secure placement data. www.ijacsa.thesai.org Another study proposed by Vikram et al., [30]. They give a new method Ripley, a replacement of Swift programming, to evade irregularities within the logic of business across both ends an impression of computational logic is the site on the side of the server that is present on the position of the client. Ripley confirms the reliability of RIAs and prevents from the extra work of within code annotation addition. However, information privacy cannot be guaranty and also it enforces memory, network overhead, and the reason for this is that it moves and positions from client to server every event. A language runtime used for the applications based on the PHP and python known as -Resin‖ permits the developers to use the already present code of application again to generate assertions that allocate the security policies. A comprehensive study conducted by Yip [31] to avoid Missing Access Control, XSS, and SQLI like multiple issues.
To develop new and secure web application an enormous frameworks of coding are created to preserve the data and important information present in web application with their reliability. To support authorizing rules or directions of web applications as acting like interpreting authority. Additionally, type checker an intermediate coding language created by Jia et al. [32] called -AURA‖. To allocate and verification that security policies applied properly or not by the integration of information flow and access control for web application Swamy et al. [33] enforced a system kind known as Apologue.
To build a secure and safe multi-level web app a coding language is created as SELinks by incorporating language links with a fable type system and this is done by Corcoran et al. [34]. In this type, SELinks compiles the code relating to the implementation of policy to functions within the database defined by the user while fable finds the missed authorization checks. It does not guarantee the security policies relating to the state of the web application is tackle by Swamy et al. [35]. They give stateful approval approaches to the application. Krishnamurthy et al. [36] Proposed a method capluse to secure the web application with secure practices.
Another method proposed by [37] is the intelligent static examination that coordinates static investigation into the Integrated Development Environment (IDE). Additionally, provide secure programming support in-situ that helps developers stop vulnerabilities while building code. There is no need for further training and there are no hypotheses as to how programs are being developed. His work is inspired in portion from the observations that are the number of vulnerabilities introduced because many knowledgeable developers fail to practice secure programming. They have employed an interactive tool for prototype static investigation similarly as a module for Java in Eclipse. Kang and Park [38] suggested a smart fumigation system made in connection with the black box and white box test that could effectively detect/distinguish software weaknesses. Na Meng [39] study served a wide reception of the validation and approval highlights gave by Spring Security -an outsider system intended to make sure about big business applications. They found that programming difficulties are generally identified with APIs or libraries, including the entangled cross-language information treatment of cryptography APIs. Moreover, discoveries uncover the deficiency of secure coding help and documentation, just as the gigantic hole between security hypothesis and coding rehearses.
The most recent study conducted by Bangani, S et al [40] proposes the educating of secure programming through a bit by bit approach. Our methodology incorporates the distinguishing proof of utilization hazards and secure coding rehearses as they identify with one another and to fundamental programming ideas. We explicitly mean to control instructors on the most proficient method to show secure programming in the .Net condition.The most recent study conducted by Agrawal, A et al [41] proposes an integrated and prescriptive framework intended to identify and mitigate vulnerabilities and provide suggestions for writing a more secure code.The detailed research review on a secure programming method to find XSS, SQLi, CSRF, LFI/RFI, and some other vulnerabilities presented in Table II.   TABLE II. SECURE PROGRAMMING EXISTING STUDIES

1) Discussion
There are numerous existing studies on preventing and detecting vulnerabilities in web applications through secure programming. Developers and firms need to focus on testing each bit of software and each application in their portfolios. By doing this right on time in the website improvement process, both can decrease, the expenses related to security. Application firewalls can be utilized as countermeasures to those attempting to hack information from an IP address. Other encryption, antivirus, antispyware, and confirmation software can be particularly utilized. To protect web applications from attackers its important keeping close eye on the security features at every stage of lifecycle while developing the application software. It is also referred as SDLC further after setting up website and it must be furnished with secondary security layer. XSS, LFI/RFI, RCE, SQLI and CSRF [2]. There are some different approached for safe coding which are distrust user input, input validation and magic switches and some tools to perform automatic source code analysis Rats, Flawfinder and ITS4. From the existing studies we conclude that Agrawal, A et al [41], Kang and Park [38], Zhu et al. [37] useful approaches in secure programming. The methods of secure programming is summarized year wise shows in the Fig. 3.

B. Analysis Method to Detect Web Vulnerabilities
There are different methods to prevent web vulnerabilities such as white box testing, blackbox, and fuzzing methods.

1) WHITE Box Testing
In the white box, the tester accesses the software code and knows the web source code's internal process. While it is possible to check how the input value of the software deduces the result value, however. This test allows access to possibly hidden source codes and code errors. The benefit of this process is that the input value can be easily predicted and a test scenario can be made. However, the white box test requires experienced skills and it is not possible to guarantee that the test specifications are met [38]. The method proposed by Jovanovic et al. [42] is suggested pixy tool.

2) Black-box Testing
The black box test depends on the software for the tester. The tester is unaware of how the software operates internally. It only tests with the result value deduction corresponding to the function-based input value. This method is advantageous for the tester as it does not require source code information or technical skills. However, while testing input values for a short time, some limitations that can not deduce logical errors and make a test case difficult without the knowledge of clear functional specifications [38].

C. Static Analysis
Mechanism of static analysis tools inspecting either binary or intermediate source code. Static examination means to look for potential vulnerabilities by inspecting the code of web applications without executing it [43]. The principal papers right now center around old vulnerabilities, for example, race conditions and buffer overflow. Later this kind of investigation has stretched out for executable programming without source code [44].
Programmers typically use static analysis tools during the development of software, checking if the code does not have vulnerabilities. In any case, these instruments just pursuit and identify the vulnerabilities. These apparatuses are program to, scanning for examples and utilizing rules for the sort of examination that they execute. As a result of this reality, the devices don't distinguish newfound classes of vulnerabilities in source code, potentially imaging the applications with bugs, creating false negativesa weakness that exists not detailed. The false positives are additionally a worry, however in the feeling of causingwaste of time, since the software engineers need to review the code looking for non-existent bugs. Static investigation procedures arranged in two primary classes, to be specific lexical examination and semantic examination [43]. Next, these strategies are displayed, with more accentuation on taint analysis, a type of semantic examination. Lexical Analysis is a strategy to discover web vulnerabilities from source code. It's examined to scan for library capacities or framework calls that are not viewed as dependable touchy sinks. (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 6, 2020 527 | P a g e www.ijacsa.thesai.org This investigation includes a lot of three principal methods: control stream examination, type checking, and information stream investigation. In a study by [45], they established a static Analysis scanner WebSSARi to find vulnerabilities in web applications. This scanner provides intra-procedural and flow-sensitive reports established on the base of the lattice model. This broadens the PHP coding including two type states, known as tainted and untainted, also finds every type state of variables in it. Runtime sanitizing objects are introduced at the place the tainted data approaches the sinks. Numerous language features, like recursive functions and array elements, have not been supported, however.Using string taint analysis suitability of sanitizing procedures can be confirmed. Furthermore, Issermann and Su [46] to enhance Minamide [47] string analysis with taint support utilize this. It has analyzed the info string to perceive spoiled substring esteems to prevent any suspicious content from running by the JavaScript mediator. As it needs an understanding of the semantics of the site page the strategy can't discover DOM-based XSS.
In another study by Xie and Aiken [48] to do a reverse interpretation of fundamental blocks, methods, and the complete program to detect SQL vulnerabilities due to injection. The method they have is capable of automatically deriving the set of variables and sanitized after which function invocated by utilizing symbolic execution. Nevertheless, their static analysis technique is bound to a specific set oflanguage features. In a study done by Halfond and Orsob [49] Suggested method AMNESIA to joins static analysis and runtime monitoring to prevent SQL injection web vulnerabilities. In one more study, Shahriar and Zulkernineb [26] put forward an information-theoretic method to prevent SQL injection web vulnerabilities. The entropy of each SQL statement is calculated based on the tokens probability.
In their study, Thomas and Williams [50] SAFERPHP utilizes static analysis to find specific semantic vulnerabilities in PHP code: nullification of administration on account of vast circles, and approving tasks in databases [51]. According to the disavowal of administration, the device utilizes corrupt examination to discover circles, and afterward utilizes representative execution investigation to decide whether assailants can forestall the end of the circles. PHPSAFE (Fonseca and Vieira, [52].; Nunes et al., [63].) taints examination to scan for vulnerabilities in PHP code.Shahriar and Zulkernine [53] proposed a method to detect the vulnerabilities based on static anaylysis. Another study proposed by Shar and Tan [54] to prevent the cross-site scripting XSS method is called XSSsafer and Scholte et al. [55] proposed an IPAAS method to detect the web vulnerabilities.
Yunhui& Zhang [56] describes another use of static analysis. His approach to finding vulnerabilities in remote code execution (RCE) using the inter-procedural path and setting delicate investigation. RCE assaults require as a rule the difference in the string and non-string portions of the customer side data sources; hence, they propose an investigation that handles these parts in a composed and productive manner with the number of PHP contents and demands. They built up a calculation that comprehends these obliges in an iterative and elective style, so endeavors can be made from this arrangement. In one more study, Doup´e et al. [57] created deDacota, an automated tool that gives a clear partition between code and data in web pages. Amira et. al., [58] proposes another static examination of web applications affirming the program's protection from meeting fixing ambushes called SAWFIX, a PHP static analyzer that outputs web applications for vulnerabilities in meeting fixation. To the best of our understanding, SAWFIX is the principle analyzer that checks extensively for this kind of powerlessness, while exchange strategies simply ensure half-precision that is limited to a modest quantity of plausible executions.
Khalid et al. [59] proposed and built up a WUM defenselessness examining apparatus (web interesting technique) to recognize and forestall every single significant weakness and tells the best way to distinguish unapproved access by recognizing vulnerabilities. The designers can discover possibly defenseless web applications with the assistance of wum Tool. WUM has created an elevated level of accuracy and similarity that is created underneath. Test's outcome shows proposed WUM helplessness scanner apparatus that gives less false positive and more vulnerabilities are identified. Another study is conctducted by Viega et al [60] on static vulnerability scanner for C and C++ code.
They developed Deepa et. al. [61] for recognizing various kinds of rationale vulnerabilities, for example, parameter control, get to control, and work process sidestep vulnerabilities. DetLogic utilizes the discovery approach and models the planned conduct of the application as a clarified limited state machine, which is thusly utilized for determining requirements identified with input parameters, get to control, and work processes. The recent study Nunes and Medeiros [62] the problem of consolidating various ASATs to improve the general identification of vulnerabilities in web applications, considering four advancement situations with various criticality objectives and limitations. These situations run from low spending plans to top of the line (e.g., business basic) web applications. The study of Long et. al, [64] has described some of the major widespread web-based vulnerabilities. These include SQLI, XSS, FI, SI etc. This study proposes an algorithm and improvements that are aimed at increasing efficiency of detecting these web-based vulnerabilities. The algorithm used to develop scanning tool use several software including UTLWebScanner. The algorithm can be compared with software providing similar functionality such as Nesus. The recent study in 2020 conducted by Aliero et al [65] to detect and minimize the occurrence of false positive and false negatives, they focus on enhancing the effectiveness of SQLIVS. They propose an object-based approach for developing SQLIVS. Three different web applications were used to test the accuracy of this approach. Each application had different types of vulnerabilities. The validity of proposed scanner was established using an experimental approach. Analytical evaluation was also used to compare the proposed scanner with other available scanners developed by various academicians. The results of experiments showed significant improvement as evidenced by high level of accuracy. The detailed research review on a secure programming method to find XSS, SQLi, CSRF, LFI/RFI, and some other vulnerabilities presented in Table III.

1) Discussion
Static analysis tools, either source, binary, or intermediate, mechanize code inspection. The objective of the static examination is to look for vulnerabilities in the source code without running it [43]. Because in the development process, static application security testing tools are used early. Before software is deployed, they can identify vulnerabilities. These tools test line by line the source code, prevent flaws and provide the opportunity to fix them before becoming a true vulnerability on the web. It requires access to source code or binaries that certain organizations or individuals may not want to abandon application testers. In order to detect vulnerabilities before deployment into the live environment, it usually needs to be integrated into the system development lifecycle, which can make implementation difficult.
Each SAST tool tends to focus on a subset of possible weaknesses. The advantages are the ability to detect vulnerabilities that are not visible without access to the source code.The capacity to reveal to you the specific area of any source code shortcomings, including the line number. Probably the greatest test to choose the correct instrument when utilizing SAST is the number of false positives produced. From the current, valuable methodologies in the static examination are NUNES et al [63] Khalid et al. [59] and Nunes, P et al. [62]. The methods of secure programming are summarized yearwise as shown in the Fig. 4.

D. Fuzzing and Dynamic Analysis
Fuzzing and dynamic analysis is another method to identify web vulnerabilities In this method does not analyze web application code for vulnerability detection from static analysis but verifies in runtime whether injected data triggers some vulnerability in application specifications [38]. In this way, it is viewed as a testing procedure that finds bugs in programming by taking care of a program with s unexpected inputs (Evron&Rathaus, [66]; Sutton et al., [67]. In their study, NguyenTuong et al. [68] altered PHP transcriber to exactly infected data of the user on the character's granularity and it traces tainted user data at runtime. In another examination, Haldar et al. [69] formulated the arrangement of Java bytecode that can grow the Java framework with inadequate following assistance. These systems will in general be easy to apply in light of the fact that it doesn't require information about the program to test. Its cooperation with the program is constrained to the program's entrance focuses Jimenez et al. [70]. Mill operator et al. [71] that depicted how they took care of UNIX program utilities with irregular data sources, such as SPIKE [72], improve this thought by giving to the applications distorted sources of info, utilizing a conventional information structure to speak to various information types [67].  perceive vulnerabilities with the assistance of a discovery approach In 2006, Kals et al. [74] built up a discovery defenselessness scanner, Secubat, to perceive vulnerabilities. The tool utilizes a crawler to perceive the site pages of the application, possess the structure fields on pages with assault vectors, and afterward break down vulnerabilities to distinguish them. It is possible to classify fuzzers into two categories: blackbox and whitebox [67]. A blackbox fuzzer executes the method portrayed as yet. As the blackbox approach is generally free of the application and doesn't require setting up the application. Regardless of black-box fuzzers being helpful, they will, in general, find just shallow (bugs that are anything but difficult to discover) and as a rule have low code inclusion (don't practice every conceivable incentive for a given variable), missing numerous pertinent code ways and in this way numerous bugs. KameleonFuzz is a blackbox fuzzer that scans for cross-site scripting. It creates malignant contributions to control cross site sciprting XSS [75].
Another technique of dynamic analysis proposed by Antunes et al., [76] is related to the black box fuzzing in a certain way that is attack injection. A tool that implements this technique intends to mimic the behavior of an attacker, and continuously inject malformed inputs while monitoring the application. The procedure is rehashed to assemble all conceivable execution ways and checking a few properties in runtime [66,67]. This is a form of white-box fuzzing that is actualized in the SAGE utilizing emblematic execution to practice all conceivable execution ways of the program. Since representative execution is moderate, in any case, it doesn't reach out to huge projects, it is difficult to find profound and complex bugs [84].
In a different study, Ciampa et al. [77] chose the result of the different advance tests on pattern matching the valid and error messages. Data stored in the form of tables and fields are tested by an empirical tactic to evaluate the gathered information. After all the computation, the compiled data is utilized to check attack inputs that are useful to recognize the vulnerabilities. In another study by Lekies et al. [78] used a taint-aware JavaScript engine to sense DOM-based XSS. Whereas, it is out of bound the other available methods to perceive these vulnerabilities. Whitebox fuzzers apply symbolic execution and imperative tackling to the source code Duchène et al., [75]. The working rule of some white-box fuzzers is to produce and to perform dynamic representative execution in an occurrence. It accumulates information stream ways and requirements on contributions from contingent branches that are experienced along with the execution. Then, the collected constraints are negated (constraint solving) and new inputs are injected to collect new execution paths. To deal with this difficulty, Dowser is a combination of symbolic execution with dynamic taint analysis to identify vulnerabilities in buffer overflow buried deep within Haller et al. [79] Programs implemented in their study.
In one, more study Duchene et al. [5] In order to get the auto production of unwanted inputs to access XSS vulnerabilities, the author used a genetic algorithm. Whereas, most of the available techniques do not have such an ability to reach the cause of DOM-based XSS vulnerabilities. In their study, Dohse, and Holz [80] purposed a very first known automatic testing method that uses static code to notice secondorder vulnerabilities and correlate more than one step attacks in web applications. The flow of unattended data can be detected by checking the incomings and outgoings from the webserver. It has been a successful identification of unsensitized data streams by linking input and output points of data in databases. Another dynamic Analysis study by Weissbacher.et al., [81] gave a system to strengthen the JavaScript-based web application to protect them from the used side attacks named ZigZag. It is a tool of client-side code. It produces a model that tells how and with whom the client-side section is in the network. It is efficient enough to perform dynamic security code invariant detection by the respective models as well as it www.ijacsa.thesai.org has the ability to handling templated JavaScript bypassing overall re instrumentation in cases where the JavaScript programs are structurally identical.
RanWang et al. [82] propose a unique recognition structure (TT-XSS) for DOM-XSS by methods for the pollute following at the customer side. They modify all JavaScript highlights and DOM APIs to corrupt the rendering procedure of programs and vectors are inferred to check the vulnerabilities naturally. In the recent study Park, et al. [83], a vulnerability detection technique is proposed that develops and manages safe applications and can resolve and analyze these problems. They developed a prototype analysis tool using our technique to test the application's vulnerability-detection ability, and show our proposed technique is superior to existing ones. The recent study in 2020 conducted by Falana [85] used Dynamic Analysis and Fuzzy Inference. The combination of these two techniques allowed them to come up with a hybrid mechanism that can be used for detection of XSS attacks. This approach used scans of website for possible SQL injections. Once this scan was done, they launched an attack vector using a HTTP request. The approach was used to test some active web applications. The results showed a large number of vulnerabilities were detected successfully. the detailed research review on dynamic analysis to prevent web vulnerabilities shown in Table IV.

1) Discussion
Dynamic analysis is a useful technique to prevent web vulnerabilities and does not analyze the source code of the website but verifies in runtime whether injected data triggers some vulnerability in the application. With this strategy, DAST tools offer risk examination and aids in the remediation endeavors, engineers do not generally know where precisely the vulnerabilities are found, nor do they generally know what countermeasures to execute. DAST approach detailing is not as much as agreeable in various examples. From the existing study RanWang et al. [82], Park et al. [83] are useful approaches in dynamic Analysis. The methods of dynamic analysis are summarized year-wise shows in Fig. 5.

E. Hybrid Method (Static + Analysis)
Extracts of static and dynamic analysis are mixed to be named as hybrid analysis and provide a path toward precision analysis. In a study by Di Lucca et al. [86] identified vulnerable web pages by studying the application's source code by arrangement control flow graphs. XXS attacks are chiefly because of wrong input sanitization functions. Some web applications are also not successful to separate the suspected entries in the inputs. In another study by Balzarotti et al. [87] claimed various elusive defects could be introduced in the web application due to defective sanitization. These subtle flaws cannot be detected by the static and dynamic practices. The hybrid analysis is utilized by Saner to identify the validity of built-in and customized sanitization procedures. Saner is the first to implement the conventional static string investigation to model the working of user input sanitization. Saner first applies conventional static string analysis to model the sanitization of user input. In order to mark frail or wrong sanitization, a big series of malicious inputs are introduced into the test sanitization procedure.
Livshits et al [88] and Lam et al., [89] studied model checking, static and dynamic inspection, and runtime detection to purpose a holistic method. Which enhances the precision of static analysis by specifically using model checking. Modelchecking can analytically search the space of a limited state system. It confirmed the authenticity of the system in reference to the provided conditions or characteristics. As well as this method of checking is capable to automatically produce tangible attacks, produce no false positives in vectors, and exploit path. Another technique of hybrid analysis study by Van Acker et al. [90] found the XSS vulnerabilities in flash applications by setting up Flashover. Whereas, the previous works up until focused on the discovery of conventional XSS web application vulnerabilities, Flashover identifies vulnerabilities in RIAs (Rich Internet Applications).
Another research is led in 2012 Lee et al. [91] struggles for finding SQLIAs by adopting both the static and dynamic methods. He examined the source code then the model of query is deduced from it. It removes the characters involved in SQL queries. After identifying and removing miscellaneous values, the obtained syntax is stored. The syntactic structure of quires is analyzed and compared with the already saved structure, this is how attacks are perceived in the runtime. The pros of using this scheme are that it can identify attach during the process. Another research Lee et [92], also applied both static and dynamic analysis methods for vulnerabilities of web applications. Along with the combination of these other techniques that are also being utilized for the specific application, dynamic black-box testing based on a fuzzing method is included in it. Vogt et al. [93], and Stock et al. [94] deploy the method to prevent the client-side browser from scripting XSS cross-site.
They propose He, X et al [95] a crossover examination strategy consolidating static and dynamic investigation for recognizing noxious JavaScript code that works by first directing grammar examination and dynamic instrumentation to remove inward highlights that are identified with malignant code and afterward performing characterization based identification to recognize assaults. In particular, MJDetector can distinguish JavaScipt assaults in current website pages with high precision 94.76% and de-jumble muddle code of explicit sorts with exactness 100% though the gauge strategy can just identify with exactness 81.16% and has no limit of deobscurity. The recent study proposes Le et al. [96] E-THAPS which actualizes a novel discovery component, an improved SQL infusion, Cross-site Scripting, and helplessness identification capacities. For vindictive web shell identification, pollute examination, and example coordinating techniques are picked to be actualized in GuruWS. the detailed research review on hybrid analysis to find XSS, SQLi, CSRF, LFI/RFI, and some other vulnerabilities shown in Table V.   TABLE V. HYBRID (STATIC AND DYNAMIC) ANALYSIS EXISTING STUDIES

1) Discussion
Extracts of static and dynamic analysis are mixed to be named as hybrid analysis; it provides a path toward precision analysis. Hybrid Analysis (also called correlation) combines DAST and SAST to correlate and verify the results. Issues identified using dynamic analysis that will be traced to the offending line of code. SAST issues can be automatically prioritized using DAST information. The challenge with hybrid analysis is that DAST relies on data being reflected in the browser, so if a SAST data flow is not reflected in the browser as a DAST issue. From the existing study, Le et al. [96] and Stock et al. [94] are useful approaches in hybrid Analysis. The methods of hybrid analysis are summarized year-wise shows in Fig. 6.

F. Machine Learning Technique
This Technique is utilized in a few application zones (e.g., computer games and robotics). Application security on the web is based on a diverse package of techniques as presented in Fig. 7. It empowers PCs to learn information without programming (coding) it, and afterward to utilize the obtained information to take activities/choices. PCs must be guided to learn before taking activities. They need an informational index of modelspreparing informational collectionfrom which to remove information, gaining from that point. An undertaking is called arrangement on the off chance that it expects to appoint input objects into classes. A classifier is a programmed technique that does classification. A classifier proceeds the dataset to collect the features and classify the dataset and provide the result based on machine learning. Email spamming is a basic example to filter the emails [97]. Machine learning is a different method to prevent web vulnerabilities.

1) Vulnerability Prediction Models in light of Software Metrics
Characterization is a type of information investigation wherein models predict the result. Model is used to predict input data class labels because each training instance's class label is referred to as supervised learning [2].

2) Anomaly Detection Approaches
To extract a program source code model & recognize vulnerabilities as separate from the usual dominant parts and principles, this work class uses unsupervised learning. This technique model isn't utilized to the class in the dataset to prevent web vulnerabilities [2].

3) Vulnerable Code Pattern Recognition
This is another machine learning method that selects the specific patterns of vulnerable code from the data set and utilizes the pattern matching to prevent web vulnerabilities on web applications [2].

4) Miscellaneous Paths
This method is used in the area of AI and data science for programming weakness in software programming and disclosure, which are not suitable other previously mentioned classes The dataset has some attributes that the set of all instance forms of a training dataset.Attributes are divided into two categories first is numerical and the second type is categorical. Illustrating the first category, it is either discrete of continuous and named as numerical attributes. Whereas, categorical attributes possess non-numerical and distinct values. Categorical attributes have a special kind of binary attributes. Binary attributes have two expected values that are either true or false [98]. Therefore, dividing and arrangement in the form of classes is a type of data examination. It includes extracting models that specify data categorically discrete or unordered class labels, these models are known as classifiers. Classification of data involves two phases: (1) learning, where the classification model is made; (2) classification, where class labels for given input data are predicted by the respective model. Supervised learning is a class in which each training instance is labeled. For example, the input of the classifier is managed in the sense that it is programmed to identify each training instance belongs to which class. An alternative type of machine learning is where each class is unidentified to any attributive vector such a type is known as unsupervised learning. Moreover, the process does not know the set of learned classes prior. [99]. each classifier utilizes an AI calculation that relies upon the learning type (supervised or unsupervised). Furthermore, the training data set is used to classify correctly about the input data. the selection of classifiers depends on the data set factors [100]. (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 6, 2020 533 | P a g e www.ijacsa.thesai.org Many researched have focused their studies to enhance the efficiency and precision of different techniques to detect web vulnerabilities. Support Vector Machines (SVM), J48, Artificial Neural Network, and many other classifications of techniques such as C5.0, Naïve Bayes, and linear regression are tested to train different datasets in order to detect web vulnerabilities. They are majorly grouped in two categories: probabilistic and machine learning. These techniques provided algorithms that are proved fruitful to cope with web vulnerability issues. Selected algorithms are analyzed by four metrics of, precision, recall, F1-score, and accuracy.
Suggested vulture tool in a study by Neuhaus et al. [101]. This unique tool will automatically explore existing vulnerabilities in archives for databases and versions. Vulture uses mine information to identify past component vulnerabilities. In addition, the components identified are classified by the most vulnerable to at least one type. This ranking of these components serves as a ground for investigations of the factors, which causes vulnerability to the targeted component. For instance, the study on the history of Mozilla vulnerability reveals an unexpected outcome that the components have one past vulnerability are mostly not affected by more vulnerabilities. In addition to it, those components that have the same functions calls are prone to vulnerabilities.
Machine learning has been utilized in certain attempts to quantify programming quality by gathering traits that uncover the nearness of software defects. Code type, counts of code lines, code metrics complication, and objected-oriented topographies are attributed in various studies. Some studies move ahead to consider the same type of metrics to guess the presence of vulnerabilities in source code. Moreover, other factors like past vulnerability events, called function and complication in codes are also used to conduct various other studies. These studies are not focused to identify bugs and mark their respective location but aim to examine the software codes according to the frequency of defect and vulnerabilities code [102].
Wang Yanya et al. [103] studied rapid density clustering called DSVRDC and intended methods to identify vulnerabilities in software using DCVRDC. Density-dependent clustering of vulnerability orders detected. The classifications examined are determined by the s-order difference. The density clustering methodology based on Rd-entropy is used to create vulnerability sequences in the first stage. Secondly, the respective vulnerabilities of the software are compared by sorder variation. Each order is dedicated to every cluster to calculate the difference in s-order as well as clusters comprises of under investigation software vulnerabilities are also computed.
In their study, Yamaguchi, Lindner, and Rieck [104] proposed a method to examine source code to detect vulnerabilities. This method schematically recognizes the API symbols of each function using lexical analysis. Then by the principal component analysis technique API symbols are introduced in vector space, and in dimension data in order to calculate the usage of API mode. Later on, the API usage mapping is created along with the estimated functions, supervisory code evaluation to classify likely vulnerabilities by utilizing known vulnerability functions as a standpoint. Another research is led in 2012. In order to protect SQLI and XSS vulnerabilities, Scholte et al [55]. combined static analysis and machine-learning and established IPAAS. This collection of information is utilized to deduce authentication policies about input fields that are helpful to save in-process attacks.
In another study by Wijayasekara et al. [105], a text mining technique was studied to remove potential vulnerabilities in the public bug database. This method creates a matrix for the term document. The mentioned process uses a text-mining technique to reach to the final task of classification of feature vectors into normal bug or vulnerability. The author has also purposed the increasing proportion of concealed vulnerabilities influence occurred during the past two years which is 53% for 53% for Linux and 10% for MySQL.
Another research is led in 2012 Nunan et al. [106], [107]. Likewise, recuperate web record and URL based highlights from an enormous box of an assortment of XSS assaults to examine how to depict the assaults and sort new potential XSS vector assaults as vindictive. Due to this enormous assortment, they perceived a lot of highlights (obscurity based, far fetched examples, and HTML/JavaScript plans) that license the specific arrangement of XSS in pages. At that point, they investigate consequently website pages to distinguish XSS assaults. These are the three stages process: one is identification and extraction of muddled highlights, the second one is unraveling of the website pages and includes, and the last one is the arrangement of pages by methods for an AI calculation.
Standard classifiers and other normal information-digging methods just search for the nearness of qualities, without relating them or thinking about their request. This can startthe wrong order and forecast. In earlier years, this perspective has been contemplated for improving exactness. Khosronejad [108] also aim to reduce the time of training during the construction of the HMM. They propose to assemble a model dependent on separated regular normal examples in follow occasions as opposed to taking each follow all alone. The follows are standard calls, since they can recognize the likelihood of deformities, by abnormal capacity call or by illicit utilization of assets as a result of assaults.. Bhole et al. [109] contrast the aftereffects of HMM and standard classifiers for the identification of oddities performed by an IDS. They infer that the HMM performs superior to the others. Another significant investigation shar and Tan [110]set forward their endeavors to recognize web vulnerabilities and to order different info sterilization methods in various classes as a lot of static code and a device called PHPMINER-1. In an investigation by Shar, Tan and Briand [111], they evaluated dynamic ascribe helplessness to supplement static traits. What's more, they utilized directed learning and estimation maps that are made together on the course of action and bunching to figure vulnerabilities. Both of these can perform exclusively in the nearness or nonattendance of marked preparing information. Creator presumed that they are appropriate without marked preparing information also.
In another examination by Soska and Christin [112]. The purpose of the study is to foresee the status of the site that will (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 6, 2020 534 | P a g e www.ijacsa.thesai.org get vindictive later on or not before it is truly undermined. This is extremely useful by utilizing AI since they are effectively recovered includes about the server of site and the facilitating subtleties of sites. The highlights removed about the sites are, for example, the structure of record framework (e.g., catalog names that show that the site is made by CMS), the structure of the page (e.g., if the site page is made by a CMS format), and the catchphrases (e.g., presence of some HTML labels). It depends on the event of these highlights; they perceive whether a site will be undermined. In another study using the machine learning technique, Howard et al. [25] proposed the Psigene system to retrieve features from a large SQL injection attack collection box to study how to describe them.
Another study led in 2014 [113], Fabian et al. purposed a technique for efficient big source code data analysis to find the vulnerabilities. The author presented a code property graph for illustration of source code in a new way. These graphs combined the idea of standard program analysis that includes abstract syntax trees, managed flow graphs, and graph of programs into a collective data assembly. We can characterize integer overflows, buffer overflows, vulnerabilities in format strings, or memory disclosures. The purposed collective informative structural model for vulnerabilities in addition to their graphs representation makes a person aware of all the above-mentioned factors. The creditability of this technique is identified by real-time application in some well-known graph database, it is successful in the Linux kernel to find eighteen unfamiliar vulnerabilities in the source code. A technique for detecting RCE, XSS, LFI/RFI, and SQLi was developed by Singh et al. [114]. This study proposed a work to improve the accuracy of the current vulnerability finding scheme. Grieco et al.'s [115] recent study, suggested a method for estimating a vulnerability by blurring. This approach deduces topologies that negate memory by analysis of a binary program. In the consequence of this analysis, all the extracted results are classified to assist machine learning. VDISCOVER is used to check if the test category has vulnerabilities. 1039 program are observed using bug hunter to extract 138308 performance sets in order to statistically investigating 76083 different function calls. Methods are proven effective as the test results have been detected and certain memory leaks have been confirmed.
In another study, Medeiros et al. [116] proposed a new approach to deduce by extraction algorithm the basic and context structure of source code to identify vulnerabilities in web applications. The author also stated context-sensitive security flaws in the prevailing most distinguished XSS (crosssite scripting) technique to find the vulnerability. It is found that the XSS methodology is unable to include user input in the output statements. In Walden, Stuckman, and Scandariato, [117], compared two important feature software metrics and text mining of web vulnerabilities. The author tried to establish a prediction model comprising for PHP. Both the techniques are cross-validated. Application with a version named as Drupal 6.0, PHP My Admin 3.3, and Moodle 2.0. are selected for cross-validation test. Validity test is performed twice; software metrics and term frequency parameters are used respectively to guess vulnerability. After this, results are compared and eminence of guess parameters is analyzed.
In their study YUN et al., [118], gave a new technology VULPREDICTOR that investigates metrics and text mining to guess vulnerable files. At last, it purposes a compound prediction model. First VULPREDICTOR builds 6 basic classifiers on a file under observation in order to produce constructs a Meta classifier. These files are classified according to their text parameters and software algorithms. This method run in two stages firstly it constructs a model then comes prediction stages. In the model construction stage, VULPREDICTOR constructs a composite structure from training source code files with (vulnerable or not) known labels. While in assuming point, this model works as to guess, whether a new source code file is vulnerable or not. In another study, Abunadi, Ibrahim, and mamdouh [119] developed an empirical study method that examines the effectiveness of cross-project prediction to guess vulnerabilities in software. The open-source datasets are incorporated and five famous classifiers are tested. The results of these classifiers are compared to check them in cross-project vulnerability prediction situations.
A study Anbiya et al. [120], focused on using PHP native token and Abstract Syntax Tree (AST) as features then manipulate them to get the best feature. We pruned the AST to dump some unusable nodes or subtrees and then extracted the node type token with Breadth First Search (BFS) algorithm. They were able to get the highest recall score at 92% with PHP token as features and Gaussian Naïve Bayes as a machine learning classification method. Another study in 2018, Kronjee et al [121] built a tool called WIRECAML a contrasted instrument with different devices for powerlessness identification in PHP code. The apparatus performed best for web vulnerabilities. They likewise gave approach a shot various open-source programming applications.
In this study Smitha et al. [122], work investigates the exhibition of calculations like choice woodland, neural systems, bolster vector machine, and strategic relapse. Their exhibition has been assessed utilizing standard execution measurements. HTTP CSIC 2010, a web interruption identification dataset is utilized right now. Test results demonstrate that SVM and LR have been predominant in their exhibition than their partners. Prescient work processes have been made utilizing Microsoft Azure Machine Learning Studio (MAMLS), a versatile AI stage that encourages an incorporated improvement condition to information researchers.
The study conducted by Noman et al. [123], fabricates 6 classifiers on a preparation set of named records spoke to by their product measurements and content highlights. Furthermore, they manufacture a Meta classifier, which consolidates the six hidden classifiers. NMPREDICTOR is assessed on datasets of three web applications, which offer 223 prevalent quality vulnerabilities found in PHPMyAdmin, Moodle, and Drupal. In their study Kudjo et al. [124], directed an observational examination on three open-source helplessness datasets, to be specific Drupal, Moodle, and PHPMyAdmin utilizing five AI calculations. Shockingly, they found that in all instances of the 3 datasets considered, models gave a critical increment in accuracy and precision against the benchmark study. Zhou et al. [125] study presents an improved www.ijacsa.thesai.org algorithm that generates test cases. This algorithm uses a new mutation method to divide test cases into various functional units to preserve their semantic structure. The results showed their algorithm not only generated better cases as compared with standard genetic algorithm and the adaptive genetic algorithm but also detected web vulnerabilities with high accuracy. Another study in machine learning is conducted by Tang et al. [126] that The statistical analysis of normal and SQL injection data was used to design eight feature types and train a machine-learning model. The accuracy of this model was 99%. The study proposed by Williams et al. [127] an integrated framework of data mining. This framework was capable of detecting evolution of web vulnerabilities. This framework three specific techniques i.e. Topically Supervised Evolution Model and Diffusion-Based storytelling technique, and prediction models. Through a series of experiments, it was shown this proposed framework not only discovered the evolution of web vulnerabilities and predict them with high level of accuracy. The methodology proposed by Calzavara et. al., [128] utilized machine learning to detect web application vulnerabilities. They used this methodology in Mitch. Mitch was the first machine-learning based solution to detect crosssite request forgeries.the detailed research review on machine learning to prevent web vulnerabilities shown in Table VI. 1) Discussion Machine learning is considered very different approach with a wide range of web applications. However, it can also use to find out web vulnerabilities in source code. It is a very important area in today's collaborative work environment to detect 0day web vulnerabilities and new approaches are always desirable including the current and existing once. Many researched have focused their studies to enhance the efficiency and precision of different techniques to detect web vulnerabilities. Support Vector Machines (SVM), J48, Artificial Neural Network, and many other classifications of techniques such as K-Nearest Neighbor, C5.0, Naïve Bayes, and linear regression are tested to train different datasets in order to detect web vulnerabilities. Furthermore, mostly researcher evaluate their result with machine learning parameters such as precision, recall, F1-score, and accuracy. From the existing, study noman et al. [127], Medeiros et al. [12] are useful approaches in Machine learning.

V. CONCLUSION
This study provides a comprehensive survey of existing methods in the research area of web applications vulnerabilities. We highlighted several open issues that still needs to be addressed. In this paper, we reviewed classification and detection of web vulnerabilities with different approaches like static analysis, dynamic analysis, hybrid analysis, combined three analyses for scanners and machine learning. We also reviewed various types of web vulnerabilities with different classification. The input validation vulnerabilities and improper session management and methods to perceive web vulnerabilities. There are lot of works that have been performed to cater to such issues. The best approach we identified to secure a web application is concluded such as for secure programming is Agrawal