Enhancement of KaPoW Plugin to Defend Against DDoS Attacks

DDoS attack is one of the hardest attacks to detect and mitigate in the computer world. This paper introduces two quantitative models, which use the client puzzling to detect and thwart application DDoS attacks. We simulated the models to use the probabilistic metrics to penalize the malicious users and prevent them from launching a DDoS attack while offering a stable environment to the normal users and decreasing the number of false positives and false negatives. Keywords—Application Security; Client Puzzling; DDoS; Metrics; PHP; Puzzle


INTRODUCTION
Distributed Denial of Service (DDoS) attacks is one of the most rapidly increasing threats to the Internet eco-system. It has been increasing almost exponentially leaving the servers always wanting more bandwidth. Nowadays, DDoS attacks may be more than 100Gbps which is 10 times the size of most internet backbone pipes.
DDoS is DoS taken to a whole new level using diversification, obfuscation and distribution of the attack origin. DDoS is launched using many computers on one or more victims to prevent the legitimate users from accessing the network resources [1,2].
In this paper, we applied the Client Puzzling approach to defend against the DDoS attack. It's a Proof of Work (PoW) [7] technique where the client proves that it has done some work, by solving medium to hard puzzles, in return to get the needed resources from the server and to prove its legitimacy [8,9].

A. Client Puzzling
Client Puzzling is a protection technique, characterized by its capability to be integrated into any web application with minimal alterations to the infrastructure and software components. Dwork and Naor were the first people to suggest the use of client puzzles to limit the junk email [10,11]. But unfortunately, client puzzling has its shortcomings for adversaries with parallelization capabilities, or legitimate flashcrowds [8].

B. Puzzles Difficulty Calculations
The puzzle difficulty can be determined based on the server load, the client behavior or just fixed difficulty [8]. In cases where the difficulty is based on the server load, the puzzle difficulty increases as the server runs out of resources regardless of their maliciousness. That's why it is the worst for the legitimate clients. It's better to determine the puzzle difficulty based on the client's behavior to penalize the attackers by giving them harder puzzles than the normal clients. Yet, this will require the server to track the client's behavior by using client identifying information, such as the client's IP address or the assigned nonce tokens. In the fixed difficulty all the clients are not required to solve a puzzle. However, when the server resources are occupied above a certain threshold, all the clients receive a puzzle with a predefined fixed difficulty.

C. KaPoW
KaPoW is a PoW based technique, implemented as libraries and can be used by the web applications to enhance the performance of anti-spam techniques such as: Captcha and spam filters [13].
There are two implementations for using KaPoW to protect the web content:  KaPoW Apache module known as Mod_KaPoW. It is an Apache2 module which is almost transparent for the application. It embeds the puzzling and the solver mechanism in a way that changes the application onthe-fly [14,15].
 KaPoW plugin which is a PHP library that allows the puzzles to be embedded in the HTML tags, solved by JavaScript and verified by a server-side component [16]. Two existing applications for KaPoW plugin are KaPoW webmail filter and KaPoW anti-spam filter.
KaPoW calculates the puzzle difficulty based on multiple metrics. The total score is calculated by summing all the metrics' scores: Score = S1 + S2 + ...Sn (1) where n is the number of used metrics. The user will receive a puzzle with difficulty (Dc) based on his score. The difficulty is calculated using: where m is an arbitrary empirical constant [16].

D. KaPoW Modules
As any client puzzling system, KaPoW plugin consists of three components: the puzzle issuer and verifier at the serverside and the puzzle solver at the client-side. The issuer generates the puzzle and delivers it to the client. After the client receives the puzzle, the solver generates random solutions to these puzzles until a correct solution is found and sent to the server. Finally, the verifier accepts or rejects the solutions, sent to the server, based on their correctness, legitimacy and freshness [16]. Fig. 1 describes the system architecture of KaPoW Guestbook. The same architecture is followed in the proposed models.

E. KaPoW Guestbook
KaPoW Guestbook [13] is an open source project under GPLv2 License and implemented in PHP. It solves the modified time-lock puzzles using a JavaScript solver which is called via AJAX which allows solving the puzzle in the background.
KaPoW Guestbook can be integrated in any application because of its modularity; which makes it easy to add after the application is already developed; there is no need to make changes in the core modules of the applications.
KaPoW Guestbook's browser side displays the comment form to the user asking him to enter his name, e-mail, comment and IP address. Then the user submits the form, and a new puzzle is requested.
When the server receives a request for a new puzzle, it invokes SpamAssassin to detect if the contents contain any spam data. Then the server checks the blacklists and returns the threat score. The difficulty (Dc) is calculated based on (2). After calculating Dc, the server nonce and the difficulty are returned as a response to the JavaScript. The client tries to solve a hash function using the information sent by the server. Finally, when the answer is found, the client submits it to the server. If the server verifies the answer is correct, it'll accept the new message and will display all the messages, otherwise it will reject the submission and will only display the old messages.
III. PROPOSED MODEL "DDOS_KAPOW" Client Puzzling has proven its capability and efficiency at defending DoS and spam attacks, that's why we decided to apply it to defend against DDoS attacks.
After examining the KaPoW_Guestbook open source code, some discrepancies were found that prevented the code from running. So, we solved the problem with the SpamAssassin and to save bandwidth, we applied caching on the message content. Also, to make the code run, we replaced the blacklists used by the authors, by "DShield Blacklist" because they didn't exist anymore.
KaPoW Guestbook's original implementation was done using two metrics Spam filter and IP Blacklist to detect the presence of Forum Spam attacks. "DDoS_KaPoW" is an implementation of KaPoW Guestbook, it uses the same architecture but with some enhancements made on the individual modules and the used metrics to adjust them to the setup environment and to defend DDoS attacks instead of spam attacks.
We modified the user interface and the core engine using a Resource Intensive Operation "RIO" (some calculations in the background which makes the post message action does some processing) to simulate the CPU intensive operation. We substituted the spam filter metric by the processor load since it is known as one of the most important factors indicating the presence of a DDoS attack. Since DDoS_KaPoW focuses on the freshness of the client puzzles, we made the nonce Nc random for each request and it is submitted with the answer for verification instead of being constant like in KaPoW Guestbook. The constant Nc in KaPoW Guestbook will allow the attackers to generate multiple requests using the same answer. We added the capability to enable and disable the Client Puzzling which will help us in the evaluation of the models. We also added an internet connectivity check because of the regularly updated services which require an internet connection e.g. blacklist.
Finally, DDoS_KaPoW checks if the user is an attacker by checking the processor load and the IP blacklist. If the processor load is higher than a predefined threshold, the score is increased by 8. If the user's IP address exists in blacklist, the score will increase by 5. At the end, the puzzle difficulty is calculated using (2). Z-PoW is a mutation of the client puzzling implementation in Mod-KaPoW and KaPoW plugin. It combines the concept of the client's maliciousness score and the equation needed to calculate a combined score from these metrics taken from the DoS protection in Mod-KaPoW, in a framework similar to that used in KaPoW plugin. However, Z-PoW proposes multiple new metrics to detect DDoS.
A. Architecture Fig. 2 displays the flowchart of Z-PoW's browser-side. At the beginning, the browser reads the puzzle from the server. After that, the browser generates a possible puzzle solution. If this solution is incorrect, it will try another one. But if the solution is correct, it will read the operation argument and will send it to the server along with the difficulty and the answer. If the operation is not complete, an error message, received from the server, will be displayed. 0 displays the flowchart of Z-PoW's server side. The server reads the operation which can have three values: "null", "preview" and "submit". If it's "null", it'll display all the old operations. On the other hand if it's "preview", it will first check whether the client puzzling is switched on or off. When the client puzzling is off, it will only make the difficulty equal to zero. But when the client puzzling is on, the score is initialized by zero. Then the server does several checks to calculate the score based on different metrics. These metrics check if the request is coming from The Onion Router (ToR), is a referrer, is blacklisted, is not permitted in the country, is a proxy, is a user agent or is the processor high, then the score will increase if one or more of the metrics is true by 1, 6, 5, 4, 1, 1 and 8 respectively. When the calculated score is less than the threshold, the difficulty will be equal to zero. But when the calculated score is higher than the threshold, the server will return a puzzle with a calculated difficulty. Finally, when the operation is "submit" and the client puzzling is on, the server will read the IP address, the answer, the difficulty and the operation. The server will also generate a puzzle based on the IP address and the given difficulty. After that, it validates the answer. In case the answer was wrong, an error message is displayed. However when the answer is correct, the operation is executed and the argument is saved.

B. Attack Identification Metrics
Based on [4,13,16,19], many factors were identified to help indicate the presence of a DDoS attack or that the user is potentially an attacker. The following factors are used as the puzzle metrics based on their disadvantages and their difficulty of implementation.  Two of them are already used in DDoS_KaPoW: the existence of client's IP address in a blacklist and the increase in the server processing load. Other factors are used like using the client's geographic location to identify the clients who wouldn't normally access the server. Also, the absence of the referring URL indicates that this client is most likely a bot or a malicious user. If the request is originated from a bot then the user agent will probably have a signature which is known as one of the bad user agents. Although, in real life, the normal clients can use anonymizing networks, such as ToR and Proxy for privacy, but still the malicious users can exploit them to launch an attack.

C. Maliciousness Score Calculation
After selecting the metrics used to identify the presence of a DDoS attack, we established some factors to measure the effectiveness of each metric and assign its score. These factors are the processing overhead, false positives, false negatives, how difficult the metric can be bypassed, the probability of its occurrence, the accuracy of the metric and finally the negative impact on the normal clients as shown in . We gave each factor a score based on its variability. At the end, the score and the difficulty are calculated based on equations (1) and (2).

D. New Modules
In order to apply the new metrics, we integrated the proposed models with third party services and libraries like Windows Management Instrumentation (WMI) objects to calculate the processor load. We also integrated Z-PoW with DB-IP database to determine the client's geographic location, a referrer anomaly detector, ToR and Proxy detection libraries and finally a user agent anomaly detector.

E. Attack Simulation
During the implementation of the DDoS attack simulation, there was a problem with the browser automation because of its limitation of maximum number of simultaneous requests to the same domain. We tried many solutions like different browser profiles, different browsers instances using Selenium, different webdrivers using Python-Selenium Library, JavaScript to Python engine and Virtual machine with BeEF. But still all these solutions were neither satisfying nor feasible to solve the problem. At the end, we used a command line standalone JavaScript engine "PhantomJS" to conduct the attack simulation We simulated the malicious user agents and proxy headers by injecting custom user agent and proxy randomly from the code. We added a module to select randomly from a list of the source IP addresses and feed it to both the simulated source IP header and the proxy header to simulate the clients behind a proxy. Also, we handled the case of unsolved puzzle, such that the operation will be discarded and the user's browser will have to request a new puzzle to solve (Retrying Request).

A. Network Setup
To build the network, we used 5 machines: one machine acting as a server and 4 machines serving as clients (good and malicious). The server machine has 4GB RAM with Windows 8.1. The clients' machines: one has 1GB RAM with Windows 7 Ultimate; one has 3GB RAM with Windows 7 Professional; one has 2GB RAM and Windows 7 Starter and one with 4GB RAM and Windows 7 Ultimate. We built the network using an 8-port 100 Mbps desktop switch and straight through Ethernet cables.

B. Software Setup
On the server machine, we used XAMPP v3.2.1. Also, PHP v5.4.19 and Apache v2.4.4 were used. We used NetBeans IDE and xDebug to run all the models. Finally, to execute the simulation consoles remotely, we mounted the network drives. The server is designed to give priority to malicious users over normal ones. So as suggested in [14], we applied the limitation of accepting 4 clients simultaneously in DDoS_KaPoW and Z-PoW using Multi-Processing Modules "MPM" parameters. Also, we changed "PHP.ini" parameter to control the maximum execution time and adjust the default value from 30 to 80 seconds.

C. Simulation Assumptions
When the good and the malicious requests are sent; we send the good requests from one client machine using 2 consoles; except during experiment 1 and 2, we only use 1 console since the number of the good requests is very small. On the other hand, we send the malicious requests from the other clients' machines through 5 consoles. But when we only send good requests, they are distributed among all the clients' machines using 5 consoles on each. No requests are sent from the machine acting as the server. We used 900 seconds (15 minutes) as a threshold after which any request will be ignored www.ijacsa.thesai.org because it's not feasible for an attacker to wait all that long for a single request; it's easier for him to launch a new attack.

VI. RESULTS AND ANALYSIS
We have made various tests to measure the efficiency of the proposed models, and we have altered many variables to evaluate them in different environments and capture their performance. These tests aim to reveal the benefits and overheads of using client puzzling to defend against DDoS attacks. We conducted 10 experiments; each experiment consists of 6 tests . TABLE II and TABLE III display  While running the experiments, we noticed that a considerable amount of time was spent to process the good requests when the client puzzling is on. This wasn't desirable and affected the aim of the models. This amount of time was caused by the lookup for the ToR network. We removed this metric which saved a lot of time such that the average time taken by the requests during the presence of the ToR metric is triple the average response time during its absence.

A. Client Puzzling on vs off
We can conclude from Fig. 4 and Fig. 5 that the average response time of the good requests during tests ON/V/G and ON/F/G is higher than test OFF/G. This makes perfect sense because this gap represents the time taken to check the user maliciousness; it's the cost of security. We can also observe that the average response time of the good clients during test ON/V/G is almost the same as test ON/F/G with very few tweaks.
From Fig. 6 and Fig. 7, we can conclude that both models have almost the same behavior with very few differences, when only good requests are sent, whether the puzzle difficulty calculation was varied or fixed. Furthermore, the average response time in both models is directly proportional to the total number of requests. In Test OFF/G, the puzzle difficulty remains zero, throughout all the experiments in both models, as the client puzzling is switched off. During test ON/F/G, the difficulty also appears to be zero since the good clients' requests never exceeded the predefined threshold. Based on TABLE IV, in Test ON/V/G, the client will receive a puzzle difficulty with either zero or 131072 in Z-PoW and 32 in DDoS_KaPoW. These numbers '131072' and '32' refer to the difficulty calculated based on equation (2) when there is a high load processing on the server and the score is substituted by the processor load score which is 8 as mentioned in section IV. There are some exceptions in Test ON/V/G where the difficulty is zero like in Z-PoW' experiments 1 & 2 and DDoS_KaPoW experiments 1, 2 ,3 & 4. These exceptions are due to the small number of the sent requests such that it didn't affect the server processor. Either in Z-PoW or DDoS_KaPoW, all the requests coming from the good clients, with or without the client puzzling, received a response. There weren't any requests dropped even when the total number of requests was increased four times.

B. Varied Puzzle Difficulty Calculation
In reference with Fig. 8 and Fig. 9, in both models during Test OFF/GM, the average response time of the malicious and the good requests are close to each other.
On the other hand, in Test ON/V/GM, the average response time of the malicious requests is way greater than the average response time of the good ones. Sometimes, the average response time of the malicious requests is 25 times the average response time of the good ones. This proves that the client puzzling enhanced the good users' experience and punished the malicious clients by giving them complex puzzles and hence delaying the response of their requests.  Fig. 10 shows that during Test ON/V/GM, Z-PoW's performance is better than DDoS_KaPoW because the average response time of the malicious clients is very high in Z-PoW while it's slightly higher than the average response time of the good ones in DDoS_KaPoW.    DDoS_KaPoW. This proves that increasing the number of the metrics didn't affect the processor load; on the contrary it enhanced the good user's experience. Finally, the puzzle difficulty of the malicious requests in Z-PoW is way higher than the malicious requests in DDoS_KaPoW and that's because Z-PoW uses 6 metrics instead of 2.
In Z-PoW and DDoS_KaPoW, no good nor malicious requests were dropped during any experiment in Test OFF/GM since the client puzzling is switched off. In both models, during test ON/V/GM there weren't any good requests dropped. TABLE VI shows the total number of the malicious requests sent and dropped during Test ON/V/GM for each experiment in both Z-PoW and DDoS_KaPoW.
In Z-PoW, when the client puzzling is on, a considerable amount of the malicious requests was dropped; even sometimes half of the requests were dropped. The number of the requests dropped is directly proportional to the total number of the sent requests. On the other hand, in DDoS_KaPoW, when the client puzzling is on, almost no malicious requests were dropped even when the number of the sent malicious requests was increased. So still the attackers will be able to access the server and dominate it at the end.

C. Fixed Puzzle Difficulty Calculation
Based on Fig. 11 and Fig. 12, in both models the average response time of the good clients, when the puzzle difficulty calculation is fixed (Test ON/F/GM), is higher than their average response time when the client puzzling is off (Test OFF/GM). This is the time cost of calculating the maliciousness score. On the other hand, the average response time of the malicious requests in test ON/F/GM is way higher than Test OFF/GM so both models succeeded at fulfilling their almost the same with very few changes. Furthermore, the average response time of the malicious requests of Z-PoW is way higher than DDoS_KaPoW's. So, using more metrics helped delaying the malicious users and increasing their average response time.
During test ON/F/GM the minimum puzzle difficulty a user can get is 0 and the maximum puzzle difficulty, based on equation (2), is 500000 in Z-PoW and 50 in DDoS_KaPoW since the score used, after exceeding the predefined threshold, is equal to 10.
Based on TABLE VII, both models succeeded at preventing the attackers from accessing the server. Thanks to using more metrics, Z-PoW succeeded at preventing more malicious users and dropping their requests

D. DDoS_KaPoW vs Z-PoW vs KaPoW Guestbook
We simulated KaPoW Guestbook's model like Z-PoW's except that: one machine was acting as the server and only two client machines were used (one acting as the good clients and the other acting as the malicious ones) since it's a forum spam attack. This attack was launched 3 times, each time the number of consoles used by the attacker and the number of the sent requests were changed as shown in TABLE VIII.     TABLE VIII shows the percentage of the dropped malicious requests in each experiment when the client puzzling is on during the simulation of Z-PoW, DDoS_KaPoW and KaPoW Guestbook. As listed, the client puzzling dropped more malicious requests and defended DDoS attack better when the number of used metrics was increased. Almost both Z-PoW and KaPoW Guestbook have the same behavior with slightly differences which indicates that the client puzzling algorithm has comparable performance in defending against both DoS and DDoS, but it needed more metrics to defend the DDoS attacks.

VII. CONCLUSION
DDoS attacks are still considered a big threat for big companies. Although there is no 100% security but the client puzzling has proven its capability and efficiency to thwart DDoS attack through punishing the malicious clients without affecting the normal clients.
Z-PoW, is like KaPoW Guestbook, can be integrated in any application because of its modularity. It also investigates a lot of metrics to prevent the DDoS attackers from accessing the server. No good requests were dropped by applying the client puzzling which satisfies Z-PoW's goal.
Although the results of the tests with fixed difficulty are better than the tests with varied difficulty; some good clients may accidently be misinterpreted as malicious ones, hence suffer more receiving very hard puzzles.
Unfortunately Z-PoW has some deficiencies. One of them is that some normal users, who are using an automated tool or a plugin to block the referrer in the browser, will be considered as attackers because there won't be a referrer in the URL. Another flaw is the overhead added by the IP-to-country library because of the duplicate cache entries. Finally when a client has to retry a solution for the puzzle, the time taken to get a reply will be calculated from the second request sent, not from the first one.

VIII. FUTURE WORK
In order to make the malicious clients suffer more, the difficulty of their puzzle can be scaled up exponentially while the difficulty for well-behaved clients scales down linearly as suggested in [8]. Or the bad clients can be blocked after multiple spikes.
The look up of the nonce can be enhanced by using the counting bloom filters. The detection of the users coming from a ToR network or behind a proxy could be also enhanced, especially the ToR because it consumes a lot of time which is not effective.
In Z-PoW, we investigated the processor load using a Yes/No check. But in the future, a variable score can be used based on the load which will help detect a DDoS attack earlier.
The attackers can be simulated to be more sophisticated and by using General Processing Unit (GPU) cracking to facilitate solving the puzzles and compare the results with the normal clients.
Finally, Z-PoW can be enhanced by combining a Trust Model with the client puzzling. To cope up with everyday changes, Z-PoW needs to be compatible with HTML5 and IPv6. Also, DShield API changes need to be applied once they are done.