Performance Analysis of Fermat Factorization Algorithms

The Rivest-Shamir-Adleman (RSA) cryptosystem is one of the strong encryption approaches currently being used for secure data transmission over an insecure channel. The difficulty encountered in breaking RSA derives from the difficulty in finding a polynomial time for integer factorization. In integer factorization for RSA, given an odd composite number n, the goal is to find two prime numbers p and q such that n = p q. In this paper, we study several integer factorization algorithms that are based on Fermat’s strategy, and do the following: First, we classify these algorithms into three groups: Fermat, Fermat with sieving, and Fermat without perfect square. Second, we conduct extensive experimental studies on nine different integer factorization algorithms and measure the performance of each algorithm based on two parameters: the number of bits for the odd composite number n, and the number of bits for the difference between two prime factors, p and q. The results obtained by the algorithms when applied to five different data sets for each factor reveal that the algorithm that showed the best performance is the algorithms based on (1) the sieving of odd and even numbers strategy, and (2) Euler’s theorem with percentage of improvement of 44% and 36%, respectively compared to the original Fermat factorization algorithm. Finally, the future directions of research and development are presented. Keywords—Integer factorization; Fermat’s algorithm; RSA; factorization with sieving; perfect square


I. INTRODUCTION
The Rivest-Shamir-Adleman (RSA) cryptosystem is one of the most famous and secure cryptosystems currently available. It was designed to encrypt plain text into cipher text in as strong a manner as possible. The RSA system is a type of public-key cryptosystem that is based on two different keys: a public key that is used for encryption and a private key that is used for decryption.
The main steps in the RSA cryptosystem are as follows [1,2]: 1) Generate two random distinct prime numbers of large and equal size, p and q, and then construct an odd composite number n = p q.
3) For the encryption procedure, choose the exponent number e that is greater than 1 and less than Φ( ) such that gcd� , Φ( )� = 1. Then apply the modular exponentiation formula on the message m to generate a secret message c as follows: = mod 4) For the decryption procedure, find the integer d that is greater than 1 and less than Φ( ) such that mod Φ( ) = 1. Then apply the modular exponentiation formula on the secret message c to generate a message m as follows:

= mod
The RSA cryptosystem includes two mathematical operations that are opposite to each other. The first operation is multiplication, which is easy to compute. The running time to compute the product of two numbers is ( 2 ) in the worst case, where b is the size of each number. This type of operation is important for computing the modular exponentiation [3,4] to reduce the computation time of the exponentiation. The second operation is a process that involves finding two prime factors p and q from an odd composite number n. This process is called integer factorization [5]. If we can factor n to p and q, then we can compute Φ( ) and then d. Consequently, the encrypted message c can be decrypted. Hence, the integer factorization problem is important in cryptography. Therefore, solving this problem in an efficient timeframe leads to breaking the RSA. In other side, the difficulty in finding a polynomial time for the factorization leads to difficulty in breaking the RSA cryptosystem [6,7,8,9].
Moreover, the integer factorization problem is important from the point of view of complexity theory. Until now, the integer factorization problem has not been considered to belong to the class of P problems. Also, there is no proof that the integer factorization problem belongs or does not belong to the class of NP-complete problems. From a review of the literature, it seems that the best time complexity for factoring an odd composite number is � �64 9 ⁄ 3 + (1)(ln ) 1/3 (ln ln ) 2/3 � using the general number field sieve (GNFS) algorithm [1,10].
A large number of algorithms have been proposed in order to attempt to factor an odd composite number. These algorithms can be categorized as either general or specialpurpose algorithms. The general-purpose group contains *Corresponding Author 340 | P a g e www.ijacsa.thesai.org integer factorization algorithms that have a running time that depends on the size of an odd composite number only. This group includes integer factorization algorithms that are based on various strategies, such as continued fraction factorization, Shanks's square forms factorization, Dixon's algorithm, the quadratic sieve algorithm and the GNFS algorithm [1,2,10].
On the other hand, the special-purpose group contains integer factorization algorithms that have a running time that depends on the size of an odd composite number and its properties. For example, the trial division method is an efficient algorithm for factorization when an odd composite contains a small prime factor. Besides trial division, the special-purpose group contains various other techniques, such as Fermat factorization, wheel factorization, Pollard's p-1, Euler factorization, and the Lenstra elliptic curve [1,2,10,11].
In this paper, we are interested in the special-purpose group because the aim is to study the performance of algorithms that are based on Fermat's factorization concept. Fermat proposed a factorization algorithm that is based on representing the odd composite number as the difference between two squares. The main advantage of Fermat's factorization technique is that it is able to factor an odd composite number, n, in a very fast time, i.e., almost instantaneously, when the difference between two factors is ∆= √ 4 [6,12]. This means that if the size of n is b bits and the difference between two factors is ∆= √ 4 , then the following are true: (1) The two prime factors have the same size, i.e., each prime factor has size 2 ⁄ ; (2) The number of common bits between the two prime factors is 4 ⁄ , and these bits should be the most significant bits. These two conditions are known as the domain of the efficiency of Fermat's algorithm, (DEF).
Many algorithms have been proposed that are based on Fermat's factorization concept [13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31]. The goal of these algorithms is to improve the running time of the original Fermat algorithm in finding prime factors. Two categories of factor have an effect on the efficiency of these algorithms. The first category is related to the properties of the input, which includes the size of the odd composite number, b, and the difference between two factors, ∆ . The second category is related to the natural of the algorithm itself such as the search strategy it uses to find the solution and the number of high-cost operations included in the algorithm.
In general, the improved Fermat algorithms can be classified into two classes. The first class contains algorithms based on the concept of an estimated prime factor and uses different techniques such as continued fraction method [28] or considering n as a special form 6 ± 1, where k is any integer [23]. However, the techniques in this class cannot factor some odd composite numbers, so they cannot be considered as general methods for Fermat factorization. The second class contains algorithms [11,14,15,17,18,19,20,21,22,24,25,26,27,29] that can be applied to any odd composite number and are based on (1) replacing the high-cost operation, i.e., the perfect square in Fermat's method, with a low-cost operation or on (2) reducing the space searched to find the solution. It should also be noted that there is another strategy [13,30] that falls outside the scope of our research, which involves speeding up the running time of Fermat's algorithm that is based on a different platform such as high-performance computing [13,33].
In this paper, we are interested in the integer factorization algorithms that are belong to the second class. From our analysis of these techniques, we made the following observations: 1) The experimental studies for most of these algorithms were implemented when the size of an odd composite was less than 64 bits [15,20,21,26,32], for example, n = 84449 × 21121 =1783647329. This number of bits is small compared to that required in cryptography. Also, the time consumed for an operation increases with increase the size of data, especially for high-cost operations.
2) Many of the experimental studies for some of these algorithms were implemented when the difference between the two factors did not belong to the DEF. This means that any comparison between Fermat's algorithm and these algorithms is not realistic because it has been proved that the Fermat factorization method is not efficient outside the DEF. For example,n=19710741 × 531349691=1047329636821139813 [20].
3) The efficiency of some of these algorithms was measured based on a few data or on some examples, rather than on different values for b and ∆, see for example, [22,25]. This means that there is no exhaustive study that compares two or more integer factorization algorithms that are based on Fermat factorization concept by applying them to different data distributions in the DEF.
4) A few steps in some of these algorithms required some optimization due to the cost of the operation to manipulate a large data size.
Consequently, we are interested in undertaking an experimental study on most of the integer factorization algorithms that are based on Fermat's concept in order to answer the following: Q1) Which one of the Fermat factorization algorithms is efficient for a large size of n and a small value of ∆? Q2) What is the effect of increasing the value of ∆ with a fixed size of n for each of the studied algorithms? Q3) Many integer factorization algorithms have the same number of iterations, theoretically, but which one is the fastest over different data distributions? To the best of our knowledge, there is no sufficient comparative study for Fermat factorization algorithms, especially with regard to the effect of the use of factors b and ∆ on performance. Also, our study compares the performance of nine different integer factorization algorithms in order to determine which has the fastest running time.
The remainder of the paper is structured as follows: In Section II, we provide the methodology used to verify the objectives of this study. In Section III, we provide an overview of the different integer factorization algorithms that are based on Fermat factorization, including the pseudocode of each 341 | P a g e www.ijacsa.thesai.org algorithm. Additionally, we classify these algorithms based on the techniques used into three groups. In Section IV, we describe the experimental study undertaken to compare and measure the performance of different integer factorization algorithms. Also, we present an analysis of the results produced by the experimental study. Finally, in Section V, we draw some conclusions from this work and highlight open questions that remain to be answered in future studies.

II. METHODOLOGY
To measure the performance of the integer factorization algorithms that are based on Fermat's strategy, we followed a methodology that consisted of five stages: 1) Determine the different strategies that need to be used to factor an odd composite integer into two prime factors according to Fermat's concept; 2) Determine the language and packages to use to verify the goal of the study; 3) Optimize the implementation of the selected integer factorization algorithms based on the platform used in the experimental studies; 4) Generate a dataset to use to measure the performance of the selected algorithms, especially when applied to large data sizes; and 5) Measure and analyze the performance of the selected algorithms.
Here we discuss, briefly, each of the above stages. In the first stage, we studied the different algorithms that use Fermat's strategy to find the two prime factors for an odd composite number. We classified these algorithms into groups based on the concept used in the algorithms. For each algorithm, we identified the main idea, the pseudocode and the expected number of iterations to find the solution. The details of this stage are covered in Section III.
In the second stage, we selected the language and package to use in our study, namely, C++ language and the GNU Multiple Precision (GMP) arithmetic library [34]. We decided to use C++ language because the performance of this language is fast compared to other languages such as Java. In other side, to execute any operation in the RSA system requires a number of size 1024 bits. However, the size of the integer type in C++ language does not support this objective because it is limited to 64 bits. Therefore, we decided to use the GMP library because it is designed to support applications such as cryptography and computational algebra that involve large-sized numbers. Furthermore, the library has the following advantages: (1) It contains a large number of functions to manipulate integers and other types; (2) the functions in the library are fast compared to those of other tools; and (3) there is no limitation to the size of number, so we can manipulate a number that is greater than 1024 bits in size.
It should be noted that, in our implementation, we used only the data type mpz_t that is used to manipulate GMP integers. The library contains a group of functions to manipulate GMP integers, such as (1) initializing and assigning GMP integers, (2) integer arithmetic and division, (3) integer roots, and (4) integer comparisons.
In the third stage, we focused on optimizing each algorithm, if required, in order to run the algorithm in a fast way. The reasons for doing this were as follows: (1) The cost of the operation for a large integer size is significantly different than that for small integers of less than 64 bits for the same operation [35,36]; and (2) for some algorithms, we needed to rewrite a few of the statements to increase the performance of an algorithm. The details of this stage are provided in Section III.
In the fourth stage, we employed a method to generate an odd composite number consisting of a product of two prime numbers. In this method, the generation of the two prime factors is based on the two factors of the DEF [6,12]. The first factor is the size of the odd composite number n. Suppose that the size of n is b, where b is the number of bits. Therefore, when we generate an odd composite number n of size b bits, we first generate two prime numbers p and q, each of size b/2, and then we multiply both of them, i.e., n = p q. The second factor is the difference between the prime factors p and q, ∆. This factor is important because the running time of Fermat's strategy increases with an increase in the value of ∆.
The generation of one data (an odd composite number), GD, consists of the following steps: The first step is to determine the number of bits for n, b, and the number of bits for the difference between two factors, ∆. The second step is to generate a prime number of size b/2, say p. The third step is to generate a random number of size ∆ and then add it to the first prime number, say x. The fourth and final step involves generating the second prime number, say q, greater than x such that the size of the difference between p and q is ∆.
For more accuracy in measuring the performance of the different algorithms, we repeated the previous steps for GD by adopting the following procedure: First, we fixed the value of b and ∆. Second, we applied the steps for generating two prime factors as in GD, i.e., from the second step to the fourth step in GD, t times, where t represents the number of different instances that have the same value of b and ∆ . Third, we repeated the execution of the first and second steps of the procedure with increasing values of ∆ in increments of 5 bits until ∆ + 20. The reason for setting a maximum value of ∆ is that the running time of all the integer factorization algorithms based on Fermat's strategy increases rapidly with an increase in ∆. Fourth, we increased the value of b and then repeated all the previous steps.
The last important task in the data generation stage was to verify that the generated data were correct as follows: 1) Each prime factor, p, should be verified such that 2 ( /2)−1 < < 2 ( /2) .
2) The difference between the two factors should be verified such that 3) The prime factors used for the fixed values of b and ∆ should be as different as possible.
The fifth and final step in the methodology involved measuring the performance of the integer factorization algorithms that are based on Fermat's strategy. The performance of these algorithms was mainly measured by 342 | P a g e www.ijacsa.thesai.org computing the execution time. Hence, for fixed values of b and ∆, the running time for the algorithm A, ( , ∆), computed using the following formula: where ( , ∆) is the running time of the algorithm A for the instance number i using input data b and ∆. Note that two instances, ( , ∆) and ( , ∆) , are different if the odd composite number, , of size b for the instance i is different than the odd composite number, , of size b for the instance j. Additionally, with respect to the issue of memory consumption, all the algorithms required a constant number of auxiliary variables, so there was no need to measure this factor experimentally.

III. CLASSIFICATION OF FERMAT ALGORITHMS
In this section, we provide an overview of the different algorithms for integer factorization that are based on Fermat's concept. For each algorithm, we discuss the main idea and steps of the algorithm, and then we give the pseudocode of the algorithm.
Without loss of generality, for all algorithms, we assume that the integer number n is odd and is a product of two prime numbers p and q, where p > q. The main idea of Fermat's algorithm is that the integer number n can be expressed as a difference between two square numbers, x and y. Formally, the odd integer number n can be written as follows: Also, the relation between the two prime factors and the two square numbers is as follows: Different strategies have been proposed to factorize an odd composite number into two prime numbers based on Fermat's concept. All the algorithms start the search with an initial value of x and try to find the value of y such that 2 = 2 − . So, the main issues in Fermat's strategy are (1) how to reduce the search space of x and (2) how to reduce the cost of the perfect squaring operation, where a simple test for a perfect square for x includes two operations: calculating the square root for x, say r, and testing whether the value of r is an integer or not.
We can classify the algorithms of integer factorization that are based of Fermat's strategy into three main groups. The first group employs a direct approach which starts from the minimum value of x and uses a perfect square operation. We named this group the Fermat factorization group because it is based on directly applying the concept proposed by Fermat. The second group is based on sieving or pruning some elements in the search space, so the algorithm does not apply the squaring operation or/and perfect squaring operation on those elements. We named this group the Fermat with sieving group. The third group is based on removing the main operation of the Fermat factorization algorithm which is the perfect squaring operation, so we named this group the Fermat without perfect squaring group. The main ideas and steps of these three groups of algorithms are outlined in the following subsections.

A. The Fermat Factorization Group
The Fermat factorization group contains many algorithms which are based mainly on the perfect squaring operation [14,15,17]. The first and main algorithm in this group is based on rewriting Eq. (1) as 2 = 2 − and starts by assuming that the value of x is �√ � + 1. Then the algorithm tests whether the value of 2 − is a perfect square. If the value of y is an integer and equal to √ 2 − , the search is terminated. When 2 − is not a perfect square, the algorithm increases the value of x by 1 and follows the same procedure until it finds the value of y.
All the steps of the algorithm are shown in Algorithm FF. The algorithm is very straightforward and contains simple operations, except for the perfect square operation. The running time of the algorithm is based on two factors. The first factor is the cost of the perfect square operation and the second is the number of iterations for the While-loop which is equal to − ��√ � + 1� in the worst case, where x satisfies that the term 2 − is a perfect square and equals ( + ) 2 ⁄ .

Algorithm FF (Fermat's Factorization)
Input: n is a positive odd number. Output: and are two prime numbers such that = .

Remarks:
1) Many modified algorithms [14,15,17] have been proposed to improve the FF algorithm while retaining the two operations, squaring and perfect squaring. The modifications are based on rewriting the Fermat factorization formula and then searching for the solution. For example, in [17], the formula is rewritten as 2 = � + �√ � � 2 − 2 , with a small x and y, and the goal is to find the solution ( , , ). In [15], the algorithm is modified in order to achieve the goal of finding a solution for 2 = ��√ �� 2 − , where starts with a value of 1. The modified algorithm, FF1, contains four operations before testing the perfect square. These operations are: multiplying n with i, squaring the root of (n i), say s, squaring s, and calculating the modulus of n. This means that, in general, the modified algorithm contains more operations than the FF algorithm. Therefore, we neglected these modifications in our general comparison of the different types of integer factorization algorithms (see Section IV). 343 | P a g e www.ijacsa.thesai.org 2) In order to optimize the code of the FF algorithm we decided to do the following: (i) To compute 2 , we multiplied with itself to get better performance instead of using the predefined function power for the exponent 2; (ii) in the case of the perfect square operation, we used the predefined function in GMP because we considered that this would be better than computing the square root of the number and then testing whether the results are integers or not.

Algorithm FF1 (Modified Fermat's Factorization)
Input: n is a positive odd number. Output: and are two prime numbers such that = .

B. The Fermat with Sieving Group
The sieving strategy is a method that is used to remove the impossible solutions so that the algorithm does not consider them during the search process. The algorithms that apply this strategy for Fermat factorization can be classified into two classes of techniques.
The first class of techniques ignores the perfect squaring operation in some cases. This means that before testing whether integer y is a perfect square or not, the technique tests whether y satisfies a certain condition. If integer y meets this condition, the technique does not test whether y is a perfect square and goes instead to the next value of x. Otherwise, the technique tests whether y is a perfect square or not.
The second class of techniques ignores the squaring operation and consequently the perfect square operation. This means that before calculating 2 , the strategy tests whether x satisfies a certain condition. If integer x meets this condition, the technique ignores all the subsequent steps, i.e., squaring, subtraction, and perfect squaring, and goes to the next value of x. Otherwise, the technique tests whether y is a perfect square or not.

1) Class 1: Ignoring the perfect square a) Sieving with modulus:
One of the techniques used in sieving is the modulus operation, mod. The idea behind using the modulus arithmetic operation is to exclude all integers, x, that are definitely not perfect squares before applying the perfect squaring operation [22,24].
We can apply this technique as follows: For any integer x, we can represent x in decimal form as −1 −1 … 1 0 , where l represents the number of decimal digits in x. It is clear that the last, right-most, digit of x is either 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9. The last digit for squaring x can be calculated by taking the modulus of 10 and is equal to 0, 1, 4, 5, 6, or 9, see the bold digit in Table I. On the other hand, no squaring number has a last digit of 2, 3, 7 or 8. Therefore, we compute = mod 10 and if the value of is 2, 3, 7, or 8, then there is no need to test whether is a perfect square or not, and so we can go to the next value of x directly.
The complete steps of the algorithm is shown in Algorithm FM10 [22,24]. The algorithm is similar to the FF algorithm but contains two extra statements. The first statement computes the modulus of 10 for the term 2 − , and the second tests the result of using the modulus, see line 7. The running time of the algorithm is similar to that of the FF algorithm, except (1) two extra statements, see lines 6 and 7, and (2) the algorithm uses fewer perfect square operations based on the truth value of the condition.
Algorithm FM10 (Fermat Sieving by Modulus 10) Input: n is a positive odd number. Output: and are two prime such that = . Begin 1.
While (Not found) do 4.
If ( is a perfect square) then 9. found=True 10.

Remarks:
1) The statement in line 7 can be rewritten as follows: If ( = 0 or = 1 or = 4 or = 5 or = 6 or = 9) then However, this statement contains six comparisons at most, whereas the statement in line 7 contains four comparisons at most, but the running time for two versions is almost similar.

2)
We can use a modulus of 15, 20 or 30 instead of 10. To study the effect of changing the value of the modulus on the performance of the algorithm, we changed the modulus of 10 to a modulus of 20 and named this method FM20. Using this approach, the accepted cases to ignore the test for the perfect square occur when the remainder of (r=y mod 20) are 0, 1, 4, 5, 9, or 16. The steps of the FM20 algorithm are similar to those of FM10, except that line 7 is replaced with: If Not ( = 0 or = 1 or = 4 or = 5 or = 9 or = 16) then We studied the effect of this change experimentally, see Section IV.

b) Sieving with odd & even:
In the FF algorithm, the values of x are �√ � + 1, �√ � + 2, �√ � + 3, �√ � + 4,…. This means that the values of x are odd and even numbers. Another sieving technique that can be applied in integer factorization algorithms is based on ignoring all the even (or all the odd) numbers of x if the integer n satisfies a certain condition. The idea behind using the even and odd property is based on the following rules [19,29]: 1) Any odd integer n can be expressed as = 4 ± 1, ≥ 3.
2) For = 4 ± 1 , we have two cases: (i) when = 4 + 1, then x is odd and y is even, and (ii) when = 4 − 1, then x is even and y is odd.
The algorithm consists of four main steps. The first step determines the form of n as either 4 + 1 or 4 − 1. The second step determines the type (even or odd) of x and y. The third step determines the start value of x in the case of whether x is odd or even. The fourth step applies the steps of FF algorithm with updating the value of x with 2.
The complete steps of the algorithm are shown in Algorithm FOE. To determine the formula of n, the algorithm computes the remainder of dividing n with 4 and then tests if the remainder is equal to 1 or not, see lines 2-3. If the remainder is equal to 1, this means that = 4 + 1 , otherwise, = 4 − 1. Lines 4-5 and 8-9 determine the type, odd or even, and the start value of the integer x. The remainder of the algorithm is similar to the FF algorithm except when the value of 2 − is not a perfect square, the algorithm updates the value of x by 2 instead of 1 because when x is even (or odd), the next even (or odd) number of x can be found by adding 2 to x. Hence, the number of iterations of the FOE algorithm is half that of the FF algorithm because the FOE algorithm updates the value of x by 2, whereas the FF algorithm updates the value of x by 1.

Algorithm FOE (Fermat Sieving using Odd & Even)
Input: n is a positive odd number. Output: and are two prime numbers such that = . Begin 1.
If ( is even) then 5.

3) Class 2:
Ignoring the Squaring: Another important improvement that has been made to Fermat's algorithm is the ignoring of some elements in the search space before squaring the value of x. Two algorithms have been proposed to achieve this goal. The two algorithms are based on analyzing the relation between the value of ( mod 10) and the value of ( mod m), where m may be 10, 15, 20, or 30.
The complete steps of this technique are shown in Algorithm FM1010 [24]. Initially, we determine the initial value of x and ( mod 10). Then the algorithm choose one of the four cases, as described previously. The number of iterations of the FM1010 algorithm is similar to that of the FF algorithm, but some of these iterations ignore the squaring and perfect squaring operations.

Algorithm FM1010 (Fermat Sieving by two Modulus 10)
Input: n is a positive odd number. Output: and are two primes such that = . Begin 1.
If ( is a perfect square) then 11. found=True 12.
Using these cases, we constructed another sieving method named FM1020 algorithm. The pseudocode of the FM1020 algorithm is similar to that of FM1010, except that the Whileloop contains eight If-statement, where each If-statement represents a case from the eight cases of the relation between ( mod 10) and ( mod 20) [25].
Remark: Another modified algorithm was proposed by Somsuk, and Tientanopajai [27] and based on analyzing the last m digits of the modulus, where ≥ 2 . The main drawback of the modified algorithm is the number of different subroutines used is very large when m is large.

C. The Fermat without Perfect Squaring Group
The algorithms in this group do not use the perfect squaring operation during the search for the two prime factors. Two main techniques have been proposed for use in integer factorization algorithms that do not employ perfect squaring.
The first algorithm is based on finding two integers x and y such that the difference between two square numbers is 4 , i.e., 4 = 2 − 2 . This formula can be rewritten as follows [27,28]: The algorithm starts the search with = 2�√ � and = 0, and then computes the value of r. Based on the value of r, the algorithm executes one of the following cases [27,28]: 1) Case 1: = 0 . This means that 4n is equal to the difference between two squares and the search process is terminated. Therefore, the two prime factors are = ( + )/2 and = ( − )/2.
2) Case 2: < 0. This means that the value of the term 2 is less than the term ( 2 + 4 ) . Therefore, the algorithm increases the value of x by 2 because the value of the term ( + ) is an even number. Also, the algorithm increases the value of r by (4 + 4) because this value represents the difference between the squares of the next and the current value of x.
3) Case 3: > 0. This means that the value of the term 2 is greater than the term ( 2 + 4). Therefore, the algorithm increases the value of y by 2 because the value of the term ( − ) is an even number. Also, the algorithm decreases the value of r by (4 + 4) because this value represents the difference between the squares of the next and current value of y.
The complete steps of the algorithm are shown in Algorithm FnPS. All the operations of the algorithm are simple and the running time of the algorithm is based only on the 346 | P a g e www.ijacsa.thesai.org number of iterations of While-loop that is dependent on two conditions: < 0 and > 0. The number of iterations for the first inner while-loop that is based on the condition ( < 0) is + 2 − ��√ � + 1� , while the number of iterations for the second inner while-loop that is based on the condition ( > 0) is − 2 , because the start value of y is 0. Therefore, the total number of iterations of the FnPS algorithm is − ��√ � + 1�.
Algorithm FnPS (Fermat with no Perfect Squares) Input: n is a positive odd number. Output: and are two prime numbers such that = .
The second algorithm in this group is based on removing the perfect squaring operation by using modular multiplication. The method of modification is based on Euler's theorem which is given by the following formula [20]: where n is a positive integer number, a is an integer such that a is relatively prime with n, and Φ( ) is the Euler's totient function that is equal to ( − 1) × ( − 1) = − ( + ) + 1.
The complete steps of the algorithm are shown in Algorithm FE [20]. The algorithm starts the computation by initializing = 2�√ � and selecting a positive integer c which is relatively prime with n. Based on the value of c, the algorithm computes the inverse, a, and the square, s, of c in modulo n. Then the algorithm applies Euler's theorem, where the value of Φ( ) = − + 1. If the result, t, equals 1 then the values of x and y are determined and the solution is found. Otherwise, the algorithm sets a flag with false. In which case the algorithm repeats the following steps until Euler's theorem is verified, i.e., when t = 1. The first step in the repetition is to compute a new value t by multiplying it with s and taking a module if it is required. The second step is to update the value of x by increasing it by a value of 2. Finally, the value of the two prime factors are + and − . The total number of iterations of this algorithm is similar to that of the FF algorithm, but the perfect squaring operation and only one square root operation is required.

Remarks:
1) The original statements of Algorithm FnPS in lines 4 and 8 are if ( < 0) and if ( > 0), respectively, see [27]. To optimize the original algorithm, we replaced the two Ifstatements with two While-loops as in Algorithm FnPS. The performance of Algorithm FnPS with two inner While-loops is better than that with two If-statements, see Section IV.
2) Algorithm EF is slightly different than the algorithm in [20]. In [20], the Boolean condition found was not used. Instead, the statement at line 11 is y = 0.1, i.e., any initial value of y is considered to be a float number, and the statement at line 13 is "y is not an integer". The performance achieved by this modification is slightly better than that of the FE algorithm, but not significantly so.

IV. EXPERIMENTAL EVALUATIONS
In this section, we discuss our experimental study of the nine different integer factorization algorithms that utilize Fermat's strategy. The study was conducted according to the methodology described in Section II and involved the evaluation of the performance of these algorithms over different numbers of bits and different differences between the two factors. The first subsection presents the specification of the hardware and software used in the implementation, and the 347 | P a g e www.ijacsa.thesai.org different values of the factors b and ∆ that have an effect on the input data. The second subsection presents the results of applying the ideas mentioned in the remarks in Section III. The third subsection presents the measurement and analysis of the running times of the different integer factorization algorithms.

A. Experimental Specification
The experimental study of different factorization algorithms required the use of a high configuration of hardware. For this purpose, we used a Microsoft Azure cloud system. The system is able to run 32 threads in parallel with a processing speed of 2.6 GHz. The reasons for selecting this hardware platform are as follows: (1) The execution times of the factorization algorithms increase rapidly with increases in b and ∆; (2) a large number of instances are used to measure the running time for each algorithm for fixed values of b and ∆; and (3) experimentally, each algorithm is run sequentially, but we use different threads to execute different instances. However, in order to unload the system we used only a maximum of 16 threads.
As regards the implementation, we ran all the algorithms under Windows 2019 and used Microsoft Visual Studio 2019 to implement the algorithms using C++ language.
The specification of the data used in the evaluation of the different factorizing algorithms was as follows: 1) The values of b were 100, 200, 300, 400, and 500.
2) The value of ∆ started with (b/4) and then we increased it incrementally by 5 bits until (b/4)+20. The reason that ∆= ( /4) + 20 was set as the last value is that the running time for all the algorithms is very high and the time increases with an increase in b.
3) The value of t, i.e., the number of instances for the fixed values of b and ∆, was set as 100, except when ∆= ( /4) + 20 where we considered t = 25, because the running time for each instance is greater than 1.5 hours for the best algorithm.

B. Experimental Comments
In this subsection, we illustrate the results that were achieved by applying the ideas mentioned in some of the remarks in Section III to optimize some of the statements in the previous algorithms.
However, first, the results in respect of the two versions of the FnPS algorithm, the If-statement and the While-loop, are presented and discussed. Table III shows the results of running two versions of the FnPS algorithm, on b = 100 and ∆ =25, 30, 35, and 40. It is clear that the results indicate that the performance of the FnPS algorithm when using the While-loop is better than that when using the If-statement for all values of ∆ . On average, the percentage of improvement in the performance of the FnPS algorithm with While-loop was 16.5% compared to its performance with the If-statement.
However, in general, the performance of the FnPS algorithm, even with While-loop, is very weak compared to that of the FF algorithm for two reasons. The first reason, from the theoretical point of view, is that the search space of the FnPS algorithm is very large compared to the search space of the FF algorithm. The second reason, from the practical perspective, is that the running time of the FF algorithm on ∆= 25 and 30 is zero (see subsection IV.C), whereas the running time of the FnPS with the best case is 0.5 seconds. Additionally, the running time of the FF algorithm on ∆= ∆ 0 + 15 = 40 is less than 1 minute, whereas the running time of the FnPS algorithm is greater than 5 hours. Finally, the results for the FnPS algorithm showed that increases in the value of b led to an increase in the search space, so we excluded this algorithm from the full comparison of all the integer factorization algorithms. Second, the results in respect of the FF1 algorithm are as follows. First, the results of running FF1 algorithm, on b = 100 and ∆ =25 and 30, 35 are 0.001 and 0.002 seconds, respectively. Second, the results of running FF1 algorithm, on b = 100 and ∆ =35 is greater than one hour in many cases without find the solution. As we see in the Subsection C, the running time for FF algorithm is faster than FF1 algorithm when ≥ 100, so we excluded this algorithm from the full comparison of all the integer factorization algorithms.

C. Results of the Comparison
Based on the results reported in Subsection IV.B, we compared seven of the nine algorithms, i.e., FF, FM10, FM20, FM1010, FM1020, FOE, and FE. The other two algorithms, FF1 and FnPS, were excluded from the final comparison because their performance was very poor.
The results of implementing the seven integer factorization algorithms using the data sets and platform described above are shown in Fig. 1, 2, 3, 4, and 5 for n = 100, 200, 300, 400, and 500, respectively. From the results shown in the figures, we can make several observations. 348 | P a g e www.ijacsa.thesai.org    Second, for a fixed value of b with increasing values of ∆, the running time for all seven algorithms rapidly increases. We can estimate this factor by calculating the percentage of the increase in the running time between two consecutive ∆, ∆ 1 and ∆ 2 , and a fixed value of b, where ∆ 2 = ∆ 1 + 5. By way of an example, Table IV shows this ratio for the FE algorithm when b = 300. The running times for the FE algorithm when b = 300 and ∆ =85, 90, and 95 are 0.0161 seconds, 16.18 seconds, and 4.62 hours, respectively. The percentage increase in the running time when ∆ 2 =90 is approximately equal to 1000 times the running time when ∆ 1 = 85, and the percentage of increase in the running time when ∆ 2 = 95 is approximately equal to 1000 times the running time when ∆ 1 = 90. This phenomenon is true for all values of b and for all seven of the integer factorization algorithms studied. Third, the difference between the FM10 and FM20 algorithms is not significant and the two algorithms are near to equal for all values of b and ∆ . On the other hand, the difference between the FM1010 and FM1020 algorithms is significant in respect of the average case. The FM1020 algorithm has better performance than the FM1010 algorithm by a percentage of 15.17%, 17.31%, 22.33%, 21.10%, and 22.17% for n = 100, 200, 300, 400, and 500, respectively. This means that the average percentage of improvement achieved by FM1020 is 19.62% when compared to FM1010.
Fourth, the performance of the FM10 and FM20 algorithms is weak compared to the other five algorithms for every b and ∆. The original Fermat algorithm, FF, has better performance than the FM10 and FM20 algorithms by almost 36% in respect of the average case.
Fifth, Table V shows the percentage of improvement achieved by each algorithm A compared to the FF algorithm for all values of b, where algorithm A is one of the following algorithms: FM10, FM20, FM1010, FM1020, FOE, and FE. Two types of value are presented in the table. The first value is a positive value, say ∝, which means that algorithm A has ∝ improvement compared to the FF algorithm, and the second value is a negative value, say -∝, which means that the FF algorithm has ∝ improvement compared to algorithm A.
Sixth, from Table V, it is clear that the percentage of disimprovement of algorithm A decreases with an increase in the number of bits. For example, the FM1010 algorithm has a disimprovement of 32.06%, 20.61%, 15.21%, and 1.04% for n = 100, 200, 300, and 400, respectively. Also, the percentage of improvement of algorithm A increases with an increase in the number of bits, except for the FOE algorithm. For example, the percentage of improvement achieved by the FE algorithm when compared to the FF algorithm is 4.89%, 11.83%, 31.78%, and 35.97% for n = 200, 300, 400, and 500, respectively.
Seventh, the FOE algorithm has two properties compared to the FF algorithm. The first is that the FOE algorithm has better performance compared to the FF algorithm for all values of b. The second is the percentage of improvement of the FOE algorithm compared to the FF algorithm is almost fixed and equal to 44% for all values of b. The reason for this property is that the number of iterations of the FOE algorithm is half that of the FF algorithm.  350 | P a g e www.ijacsa.thesai.org Eighth, Fig. 6 displays the behavior of all the algorithms for the average case for all different values of b. The same behavior occurs if we fix the value of ∆ and change the value of b. It is clear that the best integer factorization algorithm based on Fermat's strategy for all data sets is the algorithm that is based on sieving odd and even numbers, i.e., the FOE algorithm. The second and third best algorithms are FE and FM1020, respectively. Note that the two curves for the FM10 and FM20 algorithms coincide.
Ninth, Fig. 7 displays the running times for t = 100 instances for each algorithm A when b = 500 and ∆ = 140, where algorithm A is one of the following algorithms: FM1010, FM1020, FOE, and FE. From the figure, we can observe the following: (1) The running time of each algorithm A is varied and based on the value of n, except for the FE algorithm whose behavior is almost fixed; and (2) the running time of the FM1010 algorithm is slightly faster than that of the FF algorithm on average, as in Fig. 5(b), but in many instances the running time of the FM1010 algorithm is longer than the FF algorithm.

V. CONCLUSION AND OPEN PROBLEMS
In this work, we addressed the integer factorization problem that involves finding the two prime factors for an odd composite number. The inherent challenge of this problem is that no polynomial time has yet been found. The problem is used in cryptography to break the RSA cryptosystem. We classified the different Fermat factorization algorithms into three groups according to the integer factorization methods that were used. The algorithms were studied experimentally to measure their performance according to two factors: (1) the size of the odd composite factor, b bits, and (2) the difference between two factors, ∆ bits. The experimental study was conducted on data sets consisting of b =100, 200, 300, 400, and 500, and ∆ = /4, ( /4) + 5, ( /4) + 10, ( /4) + 15, and ( /4) + 20. The results of the experimental study showed that the algorithm based on sieving using the odd and even property performed the fastest factorization when applied to data sets of different sizes of odd composite numbers and different differences between the two factors with a percentage improvement of 44% compared to the original Fermat factorization algorithm. Also, the algorithm based on Euler's theorem exhibited a good level of performance compared with the Fermat factorization algorithm.
There are open questions remain related to this study.
(1) What is the behavior of FM1020, FOE and FE algorithms when > 500 bits? (2) How to use high-performance computing such as graphics processing unit (GPU) to speedup the running time for the best modified Fermat algorithm.
(3) What is the effect of using n mod m on FM1020 algorithm, when m>20.