A Robust Scheme to Improving Security of Data using Graph Theory

With the incredible growth of using internet and other new telecommunication technologies, cryptography has become an absolute necessity for securing communications between two or more entities, particularly in the case of transferring confidential data. In the literature, many encryption systems have been proposed against attack threats. These schemes should normally overcome the concerns by ensuring confidentiality, integrity and authenticity of transmitted data. However, several of them have shown weaknesses in terms of security and complexity. Hence the need for a robust and powerful non-standard encryption algorithm to prevent any traditional opportunity to sniff data. In this work, we propose a new encryption system that perfectly meets the security requirements. The scheme is based essentially on the principles of graph theory which are very promising at plain text representations. Our approach proposes another use of the concept of Hamiltonian circuit and adjacency matrix using a shared key and a pseudo-random generator. After analysis of the experimental results, which were very promising, the technique was found to be both efficient and robust. Keywords—Cryptography; encryption; security; graph theory; Hamiltonian circuit; adjacency matrix


I. INTRODUCTION
Cryptography is a branch of cryptology that relies on a set of techniques and methods that transform a clear or readable message into a completely unintelligible one. This discipline deals with several security issues such as the confidentiality of transmissions through unsecured channels, the privacy of individuals, the archiving of data on unsecured media, and so on. Cryptography thus allows the study and analysis of data encryption systems aimed at minimizing the reach of hackers and limiting, as much as possible, their unauthorized access to such data, while preserving the key concepts of information security that are confidentiality, integrity, authentication and finally non-repudiation [1]. The purpose of cryptography is therefore the construction of protection schemas that provide ironclad assurances of confidentiality or authenticity of transmitted messages when dealing with malicious attempts to access them.
Confidentiality is an essential aspect of security. It can be guaranteed via an encryption mechanism, through which data becomes unintelligible to any unauthorized party attempting to access it. The role of encryption algorithms is to transform a clear message into an encrypted one so that only authorized people can retrieve the message in its original, clear form by performing the reverse-process to encryption, namely decryption. By design, decryption should be made as difficult as possible to any unqualified, unauthorized party attempting to carry it out.
Over the years, cryptography has steadily evolved and gradually become indispensable to modern society. Any and all contributions to this field of work have always been of great interest. According to the literature, there are three types of cryptography. The first is symmetric-key cryptography, which involves using a secret key, such as DES [2], IDES [3], AES [4], or others for the purposes of encryption. The second is asymmetric cryptography, which is based on the use of two different keys (one private and the other public) [1], such as RSA [5], ElGamal [6], Diffie-Hellman [7], and so on. The third and last type is what's called hybrid cryptography, and it combines the previous two encryption methods. Today, modern cryptology is able to make use of a considerable set of mathematical tools. This has led to greater gains in performance and efficiency. Graph theory in particular is an area that is seen as being potentially promising in this respect, as it introduces concepts that might help solve problems in all areas related to networks.
Graph theory in mathematics and computer science is the study of graphs. Generally, a graph can be used to represent the structures and connections that form a complex set while using clear representations of the elements and expressing the relationships between them in a more tangible way, that is by defining communication networks for instance, as well as road networks, electrical circuits, and so on. Graphs therefore offer a way of thinking that allows for the modeling of a wide variety of problems using edges and vertices to represent them. The Seven Bridges of Königsberg (1736) [8] is a mathematical problem known for having laid the foundations for graph theory. Since the beginning of the 20th century, it has developed into a full-fledged branch of mathematics, thanks in part to the work of König, Menger, Cayley, Berge, and Erdös. (References must be added) Graph theory has become a key element in many applications within computer science. It's a relatively recent concept that has been successfully integrated and has allowed for the development of more powerful encryption algorithms that have proven difficult to crack even for modern software solutions. This is essentially a matter of modeling encryption problems by representing them in graph form so that they become problems in graph theory to which solutions are generally known or more accessible. Solutions to graph problems can be relatively easy and efficient (the time it takes to process them computationally can be fairly reasonable given their polynomial dependence on the number of vertices in the graph). The solutions can also be quite difficult (where processing time increases exponentially) in which case a heuristic a practical (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 5, 2020 problem-solving approach is used to find the optimal solution.
As a relatively new yet quite powerful tool, graph theory is recognized by government agencies and organizations that have a vested interest in security as having made considerable contributions. It is used in the development of various encryption techniques as well as in sophisticated data communications. This has led to the application of the concepts introduced in graph theory in cryptography on a broad scale, seeing as many NP-hard problems stem from this theory.
Given all of the above, there seemed to be a great need for a new encryption system based on graph theory to be developed, that would ensure a high degree of security while requiring relatively simple resource processing. In this paper, an application of the principles related to this theory in the field of cryptography is presented; its aim being the development of a communications scheme that is both efficient and secure. This proposal makes use of disjoint Hamiltonian circuits for the presentation of data, and of the divide-and-conquer paradigm to simplify processing and facilitate encryption.
The remainder of the paper is arranged as follows. Section 2 presents preliminary knowledge. A literature review of related works is explained in Section 3. Section 4 describes the proposed scheme. Experimental results and analyses are detailed in Section 5, and finally, the conclusion is given in Section 6.

II. PRELIMINARY KNOWLEDGE
• Graph : Graph theory is a branch of applied mathematics. Fundamentally, a graph consists of a set of vertices, and a set of edges, where an edge is something that connects two vertices in the graph. A graph is a pair (V, E), where V is a finite set and E is a binary relation on V. V is called a vertex set whose elements are called vertices. E is a collection of edges, where an edge is a pair (u,v) with u,v in V. Graph G = (V, E) is a collection of V nodes connected by E links [1].
• Simple graph : Undirected graph that has no loops (edges connected at both ends to the same vertex) and no more than one edge between any two different vertices.
• Path : A path is a simple graph whose vertices can be ordered so that two vertices are adjacent if and only if they are consecutive in the list [1].
• Undirected Graph : A graph in which each edge symbolizes an unordered, transitive relationship between two nodes. Such edges are rendered as plain lines or arcs [1].
• Cycle : Refers to a chain where the initial and terminal node is the same and that does not use the same link more than once is a cycle.
• Hamiltonian Path : A path that visits each vertex exactly once in an undirected graph.
• Hamiltonian Circuit : A Hamiltonian cycle (or Hamiltonian circuit) is a Hamiltonian Path such that there is an edge (in graph) from the last vertex to the first vertex of the Hamiltonian Path.
• Adjacency Matrix : Given a graph G with n vertices (ordered from v 1 to v n ). The Adjacency Matrix M of size n × n related to G can be defined by: • Divide and Conquer : An algorithmic strategy which is mainly based on dividing an initial problem into several roughly equal sub-problems, and then solve the sub-problems separately before combining their results. This strategy is able to considerably reduce the complexity of mathematical problems that require a lot of processing.
• Blum Blum Shub (BBS) : [9] A simple unpredictable pseudo-random number generator that was proposed in 1986, and whose mathematical equation is described as follows: where M = p.q is the product of two prime numbers p and q. The security of this generator fully depends on the complexity of factoring M, which means that the two primes must be properly chosen to ensure a certain robustness. BBS is a good choice for many applications, especially those related to cryptography as it can generate unpredictable sequences.

III. RELATED WORKS
Nowadays, graph theory has contributed greatly to the development of various encryption techniques. A review of the relevant literature reveals several methods that have been put forward for such purposes.
Yamuna [10] proposed an encryption mechanism using Hamiltonian paths. The data is represented using a Hamiltonian path, and the complete graph is constructed by weighting the remaining vertices to increase the level of security.
Al Etaiwi [11] put forward a new encryption algorithm based on graph theory. His paper presents a new symmetric encryption algorithm that uses the concepts of complete graph and minimum spanning tree to strengthen security.
Yamuna [12] showed that Hamiltonian circuits could be used to represent multiple messages through a single graph and that encryption could be done using time-dependent functions.
Yamuna and al. [13] used musical notation to represent the secret key (musical note) and graph theory properties to generate keys. This approach is based on the Propagating Cipher Block Chaining (PCBC) mode for encrypting binary messages. In 2014, the same authors proposed a PIN-code encryption method in the form of a digraph [14].
In [15], the authors have proposed a graph based modified DES (GMDES) algorithm which is more secure than the classical DES algorithm [2]. The proposed graph is not fully depended on secret key, and for the same plain text it produces (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 5, 2020 different cipher text using the same secret key which reduces the probability of various attacks.
Agarwal and Uniyal [16] proposed an encryption scheme based on transforming ASCII values into prime numbers using an encryption key of similar size to the clear message. The authors then randomly generated a prime weighted graph by taking into account the prime number weights assigned to each of the edges.
The system that Amounas [17] put forward handles the original data using graph theory and some of its properties. The main concept being the generation of the complete weighted graph. More specifically, this approach offers a new way of labeling the edges of a graph. It subsequently applies the matrix approach based on ECC operations to generate strong encrypted text.
Recent work in the literature includes the technique proposed in [18], where each character of the data has been encrypted into an Euler Graph. they used the Hamiltonian Circuit as key to secure the data.
Selim G. Akl designed in [19] an encryption process to transmit a secure message. The author employed three distinct graphs constructed successively, and based on an unconventional mapping, conjectured to be a trapdoor one-way function, and which is conceived especially for graph structures.
In [20], two graph based public key cryptosystems have been proposed for protecting valuable information. The first method is purely based on matrix properties, and the second is based on graphical codes.

IV. THE PROPOSED APPROACH
The system put forward in this paper uses the fundamental concepts of graph theory to facilitate the handling of raw data. The basic idea is to generate weighted graphs by using Hamiltonian circuits in a novel way.
The results that have been achieved in this work appear to be promising, especially in terms of complexity and speed when it comes to processing the clear message. The proposed mode of operation is essentially based on the divide-andconquer design paradigm which involves breaking down an initial problem into smaller sub-problems and then dealing with each sub-problem separately.
We conceive this approach primarily to address complexity concerns, as most existing works represent a clear message using a graph of similar size. Which, in turn, becomes an adjacency matrix used to process and handle data. It automatically follows that in such cases, the larger the matrix, the more complex the linear operations will be.
With that in mind, using the divide-and-conquer design paradigm has allowed us to reduce complexity by dividing the message into smaller blocks instead of processing it in its entirety. Each block is represented by disjoint Hamiltonian circuits to reduce the size of the graph associated with the block.
To describe fully how the new encryption technique actually works, we will illustrate it in two main algorithms (Encryption in Algorithm 1 and Decryption in Algorithm 2).
Each algorithm includes some functions that we will define and explain their functioning.
The input of our Graph Encryption algorithm is a clear message with n characters. The function ASCII Transfomation converts each character of message into its ASCII value. It returns an array of integer belonging to the interval [0,255].
The second function Decomposition Block decomposes the array into several k blocks BlockSet k using the following formula: Such that n is the size of the message, r (belonging to the interval [0, 24)]) is the remainder of the division of the message by 25, and k its quotient. If the division is accurate then k = k, otherwise k = k + 1.
The third function Decomposition Key generates from KEK (Key Encryption Key) master key of size m = 256, K i sub-keys as square matrices of order 13 (where i = 0 . . . k −1) which will be used to encrypt each block ( Fig. 1 depicts this process).
The generation of sub-keys is carried out in two steps. First, a size m = 13k Key is generated from a parameter m = 256 size KEK master key. Then, k other sub-keys are generated from that Key as square matrices of order 13. For each Block i , the ASCII value of the first character is used to specify a digit of the KEK master key (using the position that is supposed to fill in the range [0,255]). This digit is normally used to generate the seed S i of size 13. Then we use the vector S i to generate the sub-key K i in the form of square matrices of order 13.
Additionally, each size-25 block (minimum block size) is partitioned into two size 13 and 12 sub blocks respectively. Given that in a complete graph with n vertices, there is (n-1)/2 disjoint Hamiltonian circuits if n is an odd number strictly greater than 3 [21].
A graph of size 13 can contain 6 disjoint Hamiltonian circuits. It follows that we represent each block by a graph of size 13 containing two disjoint Hamiltonian circuits. Moreover, we convert the odd size sub block into an Eulerian cycle using the ASCII values of its characters, thus representing the weights of the edges. Thereafter, we represent the second sub block by using one of the other five Hamiltonian circuits by filling the missing values with the ASCII code of the null character (Block Graph). Then we use the adjacency matrix to represent the resulting graph (Graph Matrix). This representation is very advantageous, not only in terms of the complexity of the processing but also compared to the traditional representation of the message, which would normally take place within a single Eulerian cycle.

A. Encryption Mode
We encrypt each block using CBC (Cipher Block Chaining). In this mode, an 'exclusive OR' (XOR) operation is applied to each Block i using the preceding block s encryption before encrypting the current block itself using the same The result is then used in another XOR operation with the K i sub-key produced by the BBS pseudo-random generator to obtain the C i of the current block, The encryption of the first block (M 0 ) is performed after passing a randomly generated initialization matrix (IM ) through an XOR gate. Each encrypted block will be represented by a vector V i of size 13 2 by concatenating the lines of the adjacency matrix using the function ConcatenateLines. Finally, the resulting vectors are concatenated by the function ConcatenateVec resulting a single vector of size 13 2 k which will represent the encrypted message EM sent back out, in addition to the vector F CB containing the positions used to generate the Key. Fig. 2 clearly illustrates the encryption mode.

B. Decryption Mode
The input of our Graph Decryption algorithm is a encryption message EM of size m (13 2 k ). The function Decomposition Vector decomposes the Vector V of size 13 2 k into several k encrypted block EM atrixSet k To decrypt one message of size m. The number of blocks is calculated as follows: We generate also one key of size 13k from the master key using vector FCB. Sub-keys are then generated to decrypt each block C i (i = 1 . . . k ). Using the formula Where After this stage, each block is decrypted then the disjoint circuits are extracted and the blocks Block i (Graph Block) are reconstructed. Moreover, we convert the ASCII values into characters for each block. Finally, the character blocks are concatenated (Concatenate Block) to form the clear message as shown in Fig. 3.

V. EXPERIMENTAL RESULTS AND ANALYSES
The evaluation of the Encryption technique includes the performance and efficiency of the algorithm, and on the other hand how the scheme can react in terms of robustness against certain attacks such as the Brute-force attack.

A. Statistical Tests
In this part, the DIEHARD test [22] is used to analyze the quality of the random generation of the proposed block cipher. The main purpose of this test is to establish that our algorithm is able to withstand statistical attacks. In other words, a secure block cipher output should be statistically indistinguishable from a random output via the encryption function. For this test to be carried out, a sequence of randomly generated ciphers is first converted into binary to produce a bit-stream larger than 10 MB. Then, this bit-stream is statistically analyzed by subjecting it to the DIEHARD tests. The DIEHARD tests verify the p-value of the randomly generated numbers, where the pvalue is in the interval [0.025, 0.975]. The mean values that were obtained are summarized in Table I. Results show that the bit-stream generated using our proposed method has passed all DIEHARD tests. What is more, our encryption system displays satisfactory random and statistically indistinguishable behavior.

B. Brute-Force Attack
Brute-force attacks are a way to find all potential key arrangements using a fast prediction tool. On the assumption that a high-quality machine that takes 10 −10 seconds to test the validity of each key is used, and that the numbers used in the master key are between 1 and 100, our algorithm has 100 256 possible keys. A brute-force attack would take about 10 −10 × 100 256 seconds to obtain the correct key. Thus, a (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 5, 2020 Cipher text To reveal a 25-character message, that is in the case where only one block is used, there are 100 possibilities to determine one of the master key's numbers, which will represent the seed for the BBS generator that's used to generate the vector S 0 . That said, the prime numbers used as input parameters for the generator are difficult to determine (because of factorization issues). Therefore, it is almost impossible to find the sub-key if the pq product is sufficiently large.

VI. CONCLUSION
This paper puts forward a new block-based encryption scheme that utilizes the divide-and-conquer paradigm as well as the fundamental concepts of graph theory to simplify and facilitate processing. Various statistical tests have been carried out to prove that this new algorithm is secure. All of those tests have confirmed that this algorithm resists statistical attacks. Moreover, the BBS-based generator has been used to generate encryption keys for our algorithm, which has further improved key strength. As future work, we aim to design our own pseudo-random generator in order to provide pseudo-keys. We also aim to exploit other graph theory properties for a more robust representation of data.