AES Inspired Hex Symbols Steganography for Anti-Forensic Artifacts on Android Devices

Mobile phones technology has become one of the most common and important technologies that started as a communication tool and then evolved into key reservoirs of personal information and smart applications. With this increased level of complications, increased dangers and increased levels of countermeasures and opposing countermeasures have emerged, such as Mobile Forensics and anti-forensics. One of these antiforensics tools is steganography, which introduced higher levels of complexity and security against hackers’ attacks but simultaneously create obstacles to forensic investigations. In this paper we proposed a new data hiding approach, the AES Inspired Steganography (AIS), which utilizes some AES data encryption concepts while hiding the data using the concept of hex symbols steganography. As the approach is based on the use of multiple encryption steps, the resulting carrier files would be unfathomable without the use of the cipher key agreed upon by the communicating parties. These carrier files can be exchanged amongst android devices and/or computers. Assessments of the proposed approach have proven it to be advantageous over the currently existing steganography approaches in terms of character frequency, security, robustness, length of key, and Compatibility. Keywords—Mobile Forensics; Anti-Forensics; Artifact Wiping; Data Hiding; Steganography; AES


INTRODUCTION
As mobile phones rapidly evolved from communication means to reservoirs of personal information and smart applications [1], they allowed their users to be exposed to increasing dangers and complexities.Consequently, many fields and technologies have been developed as countermeasures to such dangers.One of these fields is the Mobile Forensics, which aims at collecting and analyzing digital evidence to resolve mobile issues.However, on the other side, opposing measures such as Anti-Forensics technologies have been developed to hinder the use of mobile forensics [2].One of these anti-forensics tools is steganography.
Steganography systems are utilized to embed secret message in hex symbols, image, audio and video files that can only be discovered by the parties informed of the secret key of the steganography chosen algorithm.Thus, steganography introduces a higher level of complexity that would protect against attacks but at the same time create an obstacle for forensic investigations [3].This paper will be proposing a new steganography approach inspired by the Advanced Encryption Standard (AES) process, a formal encryption method adopted by the National Institute of Standards and Technology of the US Government, and is accepted worldwide.This encryption method was developed and adopted as a replacement of the Data Encryption Standard (DES) method due to the disadvantages it presented.
The AES encryption method is a 128 bits or 16 bytes block cipher that processes a single block of data at a time and encrypts data through several rounds with the aid of an encryption key.During these ten to fourteen rounds, the data is continuously mixed-up and re-encrypted leading to the increase in the security of the hidden data.A single encryption key is used in the AES method with a length of 128 bits (16 bytes), 192 bits (24 bytes), or 256 bits (32 bytes).The same key would be used for both the encryption and decryption processes which known as symmetric encryption, the opposite of the asymmetric encryption observed in other methods as they utilize two different keys, a public and a private key, in the encryption process [4].This paper will be introducing some of the currently existing anti-forensics approaches and techniques.Thenceforth, the paper will be presenting the new, AES Inspired Steganography (AIS) approach.This method will be utilizing hex symbols for the embedding of the secret message, similarly to our previously proposed HAS approach [5].This approach would be applied to purposefully created hex symbol carrier files (using HxD for example) and viewed and edited using the WinHex software.
The AIS approach is proposed to have advantages over the currently existing steganography approaches in its capacity, security and robustness.Capacity refers to the maximum amount of that the stego-medium can contain.Security refers to the ability of the approach and stego-medium to maintain the secrecy of the data by eliminating chances of discovery by third parties.Robustness signifies the ability of the stegomedium to withstand modifications without the loss or compromise of its secretly hidden content [6].
The paper will be presenting background information and related work on anti-forensic techniques, artifact wiping, data hiding, and steganography tools and approaches in sections 2 and 3. Then the paper will be elaborating further on antiforensics steganography in section 4. The description of the newly proposed AES Inspired Steganography (AIS) is presented in section 5 accompanied by the explanation of the www.ijacsa.thesai.orgimplementation process in section 6. Finally the new approach is analyzed and discussed in section 7.

II. RELATED WORK
Data hiding embeds information in carrier files without changing the general content and format of the file.However, encryption leads to general changes observable by eye to the carrier files reducing the security of the file.Therefore, although encryption increases the difficulty of deciphering the secret messages, the evidence of its existence leaves it prone to attacks.Therefore, combining it with steganography could reduce the vulnerability of this method.AES inspired approached have been developed previously and incorporated into the currently used multimedia steganography methods such as image steganography.
In their paper [7] "A Novel Steganographic Scheme Based on Hash Function Coupled with AES Encryption " (2014), Rinu et al. presented an AES inspired steganography approach in which the textual data to be hidden is encrypted using the AES approach and embedded in a coloured image using hash based algorithm.Singh and Attri (2015) proposed another AES inspired steganography approach in their paper [8] "Dual Layer Security of data using LSB Image Steganography Method and AES Encryption Algorithm".In their work they propose an approach in which data would be embedded in carrier files using LSB image Steganography and encrypted using AES-128 bits encryption resulting in a 2 layered protection of the hidden data.However, this approach has found to result in the invalidation of the stego image used as the carrier file.
Another approach utilizing the concepts of the AES and steganography was presented by Goyal and Sharma in their paper [9] "Proposed AES for Image Steganography in Different Medias" (2014).Their approach utilizes the process of the modified AES (consisting of key expansion, sub bytes modification, shift of rows and mix up of columns) in imageaudio steganography.However, the same issue of image invalidation is observed to occur upon the application of this method.

Ramaiya et al. (2013) proposed an image steganography
technique based on the AES method in their paper [10] "Secured Steganography Approach Using AES".The text to be hidden was converted into binary representation in their approach and then embedded into the cover image.The method allows for the use of 128 bit block size of text & 128 bits of Secrete key.

III. ANTI-FORENSICS
Anti-forensics (AF) techniques are used to avoid and eliminate the possibility of evidence detection by the mobile forensics tools [1].AF techniques and tools are continuously and rapidly evolving.Two major types of Anti-Forensic techniques, artifact wiping and data hiding, will be briefly presented next.

A. Artifact Wiping
Artifact wiping, also known as sanitation, overwrites data files from digital devices permanently erasing them.Some artifact wiping tools, including Binary Code (BC) wipe, Eraser, and Pretty Good Privacy (PGP) wipe, target empty and unallocated spaces [11].

B. Data Hiding
Data hiding tools have been developed to secretly embed and hide undiscoverable data through multiple approaches.These approaches include transferring data to other portable storage devices and then wiping the data from the phone; making data "invisible" and concealing their existence; embedding data in multimedia (hex symbols, image, audio and video) files; and altering file extensions.

IV. THE ANTI-FORENSIC STEGANOGRAPHY
According to [3], "Steganography is the art and science of hiding information in plain sight".Thus, through steganography, a stego-system unknown to third, uninvolved parties can be created to allow for data exchange under extremely secure conditions.Digitally, data hiding techniques are important tools for the utilization of steganography.Through these tools, hex symbols, image, audio and video steganography can be applied.Steganography techniques are generally categorized into Spatial domain and frequency domain.
A spatial domain technique embeds the information to be concealed in the intensity pixels of the carrier multimedia file.The advantage of this category of techniques is their use of the Least Significant Bit (LSB) algorithms to embed the load of data.However, the drawback is that the majority of the LSB techniques are susceptible to attacks.In frequency domain techniques, on the other hand, images are transformed to frequency components by using some techniques, such as Fast Fourier Transform (FFT), Discrete Cosine Transformation (DCT) or Discrete Wavelet Transform (DWT).Thenceforth, the messages are planted and hidden in some or all of the transformed coefficients [12].
In brief, the process of steganography is commenced through an agreement of two parties on a stego-system and a secret key for the embedding algorithm.The accordingly chosen embedding algorithm would be responsible for allocating the carrier files according to their hexadecimal content.The hexadecimals are modified and replaced with the hexadecimals of the secret message to be exchanged by parties involved.This process prevents any third party lacking the knowledge of the secret key and the chosen embedding algorithm from discovering the embedded data or breaching the carrier file contents [3].
In cryptography, sensitive and secret message is stored and transmitted accorss insecure networks while protected from intruding parties access.Created with a secret key, the encrypted data can only be accessed by the meant parties possessing this key which aids in the deciphering the data [13].
In this paper, we developed an approach that combines concepts from steganography and encryption.The secret message in encrypted using an AES-like process and embedded using the hex symbols algorithm steganography proposed in our previous paper [5].Subsequent encryption steps are applied to the carrier file as well further to increase www.ijacsa.thesai.org the security of the hidden data.We call this approach the AES Inspired Steganography (AIS) (figure 1).

A. AES Inspired Steganography (AIS) Design
In this paper, we will be introducing a new data hiding and encryption method that we call AES Inspired Steganography (AIS).Through this method we aim at overcoming the problem of changing and invalidating the carrier file observed in traditional encryption methods.
In general, this method consists of multiple steps of encryption applied to the secretly hidden message.The message on the other hand is embedded into a hex symbols carrier file, which is divided into embedding matrices and cipher key matrices according to varied patterns chosen by the communicating parties.Furthermore, encryptions and rearrangements are applied to the hidden data before and after being embedded into the carrier file.The use of such variations in hiding and encrypting the data allows for increasing the security measures of the approach.More specifically, the encryption process includes inverting the hexadecimal representation of the of the secret message characters before embedding the message in the carrier file.The embedding process was designed as well to have varying patterns, such that, different combinations of different choices of embedding matrices would be identified to contain the secret message.Furthermore, more specific patterns will be used to identify which characters of the segments would be replaced by the characters of the secret message.
Besides the embedding matrices, cipher key matrices and black segments are included in the matrices divisions causing the decipher process to be even more difficult without a secret key.
After the embedding process, rearrangement of the segments and application of the XOR operation between the cipher key and the embedding matrices further masks the hidden message.These steps are followed by random rearrangements and switching of the rows and columns locations of the matrices.The output version of the carrier file in this case will maintain its validity and integrity.Furthermore, the original content of the carrier file is unfathomable as it mainly consists of random hex symbols.Therefore, the steganography process will not alarm and attract the attention of intruding parties.Additionally, as the file is specifically created for the steganography process, unlike the multimedia files, the possibility of comparisons to originals copy of the file to identify changes is eliminated.

B. AES Inspired Steganography Algorithm (AIS)
First of all the communicating parties (i.e.sender and receiver) agree upon certain patterns that will be used as keys for embedding and extracting of the secret message content.These patterns are created as follows.A carrier file in the form of symbols is created, for example using HxD software, and converted into hexadecimal using WinHex program.The resulting hex symbols file is segmented into 16x16 matrices and numbered as shown in fig. 2. The segments are then sorted according to the chosen pattern into segments for embedding and segments for the cipher key.Each of these segments are coupled such as each segment used for embedding would be accompanied by a segment for a cipher key.Fig. 3 provides an example on the arrangement of the segments in the hex symbols matrices.From the embedding matrices, specific segments would be chosen to conceal the secret message.Cipher key segments will be as well allocated to each of the chosen embedding segments.The allocation pattern of these chosen segments will be specified in the hiding keys shared between the communicating parties.This paper will be applying pattern 1 (P1) = C1E2C3C4E5E6 as an example.The hex symbol representation of the carrier file is divided into 16×16 matrices which are further divided into16 segments (each forming a 4×4 matrix).These segments are then numbers from 1 to 16.
The segments to be eliminated from the embedding process would be specified as black segments.In our example, matrix E2 includes 4 black segments, 1, 7, 10, and 16.Each of the segments is then to be given a pattern which would be indicated using an alphabet as shown in figure 4. The input secret message is then converted into a hexadecimal representation; therefore each character of its content will be in the form of a two digits hex symbol.These digits representing each character of the secret message are then inverted.For example if the letter "n" was to be hidden, it will be first converted to the hex code number 63, then inverted to become 36.The resulting inverted hex representations will then be embedded into matrices segments of the carrier file that will have a unique pattern as shown in Fig. 4.
Similarly the cipher key matrices will be divided to 16 segments each formed of a 4x4 matrix, which will be numbered from 1 to 16 (figures 9 and 10).Black segments will be chosen as well, in this case, 3, 8, 10, and 15.
The black segments in the cipher key matrix will be relocated to match the locations of those in the embedding matrix to which the cipher key matrix is coupled.For example segment number 3 will be moved forward and relocated prior to segment number 1. Segments number 1 and 2 will then be shifted 1 block foreword each and so on (fig.5).The hex decimals are then converted to binary representation both the cipher key and the embedding matrices, specifically the decimals representing the secret message content in the embedding segments and those opposing them in the cipher key segments.After the conversion to the binary representations, the decimals are processed using the XOR operation and the resulting binary representation is then converted back to hexadecimal representation.The process is illustrated in table I.The 16 segments are then randomly rearranged and the order would be described in a table and allocated a certain code such as Random(S) E2 =3,4,1,2,7,8,5,6,11,12,9,10,15,16,13,14.
The contents of these matrices are relocated by exchanging rows with columns in order to increase the difficulty for hackers.
The hex symbols book is a table like reference shared between the communicating parties with the key information to decipher the encrypted message.This book would include the arrangement of the embedding and cipher key matrices (E# / C#) for a collection of chosen pattern.Furthermore each pattern would be accompanied with information about the black segments, the original allocations of each of the segments before the randomization processes and the patterns www.ijacsa.thesai.org(indicated by the alphabet representations) used in each of the segments to indicate the characters containing the concealed hex decimals as shown in table II.

VI. IMPLEMENTATION
An example of the proposed AIS (AES Inspired Steganography) Approach will be presented in this section in which the secret message "Steganography is the art and science of hiding information in plain sight.EAS is a symmetric encryption" will be embedded in a hex symbols carrier file.The characters of the message are converted to into the hexadecimal representation to begin with.Each letter of the message would be represented by two hexadecimal character components.The two hexadecimals forming each character are then inverted as shown in fig.6, for eample 73 would become 37.

Fig. 6. Secret message hex symbols after inversion
The resulting inverted, hexadecimal representation of secret message"s characters is embedded into embedding segment 2 (E2) according to our choice of "S", which represents pattern-1 (Fig. 7).As explained before, the secret message will be embedded after the choice of the black segments and the embedding pattern of the secret message in each of the segments of the embedding matrix (shown in green).Fig. 7. Secret message hex symbols after inversion.Illustration of the embedding matrix (E2) after the embedding of the secret message according to "S" (pattern-1).the embedded secret message is represented by bold green characters while the black segments are shaded with light grey Similarly, the cipher key matrices are prepared by choosing the black segments and rearranging the segments of the matrix (figure 8).Accordingly, the locations of the black segments in both the embedding and cipher key matrices would be superimposing as explained earlier.
Fig. 8. Rearrangement of the cipher key matrix in order for the black segments (shaded with light grey) to have superimposing location with those of the embedding matrix Subsequently, the XOR operation is applied to the matrices as shown in figure 9 to produce the newly encrypted form of the embedding matrices (figure 10).The segments of the new embedding matrix are rearranged randomly (figure 11).Rearrangements are applied to the cipher key matrices as well.Initially, a random arrangement is applied (figure 13), followed by interchanging the locations of the rows and columns.Finally the hexadecimal representation of each of the characters in the cipher key matrix is inverted (figure 14).With these final rearrangements the carrier files would be ready for the safe exchange between the communicating parties.With the aid of the secret keys and codebooks shared by the communicating parties, the steps would be retraced to recover the key message.

VII. ANALYSIS AND DISCUSSION
The frequency, degree of safety, robustness, key length, compatibility and capacity of the proposed AIS scheme were analyzed as follows.

A. Frequency
As the two hexadecimal character components of the hex symbol are inverted and processes according to the XOR operation with the cipher key, the calculated frequency of occurrence before and after the application of these changes will differ (fig.15-a and 1-b).From the compiled frequencies analysis in table III and fig.16, clear differences were observed in the frequencies of the characters before and after the application of the changes.This observation positively indictes the high level of security against third attacking parties, which is expected to be even further increased with the increase in the length of the secret message.

B. Using WinHex
The use of WinHex to formulate the hex symbols during the hiding process is advantageous as the content will be difficult to trace and compare with previous versions.This advantage becomes more critical as frequent rearrangements of the hex symbols are applied throughout the steganograohy procedure.
A comparison was conducted between the hex symbols in the carrier file (viewed using WinHex) prior to and after the encryption of the secret message.As shown fig.17, a complete change occurred in the hex symbols content of the carrier file, making it impossible for a third party to detect any traces of the message.The carrier file content was additionally compared without being viewed using WinHex (Figure 18).The comparison has shown the maintenance of the integrity of the file before and after the encryption process.Such as, the encryption would not cause the invalidity of the file after the encryption as seen in other approaches [8][9].Moreover the alteration of the hex symbols by the inversion of each character element doesn"t increase the original file size, leaving it stable and unchanging in terms of elements number.Furthermore, the use of random numbers to select the segments provides an extra complication against deciphering the hidden text.A comparison between available steganalysis tools and hex symbols is presented in Table IV.

C. The safety and security against encryption traces
The proposed approach does not require the use of any external encryption and data hiding tools, but rather utilizes tools embedded in usually commonly software such as Microsoft Excel.Therefore, traces of steganography and encryption specified tools on the communicating parties" personal devices would be hard to detect.Therefore, the elimination of such a potent traces source would reduce the risk of alarming the attacking parties.Moreover, the hex symbol file extension can be changed to mislead hackers and investigators.

D. Compression
When tested under compression options such as WinRAR and ZIP file formats, the carrier file has been found to resist changes in size and content.This resistance indicates the robustness of the proposed approach against modification that could be applied to the file which steadily maintaining the file"s integrity and content safety.

E. Key length
As shown in table V, in our example, the size of the hex symbols was 1536 bytes and the size of the secret message used was 336 bytes while the size of the cipher key was equally 336 bytes as suggested earlier with regards to the pattern size required for embedding the secret message.Any increase in the required pattern size for embedding the secret message will result in a simultaneous and equal increase in the size of the secret message and cipher key.Therefore, in our approach we have been able to include all the methods of ciphering texts; the transposition (permutation), the substitution and the one-time pad.Achieving the one-time pad is a very significant strength of our new approach, as it was developed to have an equal size for both the cipher key and the secret message regardless of the size of the secret message.Furthermore, the longer the cipher key and the message are, the harder it is to identify the cipher key by intruding parties.Moreover, the use of the hex symbols carrier file and hexadecimal representation of its content allows for a higher embedding capacity in comparison with the use of binary representation.

F. Compatibility & Capacity
The use of Hex symbols was very compatible with the use of the AES concept as both approaches are based on the use of the hexadecimal representation of the content or target text.Such compatibility allows for coherence and flexibility as well as high storage capacity due to the use of the hexadecimal representation in comparison with the methods utilizing binary representations.The use of the hexadecimal representation has the advantage of the higher robustness in comparison to the binary representation as the writing and modification of the hexadecimal representation is relatively easier.Furthermore, basing the approach on the hexadecimal representation reduced the length of the code needed in comparison to the binary representation.
In this proposed approach we have added several steps to increase the complexity degree of the hidden message.First, the approach chooses certain segments and hexadecimal characters to embed in while leaving some without embedded character.Therefore, upon the continuous rearrangement of the matrices, the location of the secret message characters would be hard to identify.Second, we added the idea of using random black segments to increase the complexity of the encryption.Finally, we inverted the hexadecimal representations of the characters at several occasions.The combination of these modifications with the multiple steps of encryption and the steganography process resulted in a very complex system that would only be deciphered through the use of the secret key book.

VIII. CONCLUSION & FUTURE WORK
The AES Inspired Steganography (AIS) approach that we propose in this paper represents a modified, improved version of both the AES and the steganography approaches, as it overcomes the weaknesses of each of the techniques through the strength of the other.The approach utilizes the multi-step encryption idea of the AES in combination with the safe data hiding concept of the steganography to conceal secret messages in hex symbols carrier files.This approach has been proven to have advantages over the currently existing steganography approaches in terms of capacity, safety and robustness.www.ijacsa.thesai.org In the future, this approach can be developed further to increase its complexity.The length of the secret message as well as the cipher key could as well be further modified and increased to increase the capacity of the approach.Furthermore, the approach could be developed to be incorporated into other applications and techniques.

Fig. 3 .
Fig. 3.The division and pattern specification of the embedding and cipher key matrices in the carrier file

Fig. 4 .
Fig. 4. A demontration of a chosen pattern (pattern 1_E2) applied to an embedding metrix of the carrier file.The figure indicates the chosen black segments indicated with only numbers while other segments are indicated by a combination of numbers and alphabets

Fig. 5 .
Fig. 5. Illustration of the re-arrangement of the cipher key matrix to allow the black segments to have superimposing location as thise of the embedding metrix

Fig. 9 .
Fig. 9.The application of the XOR operation to the secret message containing characters of the embedding matrix and their opposing characters on the cipher key metrix

Fig. 11 .
Fig. 11.The resulting embedding matrix from the XOR operation

Fig. 12 .
Fig. 12. Random rearrangement of the embedding matrixFinally, one more rearrangement is applied to the embedding matrix by interchanging the positions of the rows and columns of the embedding matrix as shown in Fig.12.This is achieved by flipping the whole segment elements around the diagonal as shown in equation 1:

Fig. 13 .
Fig. 13.The final form of the embedding segment after the exhange of the locations of the rows and columns

Fig. 14 .
Fig. 14.Random rearrangement of the cipher key matrix

Fig. 15 .
Fig. 15.The inversion of the hexadecimal representaions of the cipher key matrix characters

Fig. 16 .
Fig. 16.Character frequency assessment of the embedded message: (a) before and (b) after the inversion of the characters and application of the XOR operation with the superimposing characters of the cipher key matrix

Fig. 17 .
Fig. 17.Character frequency assessment of the embedded message: (a) before (F.B) and (b) after (F.A) the inversion of the characters and application of the XOR operation with the superimposing characters of the cipher key matrix

Fig. 18 .
Fig. 18.Comparison between the Hex Symbols before and after the encryption of the secret message

Fig. 19 .
Fig. 19.Hex symbols content of the carrier file (a) before and (b) after the inversion of the characters and application of the XOR operation with the superimposing characters of the cipher key matrix of inversion and XOR operation www.ijacsa.thesai.org

TABLE II .
EXAMPLE OF THE SHARED KEY HEX SYMBOL CODEBOOK

TABLE III .
CHARACTER FREQUENCY BEOFRE (F.B) AND AFTER (F.A) INVERSION AND APPLICATION OF THE XOR WITH THEIR COUPLED CIPHER KEY SEGMENTS

TABLE IV .
COMPARISON BETWEEN STEGANALYSIS AND HEX SYMBOLS

TABLE V .
LENGTHS OF THE EMBEDDING AND CIPHER KEY SEGMENTS, THE HEX SYMBOLS IN THE CARRIER FILE, THE SECRET MESSAGE AND THE CIPHER KEY