HORAM: Hybrid Oblivious Random Access Memory Scheme for Secure Path Hiding in Distributed Environment

Now-a-days in most of the sectors digitization has taken place to store data and process it easily with enhanced techniques. Online transactions produce very huge data daily in various sectors like health care, military, government office. To store huge data many firms, take the help of third-party organizations and store data on machines provided by them which creates new security issues. While performing operations on the data or accessing data metadata leakage may happen due to untrustworthy systems. This paper proposed hybrid oblivious random-access memory (HORAM) offers users to access their data from untrusted storage devices without sharing any information about their access patterns or techniques. Here random data block shuffling approach is used which helps in hiding storage policies about the user data blocks placement and preserving privacy of data. HORAM techniques perform pullpush operations on data in a parallel manner which in turn minimizes network overhead and reduces the execution time of operation. An extensive experimental analysis of the proposed system produces better results than weak and strong Federated oblivious random access memory (FEDORAM) respectively. The method is faster than weak FEDORAM and strong FEDORAM as it takes 0.96 seconds for communication with 5 servers whereas weak and strong FEDORAM takes 1.5 and 2 seconds respectively for reading and writing data. Keywords—HORAM; metadata; data blocks; privacy; block shuffling


I. INTRODUCTION
Big data analytics is becoming very easy with the evolving techniques in big data management and cloud storage. Nonetheless, placing information on untrusted servers raises security concerns. Nowadays privacy is required in every sector as everything is becoming digital [1]. Every system is getting transformed from offline form to online as it can be accessed from anywhere. Many of the people are giving priority to online system instead of offline system. In the current situation of corona pandemic most of the systems like government offices, educational systems, healthcare systems as well as private sectors put everything online to make it easy to people. Online platform opens easy entrance to security threats to enter in the database systems and obtain the information easily [1] [4]. Especially if the information is extremely sensitive it becomes very harmful and owner may need to pay high cost for it if it gets stolen by third party.
If for example a health care system is considered with a database of patient records, specific record denotes each patient information, and columns reflect various characteristics. By executing queries on a particular attribute, a health care system may obtain details of the patient. For example, a point query will return results for people aged 30, while a range query would provide data for those aged between 19 and 30. Outsourcing such information to a user is a common practice that allows for efficient querying [2]. While executing queries encryption is employed because a user could always be trusted with critical information. To query the leased database effectively, system creates a key and encrypts it together with a data encryption structure, balancing system security and reliability. There are many existing techniques like encryption to protect data but it's not much efficient as data is exponentially growing. With existing cryptographic technique data can be secured but metadata about data can't be secured because it is created daily with the various web activities hosted by web server in many organizations [2] [4]. Only encryption cannot secure dataset completely.
Applying encryption algorithm to user message may give information privacy, however it isn't adequate to address metadata concerns. Specifically, if information regarding access pattern get disclosed then it is going to damage whole record. By guessing access pattern attacker can get access directly to the private data. There is scope for cloud as well as other attacker for catching the leaked access pattern and misuse it [3]. Everyday internet thefts are finding new tricks to access data for the financial advantage. Thus, there is need Oblivious Random Access Machine (ORAM) is a strategy that allows customer to retrieve encrypted data from the cloud servers in secret way without disclosing path of data retrieval. Path of physical location is different from actual data access by user in ORAM. Basically to accomplish the motive of ORAM many researchers have contributed. Researchers have updated basic model of ORAM to improve performance of ORAM. As ORAM is limited in terms of complexity researchers tried to reduce it so that it will be more functional [5] to have dynamic strategy to deal with these security risks to sensitive information [4]. The Path ORAM techniques are used for security in recent years, initially anticipated by Stefanov et al. [5]. Path ORAM works to store the data blocks into the binary tree structure, including multiple leaf nodes such as buckets. Every bucket of a tree having a specific constant number of blocks which is denoted as z. When the tree has initialized, the *Corresponding Author. www.ijacsa.thesai.org leaf bucket is defined as 0 to N-1, while each block has a random tag or position from range 0 to N [5]. Moreover, it contains a single small stash region that holds numerous blocks temporarily. If a block contains tag p, it will reside in the cache or some along the route from the base of the plant to the p th leaf node, according to the tree's constant.
A fully functional Oblivious RAM [6], sometimes abbreviated as ORAM, is fundamental that obscures the device's access privileges to a repository such as DRAM. In contrast, an attacker cannot learn very little about the data transmitted by watching the main memory trends. The ORAM interface converts the user's program accessing sequencing into a series of ORAM visits to seemingly random address information. Because the opponent is aware of the actual locations getting accessed, the ORAM implementation of global that the physical and logical sequence is autonomous of the proper access sequence, ensuring that the user's access sequences are not disclosed. Moreover, information stored inside database is secured using stochastic encryption to hide the captured data. Some security problem arises in hardware, software, and application levels. Several recent studies have taken use of the growing availability of trustworthy equipment for database systems.
Bajaj et al. [7] introduced TrustedDB which implements tamper-proof data aggregation using IBM 4758 PCI [8]. CryptSQLite encases the SQLite processor in an Intel SGX compartment to provide secrecy with a bit of efficiency hit [9]. ObliDB, a more recent study, improves point query speed to 722x quicker than current encrypted communication oblivious databases [10]. Access pattern threats in untrustworthy storage are identified by StealthDB and EnclaveDB which offer cryptographic solutions based on protected hardware [11] [12]. They are distinct from their ProDB regarding security border, access pattern depreciation, and high connectivity adjustments with hardware enclave, ORAM, and disk space. Hardware enclaves are used to solve database problems or build data structures with particular usage [13] [14].
ZeroTrace uses a new components library in its suggested ORAM microcontroller to offer extra protection against application attacks are launched on the SGX enclave, i.e., the oblivious positioning map access [15]. Even with processor enclaves, Oblix and ObliDB recognize the presence of access pattern leaking of the insight of database table employed in index searches and provide more effective performance than the naive worst-case buffer [16] [17]. Pro-ORAM increases system performance by utilizing the number of co Shuffle with SGX enclaves. Even though many researchers put efforts in providing security to metadata it is very difficult to fill the gap between security and practical usability of system [18].
Even though there are many ORAM techniques as mentioned above to secure metadata in online transaction they have some limitations as mentioned below.
 Traditional ORAM techniques suffer from more complexity in model construction.
 Many of the existing techniques are able to provide obliviousness to metadata but failed to improve performance with increasing data.
 As more number of users increases response time decreases.
 Existing system are unable to maintain balance between security and performance.
By considering previous works main inspiration of this paper is to seek out solution which can provide combined solution with maintaining privacy and improving performance of existing ORAM technique.
Our Contribution: We design our system to achieve main three goals: 1) To reduce complexity and response time 2) To secure metadata with active adversary attacks 3) To improve performance of overall system with increase in number of users.
The proposed system focuses on providing security to metadata in online transactions. As previous ORAM technique strong FEDORAM [28] faces problem of high response time for communication in between client and server, our proposed system tries to reduce the execution time for pull-push operations by making the system work in parallel manner. Parallelizing tasks will optimize the ORAM system in turn reducing response time and will improve performance of overall system. As we observed weak FEDORAM suffers from sensitive data leakage problem in active adversary attacks, our proposed system focuses on protecting data from various attacks like collusion attack, session hijacking, bypass authentication, sink hole and warn hole attacks with designing an XOR-based lightweight cryptographic technique for data encryption as well as decryption during the communication.
Moreover, the further sections of the paper are divided as follows: Section II describes related work done by previous researchers. In Section III describes the algorithm for proposed implementation. The Section IV describes the experimental setup for evaluating the proposed work and results achieved with our methodology and comparative analysis with various state-of-art methods. Section V concludes the proposed work and provides future work guidelines.

II. RELATED WORK
Yanyu Huang et al. [19] proposed real-time oblivious data exchange into the Fog Computing. This approach can eliminate the complex execution process of the client-side and requires low communication cost, including the minimum response time, and it reduces to computation up to 2x than state-of-art methods. The Edge computing environment has been deployed, and all transactions are performed on the edge node. This system depicts an extensive experiment analysis, and it achieves low network bandwidth utilization, fixed data storage on the client machine, and minimum network overhead. The new approach of path oblivious random-access memory is called as R-Path ORAM with large root basket dimensions including the small constant size of remaining buckets in the tree [20]. A thorough examination of the root bucket capacity is carried out in order to arrive at a restricted solution for such necessary root buckets size with a minimal error possibility. Using a common platform, the effectiveness of the R-Path ORAM is assessed to that of the conventional Path ORAM. The results of the tests indicate that R-Path ORAM offers www.ijacsa.thesai.org much less server bandwidth and time taken than the original Path ORAM. This is also a hidden eviction method for reducing the size of the bottom bucket and preventing system failure.
Cao et al. [21] proposed an approach string ORAM access using spatial and temporal optimization techniques. This approach can improve the string ORAM access by using temporal and spatial optimization methodologies. Initially it recognizes dummy data blocks with significantly waste storage space and defines the optimized ORAM scheme that reduces high time computation and effective scheduling. The outcome of this approach reduces the 30% execution time complexity, thus a 40% reduction of memory utilization during the execution. A similar approach of fast and secure ring data retrieval techniques has been proposed by Yeuzhi che et. al. [22]. According to Fletcher et al. [23] secure processors have a quality and speed inefficiency of more than 50%. Fletcher et al. [23] suggested a dynamic system with a limited amount of emission allowed.
The first Path ORAM implementations on hardware were presented by Maas et al. [24]. Parallel Computing techniques have been used with implementing the super demon during the process of read and write execution. In demon, two methods were employed to improve Path ORAM's effectiveness. Treetop caching is the first method. Treetop caching saves the first coefficient of determination of the tree in the cache because only the bottom layers are changed while reading and writing to the ORAM, decreasing latency and complexity. The Phantom is the second method. The second Phantom method is min-heap evictions, which stores the cache as a min-heap and evicts the blocks that have been utilized the least in the past initially.
In addition, Fork Path ORAM was introduced by Zhang et al. [25]. Fork Path ORAM combines two successive ORAM applications. Researchers highlighted two consecutive queries could have containers in their routes which are overlapped. As a result, they recommended when a noticed request is made and the entire pathway of the requested data block is received from the servers and put into the stash, the rewriting back of the whole route be postponed until the subsequent request is made. The buckets that intersect in the two ways are not written back in the given details, and only the containers in the first request's path are published back. Furthermore, only the elements in the second route that do not overlap with the direction of the first demand were read into the cache to execute the new request. The procedure is then repeated with the second and third requests, and so on. Researchers also recommended postponing any outstanding ORAM requests. Even though all of their studies were done using the safe processor option, Sanchez [26] found that the advantage of combining the requests is negligible. Sanchez demonstrated that combining queries of size two may save one bucket. Fletcher et al. [27] proposes an optimization that uses a large group counter and many tiny individual numbers per Position Map block to condense these markers to a manageable amount.
Pujol et al. [29] presented FEDORAM. Weak and strong FEDORAM tried to tradeoff between security and performance in the instance messaging. Weak FEDORAM focused on performance of system while strong FEDORAM focused on security of the system. Weak FEDORAM suffers sensitive data leakage problem while strong FEDORAM suffers from increasing response time with increase in number of users.
Apart from the access cost imposed by ORAM procedures, contemporary ORAM architectures ignore the current computer system's extensive memory and processing hierarchy. According to [28], if the data size is higher than actual memory capacity, it directly enhances the leaf nodes of storage in the background illustrated in Fig. 1(a). Although most layers will be in the high-speed memory area, the tree-top caching has a simple design. On the other hand, each path access is converted into a series of rapid memory locations and sluggish I/O accesses. Due to the general poor locality, alternative caching methods find it difficult to adapt the treetype structure. Such a design is improvident in terms of I/O frequency cost due to the design difference between storage and I/O access, as well as the disparity of storage and I/O use.
In FEDORAM and Multi-User Oblivious Storage via Secure Enclaves (MOSE), both techniques emphasize on reducing the input-output overhead because they extract single block data during the transaction from backend storage which is demonstrated in Fig. 1(b) and Fig. 1(c). Furthermore, the flat memory structure enables effective top-layer buffering. On the other hand, the shuffling procedure must be done often, and the whole storage must wait for such shuffled to finish before proceeding to another ORAM process. It adds extra waiting time to the process resulting in delayed output [29] [30].

III. PROPOSED SYSTEM DESIGN
A. System Architecture In the proposed system client server architecture is considered. Fig. 2 shows architecture of proposed HORAM. Initially if client wants to send message to client from server destination list ( list), then the client generates the message according to below equation 1.
Here is the random dummy leaf of destination node list. The establishes connection with entry server establishes connection to root server to forward to . The established parallel connection with candidate servers such as distributed environment. The allcandidate servers perform the decrypt operation with given , www.ijacsa.thesai.org and if it is accurate with server id then it is destination server selected by The same time data has been stored in internal tree structure by particular . S R stores positionMap[id] and OTMap [id] in which the positionMap[id] describes each leaf node information while OTMap[id] gives the information of message identifier of encrypted text.
In the system architecture four terms are more important.

1) Client 2) Entry Server 3) Root Server 4) Destination Server
In this architecture direct connection between client and destination server is avoided. Instead two entities are added in between client and destination server for secure communication.
In data storage algorithm initially client generates a message to Entry server . Then establishes connection with root server which keeps virtual id of all real and dummy messages received from entry server to destination server. Then encrypted message will be sent to all servers in the federation to get reply from actual destination server in parallel manner. The server who has authority to decrypt the message will get back to root server by decrypting message with its keys. Sending message in parallel manner saves communication time instead of sending it in parallel manner. After that root server makes entry about the current transaction id, user id and server id in its position map for further reference.
In data access algorithm, step 1 to step 3 states about connection establishment from user to root server through entry server. After establishment of connection to root server current server id for transaction is fetched and data will be extracted from specific server. After that for securing metadata and hiding path of current access to destination server current destination id will be replaced with new destination server id. Then entry for current sever id will get deleted and root server will be updated with new server id. In this way metadata privacy preserving access can be performed using the HORAM data access algorithm with employing parallelism in architecture to reduce overall response time of system. In data access algorithm we describe the pull activity perform by client Once data has been extracted it decrypts outside of ORAM, and perform the eviction function for update the repository of access patterns.

B. Algorithm Design
Some basic data structures have been used for implementation for proposed system. Basic ORAM tree is a binary tree of encoded text. Every node of tree contains certain number of data blocks. In the below section we describe data structure used for proposed hybrid ORAM during the execution.
4: Once server connection has done, according to decryption process only one server who can decrypt the data will return data. The data has forwarded to push function in the form of Push (M, S(id), ).
5: Then root server generates transaction id for further transaction and add entry into the positional map with encrypted data like below.  1: Client sends message to Entry server as below: 2: Entry server gives request to root server to get allocated server with get_Server_info ( ) and detect the allocated server for specific user.
3: Establish connection with ( ) from entry server and extract data from 4: Now, select server from set of servers by entry server 5: Once server selection has done, forward data to alter function given as Alter(M, ( ), ).
6: Entry server generates transaction id for further transaction and update the positional map on root party server like below step 7 and step 8.
Here, Table I shows symbols used in the algorithm of proposed HORAM technique and Table II shows entries of position map for particular transaction along with user id and server id.

A. Environmental Setup
The proposed implementation is an open-source java environment with 10 data servers in parallel computation for HORAM. In the configuration setup, all are homogeneous with a single client. In all servers there should be a single entry server and single root server, and one destination server in the remaining servers.
B. HORAM Performance 1) Response time: Fig. 6 depicts the average response time for the system when 100 messages sent over the network. It shows better result than existing strong FEDORAM as it took less time than strong FEDORAM with increase in number of users. It took little less time than weak FEDORAM. For 300 users it takes average 5 seconds for strong FEDORAM, 2.6 seconds for weak FEDORAM and 3.2 seconds for HORAM. It is observed with the experiment that our technique takes less response time with increase in number of servers. It takes 1.4 seconds for weak FEDORAM, 2 seconds for strong FEDORAM while 0.9 seconds for HORAM.
2) Complexity: Table III illustrates our innovations and compares our system to some of the most cutting-edge ORAM structures, as seen above. Where N denotes the total number of messages stored in the whole oblivious system. Because our HORAM's client-to-server connection is based on the RAM, they have similar client-to-server bandwidth and device storage complexity. The federation's communication channels and server computation are both linear.

C. Security Analysis
The proposed approach provides how it achieves higher security and eliminate the metadata leakage problem during communication.
 Data Generation: The client generates random message, and encrypt with proposed XOR operation techniques with the help of receiver's token id. The encryption works like one-way hash function, due to no existence of both encryption and decryption key in message generation and transmission. The encrypted data could transfer to and respectively. Moreover if or compromised with attacker even though attacker can't extract the decrypted text, due to dependency of receiver's token.
 Data Forwarding: The and can forward data to next hops or servers. Initially receives the M and he knows the client as well as . The forward similar data to and generate positionMap and OTMap respectively. The positionMap[id] describes each leaf node information while OTMap[id] gives the information of message identifier of encrypted text, this information stored on root server. The securely keeps both records in ciphertext format that eliminates the possibility of internal or external attacks. The defined ciphertext works like a one-way hash function, which requires a negligible cost to operate; it also does not require significant dependency for encryption and decryption. Moreover, worst case, we consider root server compromised with any attacker even then they are not able to extract actual plain text due to this lightweight cryptographic policy. www.ijacsa.thesai.org  Data Extraction: When any client wants to extract the data, it gives a request to entry server and forward to . The message extracted from positionMap with its server information and similar requests were forwarded to SD from and downloaded the plain text. Once the user extracts data properly, the proposed algorithm works to provide additional security to stored information. It first erases the current record from positionMap and selects any random server from the available server set. When the user extract data from holds that decrypted plain text in cache memory. The selected new server and current plain encrypt again by cryptography function and generate a new entry into the positionMap. Once a new transaction is successfully committed, it erases the previous entry of duplicate data.
In proposed architecture, it can be observed that the last transaction has changed on root server into the positional map. This activity can change every time when similar frequent access request has generated by client. The stash memory auto release when time complexity generates such 2N for N data blocks. This functionality provides eliminate the dummy blocks and reduce the time as well as space complexity respectively. This algorithm automatically erases the previous entry of a particular transaction with location details from the position map when the user has performed a data pool operation. It generates and stores the new entry into the position map. The significant advantage of this functionality traitor never identifies the background knowledge extracted data source as well as the location of data source.
V. RESULTS   Fig. 3 describes the time required in seconds for data encryption as well as decryption. Based on this experiment, the decryption could take high time than the encryption process.
The two-way encryption techniques are also carried out to achieve security to data during transmission and dynamic decryption at the selection of the destination server. The below table we demonstrate the complexity of proposed and existing systems.
According to above Fig. 4, the data uploading and downloading time required for the client-server in the proposed HORAM. The time required based on the proposed configuration could be flexible when the operating environment has changed. Fig. 5 shows network utilization in communication with number of servers.
The performance evaluation of the proposed evaluation is based on the communication cost required for data push and pull events. When a data push event has been generated, all n receiver data nodes are utilized for communication. Furthermore, the network capacity is handled to 10kb data in a single M. The message size could be changed when the client updates the information or generate a new message. The below Fig. 5 describes a network utilization in MB during data transmission.
In another experiment, we evaluated the communication cost required for data push and pop event from to .
According to FEDORAM, it describes one-to-one communication between all servers, which may produce high communication costs [29]. The proposed module generates a parallel connection between to all available sets of servers S. Fig. 6 and Fig. 7 depict response with number of servers and number of users and how proposed approach reduces the computation cost than state of the art methods.      The proposed approach has evaluated with number of users and number of servers for communication cost, based on both results our system is efficient than [29] in both experiments.

VI. CONCLUSION
The proposed HORAM, an innovative ORAM approach achieves high level data privacy and low time computation in distributed environment with untrusted memory. The proposed parallel distribution HORAM provides low computation for database transaction such as push and pull respectively. Experimental analysis shows that the HORAM gives better results in terms of computation time. The method is faster than weak FEDORAM and strong FEDORAM as it takes 0.96 seconds for communication with 5 servers whereas weak and strong FEDORAM takes 1.5 and 2 seconds respectively for reading and writing operation. It improves security in comparison with weak FEDORAM by avoiding direct contact of user with destination sever and provides more privacy to metadata with data shuffling and XOR based lightweight cryptographic technique. To enhance this system with large data processing environment for achieving security and privacy of data will be addressed in future work. In future work emphasis will be on reducing complexity of encryption and decryption of extensive data.