A Novel Approach to Network Forensic Analysis: Combining Packet Capture Data and Social Network Analysis

—Log data from computers used for network forensic analysis is ineffective at identifying specific security threats. Log data limitations include the difficulty in reconstructing communication patterns between nodes and the inability to identify more advanced security threats. By combining traditional log data analysis methods with a more effective combination of approaches, a more comprehensive view of communication patterns can be achieved. This combined approach can then help identify potential security threats more effectively. It's difficult to determine the specific benefits of combining Packet Capture (PCAP) and Social Network Analysis (SNA) when performing forensics. This article proposes a new approach to forensic analysis that combines PCAP and social network analysis to overcome some of the limitations of traditional methods. The purpose of this discovery is to improve the accuracy of network forensic analysis by combining PCAP and social network analysis to provide a more comprehensive view of network communication patterns. Network forensics, which combines pcap analysis and social network analysis, provides more comprehensive results. PCAP analysis is used to analyze network traffic, conversation statistics, protocol distribution, packet content and round-trip times. Social network analysis maps communication patterns between nodes and identifies the most influential key players within the network. PCAP analysis efficiently captures and analyzes network packets, and SNA provides insight into relationships and communication patterns between devices on the network.

The goal of this discovery is to combine PCAP and social network analysis to improve the accuracy of network forensic analysis by providing a more comprehensive view of network communication patterns. By combining social network analysis and PCAP analysis, it is possible to gain a deeper knowledge of network activity and communication patterns. Investigators can use this to find unusual or suspect activities on the network, such as secret or encrypted communication.

A. Research Related to Packet Capture (PCAP) Analysis
Sikos [1] made a comprehensive comparison of Carnivore, Snort, Windump, Wireshark, dsniff, tcpdump, Omnipeek, Solarwind and other packet analysis tools when analyzing PCAP data. The method used is AI-based deep learning inspection combined with semi-supervised machine learning. The research aims to compare the ability to recognize patterns in different packet analyzer applications in order to find the most suitable tool for network forensic analysis activities. The results of DPI (Deep Packet Inspection), a packet analysis tool with machine learning capabilities are valid [8][9] [10].
Cappers et al. [5] focused on data reduction and visualization techniques using the EventPad tool. The purpose of this study was to conduct a safety analysis in a behavioral pattern study using PCAP data. This study presents a case study of the EventPad visual analysis tool to obtain attack profiles and traffic analysis using rules and aggregations. The study did not describe any communication patterns at the application level. www.ijacsa.thesai.org Shrivastava et al. [11] focused on capturing attacks on IoT devices using Cowrie honeypots and using machine learning to classify attack types. They apply various machine learning algorithms namely Naive Bayes, J48 Decision Trees, Random Forest, and Support Vector Machines (SVM) to classify attacks such as malicious payloads, SSH attacks, XOR DDoS, espionage, suspicious and clean attacks. Perform feature selection using subset evaluation and best-first search. The training results achieved an accuracy rate ranging from 67.7% to 97.39%.

B. Network Analysis Research using Social Network Analysis
(SNA) Chakraborty et al. [12] conducted a study that advances the understanding of 5G-COVID-19 conspiracy theories. This paper conducts a social network analysis to analyze the content of Twitter data over a seven-day period (the #5GCoronavirus hashtag became trending on Twitter in the UK. The content analysis revealed that 34.8% (n=81) of a sample of 233 tweets contained references to 5G and COVID-19-related opinions, 32.2% (n=75) were critical of conspiracy theories, 33.0% (n=77) were general tweets, not disclosing views or personal opinions) tweets were from non-conspiracy theory supporters, indicating that despite interest in the topic is high, but only a small percentage of users actually believe in the conspiracy theory. Liu et al. [13] provided a large-scale group decisionmaking model based on the process of propagating beliefs; the process of conflict detection and resolution; and the process of selection using social network analysis methods. In the first procedure, we propose a relation strength-based belief propagation operator, which allows building a complete social network while considering the effect of relation strength on propagation efficiency. In the second procedure, we define the notion of degree of conflict and measure the degree of collective conflict in conjunction with assessments of information and belief relationships among large groups of decision makers. SNA is a modeling of users represented by nodes and interactions between users are represented by lines (edges). This analysis is needed because it brings new opportunities to understand individuals or communities regarding their social interaction patterns [20] [21]. SNA can be used to study network patterns of organizations, ideas, and people who are connected in various ways in an environment [22] [23]. Degree centrality counts the number of connections or interactions a node has. To calculate the value of centrality degree (CD), we use Eq. (1) [24].
Closeness centrality (CC) calculates the average distance between a node and all other nodes on the network. This measure describes the proximity of this node to other nodes [24] as in Eq. (2).
Betweenness centrality (BC) calculates how often a node is passed by another node to go to a certain node in the network. This value serves to determine the role of the actor who is the bridge that connects the interaction in the network. To calculate the value of degree centrality we use Eq. (3) [24].
One of the most important processes at the digital forensic stage is data integrity in the preservation section. Messagedigest algorithms MD5 and SHA-1 as one-way cryptographic hash functions are used in integrity validation [23] [24]. Four non-linear functions in 512bit blocks in the MD5 Algorithm as Eq. (4).
C. PCAP Data and Network Forensics Analysis PCAP analysis in cyber forensics can be performed using a variety of methods including using software, PCAP software analysis. The software can be used to view and analyze packet content, including headers and payloads, and look for signs of malicious activity [14]. The next approach is statistical analysis, which extracts statistics from PCAP, such as packet count, network traffic and protocol statistics [15]. A rather important approach is packet and payload analysis to extract information from packet headers such as: B. Source and destination IP addresses, protocol and port used, and payload from the packet [10] [11] [17] [18]. Other findings indicate that computer network traffic results provide a variety of valuable information in graphical form to help identify routine banking transactions (pooled accounts, straw men, smurfing) used to hide movement of prohibited resources or obfuscation, thereby enhancing the visualization of financial analysis aspect. Packet analysis of internet network traffic is an important backtracking technique in network forensics, if the captured packet details are detailed enough, it can even show all network traffic at a specific point in time. This can be used to detect traces of malicious online behavior, data breaches, unauthorized website access, malware infections, and infiltration attempts, and to reconstruct image files, documents, email attachments, and other content sent over the network [1] [15] [16] [19].
We recommend combining PCAP analysis and social network analysis. This combination will demonstrate advanced network forensic analysis by revealing communication patterns between specific nodes in social media interactions. Combining PCAP and SNA to analyze network forensic activity can be a powerful method for identifying and analyzing malicious activity on the network. The integration of these two technologies can take advantage of the detailed information provided by PCAP data and the broader networklevel view provided by SNA. An example of how this combination could be used is to use PCAP data to identify specific patterns or anomalies in network traffic that are consistent with known malicious activity, such as: B. Botnet command and control traffic or data exfiltration. Once these patterns are identified, SNA can be used to identify nodes and edges in the network that match these patterns, which helps identify the source and destination of malicious activity and the relationships between these entities. www.ijacsa.thesai.org III. COMBINING PCAP DATA AND SOCIAL NETWORK ANALYSIS FOR NETWORK FORENSIC ACTIVITY This research uses an experimental method in a laboratory where the independent variable is the number of nodes in social media capturing, and the dependent variable is the result of network forensic analysis, shown in Fig. 1.

A. Data Collection
The first step is to collect PCAP data from the network using an application such as Wireshark or tcpdump. This data includes information about network traffic that occurred during the collection period, including details such as source and destination IP addresses, port numbers, and packet payloads. During the collection process, the keyword "Manchester United" was used to crawl data from Twitter, and 1000 nodes were obtained. Twitter scraping was performed on January 4, 2023 using the netlytic application.

B. Data Preprocessing
This step is performed to filter out packets that are not relevant to the current investigation. For example, only save packets from suspicious IP addresses or use protocols considered important. Parsing, this step is performed to extract relevant information from the PCAP packet. Information that can be extracted includes source and destination IP addresses, protocols used, timestamps, and payload data. Anonymization, this step is performed to remove information that could be used to identify individuals participating in the communication. Information that can be removed includes IP addresses, MAC addresses, and personal information contained in payload data. Normalization, this step is performed to convert the data extracted from the PCAP packets into a format that is easier to use for analysis. For example, converting timestamps to a more readable format or converting used logs to a simpler format. Aggregation, this step is performed to combine data from multiple PCAP packets into a larger unit. For example, combining multiple packets from the same IP address into a larger unit. Enrichment, this step is performed to add additional information to the data extracted from PCAP packets. Additional information can be in the form of IP geolocation information, WHOIS information, or IP reputation information.

C. PCAP Analysis
Traffic Analysis, this step involves analyzing data to identify patterns and anomalies in network traffic. This may include identifying unusual traffic destinations, unusual traffic patterns, or specific protocols used. Packet level analysis, this step involves examining individual packets in the PCAP data such as: B. Source and destination IP addresses, source and destination ports, and packet payload. This can be used to identify specific keywords, extract files, or extract other information from the payload. Statistical Analysis, this is the process of analyzing and interpreting the data contained in a PCAP file using statistical methods. This can include identifying patterns, trends, and anomalies in network traffic, as well as estimating various metrics such as traffic volume, packet size, and packet arrival time.

D. Social Network Construction
Social network construction creates representations of relationships between individuals or entities in a social network. This may involve identifying relationships between individuals, such as B. friendships, family ties, or professional relationships, and the strength of these relationships, such as B. frequency of personal interaction or communication. These centrality measures can be applied to social networks created from PCAP data and help identify key IP addresses or ports that may be important for understanding communication patterns and how information or malware spreads in the network. Degree centrality, this measure is based on the number of edges (connections) a node has. Nodes with high centrality are those that have many connections in the network. Betweenness centrality, this measure is based on the number of shortest paths through a node. Nodes with high betweenness centrality are those located on many shortest paths, considered as bridges or gatekeepers between different communities. Closeness centrality this measure is based on the average distance of a node from all other nodes in the network. Nodes with high proximity centrality are those that are close to many other nodes in the network. Community discovery is an important task in social network analysis that aims to identify groups of nodes (communities) that are more connected to each other than to the rest of the network. There are various methods for discovering communities on the web. Network pattern recognition is the process of identifying structural patterns in a network, which can provide insight into network organization and function. There are several methods for identifying patterns in networks; some of the most popular are subgraph counts and clustering coefficients.

IV. EXPERIMENT
Conversation statistics IP source 192.168.1.14 to 104.211.42.0 show the wireshark session statistics to see which devices are communicating with each other. This data also counts the traffic exchanged between these devices. It helps to understand the communication patterns in the network. Protocol Distribution Use Wireshark Hierarchical Protocol Statistics to see which protocols are used on the network and how much traffic they generate. It shows the distribution of logs and the number of packages present. Fig. 2 shows the number of packets, namely 5840, distributed among the different protocols and conversation statistics. Fig. 3 shows the distribution of content packets in Wireshark's packet details panel to examine the contents of individual packets and see their structure. It can help (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 14, No. 3, 2023 469 | P a g e www.ijacsa.thesai.org understand the data exchanged in the network and identify potential problems. Fig. 4 shows that round trip time (RTT) is a measure of how long it takes to send a packet of data from one device to another and receive a response. RTT can use this value to measure network latency and identify potential delays. = 55 milliseconds.      5 shows data collection verification and file integrity verification in network forensics refers to the process of collecting and verifying the authenticity and integrity of digital evidence from the network. This process is critical to ensuring that evidence collected is accurate and available for investigation or trial. Data collection involves capturing and copying data from the network, while file integrity checking involves checking hashes or digital signatures of captured data to ensure it has not been tampered with. These steps are important to maintain the chain of custody and maintain the authenticity of the evidence. Digital forensic analysis using SNA includes identifying primary and secondary actors. This assessment makes the investigative process more focused on specific actors.   6 generates degree centrality as the number of connections or edges a node has to other nodes in the network. Nodes with high centrality have many connections to other nodes. In this study, 10,000 nodes were obtained using the manchester united keyword. The main actor has 951 degrees on the node labeled: utdfaithfuls. Perform key player identification to determine the degree of influence a participant www.ijacsa.thesai.org has in the network. It is important to identify who are the actors who play the most important role in the communication model. This betweenness centrality takes into account the number of shortest paths between other nodes passing through a given node. High centrality nodes are "bridges" between other nodes in the network. The degree, proximity, and betweenness centrality metrics tables will show the following information for each node (or vertex) in the network Degree centrality, this is a measure of how many direct connections a node has to other nodes in the network. This is usually expressed as the number of edges intersecting the node. Closeness centrality, this is a measure of how close a node is to all other nodes in the network. This is usually expressed as the sum of the shortest distances between a node and all other nodes in the network. Betweenness centrality, this is a measure of how often a node acts as a bridge between other nodes in the network. This is usually expressed as the number of shortest paths between pairs of nodes passing through a given node.

A. Authors and Affiliations
For this experiment, the tag utdfaithfuls Beetweenness centrality: 1718 nodes with utdplug. Graphical visualization of the top three communication modes at the highest level is shown in Fig. 7. Graph visualization is a commonly used technique in social network analysis to show the relationships between individuals or entities in the network. It visualizes the network as a graph or graph, with nodes representing individuals or entities and edges representing the relationships between them. The size and color of nodes and the thickness and direction of edges can be used to represent attributes or measures, such as relationship strength or centrality measures. This helps visualize patterns, relationships, and communities in the network and makes it easier to understand the network structure and its properties.  1) Better suspect identification. Based on their connections to and interactions with other network members, prospective suspects in a network can be found via social network analysis. This can give a more complete picture of the network and increase the precision with which suspects are identified.
2) An improved comprehension of network behavior can be obtained by combining social network analysis and PCAP analysis, which can give a more in-depth understanding of communication patterns and network behavior. Investigators can use this to spot unusual or suspect activities on the network, such as secret or encrypted communication.
3) Enhanced incident response. By integrating PCAP analysis with social network analysis, detectives can more precisely pinpoint the origin and consequences of a crime.
The outcomes of benchmarking the integration of PCAP analysis with social network analysis in network forensics can give important insights into the most efficient ways to carry out investigations, uncover potential security concerns, and improve network security generally.

V. CONCLUSION
Network forensics using PCAP analysis combined with social network analysis shows more comprehensive results. PCAP analysis is used to analyze network traffic, conversation statistics, protocol distribution, packet content and round-trip time. Social network analysis maps communication patterns between nodes and identifies the most influential key players in the network. PCAP analysis efficiently captures and analyzes network packets, while SNA provides insight into the relationships and communication patterns between devices on the network. The availability of data sources is also a factor to consider when deciding to combine PCAP and SNA. The combination of PCAP and SNA can provide a more complete view of the network when limited log data is available. When the nature of the threat includes both technical and social aspects, a combination of these two approaches may be required. PCAP and SNA can provide a more comprehensive view of the network and improve the accuracy of forensic analysis. This is because SNA can provide context about captured packets and help identify relationships and patterns that might not be apparent from packet analysis alone. Potential future suggestions for improving forensic cyber analysis include artificial intelligence and machine learning, which includes AI and machine learning algorithms that can help automate the data analysis process, making it more efficient and accurate.