Mobile Malware Classification via System Calls and Permission for GPS Exploitation

Now-a-days smartphones have been used worldwide for an effective communication which makes our life easier. Unfortunately, currently most of the cyber threats such as identity theft and mobile malwares are targeting smartphone users and based on profit gain. They spread faster among the users especially via the Android smartphones. They exploit the smartphones through many ways such as through Global Positioning System (GPS), SMS, call log, audio or image. Therefore to detect the mobile malwares, this paper presents 32 patterns of permissions and system calls for GPS exploitation by using covering algorithm. The experiment was conducted in a controlled lab environment, by using static and dynamic analyses, with 5560 of Drebin malware datasets were used as the training dataset and 500 mobile apps from Google Play Store for testing. As a result, 21 out of 500 matched with these 32 patterns. These new patterns can be used as guidance for all researchers in the same field in identifying mobile malwares and can be used as the input for a formation of a new mobile malware detection model. Keywords—Mobile malware; Global Positioning System (GPS) exploitation; system call; permission; covering algorithm; static and dynamic analyses


INTRODUCTION
Currently, Android smartphone is the most and widely used worldwide and many new mobile malwares are designed to attack it.Mobile malwares is defined as malicious software that is built to attack mobile phone or smartphone system without the owner consent.Examples of the mobile malwares are Slembunk and Santa Claus, where they are able to collect sensitive and confidential information and control smartphone with root exploitation.They tarnish the infected victim reputation and have caused loss of money, productivity and confidential information.Furthermore, McAffee has also reported that 37 million of malwares have been detected in apps stores in year 2016. 1 Apart from SMS, call log, audio and picture exploitation, Global Positioning System (GPS) has been used by many attackers to exploit smartphones.Through GPS, attackers know the victims" details such as satellite information and every movement can be monitored by them.In early year 2017, Google has released a patch (CVW-2016-8467) to overcome security vulnerabilities related with GPS 1 B. Snell, "Mobile threat report what"s on the horizon for 2016", 2016.
[Online].Available: https://www.mcafee.com/us/resources/reports/rp-mobilethreat-report-2016.pdf.[Accessed: 30-May-2017] exploitation in Nexus 6 and 6P phones. 2 Currently, not much work has been done to detect GPS exploitation in smartphone.Therefore, this paper objective is to detect mobile malware attacks for GPS exploitation based on system call and permission patterns.A covering algorithm is used as a basis for the proposed patterns.Then the proposed patterns are evaluated to prove its effectiveness.This paper is organized as: Section 2 presents related work on mobile malware architecture, features and detection techniques.Section 3 describes the methodology used in this research.Section 4 presents the results of experiment carried out in this research.Section 5 includes the summary and potential future work of this paper.

II. RELATED WORK
There are many ways how mobile malwares can be categorized.Work done by Altaher classified android malware based on weighted bipartite graph [1].He used API and permission for the classification but the dataset used for the experiment only limited to 500 dataset.A bigger and more recent dataset would be a good improvement for this work.As for Feizollah and colleagues, they used feature selection for mobile malwares features extraction [2].These are based on four main features which are static, dynamic, hybrid and application metadata features.The paper provides a comprehensive review on feature selection for mobile malwares and it is used as guidance for our experiment in this paper.Hybrid feature which combines static and dynamic analyses has been applied due to its comprehensive and systematic feature.System call and permission that are related with GPS exploitation have been extracted and categorized in different patterns and details explained in Section 4 in this paper.Work by Manuel and colleagues also used hybrid feature in their experiment [3].While works by [4]- [6] used static analysis only, which would give a better a performance if dynamic analysis is integrated in future (hybrid technique).
Apart from that, few research papers by [7]- [9], they discussed about Location Based Services (LBS) or GPS usage for Android smartphone.As for Singhal and Sungkla, they discussed about the implementation of LBS to give multiple www.ijacsa.thesai.orgservices to the user based on their location through Google Web Services and Walk Score Transit APIs on Android.While Ma and colleagues, have developed a tool called as Brox to detect location information leakage in Android by integrating static analysis and Vanjire and colleagues have developed an Android application to locate nearest friends and family members location.There are also many works related to Android malwares analysis such as by [2], [6], [10]- [13].However, none of the existing works discuss in detailed on how to detect and overcome GPS exploitation for smartphone.This is among the challenges for future work.

III. METHODOLOGY
The dynamic and static analyses and classification of GPS exploitation for system call and permission are summarized as in Fig. 1.The experiment was conducted in a controlled lab environment as illustrated in Fig. 2. No outgoing network connection is allowed to avoid any spread of the mobile malwares.[4].While for the testing, 500 mobile applications (apps) have been randomly selected from Google Play Store.The Drebin dataset includes all dataset from the Android Malware Genome Project.It is among the largest malware dataset, free and widely used by many researchers such as by [2], [6], [10]- [13].
The dynamic analysis was used to capture the system call while static analysis was used to capture permission.Then all the extracted system calls and permissions were classified by using covering algorithm.For the dynamic analysis, the apk was installed in Genymotion and being controlled by Android Debug Bridge (ADB).Then the running processes and system calls of the apk were identified and extracted.Fig. 3 displays an example of a screen shot for the system calls captured and Fig. 4 displays an example of a screen shot for the permissions captured.As for the static analysis, the permissions were extracted in the Genymotion where Dexplorer being installed inside it.Fig. 5 displays an example of the system calls frequency.Once all the permissions and system calls have been extracted, percentage of occurrence and covering algorithm were applied.These are crucial to verify the extracted dataset and to produce pattern.The percentage of occurrence is developed to compare the similarity between the extracted system calls and permissions.This is useful to avoid redundancy.Each of the system call occurrence is written as 1 to indicate the presence of the system call and 0 for vice versa.Then, the total of the presence and absence of the system calls and permissions were calculated and being compared with the existing dataset.
Once above steps are completed, the output became the input for the covering algorithm.The covering algorithm is used to generate system call and permission pattern for each apk.It identifies rules that have been set by the researchers.In this experiment, specific to general rule induction for covering algorithm has been applied as the following: 1) The extracted system calls and permissions are being picked up and generalized by repeatedly dropping condition.
2) If all the system calls and permissions covered by the set rule, then removed it and continue until all the system calls and permissions are covered.
3) When dropping the condition, make sure to choose the maximize rule coverage.

IV. FINDINGS
Thousands of system calls and permissions have been extracted, but the focus of this paper is on GPS exploitation.There are 58 system calls and 41 permissions out of 5560 samples that have been discovered that could be used together with genuine system calls for GPS exploitation.These system calls representation are shown in Table 2 and permissions representations are shown in Table 3.    4 shows permission classification that mostly used together with system call to exploit GPS that have been extracted from the Drebin dataset.Through dynamic analysis, numerous system calls per application have been encountered until the execution was stopped.Based on system calls presence during dynamic analysis, logs of dataset were recorded.
Table 5 shows the top 10 system calls classification that widely used with permission and system call to exploit GPS that have been extracted from Drebin dataset.
Table 6 shows list of patterns which have been produced based on mostly used for GPS exploitation.From 32 proposed patterns for potential GPS exploitation, only 21 of them which were downloaded from Google Play Store matched with our proposed patterns as summarized in Table 7. www.ijacsa.thesai.orgThen from Table 7, the categories of these 21 apps are summarized in Table 8.

V. CONCLUSION
Based on the analysis results in this paper, it can be concluded that each of the executed mobile application has its own system call and permission.GPS has been identified as one of the features and has been used for different purposes.Thirty-two possible patterns for GPS exploitation of system calls and permissions combination are presented in this paper.Without users" consent, their confidential information especially that is related with their location or GPS can be easily exploited by the attackers.Thus based on 21 mobile apps that matched with our patterns, it is proven that GPS feature in the Android smartphone can be exploited by android mobile malware through permission and system call.For future work, this research can be used as guidance for other researchers to extend their work with the same interest and domain.These 32 patterns can be used as a database and input for the formation of a new model to detect mobile attacks exploitation via GPS.Furthermore, automatic for system call and permission extraction is another challenge to be tackled in future.

1 .
Dataset from Drebin was downloaded 2. Laboratory environment was set up. 3. Tools were installed 4. Using static and dynamic analyses technique (Stracemodule), data analysis was conducted 5.The emulator device was rootly controlled by using Android Debug Bridge (ADB) 6.The parent process of the Android application were identified and retrieved(ps) 7. System call behaviour of an application was monitored and documented 8.The static and dynamic analyses were completed 9. System call classification were obtained 10.The result was tested with application from Google Play store 11.Documentation, report writing and publication www.ijacsa.thesai.org

Table 1
displays the softwares used for the experiment.For this research, the training dataset consists of 179 different types of mobile malwares from 5560 Drebin dataset

TABLE .
VII. PERCENTAGE OF APPLICATIONS THAT MATCH WITH SYSTEM CALLS AND PERMISSION BASED ON GPS EXPLOITATION

TABLE .
VIII.CATEGORIES OF THE MATCHED MALICIOUS APPLICATIONS