Comparing the Usability of M-Business and M-Government Software in Saudi Arabia A Revised Nielsen ’ s Heuristics Method

This study presents a usability assessment of mobile presence in the Kingdom of Saudi Arabia (KSA), with a particular focus on the variance between M-business and Mgovernment presence. In fact, a general hypothesis was developed that M-business software is more usable than M-government software, with eleven sub-hypotheses derived from Nielsen’s heuristics method. To examine the hypotheses, a true representative sample of thirty-six (n=36) mobile software applications in Saudi Arabia were identified from prior research, representing two main categories: M-business and Mgovernment. Within each category, eighteen (n=18) mobile software applications were carefully chosen for further evaluation, representing a wide variety of sectors. A questionnaire was devised based on Nielsen’s heuristics method; this was tailored to fit the context at hand (mobile computing) to establish a usability checklist (consisting of eleven constructs). A group of thirty-six (n=36) participants were recruited to complete the usability assessment of examining each software application against the usability checklist, by rating each item using a Likert scale. The results herein reveal that mobile interactions in KSA were, in general, of an acceptable design quality with respect to usability aspects. The average percentage score for all heuristics met by the evaluated mobile software applications was 68.6%, this reflected how well the usability practices in mobile presence were implemented. The scores for all usability components exceeded 60%, with five components being below the average score (of 68.6%) and six components being above it. The variance between M-business and M-government software usability was significant, particularly in favor of M-business. In fact, the general hypothesis was accepted as well as seven other subhypotheses, as only four sub-hypotheses were rejected. Keywords—Usability; interaction; heuristics; interface; mobile; Saudi Arabia


INTRODUCTION
The usability of software is an important aspect of systems development, however it is particularly challenging to study the usability of mobile systems as several limitations are posed by Mobile Computing (M-computing) environments [1].In the current literature, scholarly research considers the proposition of design patterns, software quality and testing models for different mobile software applications [2].Other streams of research have focused on investigating the feasibility of usability testing environments, frameworks, contextual effects and other aspects of mobile interaction.In addition, several studies investigated the usability of experimental mobile software applications developed to solve context-specific problems.Few studies examined the status of mobile presence in the Kingdom of Saudi Arabia (KSA), with respect to the usability of mobile interfaces; thus, most of the research efforts need to examine the usability of mobile software in KSA using heuristics evaluations.This paper not only reveals a gap in the current literature, but it also attempts to overcome this somewhat.Consequently, this paper presents an empirical study of the usability of mobile software applications in KSA.This paper represents an extended version of an earlier conference paper [3].In addition, it differs from the conference paper by examining the difference between M-business and Mgovernment software usability in KSA.
The paper aims to assess the usability of a mobile presence, while also ranking each evaluated mobile software application.Specifically, it identifies several aspects of mobile software usability, based on the Nielsen's heuristics method, in order to enrich the understanding of mobile interaction design.Another aim of this study is to demonstrate the empirical application of Nielsen's heuristics method, in the context of mobile computing.It also sheds more light on the variance in usability between M-business and M-government software in KSA.In order to achieve the aims, eleven dimensions (n=11) were adopted from instruments proposed by prior research, all of this helped to design a questionnaire for assessing mobile software usability.The eleven usability components were based on the Nielsen's heuristics instrument that was published in 1994 [4]; however, it was tailored to fit the context of touchbased mobile interaction in accordance with the arguments presented by [5].In order to compare the usability practice between M-business and M-government, the sample of mobile software applications was divided equally between the two conditions, each of which was represented by eighteen applications (n=18).
The reminder of this paper is organized as follows: the literature review is provided in Section 2, and more detail of the empirical study is presented in Section 3. The research results and discussions are presented in Sections 4 and 5 respectively and, finally, a conclusion is provided in Section 6.

II. LITERATURE REVIEW
Usability assessment has been a central topic of several empirical studies in the field of human computer interaction (HCI) [6].In particular, testing of the usability of mobile www.ijacsa.thesai.orgsoftware is an emerging area of research, as several challenges may affect these software applications, including: bandwidth, screen size and the nature of the mobile environments [7].For example, Zhang and Adipat (2005) proposed a framework for the design and implementation of mobile interface usability investigations, with regard to the selection of tools, methods, measurements and data collection approaches [7].Another study by Kallio and Kaikkonen (2005) investigated mobile usability testing in two different environments: the laboratory and field testing; the findings indicated that laboratory-based usability testing was sufficient, apart from when there were contextual elements that needed investigation in terms of the actual use environment [1].In addition, Ryan and Gonsalves (2005) empirically examined the effect that contextual elements had on mobile interaction design, they implemented four different conditions: mobile or personal computer interaction design, against web-based or traditional interfaces, all of which were tested in a with-participants experimental design [8].The usability results for mobile interaction outweighed those for personal computer interaction, due to the unique characteristics of mobile computing, such as locationbased services.Yet, some limitations were encountered, such as limited input capabilities of the actual mobile devices [8].Other studies considered the holistic view of mobile usability; to illustrate, Coursaris and Kim (2006) proposed a framework for usability evaluation, this framework offered a holistic view of mobile usability dimensions which could form the foundation to guide further research in usability [9].Another study, by Tsai et al. (2007), developed a mobile software application for caloric self-monitoring in a real-time manner; they conducted a usability study to examine the feasibility, compliance and satisfaction of the software applicationtheir findings revealed positive feasibility and usability results [10].Furthermore, Balagtas-Fernandez and Hussmann (2009) proposed a four-phase framework for mobile software usability evaluation, which aimed at simplifying the technical usability analysis [11], using the following four phases: preparing the system for analysis, collecting usability data, extracting information and analyzing usability practices.
Within the usability research endeavor, heuristics evaluation has been subject to several improvement attempts, resulting in the proposition of different variations of contextspecific usability heuristics.For instance, representing the contextual effect of mobile environments, a study by Po et al. (2004) proposed two variations of heuristics evaluation, namely heuristic walkthrough and contextual walkthrough [12].The former incorporates scenarios of use into the usability assessment by heuristics evaluation, while the latter involves the application of the former in the field [12].This study demonstrated that usability assessment could be improved by incorporating contextual details, particularly in favor of the heuristics walkthrough method [12].In addition, Korhonen and Koivisto (2006) proposed context-specific usability heuristics for mobile games, incorporating game usability, mobility and game play dimensions.The findings indicate that it is easy to discover playability flaws, but their counterparts within game play were identified as harder to uncover [13].
In KSA, several usability studies utilized the heuristics evaluation method to assess different information systems.It is noteworthy that KSA is considered to be one of the biggest information and communications technology (ICT) markets in the region.According to the Communications and Information Technology Commission's (CITC) annual report, it is estimated that ICT spending in KSA reached SAR 94 billion in 2012, with an annual growth of 16% [14].Considering the Egovernment web presence in KSA, Eidaroos et al. (2009) adapted the heuristics evaluation method to produce a usability checklist for E-government websites, so as to discover several usability flaws in the current practice of E-government interaction design [15].Furthermore, Al-Khalifa (2010) exploited the heuristics evaluation method in order to assess the usability of E-government web presence in KSA; it was found that the heuristics evaluation method was useful as an initial phase to uncover usability issues [16].Specifically, a sample of fourteen (n=14) governmental websites were selected from different sectors and considered for further evaluation.Using six usability components the sampled websites were evaluated, it was demonstrated that they had good usability implementation as all components achieved a score above 50%.More recently, Alotaibi (2013) conducted an investigation of the usability of university web presence in KSA, using heuristic evaluation to demonstrate its empirical application in KSA [17].The study evaluated twelve (n=12) university websites from different categories, thirty (n=30) evaluators found that the usability practices in Saudi universities were demonstrated to be of an acceptable quality with regard to seven usability components.

III. EMPIRICAL STUDY
Heuristics evaluation is considered to be among the most appropriate usability evaluation method, with regard to initial assessments of an interface, because it is low cost, quick to set up and achieves fast outcomes [16].Heuristics evaluation was first proposed in 1990 by Nielsen and Molich [18], and has since been refined by many authors.To illustrate, Inostroza et al. [5] adapted the Nielsen heuristics approach to fit the context of mobile computing.Specifically, they adopted Nielsen's approach, which consisted of ten (n=10) dimensions, and made some modifications to the heuristics to enable use within the context of touch-based mobile devices.In addition, a newly added dimension, concerned with physical interaction and ergonomics, has been incorporated to describe alignment between the shape, position and dimensions of the interface elements with the natural posture/position of the hand.This research proposes an assessment method that is based on Nielsen's heuristics [4], but it has been extended and tailored to fit within the context of touch-based mobile interaction [5].The proposed method incorporated eleven usability components (UC), each of which consisted of several heuristics.In order to guide software application evaluation, a questionnaire was devised which represented all usability components and heuristics.Each heuristic was evaluated by rating its status using a Likert scale ranging from 1 to 5 (1not met and 5fully met), in terms of whether it was met by the evaluated mobile software application.Table 1 shows the usability components and the number of heuristics identified.
The sampling of mobile software applications for further assessment involves selecting a few software applications that www.ijacsa.thesai.orgtruly represent the mobile software population within an area, in this case within KSA.Sampling should reflect a balanced diversity of mobile software applications; thus, the sampled software applications should represent different sectors, activities and organization sizes.In 2014, Alotaibi surveyed all mobile software applications that were developed by organizations in KSA, he identified a sample of thirty-six (n=36) mobile software applicationsthis should be regarded as a true representative sample of the population.Based on that argument [19], the thirty-six mobile software applications were considered for further analysis as a sample of the population; this sample can be viewed in Appendix A.
Scholarly research has considered evaluation of the adoption of emerging technologies whether in the Saudi private or public sectors.For example, a study by Alsenaidy and Ahmad (2012), on one hand, assessed the status of Mgovernment in KSA by reviewing several mobile services offered by government organizations.[20].On the other hand, a recent study by [21] investigated the adoption of mobile technology in the Saudi private sector, particularly the banking industry, and provided solid evidence for the maturity of Mbusiness practices.Other researchers reviewed different aspects of E-service adoption in the Saudi private and public sectors, such as E-shopping [22], E-government [23][24][25] and online retailing [26].Another stream of research considered the adoption of mobile technologies in education, which can be offered by both government owned and private universities [27][28][29][30].In fact, few of these studies considered examining the difference between M-business and M-government practices.Therefore, the current study separated M-business and Mgovernment software in order to facilitate a comparison between the two categories.
In particular, the sampled mobile software applications were selected to represent two main categories: M-business and M-government, with eighteen software applications in each category.Based on the findings of prior research that Mbusiness was more progressive than M-government in KSA [19], a general hypothesis was developed and put forward with eleven sub-hypotheses (derived from Nielsen's heuristics method [4]) to examine the different aspects of mobile software usability; to illustrate: H1: The usability of M-business software will outweigh that for M-government software, particularly in terms of the following usability components: H1(a) Visibility of system status.H1(b) Match between system and the real world.Using the questionnaire, the sampled mobile software applications were evaluated by thirty-six (n=36) participants, all of which were regular mobile users.A balanced recruitment procedure was followed in order to control the effect of gender, knowledge and education of the participants.Firstly, the sample participants were split against gender with 50% being female and 50% being male.Within the eighteen participants for each gender group, three further sub-groups were created to represent educational background; consequently, there were six bachelor students, six master students and six practitioners.This resulted in six sub-groups (each representative of 16.67% of the sample size) whereby the delegates were equally balanced between gender, knowledge and education.
A within-participants design [31] was adopted for this research; all participants were instructed to interact with and then evaluate all of the mobile software applications.The order of the mobile software application evaluation was counter balanced among the participants in order to alleviate any participant learning effects.An experimental procedure was deployed to facilitate the evaluation.Fifteen-minute lectures and short (thirty-minute) training sessions were also introduced to describe the basics of the heuristic method and to demonstrate the way in which the method could be used to evaluate mobile software applications.

IV. RESULTS
Fig. 1 shows the percentage values of the usability assessment for the mobile software applications in accordance with the overall usability score and the eleven usability components (UC).At a glance, it is clear that more than 60% www.ijacsa.thesai.org of the usability heuristics for mobile interaction design were met by the mobile presence in KSA.Although not all of the mobile usability heuristics were met, the figures clearly indicate that the usability design of the mobile software applications were of an acceptable quality.In particular, the score for visibility of the system status (UC1) was 70%, with all heuristics being rated with average scores, to illustrate there were no extreme values in the evaluation.As for UC2 (match between the system and real world), it can be seen that 72.2% of the heuristics were met, with two heuristics being rated consistently above average (displaying information in a logical and natural order and following real-world conventions).Furthermore, the two components of UC3 (user control and freedom) and UC4 (error prevention) were determined to be lower than the average score of 68.6%, scoring 68.0% and 63.4%, respectively.These results could be attributed to two main reasons: the lack of undo and redo options and the fact that the users were not warned when the error was likely to occur.In terms of the other usability components, UC5 (minimize the user's memory load) had the same value as UC1, with a 70% score.Further analysis of the raw data revealed that all heuristics in these components were rated with average scores (no extreme scores).In fact, further analysis of the results showed that UC6 (consistency and standards) achieved the highest score with 73.5%.This could be attributed to the way in which the standards were set within the mobile interaction design, whereby the concepts were consistently expressed, and established conventions were also followed.In addition, UC7 (customization and shortcuts) achieved the lowest score at 60.3%, due to a lack of simple configuration options and other features, such as: setting shortcuts and customizing, and grouping the interface elements.As for the usability component UC8 (aesthetic and minimalist design), it scored higher than the average score for all usability components, scoring 69.4%.This could be attributed to the richness and appropriateness of the displayed contents.Moreover, the two components of UC9 (help users to handle errors) and UC10 (help and documentation) were evaluated to be lower than the average, scoring 63.8% and 62.4%, respectively.This could be attributed to two main reasons: firstly, the occurrence of errors was not precisely indicated, and secondly no solutions for the errors were suggested in the error messages.Some mobile software applications provide help and documentation features that are not primarily easy to find; whereas some other software applications lack these features.Finally, the usability component UC11 (physical interaction and ergonomics) scored higher than the average score for all usability components, scoring 71.4%.This could be attributed to the appropriateness of the buttons' shapes, positions and functions.In summary, the assessment of mobile software applications in KSA were found to be above average (with an average score of 68.6%), which reflects how well the usability practices in mobile presence were implemented.Furthermore, Fig. 1 clearly documents that five of the usability components scored lower than the average score of all of the heuristics (total score= 68.6%), while five components scored above average.Fig. 2 shows the percentage values of usability scores for the M-government and M-business software in accordance with the overall usability score and the eleven usability components (UC).In general, it can be seen from the figure that the total usability score for M-business was approximately 7% higher than that for M-government.Similarly, the Mbusiness scores were found to be greater than for Mgovernment in all aspects of individual usability components, except UC10 (help and documentation) and UC6 (consistency and standards), with a difference percentage ranging from 0.4-4.2%.In particular, the score in UC1 (visibility of the system status) for M-business was 4.5% higher than for Mgovernment, while the score in UC2 (match between the system and real world) for M-business was 3.6% greater than that for M-government.Likewise, the score in UC3 (user control and freedom) for M-business was slightly higher than that for M-government.However, the score in UC4 (error prevention) showed a different picture, where the difference between M-business and M-government was found to be minimal.Similarly, the difference in UC5 (minimize the user's memory load) was found to be slight.The score in UC6 (consistency and standards) for M-business was 3.6% higher than for M-government.Furthermore, the score in UC7 (customization and shortcuts) for M-business was 4.2% higher than that for M-government.Moreover, the difference between M-business and M-government was found to be minimal with regard to UC8 (aesthetic and minimalist design) and UC10 (help and documentation).Finally, the score in UC11 (physical interaction and ergonomics) for M-business was 2.9% greater than that for M-government.In summary, the M-business usability scores were shown to be consistently higher than those for M-government, except for three components where minor differences were found.The variation in usability scores between M-business and M-government software was statistically examined using t-test.
Since the experimental design was within-participants [31], in which all participants were exposed to all conditions, the appropriate test for examining the difference between groups was the paired t-test [31].The degree of freedom was found to be six hundred and forty-seven (df=647) which represents one less than the number of pairs.With the critical significance levels (α: alpha) of α = 0.05 and α = 0.01, the critical values (cv) of the paired t-test were cv = 2.021 and cv = 2.704, respectively.Overall, the difference in usability between Mbusiness and M-government software was statistically significant (t 647 = 3.323, cv = 2.021, p<0.01).Therefore, it can be said that the general hypothesis (H1) was accepted at 99% confidence level.
Table 2 reviews the paired t-test result in accordance with the eleven usability components.At a glance, it can be seen from the table that seven tests out of eleven achieved significant results at different confidence levels.In fact, the difference in the visibility of the system status (UC1) between M-business and M-government software was found to be statistically significant at 99% confidence level.The same picture was shown in the match between the system and the real world (UC2).In addition, the difference in user control and freedom (UC3) between M-business and M-government software achieved statistical significance at 95% confidence level.However, the variance in error prevention (UC4) between M-business and M-government software did not reach any significance level.Similarly, neither did the fifth usability component (minimize the user's memory load).In contrast, the variance in consistency and standards (UC6) between the two software categories was found to be statistically significant at 99% confidence level.The same picture was shown in customization and shortcuts (UC7).Moreover, the difference in the aesthetic and minimalist design (UC8) between the two conditions of software achieved a statistical significance at 95% confidence level.On the contrary, the variance in helping users to handle errors between the two categories did not reach any statistical significance.Similarly, the tenth usability component (help and documentation) revealed insignificant results.Finally, the variance in physical interaction and ergonomics (UC11) between the two conditions reached a significant result at a 99% confidence level.In summary, all aspects of usability assessment achieved a statistical significance, except for UC4, UC5, UC9 and UC10.This asserts the rejection of sub-hypotheses H1(d), H1(e), H1(i) and H1(j), as well as acceptance of the sub-hypotheses H1(a), H1(b), H1(c), H1(f), H1(g), H1(h) and H1(k).In summary, the sub-hypotheses were partially accepted, as most of the associated tests (seven out of eleven) showed statistically significant results.
The implications of this study can be seen as being of great importance for the theory and practice of M-computing in KSA.As for the implications for academia, it is important for researchers in the M-computing field to consider the gap between M-business and M-government usability practices.The study provided an insight into the status of mobile interaction design in KSA, with a particular focus on the difference between M-business and M-government software usability.Further research in this field is yet important to highlight the critical success factors of mobile interaction design.Another effect of this study is the validation of Nielsen's heuristics method in the context of the M-computing field.During the course of the experiment, it was noticed that the usability checklist was developed for Windows-based software, but was fit in the M-computing context.However, the usability checklist needs further improvement to match the mobile HCI requirements.It is rather important to develop a usability checklist for touch-based mobile software, particularly with regard to physical interaction and ergonomics.In terms of practical implications, managers in ICT departments should double efforts to enhance mobile software, with a particular focus on the usability of the user interface.Experience gained from this study suggested that mobile software developers and interface designers should provide the necessary features to help users to handle errors.Alternatively, they should design their interface and develop the software to prevent the occurrence of errors in the first place.This can be achieved by avoiding misplacement of the control button and implementing customization and shortcut features.It is also recommended to adhere to interface design standards in the mobile computing field and follow mobile usability guidelines.
Although, this study was useful for academics and professionals in the M-computing field, it encountered several limitations.First, the study examined mobile software usability in a domestic context, representing only one country.The idea could be extended beyond the boundary of KSA to cover the usability of mobile software in neighboring countries, or rather overseas.It is important to link the status of mobile software usability in KSA with usability practices at the regional and international level.For example, a set of mobile software could be selected from regional or international key players in the Mcomputing field and considered for further evaluation, in order to compare the status of Saudi mobile interaction design against such benchmarks.Second, it is important to consider that the M-computing field is rapidly evolving and its software applications regularly change.Therefore, replication of this study on a regular basis would improve the understanding of software usability and its improvement and evolution.Finally, this study adopted a revised version of Nielsen's heuristics instrument [4], which was proposed originally to examine the usability of Windows-based software systems.This instrument was of a generic nature while assessment of the usability of Mcomputing software required the development of a usability model for M-computing research.It was rather essential to rely on a well-known instrument to investigate and establish an initial understanding of the issue at hand, as recommended by Al-Khalifa (2010): the use of a heuristics evaluation is useful as an initial phase to uncover usability issues [16].However, developing a context-specific model for M-computing software usability merits further investigation.

VI. CONCLUSION
This paper has provided an empirical study of the usability of mobile software applications in KSA, using the heuristics evaluation method.Based on prior research, eleven dimensions (n=11) were considered and a questionnaire was designed specifically for this study.The questionnaire items and constructs were derived from Nielsen's heuristics instrument, they were then adapted to fit the context of mobile environments.A sample of thirty-six (n=36) mobile software applications were carefully chosen based on prior research, representing two main categories: M-business and Mgovernment.A general hypothesis was developed that the usability of M-business software would outweigh that for Mgovernment, with eleven sub-hypotheses reflecting the usability components of Nielsen's method.Using the questionnaire, the sampled mobile software applications were evaluated by thirty-six (n=36) evaluators, using a withinparticipants experimental design, and whereby each participant evaluated all software applications.The order of the mobile software applications evaluation was counter balanced among the evaluators in order to alleviate any possible bias.The results indicated that the usability designs of mobile software applications in KSA were an acceptable quality.In fact, an average percentage score of 68.6% for all mobile software applications was evaluated and determined by the heuristics; thus, indicating the extent to which usability practices in mobile presence were implemented.The scores for all usability components exceeded 60%, with five components being below the average score and six components being above it.The usability of M-business software was shown to be greater than for M-government and, therefore, the general hypothesis was accepted.On the other hand, four sub-tests related to the usability components (UC4, UC5, UC9 and UC10) failed to achieve any statistical significance, and therefore the four corresponding sub-hypotheses were rejected.

Fig. 1 .
Fig. 1.Percentage values of total score for the eleven usability components (UC) for mobile software applications www.ijacsa.thesai.org

Fig. 2 .
Fig. 2. Percentage values of usability scores for M-government and M-business software in accordance with the overall usability score and the eleven usability components

TABLE II .
REVIEW OF PAIRED T-TEST RESULTS IN ACCORDANCE WITH THE 11 USABILITY COMPONENTS ALONG WITH THE CONFIDENCE LEVEL (CL)

TABLE III .
PERCENTAGE VALUES OF THE OVERALL USABILITY SCORE AND THE RANK FOR EACH EVALUATED MOBILE APPLICATION