A Pilot Study of an Instrument to Assess Undergraduates’ Computational thinking Proficiency

The potentiality of computational thinking (CT) in problem solving has gained much attention in academic communities. This study aimed at developing and validating an instrument, called Hi-ACT, to assess CT ability of university undergraduates. The Hi-ACT evaluates both technical and soft skills applicable to CT based problem solving. This paper reports a pilot study conducted to test and refine the initial Hi-ACT. Survey method was employed through which questionnaire comprising of 155 items was piloted among 548 university undergraduates. Structural equation modeling with partial least squares was applied to examine the Hi-ACT’s reliability and validity. Composite reliability was used to assess internal consistency reliability, while convergent validity was evaluated using based on items’ outer loadings and constructs’ average variance extracted. As a result, 41 items were excluded, and an instrument to assess CT ability comprising 114 items and ten constructs (abstraction, algorithmic thinking, decomposition, debugging, generalization, evaluation, problem solving, teamwork, communication, and spiritual intelligence) was developed. The reliability and validity of the Hi-ACT in its pilot form have been verified. Keywords—Computational thinking; assessment; skills; attitudes; undergraduates; self-assessment


I. INTRODUCTION
The ability to solve a complex problem is demanded, regardless of the field in which we work. Wing [1] introduced computational thinking (CT) as problem solving approach that using the way computer scientist think. Her vision is, the set of CT skills and attitudes will be beneficial for everyone, not only computer science majors. Further studies had reinforced that CT enables one to become a technology builder rather than a mere technology consumer [2], develops logic, creativity, innovative thinking [3], and analytical skills [4], The World Economic Forum [5] found these attributes are increasingly in demand in the digital world workplaces.
Further, recognition of CT as an essential skill for all students is expanding rapidly. Accordingly, initiatives are underway to bring CT into educational institution around the world. Among them, a number of recent studies focus on incorporating CT in classroom/curriculum [6], [7], some on creating artifacts with which to teach CT principles [8], [9], as well as on assessment [10]- [12]. Other studies [13]- [16] highlighted teachers' conception of CT.
In this work, we focus on CT assessment at the undergraduate level. Some studies have initiated CT assessments for undergraduates. An instrument, which tries to test the correlation between CT and critical thinking has been developed [17]. Specifically, this instrument assesses simple algorithms, sorting method, file structure, and digital information storage. The author used multiple choice questions and short answer questions. However, it has not been validated.
In another study [18] a test to identify CT skill of first-year computer science students was developed. It was based on six classes of CT skills and practices defined in 'Computational Thinking Framework,' i.e. models and abstractions, patterns and algorithms, processes and transformation, tools and resources, inference and logic, and evaluations and improvements.
A paper-based test, called 'The Testing Algorithmic and Application Skills' is presented [19]. This test measures algorithmic skills, computer science terminology used, and problem solving abilities. Particularly, it comprises questions related to the students' computer usage habits, self-assessment on their knowledge in informatics, and tasks of traditional programming, numerical system calculation, handling files, word processing, and spreadsheet programming.
The aforementioned CT assessment studies mostly highlighted the skills, and little has been done to include attitudes. Contrariwise, according to Wing [1], CT comprises both skill and attitudes necessary in solving problems. There is thus a need for an instrument that includes items and constructs to measure students' CT competency, in terms of skills and attitudes. Therefore, in the light of Wing's original conception of CT as a set of skills and attitudes to solve problem, this work proposes an instrument, the "Holistic Assessment of Computational Thinking" (Hi-ACT), to test undergraduates' perceptions of their CT competency. We use the term 'holistic' to describe the inclusion of both skills and attitudes in the CT assessment framework. This paper reports a pilot study that was conducted to assess and refine the initial Hi-ACT by examining its reliability and validity. This is an extended version of the work published in [20].

II. COMPUTATIONAL THINKING
CT has been noticed as a major research field since the publication of Wing's remarkable article in 2006. However, several researchers noted the long history of CT [21]- [24], as presented in Fig. 1. As early as 1945, George Polya emphasized the application of disciplined manner, decomposition, and generalization (reuse common techniques) 263 | P a g e www.ijacsa.thesai.org to solve the everyday problem [24]. In 1962, Perlis proposed his vision that programming concept would foster the ability to understand various topic outside computer science and become a vital part of education [21]. As noted by Denning [22], from the field of science, L.K. Wilson introduced the 'computational science,' a computation-based approach to exploit existing knowledge and discover the new one. Thereafter, Seymour Papert, in 1980, found that 'thinking like a computer' was a useful component of thinking skills to teach mathematics to children [23]. Programming symbols and representation were used in solving mathematical problems.
Further, in 2006, Wing introduced CT, a way of thinking to solve problems, design systems, understand human behavior, using computer science based concepts. Several researchers [2], [25]- [27] revisited Wing's definition of CT provide a definite understanding of CT and to perceive its core principles. Denning [26] defined CT as a mental orientation to formulate problems through what so-called 'conversion'. Algorithms are applied to convert some input into an output. Other studies described CT as problem solving process [2], and the essence of is 'thinking like a computer scientist' [27]. Wing refined her early delineation of CT to be "the thought processes involved in formulating problems and their solutions that can be effectively carried out by an information-processing agent; a human or machine, or combinations of humans and machines" [28]. Aho [25] then simplified Wing's refinement by defining CT as the thought processes to formulate solutions to the problems, which represented as computational steps and algorithms. Put simply, the core of CT is to approach a problem using computer scientists' way of thinking.
CT adopts some fundamental concepts of computer science as its skills [1]. There are varying views considering CT skills (Table I). Along with the skills, attitudes are also required in CT-based problem solving [1]. Barr et al. [2] used the term 'dispositions' to describe the values, motivations, feelings, stereotypes and attitudes' appropriate to CT. It, therefore, can be said that, in CT, attitudes is indeed necessary for solving problems using. Nevertheless, as shown in Table II, only a few works of literature that considered attitudes.

III. RELATED WORKS
Korkmaz et al. [10] developed the 'Computational Thinking Scale' (CST), an instrument comprising 29 five-point Likert type items. The CST, which was tested on undergraduate students, assesses algorithmic thinking, critical thinking, creativity, problem solving, and cooperation. In a more recent study, a scale has been developed to assess high school students' computational thinking skill [11]. In the same way as [10], this study also develops the scale based on ISTE (2015)'s definition of computational thinking skill. The scale takes in five skills, i.e. problem solving, algorithmic thinking, critical thinking, cooperative learning, and creative thinking as the initial factors. Subsequent to validity and reliability examinations, the resulting scale consists of five-point Likert scale of 42 items categorized into four factors, i.e. problem solving, cooperative learning and critical thinking, creative thinking, and algorithmic thinking.
The aforementioned instruments evaluate some soft-skills relevant to CT (creativity, problem-solving, and cooperation); however, both fail to consider the abstraction, decomposition, pattern recognition, and generalization. In contrast, abstraction is a basic tool of reasoning in CT [38]. That is, abstraction allows one to make simpler the large and complex problems [39]. In the same way, the ability to recognize patterns and generalizing solutions is an invaluable skill for computer scientists [18]. Decomposition is needed to break-down a problem into smaller, simpler and more manageable subproblems [39]. That is, these skills are essential in CT.
This work proposes the Hi-ACT, a CT proficiency assessment instrument that takes both skills and attitudes into consideration. The constructs measured are elaborated in detail in the next section.

IV. DEFINING HI-ACT
There is still little unanimity on CT definition, as [18], [40] inferred. Besides, its underlying skills are still being debated and redefined. Notwithstanding, to develop the Hi-ACT, we define CT as the thought process of formulating solutions to a problem that entails some skills and attitudes. The term 'skills' refers to computer science-based concepts used in CT, whereby this work draws on the work of [30], [32] to define CT core skills, including abstraction, algorithmic thinking, decomposition, debugging, evaluation, and generalization. The term 'attitudes' refers to soft-skills. Soft-skills are personal specific skills which include attitudes, character traits, and behaviors [41]. In this work, the attitudes were drawn from the Computer Science Curricula 2013, i.e. problem solving, ambiguity tolerance, teamwork, communication, and personal attributes [42], and the operational definition of CT attitudes previously stated by [2]. This collection of work on attitudes was synthesized into three categories: problem solving, teamwork, and communication.
Furthermore, this work suggests one additional element, i.e. spiritual intelligence, to be included as one of the attitudes (soft-skills) of CT. The justification for such inclusion is: spiritual intelligence comprising a set of abilities that encourage people's ability to solve problems, achieve goals, and enhance decision-making capability [43]. In this way, spiritual intelligence might be beneficial to CT as a way of thinking about solving problems. Moreover, there are some attitudes that demonstrate spiritual intelligence, i.e., selfawareness, creative reasoning, integrity, and asking 'why' questions, which are found helpful when confronting challenging problems, including artificial intelligence problems [44]. Hence, including spiritual intelligence would be beneficial in CT-based problem solving process.
• Abstraction: the ability to simplify a problem by removing unnecessary details or information, then create a representation of the solution.
• Algorithmic Thinking: the ability to thinking algorithmically in formulating the instructions (procedure) through logical thinking to solve a problem.
• Decomposition: the ability to simplify a problem by dividing it into smaller, simpler, and easier to manage sub-problems.
• Debugging: the ability to identify and remove errors in the designed solutions (the algorithm).
265 | P a g e www.ijacsa.thesai.org • Evaluation: the ability to assess the solution's correctness, performance, resource usage, and the action of refining to improve the solution's quality.
• Generalization: the ability to identify similar patterns between the problems and generalizing solutions of previous problems to similar ones.
• Problem solving: the characters applicable to problem solving process, including self-confidence, persistence, ambiguity handling, and willingness to solve the problem.
• Teamwork: the ability to work in a team.
• Communication: the ability to exchange information and knowledge, by means of verbal and non-verbal, within the member of teamwork.
• Spiritual intelligence: copes with the ability to use spiritual abilities, including self-awareness, integrity, and creative reasoning, in enhancing an individual's personal characters to facilitate problem-solving process.

A. Hi-ACT Initial Instrument
The Hi-ACT was firstly designed with 172 7-point Likert scale candidate items. These items address one of the subconstruct presented in Table III. Sub-construct is construct categories among the candidate items that are defined to ensure the items' convergent validity. Further, the first version underwent a content validation process through experts' judgment in a three-round Fuzzy Delphi Study, as reported in [49]. As a result, the initial Hi-ACT comprising 155 items was ready for validity and reliability assessment.

B. Participants
The initial Hi-ACT was administered on a total sample of 713 undergraduate students, from STEM and non-STEM major of specializations. The participants were recruited from different departments (Computer Science, Economics, Social Sciences and Humanities, Design, Linguistic, Natural Sciences, Health, Engineering, Law, Medicine, and Education), from two universities located in two different cities in Indonesia and one university located in Malaysia. After removing the surveys that have not been completely filled in, the final usable sample size is 548. Prior to data collection, universities' approval was obtained. All participants were notified of their voluntary participation, anonymity and confidentiality were assured. The percentage of participants in term of gender was equal, 274 (50%) were male, and 274 (50%) were female. Regarding the major of specialization, from the total sample, 363 (66%) were registered as STEM-based.

C. Data Analysis
This work aimed at refining the initial Hi-ACT by examining its validity and reliability. To do so, the structural equation modeling with partial least squares (PLS-SEM) was chosen. This choice was made for two reasons. First, factor analysis is a common statistical method for conceptualizing the constructs when refining a new instrument [50]. Exploratory factor analysis is specifically intended to refine a set of items in a new instrument. In that regard, as argued by Hair, Hult, Ringer and Sarstedt [51], PLS-SEM is mainly used to develop theories in an exploratory study.
Second, PLS-SEM is suitable for a complex model [52]. Based on the literature analyzed in this work and the result of content validation, Hi-ACT comes up as a multi-dimension construct, i.e. the constructs and sub-constructs described in Table III. Thus, Hi-ACT was modeled as reflective-formative higher-order constructs, as shown in Fig. 2. This model comprises of 29 first-order constructs, i.e. the sub-constructs (AR, AC, ATPr, and so forth) and ten second-order constructs, i.e. the constructs (Abstraction, Algorithmic Thinking, Decomposition, and so forth). Finally, the second-order constructs are formative to the Hi-ACT construct. Each of the 155 items in the initial Hi-ACT was modeled as a reflective indicator of one of the 29 first-order constructs.   In order to evaluate the instrument validity and reliability, the first-order constructs were evaluated. Evaluating reflective first-order constructs involves the examination of reliability and construct validity (convergent validity), which we deemed acceptable to analyze the result of new instrument pilot study. By referring to Hair, Hult, Ringer and Sarstedt [51], the following analyses were conducted: • Internal consistency reliability as an evaluation of reliability. The internal consistency reliability was assessed by composite reliability (CR). It is desirable to have value within the range of 0.7 to 0.9, and it should not exceed 0.9.
• Convergent validity was assessed based on two criteria, i.e. items outer loadings and constructs' average variance extracted (AVE). Item's outer loading should be ≥ 0.7 and the AVE to be ≥ 0.5. When the AVE does not meet the required threshold, the item with the smallest loadings should be removed. Table IV presents the internal consistency and convergent validity results. In the first run, the constructs CR value ranged from 0.7 to 0.93 exceed the threshold of 0.7. However, the TCp, SII, and SIC constructs' CR are higher than 0.9. CR value above 0.9 indicates that all the indicators measuring the same phenomenon, which is not a valid measure of the construct, and therefore is not desirable [51]. The second test was convergent validity. The loadings of all items ranged from 0.55 to 0.85, while the AVE ranged from 0.43 to 0.66. The AR, AC, ATS, ATR, DD, COM, SIS, SII, and SIC constructs' AVEs fall short of the threshold value of 0.5, indicating that the conditions of convergent validity were not met. Accordingly, items with low loadings were eliminated.

VI. RESULT
In summary, the initial instrument was refined by removing 38 items to improve each particular construct's AVE, and three items to shrink the CR value of TCp constructs to 0.9. Hence, the total of items eliminated was 41. This increased the AVEs while keeping the CRs in the threshold (Table V), then subsequently support the internal consistency and convergent validity conditions.

VII. DISCUSSION
This pilot study yielded preliminary proof of Hi-ACT's potential psychometric properties, a scales aimed at assessing undergraduate CT skills more comprehensively, by incorporating both skills and attitudes. A total of ten constructs and 114 items were extracted for Hi-ACT. Within this frame, the factor loadings for all items ranged from 0.68 to 0.86. These values indicating that the items of each specific construct have much in common, and they are contributing to measuring each associated sub-construct. The convergent validity of the construct level (AVE) was confirmed with values were ranged from 0.5 to 0.66, satisfying the required threshold of 0.5. The internal consistency reliability was maintained in an acceptable range. Within the range from 0.67 to 0.9, the CR values exceeding 0.7 were obtained for most of the sub-constructs. Thus, indicating that high internal consistency was achieved.
Six sub-constructs, i.e., ATS, ATR, TCM, COM, SII, and SIC, have items with factor loadings less than the threshold value of 0.7. Low factor loadings might contribute to low CR and AVE. Particularly for ATS and ATR sub-constructs, the CR values (0.67) were slightly lower than other sub-constructs. The CR value of 0.67, indicating that the items only have shared common variance of 45%, which implies that the items in each construct are slightly weak to measure the construct. It could be that these two sub-constructs have very few items compared to other sub-constructs. Each sub-constructs has two items, and one of them has factor loading less than 0.7; ATC3 (0.68) and ATR4 (0.69), which leads to slightly low item reliability. Nevertheless, the CR value above 0.6 is considered acceptable in an exploratory study [51]. Moreover, the AVE of both sub-constructs achieved the value of 0.5. This indicates that, on average, each sub-construct accounts for a minimum of 50% of the variance of its items. Thus, the validity of the items and the sub-construct is indisputable. The COM sub-construct also holds two items with factor loadings lower than 0.7, i.e. COM-3 (0.68) and COM-5 (0.67). However, removing one of these items led to a fall in sub-construct's convergent validity (decreasing the AVE). Also, COM has other strength statistics, i.e. CR value of 0.83. Accordingly, the items were retained. For the same reason, the items with factor loadings below 0.7 in TCM, SII, and SIC sub-constructs were retained.

VIII. CONCLUSION
The Hi-ACT which that evaluates undergraduates' perceptions of their CT competency was developed. A pilot study was carried out to refine the initial instrument. Based on the responses of 548 university undergraduates to 155 items, an instrument comprising 114 items was established. The findings of statistical test of internal consistency and convergent validity reveal that the Hi-ACT in its pilot form is valid and reliable to measure university undergraduates' CT competency. In future studies, we plan to proceed with further instrument evaluation to provide further evidence of construct validity and discriminant validity.
Furthermore, the Hi-ACT makes a notable contribution to CT literature. It extends the CT assessment study by verifying ten primary constructs and 29 sub-constructs, which delineate the skills and attitudes applicable in CT-based problem solving process. These CT concepts did not comprehensively address in most previous CT assessment studies. Accordingly, findings of this work bring forth comprehensiveness to CT theoretical work, specifically in undergraduate context. It also results in a set of indicators that useful in measuring CT competency holistically.