Analysis of Web Content Quality Factors for Massive Open Online Course using the Rasch Model

The lack of understanding among content providers towards the quality of MOOC motivates the development of several MOOC quality models. However, none was focused on the web content from the perspective of content providers or experts despite the facts that their views are important particularly in the development phase. MOOCs learners and instructors definitely understand the functional external quality, but content providers have better understanding to the internal qualities, which is required during the development phase. The initial quality model for MOOC web content based on 7C’s of Learning Design and PDCA model for continuity have been proposed, consisted of nine categories and 54 factors. This research focuses on the validation towards the proposed model by content providers and experts to provide systematic evidence of construct validity. This involved two main processes; content validity test and survey on acceptability. The content validity test was conducted to confirm the agreeability of proposed categories and factors among respondents. The Dichotomous Rasch model was utilized to explain the conditional probability of a binary outcome, given the person's agreeability level and the item's endorsability level. Subsequently, the survey on acceptability was conducted to obtain confirmation and verification from the experts group pertaining on MOOC web content quality factors. Rasch Rating Scale model was used since it specifies the set of items, which share the same rating scale structure. The usage of the Rasch Model in instrument development generally ease variable measurement by converting the nonlinear raw data to linear scale, while assists researchers in tackling fitness validation and other instrumentation issues like person reliability and unidimensionality. This paper demonstrates the strengths of applying Rasch Model in construct validation and instrument building, which provides a strong foundation for the model adaptation as a methodological tool. Keywords—Web content; quality model; hierarchical model; Rasch Model; rating scale; survey reliability


I. INTRODUCTION
Widespread acceptance among instructors and learners since its introduction in 2008 does not prevent the Massive Open Online Course (MOOC) from receiving a number of criticism in its implementation. Some of the major issues is pertaining on its web content weaknesses, despite its importance in maintaining learner's engagement as supported by [1]. To overcome this, previous research by [2] have proposed a web content quality model for MOOC from the perspective of content providers, which takes into consideration the aspects of external and internal quality. The model intents to facilitate the understanding of content providers into the right facet of producing a quality web content for MOOC.
The quality factor is an instrument that needs to be empirically verified to ensure its reliability and usefulness in the real-world environment. It leads to the main objective of this research which to validate the proposed web content quality model for MOOC by [2] and its definitions from the perspective of content providers and MOOC experts. In order to achieve this objective, two tests were conducted: content validity test and survey on acceptability. This test meant to fulfill six criteria of construct validity proposed by [3] which is content, substantive, structure, generalizability, external and consequential.
The Rasch Model was implemented due to its capability to assess the construct validity by transforming the ordinal data into a linear score, before it's been evaluated through the use of parametric statistical tests as proven by [4]. This analysis method also enables researchers to make critical corrections to the raw test score by implementing fitness validation. It is being utilized by number of instrument validation such as blog quality model by [5] and customer satisfaction for service quality by [6]. Moreover, the web content quality model is developed within the hierarchical factor-criteria-metrics (FCM) framework, similar with several hierarchical models like McCall, Boehm and ISO/IEC 9126. Therefore, it is important to ensure that all quality items have single dimension towards the model objective, or called unidimensionality. The Rasch Model was used to ensure the unidimensionality compliance through the function of Category Probability Modes and Principal Component Analysis [7]. Its adaptation along with data fitness validation and the probability of an item to be accepted is explained in this research.
The rest of this paper is organized as follows: Section II describes briefly about the development of Web Content Quality Model for MOOC and the Rasch Model method; Section III explains how the content validity and survey on acceptability were conducted; Section IV discusses the results and discussions; and finally, Section V touches on the conclusions.

A. The Development of Web Content Quality Model for MOOC
The initial web content quality model for MOOC as reference to content providers has been proposed by [2] as depicted in Fig. 1. Its development began with the determination of quality factors through the process of content analysis, which involved three activities: (i) Review the existing and possible quality factors from online library in duration of 2010 to 2018 (ii) Combining the set of factors to cross check any redundancies as applied by [8] and (iii) Assigning the factors into respective categories. The content analysis yields 54 quality factors, which assigned into nine categories that modified and customized from the 7C's model. Author in [9] point out that quality evaluation by untrained or end users is questionable and not comprehensive. Therefore, instrument validation from the perspective of content providers and MOOC experts was applied in this research to secure the validity of the proposed quality factors, as acknowledged by number of researchers like [10] and [11]. Content validity is an important procedure in scale development, which the degree of an instrument has appropriate sample of items for the constructs that being measured [12]. This test also is a non-statistical type of validity that involves systematic examination of the survey content to determine whether it covers a representative sample of the behavior domain to be measured. Its main objective is to ensure that the instruments represent all facets of a given constructs, as well as providing a solid basis for rigorous validation evaluation [13].
The survey on acceptability measure the level of acceptability among respondents to the proposed categories and factors of a model based on the steps proposed by [13] such as survey planning, availability of the resource, survey design, data collection planning and selection of participants. The survey can be executed through a structured standardized interview that follow determined and specific questionnaire. This data collection methodology has been applied by number of research such as [5] and [14] to validate the newly developed model.

B. The Rasch Model
Rasch measurement model which introduced by a Danish mathematician named Georg Rasch in 1960 is a psychometric technique to improve researchers construct instruments precision, monitor instrument quality, and compute respondents' performances [15]. It creates measurements from categorical data such as questionnaire responses, as a function of the trade-off between the respondent's abilities and item difficulties [16]. Rasch model also enables researcher to make critical corrections when using raw test score or survey data. The mathematical theory underlying the Rasch models is a special case of item response theory and generalized linear model. Rasch Dichotomous Measurement Model is a probabilistic model which considers two aspects: (i) Difficulty of the item (ii) Ability of respondent to verify the item. The model explains the conditional probability of a binary outcome (in this research, agree or disagree), given the person's agreeability and the item's endorsability level. It is based on the logic that all respondents have a higher probability of answering easier items and a lower probability of answering difficult items. This is expressed mathematically as: refers to the probability of agreement of person towards the item , while is the ability of person and is the difficulty of item . Thus, in the case of a dichotomous attainment item, it is shown that the log odds or logits of correct response by a person to an item is equal to − . The reason is based on the need to transform it to logits in order to obtain a linear interval scale [4]. Given two examinees with different ability parameters 1 and 2 and an arbitrary item with difficulty , compute the difference in logits for these two examinees by ( 1 − ) -( 2 − ). This difference becomes 1 − 2 . The logistic function in the equation as it allows for making estimates of and independently each other. Hence, the estimates are independent of the effect of and the estimates of are independent of the effect of . The separation between these two parameters provides a simple yet powerful model to assess survey response, making it possible to obtain a linear scale and generalized measurement [7]. Constant is referring to natural log function (2.7183) of the difference between person's ability and item's difficulty. This can be expressed mathematically as follows: A direct comparison between person's ability and an item's difficulty can be obtained as follows: the probability of success on any item, given's a person's ability and item's difficulty. It is divided by the probability of failure on any item and the natural log of resulting expression that provides the comparison [7]. This implies that persons and items can be compared directly as the characteristics of both have been separated. This unique property is called parameter separation [16].
Rasch Rating Scale Model is the extension of the Rasch Dichotomous Model. It derived from concept of threshold as the item is modelled of having three threshold if it contains four response choices. Every item threshold labelled with k has its own difficulty estimation F, and this is modelled as the threshold at which a person has an equal probability of choosing one category over another. For example, the first threshold is modelled as the probability of choosing a response of "2" (disagree) over the response of "1" (strongly disagree), which then estimated using the formula as follows: 579 | P a g e www.ijacsa.thesai.org 1 is the probability of person in choosing "Disagree" (Category 2) over "Strongly Disagree" (Category 1) on any item . In this equation, F 1 is the difficulty of the first threshold, and this difficulty calibration is estimated only once for this threshold across the entire set of items in the rating scale.

A. Content Validity Test
The content validity was conducted to confirm whether the content is agreeable to the respondent, hence provides the empirical evidence of content aspect in construct validity. Reference [3] explained that besides content, the aspect of consequential, substantive, structural, external and generalizability are also contribute to the construct validity. In this research, the content validity test was conducted through web-based online survey in order to ease data gathering, increase response rate, minimize cost and automate data input as supported by [17]. Google Forms was utilized as survey instrument based on its advantages like high reachability, freely available, easy to use and automatic data response input [18].
Participants were invited via communication tools like email, Facebook, Twitter and MOOC platforms to complete the online survey. They were selected openly through profiling processes with the assistance from MOOC community like the Malaysia E-Learning Higher Learning Institution Coordinator (MEIPTA) and The Australasian Council on Open, Distance and e-Learning (ACODE). The fit respondents also selected from professional sites like LinkedIn, authors of paper that used in literature review and experts from any related conference or workshops. The respondent resume and experiences were examined through their profiles available in their websites to gauge their knowledge and expertise on MOOC.
Rasch Dichotomous Measurement Model has been adapted as the analytical method to explain the conditional probability of the binary outcome, which is agree or disagree. The questionnaire data was setup in a free Rasch analysis application called Bond&FoxStep, which is the customized version of proprietary Winstep®. Through this application, analysis of reliability, person separation and principal component were carried out. Measurement of acceptance level for items and persons was made through the Wright Map, while measurement of scale was executed through Rating (Partial Credit) Scale.

B. Survey on Acceptability
The survey on acceptability was conducted after the content validity test to obtain confirmation and verification from the content providers and experts concerning the web content quality factors for MOOC. The survey was executed through structured standardised interview in order to get the optimum results. The content providers and experts were selected mostly from the higher learning institution and MOOC platform developer. The survey consists close-ended and open-ended questions to gain variety of recommendations and comments.
Before the implementation of survey on acceptability, the pretesting was conducted on the redesigned questionnaire to assess its clarity, readability and understandability to the participants. This process involved four field experts comprising statistician, MOOC expert, language expert and web designer expert. Once all of them were satisfied, the reviewed questionnaire was distributed to 49 MOOC experts and content providers. The questionnaire comprised of two parts: (i) Part I: The respondent states their gender, age and occupation. (ii) Part II: The respondent indicates the extent on which they agreed or disagreed with the proposed MOOC quality content on the scale of 1 to 5 (1 -strongly disagree and 5 -strongly agree). An open question was also included to draw further recommendations and comments.
Similar to the content validity test, the survey analysis was executed through Bond&Fox application. Data was tabulated and analysed using Rasch Rating Scale Model, given that the survey deal with multiple response category item. Rasch Rating Scale can deal with a small sample size of 50 to provide useful and reliable estimates for item calibrations, at a 99% Confidence Interval or within ± 1 logits [16]. Fig. 2 depicts the summary statistics for the sample of 59 person on the 60 dichotomous scale items, comprising of 9 categories and 51 quality factors. The mean of the person measures is 2.94 (SE .63) that is higher than the 0 calibration of the item scale, indicates that majority of respondents found this questionnaire relatively understandable. The summary statistics for item and person imply satisfactory fit to the model. The value of person reliability which is higher than .67 (at 95% confidence level) means the test discriminate the sample into enough level, indicates the instruments for measuring content validity is reliable for measurement purpose. The item reliability which is .52 (at 95% confidence level) has no traditional equivalent and can be ignored for this purpose.

A. Content Validity Test
The Wright Map in Fig. 3 shows the distribution of person on the left and the item agreement on the right, represented by category ID and factor ID. The agreeability level of person are clearly shown on the map, as the most agreeable items like C1 (Conceptual), C1F01 (Relevance), C9F01 (Consumable) and C9F02 (Continuous Improvement) are that located at -2.90 logits (SE 1.84). On the contrary, the least agreeable items which is C4F02 (Instructor-Centred) located on top of the item distribution at +2.46 logits (SE 0.35). The mean of person distribution µ person =+2.94 logit is higher than the mean of the item distribution µ person =0.00 logits, indicates that most of the respondents involved in the content validity test have tendency of agreeing the proposed categories and assigned factors definition. The probability of person's agreement with the identified categories and factors were calculated using (1). With the mean of 2.94, respondents generally indicate their level of agreement at 94.97%, which is above the 70% threshold limit of Cronbach's Alpha as shown in the following calculation:    Fig. 4 shows the item statistics that details the location of all items in Wright Map, as the top-most and bottom-most items on are equivalence. The fit statistics indicates that person fully agree with four estimated items which are C1, C1F01, C9F01 and C9F02. These items were retained in this analysis as it did not influence the measurement. In the context of Rasch analysis, infit and outfit determine the fitness of model accurately and indicate whether the item need to be deleted, rescored, or reworded. The item's infit / outfit mean square (MNSQ) value that falls outside the range of 0.6 to 1.4 and infit / outfit ZSTD value that fall outside -2.0 and +2.0 behaved more erratic than expected. The analysis performed on Outfit MNSQ and Outfit ZSTD columns reveals that all item adequately fit the model except C4F02 (Instructor-Centred) and C9F03 (Traceable). The ZSTD of C4F02 is 2.8 and C9F03 is 2.5, which considered misfits.
Crosschecking on the Guttman Scalogram as shown in Fig. 5 indicates that both misfit items, which are Instructor-Centred and Traceable have been underrated by a several person. For example, the person with ID F02 disagrees with Instructor-Centred, while most of the top is agree. That case is similar with the persons with ID F05 and F06 that disagree with Traceable, while the patterns of other person agree with it. This may due to the carelessness by the persons in attempting their work. However after verifying their MNSQ infit value which is within productive range (1.48 and 1.39, the range is within 0.4 to 1.6), the two misfits were validated. This criterion-reference interpretation of measure supports the technical quality of the content aspect in construct validity.
As stated in the content validity test objectives, two different aspects were analyzed: (i) the definition of categories and factors, and (ii) the assigning of factors into its respective categories. The probability of both aforementioned aspects was calculated based on logits measure. This also determines the revision's requirement for respondent's views from openended question. The formula of (1) was used to measure the probability for each categories. A threshold of 70% was set in line with the standard threshold limit of Cronbach Alpha [4]. It was then interpreted as follows: a) Definition of categories and factors with probability to be agreed more than or equal to 70% will be accepted without any revision.
b) Definition of categories and factors with probability to be agreed less than 70% will be reviewed if related comments are provided by the respondents. The categories will be subsequently redefined whereas the factors will be discarded or amended if applicable.
For example, for the category C01 Conceptual that the value of person measure is 2.94 and item measure is -2.9, the calculation of probabilities is as follows:   The probability of agreement for item C01 Conceptual is 99.6%, which is higher than the set threshold of 70%. Therefore, Conceptual is accepted as one of the categories that form the web content quality model for MOOC. The results for the rest of the categories along with Conceptual are presented in Table I. It concludes that all nine proposed categories were agreed by the respondents along with its definitions, with the probability is between 91.8% and 99.6%.
The finding of assigning factors into respective categories is shown on Table II. It can be seen that 50 factors (with probability to be agreed more than 70%) remain in their respective categories. Based on these findings, only one factor which is Instructor Centred from Video Quality category has possibility of acceptance lower than 70%, to be exact 61.77%. Therefore, the definition needs to be reviewed. Table III shows the comments from respondents related to this factor.   Based on these comments, the Instructor-Centred factor was removed from the category of Video Quality. The survey also put an open ended question on every categories that the respondent may proposed other factors that contribute to the quality of MOOC web content. The factors were accepted and justified based on its relevancy and suitability as shown on Table IV. After rigorous study, only one proposed factor is justified based on its relevancy to be considered as one of a web content quality factor for MOOC. The revised initial quality model now consisted of 9 categories and 52 factors when Instructor-Centred have been removed, besides Sound Clarity and Light have been added.

B. Survey on Acceptability
The survey on acceptability was conducted to measure the level of acceptability among content providers and experts towards the content-validated quality model. The summary statistics in Fig. 6 depicts the summary statistics of 47 responses to the 52 web content quality factors by person. The person's mean of +2.79 (SE .27) indicates that majority of respondents found this questionnaire relatively understandable, while showing that their selection was made correctly. This also means that they tend to accept all the proposed factors. The valid responses of 99.9% indicate almost all of the selected respondents are reliable and understand the field with no extreme value. The person reliability (Rasch equivalent to Cronbach's Alpha) is 0.96, indicates high internal consistency of response, which the same result can be expected when the same test is performed.
Item reliability of 0.82 indicates the adequacy of the item to measure what needs to be measured as shown on Fig. 7. The high quality of the items resulted a large value of person separation (4.69) which evidenced by this summary. That's mean that it able to separate person classification that choose "Strongly Agree" to "Strongly Disagree", which provide evidence for external aspects of construct validity as explained by [21]. The mean square fit (Infit and Outfit MNSQ) and the z statistics (Infit ZSTD and Outfit ZSTD) for items and persons are closer to their expected values, +1 and 0, respectively. This shows a satisfactory fit to the model.  [19] as one of the successful factor of MOOC web content. Hence, this proposed factor is accepted.

Usability
Report Analysis -Able to recall report certain segments etc.
There is a quality factor named Analyzable in Engagement category. Hence, this proposed factor is rejected.

Light -Does not consume a lot of resources for mobile and computers
There is a quality factor named Segmented in Video Quality category. Since lightweight features has been much highlighted such as [20], this factor is accepted and added to Maintainability.   The map shows the distribution that consist the respondent on the left and the item agreement on the right. This map shows that the item mean is significantly below the person mean. In fact, almost all items are located below all persons. This substantially indicates that the majority of respondents understands and tends to agree with the items or factors proposed.
The Wright Map item positioning is simplified by Item Measure Table demonstrated in Fig. 9. The table lists all logits measurement information for each item including mean square (MNSQ), ZSTD value and Point Measure Correlation (PMC). Aligned with the Wright Map, the easiest item to be accepted is at the bottom, which is C5F03 (Understandable) while the most difficult item to be accepted is on the top, which is C3F03 (Translatable). Both items located respectively at -1.05 and +1.13. The fit statistics of the item is evaluated by MNSQ, which theoretically indicate the accuracy and predictability of the data. The expected value for MNSQ is 1.0 where any values less than 1.0 indicate the observations are too predictable, while greater than 1.0 indicate unpredictability.
According to [16], the acceptable range for Infit and Outfit MNSQ to be considered productive for measurement is between 0.4 to 1.6. On the other hand, the acceptable range for Infit and Outfit ZSTD is between -2.0 to 2.0. According to the scale, three items were identified as misfits namely C8F04 (Backup ready) for Infit along with C2F01 (Multi-Platform) and C4F01 (Segmented) for Outfit. All the misfits also caused the ZSTD value to fall out of reasonable predictability range. Point-correlation is perfect as every item's PMC value is greater than zero, which indicates that all response-level scoring are makes sense.
The reevaluation of the three misfit items started with C8F04 (Backup ready). The Infit MNSQ rating for this item is 1.61, the value that clearly over the range of productive for measurement, which is 1.6. Therefore, there is high probability that some agreeable person was careless in responding the item. This prediction is strengthen with its high ZSTD value, which is +2.4. The other two misfit items, namely C2F01 (Multi-Platform) and C4F01 (Segmented) which indicates by overly outfit value may be due to imputed response, lucky guess or careless mistakes. The Guttman Scalogram was referred to detail the misfits. Reference [16] suggests that any suspected responses can be replaced with a missing or blank values before examining the impact of changed result on measures. The crosschecking process on Guttman Scalogram showed that C8F04 (Item 47) was overrated by person A16, while C4F01 (Item 21) and C2F01 (Item 9) was overrated by person A37 and A28 respectively. Therefore, all suspected responses in the dataset were replaced with missing values as suggested.
After performing the suspected responses replacement process, the dataset was retested and the result is illustrated in Fig. 10. Items that were classified as misfit in the first test became fit to the model without distorting the results of other items. For instance, the Infit MNSQ value of C8F04 (Backup ready) was adjusted from 1.61 to 1.48, resulting the decrement of ZSTD value from 2.4 to 2.0 to put it within the reasonable predictability range. The C4F01 (Segmented) and C2F01 (Multi-Platform) values of MNSQ were moved to the acceptable range due to the replacement process. Contrarily, the ZSTD value of item C4F03 is still over the acceptable range (-2.0 to 2.0) which is 2.2. However, as the value of Infit MNSQ is within range, the item was validated.      Vol. 11, No. 3, 2020 There are open-ended questions in the survey about the other factor needed to determine quality of MOOC web content, but no significant comment was provided by the respondents. The probability of factors to be accepted by the respondents had been calculated based on the logits value of Item Measure. The result shown on Table V clearly proved that the probability of all factors to be accepted by respondents on average was exceeds 70% of Cronbach's Alpha. This means that all factors are significantly acceptable to determine the web content quality for MOOC.
Rasch analysis also utilized to determine the validity of the used scale by making a zero setting and calibrating the rating scale as presented on Fig. 11. Besides, it determines that the probability of response distribution is equal between the specified scales (equal interval). The increases in value of observe average, indicates normal response pattern as depicted on Fig. 12. Structure Calibration in turn solves the problem of elasticity of gaps within the Likert scale threshold. In this analysis, it has been proved that all deviation values are within the range 1.4<s<5.0. The calculation is as follows:  Fig. 11 also shows that the person and item data fitness were also manageable as the Infit and Outfit MNSQ is all in the productive range, except for scale 1 (Outfit MNSQ 1.87). However, it's also validated since the value is not degrading as agreed by [16].      |  |  B  |1  |  A  | 11  55|  B  .8 +  11  55 +  I  |  11  5  |  L  |  1  55  |  I  |  1  44444  5  |  T  .6 +  1  222  44  44  5  +  Y  |  11  22  222  44  44  .0 +********************55555555*****111111**********************+ Rasch set the minimum value of raw variance explained by measure to 40% to be accepted as a benchmark to ensure unidimensionality in this model [22]. As shown in Fig. 13, the model's raw variance explained by measure value is 60.7%, indicates that it has good unidimensionality feature. The value of Unexplained variance in 1st contrast indicates that there is a bit disruption to the items, known as noise. However, the percentage is very low at 4.9%, compared to the maximum controlled value of 15% as pointed out by [23]. This is confirmed by the table of largest standardized residual correlations as shown in Fig. 14. The table indicates that there is no locally dependant pairs of items which having residual correlation > .7, as the largest residual correlation is only .53.
As a discussion, the proposed model validation has been executed through content validity test and the survey on acceptability. The Rasch Model was utilized to prove two things (i) Data fitness (ii) The probability of the quality factors to be accepted. The data fitness is proven by statistical analysis on infit / outfit MNSQ and ZSTD, which is all in the productive range to be measured. The Wright Map and Item Measure Table not only assists the data fitness analysis but also the level of agreement determination for every item, by placing the most agreed item at below and least agreed item on above. This enables rearrangement of the factors for each categories in the quality model according to the level of agreement as indicated by survey on acceptability. Every factors definition was also revised based on the result of model validation processes. The final web content quality model for MOOC was devised as depicted in Fig. 15.
Besides, several Rasch Model features such as Category Probability Modes and Principal Component Analysis assist the determination of item unidimensionality, which means that all items the questionnaire measure only a single construct. The feature is critical especially in forming a newly-developed hierarchical model, like the one we developed and validated in this research. The result of Principal Component Analysis prove that the model developed in this research is completely hierarchical with each criterion related to only one family, similar with other hierarchical models like ISO/IEC 9126.

C. Threats to Validity
There are several issues that may threatening the validity of the result and model. Thus, four types of threats to the 585 | P a g e www.ijacsa.thesai.org validity of the survey were analysed based on framework proposed by [24] which is internal, external, conclusion and construct. The narrowly focused purposive sampling utilized for this study strengthens the trustworthy inference, which increases the internal validity. The selection of respondents was also carefully undertook and reconsidered by the field experts before the content validity test and survey on acceptability were carried out.
Threat to external validity are manageable as the value of item and person reliability in content validity test and survey on acceptability is beyond the standard of Cronbach Alpha which is 0.7 [25]. The reliability score of 0.95 for person in the survey of acceptability indicates the consistency of the result and generalizable outside the respondents setting. In term of conclusion validity, the measurement used to analyze data is considered reliable by the application of the Rasch Model. Moreover, the high reliability score for item which is 0.82 proves data sufficiency to measure what should be measured, thus guaranteeing the conclusion validity. Threats to construct validity are taken care by utilization of Rasch Measurement Model to prove unidimensionality feature of the survey result as well as the proposed model. The evidence is when the value of raw variance explained by measure value beyond Rasch model of 60% which is 80.2%. The items that fit are likely to be measuring the single dimension intended by the construct theory.

V. CONCLUSION
This research demonstrates the effectiveness of two validation techniques which are: (1) content validity test and (2) survey on acceptability to verify the data fitness and probability of acceptance for the web content quality model. The content validity test was used to confirm whether the content of the survey is acceptable to the reviewers, which provides empirical evidence to the construct validity. A proposed factor which is Instructor-centred was excluded, while two new factors were proposed by the respondents, which is Sound Quality and Light. Then, the survey on acceptability was conducted to measure the probability of acceptance of every category and factor for the quality model based on the perspective of content providers and experts.
In order to provide evidence to construct validity, Rasch Model was applied to provide hypothetical unidimensional line along items and persons according to their difficulty and ability. The Rasch application built-in tools like the Wright Map and the Guttman Scalogram facilitate the determination of data fitness and probability of acceptance for every item which being measured in intervals via logits. While this approach claimed to be revolutionary in statistical application, this research proves it suitability for construct validation and instrument development for the development of a quality model. Besides, the features like Category Probability Modes and Principal Component Analysis assist the determination of item unidimensionality, which means that all items measure only a single construct, the feature which very pertinent in developing a new hierarchical model.