LOQES: Model for Evaluation of Learning Object

Learning Object Technology is a diverse and contentious area, which is constantly evolving, and will inevitably play a major role in shaping the future of both teaching and learning. Learning Objects are small chunk of materials which acts as basic building blocks of this technology enhanced learning and education. Learning Objects are hosted by various repositories available online so that different users can use them in multiple contexts as per their requirements. The major bottleneck for end users is finding an appropriate learning object in terms of content quality and usage. Theorist and researchers have advocated various approaches for evaluating learning objects in form of evaluation tools and metrics, but all these approaches are either qualitative based on human review or not supported by empirical evidence. The main objective of this paper is to study the impact of current evaluation tools and metrics on quality of learning objects and propose a new quantitative system LOQES that automatically evaluates the learning object in terms of defined parameters so as to give assurance regarding quality and value.


INTRODUCTION
All Learning Objects are the basic building blocks of technology enhanced education.It is a collection of content items, practice items, and assessment items that are combined based on a single objective.LOs are very popular these days as it supports reusability in different context leading to minimization of production cost.Different practitioners have defined learning objects in different ways based on its intrinsic characteristics such as interoperability, reusability, selfcontentedness, accessibility, durability; adaptability etc., still there is no consensus regarding its correct definition.According to IEEE Learning Technologies Standard Committee, Learning Object "is any entity, digital or non digital that can be used, reused or referenced during technology supported learning" [1].This definition references both digital and non-digital resources.Therefore, to narrow down its scope, David Wiley suggests "any digital resource that can be reused to support learning" [2].It includes digital images, video feeds, animations, text, web pages etc. irrespective of its size.Rehak and Mason propose that a learning object should be reusable, accessible, interoperable and durable [3].Similarly, Downes considers that only resources that are shareable, digital, modular, interoperable and discoverable can be considered learning objects [Downes, 2004].Kay & Knaack define Learning objects as "interactive web-based tools that support the learning of specific concepts by enhancing, amplifying, and/or guiding the cognitive processes of learners' [4]."Learning Objects are beneficial for learners as well as developers or instructors as it provides a customized environment for knowledge sharing and development of e-learning course module.Learning objects can be developed by the programmer as per the requirements using various authoring tools such as office suites, Hypertext editors, Vector graphic editors etc. or can be extracted from the repositories on the basis of metadata stored in Learning Object Repositories such as MERLOT (Multimedia Educational Resource for Learning and Online Teaching), WORC (University of Wisconsin Online Resource Center), ALE (Apple Learning Exchange) etc. Metadata is the description of learning resources such as name of the author, most suitable keywords, language and other characteristics which makes it possible to search, find and deliver the desired learning resource to the learner.The major bottleneck for end users is finding an appropriate learning object that are published in various learning object repositories in terms of various parameters like quality, reusability, granularity and context usage etc.

II. LITERATURE REVIEW
The growth in the number of LOs, the multiplicity of authors, increasing diversity of design and availability of trained and untrained educators has generated interest in how to evaluate them and which criteria to use to make judgments about their quality and usefulness [5].According to Williams (2000), evaluation is essential for every aspect of designing learning objects, including identifying learners and their needs, The main goal of this approach has been to determine whether participants valued the use of learning objects and whether their learning performance was altered.The formative assessment works during the development phase of learning objects where feedback is solicited from small groups at regular intervals.[4] Nesbit et al. (2002) outline a convergent evaluation model that involves multiple participantslearners, instructors, instructional designers and media developers.Each group offers feedback throughout the development of a LO.Finally a report is produced that represents multiple values and needs.The major drawback of convergent evaluation model is limited no. of participants and difference in opinions and beliefs.[6] Nesbit and Belfer (2004) designed an evaluation tool Learning Object Review Instrument (LORI) which includes nine items: content quality, learning goal alignment, feedback and adaptations, motivation, presentation design, interaction, accessibility, reusability and standards.This instrument has been tested on a limited basis ( Krauss & Ally, 2005; Vargo et al., 2003) for a higher education population, but the impact of specific criteria on learning has not been examined [7].
MERLOT developed another evaluation model which focuses on quality of content, potential effectiveness as a teachinglearning tool, and ease of use.Howard -Rose & Harrigan (2003) tested the MERLOT model with 197 students from 10 different universities.The results were descriptive and didn't distinguish the relative impact of individual model components.Coachrane (2005) tested a modified version of the MERLOT evaluation tool that looked at reusability, quality of interactivity, and potential for teaching, but only final scores are tallied, so the impact of separate components could not be determined.Finally, the reliability and validity of the MERLOT assessment tool has yet to be established [4].Kay & Knaack (2005, 2007a) developed an evaluation tool based on a detailed review of research on instructional design.Specific assessment categories included organisation/ layout learner control over interface, animation, graphics, audio, clear instructions, help features interactivity, incorrect content/ errors, difficulty/ challenge, useful/ informative, assessment and theme/ motivation.The evaluation criteria were tested on a large secondary school population [9,10].
Based on above theories Kay & Knaack (2008) proposed a multi component model for assessing learning object The Learning Object Evaluation Metric (LOEM) which focused on five main criteria interactivity, design, engagement, usability, and content.The model was tested on a large sample and the results revealed that four constructs interactivity, design, engagement and usability demonstrated good internal and inter rater reliability, significantly correlated with student and teacher perception of learning, quality , engagement and performance.But there is little finding as how each of these constructs contributes to the learning process [4].
Munoz & Conde (2009) designed and developed a model HEODAR that automatically evaluates the Los and produce a set of information that can be used to improve those Los.The tool is implemented in the University of Salamanca framework and initially integrated with LMS called Moodle but the results are not yet tested [11].
Eguigure & Zapata (2011) proposed a model for Quality Evaluation of Learning Objects (MECOA) which defines six indicators: content, performance, competition, selfmanagement, meaning and creativity to evaluate the quality of Los from a pedagogical perspective.These indicators are evaluated by four actors: teachers, student, experts and pedagogues.The instrument was designed and incorporated into AGORA platform but the results are not empirically tested [12].

III. LEARNING OBJECT EVALUATION METRICS
Metrics are the measurement of a particular characteristics grounded from the field of Software Engineering.Researchers have proposed various metrics for quantitative analysis of different dimensions of learning objects such as quality of metadata stored, reusability, learning style, ranking, cost function etc, but the empirical evaluation is performed at small scale and that on individual basis.

A. Quality Metrics
The quality of the content as well as the metadata of learning objects stored in learning object repositories is an important issue in LOR operation and interoperability.The quality of the metadata record directly affects the chances of learning object to be found, reviewed or reused.The traditional approach to evaluate the quality of learning object metadata is by comparing the values of metadata with the values provided by metadata experts.This approach is useful for small sized and small growing repositories but become impractical for large or federated repositories.Thus, there is a  where is the term frequency of the i th word , is the document frequency of the i th word.Readability Flesch(description_text) is the value of the Flesch index for the text present in the title and description of the record.

1) Simple Completeness:
This metric tries to measure the completeness of the metadata record.It counts the number of fields that doesn't contain null values.In the case of multi-valued fields, the field is considered complete if at least one instance exists.The score could be calculated as a percentage of possible fields and divided by 10 to be in a scale from 0 to 10.For example, according to this metric, a record with 70% of its fields filled has a higher score (q=7) then one in which only the 50% has been filled (q=5).

2) Weighted Completeness:
This metric not only counts the no. of filled fields, but assign a weight value to each of the fields.This weight value reflects the importance of that field for the application.The obtained value should be divided by sum of all the weights and multiplied by 10.The more important fields could have a weight of 1 while the unimportant fields could have a weight of 0.2.For e.g. if the main application of the metadata will be to provide information about the object to a human user, the title, description and annotation fields are more important than the identifier or metadata author's fields.If a record contains information for title, description and annotation then its score will be (3/3.2*10=9)which is higher than the record with information for title and metadata author (1.2/3.2*10=4).

3) Nominal Information Content:
This metric tries to measure the amount of information that the metadata possess in its nominal fields, the fields that can be filled with values taken from a fixed vocabulary.For nominal fields, the Information Content can be calculated as 1 minus the entropy of value.
Entropy is the negative log of the probability of the value in a given repository.This metric sums the information content for each categorical field of the metadata record.The metadata record whose level of difficulty is set to "high" provides more unique information about the object then the record whose difficulty level is "medium: or "low", thus possess a higher score.If records nominal fields contain only default values, they will provide less unique information about the about the object and possess lower score.

4) Textual Information Content
This metric tries to measure the relevance and uniqueness of words contained in the record's text fields, the fields that can be filled with free text.The 'relevance' and 'uniqueness' of a word is directly proportional to how often the word appears in the record and inversely proportional to how many records contain that word.This relation is handled by TF-IDF (Term Frequency -Inverse Document Frequency).
The number of times that the word appears in the document is multiplied by the negative log of the number of documents that contain that word.The log of the sum of all the TF-IDF value of all the words in textual fields is the result of the metric.For example, if the title field of a record is "Lectures of C++", given that "lecture" and "C++" are common words in the repository, will have lower score than a record whose title is "Introduction to Object Oriented Programming in C++" because the latter one contains more words and "object", "oriented", "programming" are less frequent in repository.

5) Readability
This metric tries to measure how accessible the text in the metadata is, i.e. how easy is to read the description of the learning object.
The readability indexes count the number of words per sentence and the length of the words to provide a value that suggest how easy is to read a text.For example, a description where only acronyms or complex sentences are used will receive a higher score but lower in quality than a description where normal words and simple sentences are used.Ochoa & Duval (2006) designed an experiment to evaluate how the quality metrics correlate with quality assessment by human reviewers and the result showed that Textual Information Content metric seems to be a good predictor [8].www.ijacsa.thesai.org

B. Reusability Metric
Reusability is the degree to which a learning object can work efficiently for different users in different digital environments and in different educational contexts over time.Reusability of a learning object is a major issue these days as developing a quality educational material is costly in terms of time and resources.A lot of research has been going on to improve the reusability of learning objects by defining standards such as SCORM, IMS etc. so as to resolve the issue of interoperability among different platforms.The factors that determine the reusability of a learning object can be classified as structural or contextual.From Structural factor point of view, a learning object should be self-contained, modular, traceable, modifiable, usable, standardized and properly grained.As per contextual point of view, learning object must be generic and platform independent, so that it can be used in various contexts irrespective of any subject or discipline.To measure the reusability of learning object, various metrics have been proposed grounded on the theory of software engineering which measures various reusability factors.

1) Cohesion:
This Cohesion analyzes the relationship between different modules.Greater cohesion implies greater reusability.


Learning Object involves number of concepts.Fewer the concepts, greater the module cohesion.


Learning object should have a single and clear learning objective.The more learning objectives it has, the less cohesive it will be.


The semantic density of a learning object shows its conciseness.
More conciseness shows greater cohesion.


Learning object must be self-contained and exhibit fewer relationship and instances so as to be highly cohesive.Thus the cohesion of learning object depends on semantic density, aggregation level, and number of relationships concepts and learning objectives.

2) Coupling:
Coupling measures interdependencies between various modules.A module must communicate with minimum number of modules and exchange little information so as to minimize the impact of changing modules.Lower coupling implies greater reusability.Coupling is directly proportional to number of relationships present, so a learning object should be self contained and referenced to fewer objects to increase reusability.

3) Size and Complexity:
Granularity is a major factor that measures the reusability of a particular object, as finely grained learning objects are more easily reusable.Granularity is directly proportional to the following LOM elements:  Size of the Learning Object. Duration or time to run the learning object. Typical Learning Time i.e. estimated time required to complete the learning object.

4) Portability
Portability is the ability of a learning object to be used in multiple contexts across different platforms.
 Technical portability depends on delivery format of the learning object as well as on the hardware and software requirements to run that particular Learning Object.

Very High
The object is based on a technology available on all systems and platforms (e.g.html).

High
The object is based on a technology available on many systems and platforms.

Medium
The object is based on a technology that is not available on many systems (i.e. common platformspecific file format).

Low
The object is based on a technology that is hardly available on different systems (i.e.uncommon proprietary file format).

Very low
The object is based on a proprietary technology that is not available on many systems (i.e. a specific server technology).

1
 Educational portability deals with vertical and horizontal portability.Vertical portability means possibility to be used or reused on different educational levels such as primary, higher or secondary, whereas horizontal portability determines the usage among various disciplines.

5) Difficulty of Comprehension
Difficulty to understand a learning object directly influences the reusability of that object in an aggregated manner.www.ijacsa.thesai.org

C. Ranking Metrics
LOR uses various strategies to search learning objects as per end user requirements such as metadata based search and simple text based search.In both cases the retrieval of appropriate learning object depends on the quality of the metadata and content matching capacity of LOR.Ochoa and Duval proposed various ranking metrics which are inspired on methods currently used to rank other types of objects such as books, scientific journals, TV programs etc.They are adapted to be calculable from the information available from the usage and context of learning objects.

1)
Topical Relevance Ranking Metrics These metrics estimate which objects are more related to what a user wants to learn.For this, first step is to estimate what is the topic that interests the user and second step is to establish the topic to which each learning object in the result list belongs.The source of information for first step is query terms used, course from which the search was generated and the previous interactions of the user with the system and for the second step is classifications in the learning object metadata, from the topical preference of previous learners that have used the object or the topic of courses that the object belongs to.

a) Basic Topical Relevance Metric
This metric makes two naïve assumptions: 1) topic needed by the user is fully expressed in the query 2) object is relevant to just one topic.The relevance is calculated by counting the no. of times the object has been previously selected from the result list when same query terms have been used.BT Relevance Metric is the sum of the times that the object has been selected in any of those queries.

{ ∑
Similarity between two queries range (0-1) o -Learning object to be ranked qquery performed by user q irepresentation of i th previous query NQ -Total no. of queries Similarity between two queries can be calculated either as the semantic differences between the query terms or the no. of objects that both queries have returned in common.

b) Course-Similarity Topical Relevance Metric
The course in which the object will be reused can be directly used as the topic of the query.Objects that are used in similar courses should be ranked higher in the list.The main problem to calculate this metric is to establish which courses are similar.For this SimRank algorithm is used that analyzes the object-to-object relationship to measure the similarity between those objects.
CST Relevance Metric is calculated by counting the no. of times that a learning object in the list has been used in the universe of courses.{ ∑ ∑ olearning object to be ranked ccourse where it will be inserted or used c i -ith course present in the system NC -Total no. of courses NO -Total no. of objects c) Internal Topical Relevance Metric This algorithm is based on HITS algorithm which states the existence of hubs and authorities.

Hubspages that mostly points to other useful pages
Authoritiespages with comprehensive information about a subject.
Hubs correspond to Courses and Authorities correspond to Learning Objects.Hub value of each course is calculated as no. of inbound links that it has.Rank of each object is calculated as the sum of the hub value of the courses where it has been used.

2) Personal Relevance Ranking Metrics
This metric tries to establish the learning preference of the user and compare them with the characteristics of the learning objects in the result list.The most difficult part is to obtain an accurate representation of the personal preferences.The richest source of information is the attention metadata that could be collected from the user.The second step is to obtain the characteristics of the object which is collected from metadata or contextual and usage information.

a) Basic Personal Relevance Ranking Metric
For a given user, a set of the relative frequencies for the diff.metadata field values present in their objects is obtained.Val (o , f) .The frequencies for each metadata field are calculated counting the no. of times that a given value is present in the given field in the metadata.Once the frequencies are obtained they can be compared with the metadata values of the objects in the result list.If the value present in the user preference set is also present in the object, the object receives a boost in its rank equal to the relative frequency of the value.

3) Situational Relevance Ranking Metrics
This metric tries to estimate the relevance of the object in the result list to the specific task that caused the search.This relevance is related to the learning environment in which the object will be used as well as the time, space and technological constraints that are imposed by the context where learning will take place.Contextual information is needed in order to establish the nature of the task and its environment.

a) Basic Situational Relevance Ranking Metric
In formal learning contexts, the description of the course, lesson or activity in which the object will be inserted is a source of contextual information which is usually written by the instructor.Keywords can be extracted from these texts and used to calculate a ranking metric based on the similarity between the keyword list and the content of textual fields of the metadata record.Similarity is defined as the cosine distance between the TF-IDF vectors of contextual keywords and the TF-IDF vector of words in the text fields of the metadata of the objects in the result list.TF-IDF is a measure of the importance of a word in a document that belongs to a collection.

∑ √∑ ∑
TF -Term Frequency or the no. of times that the word appear in the current text IDF -Inverse Document Frequency or the no. of documents in the collection where the word is present.olearning object to be ranked ccourse where the object will be used tv iith component of the TF-IDF vector representing the keywords extracted from the course description ov iith component of the TF-IDF vector representing the text in the object description Mdimensionality of the vector space (no. of different words)

b) Context-Similarity Situational Relevance Ranking Metric
A fair representation of the kind of objects that are relevant in a given context can be obtained from the objects that have already been used under similar conditions.Researchers have proposed various metrics but the main drawback is that these metrics are not implemented in any quality evaluation tool.The results of these metrics have been analyzed by conducting empirical analysis on small scale and that on individual basis.The main objective of this paper is to propose a model LOQES that automatically assesses the quality of learning object by employing various metrics discussed above on the defined parameters and give a quantitative rating that acts as quality indicator, which is beneficial for the learning object community.This system will apply on newly developed learning objects and acts as a certification mechanism.The tool first extracts the metadata fields of learning object on the basis of information supplied by the contributor.Then it applies various quality metrics on the metadata information to estimate the correctness and accuracy of metadata records.Afterwards, this information is used to estimate the value of other defined parameters by employing various metrics such as reusability, granularity, linkage, complexity etc.The aggregate of all the scores of the above parameters helps in calculating the overall rating of that particular learning object.The main benefit of this model is that it is a quantitative model which automatically evaluates the learning object and is not based on peer review.

V. CONCLUSIONS
In this paper, the main emphasize is on proposing a quantitative model that automatically assesses the quality of learning object on the basis of various metrics proposed.Presently all the evaluation tools are qualitative and based on expert review.Researchers have also specified the need for an automatic assessment tool due to large dissemination of learning objects.The main task left for future work is to develop the system and execute an empirical study with full implementation of these metrics in real world and comparing their performance.

Fig. 1
Fig.1 Conceptual Framework of Learning Object Repositories

Fig. 3
Fig.3 Types of Relevance Ranking Metrics

{
olearning object to be ranked ffield in the metadata standard vvalue that the f field could take ith object previously used by u N -Total no. of objects f iith field considered for calculation of the metric NF -Total no. of those fields b) User-Similarity Personal Relevance Ranking Metric USP Relevance Metric is calculated by finding the no. of times similar users have reused the objects in the result list.SimRank algorithm is used to find similar users.{ ∑ olearning object to be ranked uuser that performed the query u irepresentation of the ith user NU -Total no. of users object to be ranked ccourse where the object will be used www.ijacsa.thesai.orgo i -i th object contained in the course c ffield in the metadata standard vvalue of the f field in the object o NF -Total no. of fields IV.PROPOSED MODEL LOQES Theorist and researchers have proposed different studies for evaluating LOs in terms of reusability, standardization, design, use and learning outcomes.The major problems with these studies are that they are not supported by empirical evidence, covers limited number of objects and evaluate only the qualitative phenomenon.MacDonald et al., 2005) based on informal interviews or surveys, frequency of use and learning outcome.
Quality metrics are small calculation performed over the values of different fields of the metadata record in order to gain insight of various quality characteristics such as completeness, accuracy, provenance, conformance to expectations, coherence and consistency, timeliness and accessibility.The values obtained from the metrics are contrasted with evaluations by human reviewers to a sample of learning object metadata from a real repository, and the results are evaluated.The various quality metrics proposed by Ochoa and Duval are: RankingMetrics www.ijacsa.thesai.orgneed for automation of quality assessment of learning object metadata.

TABLE 1 :
TYPES OF QUALITY METRICS

TABLE 2 :
CORRELATION BETWEEN HUMAN EVALUATION SCORE AND THE

TABLE 3 :
COHESION VALUES TO MEASURE LEARNING OBJECT REUSABILITY

TABLE 4 :
VALUES TO MEASURE LEARNING OBJECT SIZE

TABLE 5 :
VALUES TO MEASURE LEARNING OBJECT TECHNICAL PORTABILITY

TABLE 5 :
VALUES TO MEASURE LEARNING OBJECT EDUCATIONAL PORTABILITY