Annotation and research of pedagogical documents in a platform of e-learning based on Semantic Web

E-learning is considered as one of the areas in which the Semantic Web can make a real improvement whatsoever in finding information, or reusing of educational resources or even personalized learning paths. This paper aimsto develop an educational ontology that will be used to annotate learning materials and pedagogical documents. Keywords—Ontologie ; Web sémantique ; XML ; RDF ; RDFS ; OWL ; métadonnées ; enseignement à distance.


INTRODUCTION
At its creation by Tim Berners Lee in the early 90s, the goal of the Web was to allow any user to access and share large amounts of information on the net, and very quickly the Web has achieved its objectives.Therefore, we have a large volume of information, but no control of content.The Semantic Web is the solution proposed by the W3C (World Wide Web Consortium), this new vision of the Web aims to make Web resources understandable not only by humans but also by machines.To achieve this goal, W3C begins to develop new and better languages: XML (Extensible Markup Language), RDF (Resource Description Framework), OWL (Web Ontology Language) ... etc.The Semantic Web has used engineering knowledge to provide a tool for knowledge representation and it seems that ontologies were most suitable for such an environment.An ontology is a conceptual system that enables the sharing of knowledge between humans and computers and between computers.The work presented in this paper lies at the intersection of the Semantic Web with the field of distance education.This work aims to achieve a double goal: the first one is to design an application ontology that describes the educational materials used for university education, and the second one concerns the development of an application for annotation and retrieval of documents in educational exploitation of metadata that describe them.

A. Semantic Web Hierarchy Model
The Semantic Web is a vision of the Web in which information is given explicit meaning to machines facilitating the processing and integration of information on the Web.
The Semantic Web is built on the ability of XML to define customized tagging schemas and the flexibility of RDF to represent the data.If the machines are supposed to do useful reasoning tasks on these documents, the language must go beyond basic semantics of RDF Schema.OWL has been designed to meet that need.OWL is part of a scalable stack of W3C recommendations with respect to the Semantic Web [1].
Fig. 1 shows the standardized technologies of the Semantic Web, which is cited below: 1) XML: provides a syntactical surface for structured documents [2]., achieves interoperability of data to different platforms or systems using different languages.XML has been designed for document exchange, and define the data structure [3].
2) XML Schema: is the substitute of the DTD (Document Type Definitions).Itself uses XML syntax, but more flexible than DTD, provides more data types, services for XML documents more efficiently.If theXML is standardized data formatthen the XML Schema defines the data types of an XML document [3].
3) RDF is a standard for describing Web resources proposed by the W3C, as its name implies, RDF (Resource Description Framework) is a metalanguage for resource description framework, to make the information necessary to search engines more "structured" and, more generally, to all necessary tool for automated analysis of web pages [6].RDF uses a particular terminology to indicate the various parts of statements.Specifically, the part that identifies the thing mentioned in the statement is called the subject (subject), the part that identifies the property or characteristic of the subject is called the predicate (predicate), the part that identifies the value of this property is called the object (object).Thus RDF statements are in the form of a triplet <Subjet, Prédicat, Object>.Graph model is used to query and process RDF [3].In this model, a statement is represented by: www.ijacsa.thesai.orgSo any RDF statement could be represented by the graph shown in Fig. 2. [1].It uses a kind of understandable system by the computer to define vocabularies resources described [3].RDF statement in graph model.

5) Ontology: RDF provides only basic relations of description, reasoning ability is limited. The context of the Semantic Web develops on the ontology layer and the layer of logical reasoning based on RDF to support knowledge representation and reasoning based on semantics. Ontology is a philosophical term introduced in the nineteenth century that characterizes the study of beings in our universe. In computer science, an ontology is a structured representation of domain knowledge in the form of a network of concepts linked by semantic links. The ultimate goal of the ontology is to show the implicit information accurately. W3C has defined OWL (Ontology Web Language) as the standard recommended language for describing ontologies.
6) OWL: like RDF, OWL is an XML language enjoying the universal syntax of XML.It adds the ability to make comparisons between properties and classes: identity, equivalence, contrast, symmetry, cardinality, transitivity, disjunction, etc.The W3C provides OWL with three sublanguages with increasing capacity of expression, and it is as necessary that we choose the appropriate language.d) OWL Lite is the simplest sublanguage of OWL, it is intended to represent hierarchies of simple concepts.e) OWL DL is more complex than the previous one, it is based on description logic, hence its name (OWL Description Logics).It is adapted to reasoning, and ensures the completeness of reasoning and its decidability.
f) OWL Full is the most complex version of OWL, designed for situations where it is important to have a high level of ability to describe, even if they cannot guarantee the completeness and decidability of calculations related to the ontology [7].

A. Metadata :
We can define metadata as "data about data" treatable by a machine [8], in the case of pedagogical documents, document content is data and information on the authors, their interests and their pedagogical goals are metadata.

B. Metadata based on ontologies
In the context of the Semantic Web, ontologies provide specifically rich semantics, better than any other method of knowledge representation known.In a research topic of educational content on a platform of education, basing on the conceptual vocabulary defined in ontology may help to improve the accuracy of this research by avoiding ambiguities in terminology and allowing inferences decreasing noise and increasing relevance.

C. Ontologies for e-learning
In June 2000 the European Commission defines e-learning as "the use of new multimedia technologies and the Internet to improve the quality of learning by facilitating access to resources and services, as well as exchanges and remote collaboration".E-learning, and as other Web services, can benefit from the new vision of the Semantic Web while relying particularly on the potential of ontologies.

1)
Need of e-learning systems The different needs of e-learning system that ontologies play a role to fulfill, can be summarized in: a) Need for archiving and information research: An elearning application is put online through the use of Web.Given the diversity and the exponential growth of pedagogical resources used in an e-learningtype of education, it is increasingly difficult to find relevant pedagogical materials.
b) Need to share: problem with keywords to use to search for learning materials.c) Need for reusingof pedagogical objects: Given the volume of increasingly growing pedagogical materials www.ijacsa.thesai.orgavailable on the net, just a small number of pedagogical objects are reusable.The search and selection of relevant text fragments, figures, exercises, from a document with the aim of their reuse in a new document has become almost impossible.d) Need for customization and adaptation: A system of e-learning is for a community of users who do not have the same expectations, knowledge, skills, interests, etc.They are not able to understand or accept documents, except those of the organization, content and presentation are adapted to their needs.

A. Construction process
The process of building ontology exploitable within a computer system is based on two steps: ontologization and operationalization.The ontologization consists of building a conceptual ontology.This means that we intend to provide a description of the target world, faced with this task we take into account the various sources of knowledge: glossary of terms, other ontologies, texts, interviews with experts, etc.The operationalization consistsin encoding conceptual ontology obtained, using an operational language of knowledge representation (provided with mechanisms of inferences).It should be noted that this process is not linear and that many trips are a priori necessary to develop an ontology adapted to operational needs.

B. Conception of the application ontology 1) Choice of a construct methodology
To build the application ontology, there are different methods of construction, and the choice between these methods is performed according to our needs.The method developed by [Bernaras et al, 1996] was used in this work, it is based on three steps:  Specify the application based on the ontology, in particular terms to collect and tasks to execute using this ontology.
 Refine and organize the ontology according to the principles of modularization and hierarchical organization.

This choice can be justified by two reasons:
 This method is suitable for the application ontologies rather than domain ontologies. It is structured around a set of terms that must be transformed into an ontology.2) Construction principles a) Clarity and objectivity [9]: all terms used in this ontology have been associated with definitions.b) Completeness [9]: to respond to this principle definitions of concepts and relations of our ontology have been associated with conditions, others have been associated with necessary and sufficient conditions, but of course depending on the possibility to define these conditions.c) Maximum ontological extensibility [9]: the definition of a term explains just the term itself, its definition cannot be the same except for a more general term, or a more specialized term.d) Principle of ontological distinction [10]: the concepts in the ontology are sufficiently disjoint.e) Minimum semantic distance [11]: there is a minimum distance between the concepts children of the same parents.
3) Representation of concepts : Fig. 3 shows a hierarchical representation of the concepts used to model the pedagogical universe in our ontology.

4) UML class diagram
The class diagram shown in Fig. 4 illustrates concepts, attributes and relations linking concepts together:

C. Implementation and use of the ontology:
The ontology editor « Protégé version 3.1.1», was used to edit our ontology with the aim to automatically generate the OWL code corresponding as well as to generate the HTML documentation.A fragment of OWL code generated is illustrated in Fig. 5.
The process of ontology building can be integrated into the life cycle of an ontology as shown in Figure 6  It should be noted that the approach to annotate the documents consisted of:  Add the metadata that describe a document to the file OWL that encodes the application ontology.
 Store documents in a specific location on the server.
 Manage access to documents with metadata "URI".Regarding the search for a document, it takes three options, one option depending on document metadata, an option according to the author of the document and the last option as the taught module (Figure 9).
To exploit the ontology in the application, a dedicated query language is essential; SPARQL (his name is an acronym for SPARQL Protocol And RDF Query Language).SPARQL Conducts research on RDF graphs [4].Thus, SPARQL is a programming interface between applications, an API as a Web services standard, and it opens the way to a universal API for querying structured data, in which the semantics of the query is no longer situated in the API, which limits the possibilities, but in the query itself.The objective of this paper was the conception of ontology to provide a vocabulary for the annotation and research of documents in a platform for distance education.However, this work is not perfect and can be improved in several areas, such as:  Develop other ontologies and combine them with those made here to enrich the vocabulary used for annotating and research.
 Reuse this ontology in other platform based on semantic web techniques.
 Add the intelligent agent technologies at the application to provide reactivity with users

Fig. 1 .
Fig.1.Semantic Web Layers a) a node for the subject.b) a node for the object.c) an arc, directed from the node subject to the node object, for the predicate.So any RDF statement could be represented by the graph shown in Fig.2.

Fig. 8 ,
Fig. 8, shows an example of annotation (annotation of a module).

Fig. 9 .
Fig.9.Research for a document V. CONCLUSION AND PERCPECTIVES This paper shows what is the Semantic Web is, and on what it is based in terms of standards and languages, and also covers the notion of ontology and the contribution of ontologies in Semantic Web context.