A Survey on Models and Query Languages for Temporally Annotated RDF

In this paper, we provide a survey on the models and query languages for temporally annotated RDF. In most of the works, a temporally annotated RDF ontology is essentially a set of RDF triples associated with temporal constraints, where, in the simplest case, a temporal constraint is a validity temporal interval. However, a temporally annotated RDF ontology may also be a set of triples connecting resources with a specific lifespan, where each of these triples is also associated with a validity temporal interval. Further, a temporal RDF ontology may be a set of triples connecting resources as they stand at specific time points. Several query languages for temporally annotated RDF have been proposed, where most of which extend SPARQL or translate to SPARQL. Some of the works provide experimental results while the rest are purely theoretical.


INTRODUCTION
RDF ("Resource Description Framework") [1], [2] is a growing semantic web standard for the specification of ontologies.An RDF ontology contains a set of triples (s,p,o), denoting that subject s is associated with object o by property p.However, this information is static meaning that either does not change over time or the whole RDF ontology corresponds to a particular time point.However, the truth of statements often changes with time and Semantic Web applications often need to represent such changes and reason about them.For example, statements regarding airline flights are valid only in certain time intervals.Validity time should also be integrated in the query language allowing to retrieve "flights from London to Paris during Mary's summer vacation".Some additional example temporal queries are the following: 1. Who are the foaf:Persons whose lifespan overlaps with Einstein's? 2. What is the temperature in Chicago at sunrise of July 20 th , 2008? 3. What are the names of the engineers who committed code to a particular software in the first half of 2008? 4. What is the salary of Tom during the interval [2007-01-01, 2009-12-31]? 5. Who was the head of the german government before and after the unification of 1990? 6.Who are the service providers that provide web services for more than 4 consecutive years?Who are the house members who sponsored a bill after April 2, 2008?In this paper, we provide a survey on the models and query languages for temporally annotated RDF.In most of the works, a temporally annotated RDF ontology is essentially a set of RDF triples associated with temporal constraints, where, in the simplest case, a temporal constraint is a validity temporal interval.However, a temporally annotated RDF ontology may also be a set of triples connecting resources with a specific lifespan, where each of these triples is also associated with a validity temporal interval.Further, a temporal RDF ontology may be a set of triples connecting resources as they stand at specific time points.Several query languages for temporally annotated RDF have been proposed, where most of which extend SPARQL [3] or translate to SPARQL, the most widely accepted query language for RDF.Some of the works provide experimental results while the rest are purely theoretical.
We divide reviewed works into three main categories: (a) works that they have their own model theory (Section 2), (b) works that they extend RDF simple entailment [2] (Section 3), and (c) works that they extend RDFS entailment [2] (Section 4).Works that extend RDF simple entailment are further divided into works that directly translate into RDF and those that do not.Section 5 concludes the paper and provides a comparison of the presented approaches

II. WORKS WITH THEIR OWN MODEL THEORY
In [4], a temporal RDF (tRDF for short) database is a set of triples of the form (s, p:{T}, o), (s, p:<n:T>, o), (s, p:[n:T], o), and (p rdfs:subPropertyOf p'), where s is a URI reference from a set U, p,p' are URI references from a set P, o is an entity from R= U  L, where L is a set of literals, n is a natural number, and T is a temporal interval.Intuitively, the triple (s, p:{T}, v) indicates that the association (s, p, o) holds at every time point in T, the triple (s, p:<n:T>, o) indicates that the association (s, p, o) holds at least n time points within T, and the triple (s, p:[n:T], o) indicates that the association (s, p, o) holds at most n time points within T. An interpretation I of a tRDF database is a function from the set of time points to U  P  R. Satisfaction of a tRDF triple is defined in such a way that intuitive meaning is preserved.Obviously, a tRDF database may be inconsistent due to the temporal constraints imposed to RDF triples.
A tRDF query over a tRDF database D is a set of triples of the form (s, p:{T}, o), (s, p:<n:T>, o), (s, p:[n:T], o), where s,p,o,T are possibly variables, with the constraint that each temporal variable appears only once.An answer to a tRDF www.ijacsa.thesai.orgquery q is the set of all possible substitutions to the variables in q such that all triples in q after proper substitutions are entailed by D.
To efficiently answer tRDF queries, a tGRIN index structure is proposed such that temporally closed resources and resources close in the tRDF graph are stored in the same index node.Query answering using the tGRIN index is shown to outperform query answering using R + -trees, SR-trees, and the ST-index, the most promising representatives of valid-time indexing methods, according to [5].
In [6], the authors extend RDF triples with an annotation from a set A which is a partially ordered set.We consider the case that A is the set of all temporal intervals [t,t'], where t,t' are natural numbers.The inclusion ordering  is the partial ordering in this set.An annotated RDF theory (aRDF-theory for short) is a finite set of triples (s, p:a, o), where s is a resource from a set R, p is a property from a set P, o is a resource in R, and a  A. In addition, an aRDF theory contains statements (p, rdfs:subProperyOf, p'), where p,p'  P, and statements indicating which properties are transitive.
Let O be an aRDF theory, let p be a transitive property in O, and let r,r'  R.Then, there is a p-path between r,r' if there exists a set of triples t 1 =(r,p 1 :a 1 ,r 1 ),…, t k =(r k-1 ,p k :a k ,r') such that for all i  [1,k], (p i rdfs:subPropertyOf * p).A p-path Q is indicated by the set of triples {t 1 ,…,t k } that form the path.
An interpretation I is a mapping from the set of triples (s,p,o), where s,o  R and p  P, to A. An interpretation I satisfies (s,p:a,o) iff a  I(s,p,o).I satisfies an aRDF theory O iff (i) I satisfies every (s,p:a,o)  O and (ii) for all transitive properties p  P, for all p-paths Q={t 1 ,…,t k } in O, where t i =(r i ,p i :a i ,r i+1 ), and for all a  A such that a  a i , it is the case that a  I(r 1 ,p,r k+1 ), for all i  [1,k].
A simple aRDF query q has the form (s,p:a,o), where s,p,a,o can be variables.A O (q) consists of all ground instances of q that are entailed by O.However, A O (q) may contain redundant triples.For example, if (a,p:[1,100],o)  A O (q), then there is no point including redundant triples such as (a,p: [1,10],o) in it.Answer O (q) eliminates all redundant triples from A O (q).A conjunctive query Q is a set of simple aRDF queries such that for any simple query q  Q, there is a variable in q that appears in another simple query q'  Q.
The authors present efficient algorithms for simple and conjunctive query answering, showing that the time complexity for answering a conjunctive query is in O((|R| 2 *|P|) |Q| ), where |Q| is the number of simple queries in Q.
The authors also provide experimental results showing the efficiency of their approach.

A. Approaches that translate to RDF
In [7], instead of having RDF triples associated with their validity temporal interval, named graphs [8] are used both for saving space and for querying the temporal RDF database using standard SPARQL.In particular, each created named graph g is associated with a temporal interval i and all RDF triples whose validity interval is i become members of g (in this process blank nodes are replaced by URIs).The authors introduce through examples a query language, named τ-SPARQL which extends the SPARQL query language for RDF graphs.Each τ-SPARQL query can be translated into a SPARQL query.
A τ-SPARQL query that retrieves all foaf:Persons whose lifespan overlaps with Einstein's is: Temporal relationships between named graphs, such that time:intervalOverlaps are derived from a temporal reasoning system.Additionally, the authors propose an index structure for time intervals, called keyTree index, assuming that triples within named graphs have indices by themselves.The proposed index improves the performance of time point queries over an in-memory ordered list that contains the intervals' start and end times.
Experimental results are provided.
In [9], the time-annotated RDF framework is proposed for the representation and management of time-series streaming data.In particular, a TA-RDF graph is a set of triples <s[t S ], p[t p ], o[t o ]>, where <s,p,o> is an RDF triple and t S , t p , and t o are time points.In other words, a TA-RDF graph relates streams at certain points in time.To translate a TA-RDF graph into a regular RDF graph, a data stream vocabulary is used, where (i) dvs:belongsTo is a propery that indicates that a resource is a frame in a stream, (ii) dvs:hasTimestamp is a property indicating the timestamp of a frame, and (iii) dvs:Nil is a resource corresponding to the Nil timestamp.
An RDF graph G is the translation of a TA-RDF graph G TA iff (B is the set of blank nodes): A query language for the time-annotated RDF, called TA-SPARQL, is proposed which has a formal translation into normal SPARQL.For example, a TA-SPARQL query www.ijacsa.thesai.orgrequesting the temperature in Chicago at sunrise of July 20 th  The system has been implemented on top of the Tupelo 1 semantic middleware.However, no experimental results are provided.
In [10], the authors consider temporal RDF graphs which is a set of triples of the form (s,p:[start,end],o), where (s,p,o) is an RDF triple and p:[start,end] is a shorthand for a URI that identifies a temporal property which has base property p, beginning start and ending end.
The authors define a simple temporal interpretation by extending an RDF simple interpretation as follows: 1. T is a subset of the set of resources.As temporal RDF graphs are ordinary RDF graphs they can be queried using normal SPARQL.However, it is helpful to the writer of temporal queries to provide some extra syntax to enable queries to be written more compactly and to hide the details of the underline representation.
For example, a query asking for the names of the engineers who committed code to a particular software in the first half of 2008 is the following:

B. Other approaches
In [11], an N-dimensional time domain has the form: T=T 1 … T N , where each T i is a set of intervals.A multi-temporal RDF triple is defined as (s,p,o | T), where <s,p,o> is an RDF triple and T  T. Note that since T is a set, some compression is achieved in the storage of multi-temporal RDF triples.
As a query language, the authors propose T-SPARQL, an extension of SPARQL that has many features of TSQL2 [12] (a query language designed for temporal relational databases).As in TQL2, if T is a multi-dimensional time element, the expression VALID(T) and TRANSACTION(T) can be used to express conditions on the valid and transaction components of In [13], an uncertain temporal knowledge base is a pair KB = <F, C>, where F is a set of weighted temporal RDF triples and C is a set of first-order temporal consistency constraints.In particular, a fact in F has the form: p(s,o,i) d , where p(s,o) is an RDF triple, i is a temporal interval, and d  [0,1] is a confidence degree that p(s,o) is true during interval i.Additionally, a temporal consistency constraint in C has the form: or of the form: p 1 (?s,?o 1 ,?i 1 )  p 2 (?s,?o 2 ,?i 2 )  relA(?o 1 ,?o 2 ) →false where ?i 1 , and ?i 2 are temporal interval variables, relA is an (optional) arithmetic relation, such as = and ≠, and rel T is a temporal predicate such as overlap and before (see Allen's temporal relations among intervals [14]).www.ijacsa.thesai.orgFor example, the fact that a player can only play for one club at a time is expressed by the query: playsForClub(?s,?o 1 ,?i 1 )  playsForClub(?s,?o 2 ,?i 2 )  ?o 1 ≠?o 2 → disjoint(?i 1 ,?i 2 ) A query Q is a conjunction of triples p(s,o), where s and o can be variables.To answer a query Q, all matches from the KB at collected into a set F Q .Then, all facts possibly conflicting with them are also added to F Q .To resolve the conflicts, a consistent subset F Q,C of F Q is selected such that the sum of the weights of the facts in F Q,C is maximized.Then, the matches to Q within F Q,C are returned as answer to the query.The query answering problem is shown to be NP-hard.A scheduling algorithm for query answering is provided, as well as an efficient approximation algorithm with polynomial performance.Experimental results show the efficiency of the proposed approach.
In [15], the authors extend RDF with temporal features and evolution operators.In addition, in contrast to the rest of the reviewed works, they associate concepts with their lifespan.In particular, an evolution base Σ is a set of RDF triples and a mapping τ from the set of considered RDF triples and considered resources to the set of temporal intervals.In addition, Σ may contain statements of the form (c, term, c'), where term is one of the special evolution properties becomes, join, split, merge, and detach.An evolution base Σ is consistent, if for all (s,p,o)  Σ it holds that τ(s,p,o)  τ(s) and τ(s,p,o)  τ(o).Additionally, if p  {type, subClassOf, subPropertyOf} then it should hold that τ(s)  τ(o).
To support evolution-aware querying, the authors define a navigational query language to traverse temporal and evolution edges in an evolution graph.This language is analogous to nSPARQL [16], a language that extends SPARQL with navigational capabilities based on nested regular expressions.nSPARQL uses four different axes, namely self , next, edge, and node, for navigation on an RDF graph and node label testing.The authors extend the nested regular expressions constructs of nSPARQL with temporal semantics and a set of five evolution axes, namely join, split, merge, detach, and becomes that extend the traversing capabilities of nSPARQL to the evolution edges.The extended query language is formally defined.
An example query is "who was the head of the German government before and after the unification of 1990".The query is expressed as follows: SELECT ?Y, ?W (?X, self::Reunified Germany/join -1 [1990]/ next::head[1990], ?Y) AND (?Z, self::Reunified Germany/next::head [1990], ?W) The first triple finds all the heads of state of the Reunified Germany before the unification by following join -1 [1990]  and then following next :: head [1990].The second triple finds the heads of state of the Reunified Germany after the unification.
No implementation results of this theory are provided.

IV. WORKS THAT EXTEND RDFS ENTAILMENT
In [17], a temporal graph is a set of temporal triples of the form (s,p,o)[t], where (s,p,o) is an RDF triple and t is a time point.Given a temporal graph G, G(t) denotes the set of RDF triples in G corresponding to time point t.
The authors define temporal entailment between two temporal graphs G, G' as follows: It is shown that temporal entailment is NP-complete.To test temporal entailment, the authors define the slice closure of G, as follows scl(G)= t  (cl(G(t))) t , where cl(H) is the RDFS closure [18] of an RDF graph H and The authors extend their theory to support also anonymous timestamps.
A query is defined as a pair (H, B  A), where H and B are temporal RDF graphs without blank nodes and with some elements replaced by variables and A is a set of usual arithmetic built-in predicates over time point variables and time points.All variables appearing in H should also appear in B. For deriving maximal validity intervals a special structure is used.For example a query that asks for the service providers that have web services for more than 4 consecutive years is: (?X, interval, ?t e -?t s ) ← (?Y, provided by, ?X) || ?t s , ?t e ||, ?t e -?t s > 4. No implementation of this theory is provided.
In [19], the authors extend the work in [17] and they define a temporal graph as a set of temporal triples of the form (s,p,o):i, where (s,p,o) is an RDF triple and i is a temporal interval variable or a temporal interval.A temporal constraint is an expression of the form i ω i', where i, i' are temporal intervals or temporal interval variables and ω is one of the relationships of Allen's temporal interval algebra [14].A temporal graph with temporal constrains (called c-temporal graph) is a pair C = (G, Σ), where G is a temporal graph and Σ is a set of temporal constraints over the intervals of G.
The authors define entailment between two c-temporal graphs C, C' as follows: C |= τ(const) C' iff for each time ground instance v(C) of C, there is a time ground instance v'(C') of C' www.ijacsa.thesai.orgsuch that v(C) |= τ v(C').The authors define the c-slice closure of C, denoted by cscl(C), extending the definition of slice closure of [17].It is proved that C |= τ(const) C' iff there is an interval map γ from C' to C and a mapping v s.

t. v(γ(C')) is a subgraph of cscl(C). Entailment between two c-temporal
graphs is shown to be NP-complete.No query language or implementation is provided.
In [20], [21], [22], the authors consider an extension of RDFS with spatial and temporal information.Here, we consider only the extension with temporal information.Assume a set D of RDF triples associated with their validity temporal interval i. Starting from D, the authors apply the inference rules A:?i, B:?i'→ C: ?i ∩ ?i',where A, B → C is an RDFS entailment rule [2] and ?i, ?i' are temporal interval variables, until a fixpoint is reached.Then, the temporal intervals of the same RDF triple are combined, creating maximal temporal intervals.
Based on these maximal temporal intervals, a formal extension of the SPARQL language is proposed, called SPARQL-ST, supporting however only the AND and FILTER operations.The TEMPORAL FILTER condition is precisely defined supporting all interesting conditions between temporal intervals including Allen's temporal interval relations.
An example SPARQL-ST query that returns all house members who sponsored a bill after April 2, 2008, along with the temporal interval that the bill was sponsored is: SPARQL-ST has been implemented by extending a commercial relational database system and experimental results are provided.
The proposed extension has been implemented using the forward chaining engine HFC [27], which supports arbitrary tuples, user defined tests, and actions.Some experimental results are provided.However, no query language is provided.
In [28], a general framework for representing, reasoning, and querying annotated RDFS data is presented.The authors show how their unified reasoning framework can be instantiated for the temporal, fuzzy, and provenance domain.
The authors present a set of sound and complete inference rules of the general form: (s 1 , p An example query asking for the employees of eBay during some time period that optionally owned a car at some point during their stay is: SELECT ?p, ?t, ?cWHERE { (?p type ebayEmp): ?tOPTIONAL {(?p hasCar ?c): ?t'FILTER (?t'  ?t)}}Note that the definition of [P] G is not based on maximal temporal intervals and, thus all temporal intervals that satisfy the query are returned.Therefore, the authors define an ordering between substitutions: θ'  θ iff (i) θ ≠ θ', (ii) domain(θ) = domain(θ'), (iii) θ(x) = θ'(x), for any nontemporal variable x, and (iv) θ'(t)  θ(t), for any temporal variable t.Then, for any θ [P] G , remove any θ' No implementation is provided for this theory.
In The evaluation of a TGP query w.r.t. a temporal graph G and an entailment relation X is formally defined using multisorted first-order logic.Yet, evaluation of a TGP using this definition can be inefficient.Therefore, the authors describe an optimization.
Assume that the entailment relation X is characterized by a set of definite rules of the form: A 1 ,..,A n →B.
Then, the rules: are applied until a fixpoint is reached, where x i and y i are time point variables.Then, based on the result, derived RDF triples are associated with their maximal validity intervals.Now, based on these maximal intervals the evaluation of a TGP query is efficiently defined.
Though the authors state that they have implemented their framework using the PostgreSQL database system, no implementation results are provided.

V. CONCLUSION-DISCUSSION
In this paper, we have reviewed models and query languages of temporally annotated RDF.Below, we compare these models and query languages on various aspects.First, we would like to state that approaches that have their own model theory or extend RDF simple entailment miss important inferences made from the works that extend RDFS entailment.For example, an object o may be an instance of class c during a temporal interval i and the class c may be subclass of a class c' during an interval i'.Only works that extend RDFS entailment are able to derive that o is instance of class c' during the intersection of the intervals i and i'.
From the works that extend RDFS entailment, the approach in [17] seems less efficient since it computes the RDFS closure of RDF triples at each time point.Additionally, [28] considers all temporal intervals that satisfy the query and then selects the maximal ones.In contrast, [22] and [30] achieve query answering using directly maximal temporal intervals achieving a higher performance.
In [30], the query: SELECT ?t, ?t'WHERE { (s,p,o) during ?t. (s',p',o') during ?t',FILTER (before(?t,?t')}} will return the answers of [28], as well as the intervals t,t' such that t  [1998,2009] will return no answer.As a criticism, [30] is not able to return maximal intervals within a temporal interval of interest.
Approaches [7], [28], and [11] save some space since they either use name graphs associated with temporal intervals or associate each RDF triple with its set of validity temporal intervals.
Specialized indices for query answering are used only in [4] and [7], while the rest of the approaches use common indexes.As a final remark, we would like to state that [4] can handle some temporal constraints over RDF triples, [17] can handle anonymous timestamps, and [19] can handle anonymous temporal intervals satisfying Allen's temporal interval algebra relations.
Temporal consistency constraints are considered only in [13], which however does not answer temporal queries but only normal queries.
As a criticism to the work in [6], each RDF triple is associated with a single maximal temporal interval while an RDF triple is normally associated with multiple maximal temporal intervals.Some of the proposed models and query languages have been implemented as stated in the main text of the paper and for some of them experimental results are provided.
In the future, extensions of the proposed temporal RDF query languages with features of SPARQL 1.1 [32], such as subqueries, and negation, will be of great importance.For example, it will be interesting to ask for events that have not occurred simultaneously before a date and their maximal temporal intervals always overlap after that date.Additionally, it will be interesting to ask for companies located in Crete that have exactly one manager at each point in time within a particular temporal interval of interest.
Future work also concerns a survey on spatial, fuzzy, provenance, and contextual RDF.Of course, aspects of contextual RDF can be time, space, trust, and authority.
The expression (c, becomes, c') expresses that the concept c' originates from the concept c and should hold τ(c).end<t(c').start.The expression (c, join, c') expresses that a part of concept c' born at time t comes from a part of concept c.The expression (c, spilt, c') expresses that a part of concept c ending at time t becomes a part of a new concept c'.The expression (c, merge, c') indicates that a part of concept c ending at time t becomes part of an existing concept c'.The expression (c, detach, c') indicates the new concept c' is formed at time t with at least one part from c.
2. NT is a value representing no time.3. The set of properties contains tb:property, tb:begin, and tb:end.4. BP is a subset of resources, called the set of base properties.
1 ,o 1 ) : ?t 1 ,…, (s n , p n ,o n ) : ?t n , {(s 1 , p 1 ,o 1 ),… ,(s n , p n , o n )} |-ρRDF (s, p, o) An extension of SPARQL is presented for querying an annotated RDF graph.A basic annotated pattern is an expression (s,p,o):t, where s, p, o, t can be variables.Let P be a basic annotation pattern and G be a temporal graph.The authors define the evaluation [P] G as the list of substitutions that are solutions of P, i.e., [P] G ={θ | G entails θ(P)}.Based www.ijacsa.thesai.org on [P] G the evaluations [P AND P'] G , [P UNION P'] G , [P FILTER R] G , [P OPTIONAL P'[R]] G are formally defined, where R is a filter expression.
[31], a temporal graph G is a set of temporal triples (s,p,o)[t,t'], where (s,p,o) is an RDF triple and [t,t'] is its corresponding validity temporal interval.The semantics of a temporal graph G, assuming an entailment relation X (such as RDF, RDFS, and OWL2 RL/RDF[31]entailment) are formally defined using multi-sorted first order logic.A basic graph pattern (BGP) is a set of triples (s,p,o), where s,p,o can be variables.A temporal group pattern (TGP) is an expression defined inductively, as follows: BGP, P 1 and P 2 are TGPs, R is a built-in expression, and t 1 ,t 2 , and t 3 are either time points or variables.Note that a TGP query is an extension of a SPARQL query.For example, a TGP query that retrieves all events z in London having at least one time point in common with Oktoberfest is: , t' [2008,2012], and t.end <t'.start.In contrast, in [30], the query: SELECT ?t, ?t'WHERE { (s,p,o) maxinterval ?t. (s',p',o') maxinterval ?t',FILTER (before(?t,?t')}} www.ijacsa.thesai.org