IJCAI 2018 Tutorial

Ontology-based Data Access: Theory and Practice

I. short abstract

Ontology-Based Data Access (OBDA) is a semantic technology for accessing various data sources through the mediation of an ontology, which provides a conceptual view of the domain, and declarative mapping between the data and the ontology. The tutorial will cover the foundations of OBDA and provide an overview of more recent developments in the area: mapping engineering, ontology approximation, ontology-based integration and SPARQL beyond CQs.

II. longer abstract

Ontology-Based Data Access (OBDA) has been developed since the mid-2000s as a semantic technology for accessing various data sources through the mediation of an ontology and declarative mapping between the data and the ontology. OBDA users do not have to know detailed organisation of the data sources. Instead, they can express user information needs as queries over the conceptual model, which is provided by the ontology. Using knowledge representation and automated reasoning techniques, an OBDA system will then reason about the ontology and mappings and reformulate these information needs in terms of appropriate calls to services provided by the data sources.

The tutorial will cover the basic ingredients of the traditional OBDA setup for relational databases: the ontology language OWL 2 QL and mapping language R2RML. These two languages are designed so that answering conjunctive queries (CQs) over an OBDA specification, which consists of an ontology, a mapping and a datasource, can be reduced to answering queries over the relational datasource alone. It became clear, however, that the exponentially-large reformulations of CQs, which exist in theory, cannot be used directly in practice. So, in the first part of the tutorial, we describe how the structure of mappings and database integrity constraints can be exploited to make OBDA work in practice. In the second part of the tutorial, we provide an overview of more recent developments in the OBDA addressing the shortcomings of the traditional setup. First, the expressive power of OWL 2 QL is limited to ensure that all CQs are first-order rewritable. More expressive ontologies, however, can also be dealt with by using mappings and approximation. Second, the SameAs construct, which is crucial for ontology-based data integration, can be handled by the rewriting approach if the identity assertions are structured appropriately. Third, the SPARQL query language provides means of dealing with incomplete data and other features beyond the simple CQs. We discuss how these features of SPARQL can be implemented in the context of OBDA over relational databases. Then, we return to the questions of whether the exponential blowup of rewritings is unavoidable and consider non-recursive datalog rewritings (equivalently, SQL queries with views) as an alternative target language for rewritings. Finally, we briefly mention other most promising directions for future research (in particular, access to spatial and temporal data).

The tutorial will also have a practical session, so a laptop may be required. We will use the Protege ontology editor with the Ontop plugin for the hands-on.

III. Bio

Dr Guohui Xiao (http:///www.inf.unibz.it/∼gxiao) is Assistant Professor at the KRDB Research Centre for Knowledge and Data, Free University of Bozen-Bolzano, Italy. He obtained his PhD in 2014 at Vienna University of Technology. His research focuses on the theory, optimization, and application of the OBDA technology. He is currently leading the development of the Ontop OBDA system.

Dr Roman Kontchakov (http://www.dcs.bbk.ac.uk/~roman) is Senior Lecturer in the Department of Computer Science and Information Systems at Birkbeck, University of London. He received his PhD in 2004 and has since worked in the various areas of Knowledge Representation and Reasoning. Dr Kontchakov has been developing the Ontop since 2012.

IV. Detailed outline

First Session

1. Semantic Web standards for OBDA [slides]
– RDF (Data Model)
– OWL 2 QL (Ontology Language)
– SPARQL (Query Language)
– R2RML (Mapping Language)

2. Hands-on with Ontop [slides]

Second Session

3. Use Cases [slides]

4. Query answering in OBDA [slides]

5. Extensions
5.1. Approximating Expressive Ontologies [slides]
5.2. Dealing with Identity in Ontology-based Data Integration [slides]
5.3. Ontology-Mediated Query Answering and Circuit Complexity [slides]

6. Latest Advances and other important Directions [slides]

V. References

1. Guohui Xiao, Diego Calvanese, Roman Kontchakov, Domenico Lembo, Antonella Poggi, Riccardo Rosati, and Michael Zakharyaschev. Ontology-Based Data Access: A Survey. In: IJCAI-18 – July 13-19 2018, Stockholm, Sweden. 2018.

2. D. Calvanese, B. Cogrel, S. Komla-Ebri, R. Kontchakov, D. Lanti, M. Rezk, M. Rodriguez-Muro and G. Xiao. Ontop: Answering SPARQL queries over relational databases. Semantic Web 8(3): 471-487, 2017

3. E. Botoeva, D. Calvanese, V. Santarelli, D. Fabio Savo, A. Solimando, G. Xiao: Beyond OWL 2 QL in OBDA: Rewritings and Approximations. AAAI 2016: 921-928

4. G. Xiao, D. Hovland, D. Bilidas, M. Rezk, M. Giese, and D. Calvanese. Efficient ontology-based data integration with canonical IRIs. In ESWC, 2018.

5. R. Kontchakov, M. Rezk, M. Rodriguez-Muro, G. Xiao and M. Zakharyaschev. Answering SPARQL Queries over Databases and under OWL 2 QL Entailment Regime. International Semantic Web Conference (ISWC), 2014

6. M. Rodriguez-Muro, R. Kontchakov and M. Zakharyaschev. Ontology-Based Data Access: Ontop of Databases. International Semantic Web Conference (ISWC), 2013

7. M. Bienvenu, S. Kikot, R. Kontchakov, V. Podolskii and M. Zakharyaschev. Ontology-Mediated Queries: Combined Complexity and Succinctness of Rewritings via Circuit Complexity. JACM, 2018

8. A. Poggi, D. Lembo, D. Calvanese, G. De Giacomo, M. Lenzerini, R. Rosati. Linking data to ontologies. J. Data Semantics, 2008