Current Research Project

Title / Titel
PDF
Perm
Original title / Originaltitel
Provenance extension of the relational model
Summary / Zusammenfassung
Data provenance is information that describes how a given data item was produced. The provenance of a data item includes source and intermediate data as well as the transformations involved in producing a concrete data item. In the context of a relational databases, the source and intermediate data items are relations, tuples and attribute values. The transformations are SQL queries and/or functions on the relational data items.

Existing approaches capture provenance information by extending the underlying data model. This has the intrinsic disadvantage that the provenance must be stored and accessed using a different model than the actual data. In the Perm project we try to overcome this disadvantages by developing a novel provenance management system called Perm (Provenance Extension of the Relational Model) that is capable of computing, storing and querying provenance for relational databases. Perm generates provenance by rewriting transformations (queries). For a given query, Perm generates a single query that produces the same result as q but extended with additional attributes used to store provenance data. An important advantage of the approach used in Perm is that the transformed query is also a regular relational algebra statement. Thus, we can use the full expressive power of SQL to, e.g, query the provenance of data items from the result of the original query, store the transformed query as a materialized view, and apply standard query optimization techniques to the execution of transformed query. Perm can be used both to compute provenance on the fly (i.e., at query time) and to store provenance persistently for future access. Perm also supports external provenance and incremental provenance computation reusing stored provenance information.
Weitere Informationen
Keywords / Suchbegriffe
Databases, Provenance, Query Rewriting
Project Leadership and Contacts /
Projektleitung und Kontakte
Boris Glavic (Project Leader)glavic@ifi.uzh.ch
Funding Source(s) /
Unterstützt durch
Universität Zürich (position pursuing an academic career)
 
In Collaboration with /
In Zusammenarbeit mit
Gustavo Alonso, Systems Group, Institut for Computer Science, ETH ZürichSwitzerland
Duration of Project / Projektdauer
Feb 2008 to Dec 2010