Date: Thu 30 May 2002
Time: 15.00
Place: Aula 218

Speaker: Klaus Dittrich

Abstract. Data available on-line today is spread across heterogeneous data sources like traditional databases or repositories of various forms containing unstructured and semistructured data. Obviously, the "technical'' availability alone is not at all sufficient for making meaningful use of existing information, and thus the problem of effectively and efficiently accessing and querying heterogeneous data is an important research issue. One popular approach is to integrate the data sources and offer users an a priori defined global schema. Alternatively, there are approaches which implement tools for giving users the possibility to define the query schema themselves. We propose a new approach where heterogeneous sources can be queried through a unified interface and underlying sources are integrated by means of a query language only. We present extensions to OQL which allow to query structurally heterogeneous, i.e. structured, semistructured and unstructured data alike, and to integrate data on the fly. We also present some details of query preprocessing and show how techniques from database and information retrieval systems can be combined.