Data integration provides a unified view of data by combining different data sources. In today’s multi-disciplinary and collaborative research environments, data is often produced and consumed by various means, multiple researchers operate on the data in different divisions to satisfy various research requirements, and using different query processors and analysis tools. This makes data integration a crucial component of any successful data intensive research activity. The fundamental difficulty is that data is heterogeneous not only in syntax, structure, and semantics, but also in the way it is accessed and queried. We introduce QUIS (QUery In-Situ), an agile query system equipped with a unified query language and a federated execution engine. It is capable of running queries on heterogeneous data sources in an in-situ manner. Its language provides advanced features such as virtual schemas, heterogeneous joins, and polymorphic result set representation. QUIS utilizes a federation of agents to transform a given input query written in its language to a (set of) computation models that are executable on the designated data sources. Federative query virtualization has the disadvantage that some aspects of a query may not be supported by the designated data sources. QUIS ensures that input queries are always fully satisfied. Therefore, if the target data sources do not fulfill all of the query requirements, QUIS detects the features that are lacking and complements them in a transparent manner. QUIS provides union and join capabilities over an unbound list of heterogeneous data sources; in addition, it offers solutions for heterogeneous query planning and optimization. In brief, QUIS is intended to mitigate data access heterogeneity through query virtualization, on-the-fly transformation, and federated execution. It offers in-Situ querying, agile querying, heterogeneous data source querying, unifeied execution, late-bound virtual schemas, and Remote execution.
Use and reproduction:
All rights reserved