A framework for querying heterogeneous data source driven by context
The following image provides an overview of the entire system, aimed at offering the user the opportunity to make retrieve information s/he is interested in, from a set of heterogeneous data sources. The user is not really aware of the richness and heterogeneity of the data sources, and the informatio request s/he performs is actually driven by her/his current context, making the entire operation as simple as possible.

Overview
The scenario of the framework consists of several data sources, not necessarily implemented
with the same technology (Relational Databases, XML documents, Ontologies, etc), that are used
as the source of information to be queried. In order to gather a unique vision of all the available
information, a global view of the schema of the data sources must be derived (using a Global As a View
-- GAV -- approach). Such a global view is expressed in an internal representation format
suitable for the entire methodology, able to offer the necessary support and features for the
context-aware management of the information.
The methodology defines a context-aware association between (a) the possible context a user may be in and
(b) the portion of the information s/he would be interested in, with respect to the global schema. This
association consists of a "context--data portion dictionary" that is used when the user executes a
"context-aware" query.
The user query is processed and a set of queries are re-formulated against the data sources original
schema and format, exploiting the meta-information derived by the wrappers and the schema integration
phase (intensional integration).
The data provided by the sources, in their (heterogeneous) formats is then integrated and sent to the user.
Again, the meta-information from the wrapping and schema integration phases is used to perform the fusion
of the data (extensional integration).
Elements
Internal Representation (IR) Format
We are investigating different solutions for the pivot format to represent the global schema, estract and integrate the data sources schemata, and to perform the association between contexts and data portions. At present, the formalims under considerations are: (a) relational database, (b) Extended SDR Networks, and (c) Ontologies.
Domain Model
In order to be able to manage the information of a given application scenario, a domain model is used, containing -- in the same formalism adopted for the Internal Representation -- the knowledge on the "world".
Wrappers
These modules are devoted to the extraction of the schema of a data source from its native format to the Intermediate Representation (IR) format adopted within the framework.
Data Sources Schema Integrator
The internal representation of the schema of the available data sources need to be combined to provide an integrated global view of the available information. Such an operation is performed with the support of a domain model which expresses the reality of the working scenario.
Context-Aware Query
The user identifies (either explicitely by selecting options or implicitly if the parameters can be perceived autonomously by the device - e. g., a location by means of a GPS) her/his current context and requests the data that is deemed important in such a situation.
Query Conversion and Distribution
A single context-aware view is associated with a context, and is expressed with respect to the derived global schema: in order to retrieve the data from the real data sources, it is necessary to convert such view into queries formulated on the specific sources, in their native language.
Data Sources Information Integrator
When the data sources have been singularly queried, in their own native format, the retrieved information needs to be merged and integrated, in order to provide the user with a single block of information. In this phase, both data conversion and data integration are necessary.
Open projects
Part of the framework has already been implemented, while there are portions that still need to be developed. Among the elements that still need to be analyzed, designed and implemented there are:
- Data Sources Schema Integrator: preliminary, prototype versions are available, needing to be tested and re-engineered.
- Query Conversion and Distribution: To be done
- Data Sources Information Integrator: To be done
For more information on the available projects and theses, send an email.