Besides integrating large-scale data sources, SemaGrow also tackles integrating heterogeneous data sources. This pertains to efficiently applying resource mappings within the query execution engine of the SemaGrow Stack, but also to a system of tools that complement the SemaGrow Stack and that provide the mappings that should be applying. While the first period of the project focused on developing technologies that support robust alignment over heterogeneous knowledge models and languages (MAPLE, Lime) and on refining the SYNTHESIS system for automatic alignment, the second period focused on robustness and integration. Specifically, SYNTHESIS has been provided with an appropriate data access layer, moved to a thread-based architecture, and provided with ontology modularization methods to improve scalability. Furthermore, the architecture of the VocBench environment for manual alignment and of its backing framework, Semantic Turkey, has been reworked to allow for multi-project and multi-graph management, dynamic context injection inside the services, seamless navigation of local and Web data laying out a common ground for a user-centric ontology alignment experience. Semantic Turkey also features a new Linked Data explorer that will be integrated into VocBench during the final period of the project.
Finally, during its second period the project experimented with the CODA system for knowledge extraction and transformation, applying it to a datasheet import scenario. The objective is to strike a balance between enforcing conventions (affording ease of use) and capacity to deal with complexity. This effort resulted in Sheet2RDF, a datasheet to RDF import and transformation system, which by following a “convention over configuration” approach makes so that the “evident” and trivial imports can be dealt in an almost automatic way, whereas more complex transformations can be still managed through the more powerful capabilities of CODA.
Both the alignment and knowledge acquisition systems developed, represent an “offline” contribution to the stack, i.e. integration is guaranteed at standard compliancy level. We detail here the flow of data and the standards being adopted: