Results

Research and development of the TYPHON technologies was completed in December 2020 and the results have been published in open source at the project GitHub Repository All of the new components and integrated platform for developing and deploying Big Data applications that utilise hybrid data are available including an open source introductory Tutoral.

In addition, the following technical reports are available for download providing details on each of the TYPHON components and the fully integrated platform.

You can also download technical papers presented at major conferences and published journal articles by visiting the Papers tab at the top of the page.

TyphonML: Hybrid Polystore Modelling Language

This document presents the final version of the TyphonML language, a new language for modelling in a homogeneous manner (and by abstracting the specificities of the underlying technologies) the data to be stored in polystores consisting of both relational and NoSQL databases.

Text Modelling Extension

Text modelling is used to identify data-types and their relations in text. It is a basis for text processing components and is essential step in text platforms such as UIMA. We present a type system required to support TyphonML, i.e. a new modelling language to support the design of hybrid polystores. We also evaluate our type system by developing text mining pipelines that build upon the type system in order to extract various type of information from large-scale collections, such as base form of words, i.e. stems, and polarity of text.

TyphonML Modelling Tools

This document presents the supporting tools to support the specification of TyphonML models together with their evolution. The tools permit developers to specify conceptual entities by abstracting the specificities of the underlying technologies. Mappings to specific database systems are enabled by interestingly giving the possibility of managing polystores consisting of both relational and NoSQL databases. The document contains concrete examples that users can follow to actually use the proposed TyphonML modeling tools.

TyphonML Model Analysis and Reasoning Tools

In enterprise applications the use of databases can affect some important quality parameters such as performance and maintainability. A smell in a database system indicates the violation of the recommended best practices and potentially affect in a negative way the quality of the considered software system. For this reason, the early detection of smells in developing database schema could lead to high-quality software systems and even enhance some quality performances. This document presents tools that have been developed in the context of WP2 to support the specification and detection of TyphonML smells.

TyphonDL: Hybrid Polystore Deployment Language

This report presents the final version of the TyphonDL language. In particular it presents the language concepts and architecture, its meta-model as well as its implementation. TyphonDL – Hybrid Polystore Deployment Language – is a modular language aiming to bridge the conceptual gap between high-level polystore design models (expressed in TyphonML and developed in Work Package 2), and low-level virtual image configuration and assembly tools such as Docker and Kubernetes.

TyphonML to TyphonDL Model Transformation Tools

This document presents the techniques and tools to support the specification of TyphonML models in a consistent way with the requirements that the developer wants to achieve with the system being modelled. An enhancement of the TyphonML language and supporting tools has been needed to enable the generation of TyphonDL-based deployment configurations, which are able to satisfy functional and non-functional requirements defined at the TyphonML level.

Optimized Hybrid Polystore VM Assembly Tools

This report presents the tools able to generate configuration scripts to assemble Hybrid polystore VMs from source TyphonDL models. Also, the modelling tools supporting the creation of TyphonDL models are presented. The generation of the VMs assembly is optimized by taking into account both the characteristics of the modelled polystores, and the considered deployment contexts, e.g., hardware configuration, costs, workloads, performance, costs, and storage size. The produced virtual machines are directly deployable on cloud infrastructure.

TyphonQL: Hybrid Polystore Query Language Compilers and Interpreters

TyphonQL is a new query language that strictly operates at the level of the conceptual data model. Queries expressed in this language are partitioned over different database back-ends and the results of those partial native queries are recombined to obtain the end result of the query. This document presents the final version of the TyphonQL compilers and interpreters that perform the translation of TyphonQL queries to native queries on top of SQL databases, MongoDB document stores, Cassandra Key-Value stores, and Neo4J graph databases. Furthermore, this deliverable details the TyphonQL aggregation framework, the TyphonQL type checker, and the IDE.

Data Access Layer Generator

The Typhon project provides diverse tools and languages in order to abstract over heterogeneous data backend technologies. Abstract data models can be defined using the TyphonML modelling language, independently of the specific data stores in which the data will end up. Such data can be queried using the TyphonQL query language, an SQL-like language to query over TyphonML-defined data. Typhon programmers, thus, face similar challenges in term of the representation gap as those described above. To bridge this gap between the TyphonML conceptual model and a particular programming model, a Typhon Data Access Layer has been developed. This DAL provides a convenient optional tool to program systems that use a Typhon polystore as backend. This document discusses the DAL’s design, technological stack, internals and usage.

Data Event Organisation and Representation

This document investigates in more depth the structure of messages for events triggered by stores and delivered through TyphonQL to be used for Typhon analytics. The document also provides an overview of the code generator used to actualize the TyphonML and TyphonDL models into concrete implementation of the events to be used by analytics. Finally, a discussion of the proposed analytics architecture is presented.

Event Publishing and Monitoring Architecture

This document presents the final version of the high-performance architecture developed for data analysis and monitoring in Typhon polystores. The architecture offers facilities for the authorisation of data access and update events and for the extraction of analytics of interest. It builds on scalable and fault-tolerant technologies such as Apache Kafka and Apache Flink.

Text Processing Pipelines

In this document we present the results of experimental testing of Natural Language Processing (NLP) pipelines within parallel and distributed frameworks. These pipelines were evaluated in this report from three performance perspectives; namely speed, data size and throughput. We present the methodology used for experimental testing by detailing the datasets used, along with the technical setting used to perform NLP tasks. We then outline how the pipelines are integrated within a Natural Language Analysis Engine (NLAE), which exposes NLP functionality within the TYPHON ecosystem through a RESTful API.

Hybrid Polystore Data Migration Tools

The TYPHON project aims to develop a methodology and technical infrastructure to support the graceful evolution of hybrid polystores, where multiple, possibly overlapping NoSQL and SQL databases may co-evolve in a consistent manner. The proposed methodology should cover four main aspects. This report covers the design and implementation of tool support for two key aspects, namely the evolution of the TyphonML polystore schema and the associated migration of data from one polystore schema version to another.

Hybrid Polystore Query Evolution Tools

The TYPHON project aims to develop a methodology and technical infrastructure to support the graceful evolution of hybrid polystores, where multiple NoSQL and SQL databases may jointly evolve in a consistent manner. This report focuses on the automatically-supported adaptation of TyphonQL queries to an evolving TyphonML polystore schema. We present the general method and the tool we developed in order to support this query migration process.

Hybrid Polystore Continous Evolution Tools

The TYPHON project aims to develop a methodology and technical infrastructure to support the graceful evolution of hybrid polystores, where multiple NoSQL and SQL databases may jointly evolve in a consistent manner. This report focuses on the monitoring of polystore query events in order to provide users with polystore evolution recommendations, when relevant. It also presents an additional WP6 tool, allowing one to ingest data from pre-existing relational databases to a new Typhon polystore.

TYPHON Integrated Platform

This document describes the final implementation of the integrated TYPHON platform and provides a detailed architecture and life-cycle management description. Furthermore, it discusses details on how users are allowed to design, deploy, query and evolve hybrid Polystores, via the platform Application Programming Interface and its Graphical User Interface. Additionally, it provides a clear representation of the various workflows naturally occurring for the design, deployment and usage of the Polystore, along with the interactions between the various components responsible. An installation and usage guide accompanied by the appropriate screenshots is also included.

Open Source Repository

The project partners have published in open source all of the new TYPHON components that can be used to develop and deploy Big Data applications that utilise hybrid data at the project. The GitHub Repository also includes a software Tutoral to get help get you started and familiar with TYPHON innovations.