The aim is to reduce human pressure on marine and freshwater environments, restore degraded ecosystems, and sustainably make use of the services they provide. This requires up-to-date high-quality data available under FAIR principles covering the whole water continuum from inland water, over coastal waters to the oceans, and addressing all elements in the DPSIR (Drivers – Pressures – States – Impacts - Response), which is the causal framework for describing the interactions between society and the environment adopted by the European Environment Agency (Dahl et al, 2015).
Thematically, the data required for restoring healthy oceans and water range from hydrology over biodiversity to socio-economic data. To combine this, new ways of data search and acquisition, and new and innovative ways of analysis and modelling are needed.
The overall objective of the project is to develop a virtual environment equipped with FAIR multi-disciplinary data and services to support marine and freshwater scientists and stakeholders restoring healthy oceans, seas, coastal and inland waters. The AquaINFRA virtual environment will enable the target stakeholders to store, share, access, analyse and process research data and other research digital objects from their own discipline, across research infrastructures, disciplines and national borders leveraging on EOSC and the other existing operational dataspaces (e.g., EMODnet, Copernicus Marine Service, Digital Twins, etc.).The initial level of the existing solutions for data access in marine domain are at the Technology Readiness Level (TRL) 7-8. Many services, such as Blue-Cloud data discovery and access service and NextGEOSS, have reached a maturity level, where they can already be seen as finalised end-user products in terms of usability.
The AquaINFRA Interaction Platform (AIP) will be the central gateway for scientific communities in the aquatic realm to interact with EOSC and access the AquaINFRA resources (harmonisation and processing services, research products, visualisation, virtual research environment and training platform).
The AquaINFRA architecture will capitalise on four major modules that will allow a seamless data discovery and processing across the ocean and inland water data domains: (i) the AquaINFRA Data Space with the Data Discovery and Access Service (DDAS) connecting an extensive selection of marine and freshwater data resources, (ii) the AquaINFRAVirtual Research Environment (VRE) providing a number of notebook services and models, (iii) a training platform and (iv) a powerful interactive visualisation interface. The DDAS allows the seamless search, access, and harmonisation of FAIR data through external data providers API and Digital Twin APIs. The data will be made available for further processing in the AquaINFRA data space with its bronze, silver and gold zones following the data lake terminology (Simon 2021). The bronze zone will contain selected raw data extracts (e.g. FerryBox data). Data curated and harmonised through the AquaINFRA DDAS services will be stored in the silver zone and will feed into the AquaINFRA data transformation and integration services. The data will be attached to a seamless, high-resolution pan-European hydrography database, resulting in a set of high-quality, analysis-ready data in the gold zone of the data space accounting for the spatio-temporal connectivity and temporal lag-effects across the freshwater and marine realms. The AIP will tap on available services within the EOSC portal and connect them with newly developed AquaINFRA processing and notebook services within the AquaINFRA VRE. Available services can be chosen and modified, if necessary, in the VRE through R or Python, and can be brought to a workflow canvas to further process and analyse the raw or harmonised data, and to create custom user workflows. This is to highlight transparency for the researchers to study the process as a supplement the otherwise seamless flows. Custom user- specific data can be integrated into the AIP by either directly importing it into the VRE, or by capitalising beforehand on the DDAS harmonisation service. AquaINFRA research products can be visualized throughout the AIP visualization interface. This will be accompanied by a training platform providing training resources for all AquaINFRA components. All data sources connected through the DDAS will be made accessible and discoverable through interdisciplinary, cross-domain discovery services in EOSC (EUDAT B2FIND, OpenAIRE, EXPLORE etc.). Newly developed services will be integrated into the EOSC portal and EOSC Exchange to further support the uptake of Open Science practices in the aquatic research communities. In AquaINFRA, the AIP and developed services will be demonstrated to scientific communities through four use cases (Baltic, North Sea, and Mediterranean regions as well as across the pan-European extent).
The AquaINFRA Interaction Platform and Data Space will be made interoperable with the Digital Twin of the Ocean (DTO) projects ILIAD and EDITO to ensure both that underlying data and models from the Digital Twins are available from AquaINFRA and that the AquaINFRA data and models are being provided as input to enhance the Digital Twins of the Ocean. AquaINFRA will also leverage from the experiences and development done in the related projects Blue-Cloud and NextGEOSS. Moreover, AquaINFRA will work in close collaboration with FAIRCORE4EOSC, the awarded project under the “HORIZON-INFRA-2021-EOSC-01-05 Enabling an operational, open and FAIR EOSC ecosystem” call, in particular for what concerns the integration with the new EOSC-Core Components developed by the project, i.e. the EOSC PID metaresolver, the EOSC research graph, etc. This collaboration will complement the other EOSC Core and Exchange integration activities that the project will put in place with the ongoing EOSC Future project. Most data related to water research have a location component, and accordingly AquaINFRA will make use of existing standards from ISO TC211 on spatial data, Open Geospatial Consortium (OGC), and the EU INSPIRE Directive from 2007 on a Spatial Data Infrastructure in Europe. Emerging Geo/OGC standards such as the Environmental Data Retrieval API and Cloud-Native Geospatial (COG, ZARR, COPC and STAC) will be used and contributed to their further development.