AquaINFRA News

Virtual Research Environments: What They Are and Why Aquatic Science Needs Them

April 14th, 2026
Virtual Research Environments: What They Are and Why Aquatic Science Needs Them

If you are an aquatic scientist, there is a good chance your daily workflow involves downloading datasets from three or four different portals, cleaning them in R or Python on your laptop, running analyses that take hours, and then emailing scripts to a collaborator who cannot reproduce your results because they are running a different software version. This is not an exaggeration. It is the norm.

Virtual research environments (VREs) offer a way out of this cycle. They are not a new concept — the term has been in circulation since at least the early 2010s — but adoption in the environmental sciences has been slow, and many researchers remain unclear on what exactly a VRE provides that a well-organised local setup does not. This article aims to answer that question, with particular attention to why aquatic science stands to benefit.

What is a virtual research environment?

A VRE is a web-based platform that provides researchers with integrated access to data, analytical tools, and computing resources through a browser. Rather than installing software locally, users work within a shared infrastructure where tools, libraries, and datasets are pre-configured and accessible on demand.

The concept is straightforward: move the analysis to where the data lives, rather than moving data to where the analyst sits.

In practice, a VRE typically combines several components. There is a computational backend (often cloud-based) that provides processing power beyond what a laptop can offer. There is a workspace where users can write and execute code, build workflows, and store intermediate results. And there is a data layer that connects to external repositories, catalogues, or APIs, allowing users to query and access datasets without downloading them in full.

Platforms like JupyterHub and Galaxy are well-known examples of tools that underpin VREs, though neither is a VRE in itself. JupyterHub provides collaborative notebook environments where multiple users can write and run code in Python, R, or Julia through a shared server. Galaxy, originally developed for genomics, offers a graphical workflow system where users chain together analytical tools without writing code. Both are open-source and widely used in research.

The European Open Science Cloud (EOSC) has adopted VREs as a core part of its strategy for making research infrastructure accessible across disciplines and borders. Several Horizon Europe projects, including AquaINFRA, are building domain-specific VREs that sit within this broader ecosystem.

What does a VRE offer that local analysis does not?

The advantages fall into three main categories: reproducibility, scale, and collaboration.

Reproducibility. One of the persistent problems in computational research is that analyses are difficult to reproduce. Software versions change, dependencies break, and the precise sequence of steps a researcher followed is rarely documented in enough detail for someone else to repeat them. In a VRE, the computational environment itself is defined and versioned. When a researcher builds a workflow in Galaxy, for instance, every tool version and parameter setting is recorded automatically. Another user can re-run the same workflow months or years later with confidence that they are using the same setup. This matters enormously for regulatory science, where environmental assessments need to be auditable.

Scale. Many aquatic datasets are large — satellite-derived ocean colour products, high-resolution hydrological model outputs, long-term monitoring records aggregated across countries. Downloading these datasets to a local machine is slow, storage-intensive, and often impractical. A VRE that connects directly to data repositories allows researchers to run analyses on remote servers, close to the data, without transferring terabytes to their own hardware. This is particularly relevant for pan-European analyses that draw on data from multiple national and international sources.

Collaboration. Research is increasingly collaborative, and aquatic science is no exception. A study on nutrient transport in a transboundary river basin might involve hydrologists, marine biologists, and policy analysts across several countries. A VRE provides a shared workspace where all collaborators can access the same tools and data, see each other's work, and build on it. This is a step beyond sharing scripts by email or through a Git repository — it means working in the same environment, with the same computational resources, in real time.

Why aquatic science is a natural fit

Not every discipline benefits equally from VREs. Fields with well-established, centralised databases and standardised methods may find their existing infrastructure sufficient. Aquatic science is not one of those fields.

Water research is, by its nature, distributed. Rivers cross national borders. Marine regions are monitored by multiple countries under different institutional arrangements. Freshwater and marine data are typically managed by entirely separate communities, with different metadata standards, different vocabularies, and different data formats. A researcher studying the land-to-sea continuum — how nutrient run-off from agriculture affects coastal water quality, for example — must navigate this fragmented landscape every time they begin a new analysis.

EU directives such as the Water Framework Directive and the Marine Strategy Framework Directive require member states to monitor and report on water quality, but the resulting data are stored in national databases with varying levels of accessibility. European-level aggregations exist (WISE for freshwater, EMODnet for marine data, ICES for fisheries and oceanography) but they cover different parameters, use different interfaces, and are not designed to be queried together.

A VRE tailored to aquatic science can address this by providing a single entry point to multiple data sources, with tools that handle the necessary harmonisation behind the scenes. Instead of spending days reformatting datasets to make them compatible, a researcher can focus on the science.

There is also a workforce argument. Early-career researchers entering aquatic science today are expected to handle increasingly complex computational tasks — species distribution modelling, machine learning for remote sensing classification, statistical analysis of long-term trends — but many have limited training in software engineering or data management. A VRE lowers the barrier by providing pre-configured tools and documented workflows that researchers can use and adapt without building everything from scratch.

The practical reality

VREs are not a panacea. They require reliable internet access, which is not universal. They depend on sustained funding for infrastructure maintenance. And they demand a cultural shift: researchers must be willing to move their work from familiar local environments to shared platforms, which involves a degree of trust in the underlying infrastructure.

There are also questions of governance; who controls the VRE, who decides which tools are available, and how is user data handled? These are not trivial concerns, particularly when working with sensitive environmental data or proprietary models.

But the direction of travel is clear. The volume and complexity of aquatic data are growing faster than individual researchers can manage on their own. Cross-border environmental challenges demand cross-border analytical infrastructure. And the scientific community's expectations around reproducibility and open science are only becoming more stringent.

Virtual research environments will not solve all of these problems. But they address a real and growing need and for aquatic science, with its inherent complexity and fragmentation, the case is particularly strong.