Data Slipstream: bringing together Earth-observation data, science, industry, and next-gen compute

6 February 2023

The Data Slipstream project is building a system whereby the large, diverse and complex datasets vital to Earth Observation (EO) research at the University of Edinburgh and beyond can be brought together in one place.

Data Slipstream has been running as a collaboration between EPCC and the School of Geosciences since 2020, initially as a proof-of-concept project funded by the UK Space Agency (UKSA) in 2020.

It was an early adopter of the Edinburgh International Data Facility and will provide satellite and Earth Observation data ingest, processing, hosting and dissemination services.

Earth Observation data underpins research into geoscience and engineering challenges including climate change, land use change, agriculture, planning and infrastructure, landslip, forensic archaeology and anthropology.

A major difficulty is caused by the many data sources, APIs and protocols, licences, software, conventions, data formats and, of course, the sheer amount of data involved in EO research. The scientists who rely on this data are often forced to access it in a piecemeal fashion, limiting the scope of their research.

Another challenge is the merging of data. Fused EO data can be even more powerful than the sum of its parts; a flood warning system based on optical images of a flood plane would become vastly more predictive when those images are linked with weather and climate data, soil saturation data, and digital elevation models.

Satellite view of Edinburgh and West Fife

Above: multitemporal SAR view of the eastern coast of Scotland. The colours of the sea correspond to different wind and current conditions during the various acquisitions. The tidal planes appear in magenta or in light green-blue. Land masses appear in grey where no noticeable change occurred between the acquisitions. The different colours of the fields can be explained by the change in vegetation and moisture. Image credit: ESA

Building data infrastructure

But Data Slipstream is not just a collection of data. The project will design and build the infrastructure that brings the researchers to the data, and provide the tools, environments, and computing power required to develop algorithms and machine learning models, perform analysis, and produce results. And, with the expertise of the Data Slipstream team, develop services and products around that core science.

The project began in 2020 with £215K from the UK Space Agency (UKSA) National Space Innovation Programme to initialise the development of Data Slipstream on the Edinburgh International Data Facility (EIDF), with the overarching goal of facilitating climate change mitigation and adaptation from EO data.

Following on from the UKSA funding, further funds were sought from the UKRI Industrial Strategy Challenge Fund to deliver the PASTORAL (Pasture Optimisation for Resilience and Livelihoods) agri-tech service, which produces timely crop yield forecasts to aid in agricultural decision-making by combining satellite data, a novel modelling framework and weather forecast data, and real-time pasture productivity and carbon cycling information.

The role of Data Slipstream was to host the required data and provide the infrastructure for modelling to be conducted on ARCHER2. A new spin-out company, Mercury Environmental Systems Ltd, was formed and continues to work towards increasing productivity and sustainability whilst delivering net zero.

Shortly after this, UKSA SPRINT funding was won in partnership between EPCC, the School of Engineering, and Astrosat Ltd to develop a deep-learning platform for the prediction of soil moisture from data provided by the European Space Agency’s Sentinel-1 satellite, with the COSMOS-UK network providing ground-truth data. The model, built using Tensorflow on EPCC’s Cirrus supercomputer, was trained using data from 51 soil moisture measurement sites and thousands of images taken over the UK between 2014 and 2019.

Satellite view of Scotland

Above: multispectral cloud-free composite image (showing combined RGB channels) produced by combining Sentinel-2 images from Summer 2019 produced using Data Slipstream by the School of Geosciences and EPCC staff on the EIDF.

Evolving with EIDF

Data Slipstream was an early adopter of the EIDF. As the EIDF vision becomes fully realised, Data Slipstream is being improved and redesigned to exploit the infrastructure and systems the EIDF offers. Work is currently underway to migrate Data Slipstream to an OpenStack platform, where virtual machines can be spun up on demand and provide a full EO data analysis environment through Jupyter Lab, and CPU and GPU compute through Kubernetes. Together with increased, faster storage and improved automation and cataloguing, EO data will be placed at researchers’ fingertips.

Data Slipstream now hosts UK-wide Sentinel-1 data, Sentinel-2 data covering all of Scotland and parts of Ghana, and many other datasets required in the processing of that data, such as digital elevation models of the Earth’s surface. Data sources include the Alaska Satellite Facility, the Centre for Environmental Data Analysis, and the European Space Agency. Data acquisition is project led, meaning any new data acquired is directly linked to a use case.

Future developments

The latest Data Slipstream project, funded by the STFC Impact Accelerator Account, involves an international industrial collaboration with Orbital Micro Systems to receive low-latency passive microwave data recorded by their Global Environmental Monitoring (GEMS) CubeSat constellation.

Data from the demonstration instrument, launched in 2019, will be used to prepare the Data Slipstream infrastructure to receive raw satellite data to produce up-to-the-minute 3D precipitation, temperature, and moisture profiles of the Earth’s atmosphere from instruments due for launch beginning in early 2023 from Shetland’s SaxaVord UK Spaceport.

EIDF is at the heart of a plan to make Edinburgh the data capital of Europe. Through Data Slipstream, we aim also to become the space data capital of Europe.

The EIDF offers a portfolio of services designed to support projects from across the Data-Driven Innovation (DDI) initiative.

Author

Dr Dave McKay