Edinburgh International Data Facility

The EIDF offers a portfolio of services designed to support projects from across the Data Driven Innovation (DDI) Programme.

Cerebras CS-1

The Edinburgh International Data Facility (EIDF) supports learners, researchers and innovators across the spectrum. Services range from basic data download, through simple learn-as-you-play-with-data notebooks, to GPU-enabled machine-learning platforms for driving AI application development.

Most EIDF users work in the Data Service Cloud, which offers a rich set of data science and analytics tools from browser-based notebooks to full desktop environments.

The Data Service Cloud sits on top of an Analytics-Ready Data Layer (ARD Layer), where EIDF data can be shared and re-used for science and innovation. This ARD Layer will grow over time as more and more data are collected in the EIDF.

Innovators and researchers looking for data can search and browse through the Data Catalogue to discover just what analytics-ready data EIDF has, and how they can get access.

EIDF data managers work with data depositors at the Data Ingest Gateway, ensuring that incoming data are safely stored in the Data Lake Archive Layer, and well-described in the Data Catalogue. Data in the Data Lake are stored for the long term using best practices in digital preservation.

EIDF data wranglers work in the Data Preparation Layer, often in collaboration with data depositors and others, to turn archived data from the Data Lake into analytics-ready data products in the ARD Layer. They are then ready for data innovators to create new, exciting datasets that can be stored and shared all over again.

Technology

The EIDF is not a single system but a portfolio of services built on an underlying infrastructure base. These services continuously grow and develop in response to data-driven challenges.

Hardware for the EIDF project is provided by our hardware partner HPE. Hardware platforms within EIDF include:

  • HPE Proliant servers providing the primary hardware
  • HPE Apollo new-generation GPU servers (V100 and A100)
  • Two Cerebras CS-2 units
  • A number of HPE SuperDome Flex 18TB shared memory systems

The EIDF incorporates a number of storage technologies and systems including:

  • A 20PB HPE E-1000 Lustre “hot” storage service
  • A 5PB HPE Ceph “warm” storage service
  • 20PB of DMF managed Spectra Tape Library “cold” storage service

The interconnect for the EIDF is provided by a combination of 100 Gbit/s and 200 Gbit/s ethernet networking using HPE and Mellanox switches.

The EIDF is primarily hosted within HPE “Adaptive Rack Cooling System” (ARCS) racks hosted in the purpose-built Computer Room 4 of EPCC's Advanced Computing Facility. These racks are deployed in batches of four with a central Cooling Distribution Unit which uses water to efficiently cool the hot exhaust air from servers within the rack.

There are currently 16 racks—expected to rise to 40 in the near future plus further increases to support the EIDF as it expands.

Science and applications

EIDF supports projects from across the full range of the Data Driven Innovation (DDI) Programme, which covers:

  • Health and social care
  • Public sector
  • Festivals and tourism
  • Financial services
  • Fintech
  • Creative industries
  • Robotics
  • Space and satellites
  • Agri-tech
  • Digital technologies

Early adopter services on EIDF include: 

  • Scottish Covid-19 Research Database: secure data hosting, linkage and analysis environment supporting covid-19 research across Scotland and the UK.
  • ISARIC4C research service: secure data hosting, linkage and HPC environment supporting covid-19 genetic research by the ISARIC4C consortium.
  • Smart Data Foundry (formerly Global Open Finance Centre of Excellence): secure data hosting and analysis environment for guided research in finance.
  • ScotGov SPACe: analytics workbench and confidential data workbench environments for Scottish Government.
  • iCAIRD research service: secure data hosting and dissemination service for digital pathology research data.
  • Data SlipStream: satellite and Earth Observation data ingest, processing, hosting and dissemination services.

Access

Details of academic and commercial access to EIDF can be found on the EIDF website.

People

The EIDF Service Director is Jano van Hemert. The system is maintained by the Data Science and Engineering, Service Desk and HPC Systems Teams.

Support

Support is available Monday to Friday from 08:00 until 18:00 UK time, excluding UK public holidays.

Further information

Edinburgh International Data Facility website