Many areas of scientific computing either produce or consume large amounts of data: storing and accessing that data is a major task in itself, especially as it may involve distributed storage. EPCC has been involved in many projects developing or using data standards, and has deployed large-scale data management infrastructures for a number of research groups.

Example Projects

     

  • OGSA-DAI

    Since 2002 the OGSA-DAI Project has been working to develop an effective solution to the challenge of Internet-scale data integration. The product - OGSA-DAI™ - is a middleware bundle that allows data resources, such as file systems, relational or XML databases, to be accessed, federated and integrated across the network. More details can be found here.

  • LaQuAT - Linking and Querying Ancient Texts

    LaQuAT was a collaboration between EPCC and the Centre for e-Research Kings College, London and was funded by JISC via the ENGAGE project. The project focused on linking epigraphic databases using the OGSA-DAI data management product. Epigraphy is the study of ancient texts e.g. books, papryi, or inscriptions on stone tablets. Information about these texts (what they are, where they were found, when they date from etc) are stored in databases.

    EPCC customised OGSA-DAI to run queries over distributed epigraphic databases as if they were a single virtual database. We also demonstrated how OGSA-DAI allows researchers to augment third-party read-only databases with their own data sets. The project highlighted the strengths and weaknesses of OGSA-DAI’s ability to handle real-life data and these are guiding current development. KCL developed an appreciation of the issues involved in sharing data. As a result of this project both EPCC and KCL enjoy a close ongoing relationship.

  • DiGS - Distributed Grid Storage

    DiGS is a distributed-data management system that combines commodity storage resources — such as RAID systems and Storage Area Networks — into a large-scale, unified file repository, which is presented to the end-user through an easy-to-use, lightweight client toolkit. The DiGS application is built on top of the Globus Toolkit and the EGEE application stack. More information can be found here.