Data research

Using OpenRefine to create new datasets

Author: Mario Antonioletti
Posted: 28 Apr 2019 | 16:07

One of the benefits of teaching a Carpentry course is that it can increase or deepen your understanding of a subject. A recent instance for me was in using OpenRefine, a tool that runs locally on your machine (you do not have to export your data to a third party service).

OpenRefine can help you:
• Explore and clean/transform your data. You can reconcile your data with other external data sources, i.e. enrich your data using external data

• Create a new dataset. It does not modify your original data and keeps provenance of all the steps. Depending on the capabilities of your local machine it can deal with data sets that are up to about 100k rows.

Watch the videos on the OpenRefine website for a good overview. If you want to know more, follow the Carpentry OpenRefine for Ecologists lesson. In this example, I am going to show how easy is to generate a new dataset from the EPCC website. Follow along after you have installed OpenRefine on your system.

Proof-driven queries to preserve patient privacy

Author: Mike Jackson
Posted: 4 Mar 2019 | 09:42

StethoscopeIn our role as members of the Research Engineering Group of the Alan Turing Institute, Anna Roubickova and I worked with Efi Tsamoura and Benjamin Spencer (Department of Computer Science at the University of Oxford) on PDQ, a proof-driven query planner that has great potential within the realm of data science for medical research. 

Scottish Administrative Data Research Partnership

Author: Mark Sawyer
Posted: 13 Dec 2018 | 16:34

EPCC has received funding via the Economic and Social Research Council (ESRC) to continue its work with the Scottish Administrative Data Research Partnership (S-ADRP).

The aim of the partnership is to enable research that leads to policy decisions that will in turn will help Scotland progress towards the vision outlined in the National Performance Framework. This framework helps to shape high level research priorities for Scottish Government, including tackling poverty, providing quality jobs and fair work for all, and ensuring that we live in inclusive, empowered, resilient and safe communities. S-ADRP consists of a number of Strategic Impact Programmes (SIPs) each dealing with a research priority.

Applications to Software Sustainability Institute Fellowship Programme 2019 are now open

Author: Guest blogger
Posted: 10 Dec 2018 | 10:59

By Raniere Silva, Community Officer at the Software Sustainability Institute.

Apply to the Software Sustainability Institute Fellowship Programme 2019.

The Software Sustainability Institute is pleased to announce applications to our Fellowship Programme 2019 are now open. Below we detail the application process and what to expect from us during the recruitment and post-recruitment stages.

Data-driven innovation for business

Author: Thomas Blyth
Posted: 6 Dec 2018 | 16:12

In February EPCC will host an event to explain why data driven innovation is important for industry. We will also showcase how companies are already using data technologies to enhance commercial performance.

There is a lot of hype around big data and big computing for business, but it is undeniable that the influence of data-driven innovation will be profound.

The expertise and support available in Scotland has created a massive opportunity for our engineering and manufacturing sectors and, with the launch of the £500m Data-Driven Innovation (DDI) strand of the Edinburgh and South-East Scotland City Region Deal, this is an exciting time for exploring how technology can benefit business.

VESTEC: saving the world one byte at a time

Author: Nick Brown
Posted: 15 Nov 2018 | 16:21

With jobs submitted to a batch system, supercomputing has traditionally been centred around an offline, non-interactive approach to running codes such as simulations. However, it is our belief that there is great potential in fusing HPC with real-time data for use as part of urgent decision-making processes in response to natural disasters and crises.

Analysing humanities data using Cray Urika-GX

Author: Rosa Filgueira
Posted: 11 Oct 2018 | 14:52

During the last six months, in our role as members of the Research Engineering Group of the Alan Turing Institute, we have been working with Melissa Terras, University of Edinburgh's College of Arts, Humanities and Social Sciences (CAHSS), and Raquel Alegre, Research IT Services, University College London (UCL), to explore text analysis of humanities data. This work was funded by Scottish Enterprise as part of the Alan Turing Institute-Scottish Enterprise Data Engineering Programme.

RSE18 conference, Birmingham

Author: Fiona Reid
Posted: 6 Sep 2018 | 14:47

I recently attended the Third Research Software Engineers (RSE) conference in Birmingham, UK. RSE conferences bring together people who work in an RSE-type roll from across the UK and world.

For anyone who doesn’t know, an RSE is typically someone who has expertise in both coding and research but is not necessarily a pure computer programmer or pure researcher. Often RSEs can be the only such person in their department and thus the conference gives them a chance meet other people doing similar roles to share their experiences and help them feel part of a much larger community.

Health Data Research UK

Author: Michele Weiland
Posted: 24 Aug 2018 | 15:00

The application of cutting-edge data science to health and medical data in order to address population health challenges is an exciting and fast moving new field of research. Health Data Research (HDR) UK, a pioneering national institute, was formed in April 2018 to support world-leading research in this area.

The World-Class Data Infrastructure: a fundamental enabler for data-driven innovation

Author: Mark Parsons
Posted: 13 Jul 2018 | 15:28

The University of Edinburgh is set to play a key role in the Edinburgh and South East Scotland City Region Deal, delivering the deal’s Data-Driven Innovation programme. Underpinning new data innovation hubs across the University will be an exciting new facility for the secure and trustworthy hosting and analysis of huge and varied datasets.

Pages