Can one of our MSc students help you?

Author: Mario Antonioletti
Posted: 31 Aug 2016 | 15:29

We're looking for collaborative projects with industry and academia.

A new batch of students will soon be joining our MSc in High Performance Computing (HPC) and MSc in HPC with Data Science.

As ever we are on the look-out for interesting collaborative projects for the students to undertake towards the end of their course (roughly from April/May to August). 

These projects provide you or your company or organisation with an opportunity for a student, with EPCC supervision, to advance your work while providing the student with an interesting project to work on. We are interested in projects from commercial and academic sources and, for academic projects, you don't even have to be based in the UK.

What constitutes an HPC project?

For an HPC project we would be looking at anything that would involve code optimisation, implementation and evaluation of novel algorithms and code, parallelisation of code, etc. EPCC has been doing this for a long time (26 years and counting) so we have a lot of in-house experience in this field. 

What constitutes a data science project?

For the Masters in HPC with Data Science a project can lie anywhere on the spectrum from data-intensive HPC to completely non-HPC data science. We welcome projects that lie in the areas of:

  • Data management
  • Databases
  • Data mining
  • Data analysis/analytics
  • Machine learning

But projects are not restricted to these areas. If you have something in mind but are not sure, then please get in touch. Possible projects could be:

  • Take data sets from more than one source: clean, transform, combine, analyse
  • Take a data-mining algorithm. Implement in Spark. (Deploy if necessary.)
  • Measure performance.
  • Take an HPC program with an I/O bottleneck. Measure. Try to improve with MPI IO or changing program structure.
  • Write a program (or web service?) that integrates data from multiple online sources and presents this to a user.
  • Take data sets from a partner (academic or commercial) and data mine/analyse according to partner's interest.
  • Take a data-intensive program kernel. Write MPI implementation and run on Archer. Write Hadoop version and run elsewhere. Compare and contrast (performance, scaling, flexibility)

Data visualisation would also be interesting but more challenging for the student as it isn't in the taught courses, so would only be for someone with a real interest and possibly some past experience.

Project proposals

At a minimum we are looking for:

  • A project title
  • One or more supervisors (an assigned EPCC supervisor will do the bulk of the student supervision/guidance)
  • A description of what the project entails
  • Any special skills required by the student to undertake the project.

If a student is interested in the project we will find an EPCC member of staff for you. For project goals it is better to start from a small kernel of an idea and then grow this rather than having an over-ambitious goal that might cause the student to fail. This is an MSc project, not a PhD project, and our main goal is for the student to successfully complete their MSc.

Process

Project proposals are usually prepared by the end of October. They will be made available to students in early November, with students expected to make a ranked choice of their top 4 projects by early December. Students may ask to meet the proposing supervisors for further information. You can discuss the suitability of the student with EPCC if you wish.

If a student is allocated to your project they will usually scope out the project from January to February/March (roughly a 10-week period). The deliverable from this phase will be a report and presentation, which are marked. The main phase of the project work will begin in late April or early May and will go on until mid-August.

The expected deliverables from each student project are: a dissertation, pertinent code, and/or data. These will be assessed.

What we expect from you

If you propose a project we will expect you to supply any code or data that will be used for the project. It would be good if you could be involved in some of the supervision but if the goals are clear and the code and/or data are comprehensible, this is not necessary. You will not do – and are not expected to do – any evaluation or marking of the student's work.

Benefits to you

Clearly, a successful student could help your research/product and if you establish a good relationship with them you could recruit them later as a PhD student or employee. We have plenty of examples where students have had major impacts on codes with impressive boosts in performance.

You will also establish a relationship with EPCC, which could be of use to you in the future.

Commercial companies

We have previous experience of embedding students in businesses and of working with their code/data. We have an established process that deals with NDA, licensing and copyright.

If you want to know more, please do get in touch with me at mario@epcc.ed.ac.uk.

Author

Mario Antonioletti, EPCC