MSc in HPC with Data Science industrial placements: Mallzee
Posted: 16 Nov 2016 | 00:00
All our MSc students are given the opportunity to work on a project with a company or academic group.
Among this year’s students were Adarsh Janakiraman and Killian Tattan, who both worked at Mallzee, an Edinburgh-based SME that produces the UK’s top non-retailer shopping app. Below we hear both sides of the experience.
The opportunity to work on my dissertation project with one Edinburgh’s most exciting e-commerce startups, Mallzee, was one I did not want to pass.
Mallzee is a fashion app for the smartphone, which has cornered the youth market in the UK. With over 500,000 registered users, it has been able to collect user preference information on over a million different products.
The company is currently in the process of making this potential mine of information available to retailers, to better understand their customers’ behaviour. The goal of my project was to build a model to predict the sales of products showcased on the app. The project involved extracting the data, cleaning it, selecting the right features, creating more features not currently in the database and finally applying a machine learning model and evaluating its fit on the data.
Overall, the experience of working on an actual industrial problem was great as it not only allowed us to build on our hard skills in model building and evaluation but also allowed us to build on the softer skills, such as business model evaluation, commercial strategy and teamwork, required for working in a fast-paced startup environment.
I would like to thank the department, and all involved, for making such an exciting opportunity available to students. I would also like to thank the Mallzee team for accommodating me in their office with a desk and a piping mug of tea.
The industry dissertation was both intellectually and logistically challenging. The project involved improving the recommender engine currently in development by Mallzee.
A recommender engine (recommender) makes suggestions to the user based on their previous preference/buying history. The recommender being developed for Mallzee would present clothing items to users based on what they liked and disliked in the past. The concept is simple, but the science, algorithms, and mathematics behind it are quite complex.
The final recommender developed in the project employed a random forest classifier, a machine learning algorithm (a type of algorithm that doesn’t have predefined rules and continuously learns from data), to classify (predict) whether a new clothing item would be liked by the user or not. Based on this prediction the algorithm would either recommend the item or not. The project was successful in that the accuracy of the recommender was slightly improved from the original model!
I feel industry projects are an excellent way for businesses to make use of the knowledge and talent available at universities, while also giving students insight into how the theory of what they’ve been learning is applied in the real world.
The relationship between industry and academia is mostly mutually beneficial, but not without its differences. What was interesting for me was to mediate the relationship between industry and academia while also achieving my projects goals.
Overall the project was thoroughly enjoyable; an excellent way to implement the theory of what I had learned in class to real world application while also learning about the exciting new field machine learning.
Martina Pugliese, Data Scientist, Mallzee
Through the EPCC students’ summer placements, we at Mallzee have been able to investigate two of the main research problems we face which deal with data and about which we employ data science methodologies.
Killian worked in the area of user personalisation, contributing to the realisation of improvements to the current recommender system we designed, while Adarsh tackled the question of how users respond and interact with fashion, generating new knowledge about what clothing and accessories are more prone to sell in the market.
Both projects presented challenges not only at the conceptual but also at the technical level, as data had to be polished and prepared for analysis and bespoke code created.
The students utilised a mixture of statistical and machine learning techniques, and were both able to conclude the work in the time allowed by presenting us with working prototypes. They mainly used the Python data stack and contributed code which is smoothly integrated into Mallzee’s codebase.
Can you offer a placement?
We are always looking for interesting collaborative projects for our students to undertake towards the end of their course (from April/May to August).
If you want to know more, please contact Maureen Simpson, EPCC.