ExTASY: smarter simulations for chemists

Author: Mario Antonioletti
Posted: 23 May 2016 | 14:43

Last week I attended an ExTASY tutorial here in Edinburgh. The project aims to build a set of Extensible Tools for Advanced Sampling and Analysis (hence the name) to allow chemists who use computational methods and off-the-shelf molecular dynamics (MD) packages (such as GROMACS, AMBER and NAMD) to be cleverer and more efficient with their simulations.

The Extasy-based tools are well worth considering if you are doing MD calculations. If you want to be smarter about how you do your simulations, take a look at ExTASY.

For instance, to get new and better results for the conformation of a molecule, or how the molecule changes its shape in its local environment, it should not simply be a case of applying a brute force approach of longer and bigger simulations; in effect throwing CPU cycles or money at the problem. Biological processes, eg protein folding, typically take from milliseconds to seconds to occur, while typical computational simulations use femtosecond time (10-15s) steps and typically will only be able to model 10-100 nanoseconds (10-9s) of simulation time per real day. You can certainly do 10s of nanosecondss of simulation per day, for realistic-sized systems with GROMACS.

So, to observe a process of interest can require vast amounts of computational time depending on the initial configuration of the system being modelled, which adds a stochastic element to the simulations process. An event that happens once per millisecond at 10nanoseconds per day will take 100,000 days or around 250 years, so it's not just a case of tools to do faster sampling, but tools to see stuff you could never realistically see before.

Instead of taking this quasi-random approach, one can explore the conformational space using tools such as LSDMap (Locally Scaled Diffusion Maps) and CoCo (Complementary Coordinates) that ExTASY has developed that allow you to explore the phase space, suggesting new starting configurations to explore. ExTASY has also developed a number of tools that allow all of these new systems to run in parallel using the Ensemble MD Toolkit and ExTASY Workflows, thus maximising the throughput of jobs to explore different (and potentially interesting) initial molecular configurations that may yield the result you are looking for a lot quicker. And even if they don't, you will at least understand the phase space of your molecules.

To this end, the two-day tutorial covers the use of the LSDMap and Coco tool on day one with the second day spent on the application of the workflow tools developed by ExTASY to model some of the LSDMap and CoCo configuration. The tooling is quite sophisticated and can apply several patterns to model the resulting ensembles in different ways. We got to run jobs on ARCHER here in the UK and on Stampede, an XSEDE machine based at the Texas Advanced Computer Centre in the States.

For more context, see Iain Bethune's previous posts: Agony and ExTASY and Getting hands-on with ExTASY.

Erik Lindahl, the project lead for the GROMACS MD code and also from the BioExcel project, gave a most excellent guest presentation that inspired a lot of the content in this blog piece, which I hereby grudgingly acknowledge.

The course's attendees and tutors were from all over the world as shown by the red dots on the map below:


ExTASY is a transatlantic collaborative project funded by EPSRC for the UK side and by the NFS for the USA partners.