Sunshine, snow and persistent data infrastructure

Author: Rob Baxter
Posted: 1 May 2013 | 11:00

The high desert in New Mexico

I've recently returned from a very interesting week-long tour of the southwestern USA. Work-related, of course. I and a handful of European colleagues from the EUDAT project were graciously hosted by three groups all engaged in data infrastructure work on the other side of the Atlantic.

After flying into what must be one of the world's smallest and cutest airports in Santa Fe, our first stop was Los Alamos National Lab and the Web science group led by Herbert Van de Sompel.

The Web has developed enormously since http/1.0, of course, and not just in terms of lolcatz and online shopping.  True to its original roots, the Web continues to be a powerful tool for scientific communication, and is evolving to meet the exponential rises in digital research data. Two very interesting projects developing at Los Alamos are the ResourceSync framework, designed to allow systems to remain synchronized with evolving resources on a remote server, and Web Memento, which seeks to add a time dimension to the Web, allowing for the retrieval of Web resources not as they are now, but how they were at a particular point in the past.

Our second visit was to the DataONE project - Data Observation Network for Earth - at the University of New Mexico, Albuquerque. DataONE are in many ways a sister project to EUDAT, an older sister (now into their fifth year), and with a particular remit to provide a persistent, registered home for data in the Earth sciences. Members of the DataONE network can safely replicate their research data to other member sites, pool their metadata and share observations and results all under the one umbrella. The underlying infrastructure is complemented by a rich set of client tools (the Investigator Toolkit) and a truly impressive community outreach and education programme. The EUDAT team spent a day and a half in very interesting workshops, with wonderful hospitality from our hosts.

The shores of La Jolla, San Diego The final stop on our journey led us from the snowy high deserts of New Mexico to the balmy shores of the Pacific and San Diego Supercomputer Center. We were hosted here by the EarthCube project and our day's visit covered the EarthCube research network itself, and SDSC's exciting data-intensive supercomputer Gordon (it has a lot of Flash, you see...). SDSC specialise in data-intensive computing within the US XSEDE infrastructure, with systems dedicated to hosting globally important resources such as the Protein Data Bank.

All in all it was a very interesting and valuable trip, both for me and, I think, for EUDAT. It was good to compare notes and ideas with US colleagues - always easier to do in person and away from the bustle of a conference - and identify concrete points of collaboration. And the pleasure of finding HMS Surprise in the harbour of the San Diego Maritime Museum was, for me, the icing on the cake :-).


Rob Baxter, EPCC