ARCHER gains parallel Knights Landing capability

25 October 2016

The ARCHER national service is being enhanced by the addition of a parallel Knights Landing (KNL) system that will be available to all ARCHER users. 

 

EPSRC and Cray have recently signed an agreement to add a Cray XC40 development system with Intel Xeon Phi processors to the ARCHER service. 

This will be a separate 12-node platform with KNL processors, but otherwise will have a very similar environment to the main ARCHER system, including Cray’s Aries interconnect, operating system and Cray tools. This should make it as straightforward as possible for ARCHER users to try out their codes on multiple KNLs.

The KNL nodes arrived in the UK at the end of September and became available to users on 20 October. To encourage users to take advantage of this opportunity, for the first month there will no limits on maximum usage of the KNL service, although there will be limits on, for example, the number of jobs any user can submit at once, so that all users will have a fair chance to use the new system. After the first month, any ARCHER user can then get a default allocation. 

Non-ARCHER users can also get this by the completion of an online KNL Driving Test. Additional allocations will require only a lightweight application process (probably similar to Instant Access).

User support

Support for users of the ARCHER KNL system will be provided in a similar way to the main system:

• Helpdesk

• In-depth support

• Cray Centre of Excellence

• eCSE programme. 

There will also be various hands-on training courses and virtual tutorials See: www.archer.ac.uk/training/.

The ARCHER KNL system provides a timely opportunity for users to investigate how their application codes will run on multiple KNLs, preparing for possible, future such systems.

About the system

The KNL, Intel’s latest Xeon Phi many-core processor, can have up to 72 physical cores, each of which can run 4 threads efficiently. 

The cores can undertake large vector operations, meaning they can perform calculations on 512-bits of data at a time, giving the processor the potential of providing over 3 TFlop/s of double precision floating point arithmetic. This is three times the peak performance of the previous Xeon Phi processor, the Knights Corner (or KNC), meaning that porting codes to this latest many-core processor has the potential for significant performance improvements for applications.

Direct access to main memory

The KNL also comes with direct access to main memory, and the ability to be self-hosting, meaning that the Xeon Phi is no longer a co-processor, and systems no longer need normal processors to host the Xeon Phi. 

The access to main memory also removes previous restrictions on the size of data set Xeon Phi processors could operate upon.  Having access to the full memory of a node puts them on the same footing as standard multi-core processor based nodes.  Furthermore KNL processors also come with 16 GB of high bandwidth memory stacked directly on to the chip.

This memory, also known as MCDRAM, provides much higher bandwidth data transfers than standard main memory, albeit with a slightly higher latency in the accesses. MCDRAM has the potential to significantly improve the performance of applications that stream data.

The Cray XC40 ARCHER system uses the 7210 version of the Intel Xeon Phi processor, running at 1.30GHz, with 64 physical cores, 16 GB on chip MCDRAM, and access to 96 GB of DDR4 main memory running at 2133 MT/s.