EN Bereichsnavigation EN

IBM x3850 M2 - Julier

A Short Description of the Machine

Julier is a cluster composed of 14 nodes:

- 2x IBM x3850 M2 system (24 cores, 256GB RAM)
- 12x Transtec Lynx CALLEO High-Performance Server 2840 (24cores, 48GB RAM)

Please note that each node of Julier has 12 physical cores with Hyper-Threading enabled: therefore you can submit pure MPI jobs asking up to 24 MPI tasks per node, but only up to 12 MPI tasks per node with hybrid MPI/OpenMP jobs.

The operating system is a SUSE SLES11.1.

How to Access Julier

All users who have active projects at CSCS can have access to the system.

Julier is accessible via SSH from the frontend ela.cscs.ch as julier.cscs.ch.

In order to run on the compute nodes, you need to use the batch queuing system.

Programming Environment and Supported Software

The software environment on Julier is controlled using the modules framework which gives an easy and flexible mechanism to access all the CSCS provided compilers, tools and applications.

The programming environments available for Julier are the following:
PrgEnv-gnu (loads gcc/4.3.4)
PrgEnv-intel (loads icc and ifort/11.1)
PrgEnv-pgi (loads pgi/11.10)

Each programming environment loads the mvapich2/1.7 MPI library.

Submission of Batch Jobs

Julier uses the SLURM batch system, direct access to the compute nodes is not allowed. As already mentioned above, this is the hardware configuration:

- 2 nodes provide login facilities for user access and compilation

- 2 nodes each with 24cores (12 with Hyper Threading) and 256GB of RAM

- 10 nodes each with 24cores (12 with Hyper Threading) and 48GB of RAM

Here follows a list of the available queues and partitions:

Name of the queue

Max time

Max running jobs per user

Max number of virtual cores

express

  2 h

10

    1

normal

24 h

10

    1

parExpress

  2 h

  5

  24 ( 1 node)

parallel

24 h

  3

  96 ( 4 nodes)

largeMem

48 h

  3

  48 ( 2 nodes)

You can see a list of the queues and partitions using the commands sinfo and scontrol show partition.
The first queues are meant for serial jobs, therefore there is a higher limit of running jobs per user. The rest of the queues are instead for parallel runs. You can choose the queue where to run your job by issuing the "--partition" directive in your batch script:
#SBATCH --partition=<partition_name>
Please refer to the man pages and the official SLURM documentation for further details on SLURM directives. 

Instructions on batch submission and how to set up a batch job are available on the following page:

How to Run a Batch Job

For a list of the most useful SLURM commands, please have a look at the corresponding FAQ section under the User Forum.

Data Storage

/scratch

Julier has a scratch space (/scratch/julier/user_name) connected via IB interconnect. Note that this storage is not backed up and is cleared on regular intervals so please ensure that you do not target this as a long term storage.

/project

Access to the shared storage (/project) is also available through the high speed interconnect.

For further information, please have a look at Data Management or contact help(at)cscs.ch.