EN Bereichsnavigation EN

PathScale Compiler

The PathScale compiler suite is a high performance compiler suite with a proprietary back-end based on the SGI compilers and a GNU front-end. The suite includes Fortran 77/90/95 (with Cray/SGI Fortran 95 extensions and character pointers), C and C++ compilers. The PathScale compiler suite is accessed by loading the PrgEnv-pathscale module file: module load PrgEnv-pathscale. Please note that the PathScale compiler is no longer fully supported on the Cray XE6. Supporting libraries are not provided and we cannot guarantee the availability of future releases of the compiler. We strongly suggest that you use a different compiler unless you have a very specific requirement for PathScale. 

Versions

The current default version of the compiler (pathscale) is loaded automatically when you load the programming environment. Older and/or newer versions of the compiler may be available: to see which versions are available issue module avail pathscale. To use a different version of the PathScale compiler issue module switch pathscale pathscale/<new_version>.

Invocation

To compile a Fortran 90 MPI code on the system invoke the Cray compiler wrapper:

  > ftn [compiler options] example.f90 -o example.x

 

Likewise for C and C++ codes:

  > cc [compiler options] example.c -o example.x
> CC [compiler options] example.C -o example.x

Options

The PathScale compiler organizes options into twelve groups by compiler phase or class of feature. The general syntax is 

  • -GROUPNAME:option[=value]{:option[=value]}

 The group names are as follows:

  Group

Meaning

-LIST

Options relating to the writing of a listing file

-OPT

Optimizations

-TARG

Target machine

-TENV

Target environment

-INLINE

Back-end inlining

-IPA

Interprocedural analysis

-LANG

Language features

-CG

Code generation

-WOPT

Global scalar optimization

-LNO

Loop nest optimization

Optimization

Using the appropriate compiler optimization flags is essential for reasonable performance of your application on the system. The default general optimization level is -O2, which corresponds to the inclusion of conservative optimizations, ie. ones that are virtually always beneficial. We recommend in the first instance to add the following optimization flag for increased performance:

  • -OPT:Ofast

This adds a subset of -OPT suboptions, equivalent to -OPT:ro=2:Olimit=0:dvi_split=ON:alias=typed. 

More aggressive optimization can be obtained with:

  • -Ofast

which is equivalent to -O3 -ipa -OPT:Ofast -fno-math-errno -ffast-math.

  • -O3 turns on aggressive optimizations that may or may not improve code performance
  • -ipa turns on interprocedural analysis (see bleow)
  • -fno-math-errno prevents errno being set when calling math functions that are executed in a single instruction, e.g. sqrt.
  • -ffast-math improves floating point performance by relaxing ANSI and IEEE rules

Note that optimizations introduced with the -Ofast flag may affect floating point accuracy due to the rearrangment of computations (see "Floating point accuracy" below).

Floating point accuracy

Relaxing the precision requirements for floating-point numbers might allow for faster code. There are three relevant options here: -OPT:roundoff, OPT:IEEE_arithmetic, and -OPT:IEEE_Nan_Inf.

The -OPT:roundoff option defines the extent to which the compiler can introduce roundoff error:

  • -OPT:roundoff=0 no roundoff error permitted (default at -O0, -01, and -02)
  • -OPT:roundoff=1 permits limited roundoff error (default at -O3)
  • -OPT:roundoff=2 permits roundoff error due to reassociating expressions (default at -Ofast)
  • -OPT:roundoff=3 permits any roundoff error

The -OPT:IEEE_arithmetic option specifies the level of conformance to the IEEE 754 floating-point roundoff and exception handling behaviour:

  • -OPT:IEEE_arithmetic=1 defines strict conformance to IEEE standard (default at -O0, -O1, and -O2)
  • -OPT:IEEE_arithmetic=2 allows some relaxation of accuracy
  • -OPT:IEEE_arithmetic=3 allows any mathematically equivalent tranformations to be appiled

The -OPT:IEEE_NaN_Inf=(on|off) controls the conformance to IEEE standards for NaN and Infinity. Default is on.

Note: the GNU-style flag -ffast-math (which is implied by -Ofast) is equivalent to -OPT:IEEE_arithmetic=2 -fno-math-errno. If you wish to adhere strictly to IEEE arithmentic then you should use the -fno-fast-math flag, which implies -OPT:IEEE_arithmetic=1 -fmath-errno. 

Loop nest optimization

The loop nest optimizer (LNO) performs loop transformations to optimize nested loops by making better use of cache. The -LNO options are only enabled at general optimization level -O3 or higher.   

Interprocedural analysis

Interprocedural analysis and optimization can be turned on (at any optimization level) with the -ipa option, and is turned on by default with the -Ofast option. Note that the interprocedural analysis is "whole program" and must be enabled for all source files. When -ipa is used, the majority of compiler optimization is done at the link stage rather than at the compile stage, so compilation will be fast but linking may take significantly longer. The interprocedural analysis flag must be turned on at both compile and link time.

Alias Options

It is possible to allow the compiler to make assumptions about aliasing, which could improve the performance of your code.

  • -OPT:alias=typed (this is implied by -OPT:Ofast)
  • -OPT:alias=restrict
  • -OPT:alias=disjoint

Automatic optimization tuning

The pathopt2 tool can be used to help tune the PathScale compiler for higher performance for a specific code. This tool iteratively tests different compiler options and combinations of options, tracks the resutls, selects the best options within a subgroup, and then elevates those options within the test hierarchy. 

OpenMP

For the PathScale compiler use the -mp option to enable OpenMP support. There is no support for OpenMP directives in C++ that use exceptions, classes or templates.

GCC object compatibility

PathScale is fully compatible with GCC which means that you can mix and match the linking of GNU and PathScale compiled binaries and libraries. The front-end is source compatible with the GNU compiler suite for C/C++. 

Debugging

The following compiler flags may be useful for helping debug your code:

  Flag

Meaning

-g

Generate debugging information (changes optimization to -O0 unless explicitly overridden)

-C

Enable array bounds checking for Fortran 90 codes. If you then set the environment variable F90_BOUNDS_CHECK_ABORT=YES the code will crash on an out-of-bounds access

-trapuv

Initializes variables with NaN. If the program uses the variable it will crash rather than producing incorrect results

-zerouv

Initializes variables to 0

-OPT:alias=no_parm

If your program gets the right answers with this flag and wrong without, you are likely breaking Fortran aliasing rules

-LANG:rw_const=on

Prevent segmentation fault when a constant parameter in Fortran is written to

Further Information

Refer to the online documentation from PathScale.