The following table shows some optimisation options which are a good starting point point for further optimisations. Depending on the application, only conservative optimisation of floating-point calculations can be required to preserve compatibility with results from less optimised code. If this is not the case, more aggressive optimisation can be used to further increase performance.
| GNU (*)
|| Intel (**)
| Conservative optimisation of floating-point calculations
|| -O3 -march=native
|| -O3 -xHost -ipo
|| -fast -Mipa=fast
| Agressive optimisation of floating-point calculations
|| -O3 -march=native -ffast-math
|| -O3 -xHost -ipo -fp-model fast=2
|| -fast -Mipa=fast -h fp3
(*) "-march=native" not yet supported in gcc version 4.2.4 which is the current (January 2013) default on Cy-Tera
(**) "-ipo" enables multi-file interprocedural optimisation, alternatively, "-ip" enables single-file interprocedural optimisation, resulting in faster compilation but slower code
Note the -xHost option for the Intel compiler (and its equivalent -march=native for the GNU compiler) which is desired for all HPC applications to make the most out of the given processor architecture.
Compiler specific details can be found, e.g., here:
A good and up to date source for information, not only about optimisation flags, is The RWTH HPC-Cluster User's Guide.