Open64/PathScale compiler suite reference card

Open64/PathScale Compiler suite

(For more info about Open64 compiler infrastructure, see here)

(PathScale compiler suite is also dubbed EKO: Every Known Optimization)

opencc
pathcc
C compiler driver
openCC
pathCC
C++ compiler driver
openf90
openf95
pathf90
pathf95
Fortran compiler driver
assign[PathScale] Change or display the I/O processing directives for a Fortran file or unit.
beBackend
cg.soCode generation
coco[PathScale] ISO/IEC 1539-3 Fortran conditional compilation preprocessor.
explain[PathScale] Display detailed Fortran error message for a given message ID (e.g. "pathf95-0724").
gfecGCC-based C frontend (produces WHIRL)

WHIRL=Winning Hierarchical Intermediate Representation Language

gspinOutput abstract syntax tree of GCC to SPIN
ipa.soInter-procedural analysis
lno.soLoop nest optimization
mfef95Cray F90-based Fortran frontend (produces WHIRL)
wgenTransform SPIN to WHIRL
wopt.soWHIRL global scalar optimization

Now the compiler...

(The default compiler options can be placed in the etc/compiler.defaults file. This compiler.defaults file can also be placed under the directory pointed by PSC_COMPILER_DEFAULTS_PATH or OPEN64_COMPILER_DEFAULTS_PATH environmental variable. Alternative, one can also specift environmental variables such as OPEN64_GENFLAGS, OPEN64_CFLAGS, OPEN64_CXXFLAGS, OPEN64_FFLAGS for Open64, or PSC_GENFLAGS, PSC_CFLAGS, PSC_CXXFLAGS, PSC_FFLAGS for PathScale.)

Compile

-cCompile *.c and assemble *.s. NO linking.
-IdirAlso search dir for header files.

This can also be controlled by environmental variables C_INCLUDE_PATH and CPLUS_INCLUDE_PATH.

-SCompile *.c into assembly codes *.s. NO linking.
-ERun preprocessor only. The output is sent to stdout.
-CWhen running preprocessor, don't discard comments in the program.
-dMWhen used with -E option, display definitions of all built-in macros, e.g.
opencc -E -dM - < /dev/null
-o filePlace output in file
-show
-v
When compiling, also display the programs invoked by the compiler.
-show0
-###
Like -show, but do NOT invoke the programs.
-showt When compiling, display what functions it's compiling and time/memory used in preprocessing/parsing etc.
-dumpversionPrint the version number.
-show-defaultsPrint the default optimization level and compilation target.

The default compiler options can be placed in the etc/compiler.defaults file. This compiler.defaults file can also be placed under the directory pointed by PSC_COMPILER_DEFAULTS_PATH or OPEN64_COMPILER_DEFAULTS_PATH environmental variable.

C/C++ dialect

-ansi
-std=c90
-std=c++98
Strictly ISO C90 standard. In particular, C programs can't use C++ style "//" comments and inline keyword.

__STRICT_ANSI__ will be defined if this option is used.

-std=sDetermine the language standard. s can be c90, c++98, c++0x ...

__STRICT_ANSI__ will be defined if this option specifies strict conformance.

-pedantic
-pedantic-errors
Label all usage of forbidden extensions as warning/errors.
Should be used with "-std" switch.
-openmp
-mp
Enable OpenMP.
-gnu3
-gnu4
Enable GCC 3.x/4.x compability mode.
-fno-asmDon't recognize asm, inline or typeof as a keyword, so that these words can be used in C programs as identifiers.
-fno-builtinDon't recognize built-in functions that do not begin with __builtin_ as prefix.
-funsigned-char
-fsigned-char
Whether by default char is signed or unsigned.

Preprocessor

-Dname
-Dname=value
Predefine the macro name, with value 1, or with the specified value
-UnameUn-define the (built-in or -D defined) macro name
-M
-MM
Output a rule (to stdout) suitable for Makefile describing the dependencies of the source file.

-MM only outputs header files not in the system header directories.

This option implies -E option.

-MF fileThe output of -M is written to file. This can also be controlled by environmental variable DEPENDENCIES_OUTPUT.
-MDThe same as -M -MF combined, but doesn't imply -E.

*.d files will be generated.

-Wp,optPass opt to the C preprocessor.
-ftpp
-cpp
Run Fortran preprocessors.
-fcoco[PathScale] Run ISO/IEC 1539-3 Fortran conditional compilation preprocessor.

Warning messages

-Wall
-fullwarn
Enable all warnings.

-fullwarn will generate comment level diagnostics.

-W
-Wextra
Enable extra warnings.
-wSuppress all warnings.
-WerrorTreat warnings as errors.
-WinlineWarn if a function can't be inlined by compiler but is declared as such in the program.

Link

-LdirAlso search dir for library files. This can also be controlled by environmental variable LIBRARY_PATH.
-llibraryLink to liblibrary

The linker searches libraries and object files in the order they are specified, so

  foo.o -lz bar.o
  
will search library z after file foo.o but before bar.o, so if bar.o refers to functions in z, then -lz must appear AFTER bar.o
-sRemove all symbol information from the executable

-staticProduce statically linked executable

-shared
-fPIC
-rdynamic
Produce shared libraries. For details, see here.

-rdynamic is needed for some uses of dlopen or to allow obtaining backtraces from within a program.

-nostartfilesDon't link to the standard startup files (so the start point of a program is not main, but _start).

To compile crt1.o, one has to use this option.

Also see here for examples.

-nodefaultlibsDon't link to the standard system libraries (e.g. libgcc.a).
-nostdlibDon't link to the standard system libraries (e.g. libgcc.a) or startup files.
-static-libgcc
-shared-libgcc
Whether libgcc should be statically or dynamically linked.
-Wl,optPass opt to the linker.

For example, to link to a library statically, say libstdc++, but link to others dynamically, one can do

    -Wl,-Bstatic -lstdc++ -Wl,-Bdynamic -lm
    
-Wl,-M Enable linker to display link map information.
-Wl,-t Enable linker to display the files it is processing.
-Wl,-rpath=dir Tell linker to add dir to the runtime shared/dynamic libraries search path.
-Wl,--start-group
-Wl,--end-group
All the options between this pair are passed to the linker.

Debugging

-g
-g3
Produce debugging information.

-g3 will include extra information such as macro definitions.

-ggdb
-ggdb3
Produce as much debugging information as possible for GDB to use.
-dDDump all macro definitions at end of preprocessing.
-trapuvInitialize local variables to the value NaN, so uninitialized local variables will be trapped.
-keepSave all temporary/intermediate files produced during compiling.

Profiling

-pgProduce profiling information for gprof.
-finstrument-functions

-finstrument-functions-
exclude-file-list=file,file,..

-finstrument-functions-
exclude-function-list=sym,sym,..

Allow to provide user's own profiling functions.

Basically the user will need to implement the following two functions: __cyg_profile_func_enter and __cyg_profile_func_exit.

Optimization

-O0Don't optimize.
-O
-O1
Optimize.

When any optimization option is used, __OPTIMIZE__ is defined.

-O2Optimize even more.

This is default.

-O3Optimize yet more.
-OsOptimize for code size.

This enables all -O2 optimizations that don't increase code size.

This will cause __OPTIMIZE_SIZE__ to be defined.

-OfastThe same as -O3, -ipa, -ffast-math, -OPT:Ofast, -fno-math-errno combined.
-ffast-mathOptimize floating-point arithmetic aggressively at cost in accuracy or consistency.

The macro __FAST_MATH__ will be defined.

-fp-accuracy=levelSet the accuracy level of floating-point operations.

level can be strict, strict-contract, relaxed, aggressive

-funsafe-math-optimizationsEnable unsafe floating-point operation optimizations, e.g. use associative math, use reciprocal instead of division, disregard floating-point exceptions (division by 0, overflow, underflow, etc).
-OPT:fast_math
-OPT:fast_complex
-OPT:fast_exp
-OPT:fast_sqrt
Generate fast but less accurate code for transcendental functions or other functions.
-OPT:fast_io Inline I/O functions such as printf, scanf
-OPT:IEEE_arith=3 Generate fast but less accurate code which is less conformant with the IEEE 754 standard.
-OPT:roundoff=3 Use any mathematically valid transformation of floating-point expressions and ignore possible cumulative round-off errors.
-fno-math-errnoDo not set errno after math function (e.g. sqrt) calls.
-WOPT:unroll=2Unroll the loops.
-fb-create
-fb-opt
-fb-phase
Profile guided optimization (PGO)/Feedback directed optimization (FDO).
-ipaLink time/Inter-procedural analysis/optimization.
-apoAutomatic paralellization.
-msoMulti-core scalability optimization, e.g. do less aggressive prefetching if all cores on the same CPU have a shared/common L3 cache.
-LNO:simd=2Vectorize loops aggressively (LNO=Loop Nest Optimization)
-LNO:simd_verboseDisplay diagnostic information about automatic loop vectorization during compilation.
-LNO:vintr=2Vectorize floating-point operation aggressively.
-LNO:vintr_verboseDisplay diagnostic information about floating-point operation vectorization during compilation.
-LNO:prefetch=3Aggressively prefetch memory to improve performance.
-LNO:prefetch_verboseDisplay diagnostic information about memory prefetching optimization during compilation.
-INLINE:aggressive=ONInline aggressively.
-INLINE:list=ONDisplay diagnostic information about inlining.
-march=cpuGenerate code for specific cpu.
-mtune=cpuTune for specific cpu, e.g. auto, anyx86, athlon, athlon64, barcelona, core, em64t, opteron....
-mextGenerate code for specific SSE/SIMD extensions or new instructions.

ext can be mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, aes, avx, fma4, xop, pclmul, ieee-fp

Interesting features

-C(Fortran) Generate code to check array bounds.

OpenMP environmental variables

The following environmental variables will affect OpenMP programs.

O64-prefix ones are Open64 specific and PSC-prefix are PathScale specific:

O64_OMP_SET_AFFINITY
PSC_OMP_AFFINITY
Set to TRUE to use the threads' CPU affinity.

For PathScale, the default is TRUE.

O64_OMP_AFFINITY_MAP
PSC_OMP_AFFINITY_MAP
PSC_OMP_CPU_STRIDE
PSC_OMP_CPU_OFFSET
Set the threads' CPU affinity.

If PSC_OMP_AFFINITY_MAP is in use, then PSC_OMP_CPU_STRIDE and PSC_OMP_CPU_OFFSET are ignored.

O64_OMP_SPIN_USER_LOCK Set to TRUE to use user-level spin mechanism (Pthread mutexes) for OpenMP locks.

The default is FALSE.

PSC_OMP_LOCK_SPIN Set to 0 to inhibit user-level spin mechanism (Pthread mutexes) for OpenMP locks.

The default is nonzero.

O64_OMP_SPIN_COUNT
PSC_OMP_THREAD_SPIN
Specify the number of times to check a semaphore before falling back to operating system schedule/reschedule mechanism.

The default value is 20000 for O64_OMP_SPIN_COUNT and 100 for PSC_OMP_THREAD_SPIN.

PSC_OMP_STATIC_FAIR Determine the default static scheduling policy when no chunk size is specified.
O64_OMP_VERBOSE Set to any value to display runtime debugging trace information.
PSC_OMP_SILENT Set to any value to inhibit runtime debugging trace information.
PSC_OMP_GUARD_SIZE Set the amount of bytes (e.g. 2m, 4m) allocated for the guard area that is placed below each OpenMP thread.

By default it is 32MB for 64-bit programs and 0 for 32-bit ones.

PSC_OMP_DISABLED Set to any value to disable OpenMP.
PSC_OMP_SERIAL_OUTLINE Set to 1 value to localize private variables in single-thread case.
OMP_SLAVE_STACK_SIZE Set the amount of bytes (e.g. 2m, 4m) allocated for each OpenMP thread to use as the private stack for the thread.

Built-in macros

One can use

   opencc -E -dM - < /dev/null

to see all built-in macros and their values.

For a comprehensive list of pre-defined C/C++ compiler macros across all platforms, see here

__cplusplus Is defined if C++ compiler is in use.

This is ANSI C standard macro

__FILE__
__BASE_FILE__
Name of the current input file (as a C string constant)

This is ANSI C standard macro.

__LINE__ Current input line number (as an integer constant)

This is ANSI C standard macro

__FUNCTION__
__func__
If inside a function, the current function name (as a C string constant)

This is ANSI C99 standard macro

__DATE__
__TIME__
Date & time on which the preprocessor is run. (as C string constants)

These are ANSI C standard macros.

__TIMESTAMP__ Last modification time of the input file (as a C string constant)
__STDC__ Evaluate to 1 to mean the compiler is ISO standard conformant.
__GNUC__
__GNUC_MINOR__
__GNUC_PATCHLEVEL__
Evaluate to integer constants representing the GNU (C/C++/Fortran) compiler version numbers (major/minor/patch level).
__OPENCC__
__OPENCC_MINOR__
__OPENCC_PATCHLEVEL__
__PATHCC__
__PATHCC_MINOR__
__PATHCC_PATCHLEVEL__
Evaluate to integer constants representing the Open64/PathScale compiler version numbers (major/minor/patch level).
__VERSION__ Evaluate to a C string constant representing the GNU (C/C++/Fortran) compiler version, e.g. 4.1.2 20080704 (Red Hat 4.1.2-48).
__OPEN64__
__PATHSCALE__
Evaluate to a C string constant representing the Open64/PathScale compiler version, e.g. 4.2.4.
__SSE__
__SSE2__
__SSE3__
__SSSE3__
Defined for processors that supports SSE/SSE2... instructions.
__OPTIMIZE__
__OPTIMIZE_SIZE__
Is defined if any optimization is used.

Furthermore, __OPTIMIZE_SIZE__ is defined if the optimization is for size, not speed.

__COUNTER__ It expands to sequential integral values starting from 0. In conjunction with the ## operator, this provides a convenient means to generate unique identifiers.