Open64/PathScale Compiler suite
(For more info about Open64 compiler infrastructure, see here)(PathScale compiler suite is also dubbed EKO: Every Known Optimization)
| opencc pathcc | C compiler driver |
| openCC pathCC | C++ compiler driver |
| openf90 openf95 pathf90 pathf95 | Fortran compiler driver |
| assign | [PathScale] Change or display the I/O processing directives for a Fortran file or unit. |
| be | Backend |
| cg.so | Code generation |
| coco | [PathScale] ISO/IEC 1539-3 Fortran conditional compilation preprocessor. |
| explain | [PathScale] Display detailed Fortran error message for a given message ID (e.g. "pathf95-0724"). |
| gfec | GCC-based C frontend (produces WHIRL)
WHIRL=Winning Hierarchical Intermediate Representation Language |
| gspin | Output abstract syntax tree of GCC to SPIN |
| ipa.so | Inter-procedural analysis |
| lno.so | Loop nest optimization |
| mfef95 | Cray F90-based Fortran frontend (produces WHIRL) |
| wgen | Transform SPIN to WHIRL |
| wopt.so | WHIRL global scalar optimization |
Now the compiler...
(The default compiler options can be placed in the etc/compiler.defaults file. This compiler.defaults file can also be placed under the directory pointed by PSC_COMPILER_DEFAULTS_PATH or OPEN64_COMPILER_DEFAULTS_PATH environmental variable. Alternative, one can also specift environmental variables such as OPEN64_GENFLAGS, OPEN64_CFLAGS, OPEN64_CXXFLAGS, OPEN64_FFLAGS for Open64, or PSC_GENFLAGS, PSC_CFLAGS, PSC_CXXFLAGS, PSC_FFLAGS for PathScale.)
Compile
| -c | Compile *.c and assemble *.s. NO linking. |
| -Idir | Also search dir for header files. This can also be controlled by environmental variables C_INCLUDE_PATH and CPLUS_INCLUDE_PATH. |
| -S | Compile *.c into assembly codes *.s. NO linking. |
| -E | Run preprocessor only. The output is sent to stdout. |
| -C | When running preprocessor, don't discard comments in the program. |
| -dM | When used with -E option, display definitions of all built-in macros, e.g.
opencc -E -dM - < /dev/null |
| -o file | Place output in file |
| -show -v | When compiling, also display the programs invoked by the compiler. |
| -show0 -### | Like -show, but do NOT invoke the programs. |
| -showt | When compiling, display what functions it's compiling and time/memory used in preprocessing/parsing etc. |
| -dumpversion | Print the version number. |
| -show-defaults | Print the default optimization level and compilation target.
The default compiler options can be placed in the etc/compiler.defaults file. This compiler.defaults file can also be placed under the directory pointed by PSC_COMPILER_DEFAULTS_PATH or OPEN64_COMPILER_DEFAULTS_PATH environmental variable. |
C/C++ dialect
| -ansi -std=c90 -std=c++98 | Strictly ISO C90 standard. In particular, C programs can't use C++ style "//" comments and inline keyword.
__STRICT_ANSI__ will be defined if this option is used. |
| -std=s | Determine the language standard. s can be c90, c++98, c++0x ...
__STRICT_ANSI__ will be defined if this option specifies strict conformance. |
| -pedantic -pedantic-errors | Label all usage of forbidden
extensions
as warning/errors. Should be used with "-std" switch. |
| -openmp -mp | Enable OpenMP. |
| -gnu3 -gnu4 | Enable GCC 3.x/4.x compability mode. |
| -fno-asm | Don't recognize asm, inline or typeof as a keyword, so that these words can be used in C programs as identifiers. |
| -fno-builtin | Don't recognize built-in functions that do not begin with __builtin_ as prefix. |
| -funsigned-char -fsigned-char | Whether by default char is signed or unsigned. |
Preprocessor
| -Dname -Dname=value |
Predefine the macro name, with value 1, or with the specified value |
| -Uname | Un-define the (built-in or -D defined) macro name |
| -M -MM | Output a rule (to stdout) suitable for Makefile describing the dependencies of the source file.
-MM only outputs header files not in the system header directories. This option implies -E option. |
| -MF file | The output of -M is written to file. This can also be controlled by environmental variable DEPENDENCIES_OUTPUT. |
| -MD | The same as -M -MF combined, but doesn't imply -E. *.d files will be generated. |
| -Wp,opt | Pass opt to the C preprocessor. |
| -ftpp -cpp | Run Fortran preprocessors. |
| -fcoco | [PathScale] Run ISO/IEC 1539-3 Fortran conditional compilation preprocessor. |
Warning messages
| -Wall -fullwarn | Enable all warnings. -fullwarn will generate comment level diagnostics. |
| -W -Wextra | Enable extra warnings. |
| -w | Suppress all warnings. |
| -Werror | Treat warnings as errors. |
| -Winline | Warn if a function can't be inlined by compiler but is declared as such in the program. |
Link
| -Ldir | Also search dir for library files. This can also be controlled by environmental variable LIBRARY_PATH. |
| -llibrary | Link to liblibrary The linker searches libraries and object files in the order they are specified, so foo.o -lz bar.owill search library z after file foo.o but before bar.o, so if bar.o refers to functions in z, then -lz must appear AFTER bar.o |
| -s | Remove all symbol information from the executable
|
| -static | Produce statically linked executable
|
| -shared -fPIC -rdynamic |
Produce shared libraries. For details, see here.
-rdynamic is needed for some uses of dlopen or to allow obtaining backtraces from within a program. |
| -nostartfiles | Don't link to the standard startup files (so the start point of a program is not main, but _start). To compile crt1.o, one has to use this option. Also see here for examples. |
| -nodefaultlibs | Don't link to the standard system libraries (e.g. libgcc.a). |
| -nostdlib | Don't link to the standard system libraries (e.g. libgcc.a) or startup files. |
| -static-libgcc -shared-libgcc | Whether libgcc should be statically or dynamically linked. |
| -Wl,opt | Pass opt to the linker.
For example, to link to a library statically, say libstdc++, but link to others dynamically, one can do
-Wl,-Bstatic -lstdc++ -Wl,-Bdynamic -lm
|
| -Wl,-M | Enable linker to display link map information. |
| -Wl,-t | Enable linker to display the files it is processing. |
| -Wl,-rpath=dir | Tell linker to add dir to the runtime shared/dynamic libraries search path. |
| -Wl,--start-group -Wl,--end-group |
All the options between this pair are passed to the linker. |
Debugging
| -g -g3 | Produce debugging information. -g3 will include extra information such as macro definitions. |
| -ggdb -ggdb3 | Produce as much debugging information as possible for GDB to use. |
| -dD | Dump all macro definitions at end of preprocessing. |
| -trapuv | Initialize local variables to the value NaN, so uninitialized local variables will be trapped. |
| -keep | Save all temporary/intermediate files produced during compiling. |
Profiling
| -pg | Produce profiling information for gprof. |
| -finstrument-functions
-finstrument-functions-
-finstrument-functions- | Allow to provide user's own profiling functions.
Basically the user will need to implement the following two functions: __cyg_profile_func_enter and __cyg_profile_func_exit. |
Optimization
| -O0 | Don't optimize. |
| -O -O1 | Optimize. When any optimization option is used, __OPTIMIZE__ is defined. |
| -O2 | Optimize even more. This is default. |
| -O3 | Optimize yet more. |
| -Os | Optimize for code size. This enables all -O2 optimizations that don't increase code size. This will cause __OPTIMIZE_SIZE__ to be defined. |
| -Ofast | The same as -O3, -ipa, -ffast-math, -OPT:Ofast, -fno-math-errno combined. |
| -ffast-math | Optimize floating-point arithmetic aggressively at cost in accuracy or consistency. The macro __FAST_MATH__ will be defined. |
| -fp-accuracy=level | Set the accuracy level of
floating-point operations. level can be strict, strict-contract, relaxed, aggressive |
| -funsafe-math-optimizations | Enable unsafe floating-point operation optimizations, e.g. use associative math, use reciprocal instead of division, disregard floating-point exceptions (division by 0, overflow, underflow, etc). |
| -OPT:fast_math -OPT:fast_complex -OPT:fast_exp -OPT:fast_sqrt |
Generate fast but less accurate code for transcendental functions or other functions. |
| -OPT:fast_io | Inline I/O functions such as printf, scanf |
| -OPT:IEEE_arith=3 | Generate fast but less accurate code which is less conformant with the IEEE 754 standard. |
| -OPT:roundoff=3 | Use any mathematically valid transformation of floating-point expressions and ignore possible cumulative round-off errors. |
| -fno-math-errno | Do not set errno after math function (e.g. sqrt) calls. |
| -WOPT:unroll=2 | Unroll the loops. |
| -fb-create -fb-opt -fb-phase | Profile guided optimization (PGO)/Feedback directed optimization (FDO). |
| -ipa | Link time/Inter-procedural analysis/optimization. |
| -apo | Automatic paralellization. |
| -mso | Multi-core scalability optimization, e.g. do less aggressive prefetching if all cores on the same CPU have a shared/common L3 cache. |
| -LNO:simd=2 | Vectorize loops aggressively (LNO=Loop Nest Optimization) |
| -LNO:simd_verbose | Display diagnostic information about automatic loop vectorization during compilation. |
| -LNO:vintr=2 | Vectorize floating-point operation aggressively. |
| -LNO:vintr_verbose | Display diagnostic information about floating-point operation vectorization during compilation. |
| -LNO:prefetch=3 | Aggressively prefetch memory to improve performance. |
| -LNO:prefetch_verbose | Display diagnostic information about memory prefetching optimization during compilation. |
| -INLINE:aggressive=ON | Inline aggressively. |
| -INLINE:list=ON | Display diagnostic information about inlining. |
| -march=cpu | Generate code for specific cpu. |
| -mtune=cpu | Tune for specific cpu, e.g. auto, anyx86, athlon, athlon64, barcelona, core, em64t, opteron.... |
| -mext | Generate code for specific SSE/SIMD extensions or new instructions. ext can be mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, aes, avx, fma4, xop, pclmul, ieee-fp |
Interesting features
| -C | (Fortran) Generate code to check array bounds. |
OpenMP environmental variables
The following environmental variables will affect OpenMP programs.O64-prefix ones are Open64 specific and PSC-prefix are PathScale specific:
| O64_OMP_SET_AFFINITY PSC_OMP_AFFINITY |
Set to TRUE to use the threads' CPU affinity.
For PathScale, the default is TRUE. |
| O64_OMP_AFFINITY_MAP PSC_OMP_AFFINITY_MAP PSC_OMP_CPU_STRIDE PSC_OMP_CPU_OFFSET |
Set the threads' CPU affinity.
If PSC_OMP_AFFINITY_MAP is in use, then PSC_OMP_CPU_STRIDE and PSC_OMP_CPU_OFFSET are ignored. |
| O64_OMP_SPIN_USER_LOCK | Set to TRUE to use
user-level spin mechanism (Pthread mutexes) for OpenMP locks.
The default is FALSE. |
| PSC_OMP_LOCK_SPIN | Set to 0 to inhibit
user-level spin mechanism (Pthread mutexes) for OpenMP locks.
The default is nonzero. |
| O64_OMP_SPIN_COUNT PSC_OMP_THREAD_SPIN |
Specify the number of times to check a semaphore before falling
back to operating system schedule/reschedule mechanism.
The default value is 20000 for O64_OMP_SPIN_COUNT and 100 for PSC_OMP_THREAD_SPIN. |
| PSC_OMP_STATIC_FAIR | Determine the default static scheduling policy when no chunk size is specified. |
| O64_OMP_VERBOSE | Set to any value to display runtime debugging trace information. |
| PSC_OMP_SILENT | Set to any value to inhibit runtime debugging trace information. |
| PSC_OMP_GUARD_SIZE | Set the amount of bytes (e.g. 2m, 4m)
allocated for the
guard area that is placed below
each OpenMP thread. By default it is 32MB for 64-bit programs and 0 for 32-bit ones. |
| PSC_OMP_DISABLED | Set to any value to disable OpenMP. |
| PSC_OMP_SERIAL_OUTLINE | Set to 1 value to localize private variables in single-thread case. |
| OMP_SLAVE_STACK_SIZE | Set the amount of bytes (e.g. 2m, 4m) allocated for each OpenMP thread to use as the private stack for the thread. |
Built-in macros
One can useopencc -E -dM - < /dev/null
to see all built-in macros and their values.
For a comprehensive list of pre-defined C/C++ compiler macros across all platforms, see here
| __cplusplus | Is defined if C++ compiler is in use.
This is ANSI C standard macro |
|
| __FILE__ __BASE_FILE__ |
Name of the current input file (as a C string constant)
This is ANSI C standard macro. |
|
| __LINE__ | Current input line number (as an integer constant)
This is ANSI C standard macro |
|
| __FUNCTION__ __func__ |
If inside a function, the current function name (as a C string constant)
This is ANSI C99 standard macro |
|
| __DATE__ __TIME__ |
Date & time on which the preprocessor is run. (as C string constants)
These are ANSI C standard macros. | |
| __TIMESTAMP__ | Last modification time of the input file (as a C string constant) | |
| __STDC__ | Evaluate to 1 to mean the compiler is ISO standard conformant. | |
| __GNUC__ __GNUC_MINOR__ __GNUC_PATCHLEVEL__ |
Evaluate to integer constants representing the GNU (C/C++/Fortran) compiler version numbers (major/minor/patch level). | |
| __OPENCC__ __OPENCC_MINOR__ __OPENCC_PATCHLEVEL__ __PATHCC__ __PATHCC_MINOR__ __PATHCC_PATCHLEVEL__ |
Evaluate to integer constants representing the Open64/PathScale compiler version numbers (major/minor/patch level). | |
| __VERSION__ | Evaluate to a C string constant representing the GNU (C/C++/Fortran) compiler version, e.g. 4.1.2 20080704 (Red Hat 4.1.2-48). | |
| __OPEN64__ __PATHSCALE__ |
Evaluate to a C string constant representing the Open64/PathScale compiler version, e.g. 4.2.4. | |
| __SSE__ __SSE2__ __SSE3__ __SSSE3__ |
Defined for processors that supports SSE/SSE2... instructions. | |
| __OPTIMIZE__ __OPTIMIZE_SIZE__ |
Is defined if any optimization is used. Furthermore, __OPTIMIZE_SIZE__ is defined if the optimization is for size, not speed. |
|
| __COUNTER__ | It expands to sequential integral values starting from 0. In conjunction with the ## operator, this provides a convenient means to generate unique identifiers. |