Open64/PathScale Compiler suite
(For more info about Open64 compiler infrastructure, see here)(PathScale compiler suite is also dubbed EKO: Every Known Optimization)
opencc pathcc | C compiler driver |
openCC pathCC | C++ compiler driver |
openf90 openf95 pathf90 pathf95 | Fortran compiler driver |
assign | [PathScale] Change or display the I/O processing directives for a Fortran file or unit. |
be | Backend |
cg.so | Code generation |
coco | [PathScale] ISO/IEC 1539-3 Fortran conditional compilation preprocessor. |
explain | [PathScale] Display detailed Fortran error message for a given message ID (e.g. "pathf95-0724"). |
gfec | GCC-based C frontend (produces WHIRL)
WHIRL=Winning Hierarchical Intermediate Representation Language |
gspin | Output abstract syntax tree of GCC to SPIN |
ipa.so | Inter-procedural analysis |
lno.so | Loop nest optimization |
mfef95 | Cray F90-based Fortran frontend (produces WHIRL) |
wgen | Transform SPIN to WHIRL |
wopt.so | WHIRL global scalar optimization |
Now the compiler...
(The default compiler options can be placed in the etc/compiler.defaults file. This compiler.defaults file can also be placed under the directory pointed by PSC_COMPILER_DEFAULTS_PATH or OPEN64_COMPILER_DEFAULTS_PATH environmental variable. Alternative, one can also specift environmental variables such as OPEN64_GENFLAGS, OPEN64_CFLAGS, OPEN64_CXXFLAGS, OPEN64_FFLAGS for Open64, or PSC_GENFLAGS, PSC_CFLAGS, PSC_CXXFLAGS, PSC_FFLAGS for PathScale.)
Compile
-c | Compile *.c and assemble *.s. NO linking. |
-Idir | Also search dir for header files. This can also be controlled by environmental variables C_INCLUDE_PATH and CPLUS_INCLUDE_PATH. |
-S | Compile *.c into assembly codes *.s. NO linking. |
-E | Run preprocessor only. The output is sent to stdout. |
-C | When running preprocessor, don't discard comments in the program. |
-dM | When used with -E option, display definitions of all built-in macros, e.g.
opencc -E -dM - < /dev/null |
-o file | Place output in file |
-show -v | When compiling, also display the programs invoked by the compiler. |
-show0 -### | Like -show, but do NOT invoke the programs. |
-showt | When compiling, display what functions it's compiling and time/memory used in preprocessing/parsing etc. |
-dumpversion | Print the version number. |
-show-defaults | Print the default optimization level and compilation target.
The default compiler options can be placed in the etc/compiler.defaults file. This compiler.defaults file can also be placed under the directory pointed by PSC_COMPILER_DEFAULTS_PATH or OPEN64_COMPILER_DEFAULTS_PATH environmental variable. |
C/C++ dialect
-ansi -std=c90 -std=c++98 | Strictly ISO C90 standard. In particular, C programs can't use C++ style "//" comments and inline keyword.
__STRICT_ANSI__ will be defined if this option is used. |
-std=s | Determine the language standard. s can be c90, c++98, c++0x ...
__STRICT_ANSI__ will be defined if this option specifies strict conformance. |
-pedantic -pedantic-errors | Label all usage of forbidden
extensions
as warning/errors. Should be used with "-std" switch. |
-openmp -mp | Enable OpenMP. |
-gnu3 -gnu4 | Enable GCC 3.x/4.x compability mode. |
-fno-asm | Don't recognize asm, inline or typeof as a keyword, so that these words can be used in C programs as identifiers. |
-fno-builtin | Don't recognize built-in functions that do not begin with __builtin_ as prefix. |
-funsigned-char -fsigned-char | Whether by default char is signed or unsigned. |
Preprocessor
-Dname -Dname=value |
Predefine the macro name, with value 1, or with the specified value |
-Uname | Un-define the (built-in or -D defined) macro name |
-M -MM | Output a rule (to stdout) suitable for Makefile describing the dependencies of the source file.
-MM only outputs header files not in the system header directories. This option implies -E option. |
-MF file | The output of -M is written to file. This can also be controlled by environmental variable DEPENDENCIES_OUTPUT. |
-MD | The same as -M -MF combined, but doesn't imply -E. *.d files will be generated. |
-Wp,opt | Pass opt to the C preprocessor. |
-ftpp -cpp | Run Fortran preprocessors. |
-fcoco | [PathScale] Run ISO/IEC 1539-3 Fortran conditional compilation preprocessor. |
Warning messages
-Wall -fullwarn | Enable all warnings. -fullwarn will generate comment level diagnostics. |
-W -Wextra | Enable extra warnings. |
-w | Suppress all warnings. |
-Werror | Treat warnings as errors. |
-Winline | Warn if a function can't be inlined by compiler but is declared as such in the program. |
Link
-Ldir | Also search dir for library files. This can also be controlled by environmental variable LIBRARY_PATH. |
-llibrary | Link to liblibrary The linker searches libraries and object files in the order they are specified, so foo.o -lz bar.owill search library z after file foo.o but before bar.o, so if bar.o refers to functions in z, then -lz must appear AFTER bar.o |
-s | Remove all symbol information from the executable
|
-static | Produce statically linked executable
|
-shared -fPIC -rdynamic |
Produce shared libraries. For details, see here.
-rdynamic is needed for some uses of dlopen or to allow obtaining backtraces from within a program. |
-nostartfiles | Don't link to the standard startup files (so the start point of a program is not main, but _start). To compile crt1.o, one has to use this option. Also see here for examples. |
-nodefaultlibs | Don't link to the standard system libraries (e.g. libgcc.a). |
-nostdlib | Don't link to the standard system libraries (e.g. libgcc.a) or startup files. |
-static-libgcc -shared-libgcc | Whether libgcc should be statically or dynamically linked. |
-Wl,opt | Pass opt to the linker.
For example, to link to a library statically, say libstdc++, but link to others dynamically, one can do -Wl,-Bstatic -lstdc++ -Wl,-Bdynamic -lm |
-Wl,-M | Enable linker to display link map information. |
-Wl,-t | Enable linker to display the files it is processing. |
-Wl,-rpath=dir | Tell linker to add dir to the runtime shared/dynamic libraries search path. |
-Wl,--start-group -Wl,--end-group |
All the options between this pair are passed to the linker. |
Debugging
-g -g3 | Produce debugging information. -g3 will include extra information such as macro definitions. |
-ggdb -ggdb3 | Produce as much debugging information as possible for GDB to use. |
-dD | Dump all macro definitions at end of preprocessing. |
-trapuv | Initialize local variables to the value NaN, so uninitialized local variables will be trapped. |
-keep | Save all temporary/intermediate files produced during compiling. |
Profiling
-pg | Produce profiling information for gprof. |
-finstrument-functions
-finstrument-functions-
-finstrument-functions- | Allow to provide user's own profiling functions.
Basically the user will need to implement the following two functions: __cyg_profile_func_enter and __cyg_profile_func_exit. |
Optimization
-O0 | Don't optimize. |
-O -O1 | Optimize. When any optimization option is used, __OPTIMIZE__ is defined. |
-O2 | Optimize even more. This is default. |
-O3 | Optimize yet more. |
-Os | Optimize for code size. This enables all -O2 optimizations that don't increase code size. This will cause __OPTIMIZE_SIZE__ to be defined. |
-Ofast | The same as -O3, -ipa, -ffast-math, -OPT:Ofast, -fno-math-errno combined. |
-ffast-math | Optimize floating-point arithmetic aggressively at cost in accuracy or consistency. The macro __FAST_MATH__ will be defined. |
-fp-accuracy=level | Set the accuracy level of
floating-point operations. level can be strict, strict-contract, relaxed, aggressive |
-funsafe-math-optimizations | Enable unsafe floating-point operation optimizations, e.g. use associative math, use reciprocal instead of division, disregard floating-point exceptions (division by 0, overflow, underflow, etc). |
-OPT:fast_math -OPT:fast_complex -OPT:fast_exp -OPT:fast_sqrt |
Generate fast but less accurate code for transcendental functions or other functions. |
-OPT:fast_io | Inline I/O functions such as printf, scanf |
-OPT:IEEE_arith=3 | Generate fast but less accurate code which is less conformant with the IEEE 754 standard. |
-OPT:roundoff=3 | Use any mathematically valid transformation of floating-point expressions and ignore possible cumulative round-off errors. |
-fno-math-errno | Do not set errno after math function (e.g. sqrt) calls. |
-WOPT:unroll=2 | Unroll the loops. |
-fb-create -fb-opt -fb-phase | Profile guided optimization (PGO)/Feedback directed optimization (FDO). |
-ipa | Link time/Inter-procedural analysis/optimization. |
-apo | Automatic paralellization. |
-mso | Multi-core scalability optimization, e.g. do less aggressive prefetching if all cores on the same CPU have a shared/common L3 cache. |
-LNO:simd=2 | Vectorize loops aggressively (LNO=Loop Nest Optimization) |
-LNO:simd_verbose | Display diagnostic information about automatic loop vectorization during compilation. |
-LNO:vintr=2 | Vectorize floating-point operation aggressively. |
-LNO:vintr_verbose | Display diagnostic information about floating-point operation vectorization during compilation. |
-LNO:prefetch=3 | Aggressively prefetch memory to improve performance. |
-LNO:prefetch_verbose | Display diagnostic information about memory prefetching optimization during compilation. |
-INLINE:aggressive=ON | Inline aggressively. |
-INLINE:list=ON | Display diagnostic information about inlining. |
-march=cpu | Generate code for specific cpu. |
-mtune=cpu | Tune for specific cpu, e.g. auto, anyx86, athlon, athlon64, barcelona, core, em64t, opteron.... |
-mext | Generate code for specific SSE/SIMD extensions or new instructions. ext can be mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, aes, avx, fma4, xop, pclmul, ieee-fp |
Interesting features
-C | (Fortran) Generate code to check array bounds. |
OpenMP environmental variables
The following environmental variables will affect OpenMP programs.O64-prefix ones are Open64 specific and PSC-prefix are PathScale specific:
O64_OMP_SET_AFFINITY PSC_OMP_AFFINITY |
Set to TRUE to use the threads' CPU affinity.
For PathScale, the default is TRUE. |
O64_OMP_AFFINITY_MAP PSC_OMP_AFFINITY_MAP PSC_OMP_CPU_STRIDE PSC_OMP_CPU_OFFSET |
Set the threads' CPU affinity.
If PSC_OMP_AFFINITY_MAP is in use, then PSC_OMP_CPU_STRIDE and PSC_OMP_CPU_OFFSET are ignored. |
O64_OMP_SPIN_USER_LOCK | Set to TRUE to use
user-level spin mechanism (Pthread mutexes) for OpenMP locks.
The default is FALSE. |
PSC_OMP_LOCK_SPIN | Set to 0 to inhibit
user-level spin mechanism (Pthread mutexes) for OpenMP locks.
The default is nonzero. |
O64_OMP_SPIN_COUNT PSC_OMP_THREAD_SPIN |
Specify the number of times to check a semaphore before falling
back to operating system schedule/reschedule mechanism.
The default value is 20000 for O64_OMP_SPIN_COUNT and 100 for PSC_OMP_THREAD_SPIN. |
PSC_OMP_STATIC_FAIR | Determine the default static scheduling policy when no chunk size is specified. |
O64_OMP_VERBOSE | Set to any value to display runtime debugging trace information. |
PSC_OMP_SILENT | Set to any value to inhibit runtime debugging trace information. |
PSC_OMP_GUARD_SIZE | Set the amount of bytes (e.g. 2m, 4m)
allocated for the
guard area that is placed below
each OpenMP thread. By default it is 32MB for 64-bit programs and 0 for 32-bit ones. |
PSC_OMP_DISABLED | Set to any value to disable OpenMP. |
PSC_OMP_SERIAL_OUTLINE | Set to 1 value to localize private variables in single-thread case. |
OMP_SLAVE_STACK_SIZE | Set the amount of bytes (e.g. 2m, 4m) allocated for each OpenMP thread to use as the private stack for the thread. |
Built-in macros
One can useopencc -E -dM - < /dev/null
to see all built-in macros and their values.
For a comprehensive list of pre-defined C/C++ compiler macros across all platforms, see here
__cplusplus | Is defined if C++ compiler is in use.
This is ANSI C standard macro |
|
__FILE__ __BASE_FILE__ |
Name of the current input file (as a C string constant)
This is ANSI C standard macro. |
|
__LINE__ | Current input line number (as an integer constant)
This is ANSI C standard macro |
|
__FUNCTION__ __func__ |
If inside a function, the current function name (as a C string constant)
This is ANSI C99 standard macro |
|
__DATE__ __TIME__ |
Date & time on which the preprocessor is run. (as C string constants)
These are ANSI C standard macros. | |
__TIMESTAMP__ | Last modification time of the input file (as a C string constant) | |
__STDC__ | Evaluate to 1 to mean the compiler is ISO standard conformant. | |
__GNUC__ __GNUC_MINOR__ __GNUC_PATCHLEVEL__ |
Evaluate to integer constants representing the GNU (C/C++/Fortran) compiler version numbers (major/minor/patch level). | |
__OPENCC__ __OPENCC_MINOR__ __OPENCC_PATCHLEVEL__ __PATHCC__ __PATHCC_MINOR__ __PATHCC_PATCHLEVEL__ |
Evaluate to integer constants representing the Open64/PathScale compiler version numbers (major/minor/patch level). | |
__VERSION__ | Evaluate to a C string constant representing the GNU (C/C++/Fortran) compiler version, e.g. 4.1.2 20080704 (Red Hat 4.1.2-48). | |
__OPEN64__ __PATHSCALE__ |
Evaluate to a C string constant representing the Open64/PathScale compiler version, e.g. 4.2.4. | |
__SSE__ __SSE2__ __SSE3__ __SSSE3__ |
Defined for processors that supports SSE/SSE2... instructions. | |
__OPTIMIZE__ __OPTIMIZE_SIZE__ |
Is defined if any optimization is used. Furthermore, __OPTIMIZE_SIZE__ is defined if the optimization is for size, not speed. |
|
__COUNTER__ | It expands to sequential integral values starting from 0. In conjunction with the ## operator, this provides a convenient means to generate unique identifiers. |