Compilers: IBM XL C/C++ Enterprise Edition Version 8.0 for AIX
Compilers: IBM XL Fortran Enterprise Edition Version 10.1 for AIX
Compilers: IBM XL C/C++ Enterprise Edition Version 9.0 for AIX
Compilers: IBM XL Fortran Enterprise Edition Version 11.1 for AIX
OS: IBM AIX 5L V5.3
Last updated: 18-May-2007
Selecting one of the following will take you directly to that section:
Perform optimizations for maximum performance. This includes maximum interprocedural analysis on all of the objects presented on the "link" step. This level of optimization will increase the compiler's memory usage and compile time requirements. -O5 Provides all of the functionality of the -O4 option, but also provides the functionality of the -qipa=level=2 option.
-O5 is equivalent to the following flags
Perform optimizations for maximum performance. This includes interprocedural analysis on all of the objects presented on the "link" step.
-O4 is equivalent to the following flags
-O3 Performs additional optimizations that are memory intensive, compile-time intensive, and may change the semantics of the program slightly, unless -qstrict is specified. We recommend these optimizations when the desire for run-time speed improvements outweighs the concern for limiting compile-time resources.
-O2 is equivalent to the following flags
Produces object code containg instructins that will run on the specified processors. "auto" selects the processor the complile is being done on. "pwr5x" is the POWER5+ processor.
Supported values for this flag are
Specifies the architecture system for which the executable program is optimized. This includes instruction scheduling and cache setting. The supported values for suboption are:
This option inlines glue code that optimizes external function calls when compiling.
Performs high-order transformations on loops during optimization.
Enhances optimization by doing detailed analysis across procedures (interprocedural analysis or IPA). The level determines the amount of interprocedural analysis and optimization that is performed.
level=0 Does only minimal interprocedural analysis and optimization
level=1 turns on inlining , limited alias analysis, and limited call-site tailoring
level=2 turns on full interprocedural data flow and alias analysis
The option used in the first pass of a profile directed feedback compile that causes pdf information to be generated. The profile directed feedback optimization gathers data on both exectuion path and data values. It does not use hardware counters, nor gather any data other than path and data values for PDF specific optimizations.
The option used in the second pass of a profile directed feedback compile that causes PDF information to be utilized during optimization.
-qxlf90=Determines whether the compiler provides the Fortran 90 or the Fortran 95 level of support for certain aspects of the language. can be one of the following: signedzero | nosignedzero Determines how the SIGN(A,B) function handles signed real 0.0. In addition, determines whether negative internal values will be prefixed with a minus when formatted output would produce a negative sign zero. autodealloc | noautodealloc Determines whether the compiler deallocates allocatable arrays that are declared locally without either the SAVE or the STATIC attribute and have a status of currently allocated when the subprogram terminates. oldpad | nooldpad When the PAD=specifier is present in the INQUIRE statement, specifying -qxlf90=nooldpad returns UNDEFINED when there is no connection, or when the connection is for unformatted I/O. This behavior conforms with the Fortran 95 standard and above. Specifying -qxlf90=oldpad preserves the Fortran 90 behavior. Default: o signedzero, autodealloc and nooldpad for the xlf95, xlf95_r, xlf95_r7 and f95 invocation commands. o nosignedzero, noautodealloc and oldpad for all other invocation commands.
Generates 64 bit ABI binaries. The default is to generate 32 bit ABI binaries.
Sets the bit in the file's XCOFF header indicating that this executable will request the use of large pages when they are available on the system and when the user has an appropriate privilege
Indicates that a program, designed to execute in a large page memory environment, can take advantage of large 16 MB pages provided on POWER4 and higher based systems.
Indicates that the compiler understands how to do alloca().
Causes the Fortran compiler to allocate dynamic arrays on the heap instead of the stack
Enables the generation of vector instructions for processors that support them.
Specifies whether to use volatile or non-volatile vector registers. Volatile vector registers are registers whose value is not preserved across function calls so the compiler will not depend on values in them across function calls.
The __IBM_FAST_VECTOR macro defines a different iterator for the std::vector template class. This iterator results in faster code, but is not compatible with code using the default iterator for a std::vector template class. All uses of std::vector for a data type must use the same iterator. Add -D__IBM_FAST_VECTOR to the compile line, or "#define __IBM_FAST_VECTOR 1" to your source code to use the faster iterator for std::vector template class. You must compile all sources with this macro.
Causes AIX to define "ischar()" (and friends) as macro's and no subroutines.
Cause the C++ compiler to generate Run Time Type Identification code
Specifies what aggregate alignment rules the compiler uses for file compilation, where the alignment options are: bit_packed The compiler uses the bit_packed alignment rules. full The compiler uses the RISC System/6000 alignment rules. This is the same as power. mac68k The compiler uses the Macintosh alignment rules. This suboption is valid only for 32- bit compilations. natural The compiler maps structure members to their natural boundaries. packed The compiler uses the packed alignment rules. power The compiler uses the RISC System/6000 alignment rules. twobyte The compiler uses the Macintosh alignment rules. This suboption is valid only for 32- bit compilations. The mac68k option is the same as twobyte. The default is -qalign=full.
Causes the compiler to treat "char" variables as signed instead of the default of unsigned.
Indicates that the input fortran source program is in fixed form.
Adds an underscore to global entites to match the C compiler ABI
Causes the system loader to put the heap in it's own segment of the size specified. This is only required for 32-bit applications, as their segments are 256M. If the last digit of the value is "C", the it also turns of the malloc pool option for that executable.
qalias=ansi | noansi If ansi is specified, type-based aliasing is used during optimization, which restricts the lvalues that can be safely used to access a data object. The default is ansi for the xlc, xlC, and c89 commands. This option has no effect unless you also specify the -O option. qalias=std |nostd Indicates whether the compilation units contain any non-standard aliasing (see Compiler Reference for more information). If so, specify nostd.
Turns off aggressive optimizations which have the potential to alter the semantics of your program. -qstrict sets -qfloat=nofltint:norsqrt. -qnostrict sets -qfloat=rsqrt. This option is only valid with -O2 or higher optimization levels. Default: o -qnostrict at -O3 or higher. o -qstrict otherwise.
Allows most any c dialect.
Specifies whether to include standard object code in the object files. The noobject suboption can substantially reduce overall compilation time, by not generating object code during the first IPA phase.
The threads suboption allows the IPA optimizer to run portions of the optimization process in parallel threads, which can speed up the compilation process on multi-processor systems. All the available threads, or the number specified by N, are used. N must be a positive integer. Specifying nothreads does not run any parallel threads; this is equivalent to running one serial thread.
Causes the compiler to output a traceback if it abends.
Suppresses the message with the message number specified.
Environment variables set before the run:
The fdpr command (Feedback Directed Program Restructuring) is a performance-tuning utility that may help improve the execution time and the real memory utilization of user-level application programs. The fdpr program optimizes the executable image of a program by collecting information on the behavior of the program while the program is used for some typical workload, and then creating a new version of the program that is optimized for that workload. The new program generated by fdpr typically runs faster and uses less real memory. -q, --quiet Set quiet output mode, suppressing informational messages -O Switch on basic optimizations only. -O2 Switch on less aggressive optimization flags. -O3 Switch on aggressive optimization flags. -O4 Switch on aggressive optimization flags together with aggressive function inlining. -A, --align-code Align program code according to given -bldcg, --build-dcg Build a DCG (data connectivity graph) for enhanced data reordering (applicable only with the -RD flag) -shci , --selective-hot-code-inline Perform selective inlining of functions in order to decrease the total execution counts -sdp , --stride-data-prefetch Perform data prefetching within frequently executed loops based on stride analysis, according to an aggressiveness factor between (1,9), where 1 is least aggressive