Intel® MPI Library Developer Reference for Linux* OS
This section describes the global options of the Intel® MPI Library's Hydra process manager. Global options are applied to all arguments sets in the launch command. Argument sets are separated by a colon ':'.
-usize <usize>
Use this option to set MPI_UNIVERSE_SIZE, which is available as an attribute of the MPI_COMM_WORLD.
<size> |
Define the universe size |
SYSTEM |
Set the size equal to the number of cores passed to mpiexec through the hostfile or the resource manager. |
INFINITE |
Do not limit the size. This is the default value. |
<value> |
Set the size to a numeric value ≥ 0. |
-hostfile <hostfile> or -f <hostfile>
Use this option to specify host names on which to run the application. If a host name is repeated, this name is used only once.
See also the I_MPI_HYDRA_HOST_FILE environment variable for more details.
Use the -perhost, -ppn, -grr, and -rr options to change the process placement on the cluster nodes.
-machinefile <machine file> or -machine <machine file>
Use this option to control process placement through a machine file. To define the total number of processes to start, use the -n option. To pin processes within a machine, use the option binding=map in the machine file. For example:
$ cat ./machinefile node0:2 binding=map=0,3 node1:2 binding=map=[2,8] node0:1 binding=map=8
For details, see the -binding option description.
Use this option to set the <ENVVAR> environment variable to the specified <value> for all MPI processes.
Use this option to enable propagation of all environment variables to all MPI processes.
Use this option to suppress propagation of any environment variables to any MPI processes.
-genvexcl <list of env var names>
Use this option to suppress propagation of the listed environment variables to any MPI processes.
Use this option to pass a list of environment variables with their current values. <list> is a comma separated list of environment variables to be sent to all MPI processes.
Use this option to choose the caching mode of process management interface (PMI) message. Possible values for <mode> are:
<mode> |
The caching mode to be used |
nocache |
Do not cache PMI messages. |
cache |
Cache PMI messages on the local pmi_proxy management processes to minimize the number of PMI requests. Cached information is automatically propagated to child management processes. |
lazy-cache |
cache mode with on-request propagation of the PMI information. |
alltoall |
Information is automatically exchanged between all pmi_proxy before any get request can be done. This is the default mode. |
See the I_MPI_HYDRA_PMI_CONNECT environment variable for more details.
-perhost <# of processes >, -ppn <# of processes >, or -grr <# of processes>
Use this option to place the specified number of consecutive MPI processes on every host in the group using round robin scheduling. See the I_MPI_PERHOST environment variable for more details.
When running under a job scheduler, these options are ignored by default. To be able to control process placement with these options, disable the I_MPI_JOB_RESPECT_PROCESS_PLACEMENT variable.
-rr
Use this option to place consecutive MPI processes on different hosts using the round robin scheduling. This option is equivalent to "-perhost 1". See the I_MPI_PERHOST environment variable for more details.
-trace [<profiling_library>] or -t [<profiling_library>]
Use this option to profile your MPI application with Intel® Trace Collector using the indicated <profiling_library>. If you do not specify <profiling_library>, the default profiling library libVT.so is used.
Set the I_MPI_JOB_TRACE_LIBS environment variable to override the default profiling library.
Use this option to profile your MPI application with Intel® Trace Collector using the libVTim.so library.
-aps
Use this option to collect statistics from your MPI application using Application Performance Snapshot. The collected data includes hardware performance metrics, memory consumption data, internal MPI imbalance and OpenMP* imbalance statistics. When you use this option, a new folder aps_result_<date>-<time> with statistics data is generated. You can analyze the collected data with the aps utility, for example:
$ mpirun -aps -n 2 ./myApp
$ aps aps_result_20171231_235959
If you use the options -trace or -check_mpi, the -aps option is ignored.
-mps
Use this option to collect only MPI and OpenMP* statistics from your MPI application using Application Performance Snapshot. Unlike the -aps option, -mps doesn't collect hardware metrics. The option is equivalent to:
$ mpirun -n 2 aps -c mpi,omp ./myapp
Use this option to collect the information about point-to-point operations using Intel® Trace Analyzer and Collector. The option requires that you also use the -trace option.
Use this option to collect the information about collective operations using Intel® Trace Analyzer and Collector. The option requires that you also use the -trace option.
Use the -trace-pt2pt and -trace-collectives to reduce the size of the resulting trace file or the number of message checker reports. These options work with both statically and dynamically linked applications.
Use this option to specify the file <filename> that contains the command-line options. Blank lines and lines that start with '#' as the first character are ignored.
Use this option to restrict the number of child management processes launched by the Hydra process manager, or by each pmi_proxy management process.
See the I_MPI_HYDRA_BRANCH_COUNT environment variable for more details.
-pmi-aggregate or -pmi-noaggregate
Use this option to switch on or off, respectively, the aggregation of the PMI requests. The default value is -pmi-aggregate, which means the aggregation is enabled by default.
See the I_MPI_HYDRA_PMI_AGGREGATE environment variable for more details.
Use this option to run an executable under the GNU* debugger. You can use the following command:
$ mpiexeс.hydra -gdb -n <# of processes> <executable>
Use this option to attach the GNU* debugger to the existing MPI job. You can use the following command:
$ mpiexec.hydra -gdba <pid>
Use this option to avoid running the <executable> on the host where mpiexec.hydra is launched. You can use this option on clusters that deploy a dedicated master node for starting the MPI jobs and a set of dedicated compute nodes for running the actual MPI processes.
Use this option to specify a particular <nodelist> on which the MPI processes should be run. For example, the following command runs the executable a.out on the hosts host1 and host2:
$ mpiexec.hydra -n 2 -ppn 1 -hosts host1,host2 ./a.out
If <nodelist> contains only one node, this option is interpreted as a local option. See Local Options for details.
Use this option to choose the appropriate network interface. For example, if the IP emulation of your InfiniBand* network is configured to ib0, you can use the following command.
$ mpiexec.hydra -n 2 -iface ib0 ./a.out
See the I_MPI_HYDRA_IFACE environment variable for more details.
Use this option to set the polling mode for multiple I/O. The default value is poll.
Arguments
<spec> |
Define the polling mode for multiple I/O |
poll |
Set poll as the polling mode. This is the default value. |
select |
Set select as the polling mode. |
See the I_MPI_HYDRA_DEMUX environment variable for more details.
Use this option to control the Xlib* traffic forwarding. The default value is -disable-x, which means the Xlib traffic is not forwarded.
Use this option to insert the MPI process rank at the beginning of all lines written to the standard output.
Use this option to preload the ILP64 interface.
Use this option to direct standard input to the specified MPI processes.
Arguments
<spec> |
Define MPI process ranks |
all |
Use all processes. |
<l>,<m>,<n> |
Specify an exact list and use processes <l>, <m> and <n> only. The default value is zero. |
<k>,<l>-<m>,<n> |
Specify a range and use processes <k>, <l> through <m>, and <n>. |
Use this option to disable processing of the mpiexec.hydra configuration files.
Use this option to avoid intermingling of data output from the MPI processes. This option affects both the standard output and the standard error streams.
When using this option, end the last output line of each process with the end-of-line '\n' character. Otherwise the application may stop responding.
Use this option to specify the path to the executable file.
Use this option to set a directory for temporary files. See the I_MPI_TMPDIR environment variable for more details.
Use this option to display the version of the Intel® MPI Library.
Use this option to display build information of the Intel® MPI Library. When this option is used, the other command line arguments are ignored.
Use this option to explicitly specify the local host name for the launching node.
Use this option to select a resource management kernel to be used. Intel® MPI Library only supports pbs.
See the I_MPI_HYDRA_RMK environment variable for more details.
Use this option to redirect stdout to the specified file.
Use this option to redirect stderr to the specified file.
Use this option to specify the path to the executable file.
Use this option to specify the working directory in which the executable file runs.
Use this option to perform the "umask <umask>" command for the remote executable file.
Use this option to run processes under Intel® architecture specific GNU* debugger.
Use this option to specify the pattern that is prepended to the process output.
Use this option to print debug information from mpiexec.hydra, such as:
Service processes arguments
Environment variables and arguments passed to start an application
PMI requests/responses during a job life cycle
See the I_MPI_HYDRA_DEBUG environment variable for more details.
Use this option to print out the MPI rank mapping.
Use this option to print the exit codes of all processes.
Use this option to select a built-in bootstrap server to use. A bootstrap server is the basic remote node access mechanism that is provided by the system. Hydra supports multiple runtime bootstrap servers such as ssh, rsh, pdsh, fork, persist, slurm, ll, lsf, or sge to launch MPI processes. The default bootstrap server is ssh. By selecting slurm, ll, lsf, or sge, you use the corresponding srun, llspawn.stdio, blaunch, or qrsh internal job scheduler utility to launch service processes under the respective selected job scheduler (SLURM*, LoadLeveler*, LSF*, and SGE*).
Arguments
<arg> |
String parameter |
ssh |
Use secure shell. This is the default value. |
rsh |
Use remote shell. |
pdsh |
Use parallel distributed shell. |
pbsdsh |
Use Torque* and PBS* pbsdsh command. |
slurm |
Use SLURM* srun command. |
ll |
Use LoadLeveler* llspawn.stdio command. |
lsf |
Use LSF blaunch command. |
sge |
Use Univa* Grid Engine* qrsh command. |
See I_MPI_HYDRA_BOOTSTRAP for details.
-bootstrap-exec <bootstrap server>
Use this option to set the executable to be used as a bootstrap server. The default bootstrap server is ssh. For example:
$ mpiexec.hydra -bootstrap-exec <bootstrap_server_executable> -f hosts -env <VAR1> <VAL1> -n 2 ./a.out
See I_MPI_HYDRA_BOOTSTRAP for more details.
Use this option to provide the additional parameters to the bootstrap server executable file.
$ mpiexec.hydra -bootstrap-exec-args <arguments> -n 2 ./a.out
For tight integration with the SLURM* scheduler (including support for suspend/resume), use the method outlined on the SLURM page here: http://www.schedmd.com/slurmdocs/mpi_guide.html#intel_mpi
See I_MPI_HYDRA_BOOTSTRAP_EXEC_EXTRA_ARGS for more details.
Use this option to pin or bind MPI processes to a particular processor and avoid undesired process migration. In the following syntax, the quotes may be omitted for a one-member list. Each parameter corresponds to a single pinning property.
This option is related to the family of I_MPI_PIN environment variables, which have higher priority than the -binding option. Hence, if any of these variables are set, the option is ignored.
This option is supported on both Intel® and non-Intel microprocessors, but it may perform additional optimizations for Intel microprocessors than it performs for non-Intel microprocessors.
Syntax
-binding "<parameter>=<value>[;<parameter>=<value> ...]"
Parameters
| Parameter | |
|---|---|
pin |
Pinning switch |
| Values | |
enable | yes | on | 1 |
Turn on the pinning property. This is the default value |
disable | no | off | 0 |
Turn off the pinning property |
| Parameter | |
|---|---|
cell |
Pinning resolution |
| Values | |
unit |
Basic processor unit (logical CPU) |
core |
Processor core in multi-core system |
| Parameter | |
|---|---|
map |
Process mapping |
| Values | |
spread |
The processes are mapped consecutively to separate processor cells. Thus, the processes do not share the common resources of the adjacent cells. |
scatter |
The processes are mapped to separate processor cells. Adjacent processes are mapped upon the cells that are the most remote in the multi-core topology. |
bunch |
The processes are mapped to separate processor cells by #processes/#sockets processes per socket. Each socket processor portion is a set of the cells that are the closest in the multi-core topology. |
p0,p1,...,pn |
The processes are mapped upon the separate processors according to the processor specification on the p0,p1,...,pn list: the ith process is mapped upon the processor pi, where pi takes one of the following values:
|
[m0,m1,...,mn] |
The ith process is mapped upon the processor subset defined by mi hexadecimal mask using the following rule: The j th processor is included into the subset mi if the jth bit of mi equals 1. |
| Parameter | |
|---|---|
domain |
Processor domain set on a node |
| Values | |
cell |
Each domain of the set is a single processor cell (unit or core). |
core |
Each domain of the set consists of the processor cells that share a particular core. |
cache1 |
Each domain of the set consists of the processor cells that share a particular level 1 cache. |
cache2 |
Each domain of the set consists of the processor cells that share a particular level 2 cache. |
cache3 |
Each domain of the set consists of the processor cells that share a particular level 3 cache. |
cache |
The set elements of which are the largest domains among cache1, cache2, and cache3 |
socket |
Each domain of the set consists of the processor cells that are located on a particular socket. |
node |
All processor cells on a node are arranged into a single domain. |
<size>[:<layout>] |
Each domain of the set consists of <size> processor cells. <size> may have the following values:
NoteDomain size is limited by the number of processor cores on the node. Each member location inside the domain is defined by the optional <layout> parameter value:
If <layout> parameter is omitted, compact is assumed as the value of <layout> |
| Parameter | |
|---|---|
order |
Linear ordering of the domains |
| Values | |
compact |
Order the domain set so that adjacent domains are the closest in the multi-core topology |
scatter |
Order the domain set so that adjacent domains are the most remote in the multi-core topology |
range |
Order the domain set according to the BIOS processor numbering |
| Parameter | |
|---|---|
offset |
Domain list offset |
| Values | |
<n> |
Integer number of the starting domain among the linear ordered domains. This domain gets number zero. The numbers of other domains will be cyclically shifted. |