Help:MPI
From MMAE
MPI stands for Message Passing Interface.
MPI not just software, but an application programming interface, protocol, and library. MPI has been implemented for many operating systems and parallel architectures by many different vendors. It is likely that more than one version is available on any given cluster.
Contents |
[edit] Software availability
NOTE: You must choose a version of MPI compatible with the compiler you are using, and you must use the correct version of mpirun to match.
- Roman cluster
- mpich version 1 or 2 is on most cluster nodes
- lam mpi is on all upgraded nodes
- MMAE student cluster
- Sun MPI was at one time on the sun machines, but is currently not set up. Ask if you wish to use it.
- i2 / Euler cluster
- mpich version 1 is installed, but does not seem to work between nodes
- lam mpi is installed and compiled to use the intel fortran compiler; add /opt/lam/bin to the beginning of your path to use it; NOTE: you must add lam to the START of your path, and you must do it in your .bashrc near the top, before the conditional for interactive shells
- i2 / deli cluster (Hilbert)
- lam mpi, mpich, lam mpi for Intel compilers
[edit] MPI web links
- wikipedia:Message Passing Interface
- Notes On Using Mpi (Fortran)
- MPI Forum standards group
- mcs.anl.gov
- MPI home
- MPICH home a popular MPI implementation
- tutorials
- Quick overview of SEND
- http://www10.informatik.uni-erlangen.de/~mohr/MPI/
- MPI tutorials
- Parallel Programming with MPI
- LLNL mpi tutorials
[edit] Cluster use
Please be considerate of other users. Check the system load with the uptime command or look at ganglia on the head node's web server and check the load on the machines you are using. Ask if you are having trouble finding this.
[edit] using MPICH
The cluster machines running RedHat 9 have mpich installed in /usr/local/mpich-1.2.5/bin which you can add to your path in your .bashrc.
Please consult the man pages for complete information.
Some of the mpi-ch commands available are:
mpicc mpiCC mpif77 mpif90 mpiman mpireconfig mpirun
[edit] Compile
For a single program, use for example
- mpicc filename.c
- mpif77 filename.f
- mpif90 filename.f
[edit] sample Makefile
[edit] running MPI programs
Before running your code on multiple nodes, you must set up a host file and copy the executable and data files to each node. (rsh, rcp, and rsync may be useful for this.)
Assuming you have a hosts file listing machines you have access to,
- mpirun -machinefile ~/hosts.fluent -np 5 a.out
This would run your code on the first 5 cpus listed in the hosts.fluent file in your home directory.
[edit] using LAM mpi
The cluster machines running newer versions of RedHat, Fedora Core, and CentOS have LAM mpi installed, and should use the same commands as above with some additions.
Some LAM mpi commands are:
hcc hcp hf77 laminfo lamnodes lamshrink lamtrace mpiCC mpic++ mpicc mpiexec mpif77 mpimsg mpirun mpitask tkill tping
To start LAM mpi, type:
lamboot hostfile
Where hostfile lists the hosts to use. You can either list hosts multiple times, or include cpu=2 after each hostname to use more cpus. If ssh asks for a password AND rsh works on the cluster, you can instead try
lamboot hostfile
or
LAMRSH=rsh lamboot hostfile
Once lamboot has successfully run, to run a.out, try any of these:
- mpirun n0-3 a.out
- run on 4 nodes, 0 to 3,
- mpirun -np 4 a.out
- run on first 4 nodes
- mpirun -np 8 n1-4 a.out
- run on 4 nodes (1-4) but use 8 processors
To check which nodes mpi is correctly running on, type
lamnodes
To clean up things you might have left running without shutting down your lam subcluster
lamclean
To stop LAM mpi, type
lamhalt
SPECIAL NOTE: The first node in the hosts file will be the master node, not the one you start from!
Lam documentation is available at http://www.lam-mpi.org/ Some reference documentation is in the following man pages:
- lam
- overview and introduction to LAM
- introu
- lam user commands
