OpenMPI

未整理

ビルド

--with-openib[=]
--with-libnuma
--with-tm

# yum install libibverbs libibverbs-devel
$ ./configure --prefix=/opt/openmpi/ --with-openib --enable-mpi-f90 F77=ifort FC=ifort
$ make
# make install

実行

mpirun --mca btl openib,sm,self,^tcp
mpirun --mca btl tcp,sm,self
mpirun --mca pls_rsh_agent rsh
※^(キャレット):不使用

チェック

ompi_info
ompi_info --param btl openib
ompi_info --param all all | grep openib

性能評価

First nodeでib_rdma_bwを実行しておき、
Second nodeでib_rdma_bw と実行。
rpm -qal openib-perftestでテストプログラムを確認可

MPE2(MPI Parallel Environment)

http://www.mcs.anl.gov/research/projects/perfvis/download/index.htm#MPE


$ ./configure --prefix=/usr/local/mpi2 F77=ifort MPI_CC=mpicc MPI_F77=mpif77
$ make
# make install

設定ファイル

~/.openmpi/mca-params.conf
~/.openmpi/hostfile
/etc/openmpi-mca-params.conf
/etc/openmpi-default-hostfile



pls_rsh_agent = rsh
btl = openib,sm,self または btl = tcp,sm,self

rshの指定は、"plm_rsh_agent = rsh"がよい。


hostfile記述例
alpha.xxx.com slots=4
beta.xxx.com slots=2
※要完全名?cpuはダメっぽい。


1.3.x以降は、環境変数 OMPI_MCA_orte_default_hostfileで
ホストファイル(のパス)を指定

export OMPI_MCA_orte_default_hostfile=$HOME/.openmpi/default-hostfile

etc/openmpi-mca-params.confに


orte_default_hostfile = hostfile
として追加してもよい。

open-mpi.orgのFAQ

http://www.open-mpi.org/faq/

OpenMPI Frameworks(v1.3)

大きく3つのフレームワークに分かれる。

1. MPIレイヤ(OMPI)
2. ランタイムレイヤ(ORTE)
3. オペレーティングシステム/プラットフォームレイヤ(OPAL)

OMPI frameworks
  • allocator: Memory allocator
  • bml: BTL management layer
  • btl: MPI point-to-point Byte Transfer Layer, used for MPI point-to-point messages on some types of networks
  • coll: MPI collective algorithms
  • crcp: Checkpoint/restart coordination protocol
  • dpm: MPI-2 dynamic process management
  • io: MPI-2 I/O
  • mpool: Memory pooling
  • mtl: Matching transport layer, used for MPI point-to-point messages MPI-2 one-sided communications
  • pml: MPI point-to-point management layer
  • pubsub: MPI-2 publish/subscribe management
  • rcache: Memory registration cache
  • topo: MPI topology routines
ORTE frameworks
  • errmgr: RTE error manager
  • ess: RTE environment-specfic services
  • filem: Remote file management
  • grpcomm: RTE group communications
  • iof: I/O forwarding
  • odls: OpenRTE daemon local launch subsystem
  • oob: Out of band messaging
  • plm: Process lifecycle management
  • ras: Resource allocation system
  • rmaps: Resource mapping system
  • rml: RTE message layer
  • routed: Routing table for the RML
  • snapc: Snapshot coordination
OPAL frameworks
  • backtrace: Debugging call stack backtrace support
  • carto: Cartography (host/network mapping) support
  • crs: Checkpoint and restart service
  • installdirs: Installation directory relocation services
  • maffinity: Memory affinity
  • memchecker: Run-time memory checking
  • memcpy: Memopy copy support
  • memory: Memory management hooks
  • paffinity: Processor affinity
  • timer: High-resolution timers