Showing posts with label Intel. Show all posts
Showing posts with label Intel. Show all posts

Monday, 27 October 2014

Intel MPI How To Use and Debug

Running /bin/hostname

MPIRUN directory:
/opt/intel/impi/<version>/intel64/bin

Source mpivars.sh

Create a machinefile:
$ cat mach.txt
node1
node2

Test run:
$ mpirun -r ssh -f mach.txt -ppn 1 -np 2 ./bin/hostname

mpirun is a utility which runs mpdboot after that mpiexec. So, options for mpdboot comes first and after that options for mpiexec. '-machinefile' is an option for mpiexec.
With mpirun, there is actually no need to run mpdboot (needed only for mpiexec) nor creating mpd.hosts.

If instead you would like to use mpiexec, you would have to do the following.

Create mpd.hosts on your working directory.

Example is
$ cat mpd.hosts
node1
node2

Start mpdring:
$ mpdboot

Try another one: cpi.c
$ mpiicpc mpi.c
$ mpirun -f mach.txt -ppn 4 -np 8 ./a.out

Debugging

Note: Sometimes iptables might prevent mpi across nodes. You might want to flush or edit iptables.

Debugging:

#Pass DEBUG environment variables
export I_MPI_DEBUG=5

#Check mpd is up
$ mpdtrace

#To specifically use IB HCA port 2 instead of default port 1
export I_MPI_DAPL_PROVIDER=ofa-v2-mlx4_0-2

Note: DAPL versions of the nodes must match. Older versions of Intel MPI do not support DAPL v2.0. When installing the OS, make sure the necessary Infiniband drivers (e.g. DAPL 1.2 if using old Intel MPI) are installed.

Wednesday, 8 October 2014

Testing Intel OFED Installation.

Intel TrueScale QDR Qlogic with Intel OFED.


First, you have to install OFED from Intel.

Testing OpenMPI (with Intel compilers).

cd /usr/mpi/intel/openmpi-<version>-qlc/tests/osu_benchmarks

source /usr/mpi/intel/openmpi-<version>-qlc/bin/mpivars.sh

mpirun -host node1,node2 -np 2 ./osu_latency
#0-byte message should be less than 2 us.

mpirun -host node1,node2 -np 2 ./osu_bw
#Larger messages should hit above 3000MB/s.

Note: When running MPI apps, do not use InfiniBand Verbs (IBV), OpenIB, DAPL, etc. Use PSM instead.
If latency for 0-byte message is 5 us, then Verbs interface is used instead of PSM.
If this happens, try exit the ssh session and log in again.

Tuesday, 30 September 2014

Installing Intel TrueScale Fabric HCA Host Software

Background: I have a Haswell system with CentOS 7.0 on which I would like to have InfiniBand software installed.

Removing Previous OFED installation

Before installing the Intel OFED, get the latest OFED from OpenFabrics Alliance. Run the install.pl script to uninstall the previously installed OFED software.

After clean uninstall, reboot the machine. Manually removing any remaining ib modules after reboot.
For example, I would do:
$ rmmod ib_qib
$ rmmod ib_mad
$ rmmod ib_core

Next, install the OFED software by running
$ ./install.pl -k <kernel-version> -s /lib/modules/<kernel-version>/build --umad-dev-rw

Reboot.

Installing the Intel OFED Software

Grab the installer from Intel Download Center

Since I am using CentOS 7, I will choose IntelIB-Basic.RHEL7-x86_64.7.3.0.0.26.tgz

Run the INSTALL script provided with the package with --force option (since the script will complain due to my OS being CentOS 7).

OFED components that I installed include:
  • IpoIB and MPI over uDAPL
  • ib_qib #Intel TrueScale cards
  • libibumad
  • OpenSM #Subnet Manager
  • SDP (Socket Driver Protocol)
  • SRP
  • Perftest
  • Intel MPI
  • Debug info
NOTE: Do not install iWARP.

Installing Intel TrueScale Fabric HCA Host Software

Background: I have a Haswell system with CentOS 7.0 on which I would like to have InfiniBand software installed.

Removing Previous OFED installation

Before installing the Intel OFED, get the latest OFED from OpenFabrics Alliance. Run the install.pl script to uninstall the previously installed OFED software.

After clean uninstall, reboot the machine. Manually removing any remaining ib modules after reboot.
For example, I would do:
$ rmmod ib_qib
$ rmmod ib_mad
$ rmmod ib_core

Next, install the OFED software by running
$ ./install.pl -k <kernel-version> -s /lib/modules/<kernel-version>/build --umad-dev-rw

Reboot.

Installing the Intel OFED Software

Grab the installer from Intel Download Center

Since I am using CentOS 7, I will choose IntelIB-Basic.RHEL7-x86_64.7.3.0.0.26.tgz

Run the INSTALL script provided with the package with --force option (since the script will complain due to my OS being CentOS 7).

OFED components that I installed include:
  • IpoIB and MPI over uDAPL
  • ib_qib #Intel TrueScale cards
  • libibumad
  • OpenSM #Subnet Manager
  • SDP (Socket Driver Protocol)
  • SRP
  • Perftest
  • Intel MPI
  • Debug info
NOTE: Do not install iWARP.

Installing Intel TrueScale Fabric HCA Host Software

Background: I have a Haswell system with CentOS 7.0 on which I would like to have InfiniBand software installed.

Removing Previous OFED installation

Before installing the Intel OFED, get the latest OFED from OpenFabrics Alliance. Run the install.pl script to uninstall the previously installed OFED software.

After clean uninstall, reboot the machine. Manually removing any remaining ib modules after reboot.
For example, I would do:
$ rmmod ib_qib
$ rmmod ib_mad
$ rmmod ib_core

Next, install the OFED software by running
$ ./install.pl -k <kernel-version> -s /lib/modules/<kernel-version>/build --umad-dev-rw

Reboot.

Installing the Intel OFED Software

Grab the installer from Intel Download Center

Since I am using CentOS 7, I will choose IntelIB-Basic.RHEL7-x86_64.7.3.0.0.26.tgz

Run the INSTALL script provided with the package with --force option (since the script will complain due to my OS being CentOS 7).

OFED components that I installed include:
  • IpoIB and MPI over uDAPL
  • ib_qib #Intel TrueScale cards
  • libibumad
  • OpenSM #Subnet Manager
  • SDP (Socket Driver Protocol)
  • SRP
  • Perftest
  • Intel MPI
  • Debug info
NOTE: Do not install iWARP.

Tuesday, 9 September 2014

Compiling Linux Kernel 3.16.1 for CentOS 7.0 on Haswell Server

Background: I would like to upgrade the existing kernel for CentOS 7.0 (3.10.0-123.6.3.el7.x86_64) to 3.16.1 (latest kernel from kernel.org - not available in repo yet).


Download the latest kernel from www.kernel.org into /usr/src/kernels. Untar and change into the kernel directory (in this case is linux-3.16.1).

Install the necessary packages:
$ yum groupinstall "Development Tools"
$ yum install ncurses ncurses-devel

Do a proper cleanup:
$ make mrproper

Copy current kernel config (in /boot) to .config (in working dir) as a base to use.
Note that 'make mrproper' deletes .config
$ cp /boot/config-3.10.0-123.el7.x86_64 .config

Edit the configuration further if needed:
$ make menuconfig

To adjust compiling to the number of your CPU cores (for faster compilation time):
$ export CONCURRENCY_LEVEL=`getconf _NPROCESSORS_ONLN`

Compile and build the rpms
$ make rpm

Install the kernel
$ rpm -ivh /root/rpmbuild/RPMS/x86_64/kernel-2.6.32.27-1.x86_64.rpm


This should set up the initrd and grub settings. If not, manually do these:
$ mkinitrd /boot/initrd-3.16.1.img 3.16.1
$ grubby --add-kernel=/boot/vmlinuz-3.16.1  --initrd=/boot/initramfs-3.16.1.img --title="CentOS Linux 7.0 3.16.1" --make-default --copy-default