Thursday 9 October 2014

Configuring a Compute Node (of a Supercomputer)

This post details how a compute node is set up after an upgrade (e.g. OS update).
This instruction is based on the real setup of one HPC system that I administered.

Migrate User Accounts

Write a script to transfer user accounts. This step is for creating user home directories cleanly (optional).
Require: /etc/shadow and /etc/passwd from login node.

Next, copy over the /etc/passwd, /etc/shadow, /etc/group, and /etc/hosts from master node. Remember to make copies of the existing files on the compute node.

Transfer user home directory to /home01
$ cd /
$ ln -s home home01

Set up internet

$ echo 'GATEWAY=10.10.8.243' >> /etc/sysconfig/network

$ echo -e 'nameserver 10.10.8.236\nnameserver 202.83.248.3\nnameserver 123.136.66.68' >> /etc/resolv.conf

$ service network restart

Flush iptables and make a copy of the existing one. Comment up the rules in /etc/sysconfig/iptables.



Mount File Systems

Fuji file systems do not require $DATA (visible only on head node) to be mounted on compute nodes.

Mounting the NFS filesystems

To list filesystem to mount:
$ showmount -e fujiB
$ showmount -e fs03
Steps:
$ mkdir /apps
$ mount -t nfs fujiB:/appslocal/APPS /apps
$ mount -t nfs fujiB:/appslocal/LOCAL /usr/local

$ mount -t nfs fs03:/home /home
$ mkdir /dx80
$ mount -t nfs fs03:/dx80 /dx80

To add to /etc/fstab:
fujiB:/appslocal/APPS    /apps        nfs    rsize=32768,wsize=32768,soft,intr,tcp    1 1
fujiB:/appslocal/LOCAL    /usr/local    nfs    rsize=32768,wsize=32768,soft,intr,tcp    1 1
fs03:/home        /home        nfs    rsize=32768,wsize=32768,tcp,soft,intr,noatime 1 1
fs03:/dx80        /dx80        nfs    rsize=32768,wsize=32768,soft,intr,tcp    1 1


Mounting the PanFS


NTPD required by panfs
$ chkconfig ntpdate on
$ chkconfig ntpd on
$ service ntpd start

To create scratch directory on each user's home:
$ mkdir /scratch1
$ mkdir /scratch

Install PanFS RPMs:
$ cd /usr/local/panasas
$ rpm -ivh <panasas-version-rpm>

Load the PanFS module:
$ insmod /lib/modules/<kernel-version>/extra/panfs/panfs.ko

Mount the PanFS filesystem:
$ mount -t panfs panfs://fujipan/pscratch2 /scratch
$ mount -t panfs panfs://fujipan/pscratch1 /scratch1

PS: PanFS client time must be synced with the PanFS server, otherwise clock skew error might be detected. To solve clock skew:
#Simple but lost after reboot, not needed if we start ntpd
$ ntpdate ntp.acrc.a-star.edu.sg

Add to /etc/fstab:
panfs://fujipan/pscratch2 /scratch    panfs    rw,noauto,panauto,blksize=2M,dir-caching-threshold=32M,rmlist=(10.10.9.179;10.10.9.194;10.10.9.195),callback-network-allow=10.10.0.0/255.255.0.0,callback-default-one=1,_netdev 0 0

panfs://fujipan/pscratch1 /scratch1    panfs    rw,noauto,panauto,blksize=2M,dir-caching-threshold=32M,rmlist=(10.10.9.179;10.10.9.194;10.10.9.195),callback-network-allow=10.10.0.0/255.255.0.0,callback-default-one=1,_netdev 0 0

#_netdev option  (might need netfs) should wait until the network is up, which might also give you enough time for panfs to come up.

Add /etc/sysconfig/modules/local.modules
$ cat local.modules
#!/bin/sh

#Load PanFS module
insmod /lib/modules/2.6.32-220.el6.x86_64/extra/panfs/panfs.ko


$ chmod +x local.modules

Miscellanous

Sometimes MPI needs unlimited / high locked memory limit
#Change the locked memory limit
$ echo -e '* soft memlock unlimited\n* hard memlock unlimited' >> /etc/security/limits.conf

#Install necessary software
$ yum install -y gsl

#Copy /opt/intel
$ rsync -avr /usr/local/opt/intel /opt/

#Configure PanFS to allow thru SELinux
$ yum install policycoreutils-python
$ checkmodule -M -m panfs.te -o panfs.mod
$ semodule_package -o panfs.pp -m panfs.mod
$ semodule -i panfs.pp

$ cat panfs.te

module panfs 1.0;

require {
    type mount_t;
    class capability net_raw;
    class rawip_socket { read bind create getattr write ioctl shutdown node_bind };
}

#============= mount_t ==============
allow mount_t self:rawip_socket { bind create ioctl shutdown write read getattr };
allow mount_t self:capability net_raw;

#!!!! This avc is allowed in the current policy
allow mount_t self:rawip_socket create;



******************************END***********************************

No comments:

Post a Comment