This instruction is based on the real setup of one HPC system that I administered.
Migrate User Accounts
Write a script to transfer user accounts. This step is for creating user home directories cleanly (optional).Require: /etc/shadow and /etc/passwd from login node.
Next, copy over the /etc/passwd, /etc/shadow, /etc/group, and /etc/hosts from master node. Remember to make copies of the existing files on the compute node.
Transfer user home directory to /home01
$ cd /
$ ln -s home home01
Set up internet
$ echo 'GATEWAY=10.10.8.243' >> /etc/sysconfig/network$ echo -e 'nameserver 10.10.8.236\nnameserver 202.83.248.3\nnameserver 123.136.66.68' >> /etc/resolv.conf
$ service network restart
Flush iptables and make a copy of the existing one. Comment up the rules in /etc/sysconfig/iptables.
Mount File Systems
Fuji file systems do not require $DATA (visible only on head node) to be mounted on compute nodes.Mounting the NFS filesystems
To list filesystem to mount:$ showmount -e fujiB
$ showmount -e fs03
Steps:
$ mkdir /apps
$ mount -t nfs fujiB:/appslocal/APPS /apps
$ mount -t nfs fujiB:/appslocal/LOCAL /usr/local
$ mount -t nfs fs03:/home /home
$ mkdir /dx80
$ mount -t nfs fs03:/dx80 /dx80
To add to /etc/fstab:
fujiB:/appslocal/APPS /apps nfs rsize=32768,wsize=32768,soft,intr,tcp 1 1
fujiB:/appslocal/LOCAL /usr/local nfs rsize=32768,wsize=32768,soft,intr,tcp 1 1
fs03:/home /home nfs rsize=32768,wsize=32768,tcp,soft,intr,noatime 1 1
fs03:/dx80 /dx80 nfs rsize=32768,wsize=32768,soft,intr,tcp 1 1
Mounting the PanFS
NTPD required by panfs
$ chkconfig ntpdate on
$ chkconfig ntpd on
$ service ntpd start
To create scratch directory on each user's home:
$ mkdir /scratch1
$ mkdir /scratch
Install PanFS RPMs:
$ cd /usr/local/panasas
$ rpm -ivh <panasas-version-rpm>
Load the PanFS module:
$ insmod /lib/modules/<kernel-version>/extra/panfs/panfs.ko
Mount the PanFS filesystem:
$ mount -t panfs panfs://fujipan/pscratch2 /scratch
$ mount -t panfs panfs://fujipan/pscratch1 /scratch1
PS: PanFS client time must be synced with the PanFS server, otherwise clock skew error might be detected. To solve clock skew:
#Simple but lost after reboot, not needed if we start ntpd
$ ntpdate ntp.acrc.a-star.edu.sg
Add to /etc/fstab:
panfs://fujipan/pscratch2 /scratch panfs rw,noauto,panauto,blksize=2M,dir-caching-threshold=32M,rmlist=(10.10.9.179;10.10.9.194;10.10.9.195),callback-network-allow=10.10.0.0/255.255.0.0,callback-default-one=1,_netdev 0 0
panfs://fujipan/pscratch1 /scratch1 panfs rw,noauto,panauto,blksize=2M,dir-caching-threshold=32M,rmlist=(10.10.9.179;10.10.9.194;10.10.9.195),callback-network-allow=10.10.0.0/255.255.0.0,callback-default-one=1,_netdev 0 0
#_netdev option (might need netfs) should wait until the network is up, which might also give you enough time for panfs to come up.
Add /etc/sysconfig/modules/local.modules
$ cat local.modules
#!/bin/sh
#Load PanFS module
insmod /lib/modules/2.6.32-220.el6.x86_64/extra/panfs/panfs.ko
$ chmod +x local.modules
Miscellanous
Sometimes MPI needs unlimited / high locked memory limit#Change the locked memory limit
$ echo -e '* soft memlock unlimited\n* hard memlock unlimited' >> /etc/security/limits.conf
#Install necessary software
$ yum install -y gsl
#Copy /opt/intel
$ rsync -avr /usr/local/opt/intel /opt/
#Configure PanFS to allow thru SELinux
$ yum install policycoreutils-python
$ checkmodule -M -m panfs.te -o panfs.mod
$ semodule_package -o panfs.pp -m panfs.mod
$ semodule -i panfs.pp
$ cat panfs.te
module panfs 1.0;
require {
type mount_t;
class capability net_raw;
class rawip_socket { read bind create getattr write ioctl shutdown node_bind };
}
#============= mount_t ==============
allow mount_t self:rawip_socket { bind create ioctl shutdown write read getattr };
allow mount_t self:capability net_raw;
#!!!! This avc is allowed in the current policy
allow mount_t self:rawip_socket create;
******************************END***********************************
No comments:
Post a Comment