Node Installation Documentation
This document is my log of how to install a new linux node and add it to the Microsoft HPC 2012 R2 U4 cluster.
Get ready
Install linux and enable SSH
- I used the normal Ubuntu 14.04 desktop USB, as the others didn't work.
- It all worked pretty smoothly really.
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install openssh-server
sudo usermod -aG sudo user
if you need to add any sudo-ers.sudo nano /etc/ssh/sshd_config
if you need to set ssh users.- Add a line
AllowGroups ssh
- Also, be good and add
DenyUsers root
andDenyGroups root
when you've setup sudo-ers. sudo usermod -aG ssh user
to add each user to ssh.sudo service ssh restart
to apply changes. Don't lock yourself out muppet-brain.
- Add a line
Sort out infiniband support
- The cards I used were the old Voltaire ones, so a bit of hacking was needed:-
sudo nano /etc/modules
- and add ib_mthca rdma_ucm ib_umad ib_uverbs ib_ipoib ib_srp ib_sdpsudo modprobe ib_ipoib
sudo nano /etc/network/interfaces
and add the below, where x is the node number+1. (eg, fi--didelx15 should be 12.0.0.16). Don't add anything about eth0 or eth1 or it will break.
auto ib0 iface ib0 inet static address 12.0.0.x netmask 255.255.255.0 broadcast 12.0.0.255
- We may need to disable IPv6.
sudo nano /etc/sysctl.conf
, and add the following somewhere:
net.ipv6.conf.all.disable_ipv6=1 net.ipv6.conf.default.disable_ipv6=1 net.ipv6.conf.lo.disable_ipv6=1
Add the HPC mount for some useful bits
sudo mkdir -p /hpclinux
sudo apt-get install cifs-utils
sudo mount -t cifs //fi--didelxhn/HPCLinux /hpclinux -o user=adminuser,dom=dide.local
Adding to the domain
Install NTP support
sudo apt-get install ntp
sudo cp /hpclinux/linux_inst/ntp.conf /etc/ntp.conf
- (That sets the only server to be time.imperial.ac.uk)
sudo /etc/init.d/ntp stop
sudo ntupdate time.imperial.ac.uk
sudo /etc/init.d/ntp start
Domain things
sudo apt-get install winbind libpam-winbind libnss-winbind krb5-user krb5-config libpam-krb5
- The domain, when asked, is DIDE.local - case sensitive.
sudo cp /hpclinux/linux_inst/nsswitch.conf /etc/nsswitch.conf
- adds winbind to passwd group, and removes [NOTFOUND=return] from hosts.sudo cp /hpclinux/linux_inst/smb.conf /etc/samba/smbconf
- lots of config for DIDE.sudo cp /hpclinux/linux_inst/krb5.conf /etc/krb5.conf
- lots more config for DIDE.ifconfig -a
and make note of the IP address if you haven't already.sudo nano /etc/hosts
and replace with:-
127.0.0.1 localhost 129.31.x.y fi--didelx99.dide.local fi--didelx99.dide.ic.ac.uk fi--didelx99
sudo net cache flush
sudo service smbd restart
sudo service nmbd restart
sudo service winbind restart
sudo kinit adminuser@DIDE.LOCAL
sudo net ads join -U adminuser
Preparing drive mounting
sudo apt-get install libpam-mount
sudo cp /hpclinux/linux_inst/pam_mount.conf.xml /etc/security/pam_mount.conf.xml
- this enables looking for .pam_mount_conf.xml in the home folder, and automatically sets up a mount point (on fi--san02) to that folder beforehand.sudo cp /hpclinux/linux_inst/.pam_mount.conf.xml /etc/skel
- for convenience really. Suggest that users copy all the "." files from /etc/skel to their home folder, to get a nice experience when ssh-ing.- The home folder is set to /media/home, and automatically mounts \\fi--san02\homes\username. Users should edit
.pam_mount.conf.xml
in their home folder, and add the volumes they want mounted. For example:-
<?xml version="1.0" encoding="utf-8" ?> <pam_mount> <volume options="nodev,nosuid" user="*" mountpoint="/media/f2gsim" path="GlobalSim" server="fi--didef2.dide.ic.ac.uk" fstype="cifs" /> </pam_mount>
Installing HPC
cd /hpclinux
sudo python setup.py -install -clusname:fi--didelxhn -certfile:hpc4.pfx