Setting up the Master Node
The Box cluster machine does not have a CD-ROM drive. One option is to attach a CD/DVD drive to the SATA slot. The other option is to boot this node through netboot (PXE). To do this, we should use another machine. In our case, we use the first option - viz. adding a CD-ROM drive.
After this, we setup a) the DHCP server and b) the NFS server.
We need to do some network setup for IP masquerading
DHCP server
A DHCP server is needed to dynamically assign IP addresses to the slave-nodes during the PXE boot process. We install it using
apt-get install dhcp3-server
We use the following configuration in /etc/dhcp/dhcpd.conf
ddns-update-style none;
default-lease-time 600;
max-lease-time 7200;
authoritative;
ddns-update-style none;
subnet 192.168.1.0 netmask 255.255.255.0 {
default-lease-time 86400;
range 192.168.1.2 192.168.1.20;
max-lease-time 604800;
option routers 192.168.1.1;
option subnet-mask 255.255.255.0;
option domain-name "local";
option domain-name-servers 163.143.1.100;
option nis-domain "hpc";
option broadcast-address 192.168.1.255;
allow booting;
allow bootp;
if (substring (option vendor-class-identifier, 0, 20)
= "PXEClient:Arch:00002") {
# ia64
filename "elilo.efi";
next-server 192.168.1.1;
} elsif ((substring (option vendor-class-identifier, 0, 9)
= "PXEClient") or
(substring (option vendor-class-identifier, 0, 9)
= "Etherboot")) {
# i386 and x86_64
filename "pxelinux.0";
next-server 192.168.1.1;
} else {
filename "/install/sbin/kickstart.cgi";
next-server 192.168.1.1;
}
host hpcs02 {
hardware ethernet 00:15:17:31:11:00;
option host-name "hpcs02";
fixed-address 192.168.1.2;
}
host hpcs03 {
hardware ethernet 00:15:17:31:0D:8C;
option host-name "hpcs03";
fixed-address 192.168.1.3;
}
# and so on .........
}
We should edit /etc/network/interfaces and add the following
auto eth1
iface eth1 inet static
address 192.168.1.1
netmask 255.255.255.0
network 192.168.1.0
broadcast 192.168.1.255
gateway 163.143.166.104
dns-nameservers 163.143.1.100
Now restart the network and the dhcp server
service networking restart
service isc-dhcp-server restart
More network configuration
Edit /etc/hosts to include all the nodes. For example
127.0.0.1 localhost
192.168.1.1 hpcs01.u-aizu.ac.jp hpcs01
192.168.1.2 hpcs02.u-aizu.ac.jp hpcs02
192.168.1.3 hpcs03.u-aizu.ac.jp hpcs03
192.168.1.4 hpcs04.u-aizu.ac.jp hpcs04
192.168.1.5 hpcs05.u-aizu.ac.jp hpcs05
192.168.1.6 hpcs06.u-aizu.ac.jp hpcs06
192.168.1.7 hpcs07.u-aizu.ac.jp hpcs07
Also create a text file in /etc/machines with the names of all the slave nodes for example. We will use this for scripting common tasks across the nodes.
hpcs01
hpcs02
...etc
NFS server
Install the packages: apt-get nfs-common nfs-kernel-server
Edit /etc/exports to export /home and /var/lib/tftpboot:
/home 192.168.1.0/24(rw,no_root_squash,sync,no_subtree_check)
/var/lib/tftpboot 192.168.1.0/24(rw,no_root_squash,sync,no_subtree_check)
Export these files with exportfs -av
IP Masquerading
First edit /etc/sysctl.conf and uncomment the following line net.ipv4.ip_forward=1
Follow this by sudo sysctl -p
For IP masquerading we use sudo iptables -t nat -A POSTROUTING -s 192.168.1.0/16 -p eth0 -j MASQUERADE
iptables-save > /etc/network/iptables
For permanent IP masquerading, append the following to /etc/network/interfaces as follows
auto eth1
iface eth1 inet static
address 192.168.0.1
netmask 255.255.255.0
network 192.168.0.0
broadcast 192.168.0.255
up iptables-restore < /etc/network/iptables