Supercomputing 2012 Documentation.
- The following IBM provided software was saved on our repository for easy distribution:
- ibm_fw_sraidmr_m5100-23.7.0-0037_linux_32-64.bin - M5100 series RAID firmware.
- ibm_utl_sraidmr_megacli-8.04.07_linux_32-64.zip - MegaCli command
- Install pre-regs (/lib/ld-linux.so.2, libncurses.so.5, libstdc++.so.6):
- yum install compat-glibc.x86_64 ncurses-libs-5.7-3.20090208.el6.i686 libstdc++-4.4.6-4.el6.i686
- Download and run firmware upgrade (will upgrade all connected controllers):
%STARTCONSOLE%
mkdir -p /tmp/ibm_megacli
cd /tmp/ibm_megacli
wget
http://vmrepo.heprc.uvic.ca/ibm_utl_sraidmr_megacli-8.04.07_linux_32-64.zip
unzip ibm_utl_sraidmr_megacli-8.04.07_linux_32-64.zip
yum localinstall Lib_Utils-1.00-09.noarch.rpm
MegaCli-8.04.07-1.noarch.rpm
cd /usr/local/sbin
ln -s /opt/MegaRAID/MegaCli/MegaCli64
MegaCli
%ENDCONSOLE%
%STARTCONSOLE%
MegaCli -AdpAllInfo -aALL | less
%ENDCONSOLE%
Reconfigureing RAID 0 Virtua Drives
The SC2012 servers (IBM x3650 M4) servers contain two HDD enclosures each with eight drives. The first physical drive will be used to host the operating system and utilities. The remaining drives in each enclosure will be defined as two RAID 0 arrays; one with seven drives, and the other with eight.
%STARTCONSOLE%
[root@sc03 sbin]#
MegaCli -LDInfo -Lall -aALL
%ENDCONSOLE%
- Remove redundent virtual drive:
%STARTCONSOLE%
[root@sc03 sbin]#
MegaCli -CfgLdDel -L1 -a0
Adapter 0: Deleted Virtual Drive-1(target id-1)
Exit Code: 0x00
[root@sc03 sbin]#
%ENDCONSOLE%
%STARTCONSOLE%
[root@sc03 sbin]#
MegaCli -EncInfo -aALL | awk '/Enclosure|Device ID/'
Enclosure 0:
Device ID : 252
Enclosure type : SGPIO
Enclosure Serial Number : N/A
Enclosure Zoning Mode : N/A
Enclosure 0:
Device ID : 62
Enclosure type : SGPIO
Enclosure Serial Number : N/A
Enclosure Zoning Mode : N/A
[root@sc03 sbin]#
%ENDCONSOLE%
- Determine enclosure IDs and used slots:
%STARTCONSOLE%
[root@sc03 sbin]#
MegaCli -CfgDsply -aALL |awk '/GROUP|Enclosure|Slot/'
Number of DISK GROUPS: 1
DISK GROUP: 0
Enclosure Device ID: 252
Slot Number: 0
Enclosure position: N/A
Number of DISK GROUPS: 0
[root@sc03 sbin]#
%ENDCONSOLE%
- Define new RAID 0 virtual device on the seven free drives:
%STARTCONSOLE%
[root@sc03 sbin]#
MegaCli -CfgLdAdd -r0 [252:1,252:2,252:3,252:4,252:5,252:6,252:7] RA -strpsz1024 -a0
Adapter 0: Created VD 1
Adapter 0: Configured the Adapter!!
Exit Code: 0x00
[root@sc03 sbin]#
%ENDCONSOLE%
%STARTCONSOLE%
[root@sc03 sbin]#
MegaCli -LDInfo -Lall -aALL
%ENDCONSOLE%
Moving to LSI firmware from IBM
Some minor stability improvement. Hard to quantify.
Complete output at:
https://gist.github.com/2d9da810da246e0e161b
%STARTCONSOLE%
[root@sc04 ~]# fdtClient -P 4 -c 10.20.3.101 -fl filelist-8-15.txt -d /
06/11 11:20:54 Net Out: 23.864 Gb/s Avg: 21.642 Gb/s 98.84% ( 03s )
06/11 11:20:59 Net Out: 15.864 Gb/s Avg: 21.550 Gb/s 100.00% ( 00s )
FDTReaderSession ( e66cfbb2-a692-4e91-a0a6-fa21abfa3282 ) final stats:
Started: Tue Nov 06 11:15:39 PST 2012
Ended: Tue Nov 06 11:21:03 PST 2012
Transfer period: 05m 24s
TotalBytes: 858179094400
TotalNetworkBytes: 858179094400
Exit Status: OK
%ENDCONSOLE%
Trying all 15 disks same result:
%STARTCONSOLE%
06/11 15:25:29 Net In: 0.000 b/s Avg: 21.438 Gb/s 100.00% ( 00s )
FDTWriterSession ( afabec72-9117-404a-8254-bb18856a2558 ) final stats:
Started: Tue Nov 06 15:15:27 PST 2012
Ended: Tue Nov 06 15:25:30 PST 2012
Transfer period: 10m 03s
TotalBytes: 1609085802000
TotalNetworkBytes: 1609085802000
Exit Status: OK
%ENDCONSOLE%
Building the lustre client RPMs for the current kernel.
- On a base install of Scientific Linux 6.3, install the latest kernel and all pre-reqs.
%STARTCONSOLE%
yum update
yum groupinstall 'Development tools'
yum install expect
%ENDCONSOLE%
- Edit /etc/selinux/config and disable selinux:
%STARTCONSOLE%
SELINUX=disabled
%ENDCONSOLE%
- Install the lustre client source RPM:
%STARTCONSOLE%
rpm -ivhU
http://downloads.whamcloud.com/public/lustre/lustre-1.8.8-wc1/el6/client/RPMS/x86_64/lustre-client-source-1.8.8-wc1_2.6.32_220.17.1.el6.x86_64_gbc88c4c.x86_64.rpm
%ENDCONSOLE%
- Switch to the source directory, compile the lustre client code and build the RPMs.
%STARTCONSOLE%
cd /usr/src/lustre-1.8.8
./configure
make
make rpms
%ENDCONSOLE%
- Install the client either by:
%STARTCONSOLE%
cd /root/rpmbuild/RPMS/x86_64
rpm -ivhf lustre-client-1.8.8-wc1_2.6.32_279.14.1.el6.x86_64_gbc88c4c.x86_64.rpm lustre-client-modules-1.8.8-wc1_2.6.32_279.14.1.el6.x86_64_gbc88c4c.x86_64.rpm
%ENDCONSOLE%
- or create a repository with these RPMs and (see also "Activating a lustre client node" below):
%STARTCONSOLE%
yum install lustre-client
%ENDCONSOLE%
Activating a lustre client node.
- Configure the lustre network by creating "/etc/modpobe.d/lustre-lnet.conf". Assuming the "ethmlx" network interface is on the same layer 2 network as the lustre filesystem (10.20.3.nnn), the configuration file should contain:
%STARTCONSOLE%
options lnet networks=tcp(ethmlx)
%ENDCONSOLE%
- Install the hepnet-lustre repository by creating "/etc/yum.repos.d/hepnet-lustre.repo". The file should contain:
%STARTCONSOLE%
[hepnet-lustre]
name=HEPnet Canada SC12 repository
#baseurl=http://rpms.hepnetcanada.ca/sl/6X/lustre/$basearch/2.6.32-220
#baseurl=http://rpms.hepnetcanada.ca/sl/6X/lustre/$basearch/2.6.32-279.2.1
#baseurl=http://rpms.hepnetcanada.ca/sl/6X/lustre/$basearch/2.6.32-279.5.1
baseurl=http://rpms.hepnetcanada.ca/sl/6X/lustre/$basearch/2.6.32-279.14.1
enabled=0
gpgcheck=0
http_caching=none
%ENDCONSOLE%
- Install the lustre client:
%STARTCONSOLE%
yum install --enablerepo=hepnet-lustre lustre-client
%ENDCONSOLE%
%STARTCONSOLE%
mkdir -p /lustreSC
mount -t lustre
10.20.3.103@tcp:/lustreSC /lustreSC
%ENDCONSOLE%