Difference: SC2012Documentation (1 vs. 10)

Revision 102012-11-15 - crlb

Line: 1 to 1
 
META TOPICPARENT name="ColinLeavettBrown"

Supercomputing 2012 Documentation.

Line: 189 to 189
 name=HEPnet Canada SC12 repository #baseurl=http://rpms.hepnetcanada.ca/sl/6X/lustre/$basearch/2.6.32-220 #baseurl=http://rpms.hepnetcanada.ca/sl/6X/lustre/$basearch/2.6.32-279.2.1
Changed:
<
<
i#baseurl=http://rpms.hepnetcanada.ca/sl/6X/lustre/$basearch/2.6.32-279.5.1
>
>
#baseurl=http://rpms.hepnetcanada.ca/sl/6X/lustre/$basearch/2.6.32-279.5.1
 baseurl=http://rpms.hepnetcanada.ca/sl/6X/lustre/$basearch/2.6.32-279.14.1 enabled=0 gpgcheck=0

Revision 92012-11-13 - crlb

Line: 1 to 1
 
META TOPICPARENT name="ColinLeavettBrown"

Supercomputing 2012 Documentation.

Line: 136 to 136
  Exit Status: OK %ENDCONSOLE%
Added:
>
>

Building the lustre client RPMs for the current kernel.

  • On a base install of Scientific Linux 6.3, install the latest kernel and all pre-reqs.
%STARTCONSOLE% yum update yum groupinstall 'Development tools' yum install expect %ENDCONSOLE%

  • Edit /etc/selinux/config and disable selinux:
%STARTCONSOLE% SELINUX=disabled %ENDCONSOLE%

  • Reboot.

  • Install the lustre client source RPM:
%STARTCONSOLE% rpm -ivhU http://downloads.whamcloud.com/public/lustre/lustre-1.8.8-wc1/el6/client/RPMS/x86_64/lustre-client-source-1.8.8-wc1_2.6.32_220.17.1.el6.x86_64_gbc88c4c.x86_64.rpm %ENDCONSOLE%

  • Switch to the source directory, compile the lustre client code and build the RPMs.
%STARTCONSOLE% cd /usr/src/lustre-1.8.8 ./configure make make rpms %ENDCONSOLE%

  • Install the client either by:
%STARTCONSOLE% cd /root/rpmbuild/RPMS/x86_64 rpm -ivhf lustre-client-1.8.8-wc1_2.6.32_279.14.1.el6.x86_64_gbc88c4c.x86_64.rpm lustre-client-modules-1.8.8-wc1_2.6.32_279.14.1.el6.x86_64_gbc88c4c.x86_64.rpm %ENDCONSOLE%

  • or create a repository with these RPMs and (see also "Activating a lustre client node" below):
%STARTCONSOLE% yum install lustre-client %ENDCONSOLE%
 

Activating a lustre client node.

  • Configure the lustre network by creating "/etc/modpobe.d/lustre-lnet.conf". Assuming the "ethmlx" network interface is on the same layer 2 network as the lustre filesystem (10.20.3.nnn), the configuration file should contain:
Line: 149 to 189
 name=HEPnet Canada SC12 repository #baseurl=http://rpms.hepnetcanada.ca/sl/6X/lustre/$basearch/2.6.32-220 #baseurl=http://rpms.hepnetcanada.ca/sl/6X/lustre/$basearch/2.6.32-279.2.1
Changed:
<
<
baseurl=http://rpms.hepnetcanada.ca/sl/6X/lustre/$basearch/2.6.32-279.5.1
>
>
i#baseurl=http://rpms.hepnetcanada.ca/sl/6X/lustre/$basearch/2.6.32-279.5.1 baseurl=http://rpms.hepnetcanada.ca/sl/6X/lustre/$basearch/2.6.32-279.14.1
 enabled=0 gpgcheck=0 http_caching=none
Line: 157 to 198
 
  • Install the lustre client:
%STARTCONSOLE%
Changed:
<
<
yum install --enablerepo=hepnet-lustre lustre-client net-snmp-libs
>
>
yum install --enablerepo=hepnet-lustre lustre-client
 %ENDCONSOLE%

  • Mount the filesystem:

Revision 82012-11-08 - crlb

Line: 1 to 1
 
META TOPICPARENT name="ColinLeavettBrown"

Supercomputing 2012 Documentation.

Line: 135 to 135
  TotalNetworkBytes: 1609085802000 Exit Status: OK %ENDCONSOLE% \ No newline at end of file
Added:
>
>

Activating a lustre client node.

  • Configure the lustre network by creating "/etc/modpobe.d/lustre-lnet.conf". Assuming the "ethmlx" network interface is on the same layer 2 network as the lustre filesystem (10.20.3.nnn), the configuration file should contain:
%STARTCONSOLE% options lnet networks=tcp(ethmlx) %ENDCONSOLE%

  • Install the hepnet-lustre repository by creating "/etc/yum.repos.d/hepnet-lustre.repo". The file should contain:
%STARTCONSOLE% [hepnet-lustre] name=HEPnet Canada SC12 repository #baseurl=http://rpms.hepnetcanada.ca/sl/6X/lustre/$basearch/2.6.32-220 #baseurl=http://rpms.hepnetcanada.ca/sl/6X/lustre/$basearch/2.6.32-279.2.1 baseurl=http://rpms.hepnetcanada.ca/sl/6X/lustre/$basearch/2.6.32-279.5.1 enabled=0 gpgcheck=0 http_caching=none %ENDCONSOLE%

  • Install the lustre client:
%STARTCONSOLE% yum install --enablerepo=hepnet-lustre lustre-client net-snmp-libs %ENDCONSOLE%

  • Mount the filesystem:
%STARTCONSOLE% mkdir -p /lustreSC mount -t lustre 10.20.3.103@tcp:/lustreSC /lustreSC %ENDCONSOLE%

Revision 72012-11-06 - igable

Line: 1 to 1
 
META TOPICPARENT name="ColinLeavettBrown"

Supercomputing 2012 Documentation.

Line: 121 to 121
  TotalNetworkBytes: 858179094400 Exit Status: OK %ENDCONSOLE%
Added:
>
>
Trying all 15 disks same result:

%STARTCONSOLE% 06/11 15:25:29 Net In: 0.000 b/s Avg: 21.438 Gb/s 100.00% ( 00s )

FDTWriterSession ( afabec72-9117-404a-8254-bb18856a2558 ) final stats: Started: Tue Nov 06 15:15:27 PST 2012 Ended: Tue Nov 06 15:25:30 PST 2012 Transfer period: 10m 03s TotalBytes: 1609085802000 TotalNetworkBytes: 1609085802000 Exit Status: OK %ENDCONSOLE%

 \ No newline at end of file

Revision 62012-11-06 - igable

Line: 1 to 1
 
META TOPICPARENT name="ColinLeavettBrown"

Supercomputing 2012 Documentation.

Line: 101 to 101
 %STARTCONSOLE% [root@sc03 sbin]# MegaCli -LDInfo -Lall -aALL %ENDCONSOLE% \ No newline at end of file
Added:
>
>

Moving to LSI firmware from IBM

Some minor stability improvement. Hard to quantify.

Complete output at: https://gist.github.com/2d9da810da246e0e161b %STARTCONSOLE% [root@sc04 ~]# fdtClient -P 4 -c 10.20.3.101 -fl filelist-8-15.txt -d / 06/11 11:20:54 Net Out: 23.864 Gb/s Avg: 21.642 Gb/s 98.84% ( 03s ) 06/11 11:20:59 Net Out: 15.864 Gb/s Avg: 21.550 Gb/s 100.00% ( 00s )

FDTReaderSession ( e66cfbb2-a692-4e91-a0a6-fa21abfa3282 ) final stats: Started: Tue Nov 06 11:15:39 PST 2012 Ended: Tue Nov 06 11:21:03 PST 2012 Transfer period: 05m 24s TotalBytes: 858179094400 TotalNetworkBytes: 858179094400 Exit Status: OK %ENDCONSOLE%

Revision 52012-11-05 - igable

Line: 1 to 1
 
META TOPICPARENT name="ColinLeavettBrown"

Supercomputing 2012 Documentation.

Line: 20 to 20
 
    • ./ibm_fw_sraidmr_m5100-23.7.0-0037_linux_32-64.bin -s

  • Install and run MegaCli:
Changed:
<
<
>
>
%STARTCONSOLE%
  mkdir -p /tmp/ibm_megacli cd /tmp/ibm_megacli wget http://vmrepo.heprc.uvic.ca/ibm_utl_sraidmr_megacli-8.04.07_linux_32-64.zip
Line: 28 to 28
  yum localinstall Lib_Utils-1.00-09.noarch.rpm MegaCli-8.04.07-1.noarch.rpm cd /usr/local/sbin ln -s /opt/MegaRAID/MegaCli/MegaCli64 MegaCli
Changed:
<
<
>
>
%ENDCONSOLE%
 
  • Try the MegaCli command:
Changed:
<
<
>
>
%STARTCONSOLE%
  MegaCli -AdpAllInfo -aALL | less
Changed:
<
<
>
>
%ENDCONSOLE%
 
Line: 42 to 43
 The SC2012 servers (IBM x3650 M4) servers contain two HDD enclosures each with eight drives. The first physical drive will be used to host the operating system and utilities. The remaining drives in each enclosure will be defined as two RAID 0 arrays; one with seven drives, and the other with eight.

  • List all virtual drives:
Changed:
<
<
>
>
%STARTCONSOLE%
 [root@sc03 sbin]# MegaCli -LDInfo -Lall -aALL
Changed:
<
<

Adapter 0 -- Virtual Drive Information: Virtual Drive: 0 (Target Id: 0) Name :root RAID Level : Primary-0, Secondary-0, RAID Level Qualifier-0 Size : 237.486 GB Parity Size : 0 State : Optimal Strip Size : 128 KB Number Of Drives : 1 Span Depth : 1 Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU Default Access Policy: Read/Write Current Access Policy: Read/Write Disk Cache Policy : Enabled Encryption Type : None PI type: No PI

Is VD Cached: No

Virtual Drive: 1 (Target Id: 1) Name : RAID Level : Primary-0, Secondary-0, RAID Level Qualifier-0 Size : 474.972 GB Parity Size : 0 State : Optimal Strip Size : 1.0 MB Number Of Drives : 2 Span Depth : 1 Default Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU Default Access Policy: Read/Write Current Access Policy: Read/Write Disk Cache Policy : Enabled Encryption Type : None PI type: No PI

Is VD Cached: No

Adapter 1 -- Virtual Drive Information: Adapter 1: No Virtual Drive Configured.

Exit Code: 0x00 [root@sc03 sbin]#

>
>
%ENDCONSOLE%
 
  • Remove redundent virtual drive:
Changed:
<
<
>
>
%STARTCONSOLE%
 [root@sc03 sbin]# MegaCli -CfgLdDel -L1 -a0

Adapter 0: Deleted Virtual Drive-1(target id-1)

Exit Code: 0x00 [root@sc03 sbin]#

Changed:
<
<
>
>
%ENDCONSOLE%
 
  • Determine enclosure IDs:
Changed:
<
<
>
>
%STARTCONSOLE%
 [root@sc03 sbin]# MegaCli -EncInfo -aALL | awk '/Enclosure|Device ID/' Enclosure 0: Device ID : 252
Line: 119 to 71
  Enclosure Serial Number : N/A Enclosure Zoning Mode : N/A [root@sc03 sbin]#
Changed:
<
<
>
>
%ENDCONSOLE%
 
  • Determine enclosure IDs and used slots:
Changed:
<
<
>
>
%STARTCONSOLE%
 [root@sc03 sbin]# MegaCli -CfgDsply -aALL |awk '/GROUP|Enclosure|Slot/' Number of DISK GROUPS: 1 DISK GROUP: 0
Line: 131 to 83
 Enclosure position: N/A Number of DISK GROUPS: 0 [root@sc03 sbin]#
Changed:
<
<
>
>
%ENDCONSOLE%
 
  • Define new RAID 0 virtual device on the seven free drives:
Changed:
<
<
>
>
%STARTCONSOLE%
 [root@sc03 sbin]# MegaCli -CfgLdAdd -r0 [252:1,252:2,252:3,252:4,252:5,252:6,252:7] RA -strpsz1024 -a0

Adapter 0: Created VD 1

Line: 143 to 95
  Exit Code: 0x00 [root@sc03 sbin]#
Changed:
<
<
>
>
%ENDCONSOLE%
 
  • Display virtual devices:
Changed:
<
<
>
>
%STARTCONSOLE%
 [root@sc03 sbin]# MegaCli -LDInfo -Lall -aALL
Changed:
<
<

Adapter 0 -- Virtual Drive Information: Virtual Drive: 0 (Target Id: 0) Name :root RAID Level : Primary-0, Secondary-0, RAID Level Qualifier-0 Size : 237.486 GB Parity Size : 0 State : Optimal Strip Size : 1.0 MB Number Of Drives : 1 Span Depth : 1 Default Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU Default Access Policy: Read/Write Current Access Policy: Read/Write Disk Cache Policy : Enabled Encryption Type : None PI type: No PI

Is VD Cached: No

Virtual Drive: 1 (Target Id: 1) Name : RAID Level : Primary-0, Secondary-0, RAID Level Qualifier-0 Size : 1.623 TB Parity Size : 0 State : Optimal Strip Size : 128 KB Number Of Drives : 7 Span Depth : 1 Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU Default Access Policy: Read/Write Current Access Policy: Read/Write Disk Cache Policy : Enabled Encryption Type : None PI type: No PI

Is VD Cached: No

Adapter 1 -- Virtual Drive Information: Adapter 1: No Virtual Drive Configured.

Exit Code: 0x00 [root@sc03 sbin]#

>
>
%ENDCONSOLE%
 \ No newline at end of file

Revision 42012-10-25 - crlb

Line: 1 to 1
 
META TOPICPARENT name="ColinLeavettBrown"

Supercomputing 2012 Documentation.

Line: 35 to 35
  MegaCli -AdpAllInfo -aALL | less
Added:
>
>
  • Also, the command "MegaCli -h | less" seems to list all the options, but don't assume any implied syntax; there is no consistemt description of any of the options.
 

Reconfigureing RAID 0 Virtua Drives

Line: 134 to 135
 
  • Define new RAID 0 virtual device on the seven free drives:
Changed:
<
<
[root@sc03 sbin]# MegaCli -CfgLdAdd -r0 [252:1,252:2,252:3,252:4,252:5,252:6,252:7] -a0
>
>
[root@sc03 sbin]# MegaCli -CfgLdAdd -r0 [252:1,252:2,252:3,252:4,252:5,252:6,252:7] RA -strpsz1024 -a0
  Adapter 0: Created VD 1
Line: 156 to 157
 Size : 237.486 GB Parity Size : 0 State : Optimal
Changed:
<
<
Strip Size : 128 KB
>
>
Strip Size : 1.0 MB
 Number Of Drives : 1 Span Depth : 1
Changed:
<
<
Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
>
>
Default Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU
 Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU Default Access Policy: Read/Write Current Access Policy: Read/Write

Revision 32012-10-24 - igable

Line: 1 to 1
 
META TOPICPARENT name="ColinLeavettBrown"
Added:
>
>

Supercomputing 2012 Documentation.

 
Added:
>
>
 
Added:
>
>

MegaRaid Upgrade

  • The following IBM provided software was saved on our repository for easy distribution:
    • ibm_fw_sraidmr_m5100-23.7.0-0037_linux_32-64.bin - M5100 series RAID firmware.
    • ibm_utl_sraidmr_megacli-8.04.07_linux_32-64.zip - MegaCli command
 
Deleted:
<
<
-- ColinLeavettBrown - 2012-10-24

Supercomputing 2012 Documentation.

 \ No newline at end of file
Added:
>
>
  • Install pre-regs (/lib/ld-linux.so.2, libncurses.so.5, libstdc++.so.6):
    • yum install compat-glibc.x86_64 ncurses-libs-5.7-3.20090208.el6.i686 libstdc++-4.4.6-4.el6.i686

  • Install and run MegaCli:
      mkdir -p /tmp/ibm_megacli
      cd /tmp/ibm_megacli
      wget http://vmrepo.heprc.uvic.ca/ibm_utl_sraidmr_megacli-8.04.07_linux_32-64.zip
      unzip ibm_utl_sraidmr_megacli-8.04.07_linux_32-64.zip
      yum localinstall Lib_Utils-1.00-09.noarch.rpm MegaCli-8.04.07-1.noarch.rpm
      cd /usr/local/sbin
      ln -s /opt/MegaRAID/MegaCli/MegaCli64 MegaCli

  • Try the MegaCli command:
      MegaCli -AdpAllInfo -aALL | less

Reconfigureing RAID 0 Virtua Drives

The SC2012 servers (IBM x3650 M4) servers contain two HDD enclosures each with eight drives. The first physical drive will be used to host the operating system and utilities. The remaining drives in each enclosure will be defined as two RAID 0 arrays; one with seven drives, and the other with eight.

  • List all virtual drives:
[root@sc03 sbin]# MegaCli -LDInfo -Lall -aALL
                                     

Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name                :root
RAID Level          : Primary-0, Secondary-0, RAID Level Qualifier-0
Size                : 237.486 GB
Parity Size         : 0
State               : Optimal
Strip Size          : 128 KB
Number Of Drives    : 1
Span Depth          : 1
Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy   : Enabled
Encryption Type     : None
PI type: No PI

Is VD Cached: No


Virtual Drive: 1 (Target Id: 1)
Name                :
RAID Level          : Primary-0, Secondary-0, RAID Level Qualifier-0
Size                : 474.972 GB
Parity Size         : 0
State               : Optimal
Strip Size          : 1.0 MB
Number Of Drives    : 2
Span Depth          : 1
Default Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy   : Enabled
Encryption Type     : None
PI type: No PI

Is VD Cached: No



Adapter 1 -- Virtual Drive Information:
Adapter 1: No Virtual Drive Configured.

Exit Code: 0x00
[root@sc03 sbin]#

  • Remove redundent virtual drive:
[root@sc03 sbin]# MegaCli -CfgLdDel -L1 -a0 
                                     
Adapter 0: Deleted Virtual Drive-1(target id-1)

Exit Code: 0x00
[root@sc03 sbin]#

  • Determine enclosure IDs:
[root@sc03 sbin]# MegaCli -EncInfo -aALL | awk '/Enclosure|Device ID/'
    Enclosure 0:
    Device ID                     : 252
    Enclosure type                : SGPIO
    Enclosure Serial Number       : N/A 
    Enclosure Zoning Mode         : N/A 
    Enclosure 0:
    Device ID                     : 62
    Enclosure type                : SGPIO
    Enclosure Serial Number       : N/A 
    Enclosure Zoning Mode         : N/A 
[root@sc03 sbin]# 

  • Determine enclosure IDs and used slots:
[root@sc03 sbin]# MegaCli -CfgDsply -aALL  |awk '/GROUP|Enclosure|Slot/'
Number of DISK GROUPS: 1
DISK GROUP: 0
Enclosure Device ID: 252
Slot Number: 0
Enclosure position: N/A
Number of DISK GROUPS: 0
[root@sc03 sbin]#

  • Define new RAID 0 virtual device on the seven free drives:
[root@sc03 sbin]# MegaCli -CfgLdAdd -r0 [252:1,252:2,252:3,252:4,252:5,252:6,252:7] -a0
                                     
Adapter 0: Created VD 1

Adapter 0: Configured the Adapter!!

Exit Code: 0x00
[root@sc03 sbin]#

  • Display virtual devices:
[root@sc03 sbin]# MegaCli -LDInfo -Lall -aALL
                                     

Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name                :root
RAID Level          : Primary-0, Secondary-0, RAID Level Qualifier-0
Size                : 237.486 GB
Parity Size         : 0
State               : Optimal
Strip Size          : 128 KB
Number Of Drives    : 1
Span Depth          : 1
Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy   : Enabled
Encryption Type     : None
PI type: No PI

Is VD Cached: No


Virtual Drive: 1 (Target Id: 1)
Name                :
RAID Level          : Primary-0, Secondary-0, RAID Level Qualifier-0
Size                : 1.623 TB
Parity Size         : 0
State               : Optimal
Strip Size          : 128 KB
Number Of Drives    : 7
Span Depth          : 1
Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAheadNone, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy   : Enabled
Encryption Type     : None
PI type: No PI

Is VD Cached: No



Adapter 1 -- Virtual Drive Information:
Adapter 1: No Virtual Drive Configured.

Exit Code: 0x00
[root@sc03 sbin]#

Revision 22012-10-24 - crlb

Line: 1 to 1
 
META TOPICPARENT name="ColinLeavettBrown"
Line: 7 to 6
 -- ColinLeavettBrown - 2012-10-24

Supercomputing 2012 Documentation.

\ No newline at end of file
Added:
>
>
 \ No newline at end of file

Revision 12012-10-24 - crlb

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="ColinLeavettBrown"

-- ColinLeavettBrown - 2012-10-24

Supercomputing 2012 Documentation.

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback