-- ColinLeavettBrown - 2010-07-20

Lustre Procedures

Content

  1. Determine which OSS is serving an OST.
  2. Temporarily deactivate an OST.
  3. Re-activating an OST.
  4. Determining which files have objects on a particular OST.

Proc 1: Determine which OSS is serving an OST.

Index Up Down

On the MGS/MDT server:

[crlb@elephant01 ~]$ lctl get_param osc.*.ost_conn_uuid
osc.lustre-OST0000-osc-ffff8803bd5ab400.ost_conn_uuid=206.12.154.2@tcp
osc.lustre-OST0000-osc.ost_conn_uuid=206.12.154.2@tcp
osc.lustre-OST0001-osc-ffff8803bd5ab400.ost_conn_uuid=206.12.154.3@tcp
osc.lustre-OST0001-osc.ost_conn_uuid=206.12.154.3@tcp
osc.lustre-OST0002-osc-ffff8803bd5ab400.ost_conn_uuid=206.12.154.4@tcp
osc.lustre-OST0002-osc.ost_conn_uuid=206.12.154.4@tcp
osc.lustre-OST0003-osc-ffff8803bd5ab400.ost_conn_uuid=206.12.154.5@tcp
osc.lustre-OST0003-osc.ost_conn_uuid=206.12.154.5@tcp
[crlb@elephant01 ~]$ :

The IP address identifies which node is serving which OST.

Proc 2: Temporarily deactivate an OST.

Index Up Down

On the MGS/MDT server:

  • Determine the device number for the MDT's OSC corresponding to the OST to be deactivated (a device is indentified by its' endpoints, eg. lustre-OSTnnnn-osc and lustre-mdtlov_UUID):

[crlb@elephant01 ~]$ lctl dl | grep osc
  5 UP osc lustre-OST0000-osc lustre-mdtlov_UUID 5
  6 UP osc lustre-OST0001-osc lustre-mdtlov_UUID 5
  7 UP osc lustre-OST0002-osc lustre-mdtlov_UUID 5
  8 UP osc lustre-OST0003-osc lustre-mdtlov_UUID 5
 11 UP osc lustre-OST0000-osc-ffff8803bd5ab400 a91b4601-8f1d-5061-2175-7ac02693cc0f 5
 12 UP osc lustre-OST0001-osc-ffff8803bd5ab400 a91b4601-8f1d-5061-2175-7ac02693cc0f 5
 13 UP osc lustre-OST0002-osc-ffff8803bd5ab400 a91b4601-8f1d-5061-2175-7ac02693cc0f 5
 14 UP osc lustre-OST0003-osc-ffff8803bd5ab400 a91b4601-8f1d-5061-2175-7ac02693cc0f 5
[crlb@elephant01 ~]$ 

  • To deactivate OST0003 from the above list issue:

[crlb@elephant01 ~]$ sudo lctl --device 8 deactivate
[sudo] password for crlb: 
[crlb@elephant01 ~]$

  • The "lctl dl | grep osc" command can be used to check the change in status.

Proc 3: Re-activating an OST.

Index Up Down

On the MGS/MDT server:

  • Determine the device number for the MDT's OSC corresponding to the OST to be re-activated (a device is indentified by its' endpoints, eg. lustre-OSTnnnn-osc and lustre-mdtlov_UUID):

[crlb@elephant01 ~]$ lctl dl | grep osc
  5 UP osc lustre-OST0000-osc lustre-mdtlov_UUID 5
  6 UP osc lustre-OST0001-osc lustre-mdtlov_UUID 5
  7 UP osc lustre-OST0002-osc lustre-mdtlov_UUID 5
  8 IN osc lustre-OST0003-osc lustre-mdtlov_UUID 5
 11 UP osc lustre-OST0000-osc-ffff8803bd5ab400 a91b4601-8f1d-5061-2175-7ac02693cc0f 5
 12 UP osc lustre-OST0001-osc-ffff8803bd5ab400 a91b4601-8f1d-5061-2175-7ac02693cc0f 5
 13 UP osc lustre-OST0002-osc-ffff8803bd5ab400 a91b4601-8f1d-5061-2175-7ac02693cc0f 5
 14 UP osc lustre-OST0003-osc-ffff8803bd5ab400 a91b4601-8f1d-5061-2175-7ac02693cc0f 5
[crlb@elephant01 ~]$ 

  • To Re-activate OST0003 from the above list issue:

[crlb@elephant01 ~]$ sudo lctl --device 8 activate
[sudo] password for crlb: 
[crlb@elephant01 ~]$

  • The "lctl dl | grep osc" command can be used to check the change in status.

Proc 4: Determining which files have objects on a particular OST.

Index Up Down

This procedure can be performed on any lustre node:

  • Determine the UUID for the OST of interest:
[crlb@elephant01 ~]$ lfs df
UUID                 1K-blocks      Used Available  Use% Mounted on
lustre-MDT0000_UUID   91743520    496624  86004016    0% /lustreFS[MDT:0]
lustre-OST0000_UUID  928910792 717422828 164301980   77% /lustreFS[OST:0]
lustre-OST0001_UUID  928910792 720414360 161310444   77% /lustreFS[OST:1]
lustre-OST0002_UUID  928910792 730323340 151401464   78% /lustreFS[OST:2]
lustre-OST0003_UUID  928910792 348690392 533034416   37% /lustreFS[OST:3]

filesystem summary:  3715643168 2516850920 1010048304   67% /lustreFS

[crlb@elephant01 ~]$ 

  • To list the files with objects on OST0003:
[crlb@elephant01 ~]$ lfs find --obd lustre-OST0003_UUID /lustreFS/
   .
   .
/lustreFS//BaBar/work/allruns_backup/17320691/A26.0.0V01x57F/config.tcl
/lustreFS//BaBar/work/allruns_backup/17320691/A26.0.0V01x57F/17320691.moose.01.root
/lustreFS//BaBar/work/allruns_backup/17320691/status.txt
/lustreFS//BaBar/work/allruns_backup/17320697/A26.0.0V01x57F/B+B-_generic.dec
   .
   .
[crlb@elephant01 ~]$ 

[root@elephant bin]# lfs find --obd lustre-OST0004_UUID /lustreFS/  | wc
      0       0       0
[root@elephant bin]# lctl conf_param lustre-OST0004.osc.active=0
[root@elephant bin]#

Proc 5: Restoring a corrupted PV after a motherboard replacement.

Index Up Down

On e4:
dd if=/dev/sdb of=/tmp/boot.txt bs=512 count=1

On e3:
scp crlb@e4:/tmp/boot.txt boot-e4.txt
dd if=boot-e4.txt of=/dev/sdb bs=512 count=1
fdisk -l /dev/sdb

pvcreate --restorefile /etc/lvm/backup/vg00 --uuid f26AZq-ycTI-7QKf-3yn9-3VCe-w1V3-dOaKlk /dev/sdb
* uuid was taken from /etc/lvm/backup/vg00
vgcfgrestore vg00
pvscan
vgchange -ay vg00
mount -a
mountOSTs 
df
Edit | Attach | Watch | Print version | History: r25 < r24 < r23 < r22 < r21 | Backlinks | Raw View | More topic actions...
Topic revision: r23 - 2011-01-07 - crlb
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback