BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for either Nimbus or OpenStack clouds.

Links

Building a KVM CernVM 2.7.2 for ATLAS production

In the remainder of this document, the following formatting convention is used to differentiate terminal commands from file content:

This background colour denotes terminal input

This background colour denotes file content

Set up the batch node image

Get the KVM and XEN batch nodes from the cernvm doanloads page:

mkdir -p atlas-VMs/{kvm,xen}
cd atlas-VMs
wget --no-check-certificate http://cernvm.cern.ch/releases/25/cernvm-batch-node-2.7.2-1-2-x86_64.ext3.gz
wget --no-check-certificate http://cernvm.cern.ch/releases/25/cernvm-batch-node-2.7.2-1-2-x86_64.hdd.gz

Unzip the images

gunzip *gz

Dual Hypervisor

Make a dual hypervisor image:
sudo kpartx -av cernvm-batch-node-2.7.2-1-2-x86_64.hdd
sudo mount /dev/mapper/loop0p1 kvm/
sudo mount -o loop cernvm-batch-node-2.7.2-1-2-x86_64.ext3 xen/
sudo cp kvm/boot/grub/grub.conf kvm/boot/grub/grub.conf-kvm
sudo cp xen/boot/grub/grub.conf kvm/boot/grub/grub.conf-xen
sudo find xen/boot/ -maxdepth 1 -type f -exec cp {} kvm/boot/ \;
sudo rsync -a xen/lib/modules/* kvm/lib/modules/
sudo mkdir kvm/root/.ssh
sudo chmod 700 kvm/root/.ssh
sudo cp ~/.ssh/authorized_keys kvm/root/.ssh/
sudo cat kvm/root/.ssh/authorized_keys
sudo ll kvm/root/.ssh/authorized_keys
sudo ls kvm/root/.ssh/authorized_keys
sudo ls -l kvm/root/.ssh/authorized_keys
sudo umount kvm
sudo umount xen
sudo kpartx -d cernvm-batch-node-2.7.2-1-2-x86_64.hdd

Clean up after yourself

rm cernvm-batch-node-2.7.2-1-2-x86_64.ext3
rmdir kvm xen

Increase image size

Resize the image
qemu-img resize cernvm-batch-node-2.7.2-1-2-x86_64.hdd +11G

Change the image partition table:

sudo kpartx -av cernvm-batch-node-2.7.2-1-2-x86_64.hdd
sudo fdisk /dev/loop0
sudo kpartx -d cernvm-batch-node-2.7.2-1-2-x86_64.hdd

The kpartx command went in this sequence:

p
d
n
p
1
(enter to accept defualt 1 as first)
(enter to accept defualt 327680 as last)
p
w

The last step is to grow the root file system once the VM is booted, see below.

Create virsh domain XML

Use the XML of an existing domain as a starting point. Then edit the source file, uuid and bridge MAC address

sudo virsh dumpxml sl58TestCVMFSserver > cernvm-batch-node-2.7.2-x86_64.xml

Edit the VM name, image location, uuid, and MAC. After editing the file content looks like this:

<domain type='kvm'>
  <name>cernvm-2.7.2</name>
  <uuid>6a27b6e5-1f00-4baf-a1e8-44fd00023131</uuid>
  <memory unit='GiB'>2</memory>
  <currentMemory unit='GiB'>2</currentMemory>
  <vcpu placement='static'>1</vcpu>
  <os>
    <type arch='x86_64'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/opt/qemu/cernvm-batch-node-2.7.2-x86_64' />
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>
    <interface type='bridge'>
      <mac address='fa:16:3e:7a:a3:a2'/>
      <source bridge='br0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target port='0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <input type='mouse' bus='ps2'/>
    <graphics type='vnc' port='-1' autoport='yes'/>
    <video>
      <model type='cirrus' vram='9216' heads='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </memballoon>
    
  </devices>
</domain>

Run the Image

sudo virsh create cernvm-batch-node-2.7.2-x86_64.xml --console

Grow the root partition

Expand to the root partition
resize2fs /dev/vda1

Disable SELinux

Ensure that /etc/selinux/config exists and contains:

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
#     targeted - Targeted processes are protected,
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted 

Set-up Context Helper

Set up the places context helper places files
mkdir -p /etc/grid-security/certificates
touch /etc/grid-security/hostkey.pem
chmod 600 /etc/grid-security/hostkey.pem
touch /var/lib/cloud_type
touch /etc/condor/central_manager

Install context helper and context service:

wget https://raw.github.com/hep-gc/cloud-scheduler/dev/scripts/ec2contexthelper/context \
          https://raw.github.com/hep-gc/cloud-scheduler/dev/scripts/ec2contexthelper/contexthelper \
          https://raw.github.com/hep-gc/cloud-scheduler/dev/scripts/ec2contexthelper/setup.py
python setup.py install
chmod +x /etc/init.d/context
chkconfig context on

Check that contexthelper is in your path.

Install git

As the title says
conary install git

Install Shoal Client

git clone git@github.com:hep-gc/shoal.git
cd shoal/shoal-client/
python setup.py install
Optionally clean up after yourself:
cd
rm -rf shoal

Set-up Puppet

First bring facter up to date
wget http://downloads.puppetlabs.com/facter/facter-1.7.3.tar.gz
tar zxvf facter-1.7.3.tar.gz 
cd facter-1.7.3
ruby install.rb 
cd ../
rm -rf facter-1.7.3*

Install up to date version of puppet:

wget http://downloads.puppetlabs.com/puppet/puppet-2.7.23.tar.gz
tar zxvf puppet-2.7.23.tar.gz 
cd puppet-2.7.23
ruby install.rb 
cd ../
rm -rf puppet-2.7.23*
puppet resource group puppet ensure=present
puppet resource user puppet ensure=present gid=puppet shell='/sbin/nologin'

Depending on whether we are using a masterful or a masterless set-up either configure puppet to connect to the heprc puppet master, edit the /etc/puppet/puppet.conf to contain

[agent]
    server = puppet-master.heprc.uvic.ca

or retrieve the modules to run locally

git clone https://github.com/MadMalcolm/atlasgce-modules.git modules
puppet module install thias/sysctl
and create the manifest, use my gist as an example:

cd /etc/puppet/manifests
curl https://gist.github.com/MadMalcolm/6735198/raw/466a9ba1fb1f957d1d7b812305a9dd650b588a64/csnode_config.pp > csnode.pp

and apply it in the rc.local:

#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.

touch /var/lock/subsys/local

# Make sure contect helper has time to finish
sleep 5

# Cotexulaize with puppet
/usr/bin/puppet apply /etc/puppet/manifests/csnode.pp --logdest /var/lib/puppet/log/csnode.log

Disable condor and puppet on startup

Don't let condor start until after puppet (and contexthelper) have started:
chkconfig condor off
chkconfig puppet off

Old Setup - now performed by puppet

I'm leaving this here for reference for now ...

Create worker user accounts

groupadd -g 102 condor
useradd -g condor -u 102 -s /sbin/nologin condor
for i in `seq -w 1 32`; do id=$((500+${i#0})); groupadd -g $id slot$i; useradd -g slot$i -u $id -s /bin/bash slot$i; done

Optionally, add ssh keys to root for debugging

sudo vi /root/.ssh/authorized_keys

And insert the desired id_rsa.pub keys.

Network tuning

sudo vi /etc/sysctl.conf

Add the following to the end of the file:

# Network tuning: http://fasterdata.es.net/fasterdata/host-tuning/linux/
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216 
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.core.netdev_max_backlog = 30000
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_sack = 1

sudo vi /etc/rc.local

Add the following to the end of the file:

# increase txqueuelen for 10G NICS
/sbin/ifconfig eth0 txqueuelen 10000

Grid Environment

Create a grid configuration file:

sudo vi /etc/profile.d/grid-setup.sh

And add the following contents:

# Keep grid setup out of environment for root and sysadmin.
if [[ ! "$USER" =~ ^slot[0-9]+$ ]] ; then
  return 0
fi

export GLOBUS_FTP_CLIENT_GRIDFTP2=true
export VO_ATLAS_SW_DIR=/cvmfs/atlas.cern.ch/repo/sw

## Set up grid environment:
# Use EMI in AtlasLocalRootBase
export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
source $ATLAS_LOCAL_ROOT_BASE/user/atlasLocalSetup.sh --quiet
source ${ATLAS_LOCAL_ROOT_BASE}/packageSetups/atlasLocalEmiSetup.sh --quiet
export ALRB_useGridSW=emi
## Fix for using AtlasLocalRootBase with a kit
unset  AtlasSetupSite
rm ~/.asetup

# Site-specific variables (e.g. Frontier and Squid servers) are set based on ATLAS_SITE_NAME (from JDL).
# This auto-setup is only temporarily needed, and will soon become automatic 
. /cvmfs/atlas.cern.ch/repo/sw/local/bin/auto-setup

Create the grid-security and certificates directories, and pre-create the hostkey so that it will have the right ACL mode when the real key is contextualized in:

mkdir -p /etc/grid-security/certificates
touch /etc/grid-security/hostkey.pem
chmod 600 /etc/grid-security/hostkey.pem

CernVM FS Configuration

sudo vi /etc/cvmfs/default.local

Add the following to the end of the file:

CVMFS_REPOSITORIES=atlas.cern.ch,atlas-condb.cern.ch,grid.cern.ch
CVMFS_QUOTA_LIMIT=18000
CVMFS_CACHE_BASE=/scratch/cvmfs
CVMFS_HTTP_PROXY="http://chrysaor.westgrid.ca:3128;http://cernvm-webfs.atlas-canada.ca:3128;DIRECT"

Note: This could do with some generalization, or maybe this will look very different with shoal in play?

Turn off cvmfs in chkconfig, and configure puppet to create /scratch/cvmfs owned by cvmfs.cvmfs and then start cvmfs.

One could create a cern.ch.local configuration file:

sudo vi /etc/cvmfs/domain.d/cern.ch.local

with the following content:

# For Europe:
#CVMFS_SERVER_URL="http://cvmfs-stratum-one.cern.ch:8000/opt/@org@;http://cernvmfs.gridpp.rl.ac.uk:8000/opt/@org@;http://cvmfs.racf.bnl.gov:8000/opt/@org@;http://cvmfs.fnal.gov:8000/opt/@org@;http://cvmfs02.grid.sinica.edu.tw:8000/opt/@org@"
# For North America:
CVMFS_SERVER_URL="http://cvmfs.racf.bnl.gov:8000/opt/@org@;http://cvmfs.fnal.gov:8000/opt/@org@;http://cvmfs-stratum-one.cern.ch:8000/opt/@org@;http://cernvmfs.gridpp.rl.ac.uk:8000/opt/@org@;http://cvmfs02.grid.sinica.edu.tw:8000/opt/@org@"
# For Australia:
#CVMFS_SERVER_URL="http://cvmfs.fnal.gov:8000/opt/@org@;http://cvmfs.racf.bnl.gov:8000/opt/@org@;http://cernvmfs.gridpp.rl.ac.uk:8000/opt/@org@;http://cvmfs-stratum-one.cern.ch:8000/opt/@org@;http://cvmfs02.grid.sinica.edu.tw:8000/opt/@org@"

But we'd want some cloud awareness to be baked into the VM by cloud_scheduler, to choose which CVMFS_SERVER_URL to use. Mike says he could easily do this via inserting a JSON file somewhere for our usage.

CernVM Configuration

sudo vi /etc/cernvm/site.conf

Add the following to the end of the file:

CERNVM_CVMFS2=on
CERNVM_EDITION=Basic
CERNVM_ORGANISATION=atlas
CERNVM_USER_SHELL=/bin/bash
CVMFS_REPOSITORIES=atlas,atlas-condb,grid

Blank Space Partition Configuration

sudo mkdir -p /scratch
sudo vi /etc/fstab

Add the following to fstab depending on whether the VM is for OpenStack or Nimbus.

OpenStack

LABEL=ephemeral0 /scratch ext4 noatime 1 0

Nimbus

LABEL=blankpartition0 /scratch ext2 noatime 1 0

Note: ext4 support exists in the e4fsprogs package in CernVM v2.6. We should try using ext4 with no journaling; the performance should be better. It is disabled like this: tune4fs -O ^has_journal /dev/sdb and you can verify whether the has_journal property is there using /sbin/dumpe4fs /dev/sdb . Or, just create it without journaling in the first place: mkfs.ext4 -O ^has_journal /dev/sdb . Maybe try the nobarrier and noatime options too, although they might not be as significant. However, in the case of Nimbus, some new development will be needed to use ext4 partitions.

HTCondor configuration

Replace the default HTCondor init.d script with the one from Cloud Scheduler github repository:

cd /etc/init.d
sudo mv condor /tmp
sudo wget https://raw.github.com/hep-gc/cloud-scheduler/master/scripts/condor/worker/condor --no-check-certificate
sudo chmod 755 condor
sudo vi condor

Make the following modification to the condor init.d script:

CONDOR_CONFIG_VAL=/opt/condor/bin/condor_config_val

Customize the HTCondor configuration by retrieving and modifying the sample local configuration from the Cloud Scheduler github repository:

cd /etc/condor
sudo wget https://raw.github.com/hep-gc/cloud-scheduler/dev/scripts/condor/worker/condor_config.local --no-check-certificate
sudo vi condor_config.local

Modify the following lines. Note - the LOCK and EXECUTE directories will be automatically created with the correct ownerships if they don't already exist.

LOCK=/var/lock/condor
LOG=/var/log/condor
RUN=/var/run/condor
SPOOL=/var/lib/condor/spool
EXECUTE=/scratch/condor-execute

And create the LOG, RUN, and SPOOL directories:

  • mkdir /var/log/condor
  • chown condor. /var/log/condor
  • mkdir /var/run/condor
  • chown condor. /var/run/condor
  • mkdir -p /var/lib/condor/spool
  • chown -R condor. /var/lib/condor
and add the following modifications to the bottom of condor_config.local:

## Point to the java executable.
JAVA = /usr/lib/jvm/jre-1.6.0-openjdk.x86_64/bin/java

## Job slot/user accounts:
STARTER_ALLOW_RUNAS_OWNER = False
DEDICATED_EXECUTE_ACCOUNT_REGEXP = slot[0-9]+
USER_JOB_WRAPPER=/opt/condor/libexec/jobwrapper.sh
SLOT1_USER = slot01
SLOT2_USER = slot02
SLOT3_USER = slot03
SLOT4_USER = slot04
SLOT5_USER = slot05
SLOT6_USER = slot06
SLOT7_USER = slot07
SLOT8_USER = slot08
SLOT9_USER = slot09
SLOT10_USER = slot10
SLOT11_USER = slot11
SLOT12_USER = slot12
SLOT13_USER = slot13
SLOT14_USER = slot14
SLOT15_USER = slot15
SLOT16_USER = slot16
SLOT17_USER = slot17
SLOT18_USER = slot18
SLOT19_USER = slot19
SLOT20_USER = slot20
SLOT21_USER = slot21
SLOT22_USER = slot22
SLOT23_USER = slot23
SLOT24_USER = slot24
SLOT25_USER = slot25
SLOT26_USER = slot26
SLOT27_USER = slot27
SLOT28_USER = slot28
SLOT29_USER = slot29
SLOT30_USER = slot30
SLOT31_USER = slot31
SLOT32_USER = slot32

Create the batch job wrapper script:

sudo vi  /opt/condor/libexec/jobwrapper.sh

with the following:

#!/bin/bash -l
#
# Condor startd jobwrapper
# Executes using bash -l, so that all /etc/profile.d/*.sh scripts are sourced.
#
THISUSER=`/usr/bin/whoami`
export HOME=`getent passwd $THISUSER | awk -F : '{print $6}'`
exec "$@"

and make it executable:

sudo chmod 755  /opt/condor/libexec/jobwrapper.sh

Set time to UTC

Configure the system to use UTC time by editing the file:

sudo vi /etc/sysconfig/clock

and setting

UTC="true"

Note: this doesn't appear to be working; the date is given in e.g. PDT.

Context Helper

On Nimbus cloud scheduler can just write files into the VM. On openstack a service running on the VM must retrieve information from the metadata server. Follow the instructions here to do:

cd /tmp
wget https://raw.github.com/hep-gc/cloud-scheduler/master/scripts/ec2contexthelper/context
wget https://raw.github.com/hep-gc/cloud-scheduler/master/scripts/ec2contexthelper/contexthelper
wget https://raw.github.com/hep-gc/cloud-scheduler/master/scripts/ec2contexthelper/setup.py
python setup.py install
chmod +x /etc/init.d/context
chkconfig context on
cd

Using Puppet

When using Puppet to configure the VM, be sure to disable condor from starting automatically: chkconfig condor off. That way Condor will only start after being configured correctly.

Resizing the Image

First resize the raw image using qemu image tools, loop mount it, edit the partition table:
sudo qemu-img resize /opt/qemu/cernvm-large-node-2.7.2-x86_64 +11G
sudo kpartx -av /opt/qemu/cernvm-large-node-2.7.2-x86_64
sudo fdisk /dev/loop1

Delete the old partition and make a new one, over the full size, my sequence of key presses was thus:

p
d
n
p
1
(enter to accept defualt 1 as first)
(enter to accept defualt 327680 as last)
p
w

Delete the loop mount:

sudo kpartx -d /opt/qemu/cernvm-large-node-2.7.2-x86_64

Boot the virtual machine:

sudo virsh create ~/docs/cernvm-large-node-2.7.2-x86_64.xml --console

and (in the VM console) resize the file system in the VM:

resize2fs /dev/vda1

Save the VM image

To save the virtual machine image shut down the running machine and copy the file

sudo virsh shutdown "CernVM 2.7.1-x86_64"
cp /opt/qemu/cernvm-batch-node-2.7.1-2-3-x86_64.hdd .
Edit | Attach | Watch | Print version | History: r25 | r23 < r22 < r21 < r20 | Backlinks | Raw View | More topic actions...
Topic revision: r21 - 2013-12-04 - frank
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback