Difference: BuildingATLASCernVM27 (1 vs. 25)

Revision 252014-02-06 - rptaylor

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for either Nimbus or OpenStack clouds.
Line: 348 to 348
 

Network tuning

Added:
>
>
These are generally recommended settings; see http://fasterdata.es.net/host-tuning/linux/.
 
sudo vi /etc/sysctl.conf

Revision 232014-01-20 - frank

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for either Nimbus or OpenStack clouds.

Added:
>
>

TODO

Setup the /etc/grid-security/certificates for CERN:

lrwxrwxrwx 1 root root   12 Jan 21 00:32 1d879c6c.0 -> CERN-TCA.pem
lrwxrwxrwx 1 root root   23 Jan 21 00:32 1d879c6c.signing_policy -> CERN-TCA.signing_policy
-rw-r--r-- 1 root root   83 Jan 21 00:24 CERN-Root-2.crl_url
-rw-r--r-- 1 root root  518 Jan 21 00:24 CERN-Root-2.info
-rw-r--r-- 1 root root  518 Jan 21 00:24 CERN-Root-2.namespaces
-rw-r--r-- 1 root root 2370 Jan 21 00:24 CERN-Root-2.pem
-rw-r--r-- 1 root root  304 Jan 21 00:24 CERN-Root-2.signing_policy
-rw-r--r-- 1 root root   46 Jan 21 00:24 CERN-Root.crl_url
-rw-r--r-- 1 root root  334 Jan 21 00:24 CERN-Root.info
-rw-r--r-- 1 root root  566 Jan 21 00:24 CERN-Root.namespaces
-rw-r--r-- 1 root root 1350 Jan 21 00:24 CERN-Root.pem
-rw-r--r-- 1 root root  284 Jan 21 00:24 CERN-Root.signing_policy
-rw-r--r-- 1 root root   72 Jan 21 00:32 CERN-TCA.crl_url
-rw-r--r-- 1 root root  379 Jan 21 00:32 CERN-TCA.info
-rw-r--r-- 1 root root 2204 Jan 21 00:32 CERN-TCA.pem
-rw-r--r-- 1 root root  269 Jan 21 00:32 CERN-TCA.signing_policy
-rw-r--r-- 1 root root 1521 Jan 20 23:50 bffbd7d0.0
-rw-r--r-- 1 root root  248 Jan 20 23:50 bffbd7d0.signing_policy
lrwxrwxrwx 1 root root   13 Jan 21 00:29 d254cc30.0 -> CERN-Root.pem
lrwxrwxrwx 1 root root   20 Jan 21 00:29 d254cc30.namespaces -> CERN-Root.namespaces
-rw-r--r-- 1 root root 1015 Jan 21 00:30 d254cc30.r0
lrwxrwxrwx 1 root root   24 Jan 21 00:30 d254cc30.signing_policy -> CERN-Root.signing_policy
 

Links

Revision 222014-01-16 - rptaylor

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for either Nimbus or OpenStack clouds.
Line: 360 to 360
 And add the following contents:
Changed:
<
<
# Keep grid setup out of environment for root and sysadmin.
>
>
# Keep grid setup out of environment for root, sysadmin and condor.
 if [[ ! "$USER" =~ ^slot[0-9]+$ ]] ; then return 0 fi
Line: 373 to 373
 export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase source $ATLAS_LOCAL_ROOT_BASE/user/atlasLocalSetup.sh --quiet source ${ATLAS_LOCAL_ROOT_BASE}/packageSetups/atlasLocalEmiSetup.sh --quiet
Changed:
<
<
export ALRB_useGridSW=emi
>
>
#export ALRB_useGridSW=emi #not needed
 ## Fix for using AtlasLocalRootBase with a kit unset AtlasSetupSite rm ~/.asetup

Revision 212013-12-04 - frank

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for either Nimbus or OpenStack clouds.
Line: 171 to 171
 
Added:
>
>

Disable SELinux

Ensure that /etc/selinux/config exists and contains:

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
#     targeted - Targeted processes are protected,
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted 

 

Set-up Context Helper

Set up the places context helper places files

Revision 202013-11-28 - frank

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for either Nimbus or OpenStack clouds.
Line: 193 to 193
  Check that contexthelper is in your path.
Added:
>
>

Install git

As the title says
conary install git

Install Shoal Client

git clone git@github.com:hep-gc/shoal.git
cd shoal/shoal-client/
python setup.py install
Optionally clean up after yourself:
cd
rm -rf shoal
 

Set-up Puppet

First bring facter up to date
Line: 213 to 231
 ruby install.rb cd ../ rm -rf puppet-2.7.23*
Added:
>
>
puppet resource group puppet ensure=present puppet resource user puppet ensure=present gid=puppet shell='/sbin/nologin'
 

Depending on whether we are using a masterful or a masterless set-up either configure puppet to connect to the heprc puppet master, edit the /etc/puppet/puppet.conf to contain

Line: 226 to 246
 git clone https://github.com/MadMalcolm/atlasgce-modules.git modules puppet module install thias/sysctl
Changed:
<
<
and create the manifest, use my gist as an example and apply it in the rc.local
>
>
and create the manifest, use my gist as an example:

cd /etc/puppet/manifests
curl https://gist.github.com/MadMalcolm/6735198/raw/466a9ba1fb1f957d1d7b812305a9dd650b588a64/csnode_config.pp > csnode.pp

and apply it in the rc.local:

#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.

touch /var/lock/subsys/local

# Make sure contect helper has time to finish
sleep 5

# Cotexulaize with puppet
/usr/bin/puppet apply /etc/puppet/manifests/csnode.pp --logdest /var/lib/puppet/log/csnode.log
 
Changed:
<
<

Disable condor on startup

>
>

Disable condor and puppet on startup

 Don't let condor start until after puppet (and contexthelper) have started:
chkconfig condor off
Added:
>
>
chkconfig puppet off
 

Revision 192013-10-29 - frank

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for either Nimbus or OpenStack clouds.
Line: 215 to 215
 rm -rf puppet-2.7.23*
Changed:
<
<
Configure puppet to connect to the heprc puppet master, edit the /etc/puppet/puppet.conf to contain
>
>
Depending on whether we are using a masterful or a masterless set-up either configure puppet to connect to the heprc puppet master, edit the /etc/puppet/puppet.conf to contain
 
[agent]
    server = puppet-master.heprc.uvic.ca
Added:
>
>
or retrieve the modules to run locally
git clone https://github.com/MadMalcolm/atlasgce-modules.git modules
puppet module install thias/sysctl
and create the manifest, use my gist as an example and apply it in the rc.local

Disable condor on startup

Don't let condor start until after puppet (and contexthelper) have started:
chkconfig condor off
 

Old Setup - now performed by puppet

I'm leaving this here for reference for now ...

Revision 182013-10-22 - frank

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for either Nimbus or OpenStack clouds.
Line: 72 to 72
 qemu-img resize cernvm-batch-node-2.7.2-1-2-x86_64.hdd +11G
Added:
>
>
Change the image partition table:
sudo kpartx -av cernvm-batch-node-2.7.2-1-2-x86_64.hdd
sudo fdisk /dev/loop0
sudo kpartx -d cernvm-batch-node-2.7.2-1-2-x86_64.hdd

The kpartx command went in this sequence:

p
d
n
p
1
(enter to accept defualt 1 as first)
(enter to accept defualt 327680 as last)
p
w

The last step is to grow the root file system once the VM is booted, see below.

 

Create virsh domain XML

Line: 146 to 167
 

Grow the root partition

Expand to the root partition
Changed:
<
<
....
>
>
resize2fs /dev/vda1
 
Added:
>
>

Set-up Context Helper

Set up the places context helper places files
mkdir -p /etc/grid-security/certificates
touch /etc/grid-security/hostkey.pem
chmod 600 /etc/grid-security/hostkey.pem
touch /var/lib/cloud_type
touch /etc/condor/central_manager

Install context helper and context service:

wget https://raw.github.com/hep-gc/cloud-scheduler/dev/scripts/ec2contexthelper/context \
          https://raw.github.com/hep-gc/cloud-scheduler/dev/scripts/ec2contexthelper/contexthelper \
          https://raw.github.com/hep-gc/cloud-scheduler/dev/scripts/ec2contexthelper/setup.py
python setup.py install
chmod +x /etc/init.d/context
chkconfig context on

Check that contexthelper is in your path.

Set-up Puppet

First bring facter up to date
wget http://downloads.puppetlabs.com/facter/facter-1.7.3.tar.gz
tar zxvf facter-1.7.3.tar.gz 
cd facter-1.7.3
ruby install.rb 
cd ../
rm -rf facter-1.7.3*

Install up to date version of puppet:

wget http://downloads.puppetlabs.com/puppet/puppet-2.7.23.tar.gz
tar zxvf puppet-2.7.23.tar.gz 
cd puppet-2.7.23
ruby install.rb 
cd ../
rm -rf puppet-2.7.23*

Configure puppet to connect to the heprc puppet master, edit the /etc/puppet/puppet.conf to contain

[agent]
    server = puppet-master.heprc.uvic.ca

Old Setup - now performed by puppet

I'm leaving this here for reference for now ...
 

Create worker user accounts

Revision 172013-10-21 - frank

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for either Nimbus or OpenStack clouds.
Line: 23 to 23
 This background colour denotes file content
Changed:
<
<

Get the KVM batch node image

>
>

Set up the batch node image

Get the KVM and XEN batch nodes from the cernvm doanloads page:
 
Changed:
<
<
wget --no-check-certificate https://cernvm.cern.ch/releases/19/cernvm-batch-node-2.7.1-2-3-x86_64.hdd.gz gunzip cernvm-batch-node-2.7.1-2-3-x86_64.hdd.gz mv cernvm-batch-node-2.7.1-2-3-x86_64.hdd /opt/qemu/
>
>
mkdir -p atlas-VMs/{kvm,xen} cd atlas-VMs wget --no-check-certificate http://cernvm.cern.ch/releases/25/cernvm-batch-node-2.7.2-1-2-x86_64.ext3.gz wget --no-check-certificate http://cernvm.cern.ch/releases/25/cernvm-batch-node-2.7.2-1-2-x86_64.hdd.gz

Unzip the images

gunzip *gz

Dual Hypervisor

Make a dual hypervisor image:
sudo kpartx -av cernvm-batch-node-2.7.2-1-2-x86_64.hdd
sudo mount /dev/mapper/loop0p1 kvm/
sudo mount -o loop cernvm-batch-node-2.7.2-1-2-x86_64.ext3 xen/
sudo cp kvm/boot/grub/grub.conf kvm/boot/grub/grub.conf-kvm
sudo cp xen/boot/grub/grub.conf kvm/boot/grub/grub.conf-xen
sudo find xen/boot/ -maxdepth 1 -type f -exec cp {} kvm/boot/ \;
sudo rsync -a xen/lib/modules/* kvm/lib/modules/
sudo mkdir kvm/root/.ssh
sudo chmod 700 kvm/root/.ssh
sudo cp ~/.ssh/authorized_keys kvm/root/.ssh/
sudo cat kvm/root/.ssh/authorized_keys
sudo ll kvm/root/.ssh/authorized_keys
sudo ls kvm/root/.ssh/authorized_keys
sudo ls -l kvm/root/.ssh/authorized_keys
sudo umount kvm
sudo umount xen
sudo kpartx -d cernvm-batch-node-2.7.2-1-2-x86_64.hdd

Clean up after yourself

rm cernvm-batch-node-2.7.2-1-2-x86_64.ext3
rmdir kvm xen

Increase image size

Resize the image
qemu-img resize cernvm-batch-node-2.7.2-1-2-x86_64.hdd +11G
 
Line: 37 to 78
 Use the XML of an existing domain as a starting point. Then edit the source file, uuid and bridge MAC address
Changed:
<
<
sudo virsh dumpxml sl58TestCVMFSserver > cernvm-batch-node-2.7.1-2-3-x86_64.hdd.xml
>
>
sudo virsh dumpxml sl58TestCVMFSserver > cernvm-batch-node-2.7.2-x86_64.xml
 
Changed:
<
<
After editing the file content looks like this:
>
>
Edit the VM name, image location, uuid, and MAC. After editing the file content looks like this:
 
<domain type='kvm'>
Changed:
<
<
cernvm-batch-node-2.7.2-x86_64 760de8f3-c9c4-4d91-9030-d7c7c5fdd5c8
>
>
cernvm-2.7.2 6a27b6e5-1f00-4baf-a1e8-44fd00023131
  2 2 1
Line: 66 to 107
  /usr/libexec/qemu-kvm
Changed:
<
<
>
>
 
Line: 95 to 137
 
Deleted:
<
<
Note: It would be easier to refer to the VM from virsh if you don't put spaces in the domain name.
 

Run the Image

Changed:
<
<
sudo virsh create cernvm-batch-node-2.7.1-2-3-x86_64.hdd.xml --console
>
>
sudo virsh create cernvm-batch-node-2.7.2-x86_64.xml --console
 
Added:
>
>

Grow the root partition

Expand to the root partition
....
 

Create worker user accounts

Revision 162013-10-04 - frank

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for either Nimbus or OpenStack clouds.
Line: 11 to 11
 
Changed:
<
<

Building a KVM CernVM 2.7.0 for ATLAS production

>
>

Building a KVM CernVM 2.7.2 for ATLAS production

  In the remainder of this document, the following formatting convention is used to differentiate terminal commands from file content:
Line: 44 to 44
 
<domain type='kvm'>
Changed:
<
<
CernVM 2.7.1-x86_64 592be209-5fbc-45b3-99f4-35d613396d18
>
>
cernvm-batch-node-2.7.2-x86_64 760de8f3-c9c4-4d91-9030-d7c7c5fdd5c8
  2 2 1
Line: 66 to 66
  /usr/libexec/qemu-kvm
Changed:
<
<
>
>
 
Line: 423 to 422
 

Using Puppet

When using Puppet to configure the VM, be sure to disable condor from starting automatically: chkconfig condor off. That way Condor will only start after being configured correctly.
Added:
>
>

Resizing the Image

First resize the raw image using qemu image tools, loop mount it, edit the partition table:
sudo qemu-img resize /opt/qemu/cernvm-large-node-2.7.2-x86_64 +11G
sudo kpartx -av /opt/qemu/cernvm-large-node-2.7.2-x86_64
sudo fdisk /dev/loop1

Delete the old partition and make a new one, over the full size, my sequence of key presses was thus:

p
d
n
p
1
(enter to accept defualt 1 as first)
(enter to accept defualt 327680 as last)
p
w

Delete the loop mount:

sudo kpartx -d /opt/qemu/cernvm-large-node-2.7.2-x86_64

Boot the virtual machine:

sudo virsh create ~/docs/cernvm-large-node-2.7.2-x86_64.xml --console

and (in the VM console) resize the file system in the VM:

resize2fs /dev/vda1
 

Save the VM image

To save the virtual machine image shut down the running machine and copy the file

Revision 152013-10-03 - rptaylor

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for either Nimbus or OpenStack clouds.
Line: 205 to 205
 
CVMFS_REPOSITORIES=atlas.cern.ch,atlas-condb.cern.ch,grid.cern.ch
Changed:
<
<
CVMFS_QUOTA_LIMIT=5000
>
>
CVMFS_QUOTA_LIMIT=18000 CVMFS_CACHE_BASE=/scratch/cvmfs
 CVMFS_HTTP_PROXY="http://chrysaor.westgrid.ca:3128;http://cernvm-webfs.atlas-canada.ca:3128;DIRECT"

Note: This could do with some generalization, or maybe this will look very different with shoal in play?

Added:
>
>
Turn off cvmfs in chkconfig, and configure puppet to create /scratch/cvmfs owned by cvmfs.cvmfs and then start cvmfs.
  One could create a cern.ch.local configuration file:
Line: 280 to 282
 
cd /etc/init.d
Changed:
<
<
sudo mkdir -p /home/condor/old-scripts sudo mv condor /home/condor/old-scripts/ sudo chown -R condor.condor /home/condor/old-scripts/
>
>
sudo mv condor /tmp
 sudo wget https://raw.github.com/hep-gc/cloud-scheduler/master/scripts/condor/worker/condor --no-check-certificate sudo chmod 755 condor sudo vi condor

Revision 142013-10-02 - rptaylor

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for either Nimbus or OpenStack clouds.
Line: 404 to 404
 UTC="true"
Added:
>
>
Note: this doesn't appear to be working; the date is given in e.g. PDT.
 

Context Helper

On Nimbus cloud scheduler can just write files into the VM. On openstack a service running on the VM must retrieve information from the metadata server. Follow the instructions here to do:
Line: 419 to 420
 cd
Added:
>
>

Using Puppet

When using Puppet to configure the VM, be sure to disable condor from starting automatically: chkconfig condor off. That way Condor will only start after being configured correctly.
 

Save the VM image

To save the virtual machine image shut down the running machine and copy the file

Revision 132013-09-27 - rptaylor

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for either Nimbus or OpenStack clouds.
Line: 171 to 171
 fi

export GLOBUS_FTP_CLIENT_GRIDFTP2=true

Added:
>
>
export VO_ATLAS_SW_DIR=/cvmfs/atlas.cern.ch/repo/sw
  ## Set up grid environment: # Use EMI in AtlasLocalRootBase
Line: 185 to 186
 # Site-specific variables (e.g. Frontier and Squid servers) are set based on ATLAS_SITE_NAME (from JDL). # This auto-setup is only temporarily needed, and will soon become automatic . /cvmfs/atlas.cern.ch/repo/sw/local/bin/auto-setup
Deleted:
<
<
export VO_ATLAS_SW_DIR=/cvmfs/atlas.cern.ch/repo/sw
 

Create the grid-security and certificates directories, and pre-create the hostkey so that it will have the right ACL mode when the real key is contextualized in:

Revision 122013-09-26 - rptaylor

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for either Nimbus or OpenStack clouds.
Line: 186 to 186
 # This auto-setup is only temporarily needed, and will soon become automatic . /cvmfs/atlas.cern.ch/repo/sw/local/bin/auto-setup
Changed:
<
<
## fix for atlas software location disappearing from condor job environment export VO_ATLAS_SW_DIR
>
>
export VO_ATLAS_SW_DIR=/cvmfs/atlas.cern.ch/repo/sw
 

Create the grid-security and certificates directories, and pre-create the hostkey so that it will have the right ACL mode when the real key is contextualized in:

Revision 112013-09-19 - rptaylor

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for either Nimbus or OpenStack clouds.
Line: 172 to 172
  export GLOBUS_FTP_CLIENT_GRIDFTP2=true
Deleted:
<
<
# Workaround for condor not setting $HOME for worker sessions. # voms-proxy-info requires this. if -z "$HOME" ; then export HOME=`eval echo ~$USER` fi
 ## Set up grid environment: # Use EMI in AtlasLocalRootBase export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
Line: 196 to 190
 export VO_ATLAS_SW_DIR
Changed:
<
<
Ensure the following files exist with these permissions:
>
>
Create the grid-security and certificates directories, and pre-create the hostkey so that it will have the right ACL mode when the real key is contextualized in:
 
Changed:
<
<
drwxr-xr-x 6 root root 4096 Aug 8 05:30 /etc/grid-security/ -rw-r--r-- 1 root root 5381 Apr 5 14:41 /etc/grid-security/hostcert.pem -rw------- 1 root root 1679 Apr 5 14:41 /etc/grid-security/hostkey.pem
>
>
mkdir -p /etc/grid-security/certificates touch /etc/grid-security/hostkey.pem chmod 600 /etc/grid-security/hostkey.pem
 
Deleted:
<
<
 

CernVM FS Configuration

Line: 214 to 207
 
CVMFS_REPOSITORIES=atlas.cern.ch,atlas-condb.cern.ch,grid.cern.ch
Changed:
<
<
CVMFS_QUOTA_LIMIT=4000
>
>
CVMFS_QUOTA_LIMIT=5000
 CVMFS_HTTP_PROXY="http://chrysaor.westgrid.ca:3128;http://cernvm-webfs.atlas-canada.ca:3128;DIRECT"
Line: 337 to 330
 ## Job slot/user accounts: STARTER_ALLOW_RUNAS_OWNER = False DEDICATED_EXECUTE_ACCOUNT_REGEXP = slot[0-9]+
Changed:
<
<
USER_JOB_WRAPPER=/usr/local/bin/condor-job-wrapper
>
>
USER_JOB_WRAPPER=/opt/condor/libexec/jobwrapper.sh
 SLOT1_USER = slot01 SLOT2_USER = slot02 SLOT3_USER = slot03
Line: 375 to 368
 Create the batch job wrapper script:
Changed:
<
<
sudo vi /usr/local/bin/condor-job-wrapper
>
>
sudo vi /opt/condor/libexec/jobwrapper.sh
 

with the following:

#!/bin/bash -l
Added:
>
>
# # Condor startd jobwrapper # Executes using bash -l, so that all /etc/profile.d/*.sh scripts are sourced. # THISUSER=`/usr/bin/whoami` export HOME=`getent passwd $THISUSER | awk -F : '{print $6}'`
 exec "$@"
Added:
>
>
 

and make it executable:

Line: 388 to 388
 and make it executable:
Changed:
<
<
sudo chmod 755 /usr/local/bin/condor-job-wrapper
>
>
sudo chmod 755 /opt/condor/libexec/jobwrapper.sh
 

Revision 102013-09-19 - frank

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for either Nimbus or OpenStack clouds.
Line: 407 to 407
 
Added:
>
>

Context Helper

On Nimbus cloud scheduler can just write files into the VM. On openstack a service running on the VM must retrieve information from the metadata server. Follow the instructions here to do:

cd /tmp
wget https://raw.github.com/hep-gc/cloud-scheduler/master/scripts/ec2contexthelper/context
wget https://raw.github.com/hep-gc/cloud-scheduler/master/scripts/ec2contexthelper/contexthelper
wget https://raw.github.com/hep-gc/cloud-scheduler/master/scripts/ec2contexthelper/setup.py
python setup.py install
chmod +x /etc/init.d/context
chkconfig context on
cd
 

Save the VM image

To save the virtual machine image shut down the running machine and copy the file

Revision 92013-09-18 - frank

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for either Nimbus or OpenStack clouds.
Line: 191 to 191
 # Site-specific variables (e.g. Frontier and Squid servers) are set based on ATLAS_SITE_NAME (from JDL). # This auto-setup is only temporarily needed, and will soon become automatic . /cvmfs/atlas.cern.ch/repo/sw/local/bin/auto-setup
Added:
>
>
## fix for atlas software location disappearing from condor job environment export VO_ATLAS_SW_DIR
 

Ensure the following files exist with these permissions:

Revision 82013-09-16 - frank

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for either Nimbus or OpenStack clouds.
Line: 193 to 193
 . /cvmfs/atlas.cern.ch/repo/sw/local/bin/auto-setup
Added:
>
>
Ensure the following files exist with these permissions:
drwxr-xr-x 6 root root 4096 Aug  8 05:30 /etc/grid-security/
-rw-r--r-- 1 root root 5381 Apr  5 14:41 /etc/grid-security/hostcert.pem
-rw------- 1 root root 1679 Apr  5 14:41 /etc/grid-security/hostkey.pem
 

CernVM FS Configuration

Revision 72013-08-09 - frank

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Changed:
<
<
Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for starters.
>
>
Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for either Nimbus or OpenStack clouds.
 
Deleted:
<
<
 
Line: 24 to 23
 This background colour denotes file content
Changed:
<
<

1. Get the KVM batch node image

Home Up Down
>
>

Get the KVM batch node image

 
wget --no-check-certificate https://cernvm.cern.ch/releases/19/cernvm-batch-node-2.7.1-2-3-x86_64.hdd.gz
Line: 35 to 32
 
Changed:
<
<

2. Create virsh domain XML

Home Up Down
>
>

Create virsh domain XML

  Use the XML of an existing domain as a starting point. Then edit the source file, uuid and bridge MAC address
Line: 103 to 98
  Note: It would be easier to refer to the VM from virsh if you don't put spaces in the domain name.
Changed:
<
<

3. Run the Image

Home Up Down
>
>

Run the Image

 
sudo virsh create cernvm-batch-node-2.7.1-2-3-x86_64.hdd.xml --console
Changed:
<
<

4. Create sysadmin account (optional, not required)

Home Up Down

groupadd -g 500 sysadmin
useradd -g sysadmin -u 500 sysadmin
mkdir -p /home/sysadmin/.ssh
chmod 700 /home/sysadmin/.ssh
touch /home/sysadmin/.ssh/authorized_keys
chmod 600 /home/sysadmin/.ssh/authorized_keys
chown -R sysadmin.sysadmin /home/sysadmin/.ssh
vi /home/sysadmin/.bashrc

Add the following contents to the end of the file:

export GLOBUS_LOCATION=/cvmfs/grid.cern.ch/3.2.11-1/globus
export MYPROXY_SERVER=myproxy.cloud.nrc.ca
. /cvmfs/grid.cern.ch/3.2.11-1/globus/etc/globus-user-env.sh 

visudo

And add the following to the user section:

sysadmin ALL=(ALL) NOPASSWD: ALL

5. Create worker user accounts:

Home Up Down
>
>

Create worker user accounts

 
groupadd -g 102 condor
Line: 157 to 114
 
Changed:
<
<

6. Optionally, add ssh keys to root for debugging

Home Up Down
>
>

Optionally, add ssh keys to root for debugging

 
sudo vi /root/.ssh/authorized_keys
Line: 168 to 123
 And insert the desired id_rsa.pub keys.
Changed:
<
<

7. Network tuning:

Home Up Down
>
>

Network tuning

 
sudo vi /etc/sysctl.conf
Line: 201 to 154
 
Changed:
<
<

8. OpenStack Ephemeral Storage Script

Home Up Down

Create a script to format and mount the virtual block device on OpenStack clouds.

sudo vi /etc/init.d/openstack_ephemeral_storage

containing:

#!/bin/bash
#
# openstack_ephemeral_storage
#
# chkconfig: 2345 90 99
# description: Format and mount ephemeral storage on the virtual block device.
#
### BEGIN INIT INFO
# Provides:          openstack_ephemeral_storage
# Required-Start:    
# Required-Stop:
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: openstack_ephemeral_storage daemon
# Description:       The openstack_ephemeral_storage daemon formats and mounts the virtual block device. 
### END INIT INFO

device=/dev/vdb
lockfile=/var/log/openstack_ephemeral_storage
self="`basename $0`"

check_openstack() {
  logger -s -t "$self" "Querying metadata server"
  query=`curl -m 10 http://169.254.169.254/openstack 2> >(logger -s -t "$self")`
  logger -s -t "$self" "Metadata server returned: $query"
  # check if query was successful, and first word matches "latest", or a date in the format "2013-03-08"
  if [[ $? -eq 0 ]] && [[ $query =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2} || $query =~ ^latest ]] ; then 
    logger -s -t "$self" "Cloud type is OpenStack; continuing."
  else
    logger -s -t "$self" "Cloud type is not OpenStack; exiting."
    exit 0
  fi
}

case "$1" in
  start)
    check_openstack
    if [[ ! -f $lockfile ]] ; then
      fs_label=`/sbin/tune2fs -l $device | /bin/awk '/Filesystem volume name:/ {print $4}'`
      logger -s -t "$self" "Current label of $device: $fs_label"
      if [[ "$fs_label" == "ephemeral0" ]] ; then
        logger -s -t "$self" "Re-labelling and mounting pre-formatted volume on $device"
        ( /sbin/tune2fs -L blankpartition0 $device 2>&1 && /bin/mount /scratch 2>&1 ) | logger -s -t "$self"
      else
        logger -s -t "$self" "Formatting and mounting $device"
        ( /sbin/mkfs.ext2 -L blankpartition0 $device 2>&1 && /bin/mount /scratch 2>&1 ) | logger -s -t "$self"
      fi
      touch $lockfile
    fi
    ;;
  stop)
    # unimplemented
    ;;
  status)
    check_openstack
    if [ -f $lockfile ]; then
      echo "The secondary storage $device has been formatted and mounted into /scratch by $0."
    else
      echo "The secondary storage $device hasn't been formatted and mounted into /scratch by $0."
    fi
    ;;
  *)
    echo "Usage: $0 {start|stop|status}"
    exit 1
    ;;
esac
 
exit 0

sudo chmod 755 /etc/init.d/openstack_ephemeral_storage
sudo /sbin/chkconfig --add openstack_ephemeral_storage

9. Grid Environment:

Home Up Down
>
>

Grid Environment

  Create a grid configuration file:
Line: 332 to 194
 
Changed:
<
<

10. CVMFS configuration

Home Up Down
>
>

CernVM FS Configuration

 
sudo vi /etc/cvmfs/default.local
Line: 371 to 231
 But we'd want some cloud awareness to be baked into the VM by cloud_scheduler, to choose which CVMFS_SERVER_URL to use. Mike says he could easily do this via inserting a JSON file somewhere for our usage.
Changed:
<
<

11. CernVM configuration:

Home Up Down
>
>

CernVM Configuration

 
sudo vi /etc/cernvm/site.conf
Line: 390 to 248
 
Changed:
<
<

12. Blank space partition configuration:

Home Up Down
>
>

Blank Space Partition Configuration

 
sudo mkdir -p /scratch
sudo vi /etc/fstab
Changed:
<
<
Add the following to fstab:
>
>
Add the following to fstab depending on whether the VM is for OpenStack or Nimbus.

OpenStack

LABEL=ephemeral0 /scratch ext4 noatime 1 0

Nimbus

 
LABEL=blankpartition0 /scratch ext2 noatime 1 0
Line: 408 to 272
 Note: ext4 support exists in the e4fsprogs package in CernVM v2.6. We should try using ext4 with no journaling; the performance should be better. It is disabled like this: tune4fs -O ^has_journal /dev/sdb and you can verify whether the has_journal property is there using /sbin/dumpe4fs /dev/sdb . Or, just create it without journaling in the first place: mkfs.ext4 -O ^has_journal /dev/sdb . Maybe try the nobarrier and noatime options too, although they might not be as significant. However, in the case of Nimbus, some new development will be needed to use ext4 partitions.
Changed:
<
<

13. HTCondor configuration:

Home Up Down
>
>

HTCondor configuration

  Replace the default HTCondor init.d script with the one from Cloud Scheduler github repository:
Line: 518 to 381
 sudo chmod 755 /usr/local/bin/condor-job-wrapper
Changed:
<
<

14. Set time to UTC

Home Up Down
>
>

Set time to UTC

 Configure the system to use UTC time by editing the file:
Line: 533 to 396
 UTC="true"
Changed:
<
<

15. Save the VM image

Home Up
>
>

Save the VM image

  To save the virtual machine image shut down the running machine and copy the file

Revision 62013-07-29 - rptaylor

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for starters.
Line: 153 to 153
 
groupadd -g 102 condor
useradd -g condor -u 102 -s /sbin/nologin condor
Changed:
<
<
for i in `seq -w 1 32`; do id=$((500+${i#0})); groupadd -g $id atlas$i; useradd -g atlas$i -u $id -s /bin/bash atlas$i; done
>
>
for i in `seq -w 1 32`; do id=$((500+${i#0})); groupadd -g $id slot$i; useradd -g slot$i -u $id -s /bin/bash slot$i; done
 
Line: 304 to 304
 
# Keep grid setup out of environment for root and sysadmin.
Changed:
<
<
if [[ ! "$USER" =~ ^atlas[0-9]+$ ]] ; then
>
>
if [[ ! "$USER" =~ ^slot[0-9]+$ ]] ; then
  return 0 fi

export GLOBUS_FTP_CLIENT_GRIDFTP2=true

Changed:
<
<
# Workaround for condor not setting $HOME for atlas users.
>
>
# Workaround for condor not setting $HOME for worker sessions.
 # voms-proxy-info requires this. if -z "$HOME" ; then export HOME=`eval echo ~$USER`
Line: 463 to 463
  ## Job slot/user accounts: STARTER_ALLOW_RUNAS_OWNER = False
Changed:
<
<
DEDICATED_EXECUTE_ACCOUNT_REGEXP = atlas[0-9]+
>
>
DEDICATED_EXECUTE_ACCOUNT_REGEXP = slot[0-9]+
 USER_JOB_WRAPPER=/usr/local/bin/condor-job-wrapper
Changed:
<
<
SLOT1_USER = atlas01 SLOT2_USER = atlas02 SLOT3_USER = atlas03 SLOT4_USER = atlas04 SLOT5_USER = atlas05 SLOT6_USER = atlas06 SLOT7_USER = atlas07 SLOT8_USER = atlas08 SLOT9_USER = atlas09 SLOT10_USER = atlas10 SLOT11_USER = atlas11 SLOT12_USER = atlas12 SLOT13_USER = atlas13 SLOT14_USER = atlas14 SLOT15_USER = atlas15 SLOT16_USER = atlas16 SLOT17_USER = atlas17 SLOT18_USER = atlas18 SLOT19_USER = atlas19 SLOT20_USER = atlas20 SLOT21_USER = atlas21 SLOT22_USER = atlas22 SLOT23_USER = atlas23 SLOT24_USER = atlas24 SLOT25_USER = atlas25 SLOT26_USER = atlas26 SLOT27_USER = atlas27 SLOT28_USER = atlas28 SLOT29_USER = atlas29 SLOT30_USER = atlas30 SLOT31_USER = atlas31 SLOT32_USER = atlas32
>
>
SLOT1_USER = slot01 SLOT2_USER = slot02 SLOT3_USER = slot03 SLOT4_USER = slot04 SLOT5_USER = slot05 SLOT6_USER = slot06 SLOT7_USER = slot07 SLOT8_USER = slot08 SLOT9_USER = slot09 SLOT10_USER = slot10 SLOT11_USER = slot11 SLOT12_USER = slot12 SLOT13_USER = slot13 SLOT14_USER = slot14 SLOT15_USER = slot15 SLOT16_USER = slot16 SLOT17_USER = slot17 SLOT18_USER = slot18 SLOT19_USER = slot19 SLOT20_USER = slot20 SLOT21_USER = slot21 SLOT22_USER = slot22 SLOT23_USER = slot23 SLOT24_USER = slot24 SLOT25_USER = slot25 SLOT26_USER = slot26 SLOT27_USER = slot27 SLOT28_USER = slot28 SLOT29_USER = slot29 SLOT30_USER = slot30 SLOT31_USER = slot31 SLOT32_USER = slot32
 

Create the batch job wrapper script:

Revision 52013-07-24 - rptaylor

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for starters.
Line: 113 to 113
 

Changed:
<
<

4. Create sysadmin account

>
>

4. Create sysadmin account (optional, not required)

 Home Up Down
Line: 147 to 147
 

Changed:
<
<

5. Create other user accounts:

>
>

5. Create worker user accounts:

 Home Up Down
Line: 333 to 333
 

Changed:
<
<

10. CVMFS configuration NEEDS WORK

>
>

10. CVMFS configuration

 Home Up Down
Line: 351 to 351
 Note: This could do with some generalization, or maybe this will look very different with shoal in play?
Changed:
<
<
The CVMFS parameters are source from files in the following order (from /etc/cvmfs/domain.d/cern.ch.conf):

# /etc/cvmfs/default.conf
# /etc/cvmfs/default.local
# /etc/cvmfs/domain.d/<your_domain>.conf
# /etc/cvmfs/domain.d/<your_domain>.local
# /etc/cvmfs/config.d/<your_repository>.conf
# /etc/cvmfs/config.d/<your_repository>.local

The file cern.ch.conf sets

CVMFS_SERVER_URL=${CERNVM_SERVER_URL:="http://cvmfs-stratum-one.cern.ch/opt/@org@;http://cernvmfs.gridpp.rl.ac.uk/opt/@org@;http://cvmfs.racf.bnl.gov/opt/@org@"}

So the old cern.ch.local configuration file did nothing. For now we'll therefore not add a local configuration! On further reflection, we could create a new cern.ch local configuration file:

>
>
One could create a cern.ch.local configuration file:
 
sudo vi /etc/cvmfs/domain.d/cern.ch.local
Line: 384 to 368
 #CVMFS_SERVER_URL="http://cvmfs.fnal.gov:8000/opt/@org@;http://cvmfs.racf.bnl.gov:8000/opt/@org@;http://cernvmfs.gridpp.rl.ac.uk:8000/opt/@org@;http://cvmfs-stratum-one.cern.ch:8000/opt/@org@;http://cvmfs02.grid.sinica.edu.tw:8000/opt/@org@"
Changed:
<
<
But we'd want some cloud awareness to be baked into the VM by cloud_scheduler. Mike says he would easily do this via inserting a JSON file somewhere for our usage.
>
>
But we'd want some cloud awareness to be baked into the VM by cloud_scheduler, to choose which CVMFS_SERVER_URL to use. Mike says he could easily do this via inserting a JSON file somewhere for our usage.
 

Revision 42013-07-19 - rptaylor

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for starters.
Line: 367 to 367
 CVMFS_SERVER_URL=${CERNVM_SERVER_URL:="http://cvmfs-stratum-one.cern.ch/opt/@org@;http://cernvmfs.gridpp.rl.ac.uk/opt/@org@;http://cvmfs.racf.bnl.gov/opt/@org@"}
Changed:
<
<
So the old cern.ch.local configuration file did nothing. For now we'll therefor not add a local configuration! On further reflection, we could create a new cern.ch local configuration file:
>
>
So the old cern.ch.local configuration file did nothing. For now we'll therefore not add a local configuration! On further reflection, we could create a new cern.ch local configuration file:
 
sudo vi /etc/cvmfs/domain.d/cern.ch.local
Line: 384 to 384
 #CVMFS_SERVER_URL="http://cvmfs.fnal.gov:8000/opt/@org@;http://cvmfs.racf.bnl.gov:8000/opt/@org@;http://cernvmfs.gridpp.rl.ac.uk:8000/opt/@org@;http://cvmfs-stratum-one.cern.ch:8000/opt/@org@;http://cvmfs02.grid.sinica.edu.tw:8000/opt/@org@"
Changed:
<
<
But we'd want come cloud awareness to be baked into the VM by cloud_scheduler. Mike says he would easily do this via inserting a JSON file somewhere for our usage.
>
>
But we'd want some cloud awareness to be baked into the VM by cloud_scheduler. Mike says he would easily do this via inserting a JSON file somewhere for our usage.
 

Revision 32013-07-19 - frank

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for starters.
Line: 101 to 101
 
Added:
>
>
Note: It would be easier to refer to the VM from virsh if you don't put spaces in the domain name.
 

3. Run the Image

Line: 534 to 535
 

Changed:
<
<

14. HTCondor configuration:

>
>

14. Set time to UTC

 Home Up Down Configure the system to use UTC time by editing the file:
Line: 547 to 548
 
UTC="true"
Added:
>
>

15. Save the VM image

Home Up

To save the virtual machine image shut down the running machine and copy the file

sudo virsh shutdown "CernVM 2.7.1-x86_64"
cp /opt/qemu/cernvm-batch-node-2.7.1-2-3-x86_64.hdd .

Revision 22013-07-19 - frank

Line: 1 to 1
 
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for starters.
Line: 349 to 349
  Note: This could do with some generalization, or maybe this will look very different with shoal in play?
Changed:
<
<
Create a new cern.ch local configuration file:
>
>
The CVMFS parameters are source from files in the following order (from /etc/cvmfs/domain.d/cern.ch.conf):

# /etc/cvmfs/default.conf
# /etc/cvmfs/default.local
# /etc/cvmfs/domain.d/<your_domain>.conf
# /etc/cvmfs/domain.d/<your_domain>.local
# /etc/cvmfs/config.d/<your_repository>.conf
# /etc/cvmfs/config.d/<your_repository>.local

The file cern.ch.conf sets

CVMFS_SERVER_URL=${CERNVM_SERVER_URL:="http://cvmfs-stratum-one.cern.ch/opt/@org@;http://cernvmfs.gridpp.rl.ac.uk/opt/@org@;http://cvmfs.racf.bnl.gov/opt/@org@"}

So the old cern.ch.local configuration file did nothing. For now we'll therefor not add a local configuration! On further reflection, we could create a new cern.ch local configuration file:

 
sudo vi /etc/cvmfs/domain.d/cern.ch.local
Line: 359 to 376
 
# For Europe:
Changed:
<
<
#CVMFS_SERVER_URL=${CERNVM_SERVER_URL:="http://cvmfs-stratum-one.cern.ch:8000/opt/@org@;http://cernvmfs.gridpp.rl.ac.uk:8000/opt/@org@;http://cvmfs.racf.bnl.gov:8000/opt/@org@;http://cvmfs.fnal.gov:8000/opt/@org@;http://cvmfs02.grid.sinica.edu.tw:8000/opt/@org@"}
>
>
#CVMFS_SERVER_URL="http://cvmfs-stratum-one.cern.ch:8000/opt/@org@;http://cernvmfs.gridpp.rl.ac.uk:8000/opt/@org@;http://cvmfs.racf.bnl.gov:8000/opt/@org@;http://cvmfs.fnal.gov:8000/opt/@org@;http://cvmfs02.grid.sinica.edu.tw:8000/opt/@org@"
 # For North America:
Changed:
<
<
CVMFS_SERVER_URL=${CERNVM_SERVER_URL:="http://cvmfs.racf.bnl.gov:8000/opt/@org@;http://cvmfs.fnal.gov:8000/opt/@org@;http://cvmfs-stratum-one.cern.ch:8000/opt/@org@;http://cernvmfs.gridpp.rl.ac.uk:8000/opt/@org@;http://cvmfs02.grid.sinica.edu.tw:8000/opt/@org@"}
>
>
CVMFS_SERVER_URL="http://cvmfs.racf.bnl.gov:8000/opt/@org@;http://cvmfs.fnal.gov:8000/opt/@org@;http://cvmfs-stratum-one.cern.ch:8000/opt/@org@;http://cernvmfs.gridpp.rl.ac.uk:8000/opt/@org@;http://cvmfs02.grid.sinica.edu.tw:8000/opt/@org@"
 # For Australia:
Changed:
<
<
#CVMFS_SERVER_URL=${CERNVM_SERVER_URL:="http://cvmfs.fnal.gov:8000/opt/@org@;http://cvmfs.racf.bnl.gov:8000/opt/@org@;http://cernvmfs.gridpp.rl.ac.uk:8000/opt/@org@;http://cvmfs-stratum-one.cern.ch:8000/opt/@org@;http://cvmfs02.grid.sinica.edu.tw:8000/opt/@org@"}
>
>
#CVMFS_SERVER_URL="http://cvmfs.fnal.gov:8000/opt/@org@;http://cvmfs.racf.bnl.gov:8000/opt/@org@;http://cernvmfs.gridpp.rl.ac.uk:8000/opt/@org@;http://cvmfs-stratum-one.cern.ch:8000/opt/@org@;http://cvmfs02.grid.sinica.edu.tw:8000/opt/@org@"
 
Changed:
<
<
Note: there seems to be a CVMFS bug that prevents this from working. It works if the ${CERNVM_SERVER_URL:= part (and the closing brace) are left out.
>
>
But we'd want come cloud awareness to be baked into the VM by cloud_scheduler. Mike says he would easily do this via inserting a JSON file somewhere for our usage.
 

Line: 515 to 532
 
sudo chmod 755 /usr/local/bin/condor-job-wrapper
Added:
>
>

14. HTCondor configuration:

Home Up Down Configure the system to use UTC time by editing the file:

sudo vi /etc/sysconfig/clock

and setting

UTC="true"

Revision 12013-07-18 - frank

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="BuildingDHAtlasVM"

BuildingATLASCernVM27

Step by step walk through on how to build the cloud image for ATLAS based on CernVM 2.7. This will be for a KVM hypervisor only for starters.

Links

Building a KVM CernVM 2.7.0 for ATLAS production

In the remainder of this document, the following formatting convention is used to differentiate terminal commands from file content:

This background colour denotes terminal input

This background colour denotes file content

1. Get the KVM batch node image

Home Up Down

wget --no-check-certificate https://cernvm.cern.ch/releases/19/cernvm-batch-node-2.7.1-2-3-x86_64.hdd.gz
gunzip cernvm-batch-node-2.7.1-2-3-x86_64.hdd.gz
mv cernvm-batch-node-2.7.1-2-3-x86_64.hdd /opt/qemu/

2. Create virsh domain XML

Home Up Down

Use the XML of an existing domain as a starting point. Then edit the source file, uuid and bridge MAC address

sudo virsh dumpxml sl58TestCVMFSserver > cernvm-batch-node-2.7.1-2-3-x86_64.hdd.xml

After editing the file content looks like this:

<domain type='kvm'>
  <name>CernVM 2.7.1-x86_64</name>
  <uuid>592be209-5fbc-45b3-99f4-35d613396d18</uuid>
  <memory unit='GiB'>2</memory>
  <currentMemory unit='GiB'>2</currentMemory>
  <vcpu placement='static'>1</vcpu>
  <os>
    <type arch='x86_64'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/opt/qemu/cernvm-batch-node-2.7.1-2-3-x86_64.hdd'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>
    <interface type='bridge'>
      <mac address='fa:16:3e:7a:a3:a2'/>
      <source bridge='br0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target port='0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <input type='mouse' bus='ps2'/>
    <graphics type='vnc' port='-1' autoport='yes'/>
    <video>
      <model type='cirrus' vram='9216' heads='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </memballoon>
    
  </devices>
</domain>

3. Run the Image

Home Up Down

sudo virsh create cernvm-batch-node-2.7.1-2-3-x86_64.hdd.xml --console

4. Create sysadmin account

Home Up Down

groupadd -g 500 sysadmin
useradd -g sysadmin -u 500 sysadmin
mkdir -p /home/sysadmin/.ssh
chmod 700 /home/sysadmin/.ssh
touch /home/sysadmin/.ssh/authorized_keys
chmod 600 /home/sysadmin/.ssh/authorized_keys
chown -R sysadmin.sysadmin /home/sysadmin/.ssh
vi /home/sysadmin/.bashrc

Add the following contents to the end of the file:

export GLOBUS_LOCATION=/cvmfs/grid.cern.ch/3.2.11-1/globus
export MYPROXY_SERVER=myproxy.cloud.nrc.ca
. /cvmfs/grid.cern.ch/3.2.11-1/globus/etc/globus-user-env.sh 

visudo

And add the following to the user section:

sysadmin ALL=(ALL) NOPASSWD: ALL

5. Create other user accounts:

Home Up Down

groupadd -g 102 condor
useradd -g condor -u 102 -s /sbin/nologin condor
for i in `seq -w 1 32`; do id=$((500+${i#0})); groupadd -g $id atlas$i; useradd -g atlas$i -u $id -s /bin/bash atlas$i; done

6. Optionally, add ssh keys to root for debugging

Home Up Down

sudo vi /root/.ssh/authorized_keys

And insert the desired id_rsa.pub keys.

7. Network tuning:

Home Up Down

sudo vi /etc/sysctl.conf

Add the following to the end of the file:

# Network tuning: http://fasterdata.es.net/fasterdata/host-tuning/linux/
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216 
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.core.netdev_max_backlog = 30000
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_sack = 1

sudo vi /etc/rc.local

Add the following to the end of the file:

# increase txqueuelen for 10G NICS
/sbin/ifconfig eth0 txqueuelen 10000

8. OpenStack Ephemeral Storage Script

Home Up Down

Create a script to format and mount the virtual block device on OpenStack clouds.

sudo vi /etc/init.d/openstack_ephemeral_storage

containing:

#!/bin/bash
#
# openstack_ephemeral_storage
#
# chkconfig: 2345 90 99
# description: Format and mount ephemeral storage on the virtual block device.
#
### BEGIN INIT INFO
# Provides:          openstack_ephemeral_storage
# Required-Start:    
# Required-Stop:
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: openstack_ephemeral_storage daemon
# Description:       The openstack_ephemeral_storage daemon formats and mounts the virtual block device. 
### END INIT INFO

device=/dev/vdb
lockfile=/var/log/openstack_ephemeral_storage
self="`basename $0`"

check_openstack() {
  logger -s -t "$self" "Querying metadata server"
  query=`curl -m 10 http://169.254.169.254/openstack 2> >(logger -s -t "$self")`
  logger -s -t "$self" "Metadata server returned: $query"
  # check if query was successful, and first word matches "latest", or a date in the format "2013-03-08"
  if [[ $? -eq 0 ]] && [[ $query =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2} || $query =~ ^latest ]] ; then 
    logger -s -t "$self" "Cloud type is OpenStack; continuing."
  else
    logger -s -t "$self" "Cloud type is not OpenStack; exiting."
    exit 0
  fi
}

case "$1" in
  start)
    check_openstack
    if [[ ! -f $lockfile ]] ; then
      fs_label=`/sbin/tune2fs -l $device | /bin/awk '/Filesystem volume name:/ {print $4}'`
      logger -s -t "$self" "Current label of $device: $fs_label"
      if [[ "$fs_label" == "ephemeral0" ]] ; then
        logger -s -t "$self" "Re-labelling and mounting pre-formatted volume on $device"
        ( /sbin/tune2fs -L blankpartition0 $device 2>&1 && /bin/mount /scratch 2>&1 ) | logger -s -t "$self"
      else
        logger -s -t "$self" "Formatting and mounting $device"
        ( /sbin/mkfs.ext2 -L blankpartition0 $device 2>&1 && /bin/mount /scratch 2>&1 ) | logger -s -t "$self"
      fi
      touch $lockfile
    fi
    ;;
  stop)
    # unimplemented
    ;;
  status)
    check_openstack
    if [ -f $lockfile ]; then
      echo "The secondary storage $device has been formatted and mounted into /scratch by $0."
    else
      echo "The secondary storage $device hasn't been formatted and mounted into /scratch by $0."
    fi
    ;;
  *)
    echo "Usage: $0 {start|stop|status}"
    exit 1
    ;;
esac
 
exit 0

sudo chmod 755 /etc/init.d/openstack_ephemeral_storage
sudo /sbin/chkconfig --add openstack_ephemeral_storage

9. Grid Environment:

Home Up Down

Create a grid configuration file:

sudo vi /etc/profile.d/grid-setup.sh

And add the following contents:

# Keep grid setup out of environment for root and sysadmin.
if [[ ! "$USER" =~ ^atlas[0-9]+$ ]] ; then
  return 0
fi

export GLOBUS_FTP_CLIENT_GRIDFTP2=true

# Workaround for condor not setting $HOME for atlas users.
# voms-proxy-info requires this.
if [[ -z "$HOME" ]] ; then
  export HOME=`eval echo ~$USER`
fi

## Set up grid environment:
# Use EMI in AtlasLocalRootBase
export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
source $ATLAS_LOCAL_ROOT_BASE/user/atlasLocalSetup.sh --quiet
source ${ATLAS_LOCAL_ROOT_BASE}/packageSetups/atlasLocalEmiSetup.sh --quiet
export ALRB_useGridSW=emi
## Fix for using AtlasLocalRootBase with a kit
unset  AtlasSetupSite
rm ~/.asetup

# Site-specific variables (e.g. Frontier and Squid servers) are set based on ATLAS_SITE_NAME (from JDL).
# This auto-setup is only temporarily needed, and will soon become automatic 
. /cvmfs/atlas.cern.ch/repo/sw/local/bin/auto-setup

10. CVMFS configuration NEEDS WORK

Home Up Down

sudo vi /etc/cvmfs/default.local

Add the following to the end of the file:

CVMFS_REPOSITORIES=atlas.cern.ch,atlas-condb.cern.ch,grid.cern.ch
CVMFS_QUOTA_LIMIT=4000
CVMFS_HTTP_PROXY="http://chrysaor.westgrid.ca:3128;http://cernvm-webfs.atlas-canada.ca:3128;DIRECT"

Note: This could do with some generalization, or maybe this will look very different with shoal in play?

Create a new cern.ch local configuration file:

sudo vi /etc/cvmfs/domain.d/cern.ch.local

with the following content:

# For Europe:
#CVMFS_SERVER_URL=${CERNVM_SERVER_URL:="http://cvmfs-stratum-one.cern.ch:8000/opt/@org@;http://cernvmfs.gridpp.rl.ac.uk:8000/opt/@org@;http://cvmfs.racf.bnl.gov:8000/opt/@org@;http://cvmfs.fnal.gov:8000/opt/@org@;http://cvmfs02.grid.sinica.edu.tw:8000/opt/@org@"}
# For North America:
CVMFS_SERVER_URL=${CERNVM_SERVER_URL:="http://cvmfs.racf.bnl.gov:8000/opt/@org@;http://cvmfs.fnal.gov:8000/opt/@org@;http://cvmfs-stratum-one.cern.ch:8000/opt/@org@;http://cernvmfs.gridpp.rl.ac.uk:8000/opt/@org@;http://cvmfs02.grid.sinica.edu.tw:8000/opt/@org@"}
# For Australia:
#CVMFS_SERVER_URL=${CERNVM_SERVER_URL:="http://cvmfs.fnal.gov:8000/opt/@org@;http://cvmfs.racf.bnl.gov:8000/opt/@org@;http://cernvmfs.gridpp.rl.ac.uk:8000/opt/@org@;http://cvmfs-stratum-one.cern.ch:8000/opt/@org@;http://cvmfs02.grid.sinica.edu.tw:8000/opt/@org@"}

Note: there seems to be a CVMFS bug that prevents this from working. It works if the ${CERNVM_SERVER_URL:= part (and the closing brace) are left out.

11. CernVM configuration:

Home Up Down

sudo vi /etc/cernvm/site.conf

Add the following to the end of the file:

CERNVM_CVMFS2=on
CERNVM_EDITION=Basic
CERNVM_ORGANISATION=atlas
CERNVM_USER_SHELL=/bin/bash
CVMFS_REPOSITORIES=atlas,atlas-condb,grid

12. Blank space partition configuration:

Home Up Down

sudo mkdir -p /scratch
sudo vi /etc/fstab

Add the following to fstab:

LABEL=blankpartition0 /scratch ext2 noatime 1 0

Note: ext4 support exists in the e4fsprogs package in CernVM v2.6. We should try using ext4 with no journaling; the performance should be better. It is disabled like this: tune4fs -O ^has_journal /dev/sdb and you can verify whether the has_journal property is there using /sbin/dumpe4fs /dev/sdb . Or, just create it without journaling in the first place: mkfs.ext4 -O ^has_journal /dev/sdb . Maybe try the nobarrier and noatime options too, although they might not be as significant. However, in the case of Nimbus, some new development will be needed to use ext4 partitions.

13. HTCondor configuration:

Home Up Down

Replace the default HTCondor init.d script with the one from Cloud Scheduler github repository:

cd /etc/init.d
sudo mkdir -p /home/condor/old-scripts 
sudo mv condor /home/condor/old-scripts/
sudo chown -R condor.condor /home/condor/old-scripts/
sudo wget https://raw.github.com/hep-gc/cloud-scheduler/master/scripts/condor/worker/condor --no-check-certificate
sudo chmod 755 condor
sudo vi condor

Make the following modification to the condor init.d script:

CONDOR_CONFIG_VAL=/opt/condor/bin/condor_config_val

Customize the HTCondor configuration by retrieving and modifying the sample local configuration from the Cloud Scheduler github repository:

cd /etc/condor
sudo wget https://raw.github.com/hep-gc/cloud-scheduler/dev/scripts/condor/worker/condor_config.local --no-check-certificate
sudo vi condor_config.local

Modify the following lines. Note - the LOCK and EXECUTE directories will be automatically created with the correct ownerships if they don't already exist.

LOCK=/var/lock/condor
LOG=/var/log/condor
RUN=/var/run/condor
SPOOL=/var/lib/condor/spool
EXECUTE=/scratch/condor-execute

And create the LOG, RUN, and SPOOL directories:

  • mkdir /var/log/condor
  • chown condor. /var/log/condor
  • mkdir /var/run/condor
  • chown condor. /var/run/condor
  • mkdir -p /var/lib/condor/spool
  • chown -R condor. /var/lib/condor
and add the following modifications to the bottom of condor_config.local:

## Point to the java executable.
JAVA = /usr/lib/jvm/jre-1.6.0-openjdk.x86_64/bin/java

## Job slot/user accounts:
STARTER_ALLOW_RUNAS_OWNER = False
DEDICATED_EXECUTE_ACCOUNT_REGEXP = atlas[0-9]+
USER_JOB_WRAPPER=/usr/local/bin/condor-job-wrapper
SLOT1_USER = atlas01
SLOT2_USER = atlas02
SLOT3_USER = atlas03
SLOT4_USER = atlas04
SLOT5_USER = atlas05
SLOT6_USER = atlas06
SLOT7_USER = atlas07
SLOT8_USER = atlas08
SLOT9_USER = atlas09
SLOT10_USER = atlas10
SLOT11_USER = atlas11
SLOT12_USER = atlas12
SLOT13_USER = atlas13
SLOT14_USER = atlas14
SLOT15_USER = atlas15
SLOT16_USER = atlas16
SLOT17_USER = atlas17
SLOT18_USER = atlas18
SLOT19_USER = atlas19
SLOT20_USER = atlas20
SLOT21_USER = atlas21
SLOT22_USER = atlas22
SLOT23_USER = atlas23
SLOT24_USER = atlas24
SLOT25_USER = atlas25
SLOT26_USER = atlas26
SLOT27_USER = atlas27
SLOT28_USER = atlas28
SLOT29_USER = atlas29
SLOT30_USER = atlas30
SLOT31_USER = atlas31
SLOT32_USER = atlas32

Create the batch job wrapper script:

sudo vi  /usr/local/bin/condor-job-wrapper

with the following:

#!/bin/bash -l
exec "$@"

and make it executable:

sudo chmod 755 /usr/local/bin/condor-job-wrapper
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback