Docker on CernVM
Since the latest CernVM image, a modern docker version is available. Time to re-visit the docker universe in HTCondor.
With being able to run docker on CERNVM we can run different OS versions on the same VM. The initial layer of CERNVM runs
SL7 natively. A docker container would then run the same software that provides the OS layer for a CERNVM, but in a container.
This way, jobs that run in this container see a different CERNVM repository, providing a different OS to the job in this container.
This allows us then to run
software for different experiments requiring different versions of RHEL on the same VM. This is now fully integrated in
HTCondor in the testing project on csv2-dev2. Here we describe the required steps for this approach.
HTCondor configuration on the VM.
Required is a recent version of cernvm, we use cernvm4-micor-2020.01-1.hdd. This comes with docker version
"Docker version 19.03.5, build 633a0ea".
With cloudscheduler v2 we contextualize the
VMs via cloud-init. This is the relevant docker_start.yaml file, which is appended to all other yaml files at the very end for final configuration of the HTCondor and
installs required helper scripts:
#cloud-config
merge_type: 'list(append)+dict(recurse_array)+str()'
write_files:
- content: |
DOCKER = /usr/bin/docker_wrapper
DOCKER_VOLUMES = CVMFS_PROD, CVMFS_SL7, CVMFS_SLC5
DOCKER_VOLUME_DIR_CVMFS_PROD = /cvmfs/cernvm-prod.cern.ch
DOCKER_VOLUME_DIR_CVMFS_PROD_MOUNT_IF = WantProd =!= Undefined && WantProd
DOCKER_VOLUME_DIR_CVMFS_SL7 = /cvmfs/cernvm-sl7.cern.ch
DOCKER_VOLUME_DIR_CVMFS_SL7_MOUNT_IF = WantSl7 =!= Undefined && WantSl7
DOCKER_VOLUME_DIR_CVMFS_SLC5 = /cvmfs/cernvm-slc5.cern.ch
DOCKER_VOLUME_DIR_CVMFS_SLC5_MOUNT_IF = WantSlc5 =!= Undefined && WantSlc5
DOCKER_MOUNT_VOLUMES = CVMFS_PROD, CVMFS_SL7, CVMFS_SLC5
DOCKER_DROP_ALL_CAPABILITIES = false
DOCKER_EXTRA_ARGUMENTS = --init
owner: root:root
permissions: 0644
path: /etc/condor/config.d/docker
- content: |
#!/bin/bash
NEW_ARGS=()
while [[ $# -gt 0 ]]
do
case $1 in
--user)
shift # past argument
shift # past value
;;
--volume)
NEW_ARGS+=("$1")
((num_pars++))
shift
NEW_ARGS+=("$1")
if [[ $1 == /cvmfs/cernvm* ]]; then
mmount=${1%:*}
if [[ $mmount == /cvmfs/cernvm-sl7* ]]; then
mmount="CERNVM_ROOT=${mmount}/cvm4"
else
mmount="CERNVM_ROOT=${mmount}/cvm3"
fi
pos_new=$(expr $num_pars + 1)
NEW_ARGS+=("-e" $mmount)
# NEW_ARGS+=("" "")
((num_pars+=2))
fi
shift
((num_pars++))
;;
*) # unknown option
NEW_ARGS+=("$1") # save it in an array for later
shift # past argument
;;
esac
done
set -- "${NEW_ARGS[@]}"
#touch /tmp/dc
#echo "calling with $num_pars ($pos_new) parameters as /usr/bin/docker $@" >> /tmp/dc
#if [[ $pos_new -gt 0 ]]; then
# echo " $pos_new ${NEW_ARGS[@]:$(expr $pos_new - 1):$(expr $pos_new + 1)}" >> /tmp/dc
# echo " with $mmount" >> /tmp/dc
# echo " /usr/bin/docker $@" >> /tmp/dc
#fi
/usr/bin/docker $@
owner: root:root
permissions: 0755
path: /usr/bin/docker_wrapper
- content: |
20 * * * * root docker container prune -f --filter "until=1h"
owner: root:root
permissions: 0644
path: /etc/cron.d/docker_prune
runcmd:
- [ service, docker, start ]
- [ 'wget', 'https://cernvm.cern.ch/releases/production/cvm-docker.2018.10-2.cernvm.x86_64.tar' ]
- 'cat cvm-docker.2018.10-2.cernvm.x86_64.tar | docker import - my_cernvm'
- [ usermod, '-G', docker, condor ]
- [ usermod, '-G', docker, slot01 ]
- [ usermod, '-G', docker, slot02 ]
- [ usermod, '-G', docker, slot03 ]
- [ usermod, '-G', docker, slot04 ]
- [ usermod, '-G', docker, slot05 ]
- [ usermod, '-G', docker, slot06 ]
- [ usermod, '-G', docker, slot07 ]
- [ usermod, '-G', docker, slot08 ]
- [ usermod, '-G', docker, slot09 ]
- [ usermod, '-G', docker, slot10 ]
- [ usermod, '-G', docker, slot11 ]
- [ usermod, '-G', docker, slot12 ]
- [ usermod, '-G', docker, slot13 ]
- [ usermod, '-G', docker, slot14 ]
- [ usermod, '-G', docker, slot15 ]
- [ usermod, '-G', docker, slot16 ]
- [ condor_reconfig ]
condor user and all slots must be able to execute docker commands, therefore they are added to the docker group.
The current HTCondor version doesn't clean out old, used docker containers, therefore the cron job pruning old
containers. This should not be necessary with HTCondor 8.8.
Also CERNVM cannot run as a user, because the init part creates symbolic links into the root directory of the
container, only root is allowed to do that. Therefore the need of the wrapper script that removes the user from
the command line options. Again a newer HTCondor might not need a wrapper, as, if compiled with the 'right'
options.
The volume part of the wrapper script deals with volume mount options and sets the CERNVM_ROOT,
that chooses the OS version used in the docker container. HTCondor provides conditional mounts of docker
volumes that an be switched on and off via configuration variables in HTCondor. If a user defines more than one
volume here, the last one in the list would define the content of the CERNVM_ROOT environment variable
inside the docker container that selects the OS version in the cernvm container.
Also, the cernvm docker container is preinstalled on the VM, so that container can use it immediately.
The following job then selects the docker universe and sets the CERNVM repository to use.
Universe = docker
docker_image = my_cernvm
executable = /init
arguments = /bin/cat /etc/redhat-release
dir = $ENV(HOME)/logs/$ENV(CDATE)
output = $(dir)/docker_$(Cluster).$(Process).out
error = $(dir)/docker_$(Cluster).$(Process).err
log = $(dir)/docker_$(Cluster).$(Process).log
Requirements = group_name =?= "testing" && TARGET.Arch == "x86_64"
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
request_cpus = 1
request_memory = 1500
request_disk = 15G
+WantSlc5 = True
# +WantProd = True
# +WantSl7 = True
queue 2
The executable must be /init, otherwise important symlinks will not be setup properly. Arguments then contain the job and its arguments to execute in this environment.
Contextualization of VM and of docker container providing CVMFS
To contextualize the VM as well as the docker container, we have to test running cloud-init multiple times:
1) contextualization of VM as before, except a few steps are delayed until we start the container
2) monitoring and accounting would need to run in the contextualization of the docker container, for this we providing additional yaml files to cloud-init via the -f option of cloud-init.
--
seuster - 2020-03-10