create new tag
view all tags

Docker on CernVM

Since the latest CernVM image, a modern docker version is available. Time to re-visit the docker universe in HTCondor. With being able to run docker on CERNVM we can run different OS versions on the same VM. The initial layer of CERNVM runs SL7 natively. A docker container would then run the same software that provides the OS layer for a CERNVM, but in a container. This way, jobs that run in this container see a different CERNVM repository, providing a different OS to the job in this container. This allows us then to run software for different experiments requiring different versions of RHEL on the same VM. This is now fully integrated in HTCondor in the testing project on csv2-dev2. Here we describe the required steps for this approach.


HTCondor configuration on the VM.

Required is a recent version of cernvm, we use cernvm4-micor-2020.01-1.hdd. This comes with docker version "Docker version 19.03.5, build 633a0ea". With cloudscheduler v2 we contextualize the VMs via cloud-init. This is the relevant docker_start.yaml file, which is appended to all other yaml files at the very end for final configuration of the HTCondor and installs required helper scripts:
merge_type: 'list(append)+dict(recurse_array)+str()'
-   content: |
        DOCKER = /usr/bin/docker_wrapper
        DOCKER_VOLUME_DIR_CVMFS_PROD = /cvmfs/cernvm-prod.cern.ch
        DOCKER_VOLUME_DIR_CVMFS_PROD_MOUNT_IF = WantProd =!= Undefined && WantProd
        DOCKER_VOLUME_DIR_CVMFS_SL7 = /cvmfs/cernvm-sl7.cern.ch
        DOCKER_VOLUME_DIR_CVMFS_SL7_MOUNT_IF = WantSl7 =!= Undefined && WantSl7
        DOCKER_VOLUME_DIR_CVMFS_SLC5 = /cvmfs/cernvm-slc5.cern.ch
        DOCKER_VOLUME_DIR_CVMFS_SLC5_MOUNT_IF = WantSlc5 =!= Undefined && WantSlc5
    owner: root:root
    permissions: 0644
    path: /etc/condor/config.d/docker
-   content: |
        while [[ $# -gt 0 ]]
        case $1 in
                shift # past argument
                shift # past value
                if [[ $1 == /cvmfs/cernvm* ]]; then
                  if [[ $mmount == /cvmfs/cernvm-sl7* ]]; then
                  pos_new=$(expr $num_pars + 1)
                  NEW_ARGS+=("-e" $mmount)
                  # NEW_ARGS+=("" "")
            *)    # unknown option
                NEW_ARGS+=("$1") # save it in an array for later
                shift # past argument
        set -- "${NEW_ARGS[@]}"
        #touch /tmp/dc
        #echo "calling with $num_pars ($pos_new) parameters as /usr/bin/docker $@" >> /tmp/dc
        #if [[ $pos_new -gt 0 ]]; then
        #   echo " $pos_new ${NEW_ARGS[@]:$(expr $pos_new - 1):$(expr $pos_new + 1)}" >> /tmp/dc
        #   echo "    with $mmount" >> /tmp/dc
        #   echo "    /usr/bin/docker $@" >> /tmp/dc
        /usr/bin/docker $@
    owner: root:root
    permissions: 0755
    path: /usr/bin/docker_wrapper
-   content: |
        20 * * * * root docker container prune -f --filter "until=1h"
    owner: root:root
    permissions: 0644
    path: /etc/cron.d/docker_prune
 - [ service, docker, start ]
 - [ 'wget', 'https://cernvm.cern.ch/releases/production/cvm-docker.2018.10-2.cernvm.x86_64.tar' ]
 - 'cat cvm-docker.2018.10-2.cernvm.x86_64.tar | docker import - my_cernvm'
 - [ usermod, '-G', docker, condor ]
 - [ usermod, '-G', docker, slot01 ]
 - [ usermod, '-G', docker, slot02 ]
 - [ usermod, '-G', docker, slot03 ]
 - [ usermod, '-G', docker, slot04 ]
 - [ usermod, '-G', docker, slot05 ]
 - [ usermod, '-G', docker, slot06 ]
 - [ usermod, '-G', docker, slot07 ]
 - [ usermod, '-G', docker, slot08 ]
 - [ usermod, '-G', docker, slot09 ]
 - [ usermod, '-G', docker, slot10 ]
 - [ usermod, '-G', docker, slot11 ]
 - [ usermod, '-G', docker, slot12 ]
 - [ usermod, '-G', docker, slot13 ]
 - [ usermod, '-G', docker, slot14 ]
 - [ usermod, '-G', docker, slot15 ]
 - [ usermod, '-G', docker, slot16 ]
 - [ condor_reconfig ]
condor user and all slots must be able to execute docker commands, therefore they are added to the docker group. The current HTCondor version doesn't clean out old, used docker containers, therefore the cron job pruning old containers. This should not be necessary with HTCondor 8.8.

Also CERNVM cannot run as a user, because the init part creates symbolic links into the root directory of the container, only root is allowed to do that. Therefore the need of the wrapper script that removes the user from the command line options. Again a newer HTCondor might not need a wrapper, as, if compiled with the 'right' options.

The volume part of the wrapper script deals with volume mount options and sets the CERNVM_ROOT, that chooses the OS version used in the docker container. HTCondor provides conditional mounts of docker volumes that an be switched on and off via configuration variables in HTCondor. If a user defines more than one volume here, the last one in the list would define the content of the CERNVM_ROOT environment variable inside the docker container that selects the OS version in the cernvm container.

Also, the cernvm docker container is preinstalled on the VM, so that container can use it immediately.

The following job then selects the docker universe and sets the CERNVM repository to use.

Universe   = docker
docker_image            = my_cernvm
executable              = /init
arguments               = /bin/cat /etc/redhat-release
dir           = $ENV(HOME)/logs/$ENV(CDATE)
output        = $(dir)/docker_$(Cluster).$(Process).out
error         = $(dir)/docker_$(Cluster).$(Process).err
log           = $(dir)/docker_$(Cluster).$(Process).log

Requirements = group_name =?= "testing" && TARGET.Arch == "x86_64" 
should_transfer_files = YES
when_to_transfer_output = ON_EXIT

request_cpus = 1
request_memory = 1500
request_disk = 15G

+WantSlc5 = True
# +WantProd = True
# +WantSl7 = True

queue 2

The executable must be /init, otherwise important symlinks will not be setup properly. Arguments then contain the job and its arguments to execute in this environment.

Contextualization of VM and of docker container providing CVMFS

To contextualize the VM as well as the docker container, we have to test running cloud-init multiple times: 1) contextualization of VM as before, except a few steps are delayed until we start the container 2) monitoring and accounting would need to run in the contextualization of the docker container, for this we providing additional yaml files to cloud-init via the -f option of cloud-init. -- seuster - 2020-03-10
Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | More topic actions
Topic revision: r4 - 2020-03-10 - seuster
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback