Tags:
create new tag
view all tags
This documentation describes a simple recommended frontier-squid configuration, for use with Shoal. It is a simplified subset of information from https://twiki.comp.uvic.ca/twiki/bin/view/Unix/AtlasSquid (restricted access). Refer to that parent document periodically for updated information. This setup is intended for a small lightweight squid running in a VM with very modest amounts of RAM and storage, to handle (some of) the load of worker VMs running in that cloud.

See also the details about shoal-agent configuration: ShoalAgentSquidInstallation

Squid Service Information

Squid is a caching tool used as part of the Frontier conditions data distribution system, and the CVMFS software distribution system. Frontier replicates the central database of conditions data from the Tier 0 to Tier 1 and 2 sites around the world. Conditions data (not to be confused with physics data or Monte Carlo data) can be thought of as metadata about the ATLAS detector itself; for example, detector calibration and alignment information, and beam conditions during a run. For CVMFS, each worker node fills its local CVMFS cache on an as-needed basis when software files are requested, by connecting to a Stratum-1 CVMFS server, through a Squid proxy server. Squid servers are used at Tier 2 and 3 sites to prevent each worker node from individually querying a Frontier or CVMFS server; instead the Squid caches the information from the server and returns it to each worker node. So together, Frontier/CVMFS (top-down) and Squid (bottom-up) make up a hierarchical caching system spanning from a central server all the way to each worker node at every Tier 2 and 3 site. The Squid server used by an ATLAS job is determined by:

  • for Frontier: the FRONTIER_SERVER environment variable set by the job itself, as defined in AGIS (the ATLAS Grid Information System)
  • for CVMFS: the CVMFS_HTTP_PROXY configuration property. Multiple Squid servers can be listed, as an ordered list and/or in redundancy groups
Important files:
  • /etc/init.d/frontier-squid # {start|stop|restart|condrestart|reload|status|cleancache|removecache|rotate|rotateiflarge}
  • The log files in /var/log/squid
    • access.log Access log
    • cache.log Cache log
    • note that squid will panic and die if it can't write to its log files!
  • /var/cache/squid/ Cache directory
  • /etc/cron.d/frontier-squid.cron Log rotation cron jobs
  • /etc/squid/customize.sh Any customizations of the squid configuration should be done here

Documentation

206.12.48.69 - - [27/Oct/2015:10:37:48 +0000] "GET http://cern.ch/atlas-computing/links/kitsDirectory/EvgenJobOpts/MC15JobOpts-00-01-74_v0.tar.gz HTTP/1.0" 302 628 TCP_HIT:DIRECT 320 "-"

Future Plans

Frontier-squid v3.5 has several differences but is not ready for use yet. (there are "If-Modified-Since" bugs affecting Frontier access). In frontier-squid v3.5, multi-process squids share the same log files and monitoring ports for all the workers. This will allow monitoring of multi-process squids to work correctly, so they can finally be considered fully production-ready. (Note that the configuration needed for setting up multi-process squids has changed too.)

Also, there may be a way to address the collapsed_forwarding issue in CVM-878 (namely, one slow client can slow down the access for every other client accessing the same file). The read_ahead_gap option causes squid to read from the origin server as fast as it can into memory, and then serve it from there to clients as quickly or as slowly as it needs to. This would address the collapsed_forwarding issue, but, it can use a lot of CPU and memory, and is quite slow for large files. However, squid v3.5 has a much faster memory cache implementation, so read_ahead_gap may be more efficient than in v2.

Installation

OS

Use SL6 x86_64.

Pre-configuration

  • Allow access in iptables and ACLs/SecurityGroups:
    • on port 3128 from everywhere
    • optionally:
      • on port 3401 (UDP only) for monitoring from CERN and FNAL. Use the subnets defined for the HOST_MONITOR ACL in /etc/squid/squid.conf.frontierdefault
      • on port 3401 from any other monitoring system you wish to use

Filesystems

  • /sbin/lvcreate -L 80G -n var.cache.squid rootvg
  • /sbin/lvcreate -L 40G -n var.log.squid rootvg
  • /sbin/mkfs.ext4 /dev/rootvg/var.cache.squid
  • /sbin/mkfs.ext4 /dev/rootvg/var.log.squid
  • Update fstab and mount the filesystems

Install Squid

  • Install UMD repo. Alternatively:
    rpm --import http://frontier.cern.ch/dist/rpms/cernFrontierGpgPublicKey
    yum install http://frontier.cern.ch/dist/rpms/RPMS/noarch/frontier-release-1.1-1.noarch.rpm
    yum install frontier-squid
  • Set the ulimit for open files.
    • echo "ulimit -n 8192" >> /etc/sysconfig/frontier-squid
  • See here to choose a suitable max size for the squid log rotation. As a rule of thumb, the available space should be at least 2.5 times the value of SQUID_MAX_ACCESS_LOG
    • Then e.g. echo "export SQUID_MAX_ACCESS_LOG=8G" >> /etc/sysconfig/frontier-squid

Configure Squid

Configuration changes must be made in the /etc/squid/customize.sh script, not the squid.conf file, since that file can be changed by upgrades or service restarts.

The contents of customize.sh should be something like this:

awk --file `dirname $0`/customhelps.awk --source '{
# To allow destination-based access. See https://twiki.cern.ch/twiki/bin/view/Frontier/InstallSquid#Restricting_the_destination
uncomment("acl MAJOR_CVMFS")
uncomment("acl ATLAS_FRONTIER")
insertline("^# http_access deny !RESTRICT_DEST", "http_access allow MAJOR_CVMFS")
insertline("^# http_access deny !RESTRICT_DEST", "http_access allow ATLAS_FRONTIER")

# fix for access to nightlies CVMFS repo
insertline("^# acl RESTRICT_DEST dstdom_regex", "acl ATLAS_NIGHTLIES dstdom_regex ^(cvmfs-atlas-nightlies\\.cern\\.ch)$")
insertline("^# http_access deny !RESTRICT_DEST", "http_access allow ATLAS_NIGHTLIES")

# To allow source-based access.
#setoption("acl NET_LOCAL src", "<list of subnets>")

setoption("refresh_stale_hit", "5 seconds") #Try to avoid "clientProcessExpired: collapsed request STALE!" errors
# The amount of RAM to use for caching
setoption("cache_mem", "4000 MB")
# Set this to 95% of the cache directory size in MB 
setoptionparameter("cache_dir", 3, "38000")

print
}'

To check the squid.conf file that is generated from customize.sh, do /sbin/service frontier-squid . Then, to apply the changes without restarting, do /sbin/service frontier-squid reload (this avoids clearing the cache).

Start Squid

  • /etc/init.d/frontier-squid start
  • /sbin/chkconfig frontier-squid on

Shoal Agent

See ShoalAgentSquidInstallation.

System Testing

Component Tests

  • Make sure that the squid cache and squid log filesystems have free space
  • /etc/init.d/frontier-squid status
  • Try connecting to port 3128 of the squid server
  • You can get statistics about the squid server, including the number of available and used file descriptors, by doing sudo squidclient mgr:info .

Comprehensive Tests

Test Frontier access on a worker VM

#!/bin/bash

SQUID_SERVER="mysquid.example.com"

export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh --quiet
source ${ATLAS_LOCAL_ROOT_BASE}/packageSetups/atlasLocalDiagnostics.sh --quiet
export FRONTIER_SERVER="(serverurl=http://frontier.triumf.ca:3128/ATLAS_frontier)(serverurl=http://atlasfrontier-ai.cern.ch:8000/atlr)(serverurl=http://ccfrontier.in2p3.fr:23128/ccin2p3-AtlasFrontier)(serverurl=http://lcgft-atlas.gridpp.rl.ac.uk:3128/frontierATLAS)(proxyurl=http://${SQUID_SERVER}:3128)"
${ATLAS_LOCAL_ROOT_BASE}/utilities/fngetTest.sh

Test CVMFS access on a worker VM

  • cvmfs_config chksetup will test access to all proxy/server combinations that the CVMFS client is currently configured to use

-- rptaylor - 2015-12-14

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | More topic actions
Topic revision: r1 - 2015-12-15 - rptaylor
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback