Overall Architecture
Key Features
- independent of the Cloud Scheduler
- nothing special needed on the user's VMs
- 'watches' the condor history file; more lightweight than grepping log files
- easy data replication (if needed; provided by MySQL)
- flexible query system from any host (using MySQL client)
- rabbitmq takes care of message delivery (data collector can go offline without losing any data)
Online Accounting Stats
Accounting stats are available online via the science.cloud.nrc.ca web portal at:
Condor Setup
- Install the following python libraries using your favorite method:
- Install the accounting reporting tool
- Download the code from github at https://github.com/hep-gc/nep52accounting
git clone git@github.com:hep-gc/nep52accounting.git
cd nep52accounting/client
- Open the
nep52accountant
file and make sure the python executable on the first line of the script is OK. Edit as needed.
- Run
make install
to install the data reporting component and init.d script
sudo make install
- Configure the accounting reporting tool
vim /etc/nep52accoutant.config
- Start the accounting component:
/etc/init.d/nep52accountant start
- Check log file to make sure it is running fine:
tail /var/log/nep52accountant.log
- Setup to start at system boot:
chkconfig --add nep52accountant
Server Setup
Coming soon...
Collected Data
Currently, the following data is collected from every job that completes and has the
NEP52ACNT
Condor hook set:
GlobalJobId
User
Owner
x509userproxysubject
QDate
JobStartDate
CompletionDate
JobCurrentStartDate
JobDuration (computed)
StageInStart
StageInFinish
CommittedTime
CumulativeSuspensionTime
RemoteSysCpu
RemoteUserCpu
RemoteWallClockTime
LocalSysCpu
LocalUserCpu
BytesSent
ExitCode
DiskUsage
RemoteHost
Cmd
This is just a preliminary set of Condor classad attributes to get things started. More attributes can easily be added at a later date.
Accounting Database Backup
Currently, the accounting database is backed up locally on
babar.cloud.nrc.ca
on a daily basis, with backup rotation. This is based on a method that uses logrotate (
http://scottlinux.com/2011/03/04/rotate-mysql-backups-with-logrotate/
).
This should allow us to quickly rewind the database to a prior version if needed.
babar.cloud.nrc.ca
gets fully backed up to to another system on a daily basis. This keeps a copy of the rotated database backups in case the entire
science.cloud.nrc.ca
system is lost.
Running Queries Against the Accounting Database
To query the accounting database, simply use a recent mysql client and connect to
babar.cloud.nrc.ca
as user
nep52acntRO
. (Contact
Andre.Charbonneau@nrc-cnrc.gc.ca for password.)
Note that the
nep52acntRO
user only has
SELECT
privileges on the
nep52.completed_jobs
table.
For example:
mysql -h babar.cloud.nrc.ca -u nep52acntRO -p nep52 -e "<mysql query goes here...>"
If you want the output to be stored in a text file to be included in other applications for processing, simply redirect the output of the mysql command to a file. For example:
mysql -h babar.cloud.nrc.ca -u nep52acntRO -p nep52 -e "SELECT QDate, JobStartDate FROM completed_jobs ORDER BY QDate" > query.dat
Query Examples
Here you will find a list of commonly used queries. Add your favorite queries here!
# Get total number of completed jobs
SELECT COUNT(*) FROM completed_jobs;
# Get the total amount of job runtime
SELECT SUM(JobDuration) FROM completed_jobs;
# Get number of users
SELECT COUNT(DISTINCT Owner) FROM completed_jobs;
# Get total number of completed jobs per user
SELECT Owner, COUNT(*) FROM completed_jobs GROUP BY Owner;
# Get the total amount of job runtime per user
SELECT Owner, SUM(JobDuration) FROM completed_jobs GROUP BY Owner;
--
AndreCharbonneau - 2011-08-29