Tags:
create new tag
view all tags

Overall Architecture

Key Features

  • independent of the Cloud Scheduler
  • nothing special needed on the user's VMs
  • 'watches' the condor history file; more lightweight than grepping log files
  • easy data replication (if needed; provided by MySQL)
  • flexible query system from any host (using MySQL client)
  • rabbitmq takes care of message delivery (data collector can go offline without losing any data)

Online Accounting Stats

Accounting stats are available online via the science.cloud.nrc.ca web portal at:

Condor Setup

  1. Install the following python libraries using your favorite method:
    • pika
    • json
  2. Install the accounting reporting tool
    1. Download the code from github at https://github.com/hep-gc/nep52accounting
      git clone git@github.com:hep-gc/nep52accounting.git
      cd nep52accounting/client
            
    2. Open the nep52accountant file and make sure the python executable on the first line of the script is OK. Edit as needed.
    3. Run make install to install the data reporting component and init.d script
      sudo make install
            
  3. Configure the accounting reporting tool
    vim /etc/nep52accoutant.config
       
  4. Start the accounting component:
    /etc/init.d/nep52accountant start
       
  5. Check log file to make sure it is running fine:
    tail /var/log/nep52accountant.log
       
  6. Setup to start at system boot:
    chkconfig --add nep52accountant
       

Server Setup

Coming soon...

Collected Data

Currently, the following data is collected from every job that completes and has the NEP52ACNT Condor hook set:

GlobalJobId
User
Owner
x509userproxysubject
QDate
JobStartDate
CompletionDate
JobCurrentStartDate
JobDuration (computed)
StageInStart
StageInFinish
CommittedTime
CumulativeSuspensionTime
RemoteSysCpu
RemoteUserCpu
RemoteWallClockTime
LocalSysCpu
LocalUserCpu
BytesSent
ExitCode
DiskUsage
RemoteHost
Cmd

This is just a preliminary set of Condor classad attributes to get things started. More attributes can easily be added at a later date.

Accounting Database Backup

Currently, the accounting database is backed up locally on babar.cloud.nrc.ca on a daily basis, with backup rotation. This is based on a method that uses logrotate (http://scottlinux.com/2011/03/04/rotate-mysql-backups-with-logrotate/).

This should allow us to quickly rewind the database to a prior version if needed. babar.cloud.nrc.ca gets fully backed up to to another system on a daily basis. This keeps a copy of the rotated database backups in case the entire science.cloud.nrc.ca system is lost.

Running Queries Against the Accounting Database

To query the accounting database, simply use a recent mysql client and connect to babar.cloud.nrc.ca as user nep52acntRO. (Contact Andre.Charbonneau@nrc-cnrc.gc.ca for password.) Note that the nep52acntRO user only has SELECT privileges on the nep52.completed_jobs table.

For example:

mysql -h babar.cloud.nrc.ca -u nep52acntRO -p nep52 -e "<mysql query goes here...>"

If you want the output to be stored in a text file to be included in other applications for processing, simply redirect the output of the mysql command to a file. For example:

mysql -h babar.cloud.nrc.ca -u nep52acntRO -p nep52 -e "SELECT QDate, JobStartDate FROM completed_jobs ORDER BY QDate" > query.dat

Query Examples

Here you will find a list of commonly used queries. Add your favorite queries here!

# Get total number of completed jobs
SELECT COUNT(*) FROM completed_jobs;

# Get the total amount of job runtime
SELECT SUM(JobDuration) FROM completed_jobs;

# Get number of users
SELECT COUNT(DISTINCT Owner) FROM completed_jobs;

# Get total number of completed jobs per user
SELECT Owner, COUNT(*) FROM completed_jobs GROUP BY Owner;

# Get the total amount of job runtime per user
SELECT Owner, SUM(JobDuration) FROM completed_jobs GROUP BY Owner;

-- AndreCharbonneau - 2011-08-29

Topic attachments
I Attachment History Action Size Date Who Comment
Unknown file formatgz nep52accounting.tar.gz r1 manage 128.0 K 2011-09-08 - 19:17 UnknownUser  
JPEGjpg overall-architecture.small.jpg r1 manage 16.0 K 2011-09-16 - 18:26 UnknownUser  
Edit | Attach | Watch | Print version | History: r11 < r10 < r9 < r8 < r7 | Backlinks | Raw View | More topic actions
Topic revision: r11 - 2011-11-22 - andrec
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback