Cloud Scheduler Requirements

This document contains all potential and realized requirements for the cloud scheduler that is being designed as of May 2009. Though the categories below are representative of formal requirements engineering, this document is intended to be informal. Requirements should be brief, and categorized correctly as best as possible. Also, please note whether the requirements are absolute (nonnegotiable), flexible, or potential (any ideas are appreciated).

If the categories below do not suffice, or subcategories are required, please add your own.

Note: This document is intended as a scratch requirements document that exists in order to gather ideas for building a cloud scheduler (which, in its current form, is described in the Cloud Scheduler Design). Additions are welcome. From the ideas and requirements developed in this scratch document, a more formal requirements document will be developed.

Terminology

Herein:

  • VM stands for virtual machines.
  • A "Cloud" refers to a collection of clusters running grid middleware (currently, the Globus Toolkit) as well as cloud software (currently, the Nimbus service). These clusters are connected to a job scheduler that is responsible for distributing jobs to the cloud's clusters.
  • "Cloud resources" is a more general term that refers to the clusters and individual computers that make up the cloud.
  • The "cloud environment" simply refers to configuration and status of VMs in the cloud.

Functional Requirements

(Describe the function of the intended system. What should the scheduler do?)

  • The cloud scheduler will communicate with a Condor job scheduler in order to retrieve information on queued jobs.
    • The scheduler will potentially support other job schedulers in terms of the retrieval of job information. (Abstract, job-scheduler independent interface?)
      • DRMAA http://www.drmaa.org/ is an API for the submission and control of jobs to job schedulers. Works with Grid Engine, Condor, PBS/Torque.
  • The scheduler will be able to obtain information on clusters (that are part of the cloud). This information may include:
    • Currently running VMs, and on what clusters the VMs are running;
    • Details of currently running VMs (OS, memory, storage, processors);
    • Details of clusters, including how much available space is available for creating new VMs;
    • Etc.
  • This cluster information may be obtained by the scheduler by any of the following methods (to be determined by further discussion and design):
    • By querying a MDS responsible for maintaining information on the cloud;
    • By internal storage of cluster information (i.e., by maintaining records of the scheduler's own instructions that have been sent to the cloud);
    • By independently querying the cloud resources and clusters for information.
  • The cloud scheduler will use information on current queued jobs and on the current cloud environment to determine the VMs that should be running in the cloud.
    • The cloud scheduler will attempt to create a cloud environment that is able to facilitate the quick and efficient completion of queued jobs.
    • Specifically, the cloud scheduler will determine how many of what type of VMs should be running at what location in the cloud.
    • The cloud scheduler will send instructions to clusters on the cloud in order to create and destroy VMs (and possibly pause VMs or move VMs to different cluster locations) in order to change the cloud environment.

Design Requirements

(Describe what standards the intended system should meet. How, and in accordance with what standards or guidelines, will the scheduler be built?)

  • [Potential]: The scheduler will be built into a VM running in the cloud.
  • [Potential]: The initial scheduler prototype will be built in Python, Perl, or another suitable "simple but powerful" language.

  • The cloud scheduler will initially use the workspace command to manipulate workspaces on the cloud (that is, the scheduler must be designed to work with workspace commands).

Client Requirements

(Any suggested requirements from clients of the intended system. These may fit into other categories as well.)

  • The Cloud Scheduler should make available monitoring information ideally as XML.

Performance Requirements

(Describe the performance constraints on the intended system. How fast / with how many resources should the scheduler function?)

-- DuncanPenfoldBrown - 01 Jun 2009


This topic: HEPrc > VirtualizationProjectHome > CloudSchedulerRequirements
Topic revision: r7 - 2009-06-12 - dpb
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback