create new tag
view all tags

Cloud Scheduler: Introduction and Design Concepts

The cloud scheduler is software that sits within a cloud and is responsible for creating and manipulating cloud resources in order to create an effective (fast, efficient, correct, etc.) environment for job execution. The scheduler does this based on information about the current cloud environment (how many of what type of virtual machines are running on the cloud, and where) and on information from a traditional job scheduler.

Cloud Scheduler on the Grid

  • The basic cloud scheduler environment:
cloud scheduler
[image: Patrick Armstrong]

This image shows the basic location, function, and interactions of the cloud scheduler in a cloud environment. Very simply, the system functions as follows: A user submits jobs to a job scheduler, which queues user jobs and is responsible for distributing them to available resources on the cloud. The cloud scheduler is able to access this scheduler to obtain its current job information, such as how many jobs it has queued, and what types of resources the jobs require. The cloud scheduler also has access to information on the current status of the cloud, including what virtual machines are running on what machines on the physical grid. The cloud information will be gathered by a separate MDS (monitoring and discovery service) that is responsible for maintaining up-to-date and accurate information on the cloud. Note that this cloud MDS is not pictured above.

Based on this information (that from the job scheduler and the cloud MDS), the cloud scheduler will make decisions regarding the creation of specific cloud nodes (virtual machines) in order to best suit the jobs waiting inthe job scheduler. That is, the cloud scheduler will create, destroy, or move virtual machines in the cloud in order to create an efficient job execution environment, based on queued jobs and current cloud configuration.

Design Details

The following describes the evolving design details of the cloud scheduler.

Phase 1: Cloud Manipulation (6/05/2009)

The development will initially focus on building an internal structure for cluster and cluster node data from static, external configuration files (similar to Nimbus's use of VMM pool config. files). Dynamic information on clusters will be incorporated as further progress is made with the cloud MDS (under development).

In essence, the phase 1 scheduler will seek to:

  1. Read in a static file containing information on available clusters
  2. Store that information in a dynamic data structure
  3. Arbitrarily choose a resource (of the clusters from the configuration) and start a VM on that resource via Nimbus workspace control commands
  4. Update internal cluster/resource data to reflect the usage of resources by the created VM
  5. Destroy the created VM via Nimbus workspace control commands
  6. Update the internal cluster/resource data to reflect the state of resources after VM destruction

The next steps to complete (after the above is accomplished) involve creating a choosing scheme by which the scheduler chooses the "best" available resources, and then starts VMs on those resources.

Design and Structure


class clusterNode

  • Variables:
    • string name, headnodeAddress
    • int cpuSpeed, memoryMB, storageGB
    • bool cpu64
      • Switch for indicating a 32/64-bit processor
    • clusterNode [list] siblings
      • List of clusterNode pointers to a node's siblings
  • Methods:
    • [tbc]

class cloudCluster

  • Variables:
    • string name, networkAddress
    • clusterNode [list] workers
    • bool cpu32, cpu64
      • Flags indicating whether the cluster has 32- and/or 64-bit cpu's
    • int numberWorkers, totalCPU, totalMemory, totalStorage, avgCPU, avgMemory, avgStorage
      • The totalX and avgX variables may be unnecessary

  • Methods:
    • [tbc]

class resourcePool

  • Variables:
    • cloudCluster [list] clusters
    • clusterNode [priority list] resourceNodes**

  • Methods:
    • [tbc]

**: Options for the resourceNodes list(s):

1. 1 priority list, with priority based on an evaluation of ALL node properties (some conglomeration of cpuSpeed, memory, and storage). For differentiating between 32- and 64-bit cpu's, either check whether a machine is 32- or 64-bit individually at every machine lookup, OR have separate 32-bit and 64-bit resourceNodes lists. (Currently leaning toward the latter).

2. Multiple priority lists: resourceCPU, resourceMemory, resourceStorage. Priority is simply based on the greatest number for a list's resource - a simple sorted list. Similarly, can check for 32- or 64-bit individually, or have separate sets of lists for 32- and 64-bit resources.

-- DuncanPenfoldBrown - 01 Jun 2009

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng cloud-scheduler-01.png r1 manage 72.7 K 2009-06-01 - 18:24 UnknownUser The basic cloud scheduler environment
Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | More topic actions
Topic revision: r5 - 2009-06-09 - dpb
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback