Last modified 9 years ago Last modified on 01/14/09 10:00:00

Container Management

We will manage resources by using the concept of containers that include all the necessary components for successful execution. We are currently building our containers based on the user account of the job submitter and will incorporate existing solutions as appropriate.

Before going into the details of actually configuring a container, one needs to know the procedure to actually add/expand to create a pool of compute resources, which eventually forms the basis of Grid Computing. The following steps, though not an extensive list, shows a generalized pattern that might be followed in adding resources. Since we are targeting to increase the performance of SSG's research, we will be looking from a specific job's perspective when choosing a compute resource. In SSG's case this will be the R-statistical language tool, i.e., we will be looking to identify resources which support R-tool.

  1. Identify Resource
    Two types:
    1. Bare-bones resource - typically a resource within an organization/university
    2. Resource from existing grid communities like TeraGrid?, SURAGrid, LSU etc. Factors influencing the identification of a particular resource in the Grid community might be proximity, application-support, service and maintenance support
  1. Determine Grid-ability
    (For a bare-bones resource). This means, whether the target resource has Globus Toolkit installed, since Globus is the universally accepted adhoc standard for building grids.
  1. Determine Intended Work-flow Tool Support
    Right now, we concentrate on R-language support. Here we might look at two possibilities. Target resources which have R-tool already installed or resources which do not have R-tool installed, but with proper requisitions, can be installed on the resource (this may be a system-wide installation/in $HOME dir).
  1. Obtain an Account
    This usually means filling out a requisition form or requesting to create an account through an e-mail
  1. Configure Container/User? Account
    Initial steps to configure the account are:
    • Add an entry for the user account in the /etc/grid-security/grid-mapfile on the particular resource
    • Get to know the queue configuration, policies, job-prioritization, and resource limits
    • For applications to be executed on many different systems in a grid, the applications must be able to run on these systems with no user intervention. This means that the differences between computer systems must be hidden or that information must be provided to the application so that it can adapt to the system it is executing on.The SURAgrid recommends a minimal set of environment variables that should be set on systems that are part of SURAgrid. The set of initial environment variables include <VO-NAME>_SCRATCH, <VO-NAME>_SHARED_SCRATCH, and <VO-NAME>_SHARED_PROJECTS.
      Hence, the user account might be prepared by setting these variables in a shell script, which is sourced by the login shell($HOME/.bashrc)
  1. Add Resource to the Compute Pool
    Gridway, which now comes with Globus-4.2 installation kit, is a meta-scheduler. Gridway is a workload management and scheduling system, which makes it possible for a user to have a single-point interface to the distributed resources on the grid. When we say, add resource to the compute pool, this means adding the resource in $GW_LOCATION/etc/gwd.conf
  1. Execute job with the new resource in Compute Pool
    Before submitting a job through Gridway, make sure that the job runs on the newly added resource per se (through qsub).

Issues to contend with mainly occur in the configuring and execution phases of adding a new resource to the compute pool

Existing Approaches

Resource Specific Containers

This section references container configuration notes for specific resources.

  • CheahaContainer
  • OlympusContainer - configuration of
  • AltixContainer
  • NimbusContainer The University of Chicago Science Cloud, codenamed "Nimbus", provides compute capability in the form of Xen virtual machines (VMs) that are deployed on physical nodes of the University of Chicago TeraPort cluster (currently 16 nodes) using the workspace service.
    Here we are looking to explore cloud computing rather than using their cloud to achieve all our computing needs. We're doing this as part of our container management solution is it effective for us to use VMs to achieve container management goals.
  • SURAgridContainer
  • OSG