wiki:BasicTests
Last modified 10 years ago Last modified on 10/19/07 15:29:59

The exploration of the workflow migration is taking place on the UABgrid meta-scheduler stage.uabgrid.uab.edu. For convenient access to your account on this host you can use your BlazerId and password for access via SSH.

Account setup for UABgrid

In order to use the grid resources, however, you will need to get a UABgrid certificate and upload it to your .globus directory on your stage account. You can get the UABgrid certificates here:

https://ca.uabgrid.uab.edu/user/custom_request_cert.php

After your certificate is created you will need to download the key and certificate as two separate files: userkey.pem and usercert.pem. These then need to be placed in your ~/.globus dir on stage.

You can download the two files from here:

https://ca.uabgrid.uab.edu/user/manage_cert.php

To access the files, click on the "Download" button under the "userkey.pem" and "usercert.pem" columns.

After you upload these files with SCP, be sure that the key file has the following permissions:

chmod 400 ~/.globus/userkey.pem

This takes care of setting up your account. This process is slightly more manual at this time than it needs to be, but your certificate is good for a year, so you won't need to take these steps again for a while.

Using the Metascheduler

Once you have these files in place you are ready to use the meta-scheduler. I have enabled access to your accounts on cheaha.ac.uab.edu. Puri will need to do the same for Olympus (and we should create accounts there if you don't already have them, though cheaha will suffice for initial testing.)

Initialize Your Grid Session

The first step is to initialize your grid computing session. You do this with

grid-proxy-init -valid 12:00

This will prompt you for the password protecting your private key, which was selected by you when you downloaded the private key.* Strictly speaking, the -valid 12:00 option is not required. I included it here to show that you can control the lifetime of your session to cover run time window of your job. The output will look like this:

$ grid-proxy-init -valid 12:00
Your identity: /C=US/ST=Alabama/L=Birmingham/O=University of Alabama at Birmingham/OU=UABgrid/CN=jpr/emailAddress=jpr@uab.edu
Enter GRID pass phrase for this identity:
Creating proxy ....................................... Done
Your proxy is valid until: Sat Oct 20 02:08:11 2007

After this you can run jobs using the meta scheduler. The commands that are most helpful are:

  • gwhost - list the available resources
  • gwps - lists the current jobs
  • gwsubmit - submits a job to an available resource

Run a Test Job

The first step you probably want to do is make sure your environment is functioning properly. Create a simple two line job file called testjob with this content:

EXECUTABLE=/bin/uname
ARGUMENTS=-a
RESCHEDULE_ON_FAILURE=no

Submit the job with the command:

gwsubmit -t testjob

After the job is submitted, you can monitor it's progress with gwps. You'll probably want to filter the process list on your user name like this:

gwps -u $USER

Your job will start in the pending state and progress through several intermediate steps that indicate resource selection, submission to the remote host, and finally completion in the done state.

Examine the Output

When your job is done, the standard output and error are available in your current directory under the names stdout.JID and stderr.JID, where JID is the job number assigned by the meta-scheduler and visible in the gwps output.

For the above job, you should have an empty stderr file and our stdout should be similar to:

Linux compute-0-16.local 2.6.9-55.ELsmp #1 SMP Wed May 2 14:04:42 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux

This shows that the job was run on host compute-0-16, in this case on cheaha.

You can get a brief summary of your job execution stats with the command gwhistory JID, where JID is the job id. This will give you an overview of the amount of time spent in various states for your job. The output will be similar to:

$ gwhistory 418
HID START    END      PROLOG  WRAPPER EPILOG  MIGR    REASON QUEUE    HOST       
0   14:20:47 14:21:30 0:00:01 0:00:41 0:00:01 0:00:00 ----   all.q    cheaha.ac.uab.edu/SGE

If your job did not run successfully you can do some debugging by looking at the job log file.

less $GW_LOCATION/var/JID/job.log

Again, JID is the job number seen in the gwps output. This file can be a good way to familiarize yourself with what is going on behind the scenes.

Job Cleanup

Once you are done with your job, you can remove it from the job queue display with the gwkill command. This is a somewhat misleading name, since at this point the job is already done and all you are doing is removing it from the job list. You can also remove output files if you no longer need them.

Running through this test should help you get familiar with the basic operations. You can read more about the GridWay operations in the GridWay Users Manual. Our next step is to run one of the test snow scripts.


  • If you forgot it, then just go get another copy of the key and select a new password during the download. You'll need to replace your userkey.pem file of course.