wiki:Grid
Last modified 12 years ago Last modified on 05/07/09 13:47:02

Setup

GridAtlas comes in a pre-packaged tar.gz file that includes everything that you need to use the system or to continue development.

Step 1

You will need at least one machine for a GridAtlas Aggregator (GAA) and at least one machine for a GridAtlas Daemon (GAD). Each service should be ran on a different computer. Transfer the gridatlas.<date>.tar.gz file to each machine to $HOME directory of the user where you wish to install the service and extract the contents using the following command:

tar xvfz gridatlas.<date>.tar.gz

Step 2

Make sure you are using 'bash.' Now, PATH environment variable needs to updates on each of the machines. Create a new file named $HOME/.gridatlas.env and enter the following contents:

export CATALINA_HOME=~/apache-tomcat-6.0.14
export ANT_HOME=${CATALINA_HOME}/apache-ant-1.7.1/
export PATH=${ANT_HOME}/bin:${COG_LIB}:${PATH}
export JAVA_HOME=$CATALINA_HOME/jdk1.6.0_11
export PATH=${JAVA_HOME}/bin:${PATH}
export PATH=${CATALINA_HOME}/bin:${PATH}
function starttomcat() {
        pushd `pwd`
        cd $HOME
        sh $HOME/apache-tomcat-6.0.14/bin/startup.sh
        popd
}

function shutdowntomcat {
        pushd `pwd`
        cd $HOME
        sh $HOME/apache-tomcat-6.0.14/bin/shutdown.sh
        popd
}

function comp {
        pushd `pwd`
        cd ~/apache-tomcat-6.0.14/webapps/axis
        javac -nowarn GridAtlasService.java
        cp GridAtlasService.java GridAtlasService.jws
        cp GridAtlasService.java ~/apache-tomcat-6.0.14/webapps/axis/WEB-INF/jwsClasses/
        cp IGridAtlasService.java IGridAtlasService.jws
        cp IGridAtlasService.java ~/apache-tomcat-6.0.14/webapps/axis/WEB-INF/jwsClasses/
        javac -nowarn ~/apache-tomcat-6.0.14/webapps/axis/WEB-INF/jwsClasses/GridAtlasService.java
        cp IAPIHelper.class ~/apache-tomcat-6.0.14/webapps/axis/WEB-INF/classes
        cp APIHelper.class ~/apache-tomcat-6.0.14/webapps/axis/WEB-INF/classes
        cp IHibernateDatabase.class ~/apache-tomcat-6.0.14/webapps/axis/WEB-INF/classes
        cp HibernateDatabase.class ~/apache-tomcat-6.0.14/webapps/axis/WEB-INF/classes
        cp HibernateUtil.class ~/apache-tomcat-6.0.14/webapps/axis/WEB-INF/classes

        cp IAPIHelper.java ~/apache-tomcat-6.0.14/webapps/axis/WEB-INF/src
        cp APIHelper.java ~/apache-tomcat-6.0.14/webapps/axis/WEB-INF/src
        cp IHibernateDatabase.java ~/apache-tomcat-6.0.14/webapps/axis/WEB-INF/src
        cp HibernateDatabase.java ~/apache-tomcat-6.0.14/webapps/axis/WEB-INF/src
        cp HibernateUtil.java ~/apache-tomcat-6.0.14/webapps/axis/WEB-INF/src

        cp IAPIHelper.class ~/apache-tomcat-6.0.14/webapps/axis/WEB-INF/src
        cp APIHelper.class ~/apache-tomcat-6.0.14/webapps/axis/WEB-INF/src
        cp IHibernateDatabase.class ~/apache-tomcat-6.0.14/webapps/axis/WEB-INF/src
        cp HibernateDatabase.class ~/apache-tomcat-6.0.14/webapps/axis/WEB-INF/src
        cp HibernateUtil.class ~/apache-tomcat-6.0.14/webapps/axis/WEB-INF/src
        popd
}

function stop {
        shutdowntomcat
}

function start {
        starttomcat
}

function restart {
        shutdowntomcat
        sleep 3
        starttomcat
}

export CLASSPATH=$CLASSPATH:$CATALINA_HOME/webapps/axis/lib/wsdl4j-1.5.1.jar:$CATALINA_HOME/webapps/axis/lib/saaj.jar:$CATALINA_HOME/webapps/axis/lib/mailapi.jar:$CATALINA_HOME/webapps/axis/lib/log4j-1.2.8.jar:$CATALINA_HOME/webapps/axis/lib/jaxrpc.jar:$CATALINA_HOME/webapps/axis/lib/commons-logging-1.0.4.jar:$CATALINA_HOME/webapps/axis/lib/commons-discovery-0.2.jar:$CATALINA_HOME/webapps/axis/lib/axis.jar:$CATALINA_HOME/webapps/axis/lib/axis-ant.jar:$CATALINA_HOME/webapps/axis/lib/activation.jar:$CATALINA_HOME/webapps/axis/lib/xmlsec-1.3.0.jar:$CATALINA_HOME/webapps/axis/WEB-INF/classes/:$CATALINA_HOME/webapps/axis/lib/xerces-2.6.2.jar:$CATALINA_HOME/webapps/GridAtlas/localhost/axis/GridAtlasService_jws/*:$CATALINA_HOME/lib/axis-jaxrpc-1.4.jar:$CATALINA_HOME/lib/hibernate3.jar:~/apache-tomcat-6.0.14/lib/servlet-api.jar:~/apache-tomcat-6.0.14/lib/axis.jar.

export AXIS_HOME=$CATALINA_HOME/webapps/axis/export AXIS_LIB=$AXIS_HOME/libexport AXIS_CLASSPATH=$AXIS_LIB/axis-ant.jar:$AXIS_LIB/axis.jar:$AXIS_LIB/commons-discovery-0.2.jar:$AXIS_LIB/saaj.jar:$AXIS_LIB/commons-logging-1.0.4.jar:$AXIS_LIB/jaxrpc.jar:$AXIS_LIB/log4j-1.2.8.jar:$AXIS_LIB/wsdl4j-1.5.1.jar:$AXIS_LIB/activation.jar:$AXIS_LIB/mailapi.jar:$AXIS_LIB/xmlsec-1.3.0.jar:.

alias log='vi $CATALINA_HOME/logs/catalina.out'

Now open up your $HOME/.bashrc file and add the following at the bottom:

source $HOME/.gridatlas.env

Logout and Log back in or do a

source ~/.bashrc

Step 3

Create a file named gridatlasconfiguration.props in your $HOME directory and insert the following contents. Note that a GAA and GAD have different configurations.

AGGREGATOR=
AGGREGATOR_HOSTNAMES=
AGGREGATOR_PORTS=
MY_HOSTNAME=
MY_WEB_SERVICE_PORT=

Aggregator If this service will be a GAA then make the first property = true (e.g.: AGGREGATOR=true), otherwise set the property to false. AGGREGATOR_HOSTNAMES Set this to the HOSTNAME of the aggregators seperated by commas. If the service is the aggregator, then leave this field empty. AGGREGATOR_PORTS Set this to the ports of the aggregator service (seperated by commas if you have multiple aggregators you are talking to). Leave empty if this does not talk to another aggregator. The aggregator ports by default is 11080. MY_HOSTNAME Enter the hostname of the machine the service is being deployed on. If you don't know the name of the machine, you can obtain this by typing 'hostname' at the bash prompt. MY_WEB_SERVICE_PORT The port that your web service is listening on. If you are using the defaults, this is port 11080. To change the port, edit the server.xml file located at $CATALINA_HOME/conf/server.xml and change all instances of 11080 to the port you want to use. Note that 8080 is the Universal STANDARD port and 8443 is the Universal standard for using https (use Google if you want to use https...)

Step 4

Now you need to set up your database for EACH GAA and GAD that is being set up. Any database that supports hibernate should, in theory, work; however, testing has only been performed on a MySQL database. A MySQL database script has also been included in the package for your convenience ($CATALINA_HOME/webapps/axis/gridatlas.sql). In order to use this script, create a database called gridatlas in MySQL:

mysql> create database gridatlas;
Query OK, 1 row affected (0.00 sec)

and then execute the following command on the command prompt (NOTE that provided script uses database called 'gridatlas', if you created a database with a different name or already have a database with this name and do not want all the data deleted, edit the $CATALINA_HOME/webapps/axis/gridatlas.sql file at the top where it says USE gridatlas to point to the desired database):

mysql -D gridatlas -u <DATABASE_USER_NAME> -p < $CATALINA_HOME/webapps/axis/gridatlas.sql

Here is the SQL description for the GridAtlas database in case you want to try and use a database other than MySQL:

CREATE DATABASE /*!32312 IF NOT EXISTS*/ `gridatlas` /*!40100 DEFAULT CHARACTER SET latin1 */;

USE `gridatlas`;

DROP TABLE IF EXISTS `gridatlas_apps`;
CREATE TABLE `gridatlas_apps` (
  `AppID` int(11) NOT NULL auto_increment,
  `AppName` varchar(30) NOT NULL,
  `version` varchar(5) default NULL,
  `path` varchar(50) NOT NULL,
  `description` varchar(100) default NULL,
  `username` varchar(12) NOT NULL,
  `hostname` varchar(100) default NULL,
  PRIMARY KEY  (`AppID`)
) ENGINE=INNODB;

DROP TABLE IF EXISTS `gridatlas_appinputs`;
CREATE TABLE `gridatlas_appinputs` (
  `name` varchar(50) NOT NULL,
  `value` varchar(200) NOT NULL,
  `description` varchar(200) default NULL,
  `AppID` int(11) NOT NULL,
  PRIMARY KEY  (`name`),
  FOREIGN KEY (`AppID`) REFERENCES gridatlas_apps(`AppID`) ON DELETE CASCADE
)ENGINE=INNODB;

DROP TABLE IF EXISTS `gridatlas_daemons`;
CREATE TABLE `gridatlas_daemons` (
  `daemon` varchar(100) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

DROP TABLE IF EXISTS `gridatlas_users`;
CREATE TABLE `gridatlas_users` (
  `username` varchar(12) NOT NULL,
  `hostname` varchar(100) default NULL,
  PRIMARY KEY  (`username`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

Step 5

Now you must point hibernate at your database. Edit the file $CATALINA_HOME/webapps/axis/WEB-INF/classes/hibernate.cfg.xml

vim $CATALINA_HOME/webapps/axis/WEB-INF/classes/hibernate.cfg.xml

You will need to edit the following to point to your database server and name (line 1) using your username (line 2) and password (line 5).

1. <property name="connection.url">jdbc:mysql://localhost/gridatlas</property>
2. <property name="connection.username">myusername</property>
3. <property name="connection.driver_class">com.mysql.jdbc.Driver</property> <!-- ONLY EDIT IF NOT USING MYSQL.  You must find the appropriate driver for your database! USE GOOGLE! -->
4. <property name="dialect">org.hibernate.dialect.MySQLDialect</property> <!-- ONLY EDIT IF NOT USING MYSQL.  You must find the appropriate driver for your database! USE GOOGLE! -->
5. <property name="connection.password">mypassword</property>
6. <property name="transaction.factory_class">org.hibernate.transaction.JDBCTransactionFactory</property> <!-- ONLY EDIT IF NOT USING MYSQL -->

Possible caveat: If you are using a database on a remote machine and are pointing the JDBC connection there, make sure the remote database is accepting incoming connections from the machine where current instance of GridAtlas is being installed (look for: Exception: Cannot open connection Exception: org.hibernate.exception.GenericJDBCException: Cannot open connection error in Tomcat log file). In order to do so, you need to execute the following command within MySQL database to which you are connecting:

mysql> grant all privileges on gridatlas.* to <DATABASE_USER_NAME>@<IP> identified by "<PASSWORD>";
Query OK, 0 rows affected (0.01 sec)

Step 6

Start up tomcat. You can simply type

starttomcat

To stop tomcat

stopttomcat

to restart tomcat

restart

(Note that the starttomcat function places you in your $HOME directory before starting up tomcat ... You must be in your $HOME directory whenever you start up tomcat or their will be an error reading your gridatlasconfiguration.props file).

Step 7

Verify that everything is running by checking to see if your service's WSDL is showing up. (Note that the <tomcatport> is 11080 if you did not change the $CATALINA_HOME/server.xml file).

http://<yourhostnameOrIP>:<tomcatport>/axis/GridAtlasService.jws?wsdl.

You should now see the WSDL. If you don't, you can get more information by looking at the $CATALINA_HOME/logs/catalina.out file to see what the issue is. Now browse to the GUI client

http://<yourhostnameOrIP>:<tomcatport>/GridAtlas

This should direct you to the getapplications.jsp page and you should see a big GridAtlas JPG.

Step 8

Start up all of your other GAD's and your GAA and add an application from a GAD. Check that the GAA also received the application information after you have added it to your GAD.

Usage

After all of the GAA’s and GAD’s have been configured properly and has started up, we can start populating information about the applications in our VO. You have two options at this step:

  1. You can create a custom client using the API that we have published.
  2. You can use the existing web client that we have generated for this project.

When you started up tomcat, this also started up the web client that is located at http://<GA-hostname>:8080/GridAtlas/. Browse to the web client of a GAD (note that you should always register applications from the GAD’s as the application information is automatically propogated up to the GAA’s whenever a new application is registered). The following tabs will be located at the top:

Home

The Home page displays the list of applications and inputs that the logged in user has access to.

Login

Logging in with a particular user grants you access to add applications or delete applications for the particular user. (Note that admin can add and delete applications of all users).

Logout

The Logout button simply logs the user that is currently logged in out. You will be taken back to the ‘Home’ page once the process completes.

Register

The administrator should ‘register’ all of the usernames that wishes to use GridAtlas. These usernames should directly correlate with the usernames that they use to submit jobs. By default, the user ‘admin’ has been created which can add and delete applications for any user. Otherwise, each user must individually log in and add or delete the applications they have access to themselves. Whenever a user registers on a GAD, the information is automatically propagated to the associated GAA(s).

Add Application

This is the page where a user specifies the applications he has access to. The user must give the name of the application, path of the application on that particular hostname, version, description and if the application can be used by ‘ALL’ users or only by himself. The user will also add all associated inputs that must be used for the particular application. This information is automatically propagated to the associated GAA(s).

Delete Application

This page allows you to delete any application that the logged in user has added. This information is automatically propagated to the associated GAA(s).

After all of your GAD’s have been set up, you can browse to the GAA(s) GridAtlas web clients to verify all of the information. Notice how all of the information is successfully propagated up to the GAA including the IP address of the host so that the GAA knows exactly where the application resides.

Job Submission

GridAtlas can take the user registered information and extrapolate the details in such a way that user only needs to know about the application name and application inputs that were registered in the GridAtlas Daemons. Once you have registered applications, they can be referred to in your gridatlas script as ${GA_<APPLICATIONNAME or APPINPUT>} where APPLICATIONNAME is the name of the application or input that you are referring to. GridAtlas automatically creates a new submission file on the fly for you. The following is an example of submitting the registered application blast with the inputs numthreads and database.

  1. Register and/or log in with the username that you wish to submit the job for.
  2. Add an application named ‘BLAST’ with the GridAtlas Daemon that includes the hostname and username that you wish to submit the job for.
  3. Create a script called gasubmit.ga and insert the contents from the host where your GridAtlas Daemon is running.
    EXECUTABLE  = ${GA_BLAST} # The application ‘applicationname’ will be looked up in GAA.
    ARGUMENTS   = -d ${GA_DATABASE} -a ${GA_NUMTHREADS} -p blastp -i input.fas_${TASK_ID} -o results.out_${TASK_ID} # Application inputs looked up will be ‘database’ and ‘numthreads’
    STDIN_FILE  = /dev/null
    STDOUT_FILE = gw.out.${JOB_ID}
    STDERR_FILE = gw.err.${JOB_ID}
    INPUT_FILES = input.fas_${TASK_ID}
    OUTPUT_FILES = results.out_${TASK_ID}
    
  4. There is a java class located in gridatlas-warpper/GridAtlasScriptCreator.java that is used to create and submit gridway scripts for you (Note as of 4.29.09, the java class does not 'submit' the scripts... but just creates the gridway template scripts for you).

Flow of GridAtlas – Setting up applications Figure 3 – Daemon Startup

Whenever a GridAtlas Daemon starts up, it notifies the associated GridAtlas Aggregator(s) by invoking there web services and telling it to drop all known applications that it knows of that GridAtlas Daemon. The GridAtlas Daemon also sends the GridAtlas Aggregator(s) a new list of all of its current applications and inputs whenever it requested it to delete the applications. This is done for synchronization issues that could arise between knowledge of applications between the GridAtlas Daemons and GridAtlas Aggregator(s).

Figure 4 – GAD and GAA interaction Figure 4 represents the typical interaction that takes place in GridAtlas when registering usernames, applications, or deleting applications. The client first invokes a request to the GAD which is then propagated up to the GAA notifying it of the change. Note that all changes should be done by invoking the requests from the GAD’s; otherwise the GAA would include wrong information about the daemons. Because of this, the custom web client that we have created will only allow a user to register a username, add an application, or delete an application from a GAD.

Flow of GridAtlas – Submitting jobs

GridAtlas provides the user with an easy way to submit jobs. Instead of needing to create a complex submission script regarding application details, a user now only needs to provide the name of the application and the name of any application inputs he wishes to submit. The details are fetched from the GridAtlas Aggregator and an appropriate submission script is created and submitted on the fly for the user according to the GridAtlas submission script that was generated.

Figure 5: GridAtlas Translator Example

Here is an example of a hand generated script that we needed to create when submitting jobs directly to GridWay?.

EXECUTABLE  = ./blastall.sh
ARGUMENTS   = -p blastp -i input.fas_${TASK_ID} -o results.out_${TASK_ID}
STDIN_FILE  = /dev/null
STDOUT_FILE = gw.out.${JOB_ID}
STDERR_FILE = gw.err.${JOB_ID}
INPUT_FILES = input.fas_${TASK_ID}
OUTPUT_FILES = results.out_${TASK_ID}
REQUIREMENTS = HOSTNAME = "everest.cis.uab.edu"

This file directly references the blastall.sh script. Now I will show the blastall.sh script.

if [ ${GW_HOSTNAME} = "olympus.cis.uab.edu" ]; then
	EXECUTABLE=/shared/apps/blast/bin/blastall
	NUMTHREADS=2
	DATABASE=/shared/data/blastdb/nr/nr
elif [ ${GW_HOSTNAME} = "everest.cis.uab.edu" ]; then
	EXECUTABLE=/opt/Bio/ncbi/bin/blastall
	NUMTHREADS=2
	DATABASE=/home/puri/blast/db/nr
elif [ ${GW_HOSTNAME} = "cheaha.cis.uab.edu" ]; then
	EXECUTABLE=/opt/Bio/ncbi/bin/blastall
	NUMTHREADS=8
	DATABASE=/home/puri/blast/db/nr
elif [ ${GW_HOSTNAME} = "ferrum.cis.uab.edu" ]; then
	EXECUTABLE=/store/dBLAST/blast-2.2.17/bin/blastall
	NUMTHREADS=8
	DATABASE=/store/dBLAST/DBs/nr
else
	echo Host \"${GW_HOSTNAME}\" configuration is not know
exit
fi

echo Hostname according to Gridway = ${GW_HOSTNAME} echo Output from \"uname -a\" = uname -a time $EXECUTABLE -d $DATABASE -a $NUMTHREADS $@

Notice that we had to specify the executable’s location for every host that the application has a chance of being submitted too.

With GridAtlas, the information about the application is stored entirely in the GridAtlas Aggregator(s) database so there is no need to specify nested if statements to get the job done. Instead you just specify the application name and any application inputs that will be needed. The following is an example of a GridAtlas script.

{{ EXECUTABLE = ${GA_APPLICATIONNAME} # The application ‘applicationname’ will be looked up in GAA. ARGUMENTS = -d ${GA_INPUT1} -a ${GA_INPUT2} -p blastp -i input.fas_${TASK_ID} -o results.out_${TASK_ID} # Application inputs looked up will be ‘input1’ and ‘input2’ STDIN_FILE = /dev/null STDOUT_FILE = gw.out.${JOB_ID} STDERR_FILE = gw.err.${JOB_ID} INPUT_FILES = input.fas_${TASK_ID} OUTPUT_FILES = results.out_${TASK_ID} }}} Any variable that is preceded with a ${GA_ will be looked up in the GridAtlas Aggregator for a match according to the hostname and username that submitted the application. If an application is not found, an error will be thrown and the user must correct it before proceeding (Notice that the application must go beside EXECUTABLE and the application inputs must go beside ARGUMENTS). The appropriate generation script will then be automatically generated and submitted for the user. This process ensures that a user does not need to know anything other than the application name that they wish to run and the name of the application inputs. All of the details are pushed back onto the GridAtlas Aggregator.