Last modified 10 years ago Last modified on 01/28/08 12:55:20

Table of Contents

    January 2008 R-group Status Report

    At the moment the best default configuration for a Globus facing ROCKS system seems to be Globus + PBS + MPICH. The reason being that Globus's GRAM (the job submit engine) supports PBS and MPICH out of the box. This is also the configuration that seems to ship with ROCKS.

    The default configuration for ROCKS clusters, at least locally, has been SGE + MPICH. It seems that SGE is the preferred scheduler for ROCKS and that choice forces a compromise wrt Globus, in that an external GRAM provider for SGE needs to be used. It seems this was built off the PBS one. It supports MPICH. Installing it has been somewhat error prone and it's not clear yet if our install errors or just the features of the code contribute to submit time problems. The SGE GRAM is included with the Globus Roll, so some further investigation into that is warrented.

    R jobs that don't require multiple processors and therefore don't need to use MPI to harness multiple CPUs, seem to execute successfully when submitted to the Globus+SGE+single-job GRAM chain. The main server-side issue here is configuration management across platforms. This is shared with all R jobs and should be addressed by the containerization approach.

    The next problem we run into is with R jobs that require MPI support. Submitting them with Globus has been very problematic. The default stack for R + MPI is through Rcode->snow library->Rmpi->MPI-implementation. We should be MPI implementation neutral because Rmpi supports the popular MPI implementations: LAM/MPI, OpenMPI, MPICH, and MPICH2. In testing an Rmpi build against MPICH on Cheaha, some differences betwen MPI and MPI2 features has been claried. The snow library assumes MPI_comm_spawn support which is MPI2. MPICH implements MPI not MPI2. The snow maintainer has gotten MPICH working before and I'm exploring if it's still reasonable to use MPICH.

    Without spawn support mpirun needs to be used to launch the process pool. This brings the sticking point back to SGE+MPICH, because when mpirun is used to start the snow-based job, it doesn't create the R instances on other clusters. This could be a limitation in the MPICH configration of my current container. Further debugging is required on that front.

    It looks like it will be possible to allow Rmpi to fully control the selection of the MPI implementation for the container by just putting the relevant Rmpi library into the R code library path. This will require coordination with the scheduler launch script because of the need for mpirun wrapping in the case of MPICH.

    Allowing Rmpi to be swapped around may also impact how the R code is structured, eg. can makeCluster() gloss over the difference or does the app need use getCluster(). The makeCluster() code seems to test the water with a getCluster() call, but the non-working mpirun launching has prevented testing.