Welcome to the R Group

This project is exploring the migration of R-based statistical analysis jobs to a grid based workflow using the UABgrid meta-scheduler instance of GridWay. Details and documentation will be recorded here as we progress through our exploration.

  • BasicTests - Run through these to ensure you can use the metascheduler environment
  • R Test Scripts - Make sure your cluster-specific accounts are configured to run R properly. Use these scripts to test your configuration.
  • RNotes - Notes on running R
  • GridWayNotes - Exploration notes on GridWay
  • GettingStarted - A brief run through of executing example batch script and r-batch scripts on SGE
  • SSG R-Methodological Analysis Scripts - Notes on executing R-scripts given by the SSG on cheaha
  • CommandLineProcessing - Exploring command line processing in R
  • Installing R - Exploring installation of R on Linux -opensuse 10.3 desktop machine
  • WorkflowLogic - An overview of the generic workflow structure and logic.
  • ModifiedScripts - Modified SSG's R-Methodological analysis R-script (MigAnalysis.R) and SGE job-submission script (arrayjobsge)
  • ResourceSelection - Documenting the availability and selection of compute resources to power the workflow.
  • ContainerManagement - Building containers to house the jobs

Discussions

Status Reports

Presentations

Powering Statistical Genetics with the Grid: Using GridWay to Automate R-based Workflows
Grid Enabling Workshop, Mardi Gras Conference, January 30, 2008 Baton Rouge, LA.

Resources

  • OpenMPI - New platform for MPI, more likely available on distributed clusters rather than older LAM/MPI.
  • LAM/MPI implmentation - foundation of Rmpi/Snow now in maintenance mode, supperseeded by OpenMPI
  • RWebServices - example implementation of web service interface to R related to caGrid project
  • http://wiki.fhcrc.org/caBioc - caGrid BioConductor site for RWebServices
  • Omega Project for Statistical Computing - site for the R & S Java linking toolkit SJava and other statistical tools.
  • Bcfg2 - a configuration management tool being explored for use on UABgrid and promises solution to application maintenance across distributed clusters

References

R and the Grid

  • RWebServices - Related Work (pdf) - describes similar efforts at parallelizing R workflows using web services. Covers the shared library approach of RWebServices and the message stream approach of OSS and Rserve, which are akin to our current effort using GridWay as the distribution fabric.
  • RWebServices - Lessons Learned (pdf) - describes some of the insights gained from the RWebServices share library implementation approach. Section 3 describes considerations for adapting to web services. The course grained workflow considerations are valid for all large distributed configurations.
  • RWebServices - Connecting R to Java (pdf) - details consideratoins in linking loosely coupled and tightly coupled data systems. These concerns exist above the GridWay layer we are exploring but are valuable for understanding higher-level interface considerations.

Resource Configuration

Support

Mailing Lists

  • R Group - Working group to explore the migration of R workflows to the UABgrid collaboration environment

Attachments