View Source

h3. Running Jobs

For local job submissions, PBS (a Portable Batch System) can be used to create and submit batch jobs to a large number of cluster machines. Other batch facilities used on LinkSCEEM resources are SLURM and SGE.

For remote job submissions, the Globus Toolkit lays on top of PBS but also includes authentication, scheduling, and resources description tools required for the remote submission. If there are a number of jobs to be run independently, Condor-G provides a software layer on top of Globus, allowing advanced job submission and monitoring capabilities using a single script. In the case of a workflow pipeline, where jobs have dependencies, a Condor DAGMan script can control multiple Condor-G scripts enforcing order-of-execution dependencies. If you require such tools, please let us know and we can try to implement them.

In this lesson we will demonstrate how jobs may be submitted using three different job submission systems, using the three compilation examples from the previous section (Compiling Jobs) and run them as parallel jobs on the different resources. The table below shows which LinkSCEEM resource and batch system will be used in each of the examples.

|| LinkSCEEM Resource || Batch System ||
|CaSToRC |Portable Batch System (PBS)|
|NARSS |Custom or, SLURM|
|Bibliotheca Alexandrina| Sun Grid Engine (SGE)|

Batch facilities such as PBS, SLURM or SGE differ in their user interface as well as the implementation of the batch environment. Common to all, however, is the availability of tools and commands to perform the most important operations in batch processing: job submission, job monitoring, and job control (hold, delete, resource request modification). In this lesson, we list the basic batch operations and their options, explain how to use the PBS & SGE batch environment, and describe the queue structure.

* *[PBS (Portable Batch System)]*
* *[Checking accounting of Parallel Jobs on Cy-Tera]*
* *[The SGE Batch System]*
* *[Running a parallel job at Bibliotheca Alexandrina]*
* *[The LoadLeveler Parallel Job Scheduling System]*

Back to [Getting Started on LinkSCEEM HPC resources]