User Documentation

Skip to end of metadata
Go to start of metadata

PBS (Portable Batch System)

PBS is a portable batch system which can be used to create and submit batch jobs to a number of cluster resources. A batch job is a shell script containing a set of commands you want to run on some set of execution machines. The script can contain the characteristics (attributes) of the job and the resource requirements (such as memory, cpu, time, etc.) that the job needs.

qsub & msub

The qsub command is used to submit a batch job to a queue. To submit a job on Cy-Tera msub is used. All options to qsub/msub can be specified either on the command line or as a line in a script (known as an embedded option). Command line options have precedence over embedded options. Scripts can be submitted using

or on Cy-Tera:


The main qsub commands are listed below. The sample batch script illustrates qsub usage and options. Also see the qsub man page for other options.

-A project Defines the account associated with the job.
-l resource-list nodes: number of 2-processor nodes (default: 1 node)
ppn: how many processors per node to use (1 or 2) (default: ppn=1 resource: resource to be used. See the section on available resources.)
Examples:
#PBS -l walltime=00:30:00,nodes=4:ppn=2
#PBS -l walltime=00:30:00,nodes=4:ppn=2:himem
-q queue_name: specify queue name.
-N jobname specifies the job name.
-o out_file store the standard output of the job to file out_file. After the job is done, this file will be found in the directory from which the qsub command was issued
-e err_file store the standard error of the job to file err_file. After the job is done, this file will be found in the directory from which the qsub command was issued.
-j oe merge standard output and standard error into standard output file.
-k oe place standard output and standard error files in your $HOME directory. The filenames will be of the form .o and .e respectively. If this option is used in conjunction with -j oe, standard output and standard error are combined into standard output file. The -k option overrides the -o option.
-V export all your environment variables to the batch job.
-m be send mail at the beginning and end of a job.
-M myemail@myuniv.edu send any email to given email
-X enables X11 forwarding

Most users will likely have a preference to automate the submission of their jobs; this is possible via PBS scripts:


The main msub commands are listed below. The sample batch script illustrates msub usage and options. Also see the msub man page for other options.

-A project Defines the account associated with the job.
-l resource-list nodes: number of 2-processor nodes (default: 1 node)
ppn: how many processors per node to use (1 or 2) (default: ppn=1 resource: resource to be used. See the section on available resources.)
Examples:
#PBS -l walltime=00:30:00,nodes=4:ppn=2
-q queue_name: specify queue name
-N jobname specifies the job name.
-o out_file store the standard output of the job to file out_file. After the job is done, this file will be found in the directory from which the msub command was issued
-e err_file store the standard error of the job to file err_file. After the job is done, this file will be found in the directory from which the msub command was issued.
-j oe merge standard output and standard error into standard output file.
-k oe place standard output and standard error files in your $HOME directory. The filenames will be of the form .o and .e respectively. If this option is used in conjunction with -j oe, standard output and standard error are combined into standard output file. The -k option overrides the -o option.
-V export all your environment variables to the batch job.
-m send mail at the beginning and end of a job.
-M myemail@myuniv.edu send any email to given email

Most users will likely have a preference to automate the submission of their jobs; this is possible via PBS scripts:

Interactive Jobs

qsub -I & msub -I
The -I option tells qsub/msub you want to run an interactive job. This can be very useful for direct debugging or development purposes. You may also use other qsub/msub options such as those documented in the batch sample script. For example, the following command:

For Cy-Tera

will run an interactive job with a wall clock limit of 30 minutes, using two nodes and two processors per node.

After you enter the command, you will have to wait for Torque to start the job. As with any job, your interactive job will wait in the queue until the specified number of nodes is available. If you specify a small number of nodes for smaller amounts of time, the wait should be shorter because your job will backfill among larger jobs. Now you are logged into the launch node. At this point, you can use the appropriate command to start your program.

Example:

or on Cy-Tera:

When you are done with your runs, you can use the exit command to end the job.

qstat

The qstat command displays the status of batch jobs.

-a gives the status of all jobs on the system.
qstat -n lists nodes allocated to a running job in addition to basic information.
qstat -f PBS_JOBID gives detailed information on a particular job.
-q provides summary information on all the queues.

See the qsub man page for other options available.

showq

Can only be used on Cy-Tera

Displays information about active, eligible, blocked, and/or recently completed jobs. Since the resource manager is not actually scheduling jobs, the job ordering it displays is not valid. The showq command displays the actual job ordering under the Moab Workload Manager. When used without flags, this command displays all jobs in active, idle, and non-queued states.

-b blocked jobs only
-c display details about recently completed jobs
-g display grid job and system id's for all jobs.
-i display extended details about idle jobs.
-p display only jobs assigned to the specified partition.
-r display extended details about active (running) jobs.
-R display only jobs which overlap the specified reservation.
-v Display local and full resource manager job IDs as well as partitions. If specified with the '-i' option, will display job reservation time.
qdel

The qdel command deletes a queued job or kills a running job. The syntax is qdel PBS_JOBID. Note: You only need to use the numeric part of the Job ID.

canceljob

Can only be used on Cy-Tera

The canceljob command is used to selectively cancel the specified job(s) (active, idle, or non-queued) from the queue. The syntax is canceljob JOBID


Back to Running Jobs
Next to Checking accounting of Parallel Jobs on Cy-Tera

Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.