Analysis/Job Submission

The backend job scheduler is Gridengine, it functions similarly to the PBS Pro/OpenPBS

Resource Requests

The table below summarises the major resources attributes which are commonly used in most of the jobs. There are also other attributes which are very helping for fine tuning how a job should be scheduled, whose details are provided in the sub sections.

Resources	Attribute	Description	Default Value
Parallel Environments (`-pe`)	smp	Allocate X number of CPUs on the SAME compute node	Optional, if not specified, a job defaults to use 1 CPU
Parallel Environments (`-pe`)	mpi	Allocate X number of CPUs from multiple compute node, this is mainly used by a job that implements under Open MPI framework.	Optional, if not specified, a job defaults to use 1 CPU
Resource request list (`-l`)	mem	The amount of memory limit can be used by a job	Default 1024M
	jobfs	The amount of disk space limit can be used by a job	Default 1G
	walltime	The run time limit (elapsed time) before a job gets killed by the job scheduler	Default 0:30:0
	ngpus	The number of GPGPUs	N/A
project_name (`-P`)	project_name	Request a job to consume resource quota defined via Project. Check Project sub section for details.	Optional, if not specified, default per user quota is consumed instead.

Submit a Batch Job

A batch job can be submitted by using command qsub, in the following pattern:

# submit a job which calls a script (bash, shell, python scripts etc)
qsub -N JOB_NAME -pe smp NUMBER_OF_CPU -l ATTR1=VAL1,ATTR2=VAL2 SCRIPT

# submit a job which calls a BINARY (anything which are not script, such as sleep, dd etc)
qsub -N JOB_NAME -pe smp NUMBER_OF_CPU -l ATTR1=VAL1,ATTR2=VAL2 -b y BINARY

Examples

# a very big sleep job that needs 16 x CPUs, 2 x GPGPUs, 64GB memory, 10G disk space
qsub -b y -N generic_gpgpu -pe smp 16 -l ngpus=2,mem=65G,jobfs=10G sleep 1m

# a smaller sleep job that requires the specific A2 GPGPU...
qsub -b y -N t1000_gpgpu -pe smp 8 -l ngpus=2,gpu_model=A2,mem=16G,jobfs=10G sleep 1m

# a big job runs on multiple H100 nodes inside the same physical rack/cabinet F (rack awareness)
qsub -b y -N h100_gpgpu -pe mpi 256 -l ngpus=2,gpu_model=H100,rack=f,mem=128G,jobfs=100G sleep 1m

Job Status

TODO

Submission Script

For larger and more complex analyses, the qsub submission script can be very useful. A submission script contains pre-populated qsub parameters, can be reused, distributed and version controlled easily. It looks like:

#!/bin/bash
#
# It prints the actual path of the job scratch directory.
#$ -pe smp 8
#$ -j y
#$ -e $JOB_ID_$JOB_NAME.out
#$ -o $JOB_ID_$JOB_NAME.out
#$ -cwd
#$ -N dd_smp
#$ -l mem=1G,jobfs=110G,tmpfree=150G,walltime=00:30:00
#$ -P project_name

echo "$HOSTNAME $TMPDIR $jobfs"

# about 107GB
dd if=/dev/zero of=$TMPDIR/dd.test bs=512M count=200

To submit

Local Scratch

Local scratch is temporary storage!! All data inside will be deleted upon job completion. Make sure data is copied back to somewhere!!

Each Compute Node equits a dedicated local storage which acts as the “Tier 0” scratch storage. When a job starts, it is given a dedicated (but temporary) directory on the scratch storage and its path is assigned to the variable $TMPDIR. Inside the submission script, $TMPDIR can be utilised in the following pattern:

It is generally a good idea to utilise this “Tier0” local scratch as it gives the best disk performance, compared to network shared storage (such as home directory).

Walltime Limit

TBA

Projects

TBA

Rack Awareness

TBA

1 Resource Requests
2 Submit a Batch Job
- 2.1 Examples
- 2.2 Job Status
3 Submission Script
- 3.1 Local Scratch
- 3.2 Walltime Limit
- 3.3 Projects
- 3.4 Rack Awareness

CSE Research Computing