The workflow to submit batch processing to the Condor system is as follows:
Create a directory in which to submit jobs to the Condor system.
Make sure that the directory and files with which you plan to work are readable and writable by other users, which include Condor processes.
For example, type the following:
mkdir condor cd condor
You can request that a project directory be set up for you to use for batch processing. If you perform you batch processing within your home directory, the space used for your data and program files can consume much of your allotted resources, and this can cause problems with logging into the system, so working in a project space is recommended. For more information on project spaces go to our Projects and Shared Space page.
Choose an execution environment, called a universe, for your jobs.
At HMDC, you always use the
vanillauniverse. This execution environment supports processing of individual serial jobs, but has few other restrictions on the types of jobs that you can execute.
Make your jobs batch ready.
Batch processing runs in the background, meaning that you cannot input to your executable interactively. You must create a program or script that reads in your inputs from a file, and writes out your outputs to another file.
You also must identify the full path and executable source to use for your Condor cluster. The default executable for the condor_submit_util script is the R language. In the RCE, the path and executable source for this language is
/usr/bin/R. Any command line application or program can be submitted as a batch job (Matlab, Stata, Python, etc)
If you choose to use the
condor_submit_utilscript to create the submit description file (or submit file) and submit your jobs to the Condor system for batch processing automatically, skip to step the next step.
If you choose to submit your batch processing to the Condor system manually, create a submit file.
A submit file is a plain-text file that describes a batch of jobs for the Condor software. This file contains the following descriptors:
Executable program path and file name
Program arguments (properly quoted -- see manual)
Input and output file names
Log and error file names
Here is an example of a basic submit file:
Universe = vanilla Executable = /usr/bin/R Arguments = --no-save --no-restore should_transfer_files = NO Requirements = Memory >= 32 output = $HOME/mybatchjob/output.txt error = $HOME/mybatchjob/error.txt Log = $HOME/mybatchjob/log.txt Queue 1
condor_submit_utilcommand to write the submit file and submit your program automatically to the Condor job queue.
If you chose to write your own submit file, execute the
condor_submit <submit file>.submitcommand to submit your jobs to the queue.
Condor then checks the submit file for errors, creates a
ClassAdobject and the object attributes for that cluster, and then places this object in the queue for processing.