Using condor_submit_util (recommended for new users)

After you set up your working directory and define your batch processing parameters, you can submit your script and input files for processing. You can use the condor_submit_util to set up your submit file and submit your batch or you can create your submit file with the condor_submit utility and submit to the cluster.  If you are new to batch processing with HTCondor, we recommend using the condor_submit_util.

To build a submit file automatically and submit your program for batch processing, you can use the Automated Condor Submission script (aka condor_submit_util) in two modes: interactive or command line.

Note: If you do not specify any options when you use condor_submit_util, it enters interactive mode automatically. Also, if you do not specify required options when you use condor_submit_util in command-line mode, the script enters interactive mode automatically, or it reports an error and returns you to the command-line prompt.

For examples of using condor_submit_util, see Other Batch Examples.

Working interactively with condor_submit_util

When you use the script in interactive mode, you can press the Return key to accept default values. Default values are specified in the prompts inside square brackets, and appear at the end of the prompt.

To use the condor_submit_util script in interactive mode:

  1. Execute the condor_submit_util command.

    Type the following at the command prompt in your Condor working directory:

    > condor_submit_util
    *** No arguments specified, defaulting to interactive mode...
    *** Entering interactive mode.
    *** Press return to use default value.
    *** Some options allow the use of '--' to unset the value.
  2. The script first prompts you to define the executable program that you choose to submit for batch processing, and then requests the list of arguments to provide to that executable:

    Enter executable to submit [/usr/bin/R]: <executable name>
    Enter arguments to /usr/bin/R [--no-save --vanilla]: <arguments>

    The default argument --no-save specifies not to save the R workspace at exit. The default argument --vanilla instructs R to not read any user or site profiles or restored data at start up and to not save data files at exit.

    If you do not have any arguments to apply to your executable, then type -- to supply no arguments.

  3. Next, the script prompts you to provide a name or pattern for the input, output, log, and error files for this Condor cluster submission. You can include a relative path in these entries, if you choose:

    Enter input file base [in]: <input path and file name or pattern>
    Enter output file base [out]: <output path and file name or pattern>
    Enter log file base [log]: <log path and file name or pattern>
    Enter error file base [error]: <error path and file name or pattern>

    Note, if using the batch example the input file is bootstrap.R

  4. After specifying the files, the script prompts you to define the number of iterations that you choose to execute your program for processing:

    Enter number of iterations [10]: <integer>
  5. The system creates the submit file for this batch process using your responses to script prompts.

    An example submit file is shown here. To view the contents of your submit file, include the option -v (verbose) when you launch the condor_submit_util script:

    *** creating submit file '<login account name>-<date-time>.submit'

    Universe = vanilla
    Executable = /usr/bin/R
    Arguments = --no-save --vanilla
    when_to_transfer_output = ON_EXIT_OR_EVICT
    transfer_output_files = <output file>

    input = <input file>
    output = <output file>
    error = <error file>
    Log = <log file>
    Queue <integer>
  6. If you use the verbose option, the script prompts you to confirm that the submit file is correct. To continue, press Return or type y.

    Condor checks the submit file for errors, creates the ClassAd object for your submission, and adds that object to the end of the queue for processing. The script lists messages that report this progress in your terminal window, and includes the cluster number assigned to the batch process. For example:

    Is this correct? (Enter y or n) [yes]: y
    ] submitting job to condor...
    ] removing submit file '<login account name>-<date-time>'
    *** Job successfully submitted to cluster <cluster ID>.
  7. Finally, the script prompts whether you choose to receive email when execution of your batch processing is complete. Press Return or type y to receive email, or type n to not send email and exit the script.

    If you choose to receive email, before exiting, the script prompts you to enter the email address to which you choose to send the notification. The default email address for notification is your email account on the server on which you launched the script. For example:

    Would you like to be notified when your jobs complete? (Enter y or n)
    [yes]: y
    Please enter your email address [<your email account on this server>]:
    *** creating watch file '/nfs/fs1/projects/condor_watch/<Condor machine>.<batch cluster>.<your email>'
  8. View your job queue to ensure that your batch processing begins execution successfully.

    See for complete details about checking the queue. An example is:

    > condor_q

    -- Submitter: vnc.hmdc.harvard.edu : <10.0.0.47:60603> : vnc.hmdc.harvard.edu
    IDOWNER SUBMITTED RUN_TIME STPRISIZECMD
    9.0arose10/4 11:02 0+00:00:00 R 0 9.8 dwarves.pl
    9.1arose10/4 11:02 0+00:00:00 R0 9.8 dwarves.pl
    9.2arose10/4 11:02 0+00:00:00 I 0 9.8 dwarves.pl
    9.3arose10/4 11:02 0+00:00:00 R 0 9.8 dwarves.pl

    4 jobs; 1 idle, 3 running, 0 held

Working with command arguments to condor_submit_util

When you use the script in command-line mode, you must specify all required options or the script does not execute. For example, the default number of iterations for the script is 10. If you do not have 10 input files in your working directory and you do not enter the option to specify the correct number of iterations that you plan to perform, the script does not execute and returns a message similar to the following:

> condor_submit_util -v
*** Fatal error; exiting script
*** Reason: could not find input file 'in.7'.

To use the condor_submit_util script in command-line mode:

  1. Execute the condor_submit_util command with the appropriate arguments. See for detailed information about script options.

    At a minimum, you must include the following options on the command line:

    • Executable program file name

    • Executable file arguments, or --noargs option

    • Input file, or --noinput option

    • Number of iterations, if you do not have 10 input files

    At a minimum, type the following at the command prompt from within your Condor working directory:

    > condor_submit_util -x <program> -a <arguments> -i <input files> 
  2. Condor creates a submit file and checks it for errors, creates the ClassAd object, and adds that object to the end of the queue for processing. The script supplies messages that report this progress, and includes the cluster number assigned to your Condor cluster. For example:

    > condor_submit_util -x <program> --noargs

    Submitting job(s)..........
    Logging submit event(s)..........
    10 job(s) submitted to cluster 24.

    If the script encounters a problem when creating the submit file, it enters interactive mode automatically and prompts you for the correct inputs.

  3. View your job queue to ensure that your batch processing begins execution.

    See for complete details about checking the queue.

Passing Arguments to the Program

You can pass arguments to the batch program using the --args flag in your submit file. For example, if you change the arguments line in your submit file to something like the following:

Arguments = --no-save --vanilla --args <arguments>

Then the contents of <arguments> will be passed in to the program as command-line arguments. The syntax for passing and handling these arguments differs depending on the statistics program in use.

Passing Arguments to R

To parse command-line arguments in R, use the following command in your R script:

args <- commandArgs(TRUE)

This puts the command-line arguments (the contents of <arguments>) into the variable args.

Script options

The condor_submit_util makes the task of running jobs using the batch servers easier and more intuitive.  condor_submit_util negotiates all job scheduling; it constructs the appropriate submit file for your job, and calls the condor_submit function. To use this utility you need a program to run. The format for using this script is:

condor_submit_util [OPTIONS]

In addition, the script can notify you when your job is done via email so you do not have to check the queue constantly using condor_q. In future releases, the script also will be able to keep usage data so administrators can track overall performance.

The script can be run in two ways, interactively or from the command line. When running interactively, the script prompts you for the values required to run the batch job. If you supply arguments on the command line, these arguments are used in addition to default values for any values you do not supply.

Options

  • -h, --help
    Print help page and exit.
  • -V, --version
    Print version information and exit.
  • -v, --verbose
    Show information about what goes on during script execution.
  • -I, --Interactive
    Enter interactive mode, in which the script prompts you for the required values.
  • -s, --submitfile FILE
    Specify the name of the created submit file (default is <user-name-datetime>.submit).
  • -k, --keep
    Do not delete the created submit file.
  • -N, --Notify
    Receive notification by email when jobs are complete.
  • -x, --executable FILE
    The executable for condor to run (default is /usr/bin/R).
  • -a, --arguments ARGS
    Any arguments you want to pass to the executable (should be quoted, default is "--no-save --vanilla").
  • -i, --input [FILE|PATT]
    Either an explicit file name or base name of input files to the executable (default is in).
  • -o, --output [PATT]
    Base name of output files for the executable (default is out).
  • -e, --error [PATT]
    Base name of error files for the executable (default is error).
  • -l, --log [PATT]
    Base name of log files for the executable (default is log).
  • -n, --iterations NUM
    Number of iterations to submit (default is 10).
  • -f, --force
    Overwrite any existing files.
  • --noinput
    Use no input file for executable.
  • --noargs
    Send no arguments to executable.

Examples

  1. You have a compiled executable (named foo) that takes a data set and does some analysis. You have five different data sets to run against (named data.0, data.1 ... data.4). You want to save the submit file and be notified when the job is done.

    condor_submit_util -x foo -i "data" -k -N
  2. You have an R program that has some random output. You want to run it 10 times to see the results.

    condor_submit_util -i random.R -n 10
  3. You have an R program that will take a long time to complete. You only need to run it once, but you want to be notified when it is done.

    condor_submit_util -i long.R -n 1 -N

Notes: For -o, -e, and -l, these options are considered base names for the implied files. The actual file names are created with a numerical extension tied to its condor process number (0 indexed). This means that if you execute condor_submit_util -o "out" -n 3, three output files named out.0, out.1, and out.2 are created.
Also, for -i, the script first checks to see if the name supplied is an actual file on disk, if not it uses the argument as a base name, similar to -o, -e, and -i.

    Option conventions

    For most condor_submit_util options, there are two conventions that you can use to specify that option on the command line:

    • The -<letter> convention - Use this simple convention as a short cut.

      For example, the simple option to receive email notification when your batch processing is complete is -N.

    • The --<term> convention - Use this lengthy convention to make it easy to determine what option you use.

      For example, the lengthy option to receive email notification when your batch processing is complete is --Notify.

    Both conventions for specifying an option perform the same function. For example, to receive email notification when your batch processing is complete, the options -N and --Notify perform the same function.

    Pattern Arguments

    For file-related options, such as the output file name or the error file name, you can use a pattern-matching argument. For example, if you specify the option -i "run", Condor looks for an input file with the name run. If there is no file named run, Condor looks for a file name that begins with run., such as run.14.

    If there are multiple files with names that begin with the pattern that you specify, then for the first execution within a cluster, Condor uses the file with the name that matches first in alphanumeric order. For successive executions within a cluster, Condor uses the files with names that match successively in alphanumeric order.

    Saving and Reusing a Submit File

    When you use condor_submit_util in command-line mode to submit a program for batch processing, include the option -k (keep) to save the submit file created by the utility.

    You can edit and reuse that submit file to submit similar programs to the Condor queue for batch processing. You also can include Condor macros to further improve the usability of the file. See the HTCondor documentation for detailed information about how to use Condor macros.

    For example, if you plan to submit several iterations of a program for batch processing, you can use a single submit file for all iterations. In that submit file, you use the $(PROCESS) macro to specify unique input, output, error, and log files for each iteration.

    Use of the $(PROCESS) macro requires that you develop a naming convention for files or subdirectories that includes the full range of process IDs for your iterations.

    To use an existing submit file when you submit a batch process, you cannot use the script and must execute the condor_submit command instead. Type the following:

    condor_submit my.submit