Held Process

To view information about processes that Condor placed on hold, type condor_q -hold. For example:

> condor_q -hold

-- Submitter: vnc.hmdc.harvard.edu : <10.0.0.47:60603> : vnc.hmdc.harvard.edu
 ID OWNER HELD_SINCEHOLD_REASON
 17.0 arose 10/5 12:53via condor_hold (by user arose)
 17.1 arose 10/5 12:53via condor_hold (by user arose)
 17.2 arose 10/5 12:53via condor_hold (by user arose)
 17.3 arose 10/5 12:53via condor_hold (by user arose)
 17.4 arose 10/5 12:53via condor_hold (by user arose)
 17.5 arose 10/5 12:53via condor_hold (by user arose)
 17.6 arose 10/5 12:53via condor_hold (by user arose)
 17.7 arose 10/5 12:53via condor_hold (by user arose)
 17.9 arose 10/5 12:53via condor_hold (by user arose)

9 jobs; 0 idle, 0 running, 9 held

Refer to the Condor Manual for a description of the value that represents why a process was placed on hold. Go to the following URL for section 2.5, "Submitting a Job," and look for subsection 2.5.2.2, "ClassAd Job Attributes." Look for the entry HoldReasonCode:

http://research.cs.wisc.edu/htcondor/manual/latest/2_5Submitting_Job.html

To place a process on hold, type the command condor_hold <cluster ID>.<process ID>. For example:

> condor_hold 8.33
Job 8.33 held

To place on hold any processes not completed in a full cluster, type condor_hold <cluster ID>. For example:

> condor_hold 8
Cluster 8 held.

The status of those uncompleted processes in cluster 8 is now H (on hold):

> condor_q

-- Submitter: vnc.hmdc.harvard.edu : <10.0.0.47:60603> vnc.hmdc.harvard.edu
 ID OWNER SUBMITTED RUN_TIME STPRISIZECMD
 8.2 sspade 10/4 11:19 0+00:00:00 H 0 9.8 dwarves.pl
 8.5 sspade 10/4 11:19 0+00:00:00 H 0 9.8 dwarves.pl
 8.6 sspade 10/4 11:19 0+00:00:00 H 0 9.8 dwarves.pl

3 jobs; 0 idle, 0 running, 3 held

To release a process from hold, type the command condor_release <cluster ID>.<process ID>. For example:

> condor_release 8.33
Job 8.33 released.

To release the full cluster from hold, type the command condor_release <cluster ID>. For example:

> condor_release 8
Cluster 8 released.

You can instruct the Condor system to place your batch processing on hold if it spends a specified amount of time suspended (that is, not processing). For example, include the following attribute in your submit file to place your jobs on hold if they spends more than 50 percent of their time suspended:

Periodic_hold = CumulativeSuspensionTime > (RemoteWallClockTime /2.0)