A log file includes information about everything that occurred during your cluster processing: when it was submitted, when execution began and ended, when a process was restarted, if there were any issues. When processing finishes, the exit conditions for that process are noted in the log file.
Refer to the Condor Manual for a description of the entries in the process log file. Go to the following URL for section 2.6, "Managing a Job," and go to subsection 2.6.6, "In the log file":
http://research.cs.wisc.edu/htcondor/manual/latest/2_6Managing_Job.html
To view the log file for a process and determine where an error occurred, use the cat
command. For example, the following log file indicates that the process completed normally:
> cat log.1 000 (012.001.000) 10/04 12:14:51 Job submitted from host: <10.0.0.47:60603> ... 001 (012.001.000) 10/04 12:15:00 Job executing on host: <10.0.0.61:37097> ... 005 (012.001.000) 10/04 12:15:00 Job terminated. (1) Normal termination (return value 0) Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage Usr 0 00:00:00, Sys 0 00:00:00 - Total Remote Usage Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage 7 - Run Bytes Sent By Job 163 - Run Bytes Received By Job 7 - Total Bytes Sent By Job 163 - Total Bytes Received By Job ...
Following is an example log file for a process that did not complete execution:
> cat log.4 000 (09.000.000) 09/20 14:47:31 Job submitted from host: <x1.hmdc.harvard.edu> ... 007 (09.000.000) 09/20 15:02:10 Shadow exception! Error from starter on x1.hmdc.harvard.edu: Failed to open 'scratch.1/frieda/workspace/v67/condor- test/test3/run_0/b.input' as standard input: No such file or directory (errno 2) 0 - Run Bytes Sent By Job 0 - Run Bytes Received By Job ...