Basic HTCondor commands
Experiment with basic HTCondor commands.[edit | edit source]
We are going to look at two fundamental HTCondor commands "condor_q" and "condor_status". They are used to monitor your jobs and your slots, respectively.
Viewing slots[edit | edit source]
This command can be very simple:
$ condor_statusThis command running on the CERN pool would produce a lot of output - we have 100k slots. Here we should see something a bit more simple.
[gks@htcondor-t-0 ~]$ condor_status Name OpSys Arch State Activity LoadAv Mem ActvtyTime email@example.com LINUX X86_64 Unclaimed Idle 0.000 3790 5+18:35:27 firstname.lastname@example.org LINUX X86_64 Unclaimed Idle 0.050 3790 5+18:35:20 Machines Owner Claimed Unclaimed Matched Preempting Drain X86_64/LINUX 2 0 0 2 0 0 0 Total 2 0 0 2 0 0 0
|Nameemail@example.com||Slot name and hostname|
|State||Unclaimed||State of the slot (|
|Activity||Idle||Is there activity on the slot?|
|LoadAv||0.050||Load average, a measure of CPU activity on the slot|
|Mem||3790||Memory available to the slot, in MB|
|ActivityTime||5+18:35:27||Amount of time spent in current activity (days + hours:minutes:seconds)|
After the slot data, you can see summary information about the whole pool. There is one row of summary for each machine architecture/operating system combination. The columns are the different states that a slot can be in. The final row gives a summary of slot states for the whole pool.
...yourself and compare it to the output above. How does it compare?
Viewing whole machines only[edit | edit source]
$ condor_status -compact
Note how the output compares to the full summary.
Viewing Jobs[edit | edit source]
condor_q command lists jobs that are on this submit machine and that are running or waiting to run. The
_q part of the name is meant to suggest the word “queue”, or list of jobs waiting to finish.
The simplest version of this command shows only your jobs:
$ condor_qThe main part of the output (which for you will be empty, as you haven't submitted any jobs yet), looks like this:
-- Schedd: bigbird02.cern.ch : <184.108.40.206:9618?... @ 08/28/17 13:07:42 OWNER BATCH_NAME SUBMITTED DONE RUN IDLE TOTAL JOB_IDS bejones CMD: hello.sh 8/28 13:07 _ _ 1 1 459934.0
||The user ID of the user who submitted the job|
||The executable or the "jobbatchname" specified within submit file(s)|
||The date and time when the job was submitted|
||Number of jobs in this batch that have completed|
||Number of jobs in this batch that are currently running|
||Number of jobs in this batch that are idle, waiting for a match|
||Column will show up if there are jobs on "hold" because something about the submission/setup needs to be corrected by the user|
||Total number of jobs in this batch|
||Job ID or range of Job IDs in this batch|
1 jobs; 0 completed, 0 removed, 1 idle, 0 running, 0 held, 0 suspended
Viewing everyones jobs[edit | edit source]
In the lab, where you have your own schedd, this may not show much. In other environments, to show all users jobs, run:
$ condor_q -all