Explore condor status

Explore condor_status[edit | edit source]

The goal of this exercise is try out some of the most common options to the condor_status command, so that you can view slots effectively.

The main part of this exercise should take just a few minutes, but if you have more time later, come back and work on the extension ideas at the end to become a condor_status expert!

Selecting Slots [edit | edit source]

The condor_status program has many options for selecting which slots are listed. You've already learned the basic condor_status and the condor_status -compactvariation (which you may wish to retry now, before proceeding).

Another convenient option is to list only those slots that are available now:
$ condor_status -avail
Of course, the individual execute machines only report their slots to the collector at certain time intervals, so this list will not reflect the up-to-the-second reality of all slots. But this limitation is true of all condor_status output, not just with the -avail option. Similar to condor_q, you can limit the slots that are listed in two easy ways. To list just the slots on a specific machine:
$ condor_status <hostname>
For example, if you want to see the slots on htcondor-10.os-internal:
$ condor_status htcondor-10.os-internal
Name                          OpSys      Arch   State     Activity LoadAv Mem   ActvtyTime

slot1@htcondor-10.os-internal LINUX      X86_64 Unclaimed Idle      0.000 3790  0+17:19:50

                     Machines Owner Claimed Unclaimed Matched Preempting  Drain

        X86_64/LINUX        1     0       0         1       0          0      0

               Total        1     0       0         1       0          0      0
To list a specific slot on a machine:
$ condor_status <slot>@<hostname>
Note however that the lab is configured to use "partitionable slots". This means that there is one main slot (slot1) which covers the whole resources of the machine. When requests are matched, they are carved out of the partitionable slot, ie slot1_1@<hostname>.

Note: You can name more than one hostname, slot, or combination thereof on the command line, in which case slots for all of the named hostnames and/or slots are listed.

Let’s get some practice using condor_status selections!

  1. List all slots in the pool — how many are there total?
  2. Practice using all forms of condor_status that you have learned:
    • List the available slots
    • List the slots on a specific machine
    • List a specific slot from that machine
    • Try listing the slots from a few (but not all) machines at once
    • Try using a mix of hostnames and slot IDs at once

Viewing a Slot ClassAd [edit | edit source]

Just as with condor_q, you can use condor_status to view the complete ClassAd for a given slot (often confusingly called the “machine” ad):
$ condor_status -l <hostname>
Because slot ClassAds may have 150–200 attributes (or more!), it probably makes the most sense to show the ClassAd for a single slot at a time, as shown above. Here are some examples of common, interesting attributes taken directly from condor_status output:
Name = "slot1@htcondor-10.os-internal"
NextFetchWorkDelay = -1
NumDynamicSlots = 0
NumPids = 0
OpSys = "LINUX"
OpSysAndVer = "CentOS7"
OpSysLegacy = "LINUX"
OpSysLongName = "CentOS Linux release 7.3.1611 (Core)"
OpSysMajorVer = 7
OpSysName = "CentOS"
As you may be able to tell, there is a mix of attributes about the machine as a whole (hence the name “machine ad”) and about the slot in particular.

Go ahead and examine a machine ClassAd now.

Viewing Slots by ClassAd Expression [edit | edit source]

Often, it is helpful to view slots that meet some particular criteria. For example, if you know that your job needs a lot of memory to run, you may want to see how many high-memory slots there are and whether they are busy. You can filter the list of slots like this using the -constraint option and a ClassAd expression.

Now, all the machines in the lab are the same, so it's hard to reduce the number we have with a sensible query. Here is an example using a regexp on the machine name to reduce the number:
$ condor_status -const 'OpSysAndVer =?= "CentOS7" && regexp("htcondor-1\d", Machine)'
Note: Be very careful with using quote characters appropriately in these commands. In the example above, the single quotes (') are for the shell, so that the entire expression is passed to condor_status untouched, and the double quotes (") surround a string value within the expression itself.

If you are interested in learning more about writing ClassAd expressions, look at section 4.1 and especially 4.1.4 of the HTCondor Manual. This is definitely advanced material, so if you do not want to read it, that is fine. But if you do, take some time to practice writing expressions for the condor_status -constraintcommand.

Note: The condor_q command accepts the -constraint option as well! As you might expect, the option allows you to limit the jobs that are listed based on a ClassAd expression.

Formatting Output (Optional) [edit | edit source]

The condor_status command accepts the same -format (-f) and -autoformat (-af) options that condor_q accepts, and the options have the same meanings in both commands. Of course, the attributes available in machine ads may differ from the ones that are available in job ads. Use the HTCondor Manual or look at individual slot ClassAds to get a better idea of what attributes are available.