Use queue N, $(Cluster) and $(Process)

Use queue N, $(Cluster), and $(Process)[edit | edit source]

The goal of this exercise is to learn to submit many jobs from a single queue statement, and then to control filenames and arguments per job.

Submitting Many Jobs With One Submit File [edit | edit source]

Suppose you have a program that you want to run many times. The program takes an argument, and you want to change the argument for each run of the program. With what you know so far, you have a couple of choices (assuming that you cannot change the job itself to work this way):

  • Write one submit file; submit one job, change the argument in the submit file, submit another job, change the submit file, …
  • Write many submit files that are nearly identical except for the program argument

Neither of these options seems very satisfying. Fortunately, we can do better with HTCondor.

Running Many Jobs With One queue Statement [edit | edit source]

Here is a C program that uses a simple stochastic (random) method to estimate the value of π — feel free to try to figure out the method from the code, but it is not critical for this exercise. The single argument to the program is the number of samples to take. More samples should result in better estimates!
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>

int main(int argc, char *argv[])
{
  struct timeval my_timeval;
  int iterations = 0;
  int inside_circle = 0;
  int i;
  double x, y, pi_estimate;

  gettimeofday(&my_timeval, NULL);
  srand48(my_timeval.tv_sec ^ my_timeval.tv_usec);

  if (argc == 2) {
    iterations = atoi(argv[1]);
  } else {
    printf("usage: circlepi ITERATIONS\n");
    exit(1);
  }

  for (i = 0; i < iterations; i++) {
    x = (drand48() - 0.5) * 2.0;
    y = (drand48() - 0.5) * 2.0;
    if (((x * x) + (y * y)) <= 1.0) {
      inside_circle++;
    }
  }
  pi_estimate = 4.0 * ((double) inside_circle / (double) iterations);
  printf("%d iterations, %d inside; pi = %f\n", iterations, inside_circle, pi_estimate);
  return 0;
}
  1. In a new directory for this exercise, save the code to a file named circlepi.c
  2. Compile the code
    $ gcc -static -o circlepi circlepi.c
    
  3. If there are errors, check the file contents and compile command carefully, otherwise see the instructors
  4. Test the program with just a few samples:
    $ ./circlepi 10000
    

Now suppose that you want to run the program many times, to produce many estimates. This is exactly what a statement like queue 3 is useful for. Let’s see how it works.

  1. Write a normal submit file for this program
    • Pass 1 billion (1000000000) as the command line argument to circlepi
    • Remember to use queue 3 instead of just queue
  2. Submit the file Note the slightly different message from condor_submit:
    3 job(s) submitted to cluster NNNN.
    
  3. Before the jobs execute, look at the job queue to see the multiple jobs

Look at condor_q -nobatch and note the output. Look in particular at the ID column. You should see that the jobs have the same cluster ID, but have job IDs incremented from 0. Do you remember how to ask HTCondor to list all the jobs from one cluster?

Using queue N With Output [edit | edit source]

When all three jobs in your single cluster are finished, examine the resulting files.

  • What is in the output file?
  • What is in the error file (hopefully nothing!)?
  • What is in the log file? Look carefully at the job IDs in each event.
  • Is this what you expected? Is it what you wanted?

Using $(Process) to Distinguish Jobs [edit | edit source]

As you saw with the experiment above, we need a way to separate output (and error) files per job that is queued, not just for the whole cluster of jobs. Fortunately, HTCondor has a way to separate the files easily.

When processing a submit file, HTCondor defines and uses a special variable for the process number of each job. If you write $(Process) in a submit file, HTCondor will replace it with the process number of the job, independently for each job that is queued. For example, you can use the $(Process) variable to define a separate output file name for each job. Suppose the following two lines are in a submit file:
output = my-output-file-$(Process).out
queue 10
Even though the output filename is defined only once, HTCondor will create separate output filenames for each job:
First job my-output-file-0.out
Second job my-output-file-1.out
Third job my-output-file-2.out
...
Last (tenth) job my-output-file-9.out

Let’s see how this works for our program that estimates π.

  1. In your submit file, change the definitions of output and error to use $(Process), in a way that is similar to the example above
  2. Remove any output, error, and log files from previous runs
  3. Submit the updated file

When all three jobs are finished, examine the resulting files again.

  • How many files are there of each type? What are their names?
  • Is this what you expected? Is it what you wanted from the π estimation process?

Using $(Cluster) to Separate Files Across Runs [edit | edit source]

With $(Process), you can get separate output (and error) filenames for each job within a run. However, the next time you submit the same file, all of the output and error files are overwritten by new ones created by the new jobs. Maybe this is the behavior that you want. But sometimes, you may want to separate files by run, as well.

In addition to $(Process), there is also a $(Cluster) variable that you can use in your submit files. It works just like $(Process), except it is replaced with the cluster number of the entire submission. Because the cluster number is the same for all jobs within a single submission, it does not separate files by job within a submission. But when used with $(Process), it can be used to separate files by run. For example, consider this output statement:
output = my-output-file-$(Cluster)-$(Process).out
For one particular run, it might result in output filenames like this:
First job my-output-file-2444-0.out
Second job my-output-file-2444-1.out
Third job my-output-file-2444-2.out
...

If you like, change your submit file from the previous exercise to use both $(Cluster) and $(Process). Submit your file twice to see the separate files for each run. Be careful how many jobs you run total, as the number of output files grows quickly!

Using $(Process) and $(Cluster) in Other Statements [edit | edit source]

The $(Cluster) and $(Process) variables can be used in any submit file statement, although they are useful in some kinds of statements more than others. For instance, it is hard to imagine a truly good reason to use the $(Process) variable in a rank statement (i.e., for preferring some execute slots over others), and in general the $(Cluster) variable often makes little sense to use.

But in some situations, the $(Process) variable can be very helpful. Common uses are in the following kinds of statements — can you think of a scenario in which each use might be helpful?

  • log
  • transfer_input_files
  • transfer_output_files
  • arguments

Unfortunately, HTCondor does not let you perform math on the $(Process) number when using it. So, for example, if you use $(Process) as a numeric argument to a command, it will always result in jobs getting the arguments 0, 1, 2, and so on. If you have control over your program and the way in which it uses command-line arguments, then you are fine. Otherwise, you might need to transform the $(Process) numbers into something more appropriate using a wrapper script

 Previous