The Standard Template Library/Why STL

(Author: D. Menasce)

In this chapter I would like to illustrate what was the original idea that led Alexander Stepanov to the development of the STL in the early '80.

Consider the following example, a C++ array of integers (int) for wich we would like to compute the sum (to show line numbers press the "Run this code" button, which also displays the result in real time: note that the "Share" button allows you to interactively modify the code and check your alternative implementations):

Example:

{{{1}}}

Let's examine the code line by line:

 1 // g++ -o ex1 ex1.cc && ./ex1

This is simply a comment statement to remind how to compile and run this same code.

 3 #include <iostream>
 4
 5 using namespace std;

In order to be able to use the cout statement to produce printouts to the standard output, we have to include the iostream header file (line 3) with the relevant declarations and we need to specify the std namespace as well (line 5).

At line 7 we define a C++ array of five integers: the left-hand side of the assignment operator (=) is an int array declaration, while the righthand side is just an initialization to 5, user-chosen, values.

 7 int myArray[] = {16, 2, 77, 40, 12071};

Now, at line 14 we iterate over the five values of the array: note that at this point the value of 5 iterations is hard-coded in the source and this is a bad choice. Should we enlarge or shrink the size of the array we could forget to update the correct value here with potentially disastrous effects at run time. Note that the code is syntactically correct, therefore the compiler would never detect an inconsistency here. We'll see in a later version how to elegantly remedy this problem.

14  for ( n=0 ; n<5 ; ++n )
15  {
16   result += myArray[n];
17  }

Finally we print out the result to STANDARD OUTPUT and bring execution to end with a status of success (return 0).

18  cout << "Sum: " << result << endl ;
19  return 0;
20 }

So far so good: let's now see how to fix the potential bug mentioned before (the hard-coded value of 5 iterations):

Example:

{{{1}}}

The fix is shown at line 14: given a C++ array, the operator sizeof(type) returns the size in bytes of the object representation of type. Since type is an array of 5 locations of int's, the total size allocation returned will be 5x4 (since each integer requires 4 bytes of internal representation). The array size will therefore be given by the ratio of the total size in bytes of the array and the size of each if its elements.

Using the sizeof operator we have now made the code more robust to changes: enlarging or shrinking the array definition at line 7 no longer requires us to adapt the code at line 17, this is automatically adjusted now.


Let's now make some progress and add to the code an array of double's:

Example:

{{{1}}}

The code is rather self-explaining: we had to duplicate many lines of codes in order to accomodate both int's and double's. So far this is almost unavoidable, but once we begin operating on these arrays it will become evident that the level of code duplication begins to increase beyond reason and necessity.

Suppose in fact that we would like to write a function capable of numerically sorting the two arrays, something like this new version of the code:

Example:

{{{1}}}

At lines 10 and 11 we prepare the forward declaration of the two functions capable of sorting the arrays, which will be implemented at lines 39 and 60, respectively.

10 int    * sortI(int    array[], int size) ;
11 double * sortD(double array[], int size) ;

The signature of function sortI accepts an integer array and an integer number returning a pointer to an integer array.

The signature of function sortD is similar but acts on double variables.

The instructions at lines 18 and 19 actually execute the sorting of the two input arrays (myArrayI and myArrayD) returning pointers to the sorted versions (respectively, sortedI and sortedD)

18  int    *sortedI = sortI (myArrayI,sizeI) ;
19  double *sortedD = sortD (myArrayD,sizeD) ;

Finally the content of the two sorted arrays are printend to STDOUT at lines 21 to 33. Let's now examine the implementation of these two sorting functions: they look identical!

Lines 41 to 56 are exactly identical to lines 62 to 77! The only difference between the two is the list of formal parameters, at lines 39 and 60, respectively.


This is clearly inefficient and error prone: maintaining two identical pieces of code simply because they act on objects of a different type is just a waste of time and resources.

That's exactly why the STL was invented to begin with: it had been the first implementation of the concept of generic (or template) programming.

Let's see what this means by modifying the code to a new version:

Example:

{{{1}}}

Notice the new forward declaration at lines 10 and 11 (line 11 is actually a continuation of line 10):

10 template <class T>
11 T * sort(T array[], int size) ;

This statement is interpreted by the compiler in the following way: the sort function has a signature that accepts in input an array (called array) whose type definition is deferred to a later time (we'll see when in a moment): for the time being the type is indicated in a symbolic way with the user-chosen keyword T. The function then return a pointer to a variable of the same type T.

The implementation of the function, at line 40, is now unique: the type of object upon which the algorithm operates is specified in an abstract way at line 40 to 41.


This finally brings us to the reason why STL was invented: the STL library is a collection of C++ tools capable of operating uniformly (with the same behavior) on almost any kind of data type, standard or user-defined.

Let's illustrate this point with a final example:

Example:

{{{1}}}


First notice the two additional include directives at line 4 and 5: these are two STL header files that declare objects used later on in our code.

A first difference with respect to version ex5.cc appears at line 17: we define an STL template object, vector, specialized to operate on type int. The lefthand side of the assignement operator is the templed definition of the vector object, while the righthand side is the name of the object and it initialization. In particular we initialize the vector arrI to contain a copy of the C++ array myArrayI with the same size. We will see later on where to find and understand the documentation of such an STL object: for the time being take the initialization for granted as shown in the example.

At line 21 (and 20) the array are sorted and, immediately thereafter, their sorted values are printed on STDOUT.

As you can see the economy of code is plainly evident: not only, the vector objects are carefully optimized both in storage and access efficiency, making their use highly recommended.

In the next chapter we will make an overvie of the STL library and then delve into practical examples, divided by subject.



This page belongs to the list of courses of HEP Software Foundation. Go back on the main page.