Building Tests from Specifications


John D. McGregor



I was visiting a new project recently and the main topic was their development process. As a side issue we began to discuss the need (or the lack of need) for component test plans. We had already decided that a main design document would be an API specification document. Several developers did not see the need to write unit test plans with "themselves" as the intended audience. It was (and may still be) their belief that each developer should be left alone behind their own API. Even those developers who supported the idea of written test "plans" thought of them as documentation to be done after the fact.


There are a couple of benefits to formal, written test plans, particularly those based on the specification.



As I promised last month, in this column I will discuss the actual construction of tests from specifications. I will first discuss two types of specification and then I will give a basic sequence of activities for creating tests from these specifications. I will also discuss how the amount of testing that is useful varies from one method to another. Very simple specifications will result in very little formal testing while more complex specifications, that define the characteristic behavior of the object, should be tested thoroughly.


Two Specifications for each Component


Every component has two specifications from which to build tests. The first is the componentís explicit specification that I described last month. This specification may be one of the two types discussed the last column: interface or protocol. I will describe some of the implications for each of these types later in this column.


The second is the "universal" specification that corresponds to the overall environment within which the component will be used. For example, few component specifications go to the trouble to explicitly specify what happens in the event that insufficient memory is available for a memory allocation. Yet this implicit specification is a critical element in the successful operation of many components. I recently experienced failure with a piece of software that "performed an illegal operation" when I made a certain menu selection. Since I had used this application for some time and performed this operation many times I was suspicious about the environment. A little experimentation led me to see that the software evidently needed to create a temporary file of a size that exceeded what was currently available. Obviously that feature had not been tested with a disk full condition or more likely a disk nearly full condition.


Although this second specification is implicit, it can be made explicit via a checklist. The checklist will vary from one environment to another, but Figure 1 provides a portion of one such list. Note that there are columns for both positive and negative tests. There should be few occasions for negative tests since a negative test examines the behavior that occurs outside of a specification. Essentially we donít know how the software should behave outside its specification. Typically when we do use a negative test the intention is to ensure that the software behaves "gracefully" with respect to this implicit specification.


Figure 1


Positive Test Developed

Negative Test Developed




Database connected





File found


File permissions correctly set


Memory allocation successful




Remote message received





Notations for Specifications

There are a number of ways of presenting a specification. Each has positive and negative aspects when it comes to constructing test cases.

Implementation Languages

Probably the most widely used notation is the programming language that will ultimately be used in writing the implementation. The usefulness of this approach is directly related to the semantic richness of the language. For example, Java has an interface construct that directly supports the interface type of specification. Eiffel has the asserts construct that supports the description of assumptions that would otherwise be implicit in the specification.


The implementation language is typically used to provide method signatures. These signatures provide what I last month referred to as an interface type of specification. In general the signature takes the form:


<modifier> <return object> methodName(<parameter list>) throws <exception list>


A specific example in C++ is


public void setManager(Manager) throws NoSuchEmployeeException



This notation has several advantages. There are tools that work with the language to automate some of the checking. There is less chance for transcription errors if the specification does not have to be copied from a separate description language into the implementation language. Finally, developers and managers will appreciate this approach since there is no "extra" work because the specification is also part of the final product.


One disadvantage of this notation is that it does not provide a syntax that supports the easy description of pre- and post-conditions. However, these conditions can be written as methods. Later the methods can be used during testing to check that required pre-conditions and guaranteed post-conditions are correct. I also write a class invariant method. Then, during the execution of each test case, I will call the invariant method both before and after the action to ensure that the invariant has been maintained.


Formal Methods

Formal methods, techniques that rely solely on the form of the statement rather than any underlying semantics, provide very precise ways of specifying the form of an object. Object Z, for example, is an extension of the Z formal specification language. It uses a mathematical, logical notation to specify method behavior. The language does not have an adequate representation for the second dimension of object interaction, its time varying behavior.


An advantage of this method is the ability to build tools that can manipulate the specification and even automatically generate proofs of correctness. The disadvantages include the skill set required to effectively use the method and the time required to generate the level of detail provided by this form of representation. Perhaps a more telling disadvantage is the limited expressiveness of the notation. Wegner[9] describes the notion of an interaction and discusses the limitations of the algorithmic notations for representing complex interactions.

Design Description Languages

These languages may be special purpose prototyping languages such as Rapide[7], the more complete interface description languages (IDLs) for frameworks such as CORBA[3] or general modeling languages such as the Unified Modeling Language (UML)[8]. These languages provide special constructs that make the definition of actions easy while not necessarily providing the means to define detailed algorithms.


As with implementation languages, these languages are usually supported by tools. Rapide and the various IDLs have compilers that check syntax and, in some cases, detect certain types of inconsistencies. CORBAís IDL is the standard for specification in the OMG. The disadvantage is that these descriptions generally must be transcribed into other languages with the possibility of various types of errors. The description language often does not exactly match the implementation language. Details that were not included in the specification and must be added may or may not be compatible with the original intent of the specification.


Finite State Mechanisms

Languages such as statecharts[Harel] provide a mechanism for describing the changes in an object over time. As the values of its attributes change over time the behavior of an object may change as well. The state model describes these changes in a way that can be used to guide the implementation of the object. Communication protocols have used this approach to specification for defining the very exacting and complex interactions among attributes that occur. These languages by themselves do not have the means to specify the behavior of individual objects completely. It provides an effective means of capturing protocol information. Interface information, such as pre-conditions, must be inferred by combining all occurrences of a method from all transitions.


The advantage of this approach is that it provides a formal grammar that can be analyzed and yet it is the basis for a graphical notation that is easy to understand. The disadvantage is that the notation does not provide the concise listing of behaviors that the interface approach does. The notation also does not provide a clean way to present the pre-conditions and post-conditions for each method.

Design Patterns

A design pattern describes a micro-architecture for some portion of a system. The design pattern format of Gamma et al[2], shown in Table 2, provides a less specific but very descriptive technique for specifying how a set of objects should interact to achieve specific design goals. A particular pattern of interaction is applicable across a wide range of problem domains, but affects only a very narrow slice of any particular program. An application will require numerous design patterns and some interactions among objects may not fit within the scope of any of the design patterns.


Design patterns provide support for the protocol style of specification. The Collaborations section of the pattern not only defines interactions, it also provides information about the assumed order in which messages are expected to be sent. The sample code also provides protocol information in one specific language.


Table 2: Design Pattern Format









Sample Code

Known Uses

Related Patterns




The advantage of a design pattern specification is that it is based on previous successful design experience. It is also based on experience spanning numerous problem domains. The disadvantage is that a design pattern specification will likely fail the completeness test because a single pattern will usually define parts of the interfaces of several classes rather than the complete interface for only a few classes.


For testing purposes I also use the design pattern format to specify "recurring themes" within a single application even if they have not been validated over multiple uses or users. Any good developer will write code for similar situations in a similar way. The inheritance/polymorphism approach of object-oriented software encourages and supports this patterning. Each subclass differs from its siblings in roughly the same way. That is, usually when several subclasses are developed from the same superclass, roughly the same set of methods is overridden in each subclass.


This patterning was what really made the HIT technique (See the explanation below) powerful[4]. Not only do you determine what need not be retested, what does need to be retested often falls into these patterns. This allows tests to not only be reused vertically in an inheritance hierarchy but horizontally as well. The tests used vertically can usually be reused as is. For horizontal reuse the tests usually must be modified but the modifications form the pattern making them easy to achieve.



Building Tests


No single one of the notations described above provides the complete view needed to provide a comprehensive test suite. I design development processes that use two or more modeling techniques to capture the entire specification. For individual classes, the Unified Modeling Language provides the richest set of models in a single notation. I usually supplement this with a form of "design patterns" captured using one of the accepted formats such as the one above. The patterns are used to capture the specifications for "chunks" of a system such as the modifiable sections of frameworks. The word design is in quotes to signify that I use the pattern approach to document actions of good programmers but often in a very domain or system specific way, see Construct tests only once below. Finally I often utilize design description languages, see Verify the specification below, either because of their expressiveness (like modeling the architecture) or because they are required (building a CORBA-based system).


Method signatures (described in class diagrams or in a formal spec) provide valuable information for the construction of tests but they do not provide sufficient information to determine when the method may be used neither do they provide the information concerning whether the computed result is correct. A method may only be used in one way but it is more usual for a method to be called by several different objects.


Verify the specification

In the last column we discussed the correctness, completeness and consistency attributes of high-quality specifications. For individual classes, a formal specification can be used to prove the correctness of the specification; however, the effort required to produce the specification and the proof have limited this approach to life critical systems. For specifications that involve only a few variables, Beizer illustrates the use of Karnaugh-Veitch maps for verifying the specification[1]. A more widely applicable, although not as exacting, approach is the creation of a prototype.


By creating a prototype of the software architecture for the system, several specifications both at the class and cluster levels are examined. The "relative" completeness of these specifications is verified in that they possess the behaviors necessary to support the prototype or they are modified and the behavior is added. A limited assessment of correctness can be obtained by running specific test cases. The advantage of the executable prototype is the ability to compute and evaluate the results. Finally, the consistency of the specification is examined relative to the completeness of the prototype. That is, contradictions are identified only between the concepts that are actually prototyped. The more complete the prototype, the more likely that most contradictions are identified.


Analyze the specification

In preparation for constructing tests from the specification of each method the domains of possible values for test inputs needs to be determined. Each parameter to the method is examined. Each input is either an object or, in some languages, a primitive. Donít overlook the objectís own attributes as inputs. Each of the objectís attributes are global to its methods. See my previous column[5] for an example using a set of tables.


A primitive variable represents a specific value in the domain of its data type which is clearly specified in the definition of the type. When the input is an object, determining the domain of values can require a little more effort. I use the state space of the object as its domain of values. Each state is a unique combination of values of the objectís attributes.


The second step in analyzing the specification is to identify boundaries in the domains of values for the test inputs. For an object, each state is a combination of values of attributes so every state transition crosses a boundary. Each state is a discrete "value" even though an individual attribute of the object may change value. The high-level state in which we are interested here does not change every time any single attribute of the object. Further more, for object-oriented systems, the state machines may have nested states. This results in complex combinations of boundary conditions.


Construct a basic set of tests

The construction of a test suite for a specific method begins with the selection of test inputs from the domains of the method inputs. A value is selected for each input using the information created in Analyze the specification. Not all combinations of values are acceptable. The pre-condition for a method constrains the combinations that should occur for positive tests. Use the OIDs to be certain that combinations of values correspond to system scenarios


The completeness of the test suite can be measured by evaluating how completely the set of possible outputs is covered. The outputs include:



Construct only the tests that are necessary


The best technique for building tests is to not build them. Reusability is an important development goal for many software projects that use object-oriented techniques. It can be equally useful for testing. This was the focus when we developed Hierarchical Incremental Testing (HIT) a few years ago[4].


HIT is an analysis technique that examines the inheritance relationships among classes and determines the type of testing that should be performed on each method in the inheriting class. The technique paves the way for either reducing the amount of testing actually conducted (Do we run a test case or not) or for reusing test cases and rerunning them. As a trivial example, it is not necessary to test the "getter" and "setter" methods for variables that are inherited unmodified from another class.


The result of applying HIT is a smaller, just as effective test suite. The effort required to construct the test cases and to create the test data is at most the same as a non-inheriting class and may be greatly reduced depending upon the percentage of methods being inherited. [4] provides the analysis table for C++. The table can easily be modified for other languages. A script that performs the HIT analysis on models represented in UML in Rational Rose is available from


Construct tests only once


If you have to construct tests, follow the developerís lead in order to reduce effort. Every good developer uses "regularity" in organizing their designs and code. Often they develop new code by cutting and pasting from previous code and saving much time. Testers can achieve the same productivity if they are willing to study the design, identify patterns in the specifications and use those patterns to guide the repetitive construction of test cases. Application frameworks often take advantage of this regularity to ease the job of extending the framework.


Create a test suite that covers the code of one instance of the pattern. Once this instance has been used and debugged, clone the instance for the other instances of the pattern. Typically the initial pattern will involve the highest level concrete classes.


On a recent project one of the developers created a report writing framework. It was a very general and very powerful framework. Constructing the test suite for the first report was non-trivial due to the generality, but it was accomplished using many of the same language constructs used in the report framework. Constructing the additional test suites was simply a matter of copying, pasting and editing the initial code.


Constructing Tests from Patterns

In the last column I briefly presented the Bridge pattern. This pattern defines a hierarchy of implementation classes that support a common interface. Building specification tests for these implementation classes is greatly facilitated by the PACT architecture that was presented in an earlier column[6]. In fact if we only wanted to develop specification-based tests we probably would only need to create a single class since all of the implementation classes provide the same interface. However, since we will most likely want to develop tests that cover specific portions of the implementations, which all vary, individual test classes are constructed. The common test cases for the common specification are inherited by each test class.


One caution. Design patterns are not class definitions. A class may play roles in multiple patterns. So the interface of an implementation class in the Bridge pattern may also define other methods, that reflect the participation of the class in other patterns, that also need to be tested.


One of the areas that we are currently researching at Clemson University is the relationship between patterns and the efficient development of tests. One fact seems obvious. If you are designing using standard design patterns, there should be correspondingly standard patterns for the design of the test software. That is not to say that the pattern for the test software will be the same as the one for the production software. But, there should be a fixed pattern as to how the test software is designed for a given design pattern for the application software.




Component testing is one of the most useful but controversial levels of testing. Some organizations make it the focus of their testing program while others see no need to test at this low level. Tests based on the specification of the component contribute heavily to the quality of the component. No single notation provides sufficient information in an easy-to-interpret form to support the construction of a comprehensive test suite. Using several different models to present specification information ensures that all aspects of an objectís behavior are consistently, completely and correctly communicated.


Constructing tests effectively involves analysis of the specification, analysis of the componentís role in the design and the systematic selection of test data. Each of these activities can be made more efficient and effective by careful analysis and preparation and by utilizing the advantages of object-oriented structures and relationships.


Before you starting sending piles of email, I am fully aware of what is missing from this column: an example! Next month I will work through a slice in an application.



1. Boris Beizer Software Testing Techniques, Second Edition. International Thompson Press, 1990.

2. Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software, Addison-Wesley, 1995.

3. Ron Ben-Natan. CORBA: A Guide to the Common Object Request Broker Architecture, McGraw-Hill, 1995.

4. Mary Jean Harrold, Kevin Ftizpatrick and John D. McGregor. Incremental Testing of Object-Oriented Class Structures, Fourteenth International Conference on Software Engineering, May 1992.

5. John D. McGregor. Planning for Component Testing , Journal of Object-Oriented Programming, February 1997.

6. John D. McGregor. Parallel Architecture for Component Testing , Journal of Object-Oriented Programming, May 1997.

7. Rapide Design Team. The Rapide-1 Executable Language Reference Manual, Program Analysis and Verification Group, Computer Systems Lab., Stanford University, version 1 edition, October 1994.

8. Using Rational Rose 4.0, Rational Software Corp., 1996.

9. Peter Wegner. Why Interaction is More Powerful than Algorithms, Communications of the ACM, May 1997, pp. 80 - 91.