Interrupts

Mark Smotherman
Last updated: July 2017

Summary: Interrupts are a vital part of sequencing a modern computer. They were developed for exception handling and were later applied to I/O events.

.. under construction ..


Contents

Machines discussed:

Comments on interrupts:

Additional topics:

Note

In my 1989 paper, "A Sequencing-Based Taxonomy of I/O Systems and Review of Historical Machines," Computer Architecture News, vol. 17, no. 5, pp. 5-15, September 1989 (and reprinted as Paper 1 of Chapter 7, in M. Hill, N. Jouppi, and G. Sohi (eds.), Readings in Computer Architecture, Morgan Kaufmann, San Francisco, 2000),


Short Review of the History of Interrupts


 
However, the rest of the story includes:
 


 


Comments on interrupts

Turner and Rawlings' comments on interrupts

From their 1958 paper cited above.

It is necessary to interlace computing and information transfers to achieve required speeds. It is well known that almost all the cycle time of input-output devices is available for computing if coding which is clever enough can be prepared. In practice, the difficulty of such coding is so great that little use is usually made of this theoretically available time.
 
...
[Discussing possible coding and hardware errors when interrupts are used] The heart of these difficulties is a subtle difference between interrupt and noninterrupt modes of operation. In the conventional mode, not only does the result of calculation depend only on the initially stored code and on the data, but so does every intermediate state of the computer. With interrupt, because of the random timing of interruption, this is no longer true; in fact, even with a correct program, only the answers are expected to be predictable. One of the startling effects of this variability is that a programming or coding error, by producing a variable error or sometime none at all, may be indistinguishable from a random computer malfunction. Many hours were spent in discovering this pecularity.
 
... The nature of INTERRUPT with randomly-timed devices is such that the thin line which has separated program difficulties from hardware difficulties is now invisible. A problem with a program error may be successfully run many times before the peculiar circumstances which produce a failure show up. When they do show up there is little or no evidence to be analyzed.
[capitalization in original]

Strachey's comments on interrupts

A widely-cited paper that described the use of interrupts is Christopher Strachey, "Timesharing in Large Fast Computers," International Conference on Information Processing, June 1959, Paris, pp. 336-341 [IFIP conference; the proceedings were published by UNESCO in 1960].

In the paper, he first describes a simple interruption scheme for handling magnetic tape transfers and then argues for a more sophisticated design:

This plan leaves several difficulties unresolved; it makes, for example, no provision for lock-out during transfers. Another difficulty is the possibility of several "interrupt program" signals coming at the same time (from different units) or of one special sequence of orders being interrupted by another (or even worse, by itself).
 
A fairly straightforward way out of these difficulties is to divide all possible interrupt signals into a small number of classes. The classes are allotted in order of priority, and any interrupt signal can only interrupt the program immediately if this is of a lower priority class. If the program being executed is of the same or higher priority class, the interrupt signal is merely stored. At the end of a sequence of orders of one priority, the machine scans the waiting list and starts to operate on the call with the highest remaining priority.
 
The top priority would contain operations which cannot afford to wait e.g. transfer of words to or from magnetic tape. The lowest priority would be the main program. There might have to be one or two intermediate classes, the principle of classification being that those which have interia must have a higher priority that those that have none.
 
... it may be necessary to store the content of various arithemetic and index registers before proceeding with the new program. If this is necessary the general rule should be to store them in locations which are associated with the interrupting program and to restore them again at the end of that program. The amount of information to be stored is determined by what use the interrupting program makes of these common registers. In this way no time will be wasted storing more information than is necessary.

By "lock-out" he means preventing access to an I/O buffer by the main program until the I/O transfer is complete.

He goes on to describe a time-shared machine that could support a long-running "base load" production program, a computer operator starting "short-run" programs as needed, a programmer debugging a new program, and an engineer who could be maintaining and testing the peripherals.

He describes a software "Director" that would manage the transitions between the different program sequences and have access to special instructions, memory protection limit registers, and a facility for limiting execution time. Today, we would recognize his description of a Director as an operating system running in a protected execution mode with access to memory base and bounds registers, privileged I/O instructions, and a watchdog timer.

He gives a tentative priority list of program sequences from highest priority to lowest:

and then notes:

It may be desirable to divide these into various groups, or even allow the priority of a program to be decided by the Director.

With regard to the relative innovation of the ideas presented, this paper appears a bit late in the chronology of computers with interrupt systems. For example, compare this paper with Forgie's 1957 paper on the TX-2, Brooks' 1957 paper on interrupt systems, and Gill's 1958 paper on parallel programming. On the other hand, Strachey explicitly states twice in his paper that he knows of no actual or projected computer that implemented or would implement his ideas, and Gill cites Strachey's idea of two programs running simultaneously on one computer in Gill's own 1958 paper. (In the discussion session included as part of the published Strachey paper, Ted Codd of IBM briefly mentions that the IBM Stretch is similar in approach and that he hopes to describe multiprogramming on the Stretch at the next EJCC.) Strachey filed for a patent on his ideas in Great Britain in February 1959 and in the U.S. in February 1960. He was granted US Patent 3,222,647 in December 1965.

Dijkstra's comments on interrupts

Denning's comments on interrupts

Peter J. Denning, "Principles for Reliable Operating Systems," January 28, 2003

Hardy's comments on interrupts

Norm Hardy's history of interrupts

Asanović's comments on interrupts

Krste Asanović is one of the chief architects of the RISC-V instruction set architecture. He described two major system design viewpoints regarding interrupts, each of which is supported in RISC-V:

He also identifies a third viewpoint for high-performance real-time systems that must avoid interrupt-handling overhead and that tend to poll I/O devices on a regular schedule governed by a timer interrupt.

See the slides and video of his presentation on the RISC-V interrupt system design at the RISC-V Workshop at MIT CSAIL, Cambridge, MA, July 12, 2016.


Design Tree for Interrupts

(adapted from Blaauw and Brooks, section 7.4, on interrupts)

  1. terminology and connotations (there are no well-defined, widely-accepted distinctions among the various terms)
    1. asynchronous, external
      1. interrupt - external device (I/O, clock, etc.)
      2. machine check - hardware error or failure
    2. synchronous, internal
      1. general terms
        1. alarm
        2. exception - special or undefined case (e.g., divide by zero)
        3. internal interrupt
        4. program check
        5. trap (sprung like a "mousetrap")
      2. terms associated with no continued execution
        1. abort - exception in the middle of an instruction
      3. terms associated with continued execution (either restart or resume)
        1. fault - missing information (e.g., page fault)
      4. terms associated with user request to OS
        1. mme (master mode entry)
        2. software interrupt (e.g., INT opcode in x86)
        3. svc (supervisor call)
        4. syscall
        5. trap (trap always on SPARC)

  2. entry point (i.e., where CPU execution diverted)
    1. fixed memory address (e.g., PDP-8)
      1. poll to find cause (e.g., PDP-8)
      2. code indicating cause placed in register
    2. one of many memory locations (interrupt vector)
      1. number of vectors
        1. single, globally shared (e.g., Electrologica X-1, S/360, x86)
        2. multiple, one per process (e.g., IBM Stretch)
      2. vector contents
        1. interrupt vector entry holds new PC (and maybe new PSW) (e.g., S/360, x86)
        2. interrupt vector holds blocks of instructions (e.g., SPARC)
    3. different PC (e.g., DYSEAC, TX-2)

  3. return linkage
    1. fixed memory location(s) (e.g., PDP-8, S/360)
    2. memory stack (e.g., x86, M68K)
    3. register (e.g., SPARC)

  4. permission
    1. single interrupt enable bit (e.g., PDP-8, 8086 w/o PIC)
      • enable/disable interrupt instructions
    2. interrupt enable mask in PSW (e.g., S/360)
    3. interrupt priority code in PSW (e.g., SPARC)
    4. nonmaskable (cannot be ignored)
    5. breakpoint bit in each instruction (e.g., DYSEAC)
      • device priority along with break and dismiss bit in each instructions (e.g., TX-2)

  5. techniques to accelerate interrupt handling
    1. cause determination (see cause register and interrupt vector)
    2. additional register sets
    3. interrupt poll to coalesce multiple interrupts (e.g., ITI on B5500 in 1960s, test pending interrupt on S/370 XA)


Performance Techniques

Rob Warnock writing in comp.arch on 16 Jul 2001:

As far as the areas that I've been personally involved in -- mostly high-speed networking -- the best hardware/software tradeoffs have been mostly just judgment calls by very experienced folks. What I've noticed is that a *LOT* of "lessons learned" decades ago are still being re-learned by later generations, over and over again. We as an industry don't seem to have been doing a very good job of passing on the "secret sauce" to subsequent generations [in many cases, even *within* the same companies!].
 
For example, way back in the early 1970s at Digital Communications Assoc. (in Atlanta, made communications front-ends for DEC PDP-10s and IBM 360s using DEC PDP-8e's, successfully sold in *competition* with DEC's own PDP-11-based front-ends!], we made simple interrupt-per- character PIO-based terminal controllers that were -- in terms of host CPU cycles -- *more* efficient (and a lot cheaper!) than DEC's complex DMA-based controllers, partly because we used three simple techniques, two programming and one hardware: (1) interrupt "coalescing", (2) "dallying" before dismissing an interrupt, and (3) interrupt "holdoff". That is, (1) during a single interrupt, process *all* of the data that's available or that *becomes* available (looping to repeat the availability test as needed) during a single interrupt; (2) spin-wait/test a certain amount (which may even be context/data-dependent!) before dismissing an interrupt, so as to catch new data and process it without having to dismiss/interrupt again; and (3) once you finally *do* dismiss an interrupt, arrange the hardware so that you won't get another one until a certain (sometimes dynamically) chosen time interval has expired (also known as "interrupt rate-limiting", which, together with the other two, prevents "thrashing" in the interrupt service routine).
 
[There's another variant of #2 that's worth mentioning: when doing PIO writes to a FIFO and hitting a FIFO-full condition, "dallying" (spin-waiting) a certain amount first rather than enabling a watermark interrupt and going away. If you pick the dally amount appropriately, by "wasting" a few cycles you actually can save *lots* of total CPU time, since you (hopefully) avoid a later heavy-weight interrupt/context-switch/dismiss.]
 
These techniques were *NOT* unique to DCA!! In talking to other engineers at conferences & trade shows, we learned that such things were generally widely known among workers in the field (especially embedded systems, datacomm, and real-time folks), but were not often discussed openly, being considered somewhat "trade secrets". [That is, why tell your competitor why his performance sucks and yours doesn't?!?]
 
In the 30+ years since then, I've been astounded how many times the techniques of coalescing/dallying/holdoff have been rediscovered, but I've been *more* astounded how many times they haven't been used *at all* in many places where they were definitely needed, or how they were reinvented badly. [E.g., if you're implementing hardware assistance for interrupt holdoff, it is *extremely* important that the holdoff period be effective only *after* the dismissing of an interrupt, and not delay the response to or processing of a "new" event that arrives when the system has been idle for longer than the holdoff.] "Computer Science" courses don't seem to ever discuss such practicalities, either.

(E.g., see US 3,789,365 issued January 1974 and US 4,342,082 issued July 1982)


Deferred Interrupt Handler Structure

[under development]

As Peter Freeman writes in his 1975 Software System Principles textbook, "Interrupts are often viewed as a mixed blessing by the designers of software systems. In many systems, the interrupt is the basis for much, if not all, parallel operation. It is used to send signals between independent processes and is thus of central importance to operating systems. ... On the other hand, the existence of interrupts increases the complexity of the operating system."

Freeman goes on to note that "Many interrupts are time-dependent so a quick response is necessary. If the action to be taken is long in duration with respect to the expected time of the next event, the interrupt handler may simply place a request in a work queue for later action."

In this section I want to trace the development of the idea of a deferred interrupt handler. The idea is also known as a fork routine (e.g., RSX-11M), deferred procedure call (e.g., Windows), interrupt continuation (e.g., Mach 3.0), bottom half interrupt handler (e.g., Linux), softirq (e.g., Linux), tasklet (e.g., Linux), and interrupt queueing (e.g., proposed extension to AIX locking in US 5274823, "Interrupt handling serialization for process level programming").

Interrupt stacking is another term that is sometimes used to mean deferred interrupt handling (e.g., IBM TSS), but it is also used in some hardware manuals to mean the pushing of processor state on a memory stack as part of the hardware response to an interrupt (e.g., ARM).

Alexander Koenen-Dresp, in section 5.2.1 of his doctoral thesis "Applying the Engineering Statechart Formalism to the Evaluation of Soft Real-Time in Operating Systems: A Use Case Tailored Modeling and Analysis Technique," Saarland University, 2008, gives an architectural categorization of deferred interrupts:

For example, in RSX-11M, interrupt handlers that will run more than 500 microseconds and/or will update a system data structure should fork a routine for later execution by adding a four-word fork block to a FIFO fork queue. The fork routines must be executed before any user processes are scheduled. So, in the above categorization, some of the interupt handling in RSX-11M is deferred, on demand, conducted prior to scheduling, and does not possess priority compliance.

Koenen-Dresp noted that modern operating systems such as Linux are typically hybrid designs with some amount of immediate interrupt handling (e.g., timer interrupts and interprocessor interrupts) and some amount of deferred interrupt handling.

The motivations for deferred interrupts include:

Mercury Programming System

In "JN: An Operating System for an Embedded Java Network Computer," UCSC-CRL-96-29, 1996, Bruce Montague writes that the idea of deferred interrupt handling appears to date from at least 1960 with the soft real-time monitor developed by IBM for the Project Mercury space program. While it is likely that the idea developed independently a number of times, he identifies two other IBM systems that were influenced by the the Mercury Programming System:

Manchester Atlas Supervisor

The Manchester Atlas supervisor is described in T. Kilburn, R.B. Payne, and D.J. Howarth, "The Atlas Supervisor," in Proc. AFIPS EJCC, 1961, pp. 279-294. The supervisor allowed interrupt handlers to invoke routines within the supervisor for lengthy operations. These routines ran with interrupts enabled but at a higher priority than user programs:

The interrupt routines are designed to handle calls for action with the minimum delay and in the shortest time; the character-by-character transfers to and from peripheral equipments, for example, occur at high frequency and it is essential that the transfers be carried out with the minimum possible use of the central computer and within the time limit allowed by the peripheral equipment for filling or emptying the buffer. Since several interrupt flip-flops can become set simultaneously, but cannot be acted upon while another interrupt routine is still in progress, it is essential that a short time limit be observed by each interrupt routine. On some occasions, however, longer sequences are required ... In such cases, the interrupt routine initiates a routine to be obeyed under extracode control, known as a supervisor extra code routine.

Supervisor extracode routines (S.E.R.'s) form the principal "branches" of the supervisor program. They are activated either by interrupt routines or by extracode instructions occurring in an object program. ... They can moreover be interrupted by interrupt routines, which may in turn initiate other S.E.R.'s. It is thus possible for several S.E.R.'s to be activated at the same time, in the same way as it is possible for several interrupt flip-flops to be set at the same time. Although several S.E.R.'s may be activated, obviously not more than one can be obeyed at any one moment; the rest are either halted or held awaiting execution. This matter is organized by a part of the supervisor called the "co-ordinator routine" which is held in fixed store. Activation of an S.E.R. always occurs via the co-ordinator routine, which arranges that any S.E.R. in progress is not interrupted by other S.E.R.'s. As these are activated, they are recorded in subsidary store in lists and an entry is extracted from one of these lists whenever an S.E.R. ends or halts itself. Once started, an S.E.R. is always allowed to continue if it can; a high priority S.E.R. does not "interrupt" a low priority S.E.R. but is entered only on conclusion or halting of the current S.E.R. The co-ordinator has the role of the program equivalent of the "inhibit interrupt flip-flop", the lists of activated S.E.R.'s being the equivalent of the setting of several interrupt flip-flops. The two major differences are that no time limit is placed on an S.E.R., and that an S.E.R. may halt itself for various reasons; this is in contrast to interrupt routines, which observe a time limit and are never halted. ... In order that the activity of each branch of the computing system be maintained at the highest possible level, the S.E.R.'s awaiting execution are recorded in four distinct lists. Within each list, the routines are obeyed in the order in which they were activated, but the lists are assigned priorities, so that the top priority list is emptied before entries are extracted from the next list. The top priority list holds routines initiated by completion of drum transfers, and also routines entered as a result of computer failures such as core store parity. The second list holds routines arising from magnetic tape interruptions and the third holds routines arising from peripheral interruptions. The lowest priority list contains one entry for each object program currently under execution, and entry to an S.E.R. through an extracode instruction in an object program is recorded in this list. On completion of an S.E.R., the co-ordinator routine selects for execution the first activated S.E.R. in the highest priority list.

SDS Sigma-7

Two systems with deferred handling are cited by Richard Watson in Timesharing System Design Concepts, 1970:

OS designs influenced by David Cutler

Montague refers to deferred interrupt handling as a "Cutler kernel" design approach. The genealogical tree of operating systems involving David Cutler is:

Linux deferred handlers


Precise Interrupts


Saving and Restoring Implementation-Dependent State


Corrections and additions are welcome!


[History page] [Mark's homepage] [CPSC homepage] [Clemson Univ. homepage]

mark@cs.clemson.edu