Mark Smotherman
Last updated: July 2017
Summary: Interrupts are a vital part of sequencing a modern computer. They were developed for exception handling and were later applied to I/O events.
.. under construction ..
Machines discussed:
Comments on interrupts:
Additional topics:
In my 1989 paper, "A Sequencing-Based Taxonomy of I/O Systems and Review of Historical Machines," Computer Architecture News, vol. 17, no. 5, pp. 5-15, September 1989 (and reprinted as Paper 1 of Chapter 7, in M. Hill, N. Jouppi, and G. Sohi (eds.), Readings in Computer Architecture, Morgan Kaufmann, San Francisco, 2000),
(cf. SWAC with a 4th address in the instruction used to specify a
branch address on overflow)
In the context of arithmetic overflow handling in the UNIVAC I,
Codd states,
"in the NBS DYSEAC the very significant step was made of
extending interruption to input-output operations."
[E.F. Codd, "Multiprogramming," pp. 77-153, in F.L. Alt and
M. Rubinoff (eds.), Advances in Computers, Vol. 3. New York:
Academic Press, 1962.]
... versus ...
[discussing interlocks]
Thus the assigned priority levels are not under program control and
cannot change during execution.
Memory included:
Also, the X-1 could provide:
At that point, the predefined subroutine call instruction corresponding
to the interrupt class was inserted into the instruction register.
Executing this instruction formed a link word (note the contents listed
above), which was saved into the corresponding word in low memory, and
control was transferred to the interrupt handler.
The interrupt handler began execution with interrupts disabled ("not
susceptible"). The interrupt handler had the responsibility of saving
the A, S, and B registers and then reading the class word. All bits
(i.e., interrupt request signals) in the class word would be reset upon
the read.
The enable bit and permit bits could be individually set and reset
under program control to provide for disabling all or some of the
interrupts. This allowed for a priority system among classes.
Returning from the interrupt handler was accomplished by use of a
"restoring jump" which restored the information from the saved link word.
The CDC 6600 is well-known for the absence of I/O interrupts. However,
it did have exceptions that interrupted the execution of the central
processor, and the peripheral processing units could also interrupt
the execution of the central processor by executing an exchange jump
instruction.
After experiencing circuit design problems that forced a project
restart in 1961, Thornton states that "Substantially more parallel
processing and a new I/O scheme were added to the scope of the
design." He describes the parallel processing related to the I/O
as follows:
From their 1958 paper cited above.
A widely-cited paper that described the use of interrupts is
Christopher Strachey, "Timesharing in Large Fast Computers,"
International Conference on Information Processing, June 1959, Paris,
pp. 336-341 [IFIP conference; the proceedings were published by UNESCO
in 1960].
In the paper, he first describes a simple interruption scheme for handling
magnetic tape transfers and then argues for a more sophisticated design:
By "lock-out" he means preventing access to an I/O buffer by the main
program until the I/O transfer is complete.
He goes on to describe a time-shared machine that could support a
long-running "base load" production program, a computer operator starting
"short-run" programs as needed, a programmer debugging a new program, and
an engineer who could be maintaining and testing the peripherals.
He describes a software "Director" that would manage the transitions between
the different program sequences and have access to special instructions,
memory protection limit registers, and a facility for limiting execution time.
Today, we would recognize his description of a Director as an operating system
running in a protected execution mode with access to memory base and bounds
registers, privileged I/O instructions, and a watchdog timer.
He gives a tentative priority list of program sequences from highest priority
to lowest:
and then notes:
With regard to the relative innovation of the ideas presented, this
paper appears a bit late in the chronology of computers with interrupt
systems. For example, compare this paper with Forgie's 1957 paper on
the TX-2, Brooks' 1957 paper on interrupt systems, and Gill's 1958
paper on parallel programming. On the other hand, Strachey
explicitly states twice in his paper that he knows of no actual or
projected computer that implemented or would implement his ideas, and
Gill cites Strachey's idea of two programs running simultaneously on one
computer in Gill's own 1958 paper. (In the discussion session included
as part of the published Strachey paper, Ted Codd of IBM briefly mentions
that the IBM Stretch is similar in approach and that he hopes to describe
multiprogramming on the Stretch at the next EJCC.) Strachey filed for a
patent on his ideas in Great Britain in February 1959 and in the U.S.
in February 1960. He was granted US Patent 3,222,647 in December 1965.
Peter J. Denning, "Principles for Reliable Operating Systems,"
January 28, 2003
Norm Hardy's history of interrupts
Krste Asanović is one of the chief architects of the RISC-V
instruction set architecture. He described two major system design
viewpoints regarding interrupts, each of which is supported in RISC-V:
He also identifies a third viewpoint for high-performance real-time
systems that must avoid interrupt-handling overhead and that tend
to poll I/O devices on a regular schedule governed by a timer interrupt.
See the
slides and
video of his presentation on the RISC-V interrupt system design
at the RISC-V Workshop at MIT CSAIL, Cambridge, MA, July 12, 2016.
(adapted from Blaauw and Brooks, section 7.4, on interrupts)
Rob Warnock writing in comp.arch on 16 Jul 2001:
(E.g., see US 3,789,365 issued January 1974 and US 4,342,082
issued July 1982)
[under development]
As Peter Freeman writes in his 1975 Software System Principles textbook,
"Interrupts are often viewed as a mixed blessing by the designers of
software systems. In many systems, the interrupt is the basis for much,
if not all, parallel operation. It is used to send signals between
independent processes and is thus of central importance to operating
systems. ... On the other hand, the existence of interrupts increases
the complexity of the operating system."
Freeman goes on to note that "Many interrupts are time-dependent so a
quick response is necessary. If the action to be taken is long in duration
with respect to the expected time of the next event, the interrupt handler
may simply place a request in a work queue for later action."
In this section I want to trace the development of the idea of
a deferred interrupt handler. The idea is also known as a
fork routine (e.g., RSX-11M),
deferred procedure call (e.g., Windows),
interrupt continuation (e.g., Mach 3.0),
bottom half interrupt handler (e.g., Linux),
softirq (e.g., Linux),
tasklet (e.g., Linux),
and interrupt queueing (e.g., proposed extension to AIX locking in
US 5274823, "Interrupt handling serialization for process level programming").
Interrupt stacking is another term that is sometimes used to mean deferred
interrupt handling (e.g., IBM TSS), but it is also used in some hardware
manuals to mean the pushing of processor state on a memory
stack as part of the hardware response to an interrupt (e.g., ARM).
Alexander Koenen-Dresp,
in section 5.2.1 of his doctoral thesis
"Applying the Engineering Statechart Formalism
to the Evaluation of Soft Real-Time in Operating Systems: A Use Case
Tailored Modeling and Analysis Technique,"
Saarland University, 2008,
gives an architectural categorization of deferred interrupts:
For example, in RSX-11M, interrupt handlers that will run more than
500 microseconds and/or will update a system data structure should
fork a routine for later execution by adding a four-word fork block
to a FIFO fork queue. The fork routines must be executed before any
user processes are scheduled. So, in the above categorization, some
of the interupt handling in RSX-11M is deferred, on demand, conducted
prior to scheduling, and does not possess priority compliance.
Koenen-Dresp noted that modern operating systems such as Linux are
typically hybrid designs with some amount of immediate interrupt
handling (e.g., timer interrupts and interprocessor interrupts)
and some amount of deferred interrupt handling.
The motivations for deferred interrupts include:
In
"JN: An Operating System for an Embedded Java Network Computer,"
UCSC-CRL-96-29, 1996, Bruce Montague writes that the idea of deferred
interrupt handling appears to date from at least 1960 with the soft
real-time monitor developed by IBM for the Project Mercury space program.
While it is likely that the idea developed independently a number of
times, he identifies two other IBM systems that were influenced by the
the Mercury Programming System:
The Manchester Atlas supervisor is described in
T. Kilburn, R.B. Payne, and D.J. Howarth, "The Atlas Supervisor,"
in Proc. AFIPS EJCC, 1961, pp. 279-294. The supervisor allowed interrupt
handlers to invoke routines within the supervisor for lengthy operations.
These routines ran with interrupts enabled but at a higher priority than
user programs:
Supervisor extracode routines (S.E.R.'s) form the principal "branches"
of the supervisor program. They are activated either by interrupt routines
or by extracode instructions occurring in an object program.
...
They can moreover be interrupted by interrupt routines, which may in
turn initiate other S.E.R.'s. It is thus possible for several S.E.R.'s
to be activated at the same time, in the same way as it is possible for
several interrupt flip-flops to be set at the same time. Although several
S.E.R.'s may be activated, obviously not more than one can be obeyed at
any one moment; the rest are either halted or held awaiting execution.
This matter is organized by a part of the supervisor called the
"co-ordinator routine" which is held in fixed store. Activation of an
S.E.R. always occurs via the co-ordinator routine, which arranges that
any S.E.R. in progress is not interrupted by other S.E.R.'s. As these
are activated, they are recorded in subsidary store in lists and an
entry is extracted from one of these lists whenever an S.E.R. ends or
halts itself. Once started, an S.E.R. is always allowed to continue
if it can; a high priority S.E.R. does not "interrupt" a low priority
S.E.R. but is entered only on conclusion or halting of the current S.E.R.
The co-ordinator has the role of the program equivalent of the "inhibit
interrupt flip-flop", the lists of activated S.E.R.'s being the equivalent
of the setting of several interrupt flip-flops. The two major differences
are that no time limit is placed on an S.E.R., and that an S.E.R. may halt
itself for various reasons; this is in contrast to interrupt routines,
which observe a time limit and are never halted.
...
In order that the activity of each branch of the computing system be
maintained at the highest possible level, the S.E.R.'s awaiting
execution are recorded in four distinct lists. Within each list, the
routines are obeyed in the order in which they were activated, but the
lists are assigned priorities, so that the top priority list is emptied
before entries are extracted from the next list. The top priority list
holds routines initiated by completion of drum transfers, and also
routines entered as a result of computer failures such as core store
parity. The second list holds routines arising from magnetic tape
interruptions and the third holds routines arising from peripheral
interruptions. The lowest priority list contains one entry for each
object program currently under execution, and entry to an S.E.R. through
an extracode instruction in an object program is recorded in this list.
On completion of an S.E.R., the co-ordinator routine selects for execution
the first activated S.E.R. in the highest priority list.
Two systems with deferred handling are cited by Richard Watson in
Timesharing System Design Concepts, 1970:
Montague refers to deferred interrupt handling as a "Cutler kernel" design
approach. The genealogical tree of operating systems involving David Cutler
is:
Corrections and additions are welcome!
[History page]
[Mark's homepage]
[CPSC homepage]
[Clemson Univ. homepage]
mark@cs.clemson.edu
However, the rest of the story includes:
The computer reacts to an "overflow" situation automatically. The
sequence of instructions is interrupted, and the pair of instructions
in memory location 000 is inserted. This insertion is effected after
both instructions of the original pair have been executed even though
overflow may have been caused by the first instruction of the pair.
If memory location 000 contains an instruction which transfers
control, a new sequence is initiated. It is important to note that,
if no transfer of control is ordered, the original sequence of
instructions is resumed after executing the pair contained in 000.
There is a rudimentary interrupt system for machine errors. When the
stop condition of the Machine Error indicator in [sic] not enabled by
its mask switch, it causes a trap. The next instruction is now taken
from the console switches, word 8000, instead of from the specified
location. There is no status preservation whatever; there is no way
to know where the offending trap occurred. Moreover, the Distributor
and Accumulator are reset; the main purpose is to get a fresh restart
after an intermittent error.
Each input-output instruction ... indicates whether or not a
Program-Jump operation is desired after completion of the
input-output operation. If a Program-Jump is desired, then a
signal is produced after the input-output operation is completed
which causes the program to jump, i.e., the counter-register being
used as the source of the next instruction is temporarily
abandoned, and the instruction word in the address specified by
the other counter-register is executed in its stead. ...
After effecting a program-jump, the new program that is initiated
may, as soon as its final instruction is performed, cause the
machine to resume the interrupted program where it left off, if
the programmer so desires.
...
It might be noted again that any consecutive areas of the memory in
DYSEAC, ranging in size from a single word to the entire memory, can
be loaded or printed out even while a program of computation is
proceeding. Automatic interlocks are provided to guard against
inadvertent attempts to use a memory location for conflicting
purposes. For example, if a print-out order is given, the machine
is automatically prevented from writing into any memory location
affected by the print-out order until the word in that location
has been printed out.
...
The Concurrent Input-Output Control has the function of regulating
the detailed progress of all input-output operations requested in
the course of the internal program. It directs the flow of traffic
between the memory and the Input-Output Buffer, ....
It receives signals from the Input-Output Buffer as each complete
word is transferred, and it not only keeps count of the total number
of words handled, but also keeps track of which address in the
high-speed memory contains the word currently to be transferred.
This information is transmitted to the Memory Address Switches at
the proper time.
In the comparatively unlikely event that both the Input-Output Buffer
and some internal unit should simultaneously demand access to the
high-speed memory, the Concurrent Input-Output Control resolves the
conflict by according temporary priority to the external operation.
Likewise it referees conflicts or inconsistencies between the internal
program and the progress of external operations. It also notifies the
Program Control unit at the termination of an input-output operation
so that the program-jump may take place.
For example, if the programmer directs that a given section of the
memory be loaded from an external input unit and then some time
afterwards directs that the contents of a memory location in this
section be used as the source of an instruction, the program will
automatically be temporarily halted until the sought-for memory
location has actually received the word intended for it from the
input unit.
The clever feature was that rather than requiring the processor to use
its precious time, swapping memory in and out of the drum storage space,
the transfers would be automatically done on demand. This was implemented
using another clever technique. If a request was made for data not
currently in the core memory, an interrupt, (then called an extracode
instruction), was called. This paused the execution of the program, and
automatically loaded the page of memory which the data was in, (a page
being a 512 word block of memory).
A further optimisation to this technique was also implemented. Rather
than check for every instruction to ensure it is in core memory, which
would largely waste time, as the sequential nature of instruction reads
means they are more than likely to be in the current page, a different
method was used. They eliminated the check for each instruction, and
instead implemented an exception handler to deal with the error accesses.
reviewed in depth the competing aspects of the machines presented at the
UCLA seminar. Nearly all exploited some level of parallel operation.
Such items as separate functional arithmetic units, instruction lookahead,
and an independent input/output processor were being offered as the way to
achieve high speed.
I believe that the mention of separate functional arithmetic units
is a reference to the
Gamma 60, instruction lookahead is a reference to the
Stretch, and the the independent input/output processor
is a reference to the
LARC.
The logic gate speed of the new silicon gate was about five times faster
than the 1604. We needed a factor of fifteen or twenty to satisfy our
objective and take parallel operation well beyond that discussed at UCLA.
A first step involved total separation of the input/output operation from
the central processing unit (CPU). Such operations would be executed in
ten independent peripheral processing units (PPUs).
...
The CPU was thus free of I/O operations and, in fact, free to be
designed independently. Cray proceeded with the very innovative barrel
PPU design....
...
The 6600 came under severe criticism from time sharing enthusiasts.
Properties of the machine that caused the most concern were the lack of
virtual memory and the lack of interrupts on the PPUs. The CPU, however,
was interruptible using the very fast exchange jump to achieve rapid
switching of jobs. PPUs had to execute polling action to support interactive
terminals. Ultimate, the great power of the machine overcame much of the
opposition.
...
Each PPU independently executed programs for I/O operations and certain
operating system functions. ... Some early problems arose from tyhe fact
that all ten PPUs could be referencing central memory polling tables for
more work, thus blocking CPU memory references.
...
The essential concept of the PPU was to off-load the detailed interaction
with standard peripheral controllers from the CPU. During the course of
executing an application program, any I/O calls were assigned to a
specific PPU. The CPU might continue or might be switched to another job,
but in either case the assigned PPU would execute the I/O function and
report back when complete. Several such PPU operations could be in execution
at once.
...
The isolation of the CPU "behind" central memory from the PPUs and the I/O
was beneficial.
Interrupt versus polling. The opposite philosophy is vividly
summarized in words attributed to Seymour Cray, designer of the 6600:
"Don't tell me about [such events] when I'm doing something else; all I
can do is save them for handling later. Instead you save them. When
I'm ready to handle them, I'll ask."
The implication is that it is more efficient to complete bursts of
CPU work without interpolated task switches than it is to strain to
the utmost to give immediate response to peripheral events.
...
What is unquestionably true is that the supervisor and perhaps the
machines are simpler when synchronzation is done by polling
than when it is done by interruption.
I was responsible as principal programmer for the programming aspects
of the fleet test which was done with these computers onboard ship that
had been designed by Seymour Cray. And this machine was the AUSQ-17 and
interrupts were then known. There were machines that had interrupts, but
in the late 50's Seymour Cray wouldn't use them. And so here was the
world's largest real time system, a command and control, tactical combat
system, which did everything with monitored buffering, no interrupts.
... I was involved in the efforts to redo this whole thing, the one thing
we learned on the fleet test, we passed the fleet test and the NTDS system
became, we got an order for it, but we substituted a new computer,
the AUSQ-20, later known as the Univac 1206, which had an interrupt
structure and it had a different quinary factor but it, that machine
was much easier to program the system.
The machine supported interrupts with the following wart: Upon interrupt
the hardware reported the address of the instruction being interrupted.
That address identified only the word holding the instruction. The hardware
remembered whether the first instruction of that word had been executed.
This memory was inaccessible except to the mechanism that resumed after
interrupts. This made nested interrupts or switching contexts upon interrupt
excessively arcane. This is about the only Cray wart that I recall.
Comments on interrupts
Turner and Rawlings' comments on interrupts
It is necessary to interlace computing and information transfers to achieve
required speeds. It is well known that almost all the cycle time of
input-output devices is available for computing if coding which is clever
enough can be prepared. In practice, the difficulty of such coding is so
great that little use is usually made of this theoretically available
time.
...
[Discussing possible coding and hardware errors when interrupts are used]
The heart of these difficulties is a subtle difference between interrupt
and noninterrupt modes of operation. In the conventional mode, not only does
the result of calculation depend only on the initially stored code and on
the data, but so does every intermediate state of the computer. With
interrupt, because of the random timing of interruption, this is no
longer true; in fact, even with a correct program, only the answers
are expected to be predictable. One of the startling effects of this
variability is that a programming or coding error, by producing a
variable error or sometime none at all, may be indistinguishable from
a random computer malfunction. Many hours were spent in discovering
this pecularity.
... The nature of INTERRUPT with randomly-timed devices is such that the thin
line which has separated program difficulties from hardware difficulties
is now invisible. A problem with a program error may be successfully run
many times before the peculiar circumstances which produce a failure show
up. When they do show up there is little or no evidence to be analyzed.
[capitalization in original]
Strachey's comments on interrupts
This plan leaves several difficulties unresolved; it makes, for example,
no provision for lock-out during transfers. Another difficulty is the
possibility of several "interrupt program" signals coming at the same time
(from different units) or of one special sequence of orders being interrupted
by another (or even worse, by itself).
A fairly straightforward way out of these difficulties is to divide all
possible interrupt signals into a small number of classes. The classes are
allotted in order of priority, and any interrupt signal can only interrupt
the program immediately if this is of a lower priority class. If the program
being executed is of the same or higher priority class, the interrupt signal
is merely stored. At the end of a sequence of orders of one priority, the
machine scans the waiting list and starts to operate on the call with the
highest remaining priority.
The top priority would contain operations which cannot afford to wait
e.g. transfer of words to or from magnetic tape. The lowest priority would
be the main program. There might have to be one or two intermediate classes,
the principle of classification being that those which have interia must
have a higher priority that those that have none.
... it may be necessary to store the content of various arithemetic and
index registers before proceeding with the new program. If this is necessary
the general rule should be to store them in locations which are associated
with the interrupting program and to restore them again at the end of that
program. The amount of information to be stored is determined by what use the
interrupting program makes of these common registers. In this way no time
will be wasted storing more information than is necessary.
It may be desirable to divide these into various groups, or even allow
the priority of a program to be decided by the Director.
Dijkstra's comments on interrupts
In this connection the history of the real-time interrupt is
illuminating. This was an invention from the second half of the 50s,
which enabled the completion of a communication with the external
world to interrupt the execution of one program in favour of another.
It's advantage was that it enabled the implementation of rapid
reaction to changed external circumstances without paying the price
of a lot of processor time lost in unproductive waiting. The
disadvantage was that the operating system had to ensure correct
execution of the various computations despite the unpredictability
of the moments at which the interrupts would take place and the
central processor would be switched from one computation to another;
the nondeterminism implied by this unpredictability has caused
endless headaches for those operating system designers that did
not know how to cope with it. We have seen two reactions to the
challenge of this added complexity.
The one reaction was to enhance the debugging facilities, as IBM
did for the design of OS/360. (This was the operating system IBM
tried to design for its 360-Series of machines, which were introduced
in the first half of the 60s; IBM's problems with this design
facilitated in 1968 the recognition of the world-wide phenomenon
that became known as "the software crisis".) IBM built, in fact,
special-purpose monitors that exactly recorded when the central
processor honoured which interrupt; when something had gone wrong,
the monitor could be turned into a controller, thus forcing a
replay of the suspect history and making the "experiment" repeatable.
The other reaction could be observed at the THE (Technological
University Eindhoven), viz. to determine the conditions under which
one could feasibly and safely reason about such nondeterministic
programs and subsequently to see to it that these conditions were
met by hardware and software.
The difference was striking, showing once more that debugging is
no alternative for intellectual control. While OS/360 remained a
mess forever after, the Multiprogramming System designed at the
THE was so robust that no system malfunction ever gave rise to a
spurious call for hardware maintenance. Needless to say, the whole
episode has made a lasting impression on me.
One moral is that the real-time interrupt was only the wave,
whereas the tide was the introduction of nondeterminism and the
development of the mathematical techniques to cope with it.
The third arrangement, known as "the interrupt", circumvents all
these dilemmas. While the computer calculates at full speed, a
piece of dedicated hardware monitors the outside world for
completion signals from communication devices. When a completion
is detected, the program under execution is interrupted after the
current instruction and in such a fashion that it can be resumed
at a later moment as if nothing had happened, thus instantaneously
freeing the central processor for a suddenly more urgent task.
After the interrupt the processor would execute a standard program
establishing the source of the interruption and taking appropriate action.
It was a great invention, but also a Box of Pandora. Because the
exact moments of the interrupts were unpredictable and outside our
control, the interrupt mechanism turned the computer into a
nondeterministic machine with a nonreproducible behavior, and
could we control such a beast?
In the mean time a pattern emerged for the cooperation between me
and my hardware colleagues Bram J. Loopstra and Carel S Scholten.
After the functional specification of the next machine had been
written down (usually by me), that document served as a kind of
contract between us: it told them what machine to design and
construct, while I knew what I could count upon while writing
all the basic software for the machine. The target of this
division of labour was that my programs would be ready by the
time the construction of the machine had been completed.
Looking back I now observe that the above arrangement has had
a profound influence on how I grew up as programmer: I found it
perfectly normal to program for not yet existing machines. As a
byproduct it became firmly ingrained in my mind that I programmed
for the abstract machine as specified in the original document,
and not for the actual piece of hardware: the original document
was not a description but a prescription, and in the case of a
discrepancy not the text but the actual hardware would be at fault.
...
Of course I could not exclude from my designs typographical errors
and similar blemishes, but such shortcomings did not matter as long
as the machine was not ready yet, and after the completion of the
machine they could be readily identified as soon as they manifested
themselves, but this last comforting thought was denied to me in 1957
with the introduction of the real-time interrupt. When Loopstra and
Scholten suggested this feature for the X1, our next machine, I got
visions of my program causing irreproducible errors and I panicked.
Eventually, Loopstra and Scholten flattered me out of my resistance
and I studied their proposal. The first thing I investigated was
whether I could demonstrate that the machine state could be saved
and restored in such a way that interrupted programs could be
continued as if nothing had happened. I demonstrated instead that
it could not be done and my friends had to change their proposal.
Admittedly the scenarios under which the original proposal would
fail were very unlikely, but this can have only strengthened my
conviction that I had to rely an arguments rather than on experiments.
At the time that conviction was apparently not so widespread, for
up to seven years later I would find flaws in the interrupt hardware
of new commercial machines.
Denning's comments on interrupts
Hardy's comments on interrupts
Asanović's comments on interrupts
Design Tree for Interrupts
Performance Techniques
As far as the areas that I've been personally involved in -- mostly
high-speed networking -- the best hardware/software tradeoffs have been
mostly just judgment calls by very experienced folks. What I've noticed
is that a *LOT* of "lessons learned" decades ago are still being re-learned
by later generations, over and over again. We as an industry don't seem
to have been doing a very good job of passing on the "secret sauce" to
subsequent generations [in many cases, even *within* the same companies!].
For example, way back in the early 1970s at Digital Communications
Assoc. (in Atlanta, made communications front-ends for DEC PDP-10s
and IBM 360s using DEC PDP-8e's, successfully sold in *competition*
with DEC's own PDP-11-based front-ends!], we made simple interrupt-per-
character PIO-based terminal controllers that were -- in terms of host
CPU cycles -- *more* efficient (and a lot cheaper!) than DEC's complex
DMA-based controllers, partly because we used three simple techniques,
two programming and one hardware: (1) interrupt "coalescing", (2) "dallying"
before dismissing an interrupt, and (3) interrupt "holdoff". That is,
(1) during a single interrupt, process *all* of the data that's available or
that *becomes* available (looping to repeat the availability test as needed)
during a single interrupt; (2) spin-wait/test a certain amount (which may
even be context/data-dependent!) before dismissing an interrupt, so as to
catch new data and process it without having to dismiss/interrupt again;
and (3) once you finally *do* dismiss an interrupt, arrange the hardware
so that you won't get another one until a certain (sometimes dynamically)
chosen time interval has expired (also known as "interrupt rate-limiting",
which, together with the other two, prevents "thrashing" in the interrupt
service routine).
[There's another variant of #2 that's worth mentioning: when doing PIO
writes to a FIFO and hitting a FIFO-full condition, "dallying" (spin-waiting)
a certain amount first rather than enabling a watermark interrupt and
going away. If you pick the dally amount appropriately, by "wasting" a
few cycles you actually can save *lots* of total CPU time, since you
(hopefully) avoid a later heavy-weight interrupt/context-switch/dismiss.]
These techniques were *NOT* unique to DCA!! In talking to other engineers
at conferences & trade shows, we learned that such things were generally
widely known among workers in the field (especially embedded systems,
datacomm, and real-time folks), but were not often discussed openly,
being considered somewhat "trade secrets". [That is, why tell your
competitor why his performance sucks and yours doesn't?!?]
In the 30+ years since then, I've been astounded how many times the
techniques of coalescing/dallying/holdoff have been rediscovered, but
I've been *more* astounded how many times they haven't been used *at all*
in many places where they were definitely needed, or how they were
reinvented badly. [E.g., if you're implementing hardware assistance
for interrupt holdoff, it is *extremely* important that the holdoff
period be effective only *after* the dismissing of an interrupt, and
not delay the response to or processing of a "new" event that arrives
when the system has been idle for longer than the holdoff.] "Computer
Science" courses don't seem to ever discuss such practicalities, either.
Deferred Interrupt Handler Structure
Mercury Programming System
Manchester Atlas Supervisor
The interrupt routines are designed to handle calls for action with
the minimum delay and in the shortest time; the character-by-character
transfers to and from peripheral equipments, for example, occur at high
frequency and it is essential that the transfers be carried out with the
minimum possible use of the central computer and within the time limit
allowed by the peripheral equipment for filling or emptying the buffer.
Since several interrupt flip-flops can become set simultaneously, but
cannot be acted upon while another interrupt routine is still in progress,
it is essential that a short time limit be observed by each interrupt
routine. On some occasions, however, longer sequences are required ...
In such cases, the interrupt routine initiates a routine to be obeyed
under extracode control, known as a supervisor extra code routine.
SDS Sigma-7
OS designs influenced by David Cutler
Linux deferred handlers
Precise Interrupts
Saving and Restoring Implementation-Dependent State