S-1 Supercomputer (1975-1988)

Mark Smotherman
Last update: September 2023

... under construction ...
corrections are welcome

... at Livermore Lowell [Wood] asked the graduates students [Tom McWilliams and Curt Widdoes] to team up and give some thought to designing a supercomputer from scratch.
-- William Broad, Star Warriors, p. 32

Summary: The S-1 project was an attempt to build a family of multiprocessor supercomputers. The project was envisioned by Lowell Wood at the Lawrence Livermore National Lab in 1975 and staffed for the first three years by two Stanford University Computer Science graduate students, Tom McWilliams and Curt Widdoes.

That two graduate students could design and almost completely build a supercomputer by themselves is an amazing feat, comparable to the design and building of the CDC 6600 by Seymour Cray and a small staff a dozen years earlier. However, McWilliams and Widdoes are even better known for the major advances in CAD tools for logic design that they developed as part of the early days of the project and for the startup company they founded, Valid Logic Systems.

The project was supported by the US Navy and ramped up in 1978 with the addition of more students, including Mike Farmwald and Jeff Rubin, and again in 1979. Dr. Carl Haussman provided the day-to-day oversight as the project team grew in size.

Five generations of S-1 processors were planned, and two MSI/ECL generations were built. The project independently invented two-bit branch prediction, directory-based cache coherency, and multiprocessor synchronization using load linked and store conditional. The project also influenced the development of programming languages and compilers including Common LISP and gcc.

S1 article
LLNL Newsline, January 10, 1979 (Courtesy LLNL)

Introduction

Dr. Lowell Wood, a physicist at LLNL and protege of Edward Teller, led the special studies group at LLNL, which was called the O-Group. The O-Group members had many interests, but their work mainly revolved around ideas for a national missile defense. Wood was also an interviewer for the Hertz Foundation, which awarded prestigious scholarships to graduate students interested in the applied sciences. From this position, Wood could occasionally recruit top students to work in the summers at the lab.

Two of the Hertz Foundation scholarship recipients that Wood recruited were Curt Widdoes (in 1973) and Tom McWilliams (in 1975). Widdoes had enrolled in the Ph.D. program in computer science at Stanford and was working on the design of the Minerva multiprocessor system in 1975, when McWilliams started his summer job at the lab. Wood encouraged McWilliams to meet Widdoes and challenged them to design and build a supercomputer.

In fact, Wood envisioned a family of multiprocessor supercomputers, with each having nodes of comparable power of contemporary commercial supercomputers. The plan was to build five generations of processors with the same general architecture and to develop computer-aided logic design tools that would ease the task of reimplementing the processors in each new logic technology family. The fifth generation was planned to use wafer-scale integration (WSI).

With the support of the US Navy, two MSI generations of processors were built (but were not strictly compatible):

Mark I (1978)
- each node was to have the processing power of a CDC 7600 (benchmarks indicated the completed processor was 1/3 the power of a 7600 or about the power of a 370/168)
- one node built
- (plans had called for 16 nodes interconnected by crossbar)
- 10 MIPS, 5300 chips, ECL-10K implementation (10 MHz)
- originally planned (ca. 1976) to have 4K word icache, 4K word dcache, both four-way set-associative with 4-word line size
- no segmentation
- design began with Curt Widdoes doing the Ebox and Tom McWilliams doing the Ibox, but they ended up doing the logic design jointly
- design required two man-years of effort using the specially-developed CAD tools; there were 211 high-level diagrams and 144 low-level diagrams
- 12 boards, 5300 ICs
- Jeff Rubin helped debug (the experience showed the need for a logic simulation tool)
- Mike Farmwald wrote the microcode
- Mark I photos, 3 "pages" of logic, pages unfold to allow access to wiring
Mark IIA (1982)
- each node was to have the power of a Cray-1
- several nodes built
- 15 MIPS, ECL-100K
- 4K word icache, 16K word dcache, both four-way set-associative with 16-word line size
- hardware FP
- CISCy: vector operations, selection, matrix and signal processing operations were added
- segmentation was added, with a variable boundary between the segment number and the offset
- relative pointers
- pipeline stages controlled by writable microcode
- decoded icache expanded the 36-bit instruction word to a 56-bit icache format to reduce the inst. decoding time
- (ca. 1977) branch prediction bits held in icache
- FP was emulated
- "advance computation" in early pipeline stages of simple instructions (execution done twice - once in Ibox and again in Abox for ease of branch misprediction recovery; they used the term "value prediction")
- high-performance emulation of PDP-10 and Univac AN/UYK-7 (32 bit word)
- 64 (later 72) boards of logic, 25000 ICs
- Ibox designers: Jeff Rubin and Tom McWilliams
- Abox designers: Mike Farmwald and Bill Bryson
- design done using SCALD II running on S-1 Mark I
- 20-30 people involved with designing and building Mark IIA
- early physical design was similar to the Mark I - see Figure 4 of the Widdoes and Correll Compcon 80 paper shows the Mark IIA, 5 "pages" of logic, pages unfold to allow access
- later design used a fixed frame in which each processor was built with four pages extending out from a central hub - see 1983 "S-1 Multiprocessor Test And Evaluation Facility" brochure.

The third-generation processor design was called the AAP (Advanced Architecture Processor). There were also a few references to it as the Mark IIB. The AAP was a RISC-like redesign. The processor had a 32-bit orientation, but each processor register and memory word was augmented with four tag bits for better support of Lisp, Prolog, and garbage collection. The AAP retained the cache-coherent shared-memory multiprocessor programming model of the Mark IIA, but it used dual counter-rotating slotted rings as the global interconnection network. This interconnection was designed to scale to 256 processors. Each AAP processor was supposed to be the size of "a microwave oven".

Project Timeline

(separate page)

CAD tools

... tbd ...

SCALD - structured computer-aided logic design

design tool that was graphics-based and hierarchical, optimized for ease of graphical entry of a design
operated at level of logic modules and buses, goal was to automate a lot of the lower-level details
produced lists for automatic wire-wrapping machines (i.e., for circuit board implementation)

SCALD I components

SUDS (Stanford University Drawing System) schematic editor
macro expander - Tom McWilliams, 8K lines of code initially
router - wire lister, Curt Widdoes, 12K lines of code initially

(SCALD I ended up with approx. 30K lines of Pascal code)

produced wire wrap list
could also produce change list (wrap/unwrap) for updating an old board

SCALD II

packager - Curt Widdoes, 30K lines of code initially
timing verifier - Tom McWilliams, 6K lines of code initially
(later) automated placement of chips - Jeff Rubin

led to Valid Logic as startup (see below - which needs to be incorporated here)

Instruction set architecture

... preliminary ...

influenced by PDP-10
36 bit word chosen for addressing

inst. set was widely peer-reviewed; over 100 people from Stanford, MIT, and CMU reviewed it, including Forest Baskett, who was McWilliams' and Widdoes' adviser

Mark I
- developed in 1975 by Tom McWilliams and Curt Widdoes (McWilliams describes Mark I instruction set as "superset of PDP-10")
- 36-bit word
- 9-bit bytes (with two-word byte pointers)
- 32 registers, mapped on to the first 32 words of memory
- R3 is mapped to the PC for indexing purposes
- three FP formats: 18-bit halfword, 36-bit single word, and 72-bit double word
- two-address format with three-operand versions available which implicitly use a register as the additional source or destination
- skip, jump, and hop control transfer instructions
- loop-closing versions of skip and jump that combine an increment or decrement with the conditional change of control flow
Mark IIA additions
- vector insts
- queue insts
- transcendental function insts (exp, log, sin, cos, ...) - table lookup methods described by Farmwald
- signal processing insts (fft, ...)
- matrix multiply and matrix transpose insts
- selection insts (min, max)
- qsort partition inst (qpart)
AAP (Mark IIB)
- 32-bit design, with four tag bits added to each processor register and memory word
- RISC-like (in many ways it feels like SPARC)
- fixed-length instructions (one word each)
- load/store architecture
- byte-addressable memory, 32-bit linear address, paging
- designed for four-stage pipeline, optional delayed and anulling branches
- "magicode" for trap handling (similar to DEC Alpha PAL routines)
- two 32-entry register files
  1. user register set
  2. magicode register set, high 16 registers can only be accessed in privileged mode and contain PSR, interrupt status register, etc.
- [describe tag usage]

Branch prediction

Two bits were used - a prediction bit ("jump bit") and a dynamic reverse bit ("wrong bit"). The dynamic reverse bit was set whenever the branch was mispredicted; two mispredicts in a row caused the jump bit to toggle. This scheme has a state diagram just like the two-bit predictor given in Hennessy and Patterson but with the strongly-taken state at the top left having the bit pattern of 10 (rather than 11) and weakly-taken having 11 (rather than 10). A.J. Smith cites an unpublished memo by Widdoes at Stanford in February 1977 that outlines the scheme; Widdoes says it was a joint invention between himself and McWilliams.

(The 2-bit counter scheme was independently invented by Jim Smith at CDC in 1979-1980.)

CISCy instructions like min, max, and qpart were added to Mark IIA to handle situations where branches were unpredictable.
[see Farmwald's dissertation]
[tie in design pressure to deal with branches and the Los Alamos and IBM experience with poor performance with branch prediction on the Stretch; could compare this to recent trend to use predication as the response to unpredictable branches]

I/O structure

"I also don't see that you've mentioned the I/O architecture. I/O was accomplished with the assistance of I/O processors that communicated with the S-1 through I/O memories. There were special instructions to assist in mapping the 9-bit quarterwords, 18-bit halfwords, 36-bit singlewords, and 72-bit doublewords into multiples of 8-bit bytes, possibly with different endianness, There were Unibus and Qbus interfaces to the IOM. We had a PDP-11 as one I/O processor, but the production IOP for both Unix and Amber was a Q-bus-based 68010 system. The same IOP code supported both operating systems; it provided console I/O, mass storage, and networking."

Multiprocessor structure

... tbd ...

16 processors interconnected to 16 memory modules by a crossbar
- influenced by C.mmp
- originally intended to implement software-based cache coherency
- later implemented a directory-based cache coherency scheme (which was also independently invented by Censier and Feaurier)
- the central directory scheme used a 17-bit vector (16 presence and one dirty bit -- the same approach was later used in DASH)
- coherency state transitions used the inter-processor interrupt bus to signal nodes to invalidate
- I/O device were attached to specific processors

moved from RMW approach to LL/SC (Jensen, Hagensen, and Broughton)

AAP had ring interconnection

Compilers, interpreters, and software tools

... tbd ...

emulator and optimizing assembler - Jeff Rubin

Pastel - Jeff Broughton

The early OS work was done in PL/I, but the team later switched to their own systems-programming version of Pascal, which they called "Pastel".

Lisp - Richard Gabriel and Rod Brooks

"S-1 Lisp, never completely functional, was the test bed for adapting advanced compiler techniques to Lisp implementation."
"One implementation of Common Lisp, namely S-1 Lisp, already has a compiler that produces code for numerical computations that is competitive in execution speed to that produced by a Fortran compiler."

later project to port C and Unix, involved Richard Stallman doing the C front-end

"Hoping to avoid the need to write the whole compiler myself, I obtained the source code for the Pastel compiler, which was a multi-platform compiler developed at Lawrence Livermore Lab. It supported, and was written in, an extended version of Pascal, designed to be a system-programming language. I added a C frontend, and began porting it to the Motorola 68000 computer. But I had to give that up when I discovered that the compiler needed many megabytes of stack space, and the available 68000 Unix system would only allow 64K. I then determined that the Pastel compiler was designed to parse the entire input file into a syntax tree, convert the whole syntax tree into a chain of "instructions," and then generate the whole output file, without ever freeing any storage. At this point, I concluded I would have to write a new compiler from scratch. That new compiler is now known as GCC; none of the Pastel compiler is used in it, but I managed to adapt and use the C frontend that I had written. But that was some years later; first, I worked on GNU Emacs."
-- Richard Stallman, The GNU Operating System and the Free Software Movement
"I didn't really know much about optimizing compilers at the time, because I'd never worked on one. But I got my hands on a compiler, that I was told at the time was free. It was a compiler called PASTEL, which the authors say means ``off-color PASCAL''.
"Pastel was a very complicated language including features such as parametrized types and explicit type parameters and many complicated things. The compiler was of course written in this language, and had many complicated features to optimize the use of these things. For example: the type ``string'' in that language was a parameterized type; you could say ``string(n)'' if you wanted a string of a particular length; you could also just say ``string'', and the parameter would be determined from the context. Now, strings are very important, and it is necessary for a lot of constructs that use them to run fast, and this means that they had to have a lot of features to detect such things as: when the declared length of a string is an argument that is known to be constant throughout the function, to save to save the value and optimize the code they're going to produce, many complicated things. But I did get to see in this compiler how to do automatic register allocation, and some ideas about how to handle different sorts of machines.
"Well, since this compiler already compiled PASTEL, what I needed to do was add a front-end for C, which I did, and add a back-end for the 68000 which I expected to be my first target machine. But I ran into a serious problem. Because the PASTEL language was defined not to require you to declare something before you used it, the declarations and uses could be in any order, in other words: Pascal's ``forward'' declaration was obsolete, because of this it was necessary to read in an entire program, and keep it in core, and then process it all at once. The result was that the intermediate storage used in the compiler, the size of the memory needed, was proportional to the size of your file. And this also included stack-space, you needed gigantic amounts of stack space, and what I found as a result was: that the 68000 system available to me could not run the compiler. Because it was a horrible version of Unix that gave you a limit of something like 16K words of stack, this despite the existence of six megabytes in the machine, you could only have 16Kw of stack or something like that. And of course to generate its conflict matrix to see which temporary values conflicted, or was alive at the same time as which others, it needed a quadratic matrix of bits, and that for large functions that would get it to hundreds of thousands of bytes. So i managed to debug the first pass of the ten or so passes of the compiler, cross compiled on to that machine, and then found that the second one could never run.
... "The new C compiler is something that I've written this year since last spring. I finally decided that I'd have to throw out PASTEL. This C compiler uses some ideas taken from PASTEL, and some ideas taken from the University of Arizona Portable Optimizer."
-- Stallman lecture at KTH (Stockholm, Sweden), October 1986

An automatic parallelization tool was under development called the Paralyzer (not to be confused with the Illiac IV parallelization tool of the same name).

how - PASCAL
where - Maclisp

also Fred Chow, ...

"The first assembler for the AAP was written in Lisp. (The primary language was to be Pastel, so the amount of assembly code was expected to be fairly minimal.) For various reasons, notably assembler runtime performance, it was eventually rewritten in C." - J. Bruner

Operating systems

Amber

The S-1 operating system was called Amber. The OS design was influenced by work at MIT, including Multics, ITS, and the MIT Lisp Machines, and by the Tenex operating system from BBN. The goal was a layered OS structure that could be tailored for support of real-time, time-sharing, and batch applications.

"The design of Amber was begun in 1979 by a team of six. Hon Wah Chin was project leader. Team members were Ted Anderson, Jeff Broughton, Charles Frankston, Lee Parks, and Daniel Weinreb. All team members were familiar with Multics. Lee Parks and Ted Anderson had participated in implementation of a small scale Multics like system as undergraduates [Parks1979]. Daniel Weinreb had worked on the MIT Lisp Machine project as a undergraduate [LispMachine].
"Most of the first year of the effort was spent in design and discussion. The essence of the current capability scheme was devised by Jeff Broughton toward the end of the design period. Prior to this time, the design was much closer to Multics in its concepts of segments, directories, access control and the like. In fact some code which was already written had to be modified to accommodate the new scheme, but the changes were not drastic.
"By that time it was apparent that the S-1 Mark IIA computer system, which Amber was designed for, was going to be ready later than expected. Had the hardware been closer to completion, it is likely that a less ambitious and more expeditious system would have been implemented. However, the continued non-availability of the target computer system was very detrimental to completion of coding efforts. It was increasingly difficult, both technically and psychologically, to continue building a kernel with no real feedback on how successful the elements of the structure thus far implemented were.
"At the end of the first year, Daniel Weinreb and Lee Parks left the project. Hon Wah Chin assumed other duties with the S-1 project and Jeff Broughton became team leader. One and a half years into the project Earl Killian joined in the midst of the switch to Pastel as the implementation language. Three years into the project Charles Frankston took an extended leave to continue his education. Jay Pattin joined the Amber team over four years after the inception of the project."
-- Charles Frankston (see http://www.mit.edu/~cbf/thesis.htm)

and

"The S-1 architecture supported a variable boundary between the segment number and the segment offset which was important for keeping the address to a mere 36 bits. The Mark II also had relative pointers (i.e. pointers that are an offset from the address of the word containing the offset), which was a nice feature for storing databases on disk that are mapped to different addresses in every process.
"There were some good OS ideas that were implemented, including a file system that did not require salvaging/repair when the machine was rebooted after a crash (a background process could recover lost blocks while the system was running and doing useful work). Unlike today's equivalents, it was not based on journaling, but rather careful ordering of operations. The filesystem supported a property-list for every file (a Lisp machine idea I think)."
-- Earl Killian

... combination of capability-based access with access-control lists ...

... processor affinity scheduling ...

The S-1 design made extensive use of diagnostic processors and techniques for fault tolerance. The OS supported dynamic reconfiguration ...

... more to do ...
[am I missing papers from Amber? what impact?]

Unix

There were two operating systems on the S-1 Mark IIA: Amber and Unix.

"The Unix port was based upon the 7th Edition, although the lack of virtual memory wasn't much of an issue because of the huge memory we had at the time (128 megabytes, where a byte was 9 bits wide). The C compiler was built using the (Johnson) Portable C compiler; we wrote (and later rewrote) the assembler and linker. A major challenge in porting Unix and C was the fairly loose distinction at that time between integers and pointers in C. The tagged architecture of the S-1 divided pointers into a 5-bit tag and a 31-bit address. Tags 0 and 31 were invalid, so as to trap in hardware spurious references through small positive and negative integers. However, in C it was common to use the all-zero bit pattern for (void *)NULL, which was an illegal pointer value. [Pastel did not share this problem because the nil pointer had a (nonzero) tag and there was no implicit association of 0 with nil.]" - J. Bruner

Impact

... tbd ...

The most visible impact of the S-1 project was on CAD tools. Two other major contributions were 2-bit dynamic branch prediction and the load linked / store conditional mutual exclusion primitives (which are now found in MIPS, Alpha, and PowerPC).

[inst. set - influenced development of RISC? - DEC through Baskett, MIPS through Hennessy and Killian, ...]
[did DEC PRISM epicode influence AAP magicode?]
[branch prediction - but most attribute two bits to Smith]
[decoded icache - ...]
[cache modes using tags in page table and TLB - used in MIPS, ...]
[MP, directory-based cache coherency - DASH, ...]
[compilers - gcc through Stallman, optimization through Chow, Lisp through ...; MIPS compilers, ...]
[OS - ...]

S-1 Alumni

(separate page)

The "S-1" Name

In the 1976 report by McWilliams, Widdoes, and Wood, the computer design is called the LLL Programmable Digital Filter (LLL for Lawrence Livermore Laboratory). The name was subsequently changed to S-1. Donald MacKenzie in his book, Knowing Machines: Essays on Technical Change, writes in footnote 125 on pp. 300-301:

There was no single reason for the choice of the name. S-1 was "kind of a sacred label round here [Livermore]" (Wood interview), since it had been the name of the committee of the Office of Scientific Research and Development responsible for the atomic bomb project during the Second World War. The hydrophone systems were known as SOSUS (Sound Surveillance System) arrays, and this led to the proposed computer's being referred to in negotiations with the Navy as SOSUS-1; however, the abbreviation to S-1 was convenient for security reasons, since the existence of the SOSUS arrays was at that point classified.
Finally, McWilliams and Widdoes were registered for doctorates in the Computer Science Department at Stanford University and were trying to recruit fellow graduate students to the project, which they therefore referred to as Stanford-1.

References

Web Pages

Gio Wiederhold, The Lawrence Livermore Laboratory S-1 Project and Stanford University [includes pictures from a 2006 reunion, from 2009 of the Mark I in storage at the Computer History Museum, and from 2009 of the Mark II in storage at LLNL]

Online Documents

anon., "S-1 Multiprocessor Test And Evaluation Facility," LLNL Brochure, 1983. (available online from LLNL, 193488.pdf)
anon., "S-1 Uniprocessor Architecture," LLNL, 1983. (available online from bitsavers, UCID-19782_S1_Uniprocessor_Arch_Apr83.pdf)
anon., "Amber Kernel Specification, S-1 Project," LLNL Tech. Rept. UCID-20378, 1985. (available from the Internet Archive, 198136.pdf)
T. Axelrod, P. Dubois, and P. Eltgroth, "A Simulator for MIMD Performance Prediction: Application to the S-1 MkIIa Multiprocessor," Parallel Computing, vol. 1, 1984, pp. 237-274. (available online in technical report format from LLNL, 192765.pdf)
R.A. Brooks, R.P. Gabriel, and G. L. Steele, Jr., "S-1 Common Lisp Implementation," Proceedings ACM Symposium on LISP and Functional Programming, Pittsburgh, 1982, pp. 108-113. (available online through ACM Digital Library)
J.M. Broughton, P.M. Farmwald, and T.M. McWilliams, "S-1 Multiprocessor System," SPIE Vol 341 Real Time Signal Processing V, 1982, pp. 327-332. (available online in technical report format from LLNL, 191422.pdf)
W.R. Bryson, "Packaging of the S-1 Advanced Architecture Processor," LLNL Tech Rept. UCID-21559, 1988. (available online from OSTI, 6999343)
J.D. Bruner, et al., "Cache Coherency on the S-1 AAP," LLNL Tech Rept. UCRL-97646, 1987. (available online from LLNL, 212414.pdf)
J.D. Bruner, "The S-1 AAP Architecture," LLNL Tech. Report UCID-21561, October 1988. (available online from OSTI, doi:10.2172/6999313)
P.M. Farmwald, "The S-1 Mark IIA Supercomputer," in J.S. Kowalik, (ed.), High-Speed Computation, Springer-Verlag, NATO ASI Series, vol. 7, 1984, pp. 145-155. (available online in technical report format from LLNL, 195854.pdf)
B.T. Hailpern and B.L. Hitson, "S-1 Architecture Manual," CSL Report STAN-CS-79-715, Stanford University, January 1979. (available online, STAN-CS-79-715_S-1_Arch_Man.pdf)
E.H. Jensen, G.W. Hagensen, and J.M. Broughton, "A New Approach to Exclusive Data Access in Shared Memory Multiprocessors," LLNL Tech. Rept. UCRL-97663, November 1987. (available online from LLNL, 212157.pdf)
J.L. Manferdelli, "Signal Processing Aspects Of The S-1 Multiprocessor Project," LLNL Tech Rept. UCRL-84658, 1980. (available online from LLNL, 185329.pdf)
T.M. McWilliams and L.C. Widdoes, Jr., "SCALD: Structured Computer-Aided Logic Design," 15th Design Automation Conference, Las Vegas, 1978, pp. 271-277. (available online through ACM Digital Library and in technical report format from LLNL, 178568.pdf)
T.M. McWilliams and L.C. Widdoes, Jr., "The SCALD Physical Design Subsystem," 15th Design Automation Conference, Las Vegas, 1978, pp. 278-284. (available online through ACM Digital Library and in technical report format from LLNL, 178574.pdf)
V.Y. Moldenhauer, "Ring Interconnection Network for the S-1 AAP Multiprocessor," LLNL Tech. Rept. UCRL-97650, 1987. (available online from LLNL, 212415.pdf)
S-1 Technical Staff, "The S-1 Project: Advancing the Digital Computing Technology Base for National Security Applications," presentation at Open House for Members and Guests of the Navy and DOE Communities, 1985. (available online at bitsavers, S1_OpenHouse_Apr85.pdf) [includes overview of AAP]
T. McWilliams, L.C. Widdoes, and L. Wood, "Preliminary Design of an Advanced Programmable Digital Filter Network for Large Passive Acoustic ASW," LLNL Tech. Rept. UCID-17299, 1976. (available online at the Internet Archive, 174143.pdf) [first specification of the S-1 architecture]
M.M. Murray, "Diagnostics on the S-1 Advanced Architecture Processor (AAP)," LLNL tech. rept. UCID-21568, 1988. (available online from OSTI, 7184517)
J.A. Pattin, "Amber," LLNL tech. rept. UCID-21567, 1988. (available online from OSTI, 6895826)
G.S. Taylor, "Radix 16 SRT Dividers with Overlapped Quotient Selection Stages: A 225 Nanosecond Double Precision Divider for the S-1 Mark IIB," IEEE 7th Symposium on Computer Arithmetic, 1985. (available online)
L.C. Widdoes and S. Correll, "The S-1 Project: Developing High-Performance Digital Computers," Energy and Technology Review, LLNL, 1979. (available online from bitsavers, S1_MKIIA_Article_Sep79.pdf)
see also:
- L.C. Widdoes and S. Correll, "The S-1 Project: Developing High Performance Computers," IEEE Spring Compcon, San Francisco, Feb. 1980, pp. 282-291. (available online in technical report format from LLNL, 183586.pdf)
L. Wood, et al., 1979 Annual Report, The S-1 Project, Volume I: Architecture. Lawrence Livermore Laboratory, Livermore, CA, 1979. (available online from OSTI, 5991007)
L. Wood, et al., 1979 Annual Report, The S-1 Project, Volume II: Hardware. Lawrence Livermore Laboratory, Livermore, CA, 1979. (available online from OSTI, 5921576)

Oral Histories

SCALD Oral History: #1 of 3 (Tom McWilliams and Curt Widdoes Together) Computer History Museum, February 2008. (available online from CHM, accession number 102658241)
SCALD Oral History: #2 of 3 (Curt Widdoes Alone) Computer History Museum, February 2008. (available online from CHM, accession number 102658234)
SCALD Oral History: #3 of 3 (Tom McWilliams Alone) Computer History Museum, February 2008. (available online from CHM, accession number 102658344)

Dissertations and Theses

P.M. Farmwald, On the Design of High Performance Digital Arithmetic Units. PhD thesis, Stanford University, Aug. 1981. Advisor: Forest Baskett. (LLNL technical report version available from the Internet Archive, 205212.pdf)
C.B. Frankston, "The Amber Operating System," B.Sc. Thesis, MIT, May 1984. Advisor: Fernando Corbató. (available online through MIT, thesis.htm)
T.M. McWilliams, Verification of Timing Constraints on Large Digital Systems. PhD thesis, Stanford University, May 1980. Advisor: Forest Baskett. (LLNL technical report version available from the Internet Archive, 185870.pdf)
see also:
- T.M. McWilliams, "Verification Of Timing Constraints In Large Digital Systems," LLNL Tech. Rept. UCRL-83791, 1980. (available online from LLNL, 183780.pdf)
- T.M. McWilliams, "SCALD Timing Verifier, A New Approach To Timing Constraints In Large Digital Systems," LLNL Tech. Rept. UCRL-83910, 1980. (available online from LLNL, 183834.pdf)
L.C. Widdoes, Jr., Automatic Physical Design Of Large Wire-wrap Digital Systems. PhD thesis, Stanford University, December 1980. Advisor: Forest Baskett. (LLNL technical report version available from the Internet Archive, 188493.pdf)

LLNL Institutional Research and Development Annual Reports

FY86, UCRL-53689-86, see "Supercomputer Research and Development," pp. 203-217.
FY87, UCRL-53689-87, see "Supercomputer Research and Development," pp. 108-114.
FY88, UCRL-53689-88, see "Computer-Aided Design Tools for Very Large-Scale Integration," pp. 128-132. [no mention of the S-1 Project]

Acknowledgements

Thanks to Jordin Kare for first pointing me to the S-1. Ted Anderson, Tina Darmohray, Earl Killian, and Curt Widdoes have been very helpful to me in correcting my understanding of the overall project, the processor designs, and the OS. My thanks also go to Maxine Trost, archivist at LLNL, for providing me with scanned S-1 articles and pictures. Thanks to Harry Quackenboss for help in correcting a typo in an earlier version.

[History page] [Mark's homepage] [CPSC homepage] [Clemson Univ. homepage]

mark@cs.clemson.edu