Mark Smotherman
UNDER CONSTRUCTION
previous project names: SSO, 8816, 8800
The 432 was originally intended as Intel's 16-bit microprocessor; see interview of Dave House
[regarding Colwell, et al.] They identify several design mistakes that hampered performance
They estimate that compiler weaknesses account for 25-35% loss of throughput, while inst. set design weaknesses account for another 5-10%; and, they point out that these are independent of the object-oriented nature of the 432.
They postulate even better performance could have been obtained by adding
Yet, in spite of the corrections and additions, they see a factor of two to three performance hit for the 432 style of object orientation.
Robert Colwell writes in two postings to comp.arch from April, 1995:
I think it's a shame that the failure of the product (due at least partly
to its very low performance) so obscured the contributions this design
might have made. (I say that because my research convinced me that this
slowness was not intrinsic to the design philophy. The poor performance was
due to a combination of other factors.) This project could have turned out
quite differently.
Robert Colwell wrote:
There were some really neat ideas in that system. As I pointed out
in my thesis in '85, the i432 was a wonderful research project
masquerading as a bad product.
I believe the research was done at CMU as C.MMP, Wulf, Jones, et. al.
Let's be very careful here. The pioneering work into capability machines
was indeed done at (among other places) CMU. But the i432 went far beyond
where the CMU work left off. For instance, in the i432, even the physical
processors were themselves objects, with SW-readable data records
describing their states. Processes were also objects, and were managed just
like any other object. The 432 folks extended the object paradigm to an
unprecedented extent. It was almost breathtaking when you first realized
what they'd attempted; a kind of grand unification of computer systems.
and
The 432 address cache had 4 entries, and when that cache missed, a
7-levels-of-indirection table walk ensued. Gordon Bell picked on this as
"it's obvious that this will cripple the performance, no wonder it's slow."
In the research I did, I never saw this effect. It's not so simple as
blaming the 432's addressing for its low performance.
1) I don't think that cache was too small (I had lots of simulations to
back that up)
So you're in Gordon's camp. You missed the real lessons of the i432. Read
the paper again.
You were right, but for the wrong reasons. That's not a trivial distinction.
You're wrong. It's a technical paper. Speculating on why I might have done
the work is pointless, and has nothing to do with the technical merits of
its contribution. But in case you really are interested, my interest in
doing the research, and then in writing it up, was this: the field of
computer architecture has a very difficult time applying the canonical
scientific approach. We never get to do a design twice, first one way, and
then another way, so that we can see the effects of a given design choice.
The i432 cost Intel $100M's. All we can do is analyze the final result,
and incrementally tweak the design (via its simulator) to see if we can
extract the various influences of the myriad design decisions embodied
therein.
So I did that analysis to see why such a radical design had gone so far
wrong. I became convinced that the field as a whole was missing the point:
the 432 wasn't slow because it was object-oriented. It was slow because it
got some basic things wrong. That's an important lesson for ANY chip
development.
I was never impressed by the i432. It seemed to suffer form some of the same
architectural excesses amd lack of attention to basic performance as the
Burroughs 1700.
In a previous job, I was asked to analyse the 432 as a competitor product at
the time it was comming out. The state of the project could be gleaned from
the published ISSCC paper on the 432. This paper described a hardware LRU
algorithm for deciding which cache line to flush, and then went on to admit
that they had only two words of cache.
My conclusions were:
a) They were in a hurry to complete the project and they cut out cache to
make it fit on the chip. A panic of this magnitude usually suggests that
things are not really in control.
2) I have no evidence that they chopped it down due to die size pressure
3) Even if they did have to downsize that cache, I don't think it qualifies
necessarily as "panic"; it's a normal part of chip design.
b) Given all the complicated addressing stuff, without adequate cache, it
was not going to perform.
My conclusion was that we did not need to worry about the 432, and I was
right.
P.S. I always saw that paper by Doug Jensen and his students as being CMU
defending its own.
[History page] [Mark's homepage] [CPSC homepage] [Clemson Univ. homepage]
mark@cs.clemson.edu