Transistors Lecture Outline -- Mark Smotherman

Clemson University
CPSC 464/664 Lecture Notes
Fall 2003
Mark Smotherman

Transistors and VLSI Chips

Transistor Operation
Chip fabrication
1. Intel pages
2. steps
  1. design and layout
  2. mask creation (one mask per layer, high fixed cost of masks is a problem for low volume chips)
  3. wafer processing (light exposed through mask then wafer developed, cleaned, and inspected) -- steps are repeated for each layer
  4. each die on wafer tested and then cut out
  5. good dies are inserted into package then tested
  6. delivery of remaining chips after burn-in and speed-binning
3. Sematech page on how semiconductor chips are made
4. Close-up view of interconnect in an SRAM chip (IBM)
  see also Semiconductor manufacture and interconnect packaging (IBM)
5. Notes on Fabrication and Layout, Kenneth Yun, UCSD
Die photos
Transistor-level design
HDL-level design example from UCR
Synthesis tools
1. HDL synthesized into netlists, which are then checked and optimized
2. netlists synthesized into fabrication masks
3. combined HDL and block diagram displays
4. T. Chan, et al., "Challenges of CAD Development for Datapath Design" (Intel Tech. Journal, Q1, 1999)
Technological implications
1. transistor density - Moore's Law
2. performance measures
  1. transistor switching time is on the order of a few picoseconds (varies according to process, etc.)
  2. a logic gate is built of several transistors, so the gate propagation delay is larger (e.g., 25 ps for a FO4 gate in 180 nm process)
  3. each pipeline stage uses the equivalent of around 10 gate levels +/-, so the propagation delay through pipe stage is larger still (e.g., on the order of 250 ps => 4 GHz)
    1. clock skew and jitter (e.g., 51 ps in 180 nm Pentium 4) - usually assumed to be constant for a given process and thus becomes a bigger percentage of the clock cycle time as the clock frequency increases
    2. latch delay = 3 FO4 gate delays (e.g., 75 ps in 180 nm)
    3. logic delay = additional gate delays
    4. optimizations exist to reduce skew/jitter/latch overhead (e.g., time borrowing circuits, domino pipelines)
  4. longest critical stage governs minimum clock cycle time (and thus maximum clock frequency)
  5. but note, as in Pentium 4, some sections of the processor may run at double, half, or quarter speed
3. wire delay becoming as important as transistor switching speed
  1. range of wire at various clock speeds
  2. Pentium 4 devotes entire pipeline stages for chip crossings
4. static power (i.e., leakage) becoming a major factor in power budget
5. Stefan Rusu, Intel, "Trends and Challenges in VLSI Technology Scaling Towards 100nm"
6. special issue, IBM Journal of Research and Devlopment, "Scaling CMOS to the limit," Vol. 46, Nos. 2/3, 2002
On-chip caches
1. Mike Haertel: "The relationship between cache size and performance tends to be vaguely logarithmic: if you double the size of the cache (and magically manage to keep it at the same speed), performance increases by +x%. Then you have to double the cache again to get another +x%. Obviously this depends heavily on your workload, but it's a reasonable rule of thumb. ...
  Nowdays in many processors, both CISC and RISC, more than half the die area is devoted to cache. For example, looking at a K8 (Sledgehammer) die photo, it looks like about 60% of the die is L2 cache and maybe another 10% is L1 cache. That leaves just 30% left over for all the core logic. ... Now, if the +x% gain for 2x cache growth is on the ballpark of 5-7%, (which seems like a reasonable assumption looking at the SPEC database) then a 1.2x increase in cache size is probably worth at most +2% performance."
Wire-exposed ("communications exposed") instruction sets
Asynchronous logic
1. University of Manchester Asynchronous Logic home page
2. Sun Research Labs Asynchronous Design Group

Key Points

transistors and wires are the basic building blocks in VLSI design

most chip designers work at HDL level

transistor switching time << gate propagation delay << clock cycle

power and wire delay are now critical factors in VLSI and chip design

most of the transistors in a current high-performance microprocessor are used for on-chip caches

most designs today are synchronous (i.e., clocked), but asynchronous designs may become important

[Course home page] [Mark's homepage] [CPSC homepage] [Clemson Univ. homepage]

mark@cs.clemson.edu