ENIAC | IBM 704 | IBM S/360 M50 | VAX 11-780 | Sun SPARCStation IPC | Dell 4600 | |
---|---|---|---|---|---|---|
date | 1946 | 1955 | 1965 | 1978 | 1990 | 2003 |
addition time | 200 µsec | 24 µsec | 4 µsec | 400 nsec | 40 nsec | 208 psec |
memory cycle time | 12 µsec | 2 µsec | 200 nsec | 80 or 100 nsec | 3 nsec | |
standard memory size | 168 KB | 64 KB | 128 KB | 8 MB (48 MB max) | 256 MB | |
rental | $48,000/mo. | $32,000/mo. | $6,000/mo. | |||
purchase | $500,000 | $1,390,000 | $409,000 | $128,000 | $9,995 (w/ color monitor) | $800 |
constant 2010 dollars | $5.6M | $11.3M | $2.8M | $428,000 | $16,670 | $950 |
CPU technology | 17,500 vacuum tubes | 5,000 vacuum tubes in CPU | ~ 25 K SLT circuits on 72 circuit boards (2x1 and 1x3 diodes, one transistor, and three printed resistors on AOI module; medium-speed gate was 30 nsec) | 1500 Schottky TTL chips on 20 circuit boards (1-4 gates or circuits per chip; 3-nsec gates) | approx. 1 M transistors in LSI S1C0010 IU (separate FPU) | 55 M transistors in Pentium 4 |
But recently ...
For fun - media representations of computing power
comparison of execution times (e.g., in seconds) on three machines
machine X | machine Y | machine Z | |
---|---|---|---|
program 1 | 20 | 10 | 40 |
program 2 | 40 | 80 | 20 |
total execution time | 60 | 90 | 60 |
average execution time | 30 | 45 | 30 |
normalized speedups and means (relative to machine X)
machine X | machine Y | machine Z | |
---|---|---|---|
program 1 | 1 | 2 | 0.5 |
program 2 | 1 | 0.5 | 2 |
arithmetic mean | 1 | 1.25 | 1.25 |
geometric mean | 1 | 1 | 1 |
normalized speedups and means (relative to machine Y)
machine X | machine Y | machine Z | |
---|---|---|---|
program 1 | 0.5 | 1 | 0.25 |
program 2 | 2 | 1 | 4 |
arithmetic mean | 1.25 | 1 | 2.125 |
geometric mean | 1 | 1 | 1 |
same comparisons done using rates (1/execution_time)
machine X | machine Y | machine Z | |
---|---|---|---|
program 1 | 0.05 | 0.1 | 0.025 |
program 2 | 0.025 | 0.0125 | 0.05 |
arithmetic mean | 0.0375 | 0.05625 | 0.0375 |
harmonic mean | 0.0333 | 0.0222 | 0.0333 |
CPU performance CPU_time = IC * CPI * CCT cycles secs = i insts * j ------ * k ----- = ijk secs inst cycle IC * CPI CPU_time = --------------- Clock_Frequency IC_i CPU_time = sum ( IC * CPI ) * CCT = IC * sum ( ---- * CPI ) * CCT i i IC i i i Consider the following instruction mix IC_i / IC CPI_i operation frequency cycle count weighted CPI_i --------- --------- ----------- -------------- ALU ops 50% 1 _____ Loads 20% 2 _____ Stores 10% 3 _____ Branches 20% 4 _____ a) What is the average CPI? (Use the weighted arithmetic average.) CPI = (.5*1) + (.2*2) + (.1*3) + (.2*4) = .5 + .4 + .3 + .8 = 2.0 b) What is the cycle count distribution according to operation type? frequency in percentage of operation inst stream CPI total cycle count --------- ----------- --- ----------------- ALU ops 50% 1 25% (=.5/2) Loads 20% 2 20% (=.4/2) Stores 10% 3 15% (=.3/2) Branches 20% 4 40% (=.8/2) avg = 2 Alternatively, consider that out of every 100 instructions, 50 are ALU ops, taking one cycle each, etc. |------------------------- 100 instructions --------------------------| | | | 50 ALU ops | 20 loads | 10 stores| 20 branches | | * | * | * | * | | 1 cycle / ALU op| 2 cycles/ld | 3 cyc/st | 4 cycles / branch | | = | = | = | = | | 50 cycles | 40 cycles | 30 cycles| 80 cycles | | | |------------------------- 200 clock cycles --------------------------| So, branches represent 20/100 of the instruction count (20%) but 80/200 of the cycle count (40%). Note that on average each instruction takes two cycles (avg CPI = 2), but branches take twice the average number of cycles. So, it's not surprising that the branch cycle count percentage is twice the branch instruction count percentage. c) If branch cycles are reduced to 2, do you have enough information to determine how much faster the modified processor will be than the original processor? CPU_time_old IC * CPI_old * CCT CPI_old speedup = ------------ = ------------------ = ------- CPU_time_new IC * CPI_new * CCT CPI_new CPI_old = 2.0 CPI_new = (.5*1) + (.2*2) + (.1*3) + (.2*2) = .5 + .4 + .3 + .4 = 1.6 speedup = 2.0 / 1.6 = 1.25