[Note: used different textbook in Fall 2008] 330 Fall 2008 -- Exam 1 (with answers) 1. Give the power of 10 associated with these prefixes. (1 pt. each) exa __10**18__ giga __10**9___ kilo __10**3___ mega __10**6___ micro _10**(-6)_ milli _10**(-3)_ nano _10**(-9)_ peta __10**15__ 2. Define "instruction set architecture" and give at least one example. (4 pts.) An ISA is the programmer-visible interface to a computer; it is the complete set of instructions available to the machine language programmer, including the definition of the operations, operand data types and representation, memory and register storage spaces, I/O, and instruction representation and addressing modes. specific examples include SPARC, i386, PowerPC, ARM, S/360 (note RISC, CISC, VLIW, etc. are types or categories of ISAs) 3. Matching -- technology/performance terms. Write the correct term from the list into each blank. (1 pt. each) embedded computer datapath throughput arithmetic mean Linpack desktop computer transistor CPU time harmonic mean Whetstone server wafer CPI geometric mean Dhrystone supercomputer yield workload speedup SPEC a. __supercomputer__ a computer with highest performance and cost b. __server_________ a computer that provides computation, file storage, and/or printing to multiple users across a network c. __wafer__________ the slice of silicon upon which integrated circuits are built d. __speedup________ the ratio of the execution times of two computer systems e. __geometric_mean_ used to summarize a set of normalized execution rates f. __SPEC___________ a set of benchmark suites that are used to compare the performance of various desktops and small servers 4. Give the CPU time equation and define the terms you use. (10 pts.) CPU time = IC * CPI * CCT IC = instruction count CPI = clock cycles per instruction CCT = clock cycle time ( = 1/CR where CR = clock rate ) 5. Find the execution time for a program that executes 10 billion instructions on a processor with an avg. CPI of 2.5 and a clock frequency of 2 GHz. (5 pts.) 10*10**9 insts. * 2.5 cycles/inst. CPU time = ---------------------------------- = 12.5 secs 2*10**9 cycles/sec 6. For the following workload and cycle values, find the average CPI. (4 pts.) type | freq cycles -------+-------------- CPI = 0.2*1 + 0.2*2 + 0.6*4 alu | 0.2 1 = .2 + .4 + 2.4 ld/st | 0.2 2 = 3.0 branch | 0.6 4 7. If a new compiler for the computer in question 6 could reduce the number of instructions to 1/2 of the original total and alter the instruction frequencies in the following manner, what would be the total speedup? (8 pts.) type | freq cycles -------+-------------- CPI = 0.5*1 + 0.25*2 + 0.25*4 alu | 0.5 1 = .5 + .5 +1.0 ld/st | 0.25 2 = 2.0 branch | 0.25 4 CPU time old IC * 3 * CCT speedup = ------------ = ---------------- = 3 CPU time new (IC/2) * 2 * CCT [a typo on the original exam meant that another answer was also considered correct] 8. Explain why the SPEC benchmarks are better for performance evaluation than Whetstone and Dhrystone. (9 pts.) [answer needs to emphasize that SPEC uses real programs rather than synthetic and thus can be argued to be more representative; SPEC programs and data sets are larger than Whetstone and Dhrystone and thus exercise the caches and memory systems.] 9. Matching -- logic terms. Write the correct term into each blank. (1 pt. each) minterm race condition half adder ALU register sum of products circuit depth full adder PLA shift register don't care fan-in decoder latch RS latch glitch fan-out multiplexer flip-flop JK flip-flop a. __don't_care_____ unused value that can be arbitrarily assigned 0 or 1 b. __glitch_________ undesired signal lasting only a short time c. __sum_of_products a form of logical representation that employs a logical OR of minterms d. __race_condition_ where the output of a circuit depends on small differences in signal timing e. __circuit_depth__ the number of gates in a circuit that form the longest path from any input to any output f. __multiplexer____ a circuit in which n select values route one of 2**n input values to the single output g. __flip-flop______ a memory element in which the stored state can only change once per clock cycle h. __shift_register_ a circuit that connects several flip-flops into a linear structure where the output of each flip-flop can be the input to either of its neighbors 10. Simplify the following Karnaugh maps of function F. (3 pts. each) \ BC A \ 00 01 11 10 +----+----+----+----+ 0 | 1 | 1 | 1 | 1 | F = fn(A,B,C) = (~A) + C +----+----+----+----+ 1 | 0 | 1 | 1 | 0 | +----+----+----+----+ \ BC A \ 00 01 11 10 +----+----+----+----+ 0 | 1 | 0 | 1 | 1 | F = fn(A,B,C) = B + (~C) +----+----+----+----+ 1 | d | d | d | d | +----+----+----+----+ 11. Give the expanded characteristic table and the excitation table for the JK flip-flop. (8 pts.) J K Q(t) | Q(t+1) Q(t) Q(t+1) | J K ----------+------- ------------+------------ 0 0 0 | __0__ 0 0 | __0__ __d__ 0 0 1 | __1__ 0 1 | __1__ __d__ ----------+------- ------------+------------ 0 1 0 | __0__ 1 0 | __d__ __1__ 0 1 1 | __0__ 1 1 | __d__ __0__ ----------+------- ------------+------------ 1 0 0 | __1__ 1 0 1 | __1__ ----------+------- 1 1 0 | __1__ 1 1 1 | __0__ ----------+------- Extra Credit. Draw a gate-level circuit diagram of a clocked JK latch (note: this is not a JK flip-flop). The diagram should show all the logic gates required to implement the latch; that is, do not rely on block diagrams for any or all of the circuit. Inputs are J, K, and C (clock); outputs are Q and ~Q. (up to 5 pts.) [see http://www.asic-world.com/images/digital/jk_latch.gif from homework 2, where the clock is labeled as E] 12. Consider a state machine that outputs a 1 whenever it recognizes a repeated input in its binary input stream. That is, the state machine behaves like this input (I): 1 0 1 0 1 1 0 0 1 1 1 0 0 0 1 1 1 1 0 0 0 0 output (S): 0 0 0 0 0 1 0 1 0 1 1 0 1 1 0 1 1 1 0 1 1 1 (For this trace, the state machine starts in a state which records a prior input of 0.) (a) Give the state diagram. (6 pts.) (b) Give the state transition table with inputs I, current state Q(t), next state Q(t+1), and output S. (6 pts.) (c) Give the simplified logic expressions for Q(t+1) and S. (6 pts.) (d) Draw the circuit using a D flip-flop. (6 pts.) Extra Credit. Explain how to add a reset to the circuit. (up to 5 pts.) (a) 1/0 .---. .-----. .---. / v.===./ v.===./ \ 0/1 | | 0 | | 1 | | 1/1 \ /`==='^ /`==='^ / `---' `-----' `---' 0/0 (b) I Q(t) | Q(t+1) S (c) Q(t+1) = I ---------+----------- 0 0 | 0 1 S = (~I)*(~Q(t)) + I*Q(t) 0 1 | 0 0 // exclusive-nor 1 0 | 1 0 1 1 | 1 1 (d) ----. I---*--------------*-------o| . | | .---o| .---. ----. | | | ----' |___\ \ | | | ___| >----S | | | ----. | / / | `---|----| .---' ----' | *----| . | .--------. | ----' `---| D Q |-----' > | | ~Q |--- `--------' (ex cr) "and" ~reset with the I input to the D-FF. The value of S is not explicitly specified for the reset case; but, if it should be 0 on a reset, then "and" ~reset with the output of the "or" gate as well to form S R----*-------------------------------------------. | | | ----. | I--*-|-------------*-------o| . | | | | .---o| .---. ----. | ----. __|_o__ | | ----' |___\ \ `---o| .___S | | | | ___| >-------| . | | | | ----. | / / ----' `...' `---|----| .---' ----' | *----| . | .--------. | ----' `---| D Q |-----' > | | ~Q |--- `--------'