CPSC 3300 - Spring 2018 Homework 1 1. For a processor with 3 GHz clock frequency (a.k.a. clock rate), what is the clock cycle time? (6 pts.) CCT = 1/CR = 1 / 3 GHz = 1 / 3x10^9 cycles/sec = 1/3 * 10^(-9) secs/cycle = 1/3 nsec/cycle = 0.333 nsec 2. Find the execution time for a program that executes 3 billion instructions on a processor with an average CPI of 2.0 and a clock frequency of 4 GHz. (10 pts.) IC * CPI 3x10^9 insts * 2.0 cycles/inst CPU time = -------- = ------------------------------ = 1.5 sec CR 4x10^9 cycles/sec 3. P1 has a 3.0 GHz clock frequency and an average CPI of 1.5 on a given program. If the processor executes that program in 10 seconds, find: (6 pts. each) (a) the number of cycles, and (b) the number of instructions. [This is question 1.5(b) for processor P1 from the textbook.] IC * CPI total cycles (a) CPU time = -------- = ------------ CR CR total cycles = CPU time * CR = 10 sec * 3x10^9 cycles/sec = 30 billion cycles (b) total cycles = IC * CPI so: IC = total cycles / CPI = 30 billion cycles / 1.5 cycles/inst = 20 billion insts 4. What is the MIPS rate for processor P1 in question 2 above? Use the MIPS formula that contains only the clock rate and CPI. (6 pts.) CR 4x10^9 cycles/sec MIPS = ---------- = ---------------------- = 2000 MIPS CPI * 10^6 2.0 cycles/inst * 10^6 5. For the following workload and cycle values, find the average CPI. [This is a weighted average that uses the instruction frequencies as weights. You do not divide by three.] (6 pts.) type | freq cycles -------+-------------- alu | 0.4 1 ld/st | 0.4 4 branch | 0.2 2 avg CPI = .4*1 + .4*4 + .2*2 = .4 + 1.6 + .4 = 2.4 6. If a new compiler for the computer in question 5 could reduce the number of instructions to 80% of the original total and alter the instruction frequencies in the following manner, what would be the total speedup? (18 pts.) type | freq cycles -------+-------------- alu | 0.6 1 ld/st | 0.3 4 branch | 0.1 2 CPU time old IC old * CPI old * CCT old speedup = ------------ = -------------------------- CPU time new IC new * CPI new * CCT new CPI new = .6*1 + 0.3*4 + 0.1*2 = .6 + 1.2 + 0.2 = 2.0 factor | old new -------+------------- IC | IC 0.8*IC CPI | 2.4 2.0 CCT | CCT CCT no change in CCT, so it cancels out IC old * CPI old * CCT IC * 2.4 2.4 speedup = ---------------------- = ------------ = ----- = 1.5 IC new * CPI new * CCT 0.8*IC * 2.0 0.8*2 7. Consider a vector computer in which vector mode can provide a speedup of 10. Using Amdahl's Law: (12 pts. each) (a) What is the overall speedup for each incremental twentieth in the fraction of vectorization (i.e., f = 0.05, 0.1, 0.15, ..., 0.95, 1.0)? You should express the speedup values with two decimal places. (Hint: use a spreadsheet. You can also chart the values.) 0.00 0.05 0.10 0.15 0.20 0.25 1.00 1.05 1.10 1.16 1.22 1.29 0.30 0.35 0.40 0.45 0.50 1.37 1.46 1.56 1.68 1.82 0.55 0.60 0.65 0.70 0.75 1.98 2.17 2.41 2.70 3.08 0.80 0.85 0.90 0.95 1.00 3.57 4.26 5.26 6.90 10.00 10 * 9 8 7 * 6 5 * 4 * * 3 * * 2 * * * * * * 1 * * * * * * * * 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 (b) What amount of vectorization is required for an overall speedup of 2? Solve the problem algebraically and express the answer as a reduced fraction rather than a decimal or a percentage. overall_speedup = 1/[ (1-f) + f/s ] 1 2 = ------------ (1-f) + f/10 2*(1-f) + 2*f/10 = 1 10*(1-f) + 10*f/10 = 5 10 - 10*f + f = 5 10 - 9*f = 5 -9*f = -5 f = 5/9 (c) What amount of vectorization is required for an overall speedup of 5? Solve and express the answer as directed in part (b). overall_speedup = 1/[ (1-f) + f/s ] 1 5 = ------------ (1-f) + f/10 5*(1-f) + 5*f/10 = 1 10*(1-f) + 10*f/10 = 2 10 - 10*f + f = 2 10 - 9*f = 2 -9*f = -8 f = 8/9 8. Arithmetic/Harmonic/Geometric. Circle one of A, H, or G, as applies. (2 pts. each) G - Used for averaging normalized times or rates. H - Used for averaging execution rates. A - Used for averaging execution times.