Homework 6 examples
Purposes:
(1) perform calculations using Amdahl's Law;
(2) perform calculations to determine MIPS and MFLOPS;
(3) consider instruction execution patterns for multithreading.
1. Consider enhancing a scalar machine by providing a vector mode, which
is 4 times faster than the normal mode of operation.
(a) If the percentage of vectorization is 25%, what is the overall
speedup?
1 1 1
speedup = ----------------- = ---------- = ----- = 16/13 = 1.23
(1-1/4) + (1/4)/4 3/4 + 1/16 13/16
(b) What percent of vectorization is needed to achieve an overall
speedup of 2?
1
2 = ------------- => (1-f) + (f/4) = 1/2
(1-f) + (f/4)
=> 4-4f+f = 2
=> -3f = -2
=> f = 2/3 so % vectorization needed is 67%
2. Consider a program that executes 100 million instructions in 5 seconds.
What is the MIPS rating for this program?
100*10^6 insts
MIPS = -------------- = 20 MIPS
5 secs * 10^6
Consider a processor with a CPI value of 10 cycles/inst. and a clock
frequency of 200 MHz. What is the MIPS rating for this processor?
200 M cycles/sec
MIPS = --------------------- = 20 MIPS
10 cycles/inst * 10^6
3. Consider the program in question 2 that executes 100 million instructions
in 5 seconds. If 15% of these instructions are floating-point operations,
what is the MFLOPS rating for this program?
0.15 flops/inst * 100*10^6 insts
MIPS = -------------------------------- = 3 MFLOPS
5 secs * 10^6
4. Consider the following two threads acting on a shared variable "sv":
initially: sv = 0;
thread 1:
sv++;
thread 2:
sv=2;
when compiled, these relevant portions of these threads are:
thread 1: thread 2:
(1.1) ld r1, sv (2.1) st r2, sv // assume that r2 has
(1.2) addi r1, r1, 1 // been preloaded
(1.3) st r1, sv // with the value 2
How many different interleavings are possible for the four instructions?
The answer is less than 4! (4 factorial) because the ordering
within thread 1 must be observed. If you consider four slots for
the four total instructions executed, you can assign (without loss
of generality) thread 1 to three of them (for 4 choose 3) and
assign the single instruction from thread 2 to the remaining slot.
4 1
# interleavings = ( ) * ( ) = 4 * 1 = 4
3 1
What are the possible values of sv that can result?
interleaving 1 interleaving 2 interleaving 3 interleaving 4
sv = 0 sv = 0 sv = 0 sv = 0
1.1 r1=0 1.1 r1=0 1.1 r1=0 2.1 sv = 2
1.2 r1=1 1.2 r1=1 2.1 sv=2 1.1 r1=2
1.3 sv=1 2.1 sv=2 1.2 r1=1 1.2 r1=3
2.1 sv=2 1.3 sv=1 1.3 sv=1 1.3 sv=3
sv = 2 sv = 1 sv = 1 sv = 3