Reference no: EM131968161
Assignment -
Instructions - Calculators are allowed, answer all questions.
Question 1 - Performance Evaluation
1.1) Consider a 2GHz clock cycled MIPS system with the following instruction types:
Load (5 clock cycles), Store (5 clock cycles), R-type (2 clock cycles), Branch (4 clock cycles), and Jump (3 clock cycles)
Assume that a program has 450 R-type instructions, 220 Load instructions, 160 Store instruction, 120 Branch instruction, and 50 lump instructions. Calculate its CPI, CPU time and MIPS rate.
1.2) Suppose that we wont to enhance a processor used for Web serving. The new processor is 15 times faster on computation in the Web serving application than the original processor. It noticed that the original processor spends 40% of its time for I/O wait. What is the overall speedup gained by incorporating the enhancement?
1.3) Two computers (A anal B) execute four loops of scientific program in the number of clocks shown below:
Loop
|
Comp A.
|
Comp B.
|
1
|
40
|
30
|
2
|
50
|
20
|
3
|
30
|
25
|
4
|
24
|
15
|
Based on the table, calculate the central tendency of the speedup for the loop (from A to B).
Question 2 - Processor Data Path
2.1) "Operands stored as constants or in registers are accessed quickly" give out your reason(s).
2.2) Briefly describe the term "stored program concept".
2.3) Identify the type el instruction format which is shown in Figure1 below.
2.4) Show the four state elements of MIPS system and briefly describe each of them.
2.5) Show MIPS tingle cycle data path for the following instruction set (where S0 and T0 are 32-bit general purpose registers and S0 has initialized with a value of zero):
sw T0, 4(S0)
2.6) Consider the following MIPS assembly code (where S0, S1, and S6 are 32-bit general purpose registers and S0 has initialized with a value of zero):
i) 1w S6, 5 (S0)
ii) sw S1, 8 (S0)
iii) add S1, S1, S0
Show the 32-bit physical addresses of the memory access instructions.
2.7) Assume that a complier (C/C++/Java) which is running on a MIPS system generates the following assembly code (where S0, S1, S2, S3, and S4 are 32-bit general purpose register). Show its equivalent high level code.
bne S3, S4, next
add S0, S1, S2
j L1
next:
sub S0, S0, S3
L1:
Question 3 - Pipeline Processor System
3.1) Consider an un-pipelined machine with five stages (Instruction Fetch, Instruction Decode/Register Fetch, Execute/Address Calculation, Memory Access and Write Back). Assume that it has 1us clock cycles. The machine uses four cycles for ALU operations and branches and five cycles for memory operations. Assume that the relative frequencies of these operations are 45%, 15% and 40% respectively. Pipelining the machine adds 1.5 ns of overhead to the clock. Find out how much speedup we will gain in the instruction execution rate.
3.2) Derive an equation for finding pipeline speedup with the presence of pipeline stall clock cycles per instructions during a pipeline execution based on question 3.1.
3.3) Single cycle MIPS system is pipelined by subdividing its data path into five pipeline stages. Briefly describe the pipeline stages of the processor.
3.4) Show the MIPS 5-stage pipeline sequence for the given instruction stream. If there any data hazard among instructions, then that should be shown in the pipeline sequence diagram with its solution(s) which are taken by pipeline system.
Addi S1, S0, 5
add S4, S1, S5
xor S10, S1, S4
sub S6, S10, S5
or S3, S6, S7
and S2, S6, S9
3.5) What is the structural hazard? Which are the stages in a 5-staged MIPS pipeline system to the structural hazard(s)? Describe in details.
Question 4 - Memory System
4.1) In a MIPS system, the program counter (PC) increment by 4 after each instruction fetch during its sequential execution, why?
4.2) Discuss the importance of text segment memory in MIPS systems.
4.3) Consider the following MIPS instruction (where S0 and S1) are two 32-bit general purpose registers):
1w, S1, 4(S0)
Assume that S0 is initialized with a value of zero and the memory locations with addresses 0x00000004 to 0x00000007, holding a 32-bit value; 0xFF223344. Show the value of register S1 after the execution of load word operation.
4.4) Consider a 8MB cache (assume that there is no level2 cache in the system) and a 4GB main memory (organization: 2G x 16). The size of a main memory block is 256-bit (16, 16-bit words). Assume that the processor can only access a word of 16-bit at time from the cache. Based on the given cache, memory block and main memory details shown the address fields which are used by a processor to access the following cache organizations:
4.4.1) Direct
4.4.2) Fully associative
4.4.3) 2-way set associative
4.5) Discuss briefly cache read and writers policies.