:背面在试题 ## 台灣聯合大學系統98學年度碩士班考試命題紙 科目: 計算機系統(計算機組織)(500B) 校系所組: 中大電機工程學系(電子組) 交大電子研究所(乙組) 交大電控工程研究所(乙組) 清大電機工程學系(丙組) -. [15%] Assume the floating-point format to be used in this problem is an 8-bit IEEE 754 normalized format with 1 sign bit, 4 exponent bits, and 3 mantissa bits. It is identical to the 32-bit and 64-bit formats in terms of meaning of fields and special encodings. The bit fields in a number are (sign, exponent, mantissa). Assume we use unbiased rounding to the nearest even specified in the IEEE floating point standard. Copy the following table into the answer sheet and fill in your answers for (1) (2) and (3). | (1) | (2) | (3) | |-----|-----|-----------| | (*) | \~/ | | | | | | | | | 1 | | | | | | 3 | | <u>1 </u> | - (1) The exponent field employs an *excess-N* coding. What should be *N* if it follows the IEEE 32-bit standard? - (2) Please encode the binary number, 0.0011011, using this 8-bit IEEE format. Apply rounding if necessary. - (3) What decimal value does this binary, 11010101, represent? - =. [10%] You are only allowed to use the following MIPS instructions in this problem: add, addu, addi, sub, subi, nor, or, ori, sll, slr, slt, beq, lw, sw. - (1) The following instruction is not supported by MIPS. sgt \$t1, \$t2, \$t3 :if \$t2 > \$t3 then \$t1=1 otherwise \$t1=0 Please find the shortest sequence of MIPS instructions to perform the same operation. (2) The ARM processor supports the following instruction. $$r3 = r2 + (r1 << 3)$$ Please write the minimum sequence of MIPS instructions that perform the same operation. =. [9%] Suppose that you have a computer that, on average, exhibits the following characteristics (X denotes the corresponding stage is not needed) on the programs you run: | a Popularia de la companya del companya de la companya del companya de la company | Distribution | IF | ID | Ex | MEM | WB | |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|-----|----|-----|-----|-----| | Load | 25% | 2ns | | 2ns | 2ns | 1ns | | Store | 10% | | | 2ns | Ins | X | | Arithmetic | 45% | | | 1ns | Х | 1ns | | Branch | 20% | | | 2ns | Х | Х | - (1) [4%] If your computer is implemented as a single-cycle processor, what is its throughput (measured by "instructions per second")? And, if your computer is implemented as a multi-cycle processor, what is its throughput? - (2) [2%] Which implementation (single- or multi-) has relatively simple controller? And, which implementation requires more registers? Why? - (3) [3%] If your computer is implemented as a 5-stage pipelined processor, what is its idealized throughput, assuming that the there are no hazards between instructions? ## 科目:計算機系統(計算機組織)(500B) 校系所組:中大電機工程學系(電子組) 交大電子研究所(乙組) 交大電控工程研究所(乙組) 清大電機工程學系(丙組) sub lw \$s3, \$s2, \$s1 \$s4, 100(\$s3) 注:背西南铁 ## 四. [12%] - (1) [2%] In MIPS, there are instruction like 1b, 1bu, and sb, but there is no sbu, i.e. store byte unsigned. Why not? - (2) [4%] A MIPS branch instruction performs a modification of PC+4 if the condition is true. The maximum range of the jump is (PC+4)-A to (PC+4)+B, where both A and B are positive numbers. What are A and B? - (3) [6%] You wish to call a subroutine named FOO. This subroutine will use register \$t1. Write a MIPS code segment to perform the following operations: Before calling FOO, save \$t1 on the stack, then call FOO. Then, once FOO returns, copy the value from \$t1 to \$t2, and restore the value from the stack back to \$t1. Note that this is an example of caller-saved subroutine. - £. [4%] Is it possible to eliminate the pipeline stalls caused by data dependency just by code reordering? And, is the forwarding technique capable of eliminating all pipeline stalls caused by data dependency? Defend your answer. - 六. [16%] Given a MIPS instruction sequence shown right. Assume this code is executed on a five-stage (IF, ID, EXE, MEM, WB) pipelined MIPS CPU with automatic stall handling. - with automatic stall handling. (1) [6%] If this pipelined CPU does not have any forwarding capability, please draw a multi-clock-cycle pipeline diagram with necessary stall (or NOP) that represents the correct program execution on this CPU. add \$s5, \$s4, \$s3 sw \$s5, 100(\$s3) \$s0-\$s7 is numbered as 16~23 in MIPS - (2) [4%] If the code in (1) is executed on the following datapath. At the 8th clock cycles, what are the values of those control lines: Read register 2, Write register, Branch, MemtoReg? - (3) [6%] If this pipelined CPU has forwarding capability, please redraw a multi-clock-cycle pipeline diagram with necessary stall (or NOP) that represents the correct program execution on this CPU. 科目: 計算機系統(計算機組織)(500B) 校系所組: 中大電機工程學系(電子組) 交大電子研究所(乙組) 交大電控工程研究所(乙組) 清大電機工程學系(丙組) - +. [9%] The following figure is the datapath of the multi-cycle MIPS implementation with necessary control lines. Assume each instruction is also partitioned up to five stages (IF, ID, EXE, MEM, WB). - (1) [5%] Assume this MIPS CPU is executing the instruction sw \$s1, 100(\$s2). Please explain its behavior at each stage in detail. - (2) [4%] There is a 4-input multiplexer before ALU as circled in the datapath. Please explain the usage of the 4 sources of that multiplexer. A. [5%] Match the memory hierarchy element on the left with the closet phrase on the right: 1. L1 cache a. A cache for a cache 2. L2 cache b. A cache for disks 3. TLB c. A data structure used by a virtual memory system 4. Main memory d. A cache for page table entries 5. Page table e. A cache for a main memory - 九. [20%] The Intrinsity FastMATH is a fast embedded microprocessor that uses the MIPS architecture and a direct-mapped cache containing 256 blocks with 16 words per block. Suppose the processor has a CPI of 2.0, assuming all references hit in the primary cache, and a clock rate of 5Hz - (1) [10%] "Illustrate" and "explain" the implementation of the cache. - (2) [3%] Show the bit positions in address. - (3) [2%] Indicate the capacities of the cache. - (4) [5%] Assume the CPU has a primary cache access time of 100 ns, including all the miss handling. Suppose the miss rate per instruction at the cache is 2%. How much faster will the processor be if we add a secondary cache that has a 5 ns access time for either a hit or a miss and is large enough to reduce the miss rate to primary cache to 0.5%?