## 國 立 清 華 大 學 命 題 紙 1. 10%) Three lines of a computer program is traced as follows, where A and B are registers, and symbol "\theta" means the exclusive-or operation. What will be happened at the end of the third operation? $$A \le A \oplus B$$ $B \le A \oplus B$ $A \le A \oplus B$ 2. 15%) The carry function of the full adder cell with inputs A, B, and C is Carry = $$AB + (A+B)C = AB + (A \oplus B)C$$ , where "+" and "⊕" mean logic-or and exclusive-or, respectively - (a) Write down the Sum function. - (b) From the carry function, we can get that operator "+" equals operator "⊕". Is this correct, if incorrect, why? $$AB + (A+B)C = AB + (A \oplus B)C$$ $$\Rightarrow (A+B)C = (A \oplus B)C$$ $$\Rightarrow A+B = A \oplus B$$ $$\Rightarrow + = \oplus$$ 3. Give the block diagram of a 1-digit BCD adder using only full-adders and primitive logic gates. The two 4-bit inputs are $A_3A_2A_1A_0$ and $B_3B_2B_1B_0$ ; the carry-in is $C_0$ ; and the carry-out is $C_4$ . (10%) ## 國 立 清 華 大 學 命 題 紙 ## 八十五學年度 <u>電機工程</u>系(所) <u>乙</u>組碩士班研究生入學考試 科目 計算機組織 科號 3004共 三 頁第 二 頁 \*請在試卷【答案卷】內作答 - Answer the following questions and give explanations briefly. - (a) The IEEE 754 standard for single-precision floating-point (FP) numbers has a 23-bit mantissa and an 8-bit exponent. How many bits of precision does it have? (5%) - (b) The original Intel Pentium processor had a design flaw in its floating-point unit (FPU), i.e., there were some rare cases when the buggy Pentium divides two FP numbers in which it returns an answer with the precision less than expected. The Pentium uses the IEEE 754 standard to represent FP numbers. Intel reported that the worst case inaccuracy occurred in the 12th bit of the mantissa. What is the guaranteed range of single-precision numbers in the buggy Pentium? (5%) - (c) The IBM POWER2 FPU also conforms to the IEEE 754 FP standard. In addition, it has two double-precision execution units and supports the compound multiply-add instruction, i.e., it can execute two double-precision multiply-add instructions every cycle, resulting in up to four FP operations per cycle. Based on this information, suggest the width (number of bits) of the FPU's interface to the data cache? (5%) - 5.(12%) A nonpipelined processor X has a clock rate of 25 MHz and an average CPI(cycles per instruction) of 4. Processor Y, an improved successor of X, is designed with a five-stage linear instruction pipeline. However, due to latch delay and clock skew effects, the clock rate of Y is only 20 MHz. (a) If a program containing 100 instructions is executed on both processors, what is the speedup of processor Y compared with that of processor X? (b) Calculate the MIPS rate of each processor during the execution of this particular program. - 6.(13%) Consider a shared-memory multiprocessor system with p processors. Let m be the average number of global memory references per instruction execution on a typical processor, t be the average access time to the shared memory, and x be the MIPS rate of a uniprocessor using local memory. (a) For a multiprocessor system with p = 32 RISC processors, m = 0.4, and $t = 1 \mu s$ , what is the MIPS rate of each processor needed to achieve a multiprocessor performance of 56 MIPS effectively? (b) Suppose p = 32 CISC processors with x = 8 MIPS each are used in the above multiprocessor system with m = 1.6 and t = 1 $\mu$ s, what will be the effective MIPS rate? - 7.(12%) There is a three-level virtual memory system, the access time for each level memory is: $t_{A1} = 10^{-5}$ s (cache), $t_{A2} = 10^{-4}$ s (main memory) and $t_{A3} = 10^{-2}$ s (secondary memory). The hit ratio H of cache memory is H1=0.9 and main memory is H2=0.95. - (a) What is the access time of the three-level memory system. - (b) What must the hit ratio H1 of cache memory be increased (H2 does not change) so that the access time becomes 50% of its original value. - 8.(13%) A computer consists of a CPU an I/O device D connected to main memory M via a one-word shared bas. The CPU can execute a maximum of 10<sup>6</sup> instructions per second. An average instruction requires five machine cycles, three of which use the memory bus. A memory read or write operation uses one machine cycle. Suppose the CPU is continuously executing "background" programs that require 90 percent of its instruction execution rate but no IO instructions. Now the IO device is to be used to transfer very large blocks of data to and from M. - (a) If programmed I/O is used and each one-word I/O transfer requires the CPU to execute two instructions, estimate the maximum I/O data-transfer rate possible through I/O device D? - (b) Estimate the maximum I/O data-transfer rate possible through DMA transfer?