All homework problems are to be done individually. All homework submissions are to be made via Courseworks. For all programming problems you will be required to submit source code, a README file documenting your files and code, and a test run of your programs. The README should explain any way in which your solution differs from what was assigned, and any assumptions you made. Refer to the homework submission page on the class web site for additional submission instructions.
For your convenience, all programming can be developed on any machine that can run Linux. However, only those programs which compile using the gcc compiler on the CLIC machines will be graded. Furthermore, it is critically important that all submitted program listings and executions be thoroughly documented.
In this homework you will construct a program that can profile the execution of a function using binary translation. The program will follow the execution of a function and record the number of times each basic block is executed. The following steps will lead you through this process.
We have provided you with a source skeleton which you should use for this assignment. A couple of notes about the source skeleton provided. The skeleton should compile and run out-of-the-box. In the source you will find several NOT_IMPLEMENTED macros. You job will be to implement these pieces.
Your objective in Step 0 is to use StartProfiling to binary patch fib() to immediately return. Although this may sound pointless, this technique is very useful in the real world. Often you have debugging code that needs to be removed after some time. On-the-fly binary patching allows you to remove functionality without recompiling your code. If fib() was some debugging code and this was in the kernel, binary patching fib() to return immediately would allow you to speed up the kernel without rebooting the machine.
You should patch the first instruction on fib() with a callout. The callout should emulate the behanior of the RET by returning not to the calling site of the callout (which is the normal behavior) but directly to the return PC on the stack.
Use the pseudocode in Figure 2.12a and 2.12b of the textbook as a guide. An IA32 opcode map has been provided in ia32DecodeTable to save having to type all this out. Use it as you see fit.
For each basic block you encounter, dump the instructions in that block in the same format as in step 1. You should stop this process when you hit the StopProfiling function.
Dump the contents of the table in StopProfiling.