# COMS W4115 Programming Languages and Translators Lecture 18: Translation of Arrays, Booleans,    Ifs, Whiles, Procedures April 1, 2015

## Lecture Outline

1. Arrays
2. Boolean expressions
3. If statements
4. Names
5. Procedures
6. Parameter-passing mechanisms

## 1. Translation of Arrays

• Referencing a one-dimensional array
• In C and Java, array elements are numbered `0, 1,..., n-1` for an array `A` with n elements.
• Element `A[i]` begins in location `(base + i × w)` where `base` is the relative address of the storage allocated for `A` and `w` is the width of each element.
• Common layouts for multidimensional arrays
• Row-major order
• Column-major order
• See Fig. 6.22 (p. 383) for an SDD generating three-address code for assignments with array references.
• Example: three-address code for the expression `c + a[i][j]` assuming the width of an integer is 4
• ``````
t1 = i * 12
t2 = j * 4
t3 = t1 + t2
t4 = a[t3]
t5 = c + t4
``````

## 2. Translation of Boolean Expressions

• Boolean expressions are composed of boolean operators (&&, ||, !) applied to boolean variables, relational expressions, and other boolean expressions.
• Short-circuit evaluation: Some languages, such as C and Java, do not require an entire boolean expression to be evaluated.
• Given `x && y`, if `x` is false, then we can conclude the entire expression is false without evaluating `y`.
• Given `x || y`, if `x` is true, then we can conclude the entire expression is true without evaluating `y`.
• Numerical encoding
• In C, the numerical value 0 represents false; a nonzero value represents true.
• Positional encoding
• The value of a boolean expression can be represented by a position in three-address code, and the boolean operators can be translated into jumps.
• The expression
• ``````
if (x < 100 || x > 200 && x != y)
x = 0;
``````
can be translated into the following three-address instructions:
``````
if x < 100 goto L2
ifFalse x > 200 goto L1
ifFalse x != y goto L1
L2: x = 0
L1:
``````

## 3. Translation of If-statements

• Boolean expressions often appear in the context of flow-of-control statements such as:
• If statements
• If-else statement
• See Figs. 6.36 (p. 402) and 6.37 for SDDs translating these statements with booleans into three-address code.
• For the expression
• ``````
if (x < 100) || x > 200 && x != y)
x = 0;
``````
these SDDs produce the following three-address instructions:
``````
if x < 100 goto L2
goto L3
L3: if x > 200 goto L4
goto L1
L4: if x != y goto L2
goto L1
L2: x = 0
L1:
``````
This code can be transformed into the code in Section 4 by eliminating the redundant goto and changing the directions of the tests in the second and third if-statements.

## 4. Translation of While-statements

• Consider the production `S → while ( B ) S1` for while-statements. The shape of the code for implementing this production can take the form:
• ``````
begin: // beginning of code for S
code to evaluate B
if B is true goto B.true
if B is false goto B.false
B.true:
code to evaluate S1
goto begin
B.false:  // this is where control flow will go after executing S
``````
• Here is an SDD for this translation (from Fig. 6.36, p. 402):
• ``````
S → while ( B ) S1 {
begin = newlabel()
B.true = newlabel()
B.false = S.next
S1.next = begin
S.code = label(begin) || B.code ||
label(B.true) || S1.code ||
gen('goto' begin)
}
``````

## 5. Names

• A name is a character string used to identify some element in a program.
• Names in most languages are identifiers.
• In some programming languages, certain identifiers fall into distinct namespaces that do not interfere with one another. For example, in the following C structure declaration the three uses of `id` fall into three distinct namespaces:
• ``````
struct id {  /* here id is a structure tag */
int id;    /* here id is a structure member */
} id;        /* here id is a structure variable */
``````
The structure tag `id`, the structure member `id`, and the structure variable `id` are in different namespaces and hence distinct identifiers that can be distinguished by context.
• A binding is an association between two things such as between a name and its type or between a symbol and the operation it represents. The time at which this association is determined is called the binding time. Bindings can take place from language design time to runtime.
• The textual region of a program in which a binding is active is called its scope. A scope is often with respect to a namespace.
• In a language with static scoping, the bindings between names and objects are determined at compile time by examining the text of the program. Static scoping is sometimes called lexical scoping. In static scoping, a name refers to its lexically closest declaration.
• In a language with dynamic scoping, the bindings between names and objects depend on the order in which procedures are called at runtime.
• The lifetime of a name-object binding is the time between the creation and destruction of that binding.
• The lifetime of an object is the time between the creation and destruction of the object. Depending on the language, the lifetime of an object may be different than that of lifetime of the name-object binding.
• In C, an object is a location (named region) in storage. The storage class determines the lifetime of the storage associated with an object.
• In C, there are two storage classes: automatic and static. Automatic objects are local to a block and are discarded on exit from the block. Static objects retain their values across entry from and reentry to functions and blocks.
• Object lifetimes are usually determined by the storage-allocation strategy used to manage the storage for that object.
• Static objects are allocated memory in the code space and have an address that is retained throughout the execution of a program.
• Stack objects are allocated in a last-in, first-out order on the runtime stack usually in conjunction with procedure calls and returns.
• Heap objects are dynamically allocated and deallocated at arbitrary times on the runtime heap. Some languages such as Java and C# use a garbage collection mechanism to identify and reclaim heap objects that become unreachable during program execution.

## 6. Procedures

• A procedure P in a programming language is a collection of statements that defines a parameterized computation. An invocation of P is called an activation of P.
• We use the term actual parameters to denote the parameters used in the call of a procedure.
• We use the term formal parameters to denote the parameters used in the definition of a procedure.
• We will often call a procedure that returns a value a function. (C uses the term function for procedure as well.)
• The type of the function `return_type f(arg1_type a, arg2_type b)` can be denoted by the type expression
•    `arg1_type x arg2_type → return_type`
• Some design issues for implementing procedures
• choice of parameter-passing mechanism
• storage allocation for local variables: static or dynamic
• can procedure declarations nest
• can procedures be passed as parameters, returned as values
• can procedure names be overloaded
• generic procedures, ones whose computations can be done on different types
• does language have closures (encapsulations of procedures with their runtime context)

## 7. Parameter-Passing Mechanisms

• Programming languages differ in how the values of parameters are passed to called procedures.
• Call by value
• The actual parameter is evaluated if it is an expression or copied if it is a variable. The r-value is placed in the location belonging to the corresponding formal parameter of the called procedure.
• C and Java use call by value. C leaves the order in which the parameters are evaluated unspecified; Java evaluates the parameters left to right.
• "swap" example from C
• Consider the following C program fragment
• ``````
a = 1;
b = 2;
swap(a, b);
printf("a = %d, b = %d\n", a, b);
``````
where the function `swap` is defined as
``````
void swap(int x, int y) {
int temp;
temp = x;
x = y;
y = temp;
}
``````
• Now consider the same program fragment with `swap(&a, &b)` in place of `swap(a, b)` and with `swap` defined as
• ``````
void swap(int *px, int *py) {
int temp;
temp = *px;
*px = *py;
*py = temp;
}
``````
• Call by reference
• The address of the actual parameter is passed to the callee as the value of the corresponding formal parameter.
• If the parameter is an expression, the expression is evaluated and its value is stored in a new location before the call. The address of that location is passed.
• Useful for passing large parameters to procedures.
• Used for reference parameters in C++. In C++, `swap` can be written with reference parameters as
• ``````
void swap(int &x, int &y) {
int temp;
temp = x;
x = y;
y = temp;
}
``````
In the body, `x` and `y` are `int`'s, not pointers to `int`'s. The caller passses as parameters the variables whose values are to be swapped, not their addresses.
• Call by name
• A call-by-name parameter is re-evaluated in the caller's referencing environment each time it is used. The effect is as though the called procedure is textually expanded at the point of the call with each actual parameter substituted for the corresponding formal parameter at every occurrence in the body of the procedure. Local names in the called procedure may need to be renamed to keep them distinct.
• Used in Algol 60.
• Also used at compile time by macros in the C preprocessor.
• Example: Consider the macro definition in C
• `#define max(a, b) ((a) > (b) ? (a) : (b))`
The C statement
`x = max(p+q, r*s);`
will be replaced by the statement
`x = ((p+q) > (r*s) ? (p+q) : (r*s);`