- Three-address code is an intermediate code. It is used by the optimizing compilers.
- In three-address code, the given expression is broken down into several separate instructions. These instructions can easily translate into assembly language.
- Each Three address code instruction has at most three operands. It is a combination of assignment and a binary operator.
A source-language expression like w= x+y*z might be translated into the sequence of three-address instructions,
t1 = y * z
t2 = x + t 1
w= t1+t2
where t 1 and t 2 are compiler-generated temporary names.
This unraveling of multi-operator arithmetic expressions and of nested flow-of-control statements makes three-address code desirable for target-code generation and optimization.
Three-address code is a linearized representation of a syntax tree or a DAG in which explicit names correspond to the interior nodes of the graph.
Addresses and Instructions
Three-address code is built from two concepts: addresses and instructions.
In object-oriented terms, these concepts correspond to classes, and the various
kinds of addresses and instructions correspond to appropriate sub classes. Alternatively, three-address code can be implemented using records with fields for the addresses; records called quadruples and triples.
An address can be one of the following:
- A name . For convenience, we allow source-program names to appear as addresses in three-address code. In an implementation, a source name is replaced by a pointer to its symbol-table entry, where all information about the name is kept.
- A constant . In practice, a compiler must deal with many different types of constants and variables. Type conversions within expressions are considered.
- A compiler-generated temporary . It is useful, especially in optimizing compilers, to create a distinct name each time a temporary is needed. These temporaries can be combined, if possible, when registers are allocated to variables.
Symbolic labels will be used by instructions that alter the ow of control. A symbolic label represents the index of a three-address instruction in the sequence of instructions. Actual indexes can be substituted for the labels, either by making a separate pass or by "backpatching".
Here is a list of the common three-address instruction forms:
Assignment Instructions
x = y op z and x = op y |
Here,
- x, y and z are the operands.
- op represents the operator.
It assigns the result obtained after solving the right side expression of the assignment operator to the left side operand.
Copy Instructions
x = y |
Here,
- x and y are the operands.
- = is an assignment operator.
It copies and assigns the value of operand y to operand x.
Conditional Jump
If x relop y goto X |
Here,
- x & y are the operands.
- X is the tag or label of the target statement.
- relop is a relational operator.
If the condition “x relop y” gets satisfied, then-
- The control is sent directly to the location specified by label X.
- All the statements in between are skipped.
If the condition “x relop y” fails, then-
- The control is not sent to the location specified by label X.
- The next statement appearing in the usual sequence is executed.
Unconditional Jump
goto X |
Here, X is the tag or label of the target statement.
On executing the statement,
- The control is sent directly to the location specified by label X.
- All the statements in between are skipped.
Procedure Call
param x call p return y |
Here, p is a function which takes x as a parameter and returns y.
1. Assignment instructions of the form x = y op z , where op is a binary arithmetic or logical operation, and x , y , and z are addresses.
2. Assignments of the form x = op y , where op is a unary operation. Essential unary operations include unary minus, logical negation, and conversion operators that, for example, convert an integer to a floating-point number.
3. Copy instructions of the form x = y , where x is assigned the value of y .
4. An unconditional jump goto L . The three-address instruction with label L is the next to be executed.
5. Conditional jumps of the form if x goto L and ifFalse x goto L . These instructions execute the instruction with label L next if x is true and false, respectively. Otherwise, the following three-address instruction in sequence is executed next, as usual.
6. Conditional jumps such as if x relop y goto L , which apply a relational operator ( < , == , >= , etc.) to x and y , and execute the instruction with label L next if x stands in relation relop to y . If not, the three-address instruction following if x relop y goto L is executed next, in sequence.
7. Procedure calls and returns are implemented using the following instructions: param x for parameters; call p , n and y = call p , n for procedure and function calls, respectively; and return y , where y , representing a returned value, is optional. Their typical use is as the sequence of three- address instructions
param
param
x 1
x 2
x
p n
param n
call ,
generated as part of a call of the procedure p ( x 1 ; x 2 ; : : : ; x n ). The in-
teger n , indicating the number of actual parameters in \ call p , n ," is
not redundant because calls can be nested. That is, some of the rst
param statements could be parameters of a call that comes after p returns
its value; that value becomes another parameter of the later call. The
implementation of procedure calls is outlined in Section 6.9.
8. Indexed copy instructions of the form x = y [ i ] and x [ i ] = y . The instruction x = y [ i ] sets x to the value in the location i memory units beyond location y . The instruction x [ i ] = y sets the contents of the location i units beyond x to the value of y .
9. Address and pointer assignments of the form x = & y , x = * y , and * x = y . The instruction x = & y sets the r -value of x to be the location ( l -value) of y . 2 Presumably y is a name, perhaps a temporary, that denotes an expression with an l -value such as A[i][j] , and x is a pointer name or temporary. In the instruction x = * y , presumably y is a pointer or a
temporary whose r -value is a location. The r -value of x is made equal to the contents of that location. Finally, * x = y sets the r -value of the object pointed to by x to the r -value of y .
Consider:
do i = i+1; while (a[i] < v);
A benefit of quadruples over triples can be seen in an optimizing compiler, where instructions are often moved around. With quadruples, if we move an instruction that computes a temporary t , then the instructions that use t require no change. With triples, the result of an operation is referred to by its position, so moving an instruction may require us to change all references to that result. This problem does not occur with indirect triples, which we consider next. Indirect triples consist of a listing of pointers to triples, rather than a listing of triples themselves.
Implementation of Three Address Code
There are 3 representations of three address code namely
- Quadruple
- Triples
- Indirect Triples
1. Quadruples
A quadruple (or just "quad ") has four fields, which we call op, arg 1, arg 2,
and result . The op field contains an internal code for the operator. For instance,
the three-address instruction x = y + z is represented by placing + in op , y in
arg 1 , z in arg 2 , and x in result .
Advantage
- Easy to rearrange code for global optimization.
- One can quickly access value of temporary variables using symbol table.
Disadvantage
- Contain lot of temporaries.
- Temporary variable creation increases time and space complexity.
2. Triples
A triple has only three fields, which we call op , arg 1 , and arg 2 .
Disadvantage –
- Temporaries are implicit and difficult to rearrange code.
- It is difficult to optimize because optimization involves moving intermediate code. When a triple is moved, any other triple referring to it must be updated also. With help of pointer one can directly access symbol table entry.
3. Indirect triples
Indirect triples consist of a listing of pointers to triples, rather than a listing of triples themselves.
Static Single-Assignment Form
REF
https://www.gatevidyalay.com/three-address-code/
No comments:
Post a Comment