Exam Flashcards

Question

What is Symbol Tables? (Sometimes called Identification tables)

Answer 1

It is an information table that can be accessed by all phases in a syntax-directed compiler. * Allows information to be associated with identifiers which can be shared among different compile phases. Each time an identifier/variable is declared or used, the symbol table provides information collected about it.

Answer 2

It is often packaged as a compiler compiler. (SableCC, antlr) * Includes scanner and parser generators. * Some also include symbol table managers, and code-generation tools.

Answer 3

They are the lexical elements used in specifying the production rules constituting a formal grammar.

Answer 4

A grammar symbol that cannot be rewritten. ## Footnote (By convention written in lowercase) (On the picture it's the id, assign and $ symbols.)

Answer 5

Nonterminal symbols are those symbols which can be replaced. ## Footnote (By convention written in uppercase) (On the picture it's the Val and Expr symbols)

Answer 6

They specify which symbols may replace which other symbols. The rule z0 → z1 specifies that z0 can be replaced by z1.

Answer 7

A special nonterminal. It's usually the symbol on the left-hand side of the grammars first rule. (On the picture the start symbol is "Prog")

Answer 8

A regular expression is a sequence of characters that forms a search pattern, mainly for use in pattern matching with strings, or string matching, ## Footnote "find and replace"-like operations. Written like this: ab\*(c|ε)

Answer 9

By associating a regular expression with each token.

Answer 10

One of the simplest parsing techniques used in practical compilers. ## Footnote This is a top-down process in which the parser attempts to verify that the syntax of the input stream is correct as it is read from left to right.

Answer 11

A Parse tree (sometimes called a derivation tree) visualizes how a string is structured by a grammar. It has the following characteristics: * The root is the grammar's start symbol *S*. * Each node is either a grammar symbol or λ. * Its interior nodes are nonterminals.

Answer 12

A semantic processing activity that traverses the AST to record all identifiers and their types in a symbol table. ## Footnote Symbol table:

Answer 13

It walks the AST bottom up from its leaves towards the root.

Answer 14

Tombstone diagrams consist of a set of “puzzle pieces” representing compilers and other related language processing programs. ## Footnote They are used to illustrate transformations from a *source language* to a *target language* realised in an *implementation language*.

Answer 15

A program *P* implemented in the language *L*.

Answer 16

A Translator/Compiler implemented in the language *L*, that can translate the language *S* into the language *T*.

Answer 17

A machine *M* implemented in hardware.

Answer 18

A language interpreter written in *L* that interprets the language *M*.

Answer 19

When you have created a compiler you want to run on a machine that cannot run on the code it's currently written in. (see image for an example)

Answer 20

A way to compile a compiler written in it's own language. ## Footnote The first compiler has to be compiled by a compiler in another language.

Answer 21

A recognition device that reads input strings over the alphabet of the language and decides whether the input strings belong to that language.

Answer 22

BNF is a notation techniques for context-free grammars.

Answer 23

EBNF is an adds extra features to BNF: * ? which means that the symbol (or group of symbols in parenthesis) to the left of the operator is optional (it can appear zero or one times) * \* which means that something can be repeated any number of times (and possibly be skipped altogether) * + which means that something can appear one or more times

Answer 24

When it generates a sentential form that has two or more distinct parse trees.

Answer 25

Throws Visitor is a specialized visitor used to collect information about throws that may ”escape” from a given construct. ## Footnote These visitors compute the throwsSet field, which records exceptions that may be thrown.

Answer 26

Semantics Visitor is used to check that the type rules imposed on language constructs are satisfied. ## Footnote This includes checking that control expressions are Boolean-valued, that parameters have correct types in calls, that expected types are returned from calls, and so forth. This analysis, often called static semantics, is a necessary part of all compilers.

Answer 27

Reachability Visitor is a specialized visitor used to analyze control structures for reachability and proper termination. ## Footnote These visitors set two flags, isReachable and terminatesNormally, used for error analysis and optional code optimization.

Answer 28

1. Name 2. Signature 3. Actions.

Answer 29

* Number of I/O. * Types of I/O.

Answer 30

Det der ændre input værdier til output værdier, ændre på indre tilstande og eventuelt påvirker globale variabler.

Answer 31

* Code that most assemblers generate. * External references, local instruction addresses, and data addresses are not yet bound. * Instead, addresses are assigned relative either to the beginning of the module or to some symbolically named locations.

Answer 32

Some compilers generate an absolute binary format that can be directly executed when the compiler is finished. ## Footnote This process is usually faster than the other approaches. However, the ability to interface with other code may be limited. In addition, the program must be recompiled for each execution unless some means is provided for archiving the memory image.

Answer 33

* Sequential. * Conditional Selection. * Looping Construct. Must have all three to provide full power of a Computing Machine.

Answer 34

A subprogram is a program called by another program to perform a particular task or function for the program. When a task needs to be performed multiple times, you can make it into a separate subprogram. 1. A subprogram has a single entry point. 2. The caller is suspended during execution of the called subprogram. 3. Control always returns to the caller when the called subprogram's execution terminates.

Answer 35

* Subprograms encapsulate local variables, specifics of algorithm applied * Once compiled, programmer cannot access these details in other programs. * Application of subprogram does not require user to know details of input data layout (just its type) * Form of information hiding.

Answer 36

* Formal parameters: * Names (and types) of arguments to the subprogram used in defining the subprogram body. * Actual parameters: * Arguments supplied for formal parameters when subprogram is called. **Actual/Formal Parameter Correspondence:** * Attributes of variables are used to exchange information: * Name – Call-by-name. * Memory Location – Call-by reference. * Value. * Call-by-value (one way from actual to formal parameter). * Call-by-value-result (two ways between actual and formal parameter). * Call-by-result (one way from formal to actual parameter).

Answer 37

The parameter passed isn't evaluated before it is used in the function. ## Footnote E.g int func(a) { return a + 1; } func(2 + 3) The return statement in the body of func will read return 2 + 3 + 1 instead of return a + 1; (call by value).

Answer 38

No new variable is declared at call-time, instead the parameter is used directly. ## Footnote E.g. int x = 2 + 3; int func(a) { a = a + 1; } func(x); Print(x); // x = 6

Answer 39

The actual parameter is evaluated at call time. ## Footnote E.g. int func(a) { return a + 1 } func(2 + 3) Behind the scenes this happens: a = 5 So the return statement reads return a + 1;

Answer 40

A parameter passing mode, used in Ada language to handle "IN OUT" parameters. In CallByValueResult, the actual parameter supplied by the caller is copied into the callee's formal parameter; the function is run; and the (possibly modified) formal parameter is then copied back to the caller. This allows a function to modify the state of its caller, similar to what you get with Call By Reference. **The (semantic) differences between Call By Reference and Call By ValueResult are:** * No alias is created between the formal and actual parameters. If lexical scoping is used, the difference can be apparent.

Answer 41

Pass-by-result is an implementation model for out-mode parameters. * No value is transmitted to the subprogram. * The corresponding formal parameter acts as a local variable * Just before control is transferred back to the caller, its value is transmitted back to the caller’s actual parameter.

Answer 42

You have to consider two things: 1. Efficiency. 2. One-way or two-way. These two are in conflict with one another! * Good programming --\> limited access to variables, which means one-way whenever possible * Efficiency --\> pass by reference is fastest way to pass structures of significant size * Also, functions should not allow reference parameters.

Answer 43

A function or expression is said to have a side effect if, in addition to returning a value, it also modifies some state or has an observable interaction with calling functions or the outside world. ## Footnote For example, a function might modify a global variable or static variable.

Answer 44

Sequence control: the control of the order of execution of the operations both primitive and user defined. * Implicit: * Determined by the order of the statements in the source program or by the built-in execution model. * Explicit: * The programmer uses statements to change the order of execution (e.g. uses If statement).

Answer 45

1. Prefix 1. **!**x 2. Infix 1. x **+** y 3. Postfis 1. x**++**

Answer 46

* Counter-controlled iterators. (A counter (int i; i * Logical-test iterators. (A logical expression (true)) * Recursion.

Answer 47

For-loops: * Controlled by loop variable of scalar type with bounds and increment size. * Scope of loop variable? * Extent beyond loop? * Within loop? * When are loop parameters calculated? * Once at start. * At beginning of each pass.

Answer 48

* While-loops * Test performed before entry to loop * **repeat**…**until** and **do**…**while** * Test performed at end of loop * Loop always executed at least once * Design Issues: 1. Pre-test or post-test? 2. Should this be a special case of the counting loop statement (or a separate statement)?

Answer 49

The different phases can be seen as different transformation steps to transform source code into object code. The different phases correspond roughly to the different parts of the language specification: * Syntax analysis Syntax * Contextual analysis Contextual constraints * Code generation Semantics

Answer 50

A multi pass compiler makes several passes over the program. The output of a preceding phase is stored in a data structure and used by subsequent phases. (Nævn noget om:) Issues in language design Issues in code generation

Answer 51

A single pass compiler is a compiler that passes through the parts of each compilation unit only once, immediately translating each part into its final machine code. This is in contrast to a multi-pass compiler which converts the program into one or more intermediate representations in steps between source code and machine code, and which reprocesses the entire compilation unit in each sequential pass. * An issue with single pass compilers is that identifiers have to be declared before they are used, as it only makes a single pass where it reads the code chronologically.

Answer 52

An Informal Definition of the ac Language * ac: adding calculator. * Types: * integer. * float: allows 5 fractional digits after the decimal point. * Automatic type conversion from integer to float. * Keywords: * f: float. * i: integer. * p: print. * Variables: * 23 names from lowercase Roman alphabet except the three reserved keywords f, i, and p. * Target of translation: dc (desk calculator) * Reverse Polish notation (RPN).

Answer 53

By using a CFG or a parse tree.

Answer 54

Contextual analysis: * Scope checking: verify that all applied occurrences of identifiers are declared * Type checking: verify that all operations in the program are used according to their type rules.

Answer 55

Example: The ac language offers two types: integer and float, and all identifiers must be type-declared in a program before they can be used. The type checking process walks the AST bottom-up from its leaves towards the root and checks that it corresponds with the above rules.

Answer 56

Visitor approach * GOF * The Gang of Four defines the Visitor as: Represent an operation to be performed on elements of an object structure. Visitor lets you define a new operation without changing the classes of the elements on which it operates. * Using static overloading * Reflective * By using reflection, we can view node types on a per-visitor basis and avoid the need to specify visit methods for every node type. Reflection is a programming language’s ability to inspect, reason about, manipulate, and act upon elements of the language, such as object types. The method visit(AbstractNode n) does not call n.accept(this) to perform the second dispatch. Instead, the dispatch method is invoked to determine the best match for visiting the supplied node n. * (dynamic) * (SableCC style)

Answer 57

Since it is OO you probably have an abstract node that all other nodes inherit from. ## Footnote Create an abstract method for how handle the node. Then overwrite this method for every subnode.

Answer 58

Mutual recursion is a form of recursion where objects, are defined in terms of each other. ## Footnote Example: A forest is a list of trees. A tree is a value and a forest. forest: [tree[1], ... , tree[i]] tree: value forest forest and tree are mutual recursive data structures since they both contain eachother.

Answer 59

Lexical scoping (a.k.a. static scoping) is a convention used with many programming languages that sets the scope of a variable so that it may only be called from within the block of code in which it is defined. ## Footnote The scope is determined when the code is compiled. A variable declared in this fashion is sometimes called a *private* variable.

Answer 60

If you have a production rule: stmt --\> stmt expr Then you have left recursion since the name of the rule stmt is also the first (left most) nonterminal in the rule. This is a problem if you don't have lookahead, then the parser will infinitely enter stmt and never get to expr.

Answer 61

An LL grammar is a formal grammar that can be parsed by an LL parser, which parses the input from **_L_**eft to right, and constructs a **_L_**eftmost derivation of the sentence (hence LL, compared with LR parser that constructs a **_R_**ightmost derivation).

Answer 62

A LL/LR parser is called a LL(k)/LR(k) parser if it uses k tokens of lookahead when parsing a sentence.

Answer 63

Lookahead establishes the maximum incoming tokens that a parser can use to decide which rule it should use.

Answer 64

A Look-Ahead LR parser is a simplified version of a LR parser. * LALR parser has more language recognition power than the LR(0) * LALR parser requires the same number of states as the LR(0).

Answer 65

A regular grammar is said to be ε-free if it has no ε-productions except possibly for the production S--\>ε.

Answer 66

When a parent node *N* has not yet been constructed, but its children have, the children of *N* is called the *handle* of *N*.

Answer 67

Creating a parent node *N* and connection the children in the *handle* to *N* is called *reducing* to *N*.

Answer 68

A Shift step advances in the input stream by one symbol. That shifted symbol becomes a new single-node parse tree.

Answer 69

A Simple LR or SLR parser is a type of LR parser with small parse tables and a relatively simple parser generator algorithm. ## Footnote As with other types of LR(1) parser, an SLR parser is quite efficient at finding the single correct bottom-up parse in a single left-to-right scan over the input stream, without guesswork or backtracking. The parser is mechanically generated from a formal grammar for the language.

Answer 70

Scope rules regulate visibility of identifiers. They relate every applied occurrence of an identifier to a binding occurrence.

Answer 71

**Dynamic binding:** * The method being called upon an object is looked up by name at runtime. **Static binding:** * Types of variables and expressions fixed at compilation time. Stored in the compiled program as an offset in a virtual method table ("v-table")

Answer 72

There are different kinds of Block structure. ## Footnote The C-based languages allow any compound statement (a statement sequence surrounded by matched braces) to have declarations and thereby define a new scope. Such compound statements are called blocks.

Answer 73

The binding is invisible to the programmer. C# has implicit binding since you just say y = 2 and the compiler does the binding behind the scenes.

Answer 74

In explicit binding the programmer has to specify where to bind the value. C uses explicit binding with the malloc where you have to free the space when you no longer use the variable.

Answer 75

* A shift-reduce parser scans and parses the input text in one forward pass over the text, without backing up. (That forward direction is generally left-to-right within a line, and top-to-bottom for multi-line inputs.) * The parser builds up the parse tree incrementally, bottom up, and left to right, without guessing or backtracking. * At every point in this pass, the parser has accumulated a list of subtrees or phrases of the input text that have been already parsed. * Those subtrees are not yet joined together because the parser has not yet reached the right end of the syntax pattern that will combine them.

Answer 76

A Reduce-Reduce error is a caused when a grammar allows two or more different rules to be reduced at the same time, for the same token. ## Footnote When this happens, the grammar becomes ambiguous since a program can be interpreted more than one way. This error can be caused when the same rule is reached by more than one path.

Answer 77

How to implement procedures, functions * Value vs. reference. * Recursion. | (and how to pass their parameters and return values)

Answer 78

* Constant-size representation: * The representation of all values of a given type should occupy the same amount of space. * *Direct* versus *indirect* representation.

Answer 79

Is the value of a type given directly or indirectly (pointers)

Answer 80

* Static arrays: * their size (number of elements) is known at compile time. * Dynamic arrays: * their size can not be known at compile time because the number of elements may vary at run-time.

Answer 81

* Compilation data is bound to a fixed location in the memory. Compiler can: * Compute exactly how much memory is needed for globals. * Allocate memory at a fixed position for each global variable.

Answer 82

* Procedure calls and their activations are managed by means of stack memory allocation. * It works in last-in-first-out (LIFO) method. * The local variables “live” as long as the procedure they are declared in. * Is very useful for recursive procedure calls.

Answer 83

* Local variables are allocated and de-allocated only at runtime. * Both stack and heap can grow and shrink dynamically and unexpectedly.

Answer 84

A display generalizes our use of a frame pointer. Rather than maintaining a single register, we maintain a set of registers which comprise the display.

Answer 85

The heap is memory set aside for dynamic allocation. ## Footnote Unlike the stack, there's no enforced pattern to the allocation and deallocation of blocks from the heap; you can allocate a block at any time and free it at any time. This makes it much more complex to keep track of which parts of the heap are allocated or free at any given time; there are many custom heap allocators available to tune heap performance for different usage patterns.

Answer 86

A record is associated with each node in the heap. ## Footnote The record for node N indicates how many other nodes or roots point to N.

Answer 87

Usually invoked when a user’s request for memory fails because the free list is exhausted. ## Footnote The garbage collector visits all live nodes, and returns all other memory to the free list. If sufficient memory has been recovered from this process, the user’s request for memory is satisfied.

Answer 88

As a collection algorithm, reference counting tracks, for each object, a count of the number of references to it held by other objects. If an object's reference count reaches zero, the object has become inaccessible, and can be destroyed.

Answer 89

* Start from a set of roots, and traverse all of the reachable memory-allocated objects, copying them from one half of memory into the other half. * Old space and new space. * When we copy the reachable data, we compact it so that it is in a contiguous chunk. * Contiguous area of memory in new space can quickly and easily be allocated freed.

Answer 90

Tracing garbage collection consists of determining which objects should be deallocated, by tracing which objects are reachable by a chain of references from certain "root" objects, and considering the rest as "garbage" and collecting them.

Answer 91

* Disadvantage of mark-sweep garbage collection is that it introduces very large system pauses. **Generational garbage collection is based on the following observations:** * Most objects die young. * Over 90% garbage collected in a GC is newly created post the previous GC cycle. * If an object survives a GC cycle the chances of it becoming garbage in the short term is low. **Algorithm:** * Consider a newly created object to be in Gen 0 and then if it is not collected by a cycle of garbage collection then it is promoted to the next higher generation, Gen1. * If an object in Gen1 survives a GC then that gets promoted to Gen2. * Lower generations are collected more often. * The higher generation collection is triggered fewer times.

Answer 92

* The first tracing garbage collection algorithm * Garbage cells are allowed to build up until heap space is exhausted (i.e. a user program requests a memory allocation, but there is insufficient free space on the heap to satisfy the request.) * At this point, the mark-sweep algorithm is invoked, and garbage cells are returned to the free list. * Performed in two phases: * Mark: identifies all live cells by setting a mark bit. Live cells are cells reachable from a root. * Sweep: returns garbage cells to the free list. * Compaction: we push live cells to one end of the heap * We can add a compaction phase as shown in Fig. 12.17.

Answer 93

Dangling pointer and wild pointers in computer programming are pointersthat do not point to a valid object of the appropriate type. These are special cases of memory safety violations. ## Footnote **Cause of dangling pointers:** In many languages (e.g., the C programming language) deleting an object from memory explicitly or by destroying the stack frame on return does not alter associated pointers.

Answer 94

If result from previous instruction is needed but not yet ready then we have a stalled pipeline.

Answer 95

Contextual constraints are those aspects of a programming language's syntax that are inexpressible by a CFG, for example scope rules and type rules.

Answer 96

An activation record is another name for Stack Frame. It's the data structure that composes a call stack. It is generally composed of: * Locals to the callee * Parameters of the callee * Return address to the caller The Call Stack is thus composed of any number of activation records that get added to the stack as new subroutines are added, and removed from the stack (usually) as they return.

Answer 97

A way of separating an algorithm from the object structure on which it operates. * Allows one to add new virtual functions to a family of classes without modifying the classes themselves. * One creates a visitor class that implements all of the appropriate specializations of the virtual function. * The visitor takes the instance reference as input, and implements the goal through double dispatch.

Exam Flashcards

(127 cards)