module 1 Flashcards
what are the categories of language evaluation criteria?
- readability
- writability
- reliability
- cost
what are languages developed around?
Von Neumann architecture
draw and describe the von neumann architecture
§ Data and programs stored in memory
§ Memory is separate from CPU
§ Instructions and data are piped from memory to CPU
§ Basis for imperative languages
why do we need different languages?
New software development methodologies (e.g., object-oriented software development) led to new programming paradigms and by extension, new
programming languages
what are the 4 language categories?
- imperative: variables, assignment statements, iteration, scripting, visual, and object-oriented languages (Java, C, JavaScript)
- functional: main means of making computations is by applying functions to given parameters
- logic: rule-based (prolog)
- markup/programming hybrid: markup languages extended to support some programming (JSTL)
what are the tradeoffs in language design?
reliability vs cost of execution (java demands all references to array elements to be checked for proper indexing)
readability vs writability (APL allows complex computations by containing powerful operators)
writability vs reliability (C++ pointers are powerful and flexible but unreliable)
3 implementation methods
- compilation: programs are translated into machine language (large commercial applications)
- pure interpretation: programs are interpreted by another program known as an interpreter (small programs, efficiency is not concern)
- hybrid implementation systems: compromise between compilers and pure interpreters (small and medium systems)
what are the phases of the compilation process?
- lexical analysis: converts characters in the source program into lexical units
- syntax analysis: transforms lexical units into parse trees
- semantics analysis: generate intermediate code
- code generation: machine code is generated
linking and loading
process of collecting system program units and linking them to a user program
von neumann bottleneck
- connection speed between a computer’s memory and its processor determines the speed of a computer
- program instructions can be executed faster than the speed of the connection
- Memory and the CPU are separated in the Von Neumann architecture, so the CPU must fetch data for every operation it performs
- limiting factor in the speed of computers
pure interpretation
- no translation
- easier implementation of programs
- slower execution
- often requires more space
hybrid implementation systems
- high-level language program is translated to an intermediate language and allows easy interpretation
- faster than pure interpretation
just in time implementation systems
- initially translate programs to an intermediate language
- compile the intermediate language of the subprograms into machine code when they are called
what do preprocessors do?
processes a program immediately before the program is compiled to expand embedded preprocessor macros (used to specify that code from another file is to be included)
syntax
form or structure of the expressions, statements, and program units
semantics
the meaning of the expressions, statements, and program units
sentence
string of characters over some alphabet
language
set of sentences
lexeme
lowest level syntactic unit of a language
token
category of lexemes
recognizer
reads input strings over the alphabet of the language and decides whether the input strings belong to the language
generator
device that generates sentences of a language
context-free grammar
Language generators, meant to describe the syntax of natural languages
backus-naur form
BNF is equivalent to context-free grammars
Invented by John Backus to describe the syntax of Algol 58
grammar
a finite non-empty set of rules
A start symbol is a special element of the nonterminals of a grammar
Syntactic lists are described using recursion
An abstraction (or nonterminal symbol) can have more than one RHS
ambiguous grammar
A grammar is ambiguous if and only if it generates a sentential form that has two or more distinct parse trees
attribute grammar
a context free grammar with some additions:
- For each grammar symbol x there is a set A(x) of attribute values
- Each rule has a set of functions that define certain attributes of the nonterminals in the rule
- Each rule has a (possibly empty) set of predicates to check for attribute consistency
dynamic semantics
There is no single widely acceptable notation or formalism for describing semantics
operational semantics
Describe the meaning of a program by executing its statements on a machine, either simulated or actual
The change in the state of the machine (memory, registers, etc.) defines the meaning of the statement
denotational semantics
The meaning of language constructs are defined by only the values of the program’s variables
expression
expressions are decimal numbers, variables, or binary expressions with one arithmetic operator and two operands, each of which can be an expression
axiomatic semantics
Axioms or inference rules are defined for each statement type in the language (to allow transformations of logic expressions into more formal logic expressions)
weakest precondition
the least restrictive precondition that will guarantee the postcondition
loop invariant
a property of a program loop that is true before each iteration
benefits of using BNF
provides a clear and concise syntax description
The parser can be based directly on the BNF
Parsers based on BNF are easy to maintain
2 parts of syntax analysis
- lexical analyzer
- syntax analyzer (parser)
why are lexical and syntax analysis separated?
- simplicity (separation simplifies the parser)
- efficiency (separation optimizes lexical analyzer)
- portability (parser)
what is a lexical analyzer
- a pattern matcher for character strings
- a “front-end” for the parser
- Identifies substrings of the source program that belong together - lexemes
lexer responsibilities
•Tokenizing source
•Removing comments
•(Often) dealing with pragmas (i.e., significant comments)
•Saving text of identifiers, numbers, strings
•Saving source locations (file, line, column) for error messages
dynamic scoping
bindings depend on the flow of execution during run time
refers to the most recently declared variable
static scoping
bindings based on purely textual rules
refers to the highest level declaration of a variable.
the longest possible token rule
Nearly universal rule: always take the longest possible token from the input
you return only when the next character can’t be used to continue the current token
what are the 3 ways scanners are built
§ Ad-hoc (fastest and most compact)
§ Semi-mechanical pure DFA (usually as nested case statements)
§ Table-driven DFA (produced by antlr)
lexer rules
Lexer rules specify token definitions and more or less follow the syntax of parser rules except that lexer rules cannot have arguments, return values, or local variables.
actions
blocks of text written in the target language and enclosed in curly braces.
The recognizer triggers them according to their locations within the grammar.
token attributes
attributes include useful token properties such as the token type and text matched for a token
Actions can access these attributes via :
$label.attribute
2 categories of parsers
- top-down:
Produce the parse tree, beginning at the root
Order is that of a leftmost derivation
Traces or builds the parse tree in preorder - bottom-up:
Produce the parse tree, beginning at the leaves
Order is that of the reverse of a rightmost derivation
what are the 2 most common top down parsers
- recursive descent
- LL parser (antlr)
what are the 2 most common top down parsers
- recursive descent
- LL parser (antlr)
2 most important classes of grammars
LL
- left to right, leftmost derivation
- top-down
- predictive
LR
- left to right, rightmost derivation
- bottom-up
- shift-reduce
pairwise disjointedness
The inability to determine the correct RHS on the basis of one token of lookahead
binding
A binding is an association between an entity and an attribute, such as between a variable and its type or value, or between an operation and a symbol
reserved word
a special word that cannot be used as a user-defined name
keyword
word that is special only in certain contexts
variable
an abstraction of a memory cell
type
determines the range of values of variables and the set of operations that are defined for values of that type; in the case of floating point, type also determines the precision
value
the contents of the location with which the variable is associated
abstract memory cell
the physical cell or collection of cells associated with a
variable
static
objects are given an absolute address that is retained throughout the program’s execution
stack
objects are allocated and deallocated in last-in, first-out order, usually related to subroutine calls and returns
heap
objects allocated and deallocated at arbitrary times
garbage collection
identifies and reclaims unreachable objects
explicit deallocation vs implicit deallocation
explicit: simplicity and execution speed provided that the programmer can correctly identify the end of an object’s lifetime
implicit: automatic, eliminates manual allocation errors such as dangling reference and memory leak
scope of a variable
range of statements over which it is visible
scope of a variable
range of statements over which it is visible
local vs nonlocal variables
local: those that are declared in that unit
nonlocal: those that are visible in the unit but
not declared there
ambiguous grammar
you have a symbol that appears twice on the right side and once on the left side
elaboration
process by which declarations become active when control first enters a scope
referencing environment
collection of all names that are visible in the statement
In a static-scoped language, it is the local variables plus all of the visible variables in all of the enclosing scopes
In a dynamic-scoped language, the referencing environment is the local variables plus all visible variables in all active subprograms
how to calculate leftmost derivation?
just start with the farthest left element (for example, in id = a, id gets evaluated first), keep going till you have fully evaluated the statement
pairwise disjointness test
pick the farthest left character for each statement, if the characters from each statement are the same, they fail the test
recursive descent parser trace
next token is: # Next lexeme is a
Enter
Enter
Enter
…
3 variations of imperative language group
von neumann, object-oriented, scripting languages
2 variations of declarative language group
functional, logic-based
difference between lifetime and scope
Scope of a variable determines the area or a region of code where a variable is available to use, lifetime is from when a variable is made visible until it’s destroyed
order of programming methodologies utilized
machine efficiency
structured programming
process-oriented
object-oriented