Test 1 Flashcards
Why Study Theory of Programming Languages?
- Increased Capacity to express ideas
- Improved background for choosing languages that are appropriate
- Increased ability to learn about new languages
- Better understanding of implementation
- Better use of languages that are known
- Overall advancement of computing
What are the three Language Evaluation Criteria ?
Readability, Writability, Reliability
Readability
- How easy it is for others to read and follow development and work on it.
1. Simplicity:- # of language features to provide
- Minimal Feature Multiplicity
- Minimal Operator Loading/overloading (provides flexibility, if
used appropriately makes it easier to understand)
- Orthogonality:
- Every possible combination is meaningful combinations
- Exceptions: In c++ cant make function return array &
arrays are passed to functions by reference not value - The more orthogonal the design of a language, the fewer exceptions the
language rules require. - Too much orthogonality can lead to unnecessary complexity.
- Data Types:
- The presence of adequate facilities for defining data types and data structures
- Syntax Design:
- Effects readability -> special words & compound statements
- semantics depend on syntax **. ( statements’ appearance should indicate their purpose)
Writability
- Same as readability
- Abstraction:
- Procedure Abstraction (functions or simple procedures)
- Data Abstraction (Classes & Objects)
- Expressivity:
- How powerful & convenient to specify a computation
Reliability
- Features the program provides to make it more reliable
- All factors of Readability & Reliability
- Type Checking:
- happens at compile time or runtime
- Exception Handling:
- program deals with exceptions
- Aliasing:
- two or more distinct names used to access same memory cell
- VERY DANGEROUS
Cost
Three most important contributors to language costs:
1. Program Development 2. Maintenance 3. Reliability
Other Criteria used for Evaluation Programming Languages
- Portability:
- ease with which programs can be moved from one implementation to another
- Generality:
- applicability to a wide range of applications
- well-definedness:
- Completeness & precision of lang. definition
How are design criteria weighted differently from different perspectives?
- Language implementors:
->difficulty of
implementing the constructs and features of the language - Language Users:
- > writability first and readability later
-Language designers:
-> emphasize elegance and the ability
to attract widespread use.
What are the three Language design trade-offs?
- Reliability vs. Cost of Execution
- Readability vs. Writability
- Writability (flexibility) vs. Reliability
Reliability vs. Cost of Execution
Java demands all references to array elements be checked for proper
indexing, which leads to increased execution costs
Readability vs. Writability
APL provides many powerful operators (and a large number of new
symbols), allowing complex computations to be written in a compact program but
at the cost of poor readability
Writability vs. Reliability
C++ pointers are powerful and very flexible but are unreliable
What are the two Language Categories?
Declarative and Imperative Languages
Declarative Languages
Focus on WHAT the computer is to do
-Higher level
-In tune with the programmer’s point of
view
Imperative Languages
Focus on HOW the computer should do it
-Predominate
What are three types of Declarative Languages?
Functional:
- employ a computational model based
on the recursive definition of functions.
- FL, ML, Haskell
Dataflow:
- model computation as the flow of information (tokens)
Logic or Constraint-Based languages
- prolog - find values that satisfy certain specified relationships
What are the three types of imperative languages?
Von Neumann:
- statements (assignments in particular) that influence subsequent computation - most familiar and successful
Scripting:
- Subset of von neumann - distinguished by their emphasis on ”gluing together” components that were originally developed as independent programs
Object-oriented:
- closely related ^ - Have more structured and distributed model of memory and computation
What are the three general methods that programming languages can be implemented by?
- Compilation:
- Programs are translated into ML
- Use: large commercial applications
- Pure Interpretation:
- Programs interpreted by another program known as an interpreter
- Use: Small programs / efficiency is not an issue
- Hybrid Implementation Systems:
- A compromise between compilers and pure interpreters
- Use: Small & medium systems when efficiency is not a first concern
Compilation
- Translate high-level language to ML
- Slow translation, fast execution
- Phases:
- Lexical Analysis:
- characters to lexical units
- Syntax Analysis:
- lexical units to parse trees (syntactic
structure)
- lexical units to parse trees (syntactic
- Semantics Analysis:
- generate intermediate code
- Code Generation:
- machine code generated
- Lexical Analysis:
- Load Module:
- user and system code together
- Linking and Loading:
- Collect program units & link them to
user program
- Collect program units & link them to
von Neumann bottleneck
the primary limiting factor in computer speed
Pure Interpretation
- No translation
- Easier implementation (runtime errors immediately displayed)
- Slower execution
- Needs more space
- Rare for traditional high-level languages
- JavaScript, PHP
Hybrid Implementation Systems
- High level language –> intermediate language –> easy interpretation
- Faster than pure interpretation
-Perl programs –> partially compiled to
detect errors before interpretation
Just-in-time implementation Systems
- Translate programs to an intermediate language
- Intermediate lang –> machine code
- Machine code kept for subsequent calls
- Java Programs, .NET languages
(Delayed Compilers)
Token
- Represents a lexical unit
- a pair of a token name and attribute value
- Defined Tokens:
- String Literal
- Identifiers
- Comment
- Punctuation
- Integers, numbers
- Key words
- Plus, Minus
Lexeme
- Sequence of characters in a program that matches the pattern for a token.
- Lowest level syntactic unit
Regular Expressions
Specifies a Token Pattern
RE is always defined over a set of symbols called alphabet (Σ)
Each symbol in alphabet is a RE
epsilon (ϵ) –> empty string (not in alphabet)
What are the operators to construct more complicated RE?
- Choice | (lowest precedence)
- Concatentation –> ab == “ab”
- Repetitions (highest precedence) –> (a|b)* == ( (a|b) )*
What are some RE extentions?
r+ = rr* r* r? = (ϵ|r) r{n} = r{2} = rr {a,b,c] = a | b | c [0-9] = (0 | 1 | 2 | 3 | ... 9) [abj - oZ] = ab[j-o]Z [ˆabc] means any character except a, b, or c
^r matches a r, but only at the beginning of a line.
r$ matches a r, but only at the end of a line.
Algebraic laws for regular expressions
..
Context Free Grammars
- Specifies language syntax
- Define a class of languages called context-free
languages - Noam Chomsky in the mid-1950s
What is BNF?
Backus-Naur Form (1959)
- CFG = BNF
What are BNF Fundamentals?
Nonterminals: - act like syntactic variables - represent classes of syntactic structures
Terminals:
- Lexemes or tokens
A Rule or Production
LHS - > RHS
LHS: always a nonterminal
RHS: terminals and/or non
Grammar
a finite non-empty set of rules
Has a nonterminal as the start symbol
Derivation
-A derivation is a repeated application of rules, starting with the start symbol
and ending with a sentence (all terminal symbols)
- Any string derived from the start symbol is a sentential
form.
-
A Language
The set of sentential forms containing only terminal symbols.
Parse Tree
A labeled tree representation of a derivation that filters out the order in which productions are applied to replace nonterminals.
Gives structural representation of a program
- Interior nodes: non terminals
- Leaf node: terminals
Ambiguity
if grammar generates two or more distinct parse trees
- Problem: compilers often base the semantics of those
structures on their syntactic form.
How to solve ambiguity?
- Rewrite CFG
- Introduce rules to specify which tree is preferred
(PRECEDENCE)
EBNF
Three extentions:
- Option - Alternation - Repetition
metasymbols added ( notational tools, not terminal symbols)
EBNF: Option
[ ]
EBNF: Alternation
( * | \ | % )
EBNF: Repetition
{, }
Static Semantics
CFGs cant describe all syntax of prog. lang.
( type compatible, declaring vars before usage, wrong arguments in funct. call, redefinition of ident. )
Defines restrictions on the
structure of valid texts that are hard or impossible to
express
Attribute Grammar
Device used that describes structure of prog lang.
Have additions to CFGs
Primary Value:
- Static semantic specification - Compiler design (SS checking)
Dynamic Semantics
- Behavior of program
- Result & output expected
3 Common Methods:
- Operational
- Denotational
- Axiomatic
Operational Semantics
- executing program’s statements on a machine ( stimulated or actual)
- change of machine state –> defines statement meaning
- To use:
High-Level Lang –> virtual machine needed - Better Alternative: complete computer simulation
- Build translator ( source code –> machine code)
- build simulator
- Good if used informally
What are the uses of Operational Semantics?
- Language manuals and textbooks
- Teaching programming languages
What are the two different levels or uses of Operational Semantics?
- Natural Operational Semantics: At the highest level, the interest is in the final result of the execution of a complete program
- Structural operational semantics: At lowest level, through a complete sequence of state changes when program is executed, the precise meaning is determined
Denotational Semantics
Formal method for describing meaning of program
Most abstract semantics description method
Mapping each language construct to a math function
Defined based on state of program
State of Program –> values of all variables existing in program
What is the evaluation of Denotational Semantics?
Can be used to prove correctness of program
Provides a rigorous way to think about programs
Can be an aid to language design
Has been used in compiler generation systems
Because of its complexity, it is of little use to language
users
Axiomatic Semantics
Based on formal logic
Axioms or inference rules are defined for each
statement type in the language
The logic expressions are called assertions
Original purpose: formal program verification
If program satisfies precondition, it should always satisfy post-condition after execution
If input meets precondition, will always meet post-condition
TO USE:
- Must have a set of inference rules
What is the evaluation of Axiomatic Semantics?
Developing axioms or inference rules for all of the
statements in a language is difficult
It is a good tool for correctness proofs, and an excellent framework for reasoning about programs, but it is not as useful for language users and compiler writers
What is an Inference Rule?
A method of inferring the truth of one assertion on the basis of the values of other assertions.
What is an Axiom?
A logical statement that is assumed to be
true.
Therefore, an axiom is an inference rule without
an antecedent.