Remember to fill-out and attach the team self-evaluation questionnaire.
We will distribute the description of the second project Wednesday this
week.
A grammar specifies (1) conceptual structure and (2) lexical detail. We are interested in understanding and representing (1), which is called "abstract" syntax because it leaves out the irrelevant details (2).
There can be many paths to meaning, and there can be many opinions on how expressions are structured, even if there is agreement on which expressions are legal.
There is less disagreement for programming languages than for natural languages, because PLs are designed, and they have unique meanings.
The structure of an expression is (almost always) a tree. The key structural relationship is what is a subexpression, i.e. child, of what. Usually the number of children is fixed and their order is significant, but sometimes subexpressions may be grouped in sets.
Parsing follows a grammar to check and discard the lexical detail of
an expression, while discovering its abstract syntactic structure.
Consider for example typical syntax for simple arithmetic expressions, including variables. We can represent the set of all these expressions as a recursive type, and therefore as an ML datatype:
Remember that you can declare a pointer-based type in C with exactly the same conceptual organization.datatype expression =
Constant of int |
Plus of expression * expression |
Minus of expression * expression |
Times of expression * expression;
With the data structure defined above, it is easy to write a recursive function that evaluates expressions:
Notice how this function follows the structure of the expression datatype exactly.fun eval (Constant n) = n
| eval (Plus (exp1,exp2) ) = (eval exp1 env) + (eval exp2 env)
| eval (Minus (exp1,exp2) ) = (eval exp1 env) - (eval exp2 env)
| eval (Times (exp1,exp2) ) = (eval exp1 env) * (eval exp2 env)
Remember the dragon book on compiler design. The dragon is "complexity of compiler design". The knight who vanquishes the dragon carries the emblem "syntax directed translation". This means first recovering abstract syntax, then doing code generation from the abstract syntax tree.
(2) Another use is as a common "interlingua" for multiple concrete syntaxes. Note that the concrete syntax of some programming languages is not strings.
Consider a web page creation program. The same functionality (i.e. semantics) can be controlled by many interfaces:
(3) Abstract syntax makes it possible to systematically consider all possible expressions in a given sublanguage, and therefore to prove that a property is true of all these expressions.a compiled language (e.g. TeX) an interpreted language (e.g. HTML) an interactive command line (e.g. vi) a graphical user interfaces (GUI) with menu choices direct manipulation a subroutine library, i.e. an API (application programming interface) Microsoft Bob
For example one can use the recursive definition above to prove a property
of all simple arithmetic expressions (i.e. of expression) by induction.
For example x[2*j+1] := sqrt(u+v) is a command with two expressions embedded in it: 2*j+1 and sqrt(u+v)
Definitions:
Variables are created and initialized by declarations. More formally,
the effect of a declaration is to produce one or more bindings between
an identifier and a value.