You will find the book Elements
of ML Programming (ML97 Edition) useful. This book is now
on reserve at the S&E Library, and important sections have been available
since Friday as UCSD e-reserves.
Deep structure is more important than surface appearance because knowing the structure of an expression is a first step towards knowing its meaning. A concrete syntax specification defines precisely which strings of symbols are legal expressions. For example, 0.0 may be allowed but not 0. or 00.0
A grammar specifies (1) conceptual structure and (2) lexical detail. We are interested in understanding and representing (1), which is called "abstract" syntax because it leaves out the irrelevant details (2).
There can be many paths to meaning, and there can be many opinions on how expressions are structured, even if there is agreement on which expressions are legal.
There is less disagreement for programming languages than for natural languages, because PLs are designed, and they have unique meanings.
The structure of an expression is (almost always) a tree. The key structural relationship is what is a subexpression, i.e. child, of what. Usually the number of children is fixed and their order is significant, but sometimes subexpressions may be grouped in sets.
Parsing follows a grammar to check and discard the lexical detail of
an expression, while discovering its abstract syntactic structure.
Remember the dragon book on compiler design. The dragon is "complexity of compiler design". The knight who vanquishes the dragon carries the emblem "syntax directed translation". This means first recovering abstract syntax, then doing code generation from the abstract syntax tree.
(2) Another use is as a common "interlingua" for multiple concrete syntaxes. Note that the concrete syntax of some programming languages is not strings. Consider a program for editing web pages. The same functionality (i.e. semantics) can be controlled by many interfaces:
(3) Abstract syntax makes it possible to systematically consider all possible expressions in a given sublanguage, and therefore to prove that a property is true of all these expressions.a compiled language (e.g. TeX) an interpreted language (e.g. HTML) an interactive command line (e.g. vi) a graphical user interfaces (GUI) a subroutine library, i.e. an API (application programming interface)
For example in the current project your code must enumerate all possible
expressions, to find one that is an answer. Mathematically, one can
use the recursive definition of expressions given last week to prove a
property of all simple arithmetic expressions by induction.
For example x[2*j+1] := sqrt(u+v) is a command with two expressions embedded in it: 2*j+1 and sqrt(u+v)
Definition:
Declarations are also called definitions. A declaration can be of a variable, a function, a macro, a type, etc. Remember that functions are values so declaring a function is really not so different from declaring a conventional variable. In some experimental PLs, even types and macros are themselves first-class values.
In typical languages like C, a variable declaration creates a binding between an identifier and a storage location. The declaration may or may not also initialize the variable, i.e. bind the location to a value.
The most fundamental variety of command is the assignment command.
An assignment changes the value that is bound to a storage location.
Definition: For a functional language, an environment is a mapping from names to values.
In other words, an environment is a set of bindings where each binding is an ordered pair <name, value>. The simplest way to implement an environment is to use a list of pairs. This is adequate theoretically but for practical use a more efficient data structure must be used.
Environments are ubiquitous in PLs. A PL implementation must keep track of environments both at compile-time and at run-time.
In this example three identifiers are declared at the same time:
type boolean = {false, true};In this example two identifiers are declared at the same time:
function not2 (x: boolean): boolean;
begin
if x then return false else return true;
end;