In section on Friday, Greg will talk about the syntax of C and Pascal
in some simple programs. He'll go over the tools for compiling under
Linux and Solaris (ieng9), and discuss using function pointers in general
terms. This section will be useful for doing the first assignment.
A value is a piece of data that is involved in computing. A type is a collection of values on which the same operations make sense.
For example:
Example of different operations on integers and reals:
A built-in type (or operation) is one that is supplied by the language. The opposite of built-in is programmer-defined. Typically, integers, reals, and characters are built-in, and maybe booleans and strings. For types that are not built-in, a PL often has a standard way to make them programmer-defined.
Different languages often use different names for the same primitive type. This is just a "syntactic" difference.
Sometimes there are minor semantic differences, e.g. maybe -7 mod 3 = 2 but -7 rem 3 = -1.
An implementational issue is a semantic question that is left unspecified by the definition of a PL. Implementation issues are choices that different compilers or interpreters can make differently. The most troublesome PL differences are often implementational. For example, are floating point numbers 32 bit or 64 bit?
Implementational differences can be very troublesome because they are not well-documented (like syntactic differences) and they do not have work-arounds (like documented semantic differences).
Ideally, nothing in the specification of a language should be implementation-dependent.
For example in Pascal you can write type direction = {n, e, s, w}
Note that n, e, s, and w are not characters. They are identifiers (i.e. names) which introduce brand-new values that previously did not exist in the language.
Remember, a type is a set of values on which the same operations make sense. Which operations are provided for values of type direction? Pascal provides the built-in functions succ, pred, and ord. For example:
succ(s) = w
pred(s) = e
ord(s) = 2
Using enumeration types makes programs clearer, and helps to catch errors at compile-time. For example writing s*w by mistake gives a compiler error message.
In earlier versions of C, enumeration types are not available. Instead, it's good style to write
#define n 0
#define e 1
#define s 2
#define w 3
However this doesn't allow catching errors at compile-time. Conclusion: it's good for a language to be designed so that the compiler can enforce type correctness rules.
Enumeration types don't substitute for the programmer's good sense.
He or she must still choose to only use the values of the type in sensible
ways, for example to enumerate directions in a sensible order such as clockwise.
With an "abstract data type" (we'll study these later), the programmer
defining the type can specify what the type-specific sensible operations
are, and can disallow all other operations.
There are big differences between programming languages in how a user can define his own types of compound values. Many differences are only syntactic (Pascal records are called structures in C) but still, we need some concepts that let us understand the commonalities.type point = record x, y: real end;
Our aim is to classify the wide variety of compound types. We will do
this by looking at the mathematical ways of constructing sets of compound
values. In ordinary mathematics we know about intersections, unions,
and Cartesian products. For programming we need some slightly more
complex concepts.
Mathematically, the Cartesian product of sets A and B is the set of all possible ordered pairs <x,y> where the first component x of a pair is from A and the second component y is from B. Formally, the Cartesian product is written A x B and defined to be the set { <x,y> : x in A, y in B }.
If A has n elements, i.e. its cardinality is n, and the cardinality of B is n, then the cardinality of Ax B is mn. Formally, |A x B| = |A| |B|.
Records in Pascal or structures in C are Cartesian product types. In records and structures components are given names. The benefit of these names is mostly documentation, since the components could be referred to by their numerical position: first, second, third, etc.
In ML product types are defined without giving names to components, in a more mathematical way. For example:
Note that a tuple is the same as a row in a relational database, where components are often called "fields" or "attributes".type person = string * string * int * real;