CSE130 LECTURE NOTES


November 3, 1999
 
 

ADMINISTRATION

Reminder: The midterm will be on Wednesday next week, November 10.  I am distributing sample exam questions today.  The sections this coming Friday and Monday will be devoted to midterm preparation.

I understand that the second project is harder than the first one, and that several teams would like more time.  Because of the tight schedule for the whole quarter, and out of fairness to teams who will complete the project on time, I won't extend the deadline.  However I will reduce the lateness penalty from 20% to 10% for reports submitted on Monday instead of Friday.
 
 

THE DEFINITION OF AN ADT

An abstract data type introduces a new type indirectly, by providing a set of operations that can produce and operate on a new set of values. The new set of values is characterized by specifying the relationships between the operations introduced.

All PL types are high-level since the details of how the values of the type are represented are abstracted away.  For regular types these details are taken care of by the compiler.  For ADTs, these details are given by the programmer who creates the type.  In both cases, a programmer who uses the type does not need to know the implementation details.

Formally: An ADT has two parts: a specification part and an implementation part. A specification has two parts: a signature and constraints. An implementation has two parts: a concrete type and a signature implementation.

Informally:

The specification part of an ADT is "public": it is what a user of the ADT needs to know.

The implementation part of an ADT is "private": only the implementer needs to know the details of exactly how values are represented, and how the basic operations on values are accomplished.
 
 

AN EXAMPLE ADT FOR PRIORITY QUEUES

Some PLs provide ADT facilities while others do not.  Even if a PL does not provide ADT features, a programmer can still choose to think in terms of ADTs.  Using an ADT is then a question of discipline.

This example uses ML syntax, but the ideas are applicable to any PL.  First here's a signature:

signature pqsig = sig
    type pq
    val onil: pq
    val ohd: pq -> int
    val otl: pq -> pq
    val null: pq -> bool
    val insert: int -> pq -> pq
    val sort: int list -> pq
end;
Next here is a sketch of appropriate constraints on the functions introduced in the signature:
null nil = true;
forall i:int   x:pq       null (insert i x) = false
forall i,j:int x:pq   insert(i,insert(j,x)) = insert(j,insert(i,x))
...
Of course many more equations are also true about relationships between the operations named in the signature.

One obvious concrete representation for priority queues is as unordered lists. Here's an appropriate (incomplete) implementation:

structure pqstruct : pqsig = struct
     datatype pq = wrap of int list

     fun doinsert (n: int) [] = [n]
       | doinsert n (a::x) = if n <= a then n::(a::x)
                             else a::(doinsert n x)
     val onil = (wrap [])
     fun ohd (wrap (a::x)) = ...
     fun otl (wrap (a::x)) = ...
     fun null (wrap []) = true
       | null (wrap (a::x)) = false
     fun insert n (wrap x) = wrap (doinsert n x)
     fun sort x = (wrap x)
end

Notice that the sort function is implemented as a "no-op". This implies that the ohd and otl functions must have non-trivial implementations.

Important note: The semantic part is conceptually important but programming languages typically do not require it. There are two reasons for this: (1) verifying that an implementation actually satisfies a specification is an unsolved problem in artificial intelligence, and (2) the semantics of specification languages is an open question.
 
 

PUBLIC VERSUS PRIVATE

The specification part of an ADT is "public": it is what a user of the ADT needs to know, and should be able to know.

The implementation part of an ADT is "private": only the implementer needs to know the details of exactly how values are represented, and how the basic operations on values are accomplished.

Languages have various mechanisms for making ADT function names visible.  For example in ML:

 - pqstruct.onil;
val it = wrap [] : pqstruct.pq
- open pqstruct;
- onil;
val it = wrap [] : pq
Implementation parts typically contain completely private auxiliary functions that are not declared in the public signature.

ADTs provide "representation independence" (better called implementation independence) to their consumers.  Hiding the concrete implementation of the type is crucial for this.  ML is poorly designed in that the concrete implementation is not completely hidden.  In the example above, the wrapper wrap is visible to users of the ADT.

The same signature can have multiple implementations.  These may differ, for example, on efficiency.  Remember all the data structures you know for sets.
 
 

PURE VERSUS UPDATABLE ADTS

A pure ADT is one where all operations are pure functions.  This means that operations have no side-effects.  In particular, they do not modify or update their input arguments.  They just use these arguments to generate outputs, which are fresh values of the ADT (or of other types).

Most concrete types are pure.  For example, no operation on integers actually modifies an integer.  Instead, all the operations like + produce fresh outputs.

An updatable ADT is one where some operations actually change values of the ADT.  For example, suppose we had an operation called pop that took a priority queue as argument and modified it by removing the highest-priority item.  This operation would be impure and the whole ADT would then be impure also.
 
 

PACKAGES

Abstract data types are generalized in modern languages (e.g. Ada) to so-called "packages".

A package has a public signature and a private implementation, like an ADT.  The difference is that a package can simultaneously provide many types, many specific values, and many operations.

This increases flexibility and allows mutually dependent types but loses the analogy with concrete data types.

Packages preserve the crucial ADT idea of "information-hiding."  Details that a user does not need to know are not visible to him.

Packages are best-known in Ada.  ML signatures and structures are in fact packages also.  Ada and ML both provide parametrized ADTs and packages.

One important use for packages is as units of separate compilation. All the types in a package typically provide related functionality and are implemented by the same person or team, so it is reasonable for the classes in a package to have access to each other's implementation, i.e. to all the state variables and methods of each other.
 
 

RESTRICTING VISIBILITY

In Java the programmer can choose four levels of visibility for the components (i.e the state variables and methods) of an object. public any user of the object can access the component friend any class in the same package have access protected only subclasses can access the component private only methods in this class have access.  The default is "friend".

Traditional abstract data types only allow the "public" and "private" possibilities.

The visibilities "friend" and "protected" take into account the role of classes as modules of separate implementation and compilation.
 
 



Copyright (c) by Charles Elkan, 1999.