Department of Computer Science and Engineering CSE 130
University of California at San Diego Spring 2002

The Basics of Using ML


The ML system we have is called Standard ML of New Jersey. The system is installed in the directory

/software/nonrdist/smlnj
and the binary is in bin/sml. The best general reference is the book Elements of ML Programming by Jeffrey Ullman. See also the documentation from ACS at http://www-acs.ucsd.edu/offerings/userhelp/HTML/sml,d.html.

When you launch Standard ML of New Jersey, this puts you into the interactive system.  The top level prompt is a hyphen -, and the secondary prompt which is printed when input is incomplete is an equals sign =.  Input to the top level interpreter (i.e. declarations and expressions) must be terminated by a semicolon and carriage return before the system will evaluate it.  If you get the secondary prompt when you do not expect it, typing ; will often complete your input.

When input is complete, the system then prints out a response indicating the effect of the evaluation. Expressions are treated as implicit declarations of a standard variable named it. For example,

       - 3; <return>              user input appears after the prompt
       val it = 3 : int           this is the system response
This means that the value of the last top level expression evaluated can be referred to using the name it.  Typing the interrupt character control-C will interrupt the compiler and return you to top level (unless some function is catching the Interrupt exception, which is a dangerous thing to do).  Typing control-D (EOF) at top level will cause an exit to the shell, or the parent process from which ML was run.

To execute system commands, you must first open the OS.Process package by typing

        - open OS.Process;
The signatures of all the functions made available by this package are displayed. One is the function system of type string->unit.  This spawns a process to execute its argument string as a shell command. To find out what the current directory is, for example, evaluate the expression
        - system "pwd";
This will cause the current directory to be printed out. To change the current working directory of ML use the function cd of type  string -> unit whose argument should be a path name denoting a directory. The operator use of type  string->unit  interprets its argument as a Unix file name relative to the current directory and loads the text from that file as though it had been typed in. use should normally be executed at top level, but the loaded files can also contain calls of use to recursively load other files. So if you have ML code in file hw, you can execute it with
        - use "hw";
If you are using multiple windows, you can edit hw in a second window. If you do not have access to multiple windows, try the following. Create a file called (say) start with the following contents:
        open OS.Process;
        fun vi f = system ("vi " ^ f);
Inside ML enter
        - use "start";
You can then edit the file hw without exiting ML by typing
        - vi "hw";
The error messages produced by the compiler are not always as helpful as they should be, and there are often too many of them. The compiler attempts to recover from syntactic and type errors so that it can detect as many errors as possible during a compilation. Unfortunately, it is not very graceful in recovery, and the process can cause numerous spurious secondary error messages.

When compiling files, the error messages include a line number. For simple syntactic errors this line number is often accurate or off by just one line. For other classes of errors, including type errors, the line number may not be very useful, since it will often just indicate the end of the declaration containing the error, and this declaration can be quite large.

There are a number of different forms of type error message, and it may require some practice before you become adept at interpreting them. The most common form indicates a mismatch between the type of a function (or operator) and its argument (or operand). A representation of the offending expression is usually included, but this is an image of the internal abstract syntax for the expression and may differ significantly from the original source code. For instance, an if...then...else... expression is represented internally as a case expression over a boolean value:

case ... of true => ... | false => ...


Written originally by Val Donaldson.  Adapted by Charles Elkan and Greg Hamerly.