CSE134A LECTURE NOTES

November 13, 2002
 
 

ANNOUNCEMENTS

We returned Project 2 in office hours, starting on Monday.  If you don't have yours yet, go to OH tomorrow.  We're handing back detailed score sheets showing why points were lost.  See this explanation of scores to understand why you lost points.  If you still have questions, then contact the  relevant TA directly.

Remember that Project 3 is due on Friday.  The next project uses XML, so today's lecture will be on that topic.
 
 

WORKING IN TEAMS AND REAL-WORLD SKILLS

I have heard from some students who are frustrated that their partners are not doing a fair share of the work.  This is an opportunity for you to develop and use real-world skills.  These skills include:

(a) interviewing potential team-mates and picking good ones;
(b) exerting leadership and getting team-mates to work hard;
(c) knowing when to contact your supervisor for help;
(c) knowing when to fire a team-mate;
(d) setting realistic expectations.

These are all very difficult, but very important.  By learning these skills now, you will do better in your career.

About expectations:  It is not reasonable to expect an A in every class you take.  You should not drop a class just because you think you will get less than a certain grade.  First, you may be wrong.  Second, you will be wasting your own time, and money from the taxpayer.  Third, your transcript will make you look like a quitter, instead of someone who completes jobs on time.

Grades are less important than genuine learning and real-world skills.  A big part of 134A is about these real-world skills, including team skills and writing skills.  (You should divide all your writing into logical paragraphs, even email messages.)

It's true that GPA is important for jobs and grad school, but interviews are even more important for jobs.  For grad school, GRE scores and letters
of recommendation are more important together.  Also, good employers judge how much you know through more than just your grades.  They ask tough technical questions in interviews.  See the career service for interview prep help.

I appreciate all the effort most students are putting in.  If you are frustrated, you can take action to overcome the obstacles you face, learn new skills, and end up with a good grade for the class.

 

XML

XML is a human-readable notation for writing and exchanging structured information of all sorts.  XML stands for "eXtensible meta-Markup Language."  It is a language for "marking up" (i.e. indicating explicitly) syntax and (very slightly) for indicating semantics, i.e. meaning.  Important concepts: XML is a language for portable data.  It can be used as a notation for a programming language, e.g. VoiceXML.

For good short XML tutorials see here.  Parts of today's lecture are based on this site.
 
 

XML SYNTAX

An XML document is a tree with exactly one root element, and no overlapping elements.  XML is case-sensitive, and in fact can use non-Western characters.

Start tags are written <elementname ...> and end tags are written </elementname>. Start tags can have attributes, which have the syntax name="value".  There is no XML-defined syntax inside attribute values, so nested elements are preferable.  Also, attributes must be unique for each tag instance.

Tags are nested, and can appear inside free text.

In free text, special characters must be written as &lt; and &amp;  Any XML parser translates these before passing the text to any application using the parser.

<?xml version="1.0" encoding="ISO_8859-1" standalone="no"?>                optional processing instruction
<!DOCTYPE person SYSTEM "http://www.ucsd.edu/person.dtd">
        <person born="1912" died="1954" id="p342">
           <name>
             <first_name>Alan</first_name>
             <last_name>Turing</last_name>
           </name>
           <!-- Did the word computer scientist exist in Turing's day? -->       this is a comment
           <profession>computer scientist</profession>
           <profession>mathematician</profession>
           <profession>cryptographer</profession>
        </person>
A tag beginning <? and ending ?> is a processing instruction.  These are considered to be markup, but not elements, so they can appear outside the root element.  Script code, e.g. PHP code, is a special case.
 
 

DOCUMENT TYPE DEFINITIONS

An XML document is well-formed if it satisfies the XML syntax rules.  If it satisfies a document type definition (DTD) also, then it is valid.

A DTD specifies application-specific syntax.  It cannot specify constraints like "this piece of data is a year after 2000" or even "this piece of data is a number."  XML schemas can specify data types, but they are more complex and less widely used.

In an XML document, the DTD to use is given by something like a special tag, for example

<!DOCTYPE person SYSTEM "http://www.ucsd.edu/person.dtd">
In general DTDs can be thousands of lines long, but the basics are simple.  For example:
<!ELEMENT person (name, job*)>
<!ELEMENT name (first, middle?, last)>
<!ELEMENT first (#PCDATA)>
<!ELEMENT paragraph (#PCDATA | name | footnote | date)*>
<!ELEMENT image EMPTY>
#PCDATA means parsed character data.  In this type of free text, special characters must be written as &lt; and &amp;  Any XML parser translates these before passing the text to any application using the parser.  If #PCDATA is one choice among others, the content of the element is said to be mixed.

The number of appearances allowed for a nested element is indicated by * or ? or +.  Parentheses indicate grouping.
 
 



Copyright (c) by Charles Elkan, 2002.