CSE 134A
Discussion section
Friday, 11/22/2002
TA: Greg Hamerly
 

Midterm Grades

The midterms were handed back today at the end of the hour. The statistics are: You can see more statistics at the Gradesource website. If you have any questions about the grading, please first wait for the rubric to come out.  Then if you still have a question, contact the person who graded the problem you have a question about.
 

XSL and PHP

In order to use XSL with XML, we need to have some software that applies the XSL transform to the XML document. This is the program that performs XSLT, or XSL Transformations.

Fortunately, the version of PHP that is on ieng9 has XSLT functionality compiled in. The XSLT funtionality is made available through standard PHP functions, and is implemented by the Sablotron and expat libraries on the back end. The expat library is what actually does the parsing of XML 1.0 documents, while Sablotron does the XSL transformations.

See http://www.php.net/manual/en/ref.xslt.php for a reference to the XSLT functions available in PHP. The most important functions are:


These functions make XSLT very easy. Here is an example (that works on ieng9) of formatting a document using these functions:

File: stocks.xml
<?xml version="1.0" encoding="ISO8859-1" ?>
<portfolio>
  <stock exchange="nyse">
    <name>zacx corp</name>
    <symbol>ZCXM</symbol>
    <price>28.875</price>
  </stock>
  <stock exchange="nasdaq">

    <name>zaffymat inc</name>
    <symbol>ZFFX</symbol>
    <price>92.250</price>
  </stock>
</portfolio> 

File: stocks.xsl
<?xml version='1.0'?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
    <table border="2" bgcolor="yellow">
      <tr>
        <th>Symbol</th>
        <th>Name</th>

        <th>Price</th>
      </tr>
      <xsl:for-each select="portfolio/stock">
      <tr>
        <td><xsl:value-of select="symbol"/></td>
        <td><xsl:value-of select="name"/></td>
        <td><xsl:value-of select="price"/></td>
      </tr>

      </xsl:for-each>
    </table>
</xsl:template>
</xsl:stylesheet>
      

File: process.php
<html>
<head><title>XSLT test</title></head>
<body bgcolor="ffffff">

<?
    // Allocate a new XSLT processor
    $xh = xslt_create();

    // Process the document
    if ($result = xslt_process($xh, "/SOME_PATH/stocks.xml", "/SOME_PATH/stocks.xsl")) {
        print "SUCCESS:\n<br>\n";
        print "<pre>\n";
        print($result);
        print "</pre>\n";
    } else {
        print "Sorry, could not transform the XML using the XSL." .
             xslt_error($xh) . " and the ";
        print "error code is " . xslt_errno($xh);
    }

    xslt_free($xh);

?>

</body>
</html>

 

"Screen scraping" with POST

For this project you must interact with the US Postal Service to verify mailing addresses, and to get the canonical form of the address. Unfortunately, as you may have figured out, the interface for the USPS website doesn't use the GET method; it uses POST. So how can we do it? Simple, we have to interact with the website using POST, rather than GET. They're very similar in concept, but POST happens to be harder to use.

The most significant difference between POST and GET is the way in which data is transferred from the  client to the server. Using GET, the data is transferred appended to the URL, starting with a question mark "?". Therefore, this is a client request that we know uses GET:

http://www.google.com/search?q=post+php
Therefore the client might send something like this to the server to request this document:
GET /search?q=post+php HTTP/1.0
Note that you need TWO newlines after the header of the HTTP request before the server will start processing. Try doing the following: telnet to www.google.com at port 80 (in UNIX, type "telnet www.google.com 80"), and then type in the above line, and then press return twice. What do you get?

Using POST, the data is not transferred from the client to the server on the URL. Instead, it's transferred in the body of the HTTP, but using much the same format as the GET method. Suppose we want to post the following data to the USPS server:

Firm=&Urbanization=&Delivery+Address=1600+Pennsylvania+Ave+NW&City=Washington&State=DC&Zip+Code=&Submit=Process
So the client's request looks more like this (for the USPS server) using POST:
POST /cgi-bin/zip4/zip4inq2 HTTP/1.1
Host: www.usps.gov
Content-type: application/x-www-form-urlencoded
Content-length: 135
Connection: close

Firm=&Urbanization=&Delivery+Address=1600+Pennsylvania+Ave+NW&City=Washington&State=DC&Zip+Code=&Submit=Process
Here is some PHP code that shows how to do this:
<?php

function PostToHost($host, $path, $data_to_send) {
  $fp = fsockopen($host,80);
  printf("Open!\n");
  fputs($fp, "POST $path HTTP/1.1\n");
  fputs($fp, "Host: $host\n");
  fputs($fp, "Content-type: application/x-www-form-urlencoded\n");
  fputs($fp, "Content-length: ".strlen($data_to_send)."\n");
  fputs($fp, "Connection: close\n\n");
  fputs($fp, "$data_to_send\n");
  printf("Sent!\n");
  while(!feof($fp)) {
      $res .= fgets($fp, 128);
  }
  printf("Done!\n");
  fclose($fp);

  return $res;
}

$data = "Firm=&Urbanization=&Delivery+Address=1600+Pennsylvania+Ave+NW&City=Washington&State=DC&Zip+Code=&Submit=Process";

printf("Go!\n");
$x = PostToHost(
              "www.usps.com",
              "/cgi-bin/zip4/zip4inq2",
              $data
);
print "\n\n$x\n\n";
?>

      

For more information about POST, see http://www.faqts.com/knowledge_base/view.phtml/aid/12039/fid/51 or search on Google.


Prepared by Greg Hamerly -- 11/22/2002