CSE134A LECTURE NOTES

April 30, 2001
 
 

ANNOUNCEMENTS

The midterm will be on Wednesday May 16.  You may use one PHP book and one MySQL book, your own personal notes, and the published lecture notes.
 
 

SEEING HOW A SELECT QUERY WILL BE EXECUTED

Use the explain SQL command to see how a query will be executed.  For example, here's how an index is used for a query involving a range restriction on the value of the date field:
mysql> explain select distinct ticker from postings where date>'2001-04-23';
+----------+-------+---------------+------+---------+------+-------+-------+
| table    | type  | possible_keys | key  | key_len | ref  | rows  | Extra |
+----------+-------+---------------+------+---------+------+-------+-------+
| postings | range | date          | date |    NULL | NULL | 11321 |       |
+----------+-------+---------------+------+---------+------+-------+-------+
Which index is used is shown by the key column.  Which value from another table will be looked up in this index is shown by the ref column.  How the index is used is shown by the type column.  The least efficient value for type is ALL, then index, then range, ref, eq_ref.

The product of rows from each table is an estimate of the number of combinations to be processed.  If only index is shown under Extra, then the the actual table does not need to be accessed at all, which is usually a major efficiency bonus.

Here's how an index is used for an equijoin:

mysql> explain select distinct ticker from postings, sorted where postings.id=sorted.k;
+----------+--------+---------------+---------+---------+----------+--------+-------------+
| table    | type   | possible_keys | key     | key_len | ref      | rows   | Extra       |
+----------+--------+---------------+---------+---------+----------+--------+-------------+
| sorted   | index  | k             | k       |       4 | NULL     | 124113 | Using index |
| postings | eq_ref | PRIMARY       | PRIMARY |       4 | sorted.k |      1 |             |
+----------+--------+---------------+---------+---------+----------+--------+-------------+
And here's how both are used together:
mysql> explain select distinct ticker from postings, sorted
            where postings.id=sorted.k and date>'2001-04-23' order by ticker;
+----------+-------+---------------+------+---------+-------------+-------+-------------+
| table    | type  | possible_keys | key  | key_len | ref         | rows  | Extra       |
+----------+-------+---------------+------+---------+-------------+-------+-------------+
| postings | range | PRIMARY,date  | date |    NULL | NULL        | 11321 |             |
| sorted   | ref   | k             | k    |       4 | postings.id |    10 | Using index |
+----------+-------+---------------+------+---------+-------------+-------+-------------+
 
 

GOOD AND BAD DATABASE DESIGN

Typically you know which columns you want to store data in, e.g. first and last name, SSN, zip code, date of message, date of userid creation, etc.  But how should you group these into tables?

The most important guideline is to avoid redundancy.  Consider this schema:

postings: first name, last name, address, zip, title, body
To reduce redundancy, we should have two tables.  We still have redundancy in the names, so we should introduce unique ids for authors:
postings: aid, title, body
  person: aid, first name, last name, address, zip
Note that we will often want unique ids for many different entities, e.g. for authors and separately for messages.  That's why the new field is named aid and not just id.

Why is redundancy bad?  There are at least four reasons:

How can we prevent these problems?
 
 

CONSTRAINTS

For any database, we usually possess some meta-knowledge about properties that the actual data should always satisfy.  For example: Ideally, we would be able to state all these constraints in some formal language.  The database server would give an error message whenever an insert or delete or update operation would cause any constraint to be violated.

Different database servers can enforce more or fewer types of constraint.  For example, MySQL can enforce the first constraint by saying that the id field is not null in the postings table.  The fourth constraint can be enforced with a primary key declaration on the pair of fields (name,address).

The second constraint is accommodated by not declaring id to be unique  for the postings table.  The third constraint is accommodated by making the postings and person tables separate.  This constraint would be violated if we used one big table.

The last constraint is an example of a functional dependency. Unlike more sophisticated database systems, MySQL has no features to enforce functional dependencies.

Guideline: Before choosing a database design, write down what constraints you know should be true.  Then select a design that enforces these constraints.
 
 

COOKIES

From www.php.net:
    int setcookie (string name [, string value [, int expire [, string path [, string domain [, int secure]]]]])


This defines a cookie to be sent along with the rest of the header information. Cookies must be sent before any other headers are sent (this is a restriction of cookies, not PHP). This requires you to place calls to this function before any <html> or <head> tags.

All the arguments except the name argument are optional. If only the name argument is present, the cookie by that name will be deleted from the remote client. You may also replace any argument with an empty string ("") in order to skip that argument. The expire and secure arguments are integers and cannot be skipped with an empty string. Use a zero (0) instead. The expire argument is a regular Unix time integer as returned by the time() or mktime() functions. The secure indicates that the cookie should only be transmitted over a secure HTTPS connection.

Common Pitfalls:


 



Copyright (c) by Charles Elkan, 2001.