CSE134A LECTURE NOTES

April 18, 2001
 
 

ANNOUNCEMENTS

Check the Discus message board for detailed instructions for submitting the first project on Monday.
 
 

SUMMARIES IN "SELECT" QUERIES

Here is a sequence of select queries illustrating how to generate reports about the information in a database.
select count(*) from postings;

select count(*) from postings where author = 'Smith';

select max(date) from postings where author = 'Smith';

select distinct author from postings order by author;

select count(distinct author) from postings;

select dayofweek(date) as d, count(*) from postings group by d;

The last example above reports how many messages are in the database from each of the seven days of the week.

Note that distinct is a keyword, written without parentheses, but dayofweek() is a builtin function.  Other builtin group-based functions include min, max, sum, and avg.
 
 

A "SELECT" QUERY PUZZLE

Here is the complex query I ended with on Monday.  What does it do?

        select author, count(*) as c,
           count(*) * ( timestamp() - timestamp(max(date)) ) /
                      ( timestamp(max(date)) - timestamp('2000-11-11') ) as n
           from postings
           where date > '2000-11-11' and c > 5
           group by author
           order by n desc

Answer: Suppose that messages are downloaded occasionally from another server.  This query ranks authors by the expected number of messages not yet downloaded.

When you are doing a project using a database system, do use complex queries.  Do not write your own code to do work that the server would do for you.
 
 

JOIN QUERIES

Suppose that we have information about messages in two different tables.  We need to use both to decide which messages to display.  For example:
select postings.*, scores.score from postings, scores
   where postings.id=scores.id
   order by score
   limit 10;
This type of query is called a "join" query.

Notice that the join is specified in the where clause, just like a logical condition on values in a single record.  You can do a join of three or more tables.  The tables used must be specified explicitly in the from clause.  Different tables may have different columns with the same names.

Executing join queries can be very inefficient.  In the worst case, for each record in the first table, the whole second table must be scanned.  Database systems are very good, but not perfect, at executing queries in clever ways to make them efficient.
 
 

UPDATING TABLES

So far we seen how to do queries but not inserts, changes, or deletes.
delete from postings where date < '2000-11-11';
update postings set author = 'John Smith' where author = 'Smith';
insert postings (author, date) values('Brown',curdate());
Syntax note: curdate() is a builtin SQL function, not a PHP function.
Semantics note: the where clause is optional in delete and update commands.  If you leave it out, every record will be affected.  This is almost never what you want.

Usually your PHP script will build up an SQL command by concatenating various substrings, and then send the command to be executed by the MySQL server.  For example:

$cmd = "insert postings (author, date) values('Brown'," . today() . ")";
$result = mysql_query($cmd, $link);
In the PHP code above, today() is a PHP function.

What happens when we create a new tuple and only provide values for some of its fields?  Then the database server fills in default values for the remaining fields.  The most common default value is NULL.  
 
 



Copyright (c) by Charles Elkan, 2001.