Relational Model
INLS 256, Database I
A table
may represent an entity or a relationship from an entity-relationship diagram..
The
header represents the metadata about what is represented by the table.
The
rows are also called tuples, which represent an occurrence of the
relation.
The
columns are the attributes of the relation.
Each
attribute is associated with a set of values that can fill the attribute. This
is called the domain of the attribute.
The
values in a simple domain are atomic – basic elements that can’t be
broken down further.
Advantage
of domain
as a concept. Many types of database commands only make sense if the domains of
the attribute(s) being operated on is the same
A
database that supports the definition of domains does 2 things:
Composite
domain:
Cartesian product of two or more simple domains. All members of one set
combined with all members of next set.
To
fully support domains, you must:
Relational
Schemas
A relational
schema is an intensional definition of a relation that lists the attributes in
the relation, e.g. R(A1, A2, A3, . . ., An). This is also the heading of the
table.
Each
attribute in the schema is the name of a role played by a domain in the
schema.
The
extension of the relation R is the set of n-tuples, where each n-tuple
has a value for each of the attributes in the schema
In
set terminology, a relation is a subset of the Cartesian product of the domains
of the attributes that define R.
r(R) Í (dom(A1) X dom(A2) X . . .
X dom(An))
We talked about the problems of null values, with
their ambiguous interpretations:
a.
value
unknown
b.
attribute
does not apply
c.
has
no value
4. No duplicate tuples.
Qualified
Attribute Names:
In
order to specify precisely which attribute one means, you can qualify
the attribute name by prefixing the table name.
Keys:
The
primary key is a unique identifier of a tuple (record
There
may be more than one field or combination of fields that are unique identifiers.
Any combination of attributes is a candidate key if:
1.
It
is a unique identifier; and
2.
It
contains the minimum number of fields necessary to make a unique key.
One of the set of candidate keys is chosen to be the
primary key, the others are alternate keys.
Foreign Keys:
The idea of a foreign key is to identify references
to one tuple by its primary key that is made inside another tuple.
A foreign key is an attribute or combination of
attributes in relation R2 whose values must match the primary key values in R1.
In referring to the R1 tuple by its primary key in R2, you must be sure it
exists in R1.
Integrity Rules/Constraints:
These have to do with ensuring that the information
in the DB is correct for the DB and its world.
Entity Integrity: No attribute in the primary key of a base table (relation) can accept a null value (e.g., unknown, inapplicable
Referential Integrity: If base relation R2
contains a foreign key referring to the primary key of R1, then every value of
the foreign key must be equal to the primary key values of some tuple in R1, or
else every value must be null.
Specifically, what type of enforcement is needed?
1.
Restrictions
on duplicating primary key values on addition or update.
2.
Requirement
of no null values of primary key, and perhaps of foreign key, depending on
semantics of domain.
3.
What
happens on update operations – insert, delete, modify.
a.
What
happens on insert (the easy case).
i.
If
add primary key tuple. Must enforce the entity integrity constraint, and make sure
that it has a unique primary key.
ii.
If
add tuple in which a foreign key is needed, must make sure it refers to an
existing key, or is null (the referential constraint).
b.
What
happens on delete.
i.
If
you delete the primary key tuple, what should happen in the tuples in which it
is a foreign key?
1.
Cascade:
delete those tuples.
2.
Restrict:
you can delete a primary key tuple only if it isn’t a foreign key in any other
tuple.
3.
Nullify:
when you delete a primary key, all those tuples in which it was a foreign key now
get nulls (e.g. unknown) for the values of those attributes.
4.
Change:
when you delete a primary key, all those tuples in which it was a foreign key
have the values changed for those attributes to refer to some other tuple.
ii.
If
you delete a record which contains a foreign key, the primary entity can still
exist.
c.
What
happens on update (modify)?
i.
If
you update the primary key values, what happens to the foreign key values?
1.
Cascade:
when primary key value changes, all its foreign key appearances also change.
2.
Restricted:
only allow primary key update if it has no foreign keys.
3.
Nullify:
when primary key value changes, nullify all its foreign key appearances.
ii.
If
you update the tuple in which a foreign key appears, cannot change foreign key
values unless you want to refer to a new foreign entity
Mapping
between ER and EER diagrams and relational model. There are some basic steps
that make this a fairly straightforward process.
Note: If both entities have total participation, and don’t participate in any other relationship, you could merge the entities into one table.
4. 1:N binary relationships are also modeled using a foreign key. The primary key of the entity at the 1 side of the relationship is included as a foreign key in the table of the other entity. Also include any attributes of the relationship.
A category whose superclasses do not have the same
primary key requires the addition of a new key for the category, called the surrogate
key. The category table contains the surrogate key, and any attributes of
the category. The surrogate key is also added to the tables of the superclasses
as a foreign key.
Relational algebra represents the formal
underpinnings for relational theory.
Overview of the Operations:
1. select: it chooses specified tuples from a relation (a subset of rows).
2. project: this chooses specified attributes from a relation (a subset of attributes).
3. cartesian product: builds a new relation from two existing relations, where each tuple in the new relation is the concatenation of a tuple from one relation with a tuple from the other, creating all possible combinations.
R1 X R2
4. union: builds a new relation from two existing relations, where the new relation contains all the tuples appearing in either (or both) of the old relations.
R1 È R2
5. intersect: builds a new relation from two existing relations, where the new relation contains all tuples that appear in both of the old relations.
R1 Ç R2
6. difference: builds a new relation from two existing relations, where the new relation contains all the tuples that appear in the first of the old relations but not in the second.
R1 – R2
7. join: builds a new relation from two existing relations, where the new relation contains all the tuples from the product of the old relations, such that the new tuples satisfy some condition.
R1 |X| R2
8. divide: builds a new relation from two existing relations, where the one old relation is binary, and the other is unary. The new relation contains all values of one attribute of the binary relation that match all values in the unary relation.
R1 ¸ R2
Note that all these operations have the property of closure.
Details of Operations:
Select – Select a subset of the tuples in a relation. The selection condition specifies which ones will be chosen.
senrolled = capacity (COURSE)
Project – Choose a subset of the columns in a relation.
(SID, sname, bdate, major, instate?)
pstudent-name, bdate (STUDENT)
Combining Operations: You create more complex queries by doing several basic operations in a row, operating on the results; nesting operations. To do this, you may want to give the intermediate table a name.
pfname, lname, bdate sbdate>1/1/74 (STUDENT)
is the same as
YOUNG-STUDENTS <- sbdate>1/1/74 (STUDENT)
RESULT <- pfname, lname, bdate (YOUNG-STUDENTS)
Both of these give the names and birthdates of young students.
When you make a new table like this, you may also want to rename the attributes in the table, which you can also do.
RESULT (young-fname, young-lname, bdate) <- pfname, lname, bdate (YOUNG-STUDENTS)
Union Compatibility: For some set operations, the two relations must be union compatible.
Union – A UNION B. All tuples belonging to A or B.
Intersection – A INTERSECT B. All tuples belonging to both A and B
Difference – A MINUS B. All tuples in A that aren’t in B.
Cartesian Product – A TIMES B. Concatenate each tuple in A with each tuple in B.
DEFINE ALIAS A for S
means to “rename” S as A, with all the same attributes.
Join:
This is used to “connect” different tables using one or more attributes on each that are defined on the same domain.
RESULT <- A |X| x=y B
Equijoin: where the comparison is =. This is the most common join. The result will contain two identical attributes, e.g., S join T on bdate header =
(s.id#, s.name, s.bdate, t.emp#, t.name, t.bdate) where s.bdate = t.bdate
You could eliminate the duplicate columns by filtering it through a projection, and only showing one of the bdates. This is called the natural join. Its label is *.
Division: divide A by B. If A is of degree (m+n), and B is of degree (n), then the result, D, will be of degree (m).
Aggregate Functions are included for convenience – e.g., sum, min, max, avg, count.
Outer Join is used for queries where you want some information on each tuple, and some additional information if it applies. E.g., List all students names, and if they have already done their field experience, where they did it. For those tuples that the condition doesn’t apply to, that attribute will be null.
STUDENTS (ss#, sname, . . . ) S-FE (ss#, fe-loc, . . . )
RESULT <- psname, fe-loc (STUDENTS |X|SS#=SS# S-FE)
This is a left outer join.
Outer Union takes the union of relations that aren’t union compatible.