Relational algebra Guide, Meaning , Facts, Information and Description
The relational algebra is a set of operations that manipulate relations as they are defined in the relational model and as such describes part of the data manipulation aspect of this data model. Because of their algebraic properties these operations are often used in database query optimization as an intermediate representation of a query to which certain rewrite rules can be applied to obtain a more efficient version of the query.The exact set of operations may differ per definition and also depends on whether the unlabeled relational model (that uses mathematical relations) or the labeled relational model (that uses the labeled generalization of mathematical relations) is used. We will assume the labeled case here as this is the most common way the relational model is defined. That means that we assume that tuples are partial functionss from attribute names to values. The attribute a of a tuple t is denoted in this article as t(a).
|
|
|
More formally the semantics of the selection is defined as follows:
- σaθb(R) = { t : t ∈ R, t(a) θ t(b) }
- σaθv(R) = { t : t ∈ R, t(a) θ v }
|
|
More formally the semantics of the projection is defined as follows:
- πa1,..,an(R) = { t[a1,..,an] : t ∈ R }
The result of a projection πa1,..,an(R) is only defined if {a1,..,an} is a subset of the header of R.
The cartesian product (also called cross product or
cross join) is a binary operation that is very similar to the
Cartesian product in set theory. It is written as R
× S where R and S are relations. The result of
the cartesian product is the set of all combinations of tuples in
R and S. For an example consider the tables Employee
and Dept and their Cartesian product:
The Cartesian Product
|
|
|
More formally the semantics of the Cartesian product is defined as follows:
- R × S = { t ∪ s : t ∈ R, s ∈ S }
The set union is a binary operation that is written as R ∪
S and is defined as the set union in set theory. For an
example consider the tables Engineer, Manager and their
union:
The Set Union
|
|
|
Note that Harriet only appears once in the result because the result is a set.
Formally, the union is defined as follows:
- R ∪ S = { t : t ∈ R ∨ t ∈ S }
|
|
|
Formally, the semantics of the union is defined as follows:
- R - S = { t : t ∈ R, ¬ t ∈ S }
|
|
Formally the semantics of the rename operator is defined as follows:
- ρa/b(R) = { t[a/b] : t ∈ R }
These operations are enough to express all queries that can be
expressed in tuple calculus and domain calculus which is
essentially the same as first-order logic.
Next to the six basic operations some other operations are also
often included even though they can be expressed by combinations
of the basic operations. This is because they either have
interesting algebraic properties or because they can be
implemented more efficiently than their simulations. These
operations are the set intersection, the natural join and
the division.The Expressive Power
Other Operations expressible with the Basic Operations
|
|
|
Formally, the union is defined as follows:
- R ∩ S = { t : t ∈ R, t ∈ S }
- R ∩ S = R - (R - S)
The Natural Join
The natural join is a binary operation that is written as R |X| S where R and S are relations. The result of the natural join is the set of all combinations of tuples in R and S that are equal on their common attribute names. For an example consider the tables Employee and Dept and their natural join:
|
|
|
The natural join is arguably one of the most important operators since it allows the combination of relations that are associated by a foreign key. For example, in the above example there holds probably a foreign key from Employee.DeptName to Dept.DeptName and then the natural join of Employee and Dept combines all employees with their departments. Note that this works because the foreign key holds between columns with the same name. If this is not the case such as in the foreign key from Dept.manager to Emp.emp-number then we have to rename these columns before we take the natural join. Such a join is sometimes also referred to as an equijoin (see θ-join).
More formally the semantics of the natural join is defined as follows:
- R |X| S = { t ∪ s : t ∈ R, s ∈ S, fun(t ∪ s) }
The natural join can be simulated with the basic operations as follows. Assume that a1,...,an are the attribute names unique to R, b1,...,bm are the attribute names common to R and S and c1,...,cm are the attribute unique to S. Furthermore assume that the attribute names d1,...,dm are neither in R nor in S. In a first step we can now rename the common attribute names in S: : S' := ρd1/b1(...ρdm/bm( S)...) Then we take the Cartesian product and select the tuples that are to be joined: : T := σb1=d1(...σbm=dm(R × S')...) Finally we take a projection to get rid of the renamed attributes: : U := πa1,...,an,b1,...,bm,c1,...,cm(T)
|
|
|
If DBProject contains all the tasks of the Database project then the result of the division above contains exactly all the students that have completed the Database project.
More formally the semantics of the division is defined as follows:
- R ÷ S = { t[a1,...,an] : ∀ s ∈ S ( (t[a1,...,an] ∪ s) ∈ R) }
The simulation of the division with the basic operations is as follows. We assume that a1,...,an are the attribute names unique to R and b1,...,bm are the attribute names of S. In the first step we project R on its unique attribute names and construction all combinations with tuples in S:
- T := πa1,...,an(R) × S
- U := T - R
- V := πa1,...,an(U)
- W := πa1,...,an(R) - V
The Generalized Selection
The generalized selection is a unary operation that is written as σφ(R) where φ is a propositional formula that consists of atoms as allowed in the normal selection and the logical operators ∧ (and), ∨ (or) and ¬ (negation). This selection selects all those tupels in R for which φ holds. For an example consider the following tables where the first table gives the relation Person and the second the result of σAge≥30 ∧ Weight≤60(Person).
|
|
More formally the semantics of the generalized selection is defined as follows:
- σφ(R) = { t : t ∈ R, φ(t) }
The simulation of a generalized selection that is not a fundamental selection with the fundamental operators is defined by the following rules:
- σφ∧ψ(R) = σφ(R) ∩ σψ(R)
- σφ∨ψ(R) = σφ(R) ∪ σψ(R)
- σ¬φ(R) = R - σφ(R)
The θ-Join and Equijoin
If we want to combine tuples from two relations where the combination condition is not simply the equality of shared columns then it is conventient to have a more general form of join operator, which is the θ-join. The θ-join is a binary operation that is written as R |X|aθb S or R |X|aθv S where a and b are attribute names, θ a binary operation in the set {<, ≤, =, >, ≥}, v is a value constant and R and S are relations. The result if this operation consists of all combinations of tuples in R and S that satisfy the equation. The result of the θ-join is only defined if the headers of S and R are disiont, i.e., do not contain a common attribute.
The simulation of this operation in the fundamental operations is therefore as follows:
- R |X|φ S = σφ(R × S)
|
|
|
More formally the semantics of the semijoin is defined as follows:
- R |X S = { t : t ∈ R, s ∈ S, fun(t ∪ s) }
The semijoin can be simulated using the natural join as follows. Assume that a1,...,an are the attribute names of R, then it holds that:
- R |X S = πa1,..,an(R |X| S)
or full outer join ...
...
...
TQL, a relational query language draft proposal This is an Article on Relational algebra. Page Contains Information, Facts Details or Explanation Guide About Relational algebra Operations for null values
Outer Join
Left Outer Join
Right Outer Join
Operations for Domain Computations
The Aggregation Operation
The Extend Operation
Algebraic properties
Use of Algebraic Properties for Query Optimization
Related External Links
