The relational algebra is a formal query language for the relational database model. Each operator in the relational algebra accepts relation instances as input and returns one as output.

In addition to the standard set operations of union, intersection, set difference, and cross product, relational algebra contains these operators:

  • σ (sigma) The selection operator. This operator selects rows from a relation based on a test criterion. For example,σage > 18(X) returns all rows from X whose age attribute is greater than three.
  • π (pi) The projection operator. This operator selects columns from a relation. πuid,name(X) returns a relation instance whose rows consist of the uid and name fields in each row of X.
  • (HTML has no symbol for this, looks like a bowtie) the join operator. A join with join condition c between sets R and S is equivalent to σc(R × S).

A collection of operations to manipulate relations using the relational database model. Since instances of a relation, or relation states, consist of sets of n-tuples, relational algebra consists of two kinds of operations: set operations and relational operations. Set operations used in relational algebra are the same as those found in set theory, I will detail only relational operations here.

    Select -- indicated by a lower-case sigma.
    This operation selects rows (instances of a relation) based on a predicate (condition). The format of the select operator is as follows:

    σ <predicate> (<relation>)

    Select returns a list of tuples that have the same attribute as the relation on which the select was performed.

    Project -- indicated by pi.
    This operation selects columns of a table (attributes) with no condition to be met. The format for the project operator is as follows:

    π <attribute1, attribute2, attribute n> (<relation>)

    Join -- indicated by the bowtie symbol. Why a bowtie? We may never know...
    This operation combines tables related through a foreign key. This operation is equivalent to performing a select on the cross-product of tables but it's now combined into one handy-dandy operation for your querying pleasure. As for format:

    <relation> (bowtie)<predicate> <relation>

    A special kind of join is known as Equijoin, or a natural join. This operation is indicated by an asterisk, an joins tables over the equality of a foreign key of one tuple with the primary key of another. Format:

    <relation> *<key><trelation>

    Division - Indicated by the division symbol. Since I'm still not sure myself what this thing actually does, I only know that it's necessary in some cases, I'll use the formal definition from my textbook here and let you figure things out (see the bottom of the writeup for the source):

    In general, the division operation is applied to two relations, R(Z) / S(X), where X ⊂ Z. (X is a subset of Z, if this html character doesn't work on your browser!) Let Y = Z - X...that is, let Y be the set of attributes of R that are not attributes of S. The result of the division is a relation T(Y) that includes a tuple t if tuples tR appear in R with tR[Y] = t, and with tR[X] = tS for every tuple tS in S. This means that, for a tuple t to appear in the result T of the division, the values in t must appear in R in combination with every tuple in S.

    Function : This operations performs some function on a relation (or joined relations for that matter) such as count:

    <grouping attribute>ƒ(<function> <argument>) <relation>

    The grouping attribute here is optional. If used, the function will return a set of relations: R(<grouping attribute>, <result>), otherwise R will contain only the result(s).

These operations in combination with standard set operations allow us to implement complex database queries and make life all the more interesting.


Source: Fundamentals of Database Systems by Elmasri and Navathe (there is a nice node on this text, you should check it out), and the skills and abilities I am acquiring as a result of Introduction to Database Systems at my University.

Expanding on psydereal's explanations above...

Joins are a powerful part of relational algebra in database theory.

Cartesian Product (R1 × R2)

Technically, Cartesian Products are not considered true joins, but they are similar enough to warrant mentioning. The Cartesian product concatenates every tuple in the first relation with every tuple in the second relation. This includes illogical concatenations, limiting the usefulness of a raw Cartesian Product. However, the Cartesian Product is the primitive from which more useful joins are born.

Theta join (R1 |X|predicate R2)

The tuples in a Cartesian product of two relations satisfying some predicate making use of any comparison sign (= ≠ < > ≤ ≥). This is the type of join that is signified with the bow tie shape that cannot currently be represented in HTML.

Example:

NoderInfo |X|NoderInfo.XP < NoderWriteups.totalRep NoderWriteups

would give you all the NoderInfo and NoderWriteups table information for noders with less XP than the sum total of their writeup reputations.

Equijoin

An equijoin of a special case of the Theta join where the only comparison used in the predicate is =.

Natural join (R1 |X| R2)

A natural join is a special case of the Equijoin, where the two relations already have at least one field in common. The relations are joined, and redundant fields are ignored. Because a natural join means that there are common fields, the predicate is not explicitly stated, unlike other Theta joins.

Outer join (R1 ]X| R2)

Tuples from R1 that do not have matching values in the common attributes of R2 are also included in the result of the join. Missing values in the second relation are set to null.

Example:

NoderInfo ]X| NoderWriteups

would give you a list of all NoderInfo and all NoderWriteups for all noders, even if they haven't written any writeups yet.

Technically ]X| (looking sort of like a bowtie with flags on it) is a left, or natural, outer join, but |X[ and ]X[ can also be used, called a right outer join and full outer join, respectively.

Semijoin (R1 |>predicate R2)

A join containing the tuples of R1 that would be used in a theta join with R2. This can be rewritten as the projection of all attributes A in R1 of the theta join of the two relations:

πA(R1 |X|predicate R2)

Example:

NoderInfo |>NoderInfo.XP < NoderWriteups.totalRep NoderWriteups

would only give you NoderInfo for noders with less XP than the sum total of their writeup reputations.

As you can see, the semijoin symbol looks like an arrowhead pointing at the second relation, or half a bowtie.

OK, so this is a bit of a rant, the write ups to this point have all conflated (I love it when I get to use that word) relational algebra with relational databases. It's surely true that a very common use for relational algebra is in database theory, but they're so much more than that.

This explanation could get technical but, bear with me and I'll keep it as simple as possible, but no simpler (thanks Einstein). So what is relational algebra? It's an algebra (I'll use the first reference there as it's good enough) for relations. I'll restate here what I need, though.

An algebra is a set of things, with rules for putting things together in order to make meaningful sentences, and a set of transformation rules that transform sentences into other sentences. That's not really helpful, so let me try an example and focus on simple arithmetic. Let my set of things be all natural numbers and the plus operator. Also, let my rules for meaningful sentences be that a meaningful sentence is one of

  • number
  • number + number
This is pretty simple but good enough to illustrate the point, it means that "5" is a meaningful sentence, "2 + 2" is a meaningful sentence, but "+" and "2 2" are not meaningful sentences. Finally, a transformation rule (and I'll pick a really easy one):
  • if n is a number, then n + 0 is n

What we see from the definition is that algebras are, really, very general. There's a wonderful paper out there "Simple Word Problems in Universal Algebras" by Knuth and Bendix which is anything but simple but it gives you a hint of just how far algebras can go.

A relation is a set of ordered pairs of things. All this means is that one thing can be connected to another thing by means of a relation. For example, if my relation is childOf then "(me, Mom)" and "(me, Dad)" are elements of the set defined by childOf. Ordering is important (Mom, me) would not be an element of childOf, but could be an element of a different relation called parentOf. Relations can have all sorts of properties that we use as short hand notations to express certain ideas. But the most general definition is as above. Oh, if you're worried about relations of 3 or more things I can nest my relations, so if I want 3 things to be related I can just write "(a, (b, c))" and if you just thought wait, that looks like an S-expression except for the use of comma instead of period, you'd be right; S-expressions represent a certain type of relation. But there are also other kinds of relations and, indeed, the transformation rules of an algebra form one or more relations.

At this point you can see that relations and algebras are really very general concepts. Certainly Codd's paper from 1970 defined one particular set of things, and relations between those things and, thus, defined a relational algebra. But there are other relational algebras out there. The first one that comes to my mind was an unnamed, I think, relational algebra that was created in order to provide formal semantics to the Z formal method. For various reasons, I needed to understand that relational algebra and ended up in a one day class where it was explained to me. I still have, and treasure, the materials and notes I took that day. This relational algebra is wonderfully expressive and based on a branch of mathematics known as category theory. That relational algebra is also incredibly deep and understanding it is no easy matter. All I want to add with respect to it is that I was told that its creator, when the project was over is reported to have said "That's enough of this practical stuff, I'm off to do some theory.

So, to finish my rant. Yes, relational algebras are often used in database theory, but they can be used for so much more.

Log in or register to write something here or to contact authors.