The SQL Design Patterns book has a section on Relational division. (Also available online). It mentions that

Relational Division is truly exotic operator. Search any popular database forum for problems involving Relational Division. There certainly would be a few, but chances are that they might be not recognized as such. For example, chapter 1 problem 6 is almost literal reproduction of the message posted at the newsgroup microsoft.public.sqlserver.programming. If nothing else, this section would help you to be able to quickly identify a problem as Relational Division. After all, using the appropriate names is very important for effective communication.

The apparent Relational Division rarity is very puzzling. Informally, Relational Division corresponds to the universal quantifier \forall, which should be as frequent as its existential \exists counterpart!

Set Intersection Query

Consider a set union query:

Given a family of sets, e.g {{1,3,5},{3,4,5},{5,6}} find their union

The answer is easy — {1,3,4,5,6} — but how do we express it as a query?

First, we need a schema, so we represent set of sets as a table with two columns, set id and element:


table Sets {
id integer,
element integer
}

ID ELEMENT
1 1
1 3
1 5
2 3
2 4
2 5
3 5
3 6

Then the union query is simple projection of the Sets relation to the element column.

Now flip the union to intersection:

Given a family of sets, e.g {{1,3,5},{3,4,5},{5,6}} find their intersection

and, suddenly, this query is not easy to write!

The critical observation is that there is relational division behind it. Here is informal query that we are after:

Find Elements which belong to all sets

Compare it with:

Find Elements who belong to at least one set

The set intersection query sounds almost verbatim to the the job applicants query that we studied earlier!

Therefore, we just divide our Sets relation into the set of all set IDs — {1,2,3}:

Sets / {1,2,3} = {5}

where the set of all IDs is just a projection of the Sets relation onto the ID column.

OK, this was just one more cute case for relational division. It still doesn’t look like something a database application programmer does in his everyday life.

Enterprise Application Schema

Consider a typical enterprise application schema:

Employees
Employee_ID
First_Name
Last_Name
Email
Tel_Num
Hire_Date
Job_ID
Salary
Commission_Percentage
Manager_ID
Department_ID

Jobs
Job_ID
Job_Title
Min_Salary
Max_Salary

Job_History
Employee_ID
Start_Date
End_Date
Job_ID
Department_ID

Departments
Department_ID
Department_Name
Manager_ID
Location_ID

Locations
Location_ID
Street_Add
City
Post_Code
Country_ID

Countries
Country_ID
Country_Name
Region_ID

Regions
Region_ID
Region_Name

There are zillions of things you can query in this database, but how many can you think are Relational Division? When this problem was posted on OTN, people suggested these queries:

Find employees who during their job history were tossed around all the departments

Assuming that a department could have offices in multiple cities (that is, the PK for the department table is something like department_id, location_id) then find departments with offices in all regions or countries could be another.

That’s all, two queries in this pretty modest database? Even more, those queries are not very convincing ones: for example the first one is pretty meaningless for a large corporation, as there is almost never a chance for a single person to be transferred across all the departments.

The explanation is the following. In order to have a nontrivial case of division the two relations must have some attributes in common. Otherwise, it becomes a Cartesian product, which is still formally a Relational Division, but pretty uninteresting one. On the other hand, when the two relations have an attribute in common, then it is very likely that there is referential integrity constraint. And referential constraint usually goes together with a unique key constraint on master table. Consider

Emp(empno, depno)

and

Dept(deptno, dname)

There are 2 cases. We can divide both relations as they are

Emp/Dept

or we can divide Emp by the projection of the Dept to the deptno column.

The second case is vacuous: the empno is a unique key — each employee works in no more than one department, therefore there is expected no employees who works for all the departments.

The first case is pretty shallow as well. We are looking for all the employees whose department number sets are the subsets of all the department sets corresponding to each department name. But those sets are a single element sets — sure each department name has only one department number assigned to it. Therefore, a relational division, which is a set containment join reduces to a natural join of Emp and Dept followed by projection, where we remove the surrogate key deptno!

In a word, it is one of the most distinguished relational database features — referential integrity — that undermines a possibility for a non vacuous relational division query.

4 Responses to “Why Relational Division is so uncommon?”

  1. paul c Says:

    In “Given a set of sets, e.g {{1,3,5},{3,4,5},{5,6}} find their union”, don’t you mean find the union of their elements?


  2. I changed “set” into a “family”. Union of family of sets is generalization of the union of the two sets.
    P.S. I’m not sure if this section really belongs here; perhaps it should better be moved into other article…

  3. Hans Castorp Says:

    The intersection is very easy to find: simply use the intersect operator.


  4. To clarify, we don’t have individual table per each set. We have one table with two columns: set element, and set id.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: