Relations as Finite Varieties

January 2, 2014

It is common wisdom that database field is firmly grounded in the two math disciplines: predicate logic and set theory. However, neither logic nor set theory are dominant math subjects. Just counting tags at can give rough idea of the importance of an individual math topic, and it is evident that algebraic geometry governs the math world. This field is so rich that it spawned new sub-fields, such as category theory. Furthermore, hints that category theory is more popular than either predicate logic or set theory! Some category theorists even proposed to rebuild the entire math foundation with category theory toolkit…

Does algebraic geometry offer any insights to database theory? First, we recall what basic mathematical objects of algebraic geometry are. The basic geometric object is affine variety — a system of polynomial equations. Affine variety is a set of tuples in R^n, and our focus on database applications prompts that we must narrow our scope to finite varieties, when polynomial system has finite number of roots.

OK, this abstract motivation is fine, but given a [database] relation, how do we construct a variety out of it? This is accomplished after we learn the two fundamental operations over varieties.

1. Set intersection of two varieties V and W is a set which is, again, a variety. The defining set of polynomials for V \cap W is a union of the polynomial sets defining the V and U.
2. Set union of two varieties V and W is a set which is, again, a variety. The defining set of polynomials for V \cup W is a set of all pairwise products of the polynomials from V and W.

Union and intersection of varieties is exactly what is needed for being able to exhibit a variety corresponding to a relation. Consider an unary relations with a single tuple


Geometrically, it corresponds to a single point on the x axis, so it becomes immediately obvious what equation defines it:

x - 1 = 0

Likewise, the relation


is defines by

y - 1 = 0

Next, we construct join of the two relations

[x y]
 1 1

Join is set intersection, and intersection of varieties gives us the system of defining equations

x - 1 = 0
y - 1 = 0

Let’s expand our example and add one more attribute:

[x y z]
 1 1 1

Our system of equations grows with one more constraint:

x - 1 = 0
y - 1 = 0
z - 1 = 0

Now, knowing how to construct “single tuple varieties”, we are ready to move onto relations with more than one tuple. This is accomplished via union. Consider a relation

[x y z]
 1 1 1
 2 1 1

which is a union of already familiar relation

[x y z]
 1 1 1


[x y z]
 2 1 1

To build the union of varieties we need a polynomial system defining the second variety

x - 2 = 0
y - 1 = 0
z - 1 = 0

The polynomial system for the union is

(x-1)(x-2) = 0
(y-1)(x-2) = 0
(z-1)(x-2) = 0
(x-1)(y-1) = 0
(y-1)(y-1) = 0
(z-1)(y-1) = 0
(x-1)(z-1) = 0
(y-1)(z-1) = 0
(z-1)(z-1) = 0

Wait a minute, 9(!) equations for a relation with just couple of tuples — this doesn’t look promising… The critical observation is that not all of these equations are independent. A well known method to greatly reduce polynomial system is Groebner basis. Executing the following command at Wolfram Alpha:

GroebnerBasis[{(x-1)(x-2), (y-1)(x-2), (z-1)(x-2),
(x-1)(y-1), (y-1)(y-1), (z-1)(y-1),
(x-1)(z-1), (y-1)(z-1), (z-1)(z-1)} ,{z, y, x}]

we obtain the system

x^2-3 x+2 = 0
y-1 = 0
z-1 = 0

As an afterthought, this result is obvious. The first equation constraints x to being either 1 or 2, the second equation asserts that y is equal to 1, while the third one asserts z=1. It is good to learn the general method, though.

With this technique we can proceed and find a variety corresponding to the relation

[x y z]
 1 1 1
 2 1 1
 3 2 1
 3 2 2

-6 + 11 x - 6 x^2 + x^3 = 0
-4 + 3 x - x^2 + 2 y = 0
3 - x - 3 z + x z = 0
2 - 3 z + z^2 = 0

An inquisitive reader might have already noticed functional dependency x -> y lurking in plain view. In the next installment we’ll investigate what functional dependency is from algebraic geometry perspective.


2 Responses to “Relations as Finite Varieties”

  1. Evgeny Says:

    Is this representation column-order-independent? If yes, is it possible to demonstrate so? If no, is it possible to make it column-order-independent as described for example in chapter “Symmetric Functions” here: ?

  2. I don’t understand the question. Multivariate polynomials correspond to relations in named perspective (“named” vs. “unnamed” is the Alice book terminology). In other words, attributes are referred by variable names and not by positions.

    The referred “Symmetric Functions” book section introduced a fancy unique constraint. Here is how to interpret it in the new setting. Equip the Boxes relation with 6 attributes: length, height, depth, A, B, C, where A, B, and C are functionally dependent of length, height, depth via symmetric polynomials. The uniqueness constraint says that {A, B, C} is a key, or in other words we have functional dependency {A, B, C}->{A, B, C, length, height, depth}. What a functional dependency is in geometric terms is covered here.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: