## Relations as Finite Varieties

### January 2, 2014

It is common wisdom that database field is firmly grounded in the two math disciplines: predicate logic and set theory. However, neither logic nor set theory are dominant math subjects. Just counting tags at mathoverflow.net can give rough idea of the importance of an individual math topic, and it is evident that algebraic geometry governs the math world. This field is so rich that it spawned new sub-fields, such as category theory. Furthermore, mathoverflow.net hints that category theory is more popular than either predicate logic or set theory! Some category theorists even proposed to rebuild the entire math foundation with category theory toolkit…

Does algebraic geometry offer any insights to database theory? First, we recall what basic mathematical objects of algebraic geometry are. The basic geometric object is affine variety — a system of polynomial equations. Affine variety is a set of tuples in $R^n$, and our focus on database applications prompts that we must narrow our scope to finite varieties, when polynomial system has finite number of roots.

OK, this abstract motivation is fine, but given a [database] relation, how do we construct a variety out of it? This is accomplished after we learn the two fundamental operations over varieties.

1. Set intersection of two varieties $V$ and $W$ is a set which is, again, a variety. The defining set of polynomials for $V \cap W$ is a union of the polynomial sets defining the $V$ and $U$.
2. Set union of two varieties $V$ and $W$ is a set which is, again, a variety. The defining set of polynomials for $V \cup W$ is a set of all pairwise products of the polynomials from $V$ and $W$.

Union and intersection of varieties is exactly what is needed for being able to exhibit a variety corresponding to a relation. Consider an unary relations with a single tuple

```[x]
1
```

Geometrically, it corresponds to a single point on the x axis, so it becomes immediately obvious what equation defines it:

$x - 1 = 0$

Likewise, the relation

```[y]
1
```

is defines by

$y - 1 = 0$

Next, we construct join of the two relations

```[x y]
1 1
```

Join is set intersection, and intersection of varieties gives us the system of defining equations

$x - 1 = 0$
$y - 1 = 0$

Let’s expand our example and add one more attribute:

```[x y z]
1 1 1
```

Our system of equations grows with one more constraint:

$x - 1 = 0$
$y - 1 = 0$
$z - 1 = 0$

Now, knowing how to construct “single tuple varieties”, we are ready to move onto relations with more than one tuple. This is accomplished via union. Consider a relation

```[x y z]
1 1 1
2 1 1
```

which is a union of already familiar relation

```[x y z]
1 1 1
```

with

```[x y z]
2 1 1
```

To build the union of varieties we need a polynomial system defining the second variety

$x - 2 = 0$
$y - 1 = 0$
$z - 1 = 0$

The polynomial system for the union is

$(x-1)(x-2) = 0$
$(y-1)(x-2) = 0$
$(z-1)(x-2) = 0$
$(x-1)(y-1) = 0$
$(y-1)(y-1) = 0$
$(z-1)(y-1) = 0$
$(x-1)(z-1) = 0$
$(y-1)(z-1) = 0$
$(z-1)(z-1) = 0$

Wait a minute, 9(!) equations for a relation with just couple of tuples — this doesn’t look promising… The critical observation is that not all of these equations are independent. A well known method to greatly reduce polynomial system is Groebner basis. Executing the following command at Wolfram Alpha:

```GroebnerBasis[{(x-1)(x-2), (y-1)(x-2), (z-1)(x-2), (x-1)(y-1), (y-1)(y-1), (z-1)(y-1), (x-1)(z-1), (y-1)(z-1), (z-1)(z-1)} ,{z, y, x}] ```
we obtain the system

$x^2-3 x+2 = 0$
$y-1 = 0$
$z-1 = 0$

As an afterthought, this result is obvious. The first equation constraints `x` to being either 1 or 2, the second equation asserts that `y` is equal to 1, while the third one asserts `z=1`. It is good to learn the general method, though.

With this technique we can proceed and find a variety corresponding to the relation

```[x y z]
1 1 1
2 1 1
3 2 1
3 2 2
```

$-6 + 11 x - 6 x^2 + x^3 = 0$
$-4 + 3 x - x^2 + 2 y = 0$
$3 - x - 3 z + x z = 0$
$2 - 3 z + z^2 = 0$

An inquisitive reader might have already noticed functional dependency `x -> y` lurking in plain view. In the next installment we’ll investigate what functional dependency is from algebraic geometry perspective.