## Querying Parse Trees

### October 12, 2016

In previous blog entry we have glimpsed into new SQL Developer 4.2 feature — parse trees. Again, this is something that development tools don’t readily expose with the most likely reason being the lack of foresight what users can do with this information. Blunt answer proposed here is: “query it”.

In general, querying is a formal and meaningful way to ask questions about something. In context of parsing the structure of the objects which we ask those questions about is well defined. In other words, the input for a query is a parse tree. Then, what is the structure of query answer?

There are at least two possibilities. The most natural is perhaps to have the output of the query to have the same structure as the input, that is the tree. However, I yet to see a compelling query language over trees, so perhaps choosing something more familiar to database developer – a relation (table) – would be a better choice?

A [query] language which takes an input in one form (a tree), but produces the results in another form (a relation/table) might seem counter intuitive, yet we’ll demonstrate that it is practical. The first objection might be that such query language, which input is a structure of one kind, but the output is something else is “not closed”. This is the matter of wrong appearance. Trees are binary relations; therefore, it is just that our proposed query language makes no effort to keep the query structures confined to this narrow kind of relations.

As we are going to talk in depth about this new query language, it would become awkward without naming it. So let’s give it a name — *Arbori*. (The name credit goes to Brian Jeffries from SQL Dev team). With this prelude, let’s give a taste of Arbori query.

Code quality metrics, or code guidelines can be viewed as queries or constraints over a code base. For example, Trivadis coding guideline #33 reads: “*Always close locally opened cursors*“. Reformulating it as a query, it would sound like this: “*Find all cursors which are not closed in the enclosing procedure*“.

When querying parse trees it is always insightful to focus on a concrete code sample and figure out what parse tree nodes we are after.

CREATE PACKAGE BODY "Trivadis code guideline #33" AS --bad PROCEDURE not_close_cursor ( out_count OUT INTEGER ) AS CURSOR c1 IS SELECT COUNT(*) FROM all_users; BEGIN out_count := 0; OPEN c1; FETCH c1 INTO out_count; END not_close_cursor; -- Good PROCEDURE close_cursor ( out_count OUT INTEGER ) AS CURSOR c1 IS CURSOR c1 IS SELECT COUNT(*) FROM all_users; BEGIN out_count := 0; OPEN c1; FETCH c1 INTO out_count; CLOSE c1; END close_cursor; END;

The parse tree for this code is

The cursor declaration is a node with payload “cursor_d” (1). It has two children, the node labeled with keyword “CURSOR”, followed by the identifier “C1”.

Next, we identify the parse node where the cursor is closed (2). It is a node labeled as “close_statement” and we have to make sure that its second child matches the name of the cursor that we have identified on previous step.

Finally, the cursor declaration and closing nodes that we just have identified have to be in the same scope, that is there has to be an ancestor node corresponding to the procedure that encloses both (3).

This is all the thought that is needed. The further development is just the formal translation of those 3 insights into formal Arbori query.

However, a proposition of mastering new query language is something that was hard to convince SQL Dev feature decision makers. This is why Arbori query is hidden and could only be enabled by an auxiliary command:

set hidden param arboriEditor = true;

Executing this command results in appearance of an additional button on the Code Outline panel:

Clicking it invokes Arbori dockable query panel. For our purposes it is convenient to organize all three panels side-by-side:

If a reader feels intimidated by a perspective of learning yet another query language, then this perception is without merit. In fact, anybody proficient in SQL should have no problem learning it. In SQL terms, each Arbori query accesses a single input object – the parse tree, which allows to greatly simplify query syntax: there is no need for the FROM clause. Also the attributes restriction is implicit from the predicates (or is achieved by other means), so that there is no SELECT clause. The entire query is reduced to a single WHERE clause.

Unlike SQL, each query is named. Here we started writing a query appropriately named “leaking_cursors”. There is no substance in query itself yet (it is even invalid Arbori syntax).

The major Arbori query components are unary predicates. Basically, we want to say that the node payload is such and such. Before we expressed our interest finding a nodes labeled “cursor_d”. This is how it is formally written in Arbori syntax:

We need to introduce an attribute name — “cursor_decl” (1) — and specify that a parse node named “cursor_decl” is required to have a payload “cursor_d” (2). If this syntax feels alien to the reader, written in SQL this would look something like this:

select cursor_decl from parse_tree where payload(cursor_decl).includes('cursor_d')

Admittedly, coupling square and round parenthesis in Arbori might be too much for some readers. Please don’t hesitate to propose a better syntax!

Executing this simple query produces the list of cursor declarations (3). Changing list selection updates node references on the parse tree and, as I described in the previous article, tree node selection is consistent with selection in the original PL/SQL or SQL code editor. The numbers in the odd brackets/parenthesis [13,15) mean semi-open interval/segment of integers: at position 13 of the input PL/SQL sample code there is a token “CURSOR”, while at the next position 14 there is a token “C1”.

This is some progress towards desired query, so let’s find all the nodes labeled “subprg_body”. We know how to do that already, so we just add one more condition. More importantly, we need a condition which requires procedure node to be an ancestor of the cursor declaration node. This is formally specified as a binary ordering relation enclosed_proc < cursor_decl:

Informally, ancestor date of birth precedes that of descendant. A reader familiar with concept of nested sets would find this notation quite intuitive, for example the node [49,51) “cursor_decl” is nested inside the [41,80) “enclosed_proc” node.

The next step, is finding the nodes corresponding to cursor closures. This is easy:

[cursor_close_stmt) close_stmt

We can also add a requirement for “cursor_close_stmt” to be nested inside “enclosed_proc”. However, we also need to express new requirement:

- The nodes “cursor_decl” and “cursor_close_stmt” refer to the same cursor name

This warrants additional Arbori syntax, and slight tactical maneuver. First, let’s forget about “cursor_decl” and “cursor_close_stmt” and shift our attention onto their children. Specifically, we are interested in their children which convey the cursor name, that is the ones labeled “identifier” (among other things). Let’s call these modified attributes “cursor_decl_id” and “cursor_close_stmt_id”. Since we have conditions onto their parents, it is handy to just have a convenient way to refer to them. That Arbori syntax is a caret after an attribute name. For example

[cursor_close_stmt_id^) close_stmt

postulates that a parent node of “cursor_close_stmt_id” is labeled “close_stmt”. There is one more additional syntax: specifically we want “cursor_decl_id” and “cursor_close_stmt_id” to refer to the same identifier:

?cursor_close_stmt_id = ?cursor_decl_id

Now we have all the ingredients to finalize our query. Realizing, however, that we found not quite yet what we were after, lets rename it into “well_behaved_cursors”:

We are just one step away from finishing the task. One more query is wanted, named appropriately as “all_cursors”, which is the variant of “well_behaved_cursors” that ignores cursor closing statements. Then, we need to subtract “well_behaved_cursors” from “all_cursors”.

Apart from exotic syntax, little of what I have just described would raise brows of a reader with database background. Query gradually evolving for performing more and more complex tasks is something database developers and DBAs routinely do. Querying the code is quite easy!

## Reshaping Parse Trees

### September 22, 2016

#### Introduction

Parsing and internal parse structures — trees — are not something that development tools readily expose. It is assumed that most users are comfortable with numerous features that depend on parsing, such as formatting, code block collapsing/expanding, code advisors and so on, but aren’t really interested learning the formal code structure. In SQL Developer 4.2 we have attempted to encourage users to experiment with parse trees.

The location, where the new feature is introduced, is somewhat obscure — it is the former 4.2 “Outline” panel, which is was accessible via “Quick Outline” context menu item. Minor name change reflects the amended functionality

Perhaps some explanation of what you see at the left panel is warranted. Oracle PL/SQL language is ADA derivative, which formal grammar can be found here. In the above example, the top most node labeled `subprg_spec `

has been recognized as matching the first production for the rule

subprg_spec : PROCEDURE_ IDENTIFIER .fml_part. | FUNCTION_ designator .fml_part. RETURN_ ty_mk ;

Consequently, it has 3 children labeled `PROCEDURE`

, `TEST`

(which is `identifier`

) and , `fml_part`

.

For some obscure reason, there is no trace of this genuine formal grammar in oracle PL/SQL documentation, and “railroad” syntax diagrams are different. The good news, however, is that oracle documentation SQL grammar matches almost verbatim to the parse trees that SQL Developer exposes.

#### Playing with parse trees

Selecting a node on the tree highlights the recognized portion of the code in the editor

Some nodes, such as those labeled with `fml_part`

in the example above, are collapsed. This is because parse trees for even moderately long code become overwhelming. Expanding tree branch manually is quite laborious; this is why there are 2 more effective ways to accomplish it.

The first alternative is to click “expand branch” context menu item

Even though the root node is not collapsed, this context menu item is still there, and this action will expand all the collapsed descendant branches.

#### Selectively collapsing/expanding tree branches

The second alternative is little more sophisticated. Suppose, in our working example you would like to have all the nodes with payload `fml_part`

expanded. Then, you click onto the second button on the toolbar (it is marked with plus symbol and “Collapse nodes with payload” tooltip).

If you check the `fml_part`

menu item, then all the nodes with this payload become expanded.

A sharp eyed reader might have noticed that, by default, `select`

is expanded, while `seq_of_stmts`

is collapsed. This is because code outline is working not only for PL/SQL editor but also SQL Worksheet. In SQL worksheet queries — `select`

statements — are the main laborers, this is why their structure is expanded. However, if you work mostly in PL/SQL editor, then those `select`

statements are less interesting. For example, a `select`

query` which is a part of a cursor declaration can be collapsed, unless a user decides to investigate it. As for `

`seq_of_stmts`

, again, PL/SQL package bodies are frequently very large, so having collapsed sequence of statements — the bulk of procedure or function — is handy.

#### Selectively hiding the nodes

Another way to bring a parse tree to manageable size is hiding the nodes. In general, if node disappears, something must be done about this node children. Naturally, orphans become attached to node’s parent. For example, deleting node B, reattaches grand children C and D to node A.

Which nodes are not so much interesting? Consider a typical SQL query

A chain of tree nodes marked with `select_list`

faithfully reflects SQL grammar, but a user might want to see more concise picture. The 3rd (“Hide nodes with payload”) button on toolbar serves this purpose

Unchecking the `select_list`

menu item (it is unchecked by default) reshapes the tree like this

#### Conclusion

There is only so much that tree reshaping can do. For large trees it is laborious to search for interesting information. In the next installment we’ll describe an approach which is more satisfying for a reader with database background — querying.

## SQL Performance Analysis and Tuning with SQL Developer

### September 20, 2016

SQL Developer 4.2 early adopter release is out, but SQL performance analysis improvements somehow slipped from the list of additional features. Here is detailed walkthrough of three amendments:

– Cancelling Long Running Queries (while extracting partial statistics)

– Object Hyperlinks

– Hints

## Relational Algebra with CAS

### January 4, 2016

Two years ago we have ventured into mathematical foundation of relational theory. From algebraic geometry perspective relations were viewed as Finite Varieties. In the followup we were able to describe functional dependencies via explicit analytic formulas and provide intuitive interpretation of the classic result from database dependency theory — Heath’s theorem. Recent arXiv preprint provides a little more coherent exposition of those ideas together with some excursion into Quantum Theory.

Here we complement the aforementioned arXiv article with demonstration that your average off-the-shelf *Computer Algebra System* is actually a [rudimentary] RDBMS. The CAS system of choice is CoCoA.

The starting point is being able to exhibit a system of polynomial equations constraining a relation defined as a set of set of tuples.

Given a binary relation with attributes and

[x y] 1 1 2 1 3 2

we execute the following series of CoCoA commands. First, we need to specify the polynomial ring:

Use XY ::= QQ[x,y];

Then, list the tuples:

Points := mat([[1, 1], [2, 1], [3, 2]]);

Finally, specify the ideal, and print it out:

I := IdealOfPoints(XY, Points); I;

The output:

ideal(y^2 -3*y +2, x*y -x -3*y +3, x^2 -3*x -2*y +4)

Next, for the second relation:

[x z] 1 1 2 1 3 1 3 2

we execute the series of commands

Use XZ ::= QQ[x,z]; Points := mat([[1, 1], [2, 1], [3, 1],[3, 2]]); J := IdealOfPoints(XZ, Points); J;

which outputs

ideal(z^2 -3*z +2, x*z -x -3*z +3, x^3 -6*x^2 +11*x -6)

What is the natural join of the two relations? It is the sum . However, we must switch to polynomial ring which contains both and :

Use XYZ ::= QQ[x,y,z]; I:=ideal(y^2 -3*y +2, x*y -x -3*y +3, x^2 -3*x -2*y +4); J:=ideal(z^2 -3*z +2, x*z -x -3*z +3, x^3 -6*x^2 +11*x -6);

After redefining verbatim both ideals in the larger ring (can it be accomplished easier?), we calculate their “join”:

I+J;

which outputs

ideal(y^2 -3*y +2, x*y -x -3*y +3, x^2 -3*x -2*y +4, z^2 -3*z +2, x*z -x -3*z +3, x^3 -6*x^2 +11*x -6)

This is, again an ideal of points, which is evident with the command

RationalSolve(GBasis(I+J));

[[1, 1, 1], [2, 1, 1], [3, 2, 1], [3, 2, 2]]

Here we leveraged `GBasis`

function as a way to convert an ideal into list (because `RationalSolve`

accepts a list, not an ideal).

Next, projection is an *elimination ideal*. Once again, it is zero-dimensional ideal of points, so that a typical database user would like to list the tuples:

RationalSolve(GBasis(elim(z, I+J)));

[[1, 1], [2, 1], [3, 2]]

Finally, let’s hint what are counterparts of the rest of RA operations. The set union is the intersection of an ideals. The set difference is colon ideal. The only operation which doesn’t have an obvious analog in polynomial algebra is the least challenging one — renaming.

## Distributivity of union over join

### August 19, 2014

Recently, Litak & Mikulas & Hidders published the extended version of their earlier work. However, in a typical mathematical tradition following the famous saying “when fine building is revealed to the public eye, the scaffolding should be removed”, reading it requires some detective work. In particular, lets investigate the motivation behind their axiom system:

x ^ (y v z) = (x ^ (z v (R00 ^ y))) v (x ^ (y v (R00 ^ z))). % AxRH1

```
```(R00 ^ (x ^ (y v z))) v (y ^ z) =

= ((R00 ^ (x ^ y)) v z) ^ ((R00 ^ (x ^ z)) v y). % AxRH2

`(x ^ y) v (x ^ z) = x ^ ( ((x v z) ^ y) v ((x v y) ^ z) ). % AxRL1`

Let’s start with well known lattice inequality

x v (y ^ z) > (x v y) ^ (x v z).

Here we abused notation a little and assume the order symbol > to be reflexive relation. If the LHS of inequality is greater (or equal) than the RHS, then perhaps one can hope to substitute `y`

and `z`

with greater values at RHS, or with lower values at LHS so that inequality becomes an identity? This is the technique employed by Sergey Polin when refuting McKenzie’s conjecture. Specifically, he introduced a non-decreasing chain of values

y1 = y

z1 = z

```
```y2 = y v (x ^ z1).

z2 = z v (x ^ y1).

`y3 = y v (x ^ z2).`

z3 = z v (x ^ y2).

...

together with weakened distributivity law

x v (y ^ z) = (x v yN) ^ (x v zN).

Unfortunately, literal Polin-style weakened distributivity law fails in relational lattices for n=1,…,6, at least. However, Polin’s technique can be naturally generalized when an algebra offers constant(s). The axiom `AxRH1`

is weakened distributivity law with `y`

elevated to `y v (R00 ^ z)`

and `z`

elevated to `y v (R00 ^ z)`

.

In a similar venue, the axiom `AxRL1`

is Polin-style weakening applied to LHS of distributivity law. (Litak et.al. attribute the axiom to Padmanabhan work.)

Genesis of `AxRH2`

is probably various earlier discovered conditional distributivity identities involving the header relation `R00`

. For professional logician converting implication into equality is a walk in the park.

How about the dual distributivity law of union over join? The stricter conditional distributivity of union over join established in an earlier paper hints that the axiom might be more complicated. One might have to lower/elevate both sides of inequality. After some trial permutations of variables and the header constant `R00`

, it revealed the following identity:

(x v (y ^ (x v z))) ^ (x v (z ^ (x v y))) = x v ((y v (x ^ R00)) ^ (z v (x ^ R00))).

This can be translated into relational calculus expression

(( exists p1 exists p2 exists p6(x(p1,p3,p5,p7) | (y(p2,p3,p6,p7)& exists p1 exists p3 exists p4 exists p6(x(p1,p3,p5,p7) | z(p4,p5,p6,p7))))& exists p1 exists p4 exists p6(x(p1,p3,p5,p7) | (z(p4,p5,p6,p7)& exists p1 exists p5 exists p2 exists p6(x(p1,p3,p5,p7) | y(p2,p3,p6,p7)))))

<->

exists p1(x(p1,p3,p5,p7) | ( exists p2 exists p6 exists p1 exists p5(y(p2,p3,p6,p7) | (x(p1,p3,p5,p7)&$F))& exists p4 exists p6 exists p1 exists p3(z(p4,p5,p6,p7) | (x(p1,p3,p5,p7)&$F))))).

which validity is easily verified with Prover9.

This axiom doesn’t seem to be independent from the `{AxRH1,AxRH2,AxRL1}`

with Mace4 models lookup up to cardinality 26 at least, although Prover9 fails to confirm it after 3 hours running time. (This is not entirely surprising, with union over join distributivity law being often enforced by join over union).

## Automatic database links in SQLDeveloper

### April 7, 2014

People invent really complicated ways to achieve simple tasks such as copying a table from one database to another. With database link (which is arguably RDBMS’s the most under appreciated feature) it is just one SQL command:

create table emp as select * from scott.emp@dblink1;

But who has time to create database links? In SQLDeveloper 4.0 you don’t have to. Assuming that you have the right connection the script

set autodblink on create table emp as select * from scott.emp@connection1; -- assuming there is connection "connection1"

would create temporary db link, then run the statement and, finally, drop the link.

Another example, which benefits from this functionality is comparing a table in two different databases:

set autodblink on (select * from employees@connection1 minus select * from employees) union (select * from employees minus select * from employees@connection1);

In releases prior to SQLDeveloper 4.1, however, there was a major omission — a failure to run individual query which outputs result set into a grid (as opposed to running it as a script). Admittedly, there was a technical culprit for that. When a query is executed into a grid, the contract between the server (RDBMS) and client (SQLDeveloper) is somewhat loosely defined. Specifically, the client asks the server to execute the query, then fetches 50 records. The cursor remains open “just in case” if user might want to scroll down beyond the first batch of records. In other words, the end of sql execution is not precisely demarcated. How does this affect SQLDeveloper automatic DB link feature? Well, SQLDeveloper needs to know when to delete temporary database links, and the only solution is attaching listener to the event of grid closure. Hence, after you you set dblinks on, and execute a query into a grid, but before closing the grid, you’ll witness database link under the links node in the navigator. Please be aware that deleting temporary links from the client is not bulletproof; for example, server process going down with ORA-600 and taking user session with it, would result in leftover database links. If your database development work revolves around environment with strict security requirements, it is advisable to reevaluate using this feature.

In releases prior to SQLDeveloper 4.1 there were also smaller snags. For example, PL/SQL anonymous blocks execution was not accompanied by database link creation. Also, database link names with identifiers separated by dots were not recognized. Regarding the second issue, please keep in mind that SQLDeveloper’s syntax for connection names is less restricted than that of database links. For example, something like 11.45.34.12_SCOTT is legitimate connection alias, but is not a database link.

With these bug fixes behind, let’s venture into applications. How about copying [remote] database schema? In introductory computer science class students learn that there are two kinds of objects in programming: data and code. Here is the command which copies the data:

set autodblink on; BEGIN FOR rec IN (SELECT TABLE_NAME FROM user_tables@gbr30060_DB11GR24_hr) LOOP begin DBMS_OUTPUT.PUT_LINE('Processing Table: ' || rec.TABLE_NAME ); execute immediate 'create table ' || rec.TABLE_NAME || ' as select * from ' || rec.TABLE_NAME || '@gbr30060_DB11GR24_hr'; exception when others then DBMS_OUTPUT.PUT_LINE('Skipped Table: ' || rec.TABLE_NAME ||' -&amp;gt; -ERROR- '||SQLERRM); end; END LOOP; END; /

and the code

BEGIN FOR rec IN (SELECT name, type, LISTAGG(text, 'chr(13)') WITHIN GROUP (ORDER BY line) AS code FROM user_source@gbr30060_DB11GR24_hr GROUP BY name, type ) LOOP begin DBMS_OUTPUT.PUT_LINE('Processing Object: ' || rec.NAME ); execute immediate 'create ' || rec.code; exception when others then DBMS_OUTPUT.PUT_LINE('Skipped : ' || rec.type || '' || rec.name ||' --ERROR-- '||SQLERRM); end; END LOOP; END; /

Relational Lattice is simplified “attribute-free” Relational Algebra. It allows to axiomatize Relational Algebra in the same spirit as Relation Algebra. Here is axiom system by Litak&Hidders&Mikulas:

`x ^ y = y ^ x.`

(x ^ y) ^ z = x ^ (y ^ z).

x ^ (x v y) = x.

x v y = y v x.

(x v y) v z = x v (y v z).

x v (x ^ y) = x.

```
```

`x ^ (y v z) = (x ^ (z v (R00 ^ y))) v (x ^ (y v (R00 ^ z))).`

(x ^ y) v (x ^ z) = x ^ ( ((x v z) ^ y) v ((x v y) ^ z) ).

(R00 ^ (x ^ (y v z))) v (y ^ z) = ((R00 ^ (x ^ y)) v z) ^ ((R00 ^ (x ^ z)) v y).

The authors hint the proof of each individual axiom “by direct calculation”. Let’s prove them by converting each equality into equivalent Predicate Calculus formula.

The first snag is that Relational Lattice expressions don’t refer to [relation] attributes. We have to equip each individual relation variable with attributes. This is inherently ambiguous step, although one observation might help.

Consider an expression of two variables `x`

and `y`

, e.g. `x ^ (x v y)`

. There might be more than one common attribute for `x`

and `y`

, but when calculating a result of either operation *join* `^`

or *meld* `v`

, two common attributes match the same way as a single *composite attribute*. For example, for all practical purposes joining by two conditions matching first and last name is the same as matching by their concatenation — full name. Therefore, considering expressions with two variables we equip `x`

with two attributes: `p1`

, which is in `x`

but not in `y`

, together with `p2`

, which is both in `x`

and `y`

. Likewise, we equip `y`

with two attributes: `p2`

, which is both in `x`

and `y`

, together with `p3`

, which is in `y`

but not in `x`

. This canonical attribute introduction generalizes to any number of relvars.

Now that all relvars are equipped with attributes we can translate complex Relational Lattice expressions into Predicate Calculus as well. Join operation is straightforward: it is Boolean conjunction. The set of free variables in the resulting formula is a union of free variables of the operands. Meld operation is Boolean disjunction, however we must add existential quantification over arguments which are in the first operand but are absent from the second one, and arguments which are in the second operand but not the first one (symmetric difference). For example,

`x v y `

translates to

`exists p1 exists p3 x(p1,p2) | y(p2,p3)`

Translation procedure is entirely mechanical and begs for automation. In QBQL you just write down an axiom into the input file `qbql/lang/parse/RL.axiom`

, then run `qbql.lang.parse.Translate.main()`

and, finally, witness the translated expression in the output. The result then can be fed directly into Prover9.

Let’s try this algorithm for several axioms. The absorption law

`x v (x ^ y) = x.`

translates into predicate equivalence

`exists p2(x(p1,p3) | (x(p1,p3)&y(p2,p3)))`

<->

x(p1,p3).

The Prover9 proof of the later is remarkable:

`1 (exists p2 (x(p1,p3) | x(p1,p3) & y(p2,p3))) x(p1,p3) # label(non_clause) # label(goal). [goal].`

2 $F. [deny(1)].

It is certainly shorter than the proof in one of the earlier paper, and warrants an explanation, to say the least. Prover9 succinct output is a result of several steps done internally, such as Scolemization, rewriting to Conjunctive Normal Form, etc. In this case, Scolemization followed by CNF transformation produced the false value, thus proving the theorem.

Let’s investigate more relational operations. Relational complement translates into Boolean negation, while inversion is a little bit more sophisticated. Here is a theorem, which can be considered definition for `R00`

(better known in D&D community as `DUM`

):

`(y' v y`)'=R00.`

It translates into

`-( exists p1(-(y(p1)) | ( exists p1 y(p1))))`

<->

$F.

where one don’t need help of any automatic proof system to be convinced of its validity. There we see yet again that `R00`

is analog of nullary predicate “false”.

Now that we established conversion of Relational Lattice expressions into Predicate Calculus it is naturally to ask if we can translate in the opposite direction. Consider arbitrary calculus formula and let’s drill down into its structure. There are several possibilities of what syntactic constructions we encounter on each level.

If we come across existential quantifiers, then we introduce projection, i.e. meld with an empty relation. Universal quantifier is dual to existential via De Morgan’s formula. There we have to introduce negation, which is domain dependent. If we encounter Boolean conjunction, then we introduce [natural] join (leaving all free variables unaffected). The only step which is not straightforward is handling disjunction. We translate it into the *outer union* AKA `<OR>`

. Outer union is domain-dependent operation, but as you see ignoring domain dependence rewarded us with very succinct proof of Codd’s theorem. Free variables flow unobstructed through outer union, similar to natural join; giving another reason why this transformation step was so easy.

The last step is rewriting outer union in terms of standard lattice operations using one of the following equivalent definitions:

`x <OR> y = (x ^ R11) v (y ^ (x v R11)).`

x <OR> y = (R11 ^ y) v (x ^ y) v (x ^ R11).

x <OR> y = (x ^ (y v R11)) v (y ^ (x v R11)).

x <OR> y = (x' ^ y')'.

x <OR> y = ((R00 ^ x) ^ y) v (R11 ^ y) v (x ^ R11).