SQL identifiers syntax look odd by today’s programming language standards. It is case insensitive (which by itself is not a sin); however, in what looks like a futile attempt to please everybody, many SQL implementations offer *quoted identifiers*. This cumbersome syntax is a source of numerous bugs in tools, and myself being in the role of Oracle SQL Developer developer while often caught in the bug crossfire quietly hated this feature.

There is more in quoted identifiers idea than just admitting names in mixed case, though. A name becomes an arbitrary string literal: the name can start with number, contain spaces, or even Kanji symbols. An identifier can be a whole sentence! Wait a minute, didn’t Formal Logic and, consequently, Database Theory established already that a predicate is a sentence?

Consider the following sentence: *“Max fed Folly at 2 o’clock”*. This is a proposition, which can be considered a tuple in a relation:

Fed=[person pet time] claire folly 200 max folly 200 max folly 300 max scruffy 200 ;

Therefore, a relation name `Fed`

is a shorthand for “`Person`

fed `pet`

at `time`

[o’clock]”. Fully named relations may appear silly for database professional who often operate tables and views with hundreds of attributes; although, database theoretician can fire back suggesting that database design with relations of excessive arity is preposterous.

Let shift the focus from relations to predicates. QBQL library has several predicates: LE, Plus, `Times`

, `Exp`

, `IntegerInterval`

, etc, but their names, although sometimes hint what predicate is, contain no information about its usage. Moreover, a typical query, such as *“Find all pets who were not fed after 2 o’clock”*

Pets ^ (Fed /^ "time = rgt" /^ LE /^ [lft]2 )';

has to jump through renaming hoops in order to match the canonical names of predicate attributes (`lft`

and `rgt`

) to that of the `Fed`

relation. Yes, renaming via composition `/^`

operation with equality predicate `"time = rgt"`

is neat, but do we want to emphasize this point every time a user writes a simple query? SQL would just use `time >= 2`

symbolic notation instead of identifier `LE`

with all the attributes renamed in-place. Note, that we already used quoted notation for equality predicate `"time = rgt"`

, so perhaps writing the numerical order predicate as `"lft <= rgt"`

or, even, with attributes renamed `"two <= time"`

where we assumed to join it with relation `[two]2`

, or even with the later inlined into the predicate `"2 <= time"`

. The transformed query

Pets ^ (Fed /^ "time >= 2")';

is certainly more user friendly. Queries with ternary predicates such as `Plus`

and `Times`

make the argument even more convincing. Consider

"x+y=5" /^ "y=3";

It is immediate that this query outputs `[x]2`

, while in the old notation

Plus /^ [z]5 /^ [y]3;

a user is forced to look up predicate definition in QBQL library. Moreover, this query has being fortunate as we don’t have to rename the canonical attribute names `x`

, `y`

and `z`

. In little more complex query the names almost always get in a way, so we have to rename them explicitly like we did in the original version of the “not fed after 2 o’clock” query. For example, multiplication defined via addition and logarithm in the old notation looks like this

Mult = (Exp /^ "x=r" /^ "y=f1") /^ (Exp /^ "x=s" /^ "y=f2") /^ (Exp /^ "x=t" /^ "y=p") /^ (Plus /^ "x=r" /^ "y=s" /^ "z=t");

Compare it to

"f1 <*> f2 = p" = "exp(r)=f1" /^ "exp(s)=f2" /^ "exp(t)=p" /^ "r+s=t";

A reader may wonder what is a big deal of having symbolic notation for arithmetic predicates and functions. Didn’t SQL have `where`

clause syntax such as `x+y=5 and y=3`

since the beginning of time? Yes, but all these predicates are hidden from the user in RDBMS engine black box; one can’t program their own predicates. In QBQL, both defining predicates, and writing queries is easy. Consider finding all even numbers between 5 (inclusive) and 10 (exclusive). First, we define a ternary predicate “`integer`

between `lower_bound`

(inclusive) and `upper_bound`

(exclusive)”. We have already described the technique of coding index function when defining predicates in the earlier articles. Here is the implementation — IntegerInterval.java. The `getSymbolicNames()`

function defined as

public static String[] getSymbolicNames() { return new String[] { "for(int i = from; i<to; i++)", "from <= int i < to", }; }

returns two strings that are matched against quoted predicate name in a query. For example in

"5 <= int x < 10" ^ "2*y=x";

we have joined two predicates, where the first one matches the second symbolic name of the `IntegerInterval`

definition with arguments renamed via the mappings `5->from`

and `10->to`

. (The first alternative, that is symbolic name `"for(int i = from; i<to; i++)"`

, pokes fun at java loop syntax, but is fully legitimate, otherwise).

The query result is little surprising:

`[x y]`

5 2.5

6 3

7 3.5

8 4

9 4.5

A quick investigation of the Times predicate reveals that the `y`

attribute is calculated via index function that performs division, so it can be fractional. We have to explicitly restrict y to be integer valued via the Integers predicate

"5 <= int x < 10" ^ "2*y=x" ^ "int y";

Having what intuitively seems to be the same predicate in both bounded and unbounded versions seems to be an imperfection. Perhaps, one day QBQL would be able to push inequality predicates `5 <= x`

and `x < 10`

inside the `int x`

predicate…

In a similar venue, query

"x + y = 5" ^ "x - y = 1";

would greatly benefit from QBQL ability to join predicates. At this moment QBQL is able to join a predicate and relation only, even if both are located in distant parts of the query. However, the above linear system doesn’t have any relation to begin evaluation, and certainly QBQL doesn’t know anything about Gaussian elimination method.