## TABLE_DUM divided by Relation

### September 1, 2011

Here is curious identity:

TABLE_DUM divided by a relation evaluates to the relation’s complement.

Formally in QBQL:

TABLE_DUM /= x = <NOT> x.

What division am I talking about? Googling “relational division” brings up a list of usual suspects — articles by Date, Celko, etc — with some confusion leading to “Todd’s division”, ternary (!) division operation and so on. QBQL follows much cleaner definitions of set equality join, set containment join and alike readily found in academic literature. In the above identity the “/=” is binary set equality join operation, informally known as relational division.

Dual identity is also interesting. First, the dual of TABLE_DUM, aka R00 is the universal relation R11. The dual of unary complement operation is inversion. Duality among multiple division-like operations is not evident, but the reader might verify that set intersection join analogous to composition in the algebra of Binary Relations fits the bill:

R11 /^ x = <INV> x.

## Empty String vs NULL

### July 29, 2011

How weird is the language you are programming on? User survey on stackoverflow is the second most voted thread. Somewhere on page 2 we encounter the following SQL code snippet:

create table wtf (key number primary key, animal varchar2(10)); insert into wtf values (1,'dog'); insert into wtf values (2,''); insert into wtf values (3,'cat'); select * from wtf where animal 'cat'; 
I don’t agree with the sentiment of that post. How exactly returning an animal (or any other thing for that matter) named with empty string of characters would make user happy? For all practical purposes empty string of data is useless and equating it with NULL just reduces complexity.

## Search code in SQL Developer 3.0

### June 22, 2011

Oracle 11g introduced “syntax aware” code search facility — PL/Scope. One can look up identifier definitions, usages, exceptions, which is much more sophisticated and targeted search compared to the old way of finding all of the occurrences of a string in the ALL_SOURCE. Certainly few developers are expected to use PL/Scope À la carte. IDEs, such as SQL Developer, accommodated it within its [object] search. Here is a screen shot illustrating a search of formal parameter P3:

So, with PL/Scope available is the old way, that is finding all of the occurrences of a string in the ALL_SOURCE, obsolete? Yes, it has been effectively deprecated, which turned out to be ill-considered decision. Apparently, some people like the old way better, especially that there is no way to search the comments. The fix is expected in next SQL Developer release, but for now here is a workaround. Among the other things it demonstrates that many features are nearly duplicated, and certainly a search can be accommodated by SQL Developer reports. In fact, there is a off-the-shelf report “Search Source Code” already, although many would find the owner = USER predicate impeding their goal. With little effort, however, one can customize it to approximate to what the Search Panel does (for RDBMS versions below 11.1). One can even prettify the output with html formatting; here are the steps:

1. Go to reports tab -> User Defined Reports and select “Add Report” from context menu
2. Name the report, e.g. “Fancy all_source”, and copy and paste the following prettified query

select
owner "Owner",
name  "PL/SQL Object Name",
type  "Type",
line  "Line",
'<html><font bgcolor=#dddddd>'
||substr(text,1,instr(upper(text),upper(:TEXT_STRING))-1)||'</font>'
||'<font bgcolor=#bbffbb>'
||substr(text,instr(upper(text),upper(:TEXT_STRING)),length(:TEXT_STRING))
||'</font>'
||'<html><font bgcolor=#dddddd>'
||substr(text,instr(upper(text),upper(:TEXT_STRING))+length(:TEXT_STRING))
"Text",
from sys.all_source
where (:TEXT_STRING is null or
instr(upper(text),upper(:TEXT_STRING)) > 0)
and length(text)<3000  --not like '%wrapped%'
and name not like 'BIN\$%'
order by owner, name, type, line


3. Use Go To option from context menu on PL/SQL object name to navigate to the PL/SQL compilation unit.

Here is a screen shot of the report output:

## ANSI join syntax

### March 2, 2011

Here is unconventional way to write a query in ANSI SQL join syntax

SELECT 1
FROM DEPARTMENTS C
JOIN EMPLOYEES A
JOIN JOBS B
ON C.DEPARTMENT_ID = A.DEPARTMENT_ID
ON A.JOB_ID = B.JOB_ID


It is valid syntax according to ANSI 92. Indeed, the fragment

EMPLOYEES A
JOIN JOBS B
ON C.DEPARTMENT_ID = A.DEPARTMENT_ID


is a qualified join which is a joined table which is a table reference.

Therefore, one can just take conventional restrict-project-cartesian product query, and replace all the commas in the FROM clause with the JOIN keyword, then replace WHERE and AND keywords with the ON.

An argument in favor of ANSI style is allegedly cleaner textbook example

SELECT 1
FROM DEPARTMENTS C
JOIN EMPLOYEES A  ON C.DEPARTMENT_ID = A.DEPARTMENT_ID
JOIN JOBS B  ON A.JOB_ID = B.JOB_ID


although, as we have witnessed, ANSI grammar doesn’t enforce it. A developer is free to mix the first and the second variant in unbelievably messy query!

My feelings is reflected in a quote by Anthony Molinaro (author of “SQL Cookbook”):

“Old style is short and sweet and perfect. ANSI dumbed it down, and for people who’ve been developing for sometime, it’s wholly unnecessary”.

## SQL challenge #2

### February 14, 2011

Here is a solution to Iggy Fernandez The Second SQL challenge:

WITH T(W2,OUT) AS ( SELECT WORD2, WORD1 || ' ' || WORD2 || ' ' || WORD3 FROM RIDDLE UNION ALL SELECT R.WORD2, T1.out || ' ' || R.WORD2 || ' ' || T2.out FROM T T1, RIDDLE R, T T2 WHERE T1.W2 = R.WORD1 AND T2.W2 = R.WORD3 ) select out from t

(Be sure to have table columns of sufficient width). It outputs something like:

TRYING TO TYPE ONE HUNDRED DISTINCT WORDS IN A SINGLE PARAGRAPH IS REALLY TOUGH IF I CANNOT REPEAT ANY OF THEM THEN PROBABLY THOSE WITH MANY LETTERS SHOULD BE USED MAYBE SOME READERS WILL UTILIZE DICTIONARIES THESAURUSES THESAURI OR POSSIBLY EVEN ENCYCLOPEDIAS BUT MY PREFERENCE HAS ALWAYS BEEN THAT GRAY MATTER BETWEEN YOUR EARS SERIOUSLY MARILYN CHALLENGES SUCH AS THIS REQUIRE SKILLS BEYOND

P.S. It looks like I didn’t figure out tree structure correctly as the rhs of the tree is almost gibberish:

SCIENCE AND PHYSICS SO WHAT DO YOU ASK READING COMPREHENSION WRITING ABILITY GOOD OLD FASHIONED ELBOW GREASE SCIENTISTS DONT CARE ABOUT STRUCTURE THEY WANT RESULTS HEY LOOK ONLY ELEVEN MORE LEFT

Why the forementioned query didn’t connect it to the first part is another mystery…

## Making fun of Java in cubicle

### November 20, 2010

This is a legitimate program in QBQL:

"for( int i = 0; i<10; i++ )" /=
"sum += i";

which returns

"for( int i = 0; i<10; i++ )"/="sum += i"=[sum] 45 ; 
Here is detailed article.

## Pivot and Unpivot

### October 14, 2010

SQL has many flaws, and the fact that its syntax poorly reflects duality of pivot and unpivot queries is probably somewhere at the bottom of the list. To be fair, relational algebra doesn’t do any better. What are pivot and unpivot queries in QBQL?

Consider the following matrix example:

$\begin{bmatrix} 3 & 1 \\1 & 0 \end{bmatrix}$

This 2×2 matrix can be represented relationally in two ways:
• as ternary relation with two subscripts and one entry attribute:

Au=[i j s]
1 1 3
1 2 1
2 1 1
2 2 0
;


• as rectangular array with dimension j pivoted into the columns j1 and j2:

Ap=[i j1 j2]
1 3  1
2 1  0
;


The expressing Ap in terms of Au is pivot operation, while representing Au in terms of Ap is unpivot.

Au =
((Ap /^ [j1=s]) /^ [j]1)
v
((Ap /^ [j2=s]) /^ [j]2)
.

Ap =
((Au /^ [j1=s]) /^ [j]1)
^
((Au /^ [j2=s]) /^ [j]2)
.


Notice that both expressinos are almost identical, so the duality of pivot and unpivot is exposed as formal relational lattice duality of join and union.

## QBQL amendments

### October 8, 2010

QBQL is, to put it bluntly, “Relational Algebra done right”; and this posting is about the recent improvements.

1. The operations from Date&Darwen’s “New Relational Algebra” are implemented as user-defined ones. Their definitions in relational lattice terms are in the udf.def file.

2. The include directive allows nesting QBQL programs inside each other. For example, the code snippet

include udf.def; [x] 1 2 <OR> [x] 2 3; 

refers to user-defined operations introduced in the udf.def file (with generalized union operation  <OR> among them) and then executes the query.

3. Artificial distinction between database and program files is gone. To run the program you specify a single argument — the program name. For example,

C:\> java -jar C:\qbql_trunk\dist\qbql.jar C:\qbql_trunk\src\qbql\program\current.prg

runs the program current.prg. It is up to the current.prg to instantiate any database(s) with the include directive. (As we have seen in the previous bullet one might actually don’t need any explicit database initilalization).

Little more detailed instructions how to run QBQL are on updated Getting Started page.

## Impedance Mismatch?

### September 30, 2010

The previous post about Boolean datatype sparkled a lively discussion at comp.databases.theory. Here, I’ll embark upon Tony Andrews comment that SQL DBMS world somehow abhors unary relations. This is very strange from theoretical perspective, as one can expect lower arity relations be prevalent according to Zipf law.

This is one more artifact of infamous impedance mismatch between front-end application and DBMS. The client behavior is imperative: “Tell me if my customer is creditworthy or not, so I can proceed with her application. (I don’t care about the list of all other creditworthy customers!)” . It clearly wants a boolean answer. What database reply might be? “Look, I supplied all the [relevant] information, isn’t decision and taking action your responsibility?”

P.S. We all know that ORM tools (Hibernate, Linq, etc) solved impedance problem already, so the issue is no longer there! Just kidding.

## On Boolean datatype in SQL and beyond

### September 16, 2010

The question if Boolean is a legitimate SQL datatype is the source of long debates; witness the 10 years long discussion at Ask Tom, for example. Here is my perspective.

To begin with, what is Boolean datatype, and why it is ubiquitous in each and every procedural programming language? Mathematically, depending how you look into it, Boolean objects form a ring or lattice. Then, any algebraic system prompts a datatype. We have integers (of finite and infinite precision), floating point numbers (as bastardized version of reals), alphanumeric strings, complex numbers, vectors, matrices, and so on. Boolean is just yet another type among many (and not the one that is terribly hard to implement, either).

However, relational databases are different. Relations are predicates and the simplest form of predicate is a proposition. For example, the assertion “Employee ___ has been hired at ___” is a binary predicate, while “Employee SCOTT has been hired at 01/01/2010″ is a proposition. What values does a proposition have? Here, you have it: True and False. The relations whose values are True and False are sometimes known in database world under colorful names of DEE and DUM, but practitioner would have hard time finding anything of resemblance in his favorite RDBMS. The takeout lesson here is that if we allow propositional values of relational attributes then, perhaps, why don’t we allow predicates as well? This leads us to the old idea of nested relations.

Let’s get back to earth, and borrow an example from the above Ask Tom discussion. Consider the Boolean attribute IS_CREDIT_WORTHY of the relation CUSTOMERS. This is sloppy database design: it is obvious that for credit agency this Boolean attribute is calculated, while genuine customer data contains credit score. Given a relation CUSTOMER with attribute CREDIT_SCORE, or CREDIT_LIMIT one can always make a view

create view CREDIT_WORTHY_CUSTOMERS as select * from CUSTOMERS where CREDIT_LIMIT > 0 

Therefore, I interpret all these pleas for Boolean datatype as “I’m too lazy (or impatient) to do proper database design, please give us Boolean shortcut. (After all, you have stuffed RDBMS with other things that don’t fit there, like Blob, XML and Java!)”. Customer is always right isn’t e?