Oracle 11g introduced “syntax aware” code search facility — PL/Scope. One can look up identifier definitions, usages, exceptions, which is much more sophisticated and targeted search compared to the old way of finding all of the occurrences of a string in the ALL_SOURCE. Certainly few developers are expected to use PL/Scope À la carte. IDEs, such as SQL Developer, accommodated it within its [object] search. Here is a screen shot illustrating a search of formal parameter P3:

So, with PL/Scope available is the old way, that is finding all of the occurrences of a string in the ALL_SOURCE, obsolete? Yes, it has been effectively deprecated, which turned out to be ill-considered decision. Apparently, some people like the old way better, especially that there is no way to search the comments. The fix is expected in next SQL Developer release, but for now here is a workaround. Among the other things it demonstrates that many features are nearly duplicated, and certainly a search can be accommodated by SQL Developer reports. In fact, there is a off-the-shelf report “Search Source Code” already, although many would find the owner = USER predicate impeding their goal. With little effort, however, one can customize it to approximate to what the Search Panel does (for RDBMS versions below 11.1). One can even prettify the output with html formatting; here are the steps:

1. Go to reports tab -> User Defined Reports and select “Add Report” from context menu
2. Name the report, e.g. “Fancy all_source”, and copy and paste the following prettified query

select
    owner "Owner",
    name  "PL/SQL Object Name",
    type  "Type",
    line  "Line",
    '<html><font bgcolor=#dddddd>'
    ||substr(text,1,instr(upper(text),upper(:TEXT_STRING))-1)||'</font>'
    ||'<font bgcolor=#bbffbb>'
    ||substr(text,instr(upper(text),upper(:TEXT_STRING)),length(:TEXT_STRING))
    ||'</font>'
    ||'<html><font bgcolor=#dddddd>'
    ||substr(text,instr(upper(text),upper(:TEXT_STRING))+length(:TEXT_STRING))
          "Text",
    owner  sdev_link_owner,
    name   sdev_link_name,
    type   sdev_link_type,
    line   sdev_link_line 
from sys.all_source
where (:TEXT_STRING is null or
           instr(upper(text),upper(:TEXT_STRING)) > 0)
    and length(text)<3000  --not like '%wrapped%'
    and name not like 'BIN$%'
    order by owner, name, type, line

3. Use Go To option from context menu on PL/SQL object name to navigate to the PL/SQL compilation unit.

Here is a screen shot of the report output:

ANSI join syntax

March 2, 2011

Here is unconventional way to write a query in ANSI SQL join syntax

SELECT 1
   FROM DEPARTMENTS C
        JOIN EMPLOYEES A
             JOIN JOBS B
     ON C.DEPARTMENT_ID = A.DEPARTMENT_ID
     ON A.JOB_ID = B.JOB_ID

It is valid syntax according to ANSI 92. Indeed, the fragment

EMPLOYEES A
             JOIN JOBS B
     ON C.DEPARTMENT_ID = A.DEPARTMENT_ID

is a qualified join which is a joined table which is a table reference.

Therefore, one can just take conventional restrict-project-cartesian product query, and replace all the commas in the FROM clause with the JOIN keyword, then replace WHERE and AND keywords with the ON.

An argument in favor of ANSI style is allegedly cleaner textbook example

SELECT 1
   FROM DEPARTMENTS C
        JOIN EMPLOYEES A  ON C.DEPARTMENT_ID = A.DEPARTMENT_ID
        JOIN JOBS B  ON A.JOB_ID = B.JOB_ID

although, as we have witnessed, ANSI grammar doesn’t enforce it. A developer is free to mix the first and the second variant in unbelievably messy query!

My feelings is reflected in a quote by Anthony Molinaro (author of “SQL Cookbook”):

“Old style is short and sweet and perfect. ANSI dumbed it down, and for people who’ve been developing for sometime, it’s wholly unnecessary”.

SQL challenge #2

February 14, 2011

Here is a solution to Iggy Fernandez The Second SQL challenge:

WITH T(W2,OUT) AS (
SELECT WORD2, WORD1 || ' ' || WORD2 || ' ' || WORD3 FROM RIDDLE
UNION ALL
SELECT
R.WORD2,
T1.out || ' ' || R.WORD2 || ' ' || T2.out
FROM T T1, RIDDLE R, T T2
WHERE T1.W2 = R.WORD1 AND T2.W2 = R.WORD3
) select out from t

(Be sure to have table columns of sufficient width). It outputs something like:

TRYING TO TYPE ONE HUNDRED DISTINCT WORDS IN A SINGLE PARAGRAPH IS REALLY TOUGH IF I CANNOT REPEAT ANY OF THEM THEN PROBABLY THOSE WITH MANY LETTERS SHOULD BE USED MAYBE SOME READERS WILL UTILIZE DICTIONARIES THESAURUSES THESAURI OR POSSIBLY EVEN ENCYCLOPEDIAS BUT MY PREFERENCE HAS ALWAYS BEEN THAT GRAY MATTER BETWEEN YOUR EARS SERIOUSLY MARILYN CHALLENGES SUCH AS THIS REQUIRE SKILLS BEYOND

P.S. It looks like I didn’t figure out tree structure correctly as the rhs of the tree is almost gibberish:

SCIENCE AND PHYSICS SO WHAT DO YOU ASK READING COMPREHENSION WRITING ABILITY GOOD OLD FASHIONED ELBOW GREASE SCIENTISTS DONT CARE ABOUT STRUCTURE THEY WANT RESULTS HEY LOOK ONLY ELEVEN MORE LEFT

Why the forementioned query didn’t connect it to the first part is another mystery…

Making fun of Java in cubicle

November 20, 2010

This is a legitimate program in QBQL:

"for( int i = 0; i<10; i++ )" /= 
    "sum += i";

which returns

"for( int i = 0; i<10; i++ )"/="sum += i"=[sum]
45
;

Here is detailed article.

Pivot and Unpivot

October 14, 2010

SQL has many flaws, and the fact that its syntax poorly reflects duality of pivot and unpivot queries is probably somewhere at the bottom of the list. To be fair, relational algebra doesn’t do any better. What are pivot and unpivot queries in QBQL?

Consider the following matrix example:

\begin{bmatrix} 3 & 1 \\1 & 0 \end{bmatrix}

This 2×2 matrix can be represented relationally in two ways:
• as ternary relation with two subscripts and one entry attribute:

Au=[i j s]
    1 1 3
    1 2 1
    2 1 1
    2 2 0
;

• as rectangular array with dimension j pivoted into the columns j1 and j2:

Ap=[i j1 j2]
    1 3  1
    2 1  0
;

The expressing Ap in terms of Au is pivot operation, while representing Au in terms of Ap is unpivot.

Au =
((Ap /^ [j1=s]) /^ [j]1)
v
((Ap /^ [j2=s]) /^ [j]2)
. 

Ap =
((Au /^ [j1=s]) /^ [j]1)
^
((Au /^ [j2=s]) /^ [j]2)
.

Notice that both expressinos are almost identical, so the duality of pivot and unpivot is exposed as formal relational lattice duality of join and union.

QBQL amendments

October 8, 2010

QBQL is, to put it bluntly, “Relational Algebra done right”; and this posting is about the recent improvements.

1. The operations from Date&Darwen’s “New Relational Algebra” are implemented as user-defined ones. Their definitions in relational lattice terms are in the udf.def file.

2. The include directive allows nesting QBQL programs inside each other. For example, the code snippet

include udf.def;
[x] 1 2 <OR> [x] 2 3;

refers to user-defined operations introduced in the udf.def file (with generalized union operation <OR> among them) and then executes the query.

3. Artificial distinction between database and program files is gone. To run the program you specify a single argument — the program name. For example,

C:\> java -jar C:\qbql_trunk\dist\qbql.jar C:\qbql_trunk\src\qbql\program\current.prg

runs the program current.prg. It is up to the current.prg to instantiate any database(s) with the include directive. (As we have seen in the previous bullet one might actually don’t need any explicit database initilalization).

Little more detailed instructions how to run QBQL are on updated Getting Started page.

Impedance Mismatch?

September 30, 2010

The previous post about Boolean datatype sparkled a lively discussion at comp.databases.theory. Here, I’ll embark upon Tony Andrews comment that SQL DBMS world somehow abhors unary relations. This is very strange from theoretical perspective, as one can expect lower arity relations be prevalent according to Zipf law.

This is one more artifact of infamous impedance mismatch between front-end application and DBMS. The client behavior is imperative: “Tell me if my customer is creditworthy or not, so I can proceed with her application. (I don’t care about the list of all other creditworthy customers!)” . It clearly wants a boolean answer. What database reply might be? “Look, I supplied all the [relevant] information, isn’t decision and taking action your responsibility?”

P.S. We all know that ORM tools (Hibernate, Linq, etc) solved impedance problem already, so the issue is no longer there! Just kidding.

The question if Boolean is a legitimate SQL datatype is the source of long debates; witness the 10 years long discussion at Ask Tom, for example. Here is my perspective.

To begin with, what is Boolean datatype, and why it is ubiquitous in each and every procedural programming language? Mathematically, depending how you look into it, Boolean objects form a ring or lattice. Then, any algebraic system prompts a datatype. We have integers (of finite and infinite precision), floating point numbers (as bastardized version of reals), alphanumeric strings, complex numbers, vectors, matrices, and so on. Boolean is just yet another type among many (and not the one that is terribly hard to implement, either).

However, relational databases are different. Relations are predicates and the simplest form of predicate is a proposition. For example, the assertion “Employee ___ has been hired at ___” is a binary predicate, while “Employee SCOTT has been hired at 01/01/2010″ is a proposition. What values does a proposition have? Here, you have it: True and False. The relations whose values are True and False are sometimes known in database world under colorful names of DEE and DUM, but practitioner would have hard time finding anything of resemblance in his favorite RDBMS. The takeout lesson here is that if we allow propositional values of relational attributes then, perhaps, why don’t we allow predicates as well? This leads us to the old idea of nested relations.

Let’s get back to earth, and borrow an example from the above Ask Tom discussion. Consider the Boolean attribute IS_CREDIT_WORTHY of the relation CUSTOMERS. This is sloppy database design: it is obvious that for credit agency this Boolean attribute is calculated, while genuine customer data contains credit score. Given a relation CUSTOMER with attribute CREDIT_SCORE, or CREDIT_LIMIT one can always make a view

create view CREDIT_WORTHY_CUSTOMERS as
select * from CUSTOMERS where CREDIT_LIMIT > 0

Therefore, I interpret all these pleas for Boolean datatype as “I’m too lazy (or impatient) to do proper database design, please give us Boolean shortcut. (After all, you have stuffed RDBMS with other things that don’t fit there, like Blob, XML and Java!)”. Customer is always right isn’t e?

Here is couple of queries which are easily expressed in SQL, but are much more challenging in Relational Algebra

1. Find a maximum in a set

select max(x) from S

This query is expressed in Relational Calculus as

\{x |  \forall y S(x) \wedge S(y)  \to y \le x \}

hence, it can be expressed in Relational Algebra as well. So, given unary relation S(x) and binary relation LE(x,y), what relational algebra query outputs a maximal element of S?

2. Sort a set

select x from S order by x

There is no concept of ordering in Relational Algebra, of course. Therefore, the query has to be reformulated a little. A list can be represented as binary relation with domain of non-negative integers as a first argument. For example, alphabetically ordered list "a, b, c" becomes

[position, value]
0 a
1 b
2 c

Formally, given two unary relations S (e.g numbers) and R (e.g. strings) together with binary order relations LES (e.g. numerical order) and LER (e.g. alphabetical order) on their respective domains write a query that “matches” S and R. Leveraging previous problem we can express the query procedurally:
“Find the maximal element in both S and R, match them together, and repeat this procedure with the S and R stripped off their respective maximums.”
The problem is that Relational algebra doesn’t allow recursion/iteration, so the question is if this query is expressible by standard means.

Relational Lattice theory originated as Relational Algebra cast into genuine mathematical shape. It gives a satisfactory foundation for ad-hock query transformation laws, such as push-selection-via-projection, which reduce to elementary algebraic laws of associativity, distributivity, absorption and so on. Relational Algebra cast into Relational Lattice looks like Boolean Algebra with every operation split twofold: there are two versions of disjunction, two version of conjunction, two negations, and finally, two versions of each constant true and false.

Here is a recent development which, on one hand, answers a long standing puzzle and, on the other hand, skews this little symmetry. Consider the following operation:

x # y = (x' v y')'.

It is idempotent, symmetric, associative. In addition, it honors both absorption laws over outer union:

x + (x # y) = x.
x # (x + y) = x.

Therefore, this new operation together with outer union forms a lattice! The constants R00 and R11 become the supremum and infimum in the new lattice:

x # R00 = R00.
x # R11 = x.

Please note, that this operation is different from inner join by the mere fact that it is associative. The relationship between the two is:

x # y < x * y.

Unlike inner join, which is just a projection of natural join into common set of attributes, it appears to be much less intuitive.

To summarize, the bi-lattice structure that has been hinted two years ago is well and alive. However, the question “How many versions of logical AND operation is there” now appears to give a surprising answer of 3.

Follow

Get every new post delivered to your Inbox.