On Boolean datatype in SQL and beyond
September 16, 2010
The question if
Boolean is a legitimate SQL datatype is the source of long debates; witness the 10 years long discussion at Ask Tom, for example. Here is my perspective.
To begin with, what is Boolean datatype, and why it is ubiquitous in each and every procedural programming language? Mathematically, depending how you look into it, Boolean objects form a ring or lattice. Then, any algebraic system prompts a datatype. We have integers (of finite and infinite precision), floating point numbers (as bastardized version of reals), alphanumeric strings, complex numbers, vectors, matrices, and so on. Boolean is just yet another type among many (and not the one that is terribly hard to implement, either).
However, relational databases are different. Relations are predicates and the simplest form of predicate is a proposition. For example, the assertion “Employee ___ has been hired at ___” is a binary predicate, while “Employee SCOTT has been hired at 01/01/2010” is a proposition. What values does a proposition have? Here, you have it: True and False. The relations whose values are True and False are sometimes known in database world under colorful names of
DUM, but practitioner would have hard time finding anything of resemblance in his favorite RDBMS. The takeout lesson here is that if we allow propositional values of relational attributes then, perhaps, why don’t we allow predicates as well? This leads us to the old idea of nested relations.
Let’s get back to earth, and borrow an example from the above Ask Tom discussion. Consider the Boolean attribute
IS_CREDIT_WORTHY of the relation
CUSTOMERS. This is sloppy database design: it is obvious that for credit agency this Boolean attribute is calculated, while genuine customer data contains credit score. Given a relation
CUSTOMER with attribute
CREDIT_LIMIT one can always make a view
create view CREDIT_WORTHY_CUSTOMERS as
select * from CUSTOMERS where CREDIT_LIMIT > 0
Therefore, I interpret all these pleas for Boolean datatype as “I’m too lazy (or impatient) to do proper database design, please give us Boolean shortcut. (After all, you have stuffed RDBMS with other things that don’t fit there, like Blob, XML and Java!)”. Customer is always right isn’t e?
And, finally, returning back to asktom discussion, what about [PL/SQL] function which returns a BOOLEAN: how such a function can be leveraged in a SQL query without BOOLEAN datatype available? Suppose this function takes two parameters DEPTNO and DNAME and returns TRUE if a department with such attributes exist. One may argue that whatever this function does, table function constructing a set of tuples with valid DEPTNO and DNAME can do as well. There is one important caveat, though: database relations are finite. Pipelined table functions can accommodate infinite relations, so any database query with some boolean function in the select operator (AKA SQL WHERE clause) can be expressed as an equivalent query against a view supported by pipelined function. A function which returns BOOLEAN can be viewed as an index over a relation specified by pipelined function. This perspective goes quite beyond SQL idiosyncrasies, which I touched in this article.