“SQL Design Patterns” Book Errata
March 1, 2007
p.17
Not exactly a typo, but a comment from Laurent Schneider that grouping by the empty set can be explicitly expressed as
group by ()
p.19-21
Missing and
symbols.
p.20
Missing symbols and β:
if x ≥ x0 then else β endif
p.28 and elsewhere
inner view should read as inline view
p.21
Missing symbols and β again:
if x ≥ x0 then else β endif =
1 x-x0 + β 1 x0-x
Also in the following text fragment:
Easy: the above formula for general step function should handle the case where α is an expression, rather than constant.
p.47.
Since the solution differs between various vendors, now it’s a good time to branch out this information
This was copyedited from
The solution differs between various vendors, so that we have to branch our story here.
which is IMO easier to read.
p.82
Missing symbol :
(x+1) = …
(6) = 59.589686421
p.95
Missing symbol in the expression:
(A B) \ (A
B)
p.147
Another typesetting issue of missing arrow:
∅ → {A,B,C}
X → Y
p.170 (brought to my attention by Gary Howard)
Typo, the root node should be
[2 -1]
[1 0]
and not
[1 -1]
[1 0]
p.172 (by David Brusowankin)
Consider the child node 1.3.2.5 which matrix encoding is
[107 -19]
[ ]
[62 -11]
and it’s parent 1.3.2
[19 -7]
[ ]
[11 -4]
so the child.a12 = -19 is the negative of the parent.a11 = 19
and the child.a22 = -11 is the negative of the parent.a21 = 11
The queries:
Find all the employees who report directly to Jones.
select child.name
from MatrixTreeNodes parent, MatrixTreeNodes child
where parent.name = ‘Jones’
and child.a12 = parent.a11 and child.a22 = parent.a21
Who is Jones’ manager?
select parent.name
from MatrixTreeNodes parent, MatrixTreeNodes child
where child.name = ‘Jones’
and child.a12 = parent.a11 and child.a22 = parent.a21
p.173
David Brusowankin and Steve Johnston challenged the algebraic step of division removal. Here is the detailed exposition:
We take
descendant.a11/descendant.a21 <=
(node.a11-node.a12)/(node.a21-node.a22)
and multiply both sides by descendant.a21*(node.a21-node.a22) to get
descendant.a11*(node.a21-node.a22) <=
(node.a11-node.a12)*descendant.a21
which is the same as
descendant.a11*node.a21-descendant.a11*node.a22 <=
node.a11*descendant.a21-node.a12*descendant.a21
At this point we have already reached our goal and removed the division. We can try to simplify more but Steve comment makes me suspect that the following steps have an error hidden somewhere.
Given that we already have
descendant.a11*node.a21 >= node.a11*descendant.a21
we deduce
-descendant.a11*node.a22 <= -node.a12*descendant.a21which turns to
descendant.a11*node.a22 >= node.a12*descendant.a21
p.181
Steve Johnston noticed that the parent and child matrix entries are also confused in the following query:
select max(floor(a11/a12)) as N from MatrixTreeNodes
where a12 = :parent_a11 and a22 = :parent_a21
Then, there is a case when a node has no children, so the query should be amended as follows:
select case when count(1)=0 then 0 else max(floor(a11/a12)) end as N
from MatrixTreeNodes
where a11 = :parent_a12 and a21 = :parent_a22
Finally, we want to insert into position N+1, not N, because N is occupied.
p.193
Missing symbol :
(1,
2,
3,…,
k)
Figures for chapter 1

Figure 1.1: Cartesian product of the set A = {2,3,4,6,9} by itself. Counting all the elements x that are no greater than y produces the sequence number of y in the set A.

Figure 1.2: Cartesian product of the set A = {2,3,4,6,9} by itself. The predecessor of y is the maximal number in a set of x that are less than y. There is no predecessor for y = 2.

Figure 1.3: Indicator functions 1A(x) and 1B(x).

Figure 1.4: 1A(x)+1B(x) is not an indicator function, because it is equal to 2 for any x∈A∩B . By inclusion-exclusion principle we have to subtract 1A(x)1B(x).

Figure 1.5: Step function f(x)=1x.

Figure 1.6: Endpoints of the covering interval are constructed from the endpoints of the first and last intervals in a chain of intervals with no gaps.

Figure 1.7: In the chain of intervals fst, int, lst the endpoint int.x is covered by the fst interval. Likewise, the fst.y is covered by the int.

Figure 1.8: The chain starting with the fst interval and ending with lst is not maximal, since the lst.y is covered by some other interval.

Figure 1.9: A chain of intervals beginning with fst1 and ending with lst1 has a gap. Therefore, there is an element lst2, such that lst2.y is not covered by any other interval.

Figure 1.10: Interval Packing. A set of integers 0,1,2,3,5,6,7,9 is packed into the intervals [0,3], [5,7], and [9,9].
Figures for chapter 5

Figure 5.14: Nested sets structure for the tree in fig. 5.2a with set containment reversed. Now, a child set is required to contain its parent.

Figure 5.15: A different view of the reverse nested sets structure for the tree in fig. 5.14. A set is indistinguishable from path to the root.
July 16, 2007 at 5:02 pm
there are two figures 6.15
October 11, 2007 at 7:10 pm
I have some concerns about the section titled ‘Inserting Nodes’ in the book SQL Design Patters’ The sections starts on page 180.
The first area of concern reads
1. Query all the immediate children and find the oldest child.
select max(floor(a11/a12)) as N from MatrixTreeNodes
where a11 = :parent_a12
and a21 = :parent_a22.
I think this query should read:
select max(floor(a11/a12)) as N from MatrixTreeNodes
where a12 = :parent_a11
and a22 = :parent_a21.
After making this correction, the query should return N, the number of children of the parent node. This can also be called the position of the oldest child among all the other children of the parent node.
Following the above query, the book states the following
2. Insert the node at the n-th position:
insert into MatrixTreeNodes(a11,a12,a21,a22) values
(:parent_a11*(:N+1) - :parent_a12,
:parent_a11,
:parent_a21*(:N+1) - :parent_a22,
:parent_a21);
The insert statement implies that you carry the value of N directly down from the first query. This is not the case. You are not inserting into the nth position as it is already occupied. You are actually inserting into the N+1 position.
Additionally, if there are no children for the parent node, the first query returns a null value.
I would recommend the additional corrections to the first query:
MSSQL:
select isnull(max(floor(a11/a12)),0) as N from MatrixTreeNodes
where a11 = :parent_a12
and a21 = :parent_a22.
ORACLE:
select nvl(max(floor(a11/a12)),0) as N from MatrixTreeNodes
where a11 = :parent_a12
and a21 = :parent_a22.
I would also recommend adding a statement that N needs to be incremented because it is referring to the current oldest node and not where the new node should be inserted.
:newN = N + 1
insert into MatrixTreeNodes(a11,a12,a21,a22) values
(:parent_a11*(:newN+1) - :parent_a12,
:parent_a11,
:parent_a21*(:newN+1) - :parent_a22,
:parent_a21);
October 11, 2007 at 8:23 pm
Now that I think about this, my comment
“After making this correction, the query should return N, the number of children of the parent node.”
is not correct. It does not account for deletions.
I wonder if an enhancement needs to be considered to fill gaps when children are deleted.
October 12, 2007 at 5:59 pm
I am having some serious problems implementing Nested Intervals using matrix encoding based on whats in this book.
Here is what I have found:
After creating a table and insert procedure that follows the suggestions in this book I could not get the ‘query all descendants’ to work. After going through the numbers over and over again I went back to a query proposed in the book.
select descendant.*
from MatrixTreeNodes descendant, MatrixTreeNodes node
where descendant.a11/descendant.a21 between node.a11/node.a21
and (node.a11-node.a12)/(node.a21-node.a22)
and node.name = … — predicate uniquely identifying a node
It then states the following:
Unfortunately, this query would produce a wrong result. None of the database vendors support the rational number datatype. The ratios of integers would be silently casted into float numbers with accompanying errors due to lack of precision. All the expressions must be rewritten with divisions within the means of integer arithmetic:
The resulting query is this:
select descendant.*
from MatrixTreeNodes descendant, MatrixTreeNodes node
where descendant.a11*node.a21 >= descendant.a21*node.a11
and descendant.a11*node.a22 >= descendant.a21*node.a12
and node.name = … — predicate identifying a node uniquely
This does not work for what I have created.
So, taking the first query, I pulled out the conditions and went through the process of moving things around to get rid of the division. I came up with the following. I had to switch the greater than with the less than.
select dec.*
from
tblSiteLogonHierarchy dec,tblSiteLogonHierarchy node
where dec.matrixUL*node.matrixLL ((node.matrixUL*dec.matrixLL) - (node.matrixUR*dec.matrixLL))
and node.lSiteLogonID =-200
order by 4
matrixUL = UpperLeft (a11 from book)
matrixLL = Lower Left (a21 from book)
matrixUR = Upper Right (a12 from book)
matrixLR = Lower Right (a22 from book)
lSiteLogonID = unique identifier
What am I missing here? After some very small scale experimintation, my revised query seems to work as I would expect.
Does anyone have an opinion on this? Is the book wrong? Am I wrong?
October 12, 2007 at 6:13 pm
Vadim, I dont suppose you happen to stop by here to examine this at all? Would love to hear your input on this.
October 17, 2007 at 5:10 pm
Updated the errata, sorry for late reply
October 26, 2007 at 6:26 pm
Regarding “Relocating Tree Branches”, I don’t believe that the code on page 182 accounts for the case in which the new node root has existing children.
October 26, 2007 at 11:04 pm
If new root is not empty, sure there might be nodes with conflicting encoding. In terms of materialized path, suppose we move all the children of 1.3 into new location 1.7.2 Next, assume the node 5.4 is in the relocated branch. The new location for that node is encoded as 1.7.2.5.4 and we have no guarantee that it is not already occupied. So yes, the discussion should go deeper and introduce some policy how to insert children safely, but this is IMO easier problem than figuring out how to modify encodings when moving nodes.
November 8, 2007 at 4:46 am
On page 179: finding the parent of
[7 -2]
[ ]
[4 -1]
1 - mod (4, 1) = 1 - 0 = 1
Not 0 so the root matrix is not obtained.
Is the root a special case somehow ?
Thanks,
David
November 8, 2007 at 10:08 pm
Consider the matrix equation from the top of page 178 adapted to the matrix from your example
[2..x]…[n.-1]…[7.-2]
[....].*.[....].=.[....]
[1..y]…[1..0]…[4.-1]
(dots are used as spacers). Due to matrix multiplication law, the 3 unknowns x,y and n should satisfy the following equations:
2n + x = 7
n + y = 4
It is a system of two equations with three variable that we are solving, and modulo function is just a handy tool. In the first equation the n is multiplied by a number greater than 1. There is no ambiguity, and we get a single integer solution n=4, x=-1. The second equation, however, admits two solutions n=4, y=0 and n=5, y=-1 and the modulo function finds the wrong one (that is the one inconsistent with the first equation)!
To summarize, you are right, the modulo function is the robust means to find the parent matrix upper right element, but the lower right element is better calculated some other way. You can use linear equation as above, or even more simple, leverage the determinant constraint.
March 18, 2008 at 9:40 pm
One can achieve much higher precision without overflow by use of DECIMAL (38,0). Still, this has its limits too.