More Errata: Ancestor Query in Matrix Encoding

July 3, 2010

Apparently, there is yet another bug in Matrix tree encoding ancestors query. Let’s trace derivation step-by-step to find out where the error crept in and fix it. A working example

create table hierarchy (
    name varchar2(10),
    a11 integer,
    a12 integer,
    a21 integer,
    a22 integer

insert into hierarchy values('KING',  2,   1,   1,   0);
insert into hierarchy values('JONES', 3,   2,   2,   1);
insert into hierarchy values('SCOTT', 4,   3,   3,   2);
insert into hierarchy values('ADAMS', 5,   4,   4,   3);
insert into hierarchy values('FORD',  7,   3,   5,   2);
insert into hierarchy values('SMITH', 11,  7,   8,   5);
insert into hierarchy values('BLAKE', 5,   2,   3,   1);
insert into hierarchy values('ALLEN',  8,   5,   5,   3);
insert into hierarchy values('WARD',  13,  5,   8,   3);
insert into hierarchy values('MARTIN',  18,  5,   11,  3);

is especially helpful for checking each individual assertion.

First, the interval boundaries are a11/a21 and (a11-a12)/(a21-a22). Which one is lower and which one is higher? Let’s query it:

select name, '[' || (a11-a12)/(a21-a22) || ',' || a11/a21 || ')' 
from hierarchy;

NAME '['||(A11-A12)/(A21-A22)||','||A11/A21||')'
---------- ---------------------------------------------
KING [1,2)
JONES [1,1.5)
SCOTT [1,1.33333333333333333333333333333333333333)
ADAMS [1,1.25)
FORD [1.33333333333333333333333333333333333333,1.4)
SMITH [1.33333333333333333333333333333333333333,1.375)
BLAKE [1.5,1.66666666666666666666666666666666666667)
ALLEN [1.5,1.6)
WARD [1.6,1.625)
MARTIN [1.625,1.63636363636363636363636363636363636364)

That looks correct, for example, the node 'JONES' is parent of 'FORD', therefore, semi-open interval [1,1.5) contains [1.3333...,1.4).

Now we are ready to write an initial version of the query listing all the descendants of ‘FORD’ (or any other node for that matter). We would have a self-join of the hierachy table with the two tables aliased as ‘ancestor’ and ‘descendant’. The join condition claims that the descendant’s interval is contained within ancestor’s:

from hierarchy ancestor, hierarchy descendant
where = 'FORD'
and (ancestor.a11-ancestor.a12)/(ancestor.a21-ancestor.a22) 
<= (descendant.a11-descendant.a12)/(descendant.a21-descendant.a22)
and descendant.a11/descendant.a21 <= ancestor.a11/ancestor.a21


Next we get rid of the division:

from hierarchy ancestor, hierarchy descendant
where = 'FORD'
and (ancestor.a11-ancestor.a12)*(descendant.a21-descendant.a22) 
<=  (descendant.a11-descendant.a12)*(ancestor.a21-ancestor.a22)
and descendant.a11*ancestor.a21 <= ancestor.a11*descendant.a21


The final step, creating FBI and leveraging it in the query (so that the join were performed via index range scan) remains the same as in the book.


2 Responses to “More Errata: Ancestor Query in Matrix Encoding”

  1. David Brusowankin Says:

    FBI? Sorry but my acronym recall algorithm (ARA as opposed to RAR) isn’t returning anything for that. I must be dense.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: