More Errata: Ancestor Query in Matrix Encoding

July 3, 2010


Apparently, there is yet another bug in Matrix tree encoding ancestors query. Let’s trace derivation step-by-step to find out where the error crept in and fix it. A working example

create table hierarchy (
    name varchar2(10),
    a11 integer,
    a12 integer,
    a21 integer,
    a22 integer
);

insert into hierarchy values('KING',  2,   1,   1,   0);
insert into hierarchy values('JONES', 3,   2,   2,   1);
insert into hierarchy values('SCOTT', 4,   3,   3,   2);
insert into hierarchy values('ADAMS', 5,   4,   4,   3);
insert into hierarchy values('FORD',  7,   3,   5,   2);
insert into hierarchy values('SMITH', 11,  7,   8,   5);
insert into hierarchy values('BLAKE', 5,   2,   3,   1);
insert into hierarchy values('ALLEN',  8,   5,   5,   3);
insert into hierarchy values('WARD',  13,  5,   8,   3);
insert into hierarchy values('MARTIN',  18,  5,   11,  3);
commit;

is especially helpful for checking each individual assertion.

First, the interval boundaries are a11/a21 and (a11-a12)/(a21-a22). Which one is lower and which one is higher? Let’s query it:

select name, '[' || (a11-a12)/(a21-a22) || ',' || a11/a21 || ')' 
from hierarchy;

NAME '['||(A11-A12)/(A21-A22)||','||A11/A21||')'
---------- ---------------------------------------------
KING [1,2)
JONES [1,1.5)
SCOTT [1,1.33333333333333333333333333333333333333)
ADAMS [1,1.25)
FORD [1.33333333333333333333333333333333333333,1.4)
SMITH [1.33333333333333333333333333333333333333,1.375)
BLAKE [1.5,1.66666666666666666666666666666666666667)
ALLEN [1.5,1.6)
WARD [1.6,1.625)
MARTIN [1.625,1.63636363636363636363636363636363636364)

That looks correct, for example, the node 'JONES' is parent of 'FORD', therefore, semi-open interval [1,1.5) contains [1.3333...,1.4).

Now we are ready to write an initial version of the query listing all the descendants of ‘FORD’ (or any other node for that matter). We would have a self-join of the hierachy table with the two tables aliased as ‘ancestor’ and ‘descendant’. The join condition claims that the descendant’s interval is contained within ancestor’s:

select descendant.name 
from hierarchy ancestor, hierarchy descendant
where ancestor.name = 'FORD'
and (ancestor.a11-ancestor.a12)/(ancestor.a21-ancestor.a22) 
<= (descendant.a11-descendant.a12)/(descendant.a21-descendant.a22)
and descendant.a11/descendant.a21 <= ancestor.a11/ancestor.a21
;

NAME
----------
FORD
SMITH

Next we get rid of the division:

select descendant.name 
from hierarchy ancestor, hierarchy descendant
where ancestor.name = 'FORD'
and (ancestor.a11-ancestor.a12)*(descendant.a21-descendant.a22) 
<=  (descendant.a11-descendant.a12)*(ancestor.a21-ancestor.a22)
and descendant.a11*ancestor.a21 <= ancestor.a11*descendant.a21
;

NAME
----------
FORD
SMITH

The final step, creating FBI and leveraging it in the query (so that the join were performed via index range scan) remains the same as in the book.

Advertisements

2 Responses to “More Errata: Ancestor Query in Matrix Encoding”

  1. David Brusowankin Says:

    FBI? Sorry but my acronym recall algorithm (ARA as opposed to RAR) isn’t returning anything for that. I must be dense.

    –David


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: