Plan display: Horizontal or Vertical?
March 27, 2009
A tree structure is typically visualized as tree that “grows horizontally”, Windows file manager being a ubiquitous example. SQL execution plan is a tree which, until recently, has been rendered exclusively horizontal. Several client tools, however, broke up with this tradition and offered vertical display, while some articles cheer up this new option. Here is my analysis of pros and cons of each alternative.
1. Vertical display takes more space. The plan tree width is constrained by the amount of horizontal space, divided by the width of the average plan node. For big plans you have to scroll both horizontally and vertically.
2. Standard database textbooks present execution plan in vertical display and, evidently, the entire CS community is preferring to render trees this way. However, textbook examples are rarely chosen to illustrate large trees, which are common in practice.
3. One can argue that it is easier to see the order of execution on tree planted vertically. For parent nodes that have two children it is not easy to see the children if indentation is small, and child nodes are expanded.
The last one seems to be legitimate argument, but my question is: does it matter? Unless you are a professional optimizer developer, do you have time to analyze each and every node of the plan? The question is why the particular SQL query is slow, and how to tune it. Studying the execution flow of rather large plan contributes little to understanding the problem. A performance analyst would get much better return on investment of his time by searching for performance bottlenecks.
Therefore, in addition to plan structure we need some numbers as a basis for quantitative analysis of the performance problem. Horizontal tree display provides some advantage into this respect, because it naturally integrates with the grid. Have a look into this example
and notice additional columns on the right side displaying cost, buffer gets, elapsed time, and rowsource start counter. First, notice the cost entirely mismatching buffer gets and execution time. Therefore, the plan is effectively random. Second, please note that the full scan of the X$KTFBHC has been started 139 times. There is no nested loops of filter node in the ancestor chain, so how is this possible? The answer is that the DBA_TABLESPACE_USAGE_METRICS view definition
CREATE OR REPLACE VIEW "SYS"."DBA_TABLESPACE_USAGE_METRICS" AS SELECT t.name, sum(f.allocated_space) - ( select nvl(sum( rb.space), 0) from sys.recyclebin$ rb where t.ts# = rb.ts# ) + ( select sum(fc.ktfbhcsz - fc.ktfbhcusz) from x$ktfbhc fc where t.ts# = fc.ktfbhctsn), ...
includes scalar subqueries, which are executed with tuple iteration semantics.
The last question about this execution statistics is why buffer gets and execution tree don’t accumulate in the ancestor chain. Sure the time at any node should be at least as big as the sum of the children? The answer is that this is a partial execution statistics. For most long running queries there is no point waiting them to complete. Canceling after, say, 5 min would still provide pretty good picture where the bottleneck is.
To conclude, SQL Optimization is well known to be a difficult problem. Statistics information is incomplete, robust cost metrics is elusive, and the search space is explosive. The optimization goals are often conflicting. The very first idea that every SQL performance analyst discovers: “The optimization is only as good as its cost estimates”. In the grand scheme of things plan display orientation has minuscule value.
Finally, if you are not convinced, please vote for your favorite plan display orientation here.