Custom Formatting in SQLDev 4.2

January 3, 2017


Formatting is one of the software development features that always steers controversy. To put it bluntly, everybody wants to format the code their own way. In previous SQL Dev versions the answer to that challenge was having elaborate tree-table widget spiced with several dozens of various options.

Arguably, the combined SQL and PL/SQL grammar is the most cumbersome programming language in existence, with total number of grammar productions exceeding 20000. From that perspective, the list of “Advanced Format” options offered by SQL Dev feels inadequate. To address this mismatch, the latest release introduced “Custom Format”. It offers to a developer means to specify formal conditions which SQL and PL/SQL grammatical constructs to indent,  align, pad, add extra line breaks and so on.

As usual, the best way to demonstrate a new feature is an example. Consider the following enhancement request by Stefan Poschenrieder to remove the indent of the NOT NULL ENABLE inline constraint in the following code

CREATE TABLE dim_aspect (
    dim_aspectid       NUMBER(11,0)
        NOT NULL ENABLE,
    source             VARCHAR2(50),
    hier1_desc         VARCHAR2(50),
    hierarchy1         VARCHAR2(50),
    dwh_inserteddate   DATE
);

The first step is identifying the target grammar symbol in the Code Outline panel. Here is screen snapshot illustrating it:
code outline.png
As illustrated there, positioning a cursor over the NOT keyword highlights the corresponding node in the parse tree. Therefore, the grammar symbol of interest is probably “inline_constraint“. Clicking onto the node labeled inline_constraint to double check this guess witnesses it highlighting “NOT NULL ENABLE” in the code editor. (Please be aware that clicking onto a hyperlinked looking grammar symbol has side effect of opening documentation page featuring that symbol railroad diagram definition).

As soon as the grammar symbol of interest is known, lets find it at Custom Format preferences panel. Admittedly, glaring omission there is the search widget. Yet, grammar symbols are arranged in lexicographic order, so that a reader would have no trouble spotting inline_constraint in the following formatting rule:

...
| [node) index_subpartition_clause[69,119)# 
| [node) inline_constraint & ![node) inline_constraint[14,67)
| [node) inline_ref_constraint
...

This fragment of rather convoluted disjunctive condition instructs the formatter to indent the nodes with payload “index_subpartition_clause[69,119)#“, or the nodes labeled with “inline_constraint” (but not those that also labeled as “inline_constraint[14,67)“), or the nodes labeled “inline_ref_constraint“.  Therefore, commenting out the middle conjunction

...
| [node) index_subpartition_clause[69,119)# 
--| [node) inline_constraint & ![node) inline_constraint[14,67)
| [node) inline_ref_constraint
...

is likely a solution to the problem. After making this change, a reader is advised to test what effect do modified formatting rules have on a SQL or PL/SQL code sample in the preview editor. This action is performed with the bottom-left “Run” button. The two neighbors “Import…” and “Export…” are for saving and retrieving your work. By default the modified “format.arbori” file is kept in the SQLDev product preferences directory.

Let’s reinforce this custom formatting idea with two more examples, both from Steven Feuerstein. In both cases we’ll learn some additional syntax of that formal formatting rules specification. The first example is just a bug witnessed by the following test case:

CREATE OR REPLACE PACKAGE BODY plch_pkg IS BEGIN
   FOR indx IN 1..100000 LOOP
      g_strings(    
         indx       --<--- ???
      ) := 'String ' || indx;
   END LOOP;
END;
/

The rule responsible for that unwanted indentation of the “indx” parameter is

| [node) pls_expr & [node-1) '(' & ![node) numeric_literal

It requires the parse node to be labeled with “pls_expr” but not “numeric_literal“. Also, it stipulates that the prior sibling node (formally, “node-1“) is labeled with open parenthesis. Adding the condition for the node successor not to be the closing parenthesis

| [node) pls_expr & [node-1) '(' & ![node) numeric_literal & ![node+1) ')' 

is all it takes to fix this snag.

In the second example

CREATE PACKAGE fmt AS
    PROCEDURE create_checklist (
        user_id_in       IN INTEGER,
        question_id_in   IN INTEGER
    );
    --<-- wanted extra line break 
    PROCEDURE remove_checklist (
        checklist_id_in   IN INTEGER
    );

    PROCEDURE remove_checklist (
        user_id_in       IN INTEGER,
        question_id_in   IN INTEGER
    );

END fmt;

the formatted code have only single line breaks after each package member. The formatting rule responsible for this functionality is “extraLines“. It formally a disjunction of the two rules “sql_stmts” and “significant_statements“. In the later rule we see the following conjunctive condition:

| [node) basic_decl_item & [node+20 < node)

The syntax of first part is familiar: it requires the node to be labeled as “basic_decl_item“.

The second part is little more cumbersome. The “[ node” and “node )” refer to the node beginning and ending position, correspondingly. Each node in the parse tree recognizes a sequence of tokens. In our example nodes labeled as “basic_decl_item” recognizes a sequence of tokens beginning with the PROCEDURE keyword and ending with semicolon. The beginning position is the offset of the PROCEDURE keyword, while the ending position is the offset of the semicolon. The condition  “[node+20 < node)” requires more than 20 tokens between the node beginning and ending position, or informally the package member to be of “sufficient length”.

Please note that package data members are recognized as “basic_decl_item” as well. Therefore, the proper fix depends on if a user wants to inset double line breaks after data members or not. The length condition was a naive attempt on distinguishing data and procedures, but more elaborate condition, distinguishing data and procedures is possible too.

Let’s walk trough yet another example demonstrating how flexible it is. Consider the following example adapted from ToadWorld ER:

Can we vertically align THEN values of CASE statement ? Current Format:

SELECT CASE
         WHEN a.col5 = b.col5 THEN 'COL5'
         WHEN a.col55555555555555 = b.col55 THEN 'COL55555'
         WHEN a.col555555 = b.col55555 THEN 'COL555555'
       END AS VAL
FROM   table1;

Desired Format:

SELECT CASE
         WHEN a.col5 = b.col5                THEN 'COL5'
         WHEN a.col55555555555555 = b.col55  THEN 'COL55555'
         WHEN a.col555555   = b.col5555      THEN 'COL555555'
       END AS VAL
FROM   table1;

For the reference, the default formatting in SQLDeveloper looks less than desirable:

SELECT
        CASE WHEN
            a.col5 = b.col5
        THEN
            'COL5'
        WHEN
            a.col55555555555555 = b.col55
        THEN
            'COL55555'
        WHEN
            a.col555555 = b.col55555
        THEN
            'COL555555'
        END
    AS val
FROM
    table1

However, to satisfy this request it requires exactly 4 changes of the formatting program.

First, before taking on alignments, we should fix unwanted indentation. How about not indenting conditions a.col5 = b.col5 and some such? By examining the parse tree, we notice that the code fragment a.col5 = b.col5 is formally a condition. Therefore, it must be that the line

| [node) condition  &  [node-1) 'WHEN'

of the formatting program that is responsible for the effect. Comment (or delete it), and rerun the formatter to confirm this hypothesis:

SELECT
        CASE WHEN a.col5 = b.col5 THEN
            'COL5'
        WHEN a.col55555555555555 = b.col55 THEN
            'COL55555'
        WHEN a.col555555 = b.col55555 THEN
            'COL555555'
        END
    AS val
FROM
    table1;

What about not indenting expressions, like string literal ‘COL5’? Just comment out the following rule:

--| [node) expr  &  [node-1) 'THEN'

The effect of this change, however, is somewhat discouraging:

SELECT
        CASE WHEN a.col5 = b.col5 THEN 'COL5' WHEN a.col55555555555555 = b.col5 THEN 'COL55555' WHEN a.col555555 = b.col5 THEN 'COL555555' WHEN a.col55555555
 = b.col5 THEN 'COL5555555' END
    AS val
FROM
    table1;

Here we need to look onto parse tree yet again to identify what grammar symbol would indent the entire WHEN-THEN fragments. The symbol in question is the  searched_case_expression#, so that the missing indentation rule alternative is just

| [node) searched_case_expression#

With this amendments the sample code formats to:

SELECT
        CASE
            WHEN a.col5 = b.col5 THEN 'COL5'
            WHEN a.col55555555555555 = b.col55 THEN 'COL55555'
            WHEN a.col555555 = b.col5555 THEN 'COL555555'
        END
    AS val
FROM
    table1;

Please note that even though there is a set options for CASE expressions (which, again, is a concession to traditional format design), flipping those options won’t produce format as clean as we just did.

So far we made 3 custom formatting program changes . Now we are ready to take on alignments. Let’s first investigate how current  formatting rules align the equalities,  because reverse engineering similar functionality is the easiest way to master new skills. Here is the code:

/**
 * All alignments (paddings) ...
**/
paddedIdsInScope: (
-- types
  [id) identifier & [id+1) datatype & [scope) relational_properties
| [id) decl_id & ([id+1) prm_spec_unconstrained_type | [id+1) 'IN' | [id+1) 'OUT' ) & [scope) fml_part
| [id) decl_id & ([id+1) constrained_type | [id+1) object_d_rhs) & [scope) adt_definition
| [id) decl_id & ([id+1) constrained_type | [id+1) object_d_rhs) & [scope) decl_list
-- =>
| [id) sim_expr & [id+1) '=' & [id+1+1) '>' & [scope) paren_expr_list
-- :=
| [id) name & [id+1) ':' & [id+1+1) '=' & [scope) seq_of_stmts
-- =
| [id) column & [id+1) '=' & [id+1+1) expr & [scope) where_clause
| [id) column & [id+1) '=' & [id+1+1) expr & [scope) on_using_condition
) & scope < id ->

The first highlighted condition specify that we are interested in columns, followed by equality, followed by some expression. They would be chosen together with the containing where_clause. Therefore, if we choose columns within the CASE scope

| [id) column & [id+1) '=' & [id+1+1) expr & [scope) case_expression

then it would align the equalities (but not THEN expressions):

SELECT
        CASE
            WHEN a.col5                = b.col5 THEN 'COL5'
            WHEN a.col55555555555555   = b.col55 THEN 'COL55555'
            WHEN a.col555555           = b.col55555 THEN 'COL555555'
        END
    AS val
FROM
    table1;

The condition we are after is a slight variation of thereof:

| [id) condition & [id+1) 'THEN' & [id+1+1) expr & [scope) case_expression

which produces:

SELECT
        CASE
            WHEN a.col5 = b.col5               THEN 'COL5'
            WHEN a.col55555555555555 = b.col55 THEN 'COL55555'
            WHEN a.col555555 = b.col55555      THEN 'COL555555'
        END
    AS val
FROM
    table1;
Advertisements

11 Responses to “Custom Formatting in SQLDev 4.2”

  1. Daniel Nyeste Says:

    Hey Vladim,

    As the new SQLDeveloper is out, I try to bring together a custom format.
    I want to align the declaration items to default expressions to get from this:

    DECLARE

    tot_emps NUMBER;
    tot_depts NUMBER := 50;
    v_verylongresultname VARCHAR2(32767) := ”;
    v_val VARCHAR2(50) := ”;
    v_onechar VARCHAR2(1) := ‘I’;

    something like this:

    DECLARE

    tot_emps NUMBER;
    tot_depts NUMBER := 50;
    v_verylongresultname VARCHAR2(32767) := ”;
    v_val VARCHAR2(50) := ”;
    v_onechar VARCHAR2(1) := ‘I’;

    I tried to use this alignment code inside paddedIdsInScope, but does not work:

    | [id) constrained_type & [id+1) ‘:’ & [id+1+1) ‘=’ & [scope) decl_list

    In the code outline it shows constrained_type (or unconstrained_type) should be used as id there but I’m stuck. Any idea what’s going wrong here?

    Regards,
    Daniel Nyeste

    • Daniel Nyeste Says:

      Oh well, blog engine doesn’t like white spaces apperently. Please look at my example at the following pastebin:
      https://pastebin.com/5Zada6gV


    • You want the next node at the same level:

      | [id) constrained_type & [id+1) default_expr_opt & [scope) decl_list

      Here is parse tree fragment:


      [28,37) basic_decl_item
      . [28,36) basic_d object_d
      . . [28,29) decl_id identifier
      . . [29,36) object_d_rhs
      . . . [29,33) constrained_type
      . . . . [29,30) 'VARCHAR2' unconstrained_...
      . . . . [30,33) constraint paren_expr_list
      . . . . . [30,31) '('
      . . . . . [31,32) and_expr arg arith_expr ...
      . . . . . [32,33) ')'
      . . . [33,36) default_expr_opt
      . . . . [33,34) ':'
      . . . . [34,35) '='
      . . . . [35,36) and_expr arith_expr boolean...
      . [36,37) ';'

      Output:

      DECLARE
          tot_emps               NUMBER;
          tot_depts              NUMBER                 := 50;
          v_verylongresultname   VARCHAR2(32767)        := '';
          v_val                  VARCHAR2(50)           := '';
          v_onechar              VARCHAR2(1)            := 'I';
      BEGIN
          NULL;
      END;
      
  2. Daniel Nyeste Says:

    Hey Vadim,

    Thank you for your help despite the mistake I made in your name the first time. I’m really sorry for that.

    On the other hand, your answer works correctly 🙂

    Regards,
    Daniel


  3. I’m trying to deploy SQLcl as an auto-formatting solution for our team project, that is use it in a Git hook. I’m almost there, but I really wish Oracle would bundle `format.arbori` into code style XML. Because code style XML embeds an absolute path to `format.arbori`, it seems to me that I have to:

    1. Update code style XML and replace absolute path each time I export.
    2. cd/pushd in my script to tun sqlcl from a fixed directory to use this relative path.
    3. Do some shell programming to resolve relative file paths because of #2.

    Can you suggest any better options? If not, can embedding be channelled as a SQL Developer enhancement somewhere?


    • The default formatting program — format.prg — is read from oracle.dbtools-common.jar (oracle/dbtools/app). If you replace it with your custom version, then you can disseminate the modified SQLcl (or just oracle.dbtools-common.jar) to the members of your team.

      P.S. I agree that formatter not recognizing relative path to the formatting arbori program is a bug.


      • To be clear: it does recognize relative paths to format.arbori, but only relative to the current directory, which makes the whole thing clunky. I did finish a portable script that I could commit to a repo, but it’d have been easier to deal with a single code style XML.

        Thanks for your response!

  4. Piter Says:

    I’m looking for a condition that creates Line Breaks after ‘(‘ . I want move subquery one line higher and ‘)’ on end subquery. Can you help me?


    • One major shortcoming of 17.2 formatter is that there is no dedicated rule for line breaks. Line breaks are artifacts of indents, which limits what can be done with custom format. In 17.3 there is a separate rule for line breaks before and after a specified parse tree node. There I was able to achieve something sensible

      SELECT 1
      FROM dual
      WHERE
          1 = (
              SELECT 1 FROM dual
          )

      with “Line breaks->subqueries” option, and

      SELECT 1
      FROM dual
      WHERE
          1 = ( SELECT 1 FROM dual )

      without.

      P.S. this option looks broken in 17.2

  5. NP Says:

    Hi Vadim,

    I l.ike to use leading comma in the column list i SELECT statement so I set “Line break” to be “Before Comma” in the Fomatting Options.

    Unfortunately this also makes some function calls harder to read.

    Example:

    — Unformatted
    SELECT M.MATERIAL
    , SUBSTR(M.MATERIAL, 1, INSTR(M.MATERIAL, ‘:’, -1, 1) – 1) MATERIAL_SHORT

    — After format:
    SELECT M.MATERIAL
    , SUBSTR(
    M.MATERIAL
    , 1
    , INSTR(
    M.MATERIAL
    , ‘:’
    , -1
    , 1
    ) – 1
    ) MATERIAL_SHORT

    Is there a way to exempt the commas within function calls from the Line Break rule?

    Thanks.

    NP


    • This area has been improved for upcoming release 17.3 (due in couple of weeks at the end of the quarter). Here is default formatting:

      SELECT
          m.material,
          substr(m.material,1,instr(m.material,':',-1,1) - 1) material_short
      FROM
          dual

      And with leading commas:

      SELECT
          m.material
         ,substr(m.material,1,instr(m.material,':',-1,1) - 1) material_short
      FROM
          dual

      In 17.3 there are several notable improvements:
      1. Dedicated rules for line breaks
      2. A new advanced format option not to indent procedure arguments (default setting “no indent”)
      3. Dedicated rules to override excessive line breaks, so that, for example, line break is never inserted before semicolon.
      4. All advanced formatting options are “wired” into the custom format program so that they effectively turn on and off custom formatting rules, which results in more robust formatting job.

      All these amendments warrant a blog post, as soon as the release is out.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: