DB2 Battle: Optimization Profile vs. Statement Concentrator (Part 2)

Today I wanted to try out using DB2 optimization profiles for a statement impacted by the statement concentrator. In part 1 I gave the background, showed how I created an optimization profile and that a simple explain statement didn’t return what I expected. In this final part I am going to look at DB2 section actuals to hopefully proof that my optimization guideline works as I had hoped.

Because all my “explain plan for select …” statements resulted in the same access plan, I decided to use session actuals to look at how statements are really executed within DB2. The actuals are kind of a live log of the real statement execution costs and the applied access plan. The first step towards session actuals is to have a workload and an activity event monitor and to switch the monitor on:

db2 "create workload betw applname('python') collect activity data with details,section"
db2 "create event monitor betwmon for activities write to table"
db2 "set event monitor betwmon state 1"

The above statements create a workload which collects section data. The application name (APPLNAME) is “python” because I use a Python script (see below) for parts of the testing.

Script:

import ibm_db
conn = ibm_db.connect("hltest","hloeser","secretpw")
ibm_db.exec_immediate(conn, 'set current optimization profile="HLOESER"."PROFILE_BETW"')
ibm_db.exec_immediate(conn, 'select id, s from betw where id between 2 and 20')

Once the script has been executed, the session actuals can be turned into explain information and then that information formatted. To identify which data entry to transform a query like the following one can be used. The table name consists of the generic prefix “ACTIVITYSTMT” and the monitor name:

select appl_id,uow_id,activity_id,stmt_text from ACTIVITYSTMT_BETWMON

The output then serves as input to the “explain_from_activity” procedure:

CALL EXPLAIN_FROM_ACTIVITY('*LOCAL.hloeser.150227075721',1,2, 'BETWMON',NULL,?,?,?,?,?)

The procedure takes the logged execution statistics for the section and turns it into regular explain data. The explain data can be stored in an user copy of the DB2 explain tables. The tool db2exfmt is then used to format the explain data to typical DB2 explain output. To identify what data to format the procedure above returns the needed information, something like this:

  Value of output parameters
  --------------------------
  Parameter Name  : EXPLAIN_SCHEMA
  Parameter Value : HLEXP   

  Parameter Name  : EXPLAIN_REQUESTER
  Parameter Value : HLOESER 

  Parameter Name  : EXPLAIN_TIME
  Parameter Value : 2015-02-27-11.55.21.008920

  Parameter Name  : SOURCE_NAME
  Parameter Value : SYSSH200

  Parameter Name  : SOURCE_SCHEMA
  Parameter Value : NULLID  

  Parameter Name  : SOURCE_VERSION
  Parameter Value : 

  Return Status = 0

Calling db2exfmt with that information as input:

db2exfmt -d hltest -w  2015-02-27-11.55.21.008920 -n SYSSH200 -s NULLID -e HLEXP

Long story short, with all the setup I managed to get different access plans and costs for the query with the BETWEEN predicate, depending on the range of values. It was either an index or a table scan. db2exfmt also showed a “Explain from section”, the use of my statement-level optimization guideline as shown earlier, and that the statement concentrator still was on (see “effective statement”).

Here the two access plans as result from the REOPT ALWAYS in the optimization profile:

Access Plan:
-----------
    Total Cost:         590.416
    Query Degree:        0
      Rows 
   Rows Actual
     RETURN
     (   1)
      Cost 
       I/O 
       |
      98302 
      4850 
     TBSCAN
     (   2)
     590.416 
       NA 
       |
      98304 
       NA 
 TABLE: HLOESER 
      BETW
       Q1
Access Plan:
-----------
    Total Cost:         20.3399
    Query Degree:        0
            Rows 
         Rows Actual
           RETURN
           (   1)
            Cost 
             I/O 
             |
           18.9981 
             19 
           FETCH 
           (   2)
           20.3399 
             NA 
         /---+----\
     18.9981       98304 
       19           NA 
     IXSCAN   TABLE: HLOESER 
     (   3)        BETW
     13.5646        Q1
       NA 
       |
      98304 
       NA 
 INDEX: HLOESER 
     BETWIX
       Q1

As a last test I removed the line from my script which sets the current optimization profile. With that all statements turn back into the generic and complex access plan:

Access Plan:
-----------
    Total Cost:         20.469
    Query Degree:        0
                Rows 
             Rows Actual
               RETURN
               (   1)
                Cost 
                 I/O 
                 |
               167.205 
                 19 
               NLJOIN
               (   2)
               20.469 
                 NA 
          /------+-------\
         1               167.205 
        NA                 19 
      TBSCAN             FETCH 
      (   3)             (   4)
    0.000186812          20.4688 
        NA                 NA 
        |              /---+----\
         0         167.205       98304 
        NA           19           NA 
 TABFNC: SYSIBM    IXSCAN   TABLE: HLOESER 
      GENROW       (   5)        BETW
        Q1         13.6489        Q3
                     NA 
                     |
                    98304 
                     NA 
               INDEX: HLOESER 
                   BETWIX
                     Q3

So what is the conclusion? Applying the statement concentrator turns similar statements into a generic one to cut down on query compilation and optimization costs. This is useful in many cases. If queries use predicates like the BETWEEN it can result in suboptimal plans. In such a case a DB2 optimization profile might be useful. But how to proof it? It turns out that session actuals are very useful to dig deeper into what is happening within the database engine. With few steps it is possible to look at how a statement is really executed. And sometimes a database engine does exactly as the user is hoping for… ;-)

Update on 02-Jul-2015: Some additional information can be found in part 3 of this journey.