Featured image of post Db2 and JSON array access

Db2 and JSON array access

Db2 supports JSON data and querying it with SQL, but access to JSON arrays, especially unnesting that data, is tricky. Great, if there is some background in recursive SQL.

Db2 JSON Support

Since many years, starting with Db2 11.1.4.4 has improved support for querying JSON data. At that time, I wrote in the linked blog post:

Application developers benefit from new JSON support that features functions like JSON_QUERY, JSON_TABLE and JSON_EXISTS (Hello, XML developers!).

At the core, there is JSON_QUERY to return a SQL/JSON value from the provided JSON value based on a SQL/JSON path expression. The function JSON_TABLE does the same, but returns a table. It allows to specify a path to the root of relevant data within the queries JSON objects, then a set of paths to extract the values making up the resulting columns. I gave a talk on that topic and used this Jupyter Notebook to demonstrate the capabilities.

All works well, except when the data and the tasks are more complex.

{
  "title":"The best dummy title ever",
  "authors": 
  [ 
    {"first":"John","last":"Doe"},
    {"first":"Erika","last":"Mustermensch"}
  ]
}

Given the above data for a book with a title and two authors, how could I extract both authors within a single query? The following query is capable of extracting the book title and returning the author information as a JSON array.

SELECT J.ID, J.TYPE, T.*
FROM JDATA J, 
     JSON_TABLE(J.DATA, 'strict $' COLUMNS(
        "title" varchar(40) path '$.title',
        "authors" varchar(300) format json path '$.authors' with conditional wrapper)
        error on error) T
WHERE TYPE='book';

The query results in something like this:

ID   TYPE    title                       authors
---- ------- --------------------------- ----------------------------------------------------------------------------------------------
   1 book    The best dummy title ever   [ { "first" : "John", "last" : "Doe" }, { "first" : "Erika", "last" : "Mustermensch" } ] 

The array of authors is returned, but not the individual authors extracted. What is missing is to un-nest the array entries into records or rows. Mark Barinstein answered a Stack Overflow question on that topic and provided a solution. It is based on a CTE, a common table expression. Does that ring a bell…?

Let’s UNNEST - the recursive way

In the following, I provide my variation of Mark’s UNNEST function, then show how to use it.

CREATE OR REPLACE FUNCTION JSON_UNNEST (OBJ CLOB(4K), JSON_PATH VARCHAR(128))
-- table structure
RETURNS TABLE (INDEX INT, VAL CLOB(4K))
RETURN
-- Common Table Expression (CTE) to recursively obtain each array element
  WITH ELEM_ARRAY (ELEMENTS) AS 
(
-- extract the array in question
  VALUES JSON_OBJECT(KEY 'elements' VALUE JSON_QUERY(OBJ, JSON_PATH) FORMAT JSON)
)
-- and build up a table with an index and the actual value
, ELEMENTS (INDEX, VAL) AS
(
-- the first array element
  SELECT 0, JSON_QUERY(ELEMENTS, '$.elements[0]')
  FROM ELEM_ARRAY
  WHERE JSON_EXISTS(ELEMENTS, '$.elements[0]')
    UNION ALL
-- combined with the next (index+1)
  SELECT E.INDEX+1, JSON_QUERY(A.ELEMENTS, '$.elements['|| TO_CHAR(E.INDEX+1) ||']')
  FROM ELEMENTS E, ELEM_ARRAY A
-- limit the index to a million to avoid warning of infinite loop
  WHERE JSON_EXISTS(A.ELEMENTS, '$.elements['|| TO_CHAR(E.INDEX+1) ||']') AND E.INDEX<1000000
)
-- all done, now return the index and the value for each array element
SELECT INDEX, VAL 
FROM ELEMENTS;

As you can see from my comments in the SQL code, based on the path to the JSON array, each array element is extracted and added to a “holding table”. The counter to address the array elements is increased and with the next iteration the next element is extracted.

JSON data in action

With the function JSON_UNNEST in place, let’s take a look at how to use it. First, I created a table with both book and contact data.

-- drop an existing table if needed
-- drop table jdata;
create table jdata (id integer not null unique, type varchar(20), data CLOB(2028));

insert into jdata values (1, 'book','{
  "title":"The best dummy title ever",
  "authors": [{"first":"John","last":"Doe"}, {"first":"Erika","last":"Mustermensch"}]
}');

insert into jdata values (2, 'book','{
  "title":"The second best dummy title ever",
  "authors": [{"first":"Jane","last":"Doe"}, {"first":"Max","last":"Mustermensch"}]
}');

insert into jdata values (3, 'contact','{
  "name":"John Doe",
  "phones": [{"type":"mobile","number":"1-408-555-5555"}, {"type":"business","number":"1-415-555-5555"}]
}');

insert into jdata values (4, 'contact','{
  "name":"Jane Doe",
  "phones": [{"type":"mobile","number":"1-519-555-1212"}, {"type":"business","number":"1-519-555-2121"}]
}');

Then, I issued the following query to extract the individual authors. The function UNNEST_JSON unwraps the array and returns its elements. The JSON_TABLE is then used to extract parts (first and last name) of the JSON object and return it as columns.

SELECT T.ID, A.INDEX, JSON_VALUE(T.DATA, 'strict $.title' returning VARCHAR(40)) as title, IT.* 
FROM 
  jdata T
, TABLE(JSON_UNNEST(T.data, '$.authors')) A
, JSON_TABLE
  (
    A.VAL, 'strict $' COLUMNS
    (
        first      VARCHAR(20) PATH '$.first',
        last       VARCHAR(20) PATH '$.last'
)
       ERROR ON ERROR) it
where t.type='book';

The result is an output like this:

Db2 result table for the extracted array values

We could query the contact data in the same way. If the JSON object within the array is simple and only few attributes need to be extracted, then we could do without JSON_TABLE. Here, I show how to return the extracted array data using the JSON_UNNEST and JSON_VALUE.

SELECT T.ID, A.INDEX, 
       JSON_VALUE(T.DATA, 'strict $.name' returning VARCHAR(20)) as name,
       JSON_VALUE(A.VAL, 'strict $.number' returning varchar(15)) as number,
       JSON_VALUE(A.VAL, 'strict $.type' returning varchar(10)) as type
FROM jdata T, TABLE(JSON_UNNEST(T.data, '$.phones')) A
WHERE T.TYPE='contact';
ID          INDEX       NAME                 NUMBER          TYPE      
----------- ----------- -------------------- --------------- ----------
          3           0 John Doe             1-408-555-5555  mobile    
          3           1 John Doe             1-415-555-5555  business  
          4           0 Jane Doe             1-519-555-1212  mobile    
          4           1 Jane Doe             1-519-555-2121  business  

  4 record(s) selected.

Conclusions

Db2 has the needed core capabilities to work with JSON data. Sometimes, the “syntactic sugar” is missing for improved comfort and convenience. If you know how to write SQL - or at least how to search for help - then you can implement the “sugar” on your own… 🍬🍬🍬 With that I close for the year and all the best for 2026. 🍀🍀🍀

If you have feedback, suggestions, or questions about this post, please reach out to me on Mastodon (@data_henrik@mastodon.social) or LinkedIn.