In my blog post Db2 vector for similarity search, I discussed a demo using embeddings for similarity search. It is a “typical” AI use case for vector capabilities. Today, I am going to show you a use case totally without artificial intelligence - no GPUs, no models, etc. required. And it is even very easy to set up.
Vectors for geographical coordinates
Here on earth, every point on the surface can be located by its latitude and longitude (more or less). The pair makes up the geographical coordinates of a point on a map. And, in practice for our demo, is a nice two-dimensional vector. Thus, it should be an easy to set up scenario to demonstrate the Db2 vector data type and its distance search.
To have enough data, I downloaded the list of world cities from simplemaps, then unzipped the file worldcities.csv. The file holds information on about 48000 cities worldwide. For my demo, I only used the city name, the country information, and the latitude/longitude data.
City data in Db2
After I created a test database, I connected to it, then ran the following SQL statement:
create table if not exists cities(name varchar(60), country varchar(30), lat float(32), long float(32), coord vector(2,float32));
The above creates a table cities with four columns. The column coord has the vector data type with two dimensions and float32 numbers.
Keeping the connection open, I ran the IMPORT command in my shell:
db2 "import from worldcities.csv of del method P(1,5,3,4) skipcount 1 insert into cities (name, country, lat, long)"
The method is needed to pick the right columns out of the CSV file, then insert them in the stated column list. Next, back in the Db2 CLI, I ran the following UPDATE statement to create a vector value and attach it to the existing row. The vector is a string in the form [lat, long]. I made up that string on the fly by concatenating the required parts.
update cities set coord=vector(varchar('['||lat||','||long||']'),2,float32);
With the above I was ready for some geospatial-like searches on the city data set.
Db2 vector distance for geospatial search
Here is a simple query that retrieves the 19 neighboring cities to Friedrichshafen, Germany. Even though it says LIMIT 20 to only return 20 rows in the result, it actually is 19 neighbors. The center point, Friedrichshafen in this case, is returned as first row.
select name,
country,
vector_distance((select coord
from cities
where name='Friedrichshafen'),
coord,
euclidean) as distance
from cities
order by distance
limit 20;

Pretty easy, right? And it is impressive that my hometown is close to other countries. To test it further, I also ran the following queries:
select name, country, vector_distance(
(select coord
from cities
where name='San Jose' and country='United States'),
coord,euclidean) as distance
from cities
order by distance limit 30;
The above is interesting for those in Silicon Valley, the below query for some Canada-based IBMers…
select name, country, vector_distance(
(select coord from cities
where name='Markham' and country='Canada'),
coord,euclidean) as distance
from cities
order by distance limit 30;
Conclusions
Db2 has a built-in vector data type (since version 12.1.2, there is already 12.1.3 available). That data type is not only useful for AI-based use case, but also valuable to other scenario. Here, in my blog, I showed how to query geographical data.
If you have feedback, contact me through the usual channels, including Mastodon (@data_henrik@mastodon.social) or LinkedIn.