Graph databases are used to store and analyze data that is connected in a network-like structure. For example, cities connected by roads, people in a social network, or category hierarchies where categories have sub-categories. While there are dedicated graph databases, extensions like ltree and pgRouting add graph functionality to Postgres.

Steps

  • Enable the ltree and pgrouting extensions
  • Create a table to store hierarchical data
  • Insert and retrieve hierarchical data
  • Perform hierarchical queries using ltree
  • Create a table to store network data
  • Insert and query network data

Enable ltree and pgRouting

ltree adds a new LTREE type to Postgres for storing hierarchies, like categories. In Neon, ltree is already installed, you just need to enable it using the following command.

CREATE EXTENSION IF NOT EXISTS ltree;

pgrouting is a separate extension that can be used for routing and shortest path calculations in network data, like finding the shortest path between two cities.

CREATE EXTENSION IF NOT EXISTS pgrouting;

Create a table to store hierarchical data

With ltree, you can store hierarchical relationships such as category trees, organizational charts, or file directories. You can create a table to store a category tree using the following command.

CREATE TABLE categories (
  id SERIAL PRIMARY KEY,
  name TEXT NOT NULL,
  path LTREE
);

Insert and retrieve hierarchical data

The ltree type allows inserting hierarchical paths:

INSERT INTO categories (name, path) VALUES
  ('Electronics', 'Electronics'),
  ('Laptops', 'Electronics.Laptops'),
  ('Gaming Laptops', 'Electronics.Laptops.Gaming');

You can retrieve all subcategories under a given path using <@ as follows.

SELECT * FROM categories WHERE path <@ 'Electronics.Laptops';
idnamepath
2LaptopsElectronics.Laptops
3Gaming LaptopsElectronics.Laptops.Gaming

Create a table to store network data

With pgrouting, you can model roads, social networks, or any graph-like data structure. For example, the following table stores a graph of roads that have an associated cost. Each road has a source and a target, and an associated cost.

CREATE TABLE roads (
  id SERIAL PRIMARY KEY,
  source INT,
  target INT,
  cost FLOAT
);

Insert and query network data

Insert edges representing connections between nodes using the following command.

INSERT INTO roads (source, target, cost) VALUES
  (1, 2, 4.0),
  (2, 3, 3.0),
  (3, 4, 2.5),
  (1, 4, 10.0);

You can then find the minimal cost path between two nodes using the pgr_dijkstra() function, which is an implementation of Dijkstra's Algorithm.

SELECT * FROM pgr_dijkstra(
  'SELECT id, source, target, cost FROM roads',
  1, 4, false
);

The above query returns the following, which shows the shortest path from node 1 to node 4 is by visiting each node in order (1, 2, 3, 4) with an aggregate cost of 9.5.

seqpath_seqnodeedgecostagg_cost
111140
222234
33332.57
444409.5