# PostgreSQL Window Functions

**Summary**: in this tutorial, you will learn how to use the PostgreSQL window functions to perform the calculation across a set of rows related to the current row.

## Setting up sample tables

First, create two tables named `products`

and `product_groups`

for the demonstration:

Second, insert some rows into these tables:

## Introduction to PostgreSQL window functions

The easiest way to understand the window functions is to start by reviewing the aggregate functions. An aggregate function aggregates data from a set of rows into a single row.

The following example uses the `AVG()`

aggregate function to calculate the average price of all products in the `products`

Â table.

To apply the aggregate function to subsets of rows, you use the `GROUP BY`

clause. The following example returns the average price for every product group.

As you see clearly from the output, the `AVG()`

function reduces the number of rows returned by the queries in both examples.

Similar to an aggregate function, a window function operates on a set of rows. However, it does not reduce the number of rows returned by the query.

The term *window* describes the set of rows on which the window function operates. A window function returns values from the rows in a window.

For instance, the following query returns the product name, the price, product group name, along with the average prices of each product group.

In this query, the `AVG()`

function works as a *window function* that operates on a set of rows specified by the `OVER`

clause. Each set of rows is called a window.

The new syntax for this query is the `OVER`

clause:

In this syntax, the `PARTITION BY`

distributes the rows of the result set into groups and the `AVG()`

function is applied to each group to return the average price for each.

Note that a window function always performs the calculation on the result set after the `JOIN`

, `WHERE`

, `GROUP BY`

and `HAVING`

clause and before the final `ORDER BY`

clause in the evaluation order.

## PostgreSQL Window Function Syntax

PostgreSQL has a sophisticated syntax for window function call. The following illustrates the simplified version:

In this syntax:

### window_function(arg1,arg2,...)

The `window_function`

is the name of the window function. Some window functions do not accept any argument.

### PARTITION BY clause

The `PARTITION BY`

clause divides rows into multiple groups or partitions to which the window function is applied. Like the example above, we used the product group to divide the products into groups (or partitions).

The `PARTITION BY`

clause is optional. If you skip the `PARTITION BY`

clause, the window function will treat the whole result set as a single partition.

### ORDER BY clause

The `ORDER BY`

clause specifies the order of rows in each partition to which the window function is applied.

The `ORDER BY`

clause uses the `NULLS FIRST`

or `NULLS LAST`

option to specify whether nullable values should be first or last in the result set. The default is `NULLS LAST`

option.

### frame_clause

The `frame_clause`

defines a subset of rows in the current partition to which the window function is applied. This subset of rows is called a frame.

If you use multiple window functions in a query:

you can use the `WINDOW`

clause to shorten the query as shown in the following query:

It is also possible to use the `WINDOW`

clause even though you call one window function in a query:

## PostgreSQL window function List

The following table lists all window functions provided by PostgreSQL. Note that some aggregate functions such as `AVG()`

, `MIN()`

, `MAX()`

, `SUM()`

, and `COUNT()`

can be also used as window functions.

Name | Description |
---|---|

CUME_DIST | Return the relative rank of the current row. |

DENSE_RANK | Rank the current row within its partition without gaps. |

FIRST_VALUE | Return a value evaluated against the first row within its partition. |

LAG | Return a value evaluated at the row that is at a specified physical offset row before the current row within the partition. |

LAST_VALUE | Return a value evaluated against the last row within its partition. |

LEAD | Return a value evaluated at the row that is offset rows after the current row within the partition. |

NTILE | Divide rows in a partition as equally as possible and assign each row an integer starting from 1 to the argument value. |

NTH_VALUE | Return a value evaluated against the nth row in an ordered partition. |

PERCENT_RANK | Return the relative rank of the current row (rank-1) / (total rows â€“ 1) |

RANK | Rank the current row within its partition with gaps. |

ROW_NUMBER | Number the current row within its partition starting from 1. |

## The ROW_NUMBER(), RANK(), and DENSE_RANK() functions

The `ROW_NUMBER()`

, `RANK()`

, and `DENSE_RANK()`

functions assign an integer to each row based on its order in its result set.

The `ROW_NUMBER()`

function assigns a sequential number to each row in each partition. See the following query:

The `RANK()`

function assigns ranking within an ordered partition. If rows have the same values, theÂ `RANK()`

function assigns the same rank, with the next ranking(s) skipped.

See the following query:

In the laptop product group, both `Dell Vostro`

and `Sony VAIO`

products have the same price, therefore, they receive the same rank 1. The next row in the group is `HP Elite`

Â that receives the rank 3 because the rank 2 is skipped.

Similar to the `RANK()`

function, the `DENSE_RANK()`

function assigns a rank to each row within an ordered partition, but the ranks have no gap. In other words, the same ranks are assigned to multiple rows and no ranks are skipped.

Within the laptop product group, rank 1 is assigned twice to `Dell Vostro`

and `Sony VAIO`

. The next rank is 2 assigned to `HP Elite`

.

## The FIRST_VALUE and LAST_VALUE functions

The `FIRST_VALUE()`

function returns a value evaluated against the first row within its partition, whereas the `LAST_VALUE()`

function returns a value evaluated against the last row in its partition.

The following statement uses the `FIRST_VALUE()`

to return the lowest price for every product group.

The following statement uses the `LAST_VALUE()`

function to return the highest price for every product group.

Notice that we added the frame clause `RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING`

because by default the frame clause is `RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW`

.

## The LAG and LEAD functions

The `LAG()`

function has theÂ ability to access data from the previous row, while the `LEAD()`

function can access data from the next row.

Both `LAG()`

and `LEAD()`

functions have the same syntax as follows:

In this syntax:

`expression`

â€“ a column or expression to compute the returnedÂ value.`offset`

â€“ the number of rows preceding (`LAG`

)/ following (`LEAD`

) the current row. It defaults toÂ 1.`default`

â€“ the default returned value if the`offset`

goes beyond the scope of the window. The`default`

is`NULL`

if you skip it.

The following statement uses the `LAG()`

function to return the prices from the previous row and calculates the difference between the price of the current rowÂ and the previous row.

The following statement uses the `LEAD()`

function to return the prices from the next row and calculates the difference between the price of the current rowÂ and the next row.

In this tutorial, we have introduced you to the PostgreSQL window functions and shown you some examples of using them to query data.