Window functions

Window functions perform calculations across a set of table rows that are somehow related to the current row. They are often used for ranking, aggregating, and calculating running totals.

functions

dense_rank

Ranks the current row within its partition without gaps. In other words, if the value of any new row encountered is equal to the value of one of the previous rows then it will receive the next successive rank without any gaps in ranking.

The rank function provides the same behaviour, but with gaps in ranking.

Syntax

Alias: denseRank (case-sensitive)

dense_rank (column_name)
  OVER ([[PARTITION BY grouping_column] [ORDER BY sorting_column] 
        [ROWS or RANGE expression_to_bound_rows_withing_the_group]] | [window_name])
FROM table_name
WINDOW window_name as ([[PARTITION BY grouping_column] [ORDER BY sorting_column])

For more detail on window function syntax see: Window Functions - Syntax.

Returned value

  • A number for the current row within its partition, without gaps in ranking. UInt64.

Example

The following example is based on the example provided in the video instructional Ranking window functions in Tinybird.

Query:

SELECT player, salary, 
       dense_rank() OVER (ORDER BY salary DESC) AS dense_rank
FROM (
  select c1::String as team, c2::String as player, c3::UInt32 as salary, c4::String as position from values (
      ('Port Elizabeth Barbarians', 'Gary Chen', 195000, 'F'),
      ('New Coreystad Archdukes', 'Charles Juarez', 190000, 'F'),
      ('Port Elizabeth Barbarians', 'Michael Stanley', 150000, 'D'),
      ('New Coreystad Archdukes', 'Scott Harrison', 150000, 'D'),
      ('Port Elizabeth Barbarians', 'Robert George', 195000, 'M'),
      ('South Hampton Seagulls', 'Douglas Benson', 150000, 'M'),
      ('South Hampton Seagulls', 'James Henderson', 140000, 'M')
  )
)

Result:

┌─player──────────┬─salary─┬─dense_rank─┐
│ Gary Chen       │ 195000 │          1 │
│ Robert George   │ 195000 │          1 │
│ Charles Juarez  │ 190000 │          2 │
│ Michael Stanley │ 150000 │          3 │
│ Douglas Benson  │ 150000 │          3 │
│ Scott Harrison  │ 150000 │          3 │
│ James Henderson │ 140000 │          4 │
└─────────────────┴────────┴────────────┘

first_value

Returns the first value evaluated within its ordered frame. By default, NULL arguments are skipped, however the RESPECT NULLS modifier can be used to override this behaviour.

Syntax

first_value (column_name) [[RESPECT NULLS] | [IGNORE NULLS]]
  OVER ([[PARTITION BY grouping_column] [ORDER BY sorting_column] 
        [ROWS or RANGE expression_to_bound_rows_withing_the_group]] | [window_name])
FROM table_name
WINDOW window_name as ([PARTITION BY grouping_column] [ORDER BY sorting_column])

Alias: any.

Using the optional modifier RESPECT NULLS after first_value(column_name) will ensure that NULL arguments aren't skipped.

Alias: first_value_respect_nulls

For more detail on window function syntax see: Window Functions - Syntax.

Returned value

  • The first value evaluated within its ordered frame.

Example

In this example the first_value function is used to find the highest paid footballer from a fictional dataset of salaries of Premier League football players.

Query:

SELECT player, salary, 
       first_value(player) OVER (ORDER BY salary DESC) AS highest_paid_player
FROM (
  select c1::String as team, c2::String as player, c3::UInt32 as salary, c4::String as position from values (
      ('Port Elizabeth Barbarians', 'Gary Chen', 195000, 'F'),
      ('New Coreystad Archdukes', 'Charles Juarez', 190000, 'F'),
      ('Port Elizabeth Barbarians', 'Michael Stanley', 150000, 'D'),
      ('New Coreystad Archdukes', 'Scott Harrison', 150000, 'D'),
      ('Port Elizabeth Barbarians', 'Robert George', 195000, 'M'),
      ('South Hampton Seagulls', 'Douglas Benson', 150000, 'M'),
      ('South Hampton Seagulls', 'James Henderson', 140000, 'M')
  )
)

Result:

┌─player──────────┬─salary─┬─highest_paid_player─┐
│ Gary Chen       │ 196000 │ Gary Chen           │
│ Robert George   │ 195000 │ Gary Chen           │
│ Charles Juarez  │ 190000 │ Gary Chen           │
│ Scott Harrison  │ 180000 │ Gary Chen           │
│ Douglas Benson  │ 150000 │ Gary Chen           │
│ James Henderson │ 140000 │ Gary Chen           │
│ Michael Stanley │ 100000 │ Gary Chen           │
└─────────────────┴────────┴─────────────────────┘

lagInFrame

Returns a value evaluated at the row that is at a specified physical offset row before the current row within the ordered frame.

Syntax

lagInFrame(x[, offset[, default]])
  OVER ([[PARTITION BY grouping_column] [ORDER BY sorting_column] 
        [ROWS or RANGE expression_to_bound_rows_withing_the_group]] | [window_name])
FROM table_name
WINDOW window_name as ([[PARTITION BY grouping_column] [ORDER BY sorting_column])

For more detail on window function syntax see: Window Functions - Syntax.

Parameters

  • x: Column name.
  • offset: Offset to apply. (U)Int*. (Optional - 1 by default).
  • default: Value to return if calculated row exceeds the boundaries of the window frame. (Optional - default value of column type when omitted).

Returned value

  • Value evaluated at the row that is at a specified physical offset before the current row within the ordered frame.

Example

This example looks at historical data for a specific stock and uses the lagInFrame function to calculate a day-to-day delta and percentage change in the closing price of the stock.

Query:

SELECT
    date,
    close,
    lagInFrame(close, 1, close) OVER (ORDER BY date ASC) AS previous_day_close,
    COALESCE(ROUND(close - previous_day_close, 2)) AS delta,
    COALESCE(ROUND((delta / previous_day_close) * 100, 2)) AS percent_change
FROM (
    select c1::Date as date, c2::Float32 as open, c3::Float32 as high, c4::Float32 as low, c5::Float32 as close, c6::UInt32 as volume from values (
    ('2024-06-03', 113.62, 115.00, 112.00, 115.00, 438392000),
    ('2024-06-04', 115.72, 116.60, 114.04, 116.44, 403324000),
    ('2024-06-05', 118.37, 122.45, 117.47, 122.44, 528402000),
    ('2024-06-06', 124.05, 125.59, 118.32, 121.00, 664696000),
    ('2024-06-07', 119.77, 121.69, 118.02, 120.89, 412386000)
    )  
)
ORDER BY date DESC

Result:

┌───────date─┬──close─┬─previous_day_close─┬─delta─┬─percent_change─┐
│ 2024-06-07 │ 120.89 │                121 │ -0.11 │          -0.09 │
│ 2024-06-06 │    121 │             122.44 │ -1.44 │          -1.18 │
│ 2024-06-05 │ 122.44 │             116.44 │     6 │           5.15 │
│ 2024-06-04 │ 116.44 │                115 │  1.44 │           1.25 │
│ 2024-06-03 │    115 │                115 │     0 │              0 │
└────────────┴────────┴────────────────────┴───────┴────────────────┘

last_value

Returns the last value evaluated within its ordered frame. By default, NULL arguments are skipped, however the RESPECT NULLS modifier can be used to override this behaviour.

Syntax

last_value (column_name) [[RESPECT NULLS] | [IGNORE NULLS]]
  OVER ([[PARTITION BY grouping_column] [ORDER BY sorting_column] 
        [ROWS or RANGE expression_to_bound_rows_withing_the_group]] | [window_name])
FROM table_name
WINDOW window_name as ([[PARTITION BY grouping_column] [ORDER BY sorting_column])

Alias: anyLast.

Using the optional modifier RESPECT NULLS after first_value(column_name) will ensure that NULL arguments aren't skipped.

Alias: lastValueRespectNulls

For more detail on window function syntax see: Window Functions - Syntax.

Returned value

  • The last value evaluated within its ordered frame.

Example

In this example the last_value function is used to find the lowest paid footballer from a fictional dataset of salaries of Premier League football players.

Query:

SELECT player, salary,
       last_value(player) OVER (ORDER BY salary DESC RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS lowest_paid_player
FROM (
  select c1::String as team, c2::String as player, c3::UInt32 as salary, c4::String as position from values (
    ('Port Elizabeth Barbarians', 'Gary Chen', 196000, 'F'),
    ('New Coreystad Archdukes', 'Charles Juarez', 190000, 'F'),
    ('Port Elizabeth Barbarians', 'Michael Stanley', 100000, 'D'),
    ('New Coreystad Archdukes', 'Scott Harrison', 180000, 'D'),
    ('Port Elizabeth Barbarians', 'Robert George', 195000, 'M'),
    ('South Hampton Seagulls', 'Douglas Benson', 150000, 'M'),
    ('South Hampton Seagulls', 'James Henderson', 140000, 'M')
  )
)

Result:

┌─player──────────┬─salary─┬─lowest_paid_player─┐
│ Gary Chen       │ 196000 │ Michael Stanley    │
│ Robert George   │ 195000 │ Michael Stanley    │
│ Charles Juarez  │ 190000 │ Michael Stanley    │
│ Scott Harrison  │ 180000 │ Michael Stanley    │
│ Douglas Benson  │ 150000 │ Michael Stanley    │
│ James Henderson │ 140000 │ Michael Stanley    │
│ Michael Stanley │ 100000 │ Michael Stanley    │
└─────────────────┴────────┴────────────────────┘

leadInFrame

Returns a value evaluated at the row that is offset rows after the current row within the ordered frame.

Syntax

leadInFrame(x[, offset[, default]])
  OVER ([[PARTITION BY grouping_column] [ORDER BY sorting_column] 
        [ROWS or RANGE expression_to_bound_rows_withing_the_group]] | [window_name])
FROM table_name
WINDOW window_name as ([[PARTITION BY grouping_column] [ORDER BY sorting_column])

For more detail on window function syntax see: Window Functions - Syntax.

Parameters

  • x: Column name.
  • offset: Offset to apply. (U)Int*. (Optional - 1 by default).
  • default: Value to return if calculated row exceeds the boundaries of the window frame. (Optional - default value of column type when omitted).

Returned value

  • value evaluated at the row that is offset rows after the current row within the ordered frame.

Example

Query:

SELECT
    n,
    m,
    leadInFrame(m) OVER (
        PARTITION BY n ORDER BY m ASC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS next
FROM
    (
        select (number % 12) + 1 n, toStartOfMonth(today()) - INTERVAL number MONTH m
        from numbers(36)
        order by m
    )

Result:

--------------------------------
|  n | m          | next       |
--------------------------------
|  1 | 2022-12-01 | 2023-12-01 |
|  1 | 2023-12-01 | 2024-12-01 |
|  1 | 2024-12-01 | 1970-01-01 |
|  2 | 2022-11-01 | 2023-11-01 |
|  2 | 2023-11-01 | 2024-11-01 |
|  2 | 2024-11-01 | 1970-01-01 |
|  3 | 2022-10-01 | 2023-10-01 |
|  3 | 2023-10-01 | 2024-10-01 |
|  3 | 2024-10-01 | 1970-01-01 |
|  4 | 2022-09-01 | 2023-09-01 |
|  4 | 2023-09-01 | 2024-09-01 |
|  4 | 2024-09-01 | 1970-01-01 |
|  5 | 2022-08-01 | 2023-08-01 |
|  5 | 2023-08-01 | 2024-08-01 |
|  5 | 2024-08-01 | 1970-01-01 |
|  6 | 2022-07-01 | 2023-07-01 |
|  6 | 2023-07-01 | 2024-07-01 |
|  6 | 2024-07-01 | 1970-01-01 |
|  7 | 2022-06-01 | 2023-06-01 |
|  7 | 2023-06-01 | 2024-06-01 |
|  7 | 2024-06-01 | 1970-01-01 |
|  8 | 2022-05-01 | 2023-05-01 |
|  8 | 2023-05-01 | 2024-05-01 |
|  8 | 2024-05-01 | 1970-01-01 |
|  9 | 2022-04-01 | 2023-04-01 |
|  9 | 2023-04-01 | 2024-04-01 |
|  9 | 2024-04-01 | 1970-01-01 |
| 10 | 2022-03-01 | 2023-03-01 |
| 10 | 2023-03-01 | 2024-03-01 |
| 10 | 2024-03-01 | 1970-01-01 |
| 11 | 2022-02-01 | 2023-02-01 |
| 11 | 2023-02-01 | 2024-02-01 |
| 11 | 2024-02-01 | 1970-01-01 |
| 12 | 2022-01-01 | 2023-01-01 |
| 12 | 2023-01-01 | 2024-01-01 |
| 12 | 2024-01-01 | 1970-01-01 |
--------------------------------

nth_value

Returns the first non-NULL value evaluated against the nth row (offset) in its ordered frame.

Syntax

nth_value (x, offset)
  OVER ([[PARTITION BY grouping_column] [ORDER BY sorting_column] 
        [ROWS or RANGE expression_to_bound_rows_withing_the_group]] | [window_name])
FROM table_name
WINDOW window_name as ([[PARTITION BY grouping_column] [ORDER BY sorting_column])

For more detail on window function syntax see: Window Functions - Syntax.

Parameters

  • x: Column name.
  • offset: nth row to evaluate current row against.

Returned value

  • The first non-NULL value evaluated against the nth row (offset) in its ordered frame.

Example

In this example the nth-value function is used to find the third-highest salary from a fictional dataset of salaries of Premier League football players.

Query:

SELECT player, salary, nth_value(player,3) OVER(ORDER BY salary DESC) AS third_highest_salary FROM 
(
    select c1::String as team, c2::String as player, c3::UInt32 as salary, c4::String as position from values (
        ('Port Elizabeth Barbarians', 'Gary Chen', 195000, 'F'),
        ('New Coreystad Archdukes', 'Charles Juarez', 190000, 'F'),
        ('Port Elizabeth Barbarians', 'Michael Stanley', 100000, 'D'),
        ('New Coreystad Archdukes', 'Scott Harrison', 180000, 'D'),
        ('Port Elizabeth Barbarians', 'Robert George', 195000, 'M'),
        ('South Hampton Seagulls', 'Douglas Benson', 150000, 'M'),
        ('South Hampton Seagulls', 'James Henderson', 140000, 'M')
    )
)

Result:

┌─player──────────┬─salary─┬─third_highest_salary─┐
│ Gary Chen       │ 195000 │                      │
│ Robert George   │ 195000 │                      │
│ Charles Juarez  │ 190000 │ Charles Juarez       │
│ Scott Harrison  │ 180000 │ Charles Juarez       │
│ Douglas Benson  │ 150000 │ Charles Juarez       │
│ James Henderson │ 140000 │ Charles Juarez       │
│ Michael Stanley │ 100000 │ Charles Juarez       │
└─────────────────┴────────┴──────────────────────┘

percent_rank

returns the relative rank (i.e. percentile) of rows within a window partition.

Syntax

Alias: percentRank (case-sensitive)

percent_rank (column_name)
  OVER ([[PARTITION BY grouping_column] [ORDER BY sorting_column] 
        [RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING]] | [window_name])
FROM table_name
WINDOW window_name as ([PARTITION BY grouping_column] [ORDER BY sorting_column] RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)

The default and required window frame definition is RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.

For more detail on window function syntax see: Window Functions - Syntax.

Example

Query:

SELECT player, salary, 
       percent_rank() OVER (ORDER BY salary DESC) AS percent_rank
FROM (
    select c1::String as team, c2::String as player, c3::UInt32 as salary, c4::String as position from values (
      ('Port Elizabeth Barbarians', 'Gary Chen', 195000, 'F'),
      ('New Coreystad Archdukes', 'Charles Juarez', 190000, 'F'),
      ('Port Elizabeth Barbarians', 'Michael Stanley', 150000, 'D'),
      ('New Coreystad Archdukes', 'Scott Harrison', 150000, 'D'),
      ('Port Elizabeth Barbarians', 'Robert George', 195000, 'M'),
      ('South Hampton Seagulls', 'Douglas Benson', 150000, 'M'),
      ('South Hampton Seagulls', 'James Henderson', 140000, 'M')
    )
)

Result:

┌─player──────────┬─salary─┬───────percent_rank─┐
│ Gary Chen       │ 195000 │                  0 │
│ Robert George   │ 195000 │                  0 │
│ Charles Juarez  │ 190000 │ 0.3333333333333333 │
│ Michael Stanley │ 150000 │                0.5 │
│ Scott Harrison  │ 150000 │                0.5 │
│ Douglas Benson  │ 150000 │                0.5 │
│ James Henderson │ 140000 │                  1 │
└─────────────────┴────────┴────────────────────┘

rank

Ranks the current row within its partition with gaps. In other words, if the value of any row it encounters is equal to the value of a previous row then it will receive the same rank as that previous row. The rank of the next row is then equal to the rank of the previous row plus a gap equal to the number of times the previous rank was given.

The dense_rank function provides the same behaviour but without gaps in ranking.

Syntax

rank (column_name)
  OVER ([[PARTITION BY grouping_column] [ORDER BY sorting_column] 
        [ROWS or RANGE expression_to_bound_rows_withing_the_group]] | [window_name])
FROM table_name
WINDOW window_name as ([[PARTITION BY grouping_column] [ORDER BY sorting_column])

For more detail on window function syntax see: Window Functions - Syntax.

Returned value

  • A number for the current row within its partition, including gaps. UInt64.

Example

The following example is based on the example provided in the video instructional Ranking window functions in Tinybird.

Query:

SELECT player, salary, 
       rank() OVER (ORDER BY salary DESC) AS rank
FROM (
    select c1::String as team, c2::String as player, c3::UInt32 as salary, c4::String as position from values (
      ('Port Elizabeth Barbarians', 'Gary Chen', 195000, 'F'),
      ('New Coreystad Archdukes', 'Charles Juarez', 190000, 'F'),
      ('Port Elizabeth Barbarians', 'Michael Stanley', 150000, 'D'),
      ('New Coreystad Archdukes', 'Scott Harrison', 150000, 'D'),
      ('Port Elizabeth Barbarians', 'Robert George', 195000, 'M'),
      ('South Hampton Seagulls', 'Douglas Benson', 150000, 'M'),
      ('South Hampton Seagulls', 'James Henderson', 140000, 'M')
    )
)

Result:

┌─player──────────┬─salary─┬─rank─┐
│ Gary Chen       │ 195000 │    1 │
│ Robert George   │ 195000 │    1 │
│ Charles Juarez  │ 190000 │    3 │
│ Douglas Benson  │ 150000 │    4 │
│ Michael Stanley │ 150000 │    4 │
│ Scott Harrison  │ 150000 │    4 │
│ James Henderson │ 140000 │    7 │
└─────────────────┴────────┴──────┘

row_number

Numbers the current row within its partition starting from 1.

Syntax

row_number (column_name)
  OVER ([[PARTITION BY grouping_column] [ORDER BY sorting_column] 
        [ROWS or RANGE expression_to_bound_rows_withing_the_group]] | [window_name])
FROM table_name
WINDOW window_name as ([[PARTITION BY grouping_column] [ORDER BY sorting_column])

For more detail on window function syntax see: Window Functions - Syntax.

Returned value

  • A number for the current row within its partition. UInt64.

Example

The following example is based on the example provided in the video instructional Ranking window functions in Tinybird.

Query:

SELECT player, salary, 
       row_number() OVER (ORDER BY salary DESC) AS row_number
FROM (
    select c1::String as team, c2::String as player, c3::UInt32 as salary, c4::String as position from values (
      ('Port Elizabeth Barbarians', 'Gary Chen', 195000, 'F'),
      ('New Coreystad Archdukes', 'Charles Juarez', 190000, 'F'),
      ('Port Elizabeth Barbarians', 'Michael Stanley', 150000, 'D'),
      ('New Coreystad Archdukes', 'Scott Harrison', 150000, 'D'),
      ('Port Elizabeth Barbarians', 'Robert George', 195000, 'M')
    )
)

Result:

┌─player──────────┬─salary─┬─row_number─┐
│ Gary Chen       │ 195000 │          1 │
│ Robert George   │ 195000 │          2 │
│ Charles Juarez  │ 190000 │          3 │
│ Michael Stanley │ 150000 │          4 │
│ Scott Harrison  │ 150000 │          5 │
└─────────────────┴────────┴────────────┘
Updated