The UNNEST
function in BigQuery is used to transform arrays or repeated fields within a table into individual rows. This is particularly useful when dealing with data stored in nested structures, such as arrays or structs, as it allows you to flatten and work with the data more easily.
This can be useful for a variety of purposes, such as:
- Extracting individual elements from an array
- Counting the number of elements in an array
- Finding the minimum or maximum value in an array
- Grouping data by the values in an array
Toy Examples
Example 1: Explode an array into separate rows
Suppose you have a table called orders with a column called items which contains an array of items for each order. You can use UNNEST
to explode the array into separate rows:
SELECT
order_id,
item
FROM orders,
UNNEST(items) AS item
Example 2: Combine UNNEST with other functions
You can also combine UNNEST
with other functions to perform more complex operations. For example, let’s say you have a table called users with a column called interests which contains an array of interests for each user. You can use UNNEST
along with COUNT
to count the number of users for each interest:
SELECT
interest,
COUNT(*) AS user_count
FROM users,
UNNEST(interests) AS interest
GROUP BY
interest
Toy Example Walkthrough
First create some data with an array field in and take a look
WITH toy_table AS (
SELECT "Marquin" AS user, ["bread", "grapes", "pringles", "pizza"] AS favourite_food
UNION ALL
SELECT "Jo", ["pizza", "banana"]
UNION ALL
SELECT "Sam", ["corn", "kiwi", "rice", 'jam', "chocolate"]
UNION ALL
SELECT "Peter", ["ice cream", "apple", "chocolate", "sausage", "pizza"]
)
SELECT *
FROM toy_table;
UNNEST use case: We want to find out which food item is most commonly on people’s favourite food list.
The table in its current form does not really allow for the counting of each item in the array.
What we need is for each item in the array to be represented in its own row. The query below achieves this.
WITH toy_table AS (
SELECT "Marquin" AS user, ["bread", "grapes", "pringles", "pizza"] AS favourite_food
UNION ALL
SELECT "Dejan", ["pizza", "banana"]
UNION ALL
SELECT "Heather", ["corn", "kiwi", "rice", 'jam', "chocolate"]
UNION ALL
SELECT "Peter", ["ice cream", "apple", "chocolate", "sausage", "pizza"]
)
SELECT
user,
items
FROM toy_table,
UNNEST(favourite_food) AS items;
So now each favourite food is on its own row, it is able to be counted and grouped to find the most popular foods
WITH toy_table AS (
SELECT "Marquin" AS user, ["bread", "grapes", "pringles", "pizza"] AS favourite_food
UNION ALL
SELECT "Dejan", ["pizza", "banana"]
UNION ALL
SELECT "Heather", ["corn", "kiwi", "rice", 'jam', "chocolate"]
UNION ALL
SELECT "Peter", ["ice cream", "apple", "chocolate", "sausage", "pizza"]
),
item_on_each_row AS (
SELECT
user,
items
FROM toy_table,
UNNEST(favourite_food) AS items
)
SELECT
items,
COUNT(*) AS times_as_favourite
FROM item_on_each_row
GROUP BY
items
ORDER BY
COUNT(*) DESC
Additional Resources
Working with Arrays. BigQuery official documentation