Window Function in SQL

Introduction

A window function is a special type of function that allows you to perform calculations on a specific subset, or "window," of rows from a result set. It's like looking at a window into your data and performing calculations on that smaller portion. Window functions are different from regular aggregate functions like SUM or AVG because they calculate values for each row individually, taking into account the rows within the defined window. They can be used to perform various calculations, such as aggregations, ranking, and statistical operations, without the need for grouping the data or using subqueries.

 There are several types of window functions commonly used in SQL. Here are some of the most frequently used ones:

Aggregate Window Functions

These functions perform calculations on a subset of rows within a window and return a single aggregated result. 

SUM(): Calculates the sum of a column within the window.

//Calculate the total revenue for each region, including a column with the overall average revenue.

SELECT
    region,
    SUM(revenue) AS total_revenue,
    AVG(revenue) OVER () AS overall_avg_revenue
FROM
    sales
GROUP BY
    region;

AVG(): Computes the average of a column within the window.



SELECT
    product_id,
    product_name,
    category,
    price,
    AVG(price) OVER (PARTITION BY category) AS avg_price
FROM
    products;

COUNT(): Counts the number of rows within the window.

SELECT
    product_id,
    product_name,
    category,
    price,
    COUNT(*) OVER (PARTITION BY category) AS category_count
FROM
    products;

MAX(): Finds the maximum value within the window.

SELECT
    product_id,
    product_name,
    category,
    price,
    MAX(price) OVER (PARTITION BY category) AS max_price
FROM
    products;

MIN(): Finds the minimum value within the window.

SELECT
    emp_id,
    emp_name,
    department,
    salary,
    MIN(salary) OVER (PARTITION BY department) AS min_salary
FROM
    employees;

Ranking Window Functions

These functions assign a rank or position to each row within a window based on a specified criterion. 

ROW_NUMBER(): Assigns a unique number to each row within the window.

SELECT
    ROW_NUMBER() OVER (ORDER BY department) AS row_num,
    employee_id,
    employee_name,
    department,
    salary
FROM
    employee
ORDER BY
    department;

RANK(): Assigns a rank to each row, with gaps in case of ties.

SELECT
    RANK() OVER (ORDER BY price DESC) AS product_rank,
    product_id,
    product_name,
    category,
    price
FROM
    products;

DENSE_RANK(): Assigns a rank to each row without gaps in case of ties.

SELECT
    sale_id,
    product_name,
    category,
    sale_amount,
    DENSE_RANK() OVER (ORDER BY sale_amount DESC) AS dense_rank
FROM
    sales;

NTILE(): Divides the rows into specified buckets or percentiles.

SELECT
    student_id,
    student_name,
    score,
    NTILE(3) OVER (ORDER BY score DESC) AS tile_number
FROM
    students;

Analytic Window Functions

These functions provide additional analytical capabilities and often require both partitioning and ordering of rows. 

LAG(): Retrieves the value from a previous row within the window.

SELECT
    order_id,
    customer_id,
    order_date,
    order_total,
    LAG(order_total, 1, 0) OVER (PARTITION BY customer_id ORDER BY order_date) AS previous_order_total
FROM
    orders;

LEAD(): Retrieves the value from a subsequent row within the window.

SELECT
    employee_id,
    first_name,
    last_name,
    department,
    salary,
    LEAD(salary, 1, 0) OVER (PARTITION BY department ORDER BY employee_id) AS next_salary
FROM
    employees;

FIRST_VALUE(): Returns the value of a specified expression from the first row in the window.

SELECT
    employee_id,
    first_name,
    last_name,
    department,
    salary,
    FIRST_VALUE(salary) OVER (PARTITION BY department ORDER BY employee_id) AS first_salary
FROM
    employees;

LAST_VALUE(): Returns the value of a specified expression from the last row in the window.

SELECT
    employee_id,
    first_name,
    last_name,
    department,
    salary,
    LAST_VALUE(salary) OVER (PARTITION BY department ORDER BY employee_id
                             ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_salary
FROM
    employees;

Here are some  examples that aim to provide valuable insights and enhance your understanding of  Window functions effectively

Calculate the total revenue for each region, including a column with the overall average revenue.

SELECT
    region,
    SUM(revenue) AS total_revenue,
    AVG(revenue) OVER () AS overall_avg_revenue
FROM
    sales
GROUP BY
    region;

Rank products based on their sales quantities within each category.

SELECT
    category,
    product,
    sales_quantity,
    RANK() OVER (PARTITION BY category ORDER BY sales_quantity DESC) AS product_rank
FROM
    sales
ORDER BY
    category, product_rank;

Calculate the running total of sales revenue for each day.

SELECT
    sales_date,
    SUM(revenue) OVER (ORDER BY sales_date) AS running_total
FROM
    daily_sales
ORDER BY
    sales_date;

Determine the percentage of total sales revenue contributed by each product within its category.

SELECT
    category,
    product,
    revenue,
    revenue / SUM(revenue) OVER (PARTITION BY category) * 100 AS revenue_percentage
FROM
    sales
ORDER BY
    category, product;

Find the highest revenue achieved in each quarter.

SELECT
    DATE_TRUNC('quarter', sales_date) AS quarter,
    MAX(revenue) AS highest_revenue
FROM
    sales
GROUP BY
    quarter
ORDER BY
    quarter;

Calculate the average rating of movies within each genre.

SELECT
    genre,
    AVG(rating) AS average_rating
FROM
    movies
GROUP BY
    genre;

Determine the difference in sales quantity between the current row and the previous row.

SELECT
    order_date,
    sales_quantity,
    LAG(sales_quantity) OVER (ORDER BY order_date) AS previous_sales_quantity,
    sales_quantity - LAG(sales_quantity) OVER (ORDER BY order_date) AS sales_quantity_difference
FROM
    sales;

Rank customers based on their total purchase amounts.

SELECT
    customer_id,
    total_purchase_amount,
    RANK() OVER (ORDER BY total_purchase_amount DESC) AS purchase_rank
FROM
    (
    SELECT
        customer_id,
        SUM(purchase_amount) AS total_purchase_amount
    FROM
        purchases
    GROUP BY
        customer_id
    ) AS purchase_summary;

Calculate the cumulative percentage of sales revenue for each product.

SELECT
    product_id,
    sales_revenue,
    SUM(sales_revenue) OVER (ORDER BY sales_revenue DESC) AS cumulative_revenue,
    SUM(sales_revenue) OVER (ORDER BY sales_revenue DESC) / SUM(sales_revenue) OVER () * 100 AS cumulative_percentage
FROM
    sales;

Determine the average salary of employees within each department.

SELECT
    department,
    salary,
    AVG(salary) OVER (PARTITION BY department ORDER BY department) AS average_salary
FROM
    employees
ORDER BY
    department;

Calculate the moving average of sales quantities over a specific period.

SELECT
    Date,
    SalesQuantity,
    AVG(SalesQuantity) OVER (ORDER BY Date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS MovingAverage
FROM
    Sales
ORDER BY
    Date;

Rank students based on their exam scores within each subject.

SELECT
    Subject,
    StudentName,
    ExamScore,
    RANK() OVER (PARTITION BY Subject ORDER BY ExamScore DESC) AS SubjectRank
FROM
    ExamScores;

Determine the percentage growth in revenue compared to the previous year for each quarter.

SELECT
    Year,
    Quarter,
    Revenue,
    (Revenue - LAG(Revenue) OVER (ORDER BY Year, Quarter)) / LAG(Revenue) OVER (ORDER BY Year, Quarter) * 100 AS RevenueGrowth
FROM
    RevenueData;

Find the top-selling product within each category.      

SELECT
    Category,
    Product,
    SalesQuantity,
    RANK() OVER (PARTITION BY Category ORDER BY SalesQuantity DESC) AS ProductRank
FROM
    SalesData;

Calculate the median salary of employees within each department. 

SELECT
    Department,
    Salary,
    PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY Salary) OVER (PARTITION BY Department) AS MedianSalary
FROM
    EmployeeData;

Determine the difference in sales revenue between the current row and the next row.


     
SELECT
    Date,
    Revenue,
    LEAD(Revenue) OVER (ORDER BY Date) - Revenue AS RevenueDifference
FROM
    SalesData;

Rank cities based on their population density.


     
SELECT
    City,
    PopulationDensity,
    RANK() OVER (ORDER BY PopulationDensity DESC) AS CityRank
FROM
    CityData;

Calculate the cumulative sum of sales quantities for each product.

     
SELECT
    Product,
    SalesQuantity,
    SUM(SalesQuantity) OVER (PARTITION BY Product ORDER BY Date) AS CumulativeSalesQuantity
FROM
    SalesData;

Determine the percentage of customers who made a repeat purchase within each month.

     
SELECT
    Month,
    COUNT(DISTINCT CustomerID) AS TotalCustomers,
    COUNT(DISTINCT CASE WHEN RepeatPurchase = 1 THEN CustomerID END) / COUNT(DISTINCT CustomerID) * 100 AS RepeatPurchasePercentage
FROM
    PurchaseData
GROUP BY
    Month;

Rank employees based on their performance scores within each department.

      
SELECT
    Department,
    Employee,
    PerformanceScore,
    RANK() OVER (PARTITION BY Department ORDER BY PerformanceScore DESC) AS EmployeeRank
FROM
    EmployeeData;

Calculate the average order value for each customer, including the overall average order value.

      
SELECT
    CustomerID,
    OrderValue,
    AVG(OrderValue) OVER (PARTITION BY CustomerID) AS AverageOrderValue,
    AVG(OrderValue) OVER () AS OverallAverageOrderValue
FROM
    OrderData;

Determine the difference in ratings between the current movie and the highest-rated movie within its genre.

      
SELECT
    Movie,
    Genre,
    Rating,
    MAX(Rating) OVER (PARTITION BY Genre) - Rating AS RatingDifference
FROM
    MovieData;

Calculate the maximum temperature recorded in each month.

SELECT
    EXTRACT(MONTH FROM Date) AS Month,
    MAX(Temperature) AS MaxTemperature
FROM
    WeatherData
GROUP BY
    EXTRACT(MONTH FROM Date);

Determine the percentage of total sales revenue contributed by each customer.

SELECT
    CustomerID,
    SalesRevenue,
    SalesRevenue / SUM(SalesRevenue) OVER () * 100 AS RevenuePercentage
FROM
    SalesData;

Rank countries based on their GDP per capital.

SELECT
    Country,
    GDPperCapita,
    RANK() OVER (ORDER BY GDPperCapita DESC) AS CountryRank
FROM
    CountryData;

Calculate the running total of customer orders within each week.

SELECT
    Week,
    CustomerID,
    OrderCount,
    SUM(OrderCount) OVER (PARTITION BY Week ORDER BY CustomerID) AS RunningTotal
FROM
    OrderData;

Calculate the average rating of movies released each year.

SELECT
    EXTRACT(YEAR FROM ReleaseDate) AS Year,
    AVG(Rating) AS AverageRating
FROM
    MovieData
GROUP BY
    EXTRACT(YEAR FROM ReleaseDate);

Determine the percentage of total revenue achieved by each salesperson.

SELECT
    Salesperson,
    Revenue,
    Revenue / SUM(Revenue) OVER () * 100 AS RevenuePercentage
FROM
    SalesData;

Calculate the cumulative product of sales quantities for each product.

SELECT
    Product,
    SalesQuantity,
    EXP(SUM(LOG(SalesQuantity)) OVER (PARTITION BY Product ORDER BY Date)) AS CumulativeProduct
FROM
    SalesData;

Determine the difference in population between the current city and the next city.

SELECT
    City,
    Population,
    LEAD(Population) OVER (ORDER BY City) - Population AS PopulationDifference
FROM
    CityData;

Determine the percentage decrease in sales revenue compared to the previous month for each month.

SELECT
    Month,
    SalesRevenue,
    (SalesRevenue - LAG(SalesRevenue) OVER (ORDER BY Month)) / LAG(SalesRevenue) OVER (ORDER BY Month) * 100 AS RevenueDecreasePercentage
FROM
    SalesData;

Find the highest-rated movie within each genre.

SELECT
    Genre,
    Movie,
    Rating
FROM (
    SELECT
        Genre,
        Movie,
        Rating,
        RANK() OVER (PARTITION BY Genre ORDER BY Rating DESC) AS MovieRank
    FROM
        MovieData
) AS RankedMovies
WHERE
    MovieRank = 1;

Calculate the median age of employees within each department.

SELECT
    Department,
    PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY Age) OVER (PARTITION BY Department) AS MedianAge
FROM
    EmployeeData;

Calculate the cumulative count of unique customers for each product.

SELECT
    Product,
    CustomerID,
    COUNT(DISTINCT CustomerID) OVER (PARTITION BY Product ORDER BY Date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS CumulativeUniqueCustomers
FROM
    SalesData;

Determine the percentage increase in sales quantity compared to the previous month for each month.

SELECT
    Month,
    SalesQuantity,
    (SalesQuantity - LAG(SalesQuantity) OVER (ORDER BY Month)) / LAG(SalesQuantity) OVER (ORDER BY Month) * 100 AS QuantityIncreasePercentage
FROM
    SalesData;

Find the top-performing employee within each department.

SELECT
    Department,
    Employee,
    PerformanceScore
FROM (
    SELECT
        Department,
        Employee,
        PerformanceScore,
        RANK() OVER (PARTITION BY Department ORDER BY PerformanceScore DESC) AS EmployeeRank
    FROM
        EmployeeData
) AS RankedEmployees
WHERE
    EmployeeRank = 1;

Determine the difference in customer satisfaction ratings between the current row and the highest-rated row within each department.

SELECT
    Department,
    CustomerID,
    SatisfactionRating,
    SatisfactionRating - MAX(SatisfactionRating) OVER (PARTITION BY Department) AS RatingDifference
FROM
    CustomerData;

These are just a few examples of the types of window functions available in SQL. The specific functions and syntax may vary depending on the database system you are using. Window functions provide a powerful toolset for performing complex calculations and analysis within SQL queries.

Thank you for reading, and I hope this post has helped provide you with a better understanding of the Window Function in SQL.