What are Window functions ROW_NUMBER(),Rank(), DENSE_RANK() in SQL?

Introduction

In the dynamic landscape of SQL Server, Windows functions stand out as powerful tools for analytical processing. They offer a distinct way to perform calculations across a specified range of rows related to the current row. In this blog, we'll delve into the world of Window Functions using a simple table example, exploring various scenarios, and employing normalization techniques.

Understanding Window Functions

Window Functions operate within a specified window of rows related to the current row without reducing the result set. They bring a new level of analytical capabilities to SQL queries. Let's start by creating a simple table and populating it with data.

-- Create a simple table
CREATE TABLE Sales (
    SaleID INT PRIMARY KEY,
    ProductID INT,
    SaleDate DATE,
    Amount DECIMAL(10, 2)
);

-- Insert sample data
INSERT INTO Sales VALUES (1, 101, '2023-01-01', 1500.00);
INSERT INTO Sales VALUES (2, 102, '2023-01-02', 2000.00);
INSERT INTO Sales VALUES (3, 101, '2023-01-03', 1200.00);
INSERT INTO Sales VALUES (4, 102, '2023-01-04', 1800.00);

Sales

Sales

ROW_NUMBER() - Enumerating Rows

The ROW_NUMBER() function assigns a unique number to each row based on a specified order. This can be useful for pagination or ranking purposes.

-- Enumerate rows based on SaleDate
SELECT SaleID, SaleDate, Amount,
       ROW_NUMBER() OVER (ORDER BY SaleDate) AS RowNum
FROM Sales;

Row_Number

RANK() - Ranking Rows

RANK() assigns a unique rank to each distinct row, leaving gaps for tied values.

-- Rank rows based on Amount
SELECT SaleID, SaleDate, Amount,
       RANK() OVER (ORDER BY Amount DESC) AS SalesRank
FROM Sales;

Rank

DENSE_RANK() - Dense Ranking Rows

DENSE_RANK() is similar to RANK(), but without gaps for tied values.

-- Dense rank rows based on Amount
SELECT SaleID, SaleDate, Amount,
       DENSE_RANK() OVER (ORDER BY Amount DESC) AS DenseSalesRank
FROM Sales;

Dense Rank

In this specific dataset, since there are no tied values in the Amount column, the rankings provided by both RANK() and DENSE_RANK() are identical.

Here's a brief explanation of the differences:

  • RANK()

    • Assigns a unique rank to each distinct row.
    • Leaves gaps in the ranking for tied values. If two rows have the same value, they both get the same rank, and the next rank is skipped.
  • DENSE_RANK()

    • Similar to RANK().
    • Does not leave gaps for tied values. If two rows have the same value, they both get the same rank, and the next rank is not skipped.

SUM() - Cumulative Sum

SUM() as a window function enables the calculation of cumulative sums.

-- Calculate cumulative sum of Amount
SELECT SaleID, SaleDate, Amount,
       SUM(Amount) OVER (ORDER BY SaleDate) AS CumulativeSum
FROM Sales;

Sum

AVG() - Moving Average

AVG() as a window function calculates a moving average over a specified window of rows.

-- Calculate 3-day moving average of Amount
SELECT SaleID, SaleDate, Amount,
       AVG(Amount) OVER (ORDER BY SaleDate ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS MovingAverage
FROM Sales;

AVG

LEAD() and LAG() - Accessing Adjacent Rows

LEAD() and LAG() provide access to subsequent and preceding rows, respectively.

-- Access next and previous SaleDate
SELECT SaleID, SaleDate, Amount,
       LEAD(SaleDate) OVER (ORDER BY SaleDate) AS NextSaleDate,
       LAG(SaleDate) OVER (ORDER BY SaleDate) AS PreviousSaleDate
FROM Sales;

LEAD_LAG

Normalizing Data Using Window Functions

Normalization is a key database design principle. Let's use Window Functions to normalize data by creating a new table to store aggregated information.

-- Create a normalized table
CREATE TABLE ProductSales (
    ProductID INT PRIMARY KEY,
    TotalSales DECIMAL(10, 2),
    AverageSaleAmount DECIMAL(10, 2)
);

-- Insert normalized data using Window Functions
INSERT INTO ProductSales
SELECT ProductID,
       SUM(Amount) OVER (PARTITION BY ProductID) AS TotalSales,
       AVG(Amount) OVER (PARTITION BY ProductID) AS AverageSaleAmount
FROM Sales
GROUP BY ProductID;

ProductSales

Product Sales

In this normalized table, the ProductSales table captures the total sales and average sale amount for each product, eliminating redundant information.

Conclusion

Window Functions in SQL Server are a treasure trove for data analysts and developers alike. From ranking rows to calculating cumulative sums and moving averages, these functions elevate your analytical capabilities. Understanding and mastering Windows functions empower you to derive valuable insights from your data and enhance the efficiency of your SQL queries. As demonstrated in the scenarios above, they can also play a pivotal role in normalizing data, contributing to a well-designed and efficient database schema. Happy querying!