Search For Missing Values Within a Numerical Sequence

Scope

This short article aims to show a method for quickly identifying gaps in a numerical sequence within a SQL Server table, to locate absences of consequence within given data. This approach is useful when there is a need, for example, to verify that there are no missing documents in a given list (think of the possibility of having to monitor a flow of invoices, then you must check if in a given range some of them are missing, basing your analysis on the document's registration number).

Basic concept

The script is based on a simple concept. Suppose we could have a table containing each number among those possible for managing our case. In that case, we could compare our actual list (where some numbers will potentially not be present) with the ideal list that contains them all and then verify the deficiencies.

Having a numbers table in each SQL server instance is highly recommended. In most cases, many queries can be improved using this numbers table. There is no reason to create a new table each time. Moreover, we can use one table for any query in any database in the server instance.M

Tip!

You can create an Accessories database with accessories like numbers tables, dates tables, CLR functions, T-SQL functions, and more. This database should be in a read-only mode and will make the queries faster with fewer locks and more security. All users should have privileges to read from this database.

If you have not yet created a numbers table, this is the time to do so.

---------------------------------------------------> Numbers table  
CREATE TABLE AriNumbersTbl (Number  int  not null)  
GO  
   
INSERT INTO AriNumbersTbl(Number)  
SELECT TOP 10000000 row_number() over(order by (select NULL)) as N  
FROM master..spt_values t1  
CROSS JOIN master..spt_values t2  
CROSS JOIN master..spt_values t3  
GO  
   
ALTER TABLE AriNumbersTbl ADD CONSTRAINT PK_AriNumbersTbl PRIMARY KEY CLUSTERED (Number)  
GO  

Application

We'll now create a table named Products for our comparisons. It will have the following two fields, the first is a hypothetical unique ID of a product, whereas the second is a description of the product itself. In a table like this, we will insert some test records.

CREATE TABLE Products( ProductID INT PRIMARY KEY, ProductDes VARCHAR (50))  
INSERT INTO Products VALUES(1 , 'Product 1' )  
INSERT INTO Products VALUES(2 , 'Product 2' )  
INSERT INTO Products VALUES(3 , 'Product 3' )  
INSERT INTO Products VALUES(4 , 'Product 4' )  
INSERT INTO Products VALUES(5 , 'Product 5' )  
INSERT INTO Products VALUES(10 , 'Product 10' )  

We have 5 consecutive records, and the sixth has an ID = 10. The following implements the preceding logic. If we wish to spot potential missing values, the result we should expect will consist of a record with an ID from 6 to 9, missing in the Products table but not in the AriNumbersTbl. Let's see how to do this.

SELECT number  
FROM AriNumbersTbl  
WHERE number NOT IN (SELECT ProductID FROM Products) AND  
      number <= (SELECT MAX(ProductID) FROM Products )  

This simple SELECT contains two subqueries in the WHERE clause. We will extract from the AriNumbersTbl table the values not contained in (NOT IN) the selection of all ID values from the Products table (SELECT ProductID FROM Products). If we stop here, we will obtain the values 6, 7, 8, and 9, but those between 11 and 99999 are absent from the Products table but not into AriNumbersTbl.

As an upper limit, we can then put the maximum value readable from the Products table. Since the higher ID is 10 at the moment, we are not interested in spotting the values that will go beyond that limit, being obvious for them to satisfy the basic conditions for our analysis. So, we add a second condition to extract, among all the values of AriNumbersTbl, those that won't go beyond the maximum ID present in the Products table (SELECT MAX(ProductID) FROM Products).

We'll obtain a list like this by executing our script in Management Studio using the example data.

image

Complete script

--> Don't forget that you need a numbers table which is not part of this script, since it is something that should be in any server instance.  
   
-- Products Table creation and population  
CREATE TABLE Products( ProductID INT PRIMARY KEY, ProductDes VARCHAR (50))  
INSERT INTO Products VALUES(1 , 'Product 1' )  
INSERT INTO Products VALUES(2 , 'Product 2' )  
INSERT INTO Products VALUES(3 , 'Product 3' )  
INSERT INTO Products VALUES(4 , 'Product 4' )  
INSERT INTO Products VALUES(5 , 'Product 5' )  
INSERT INTO Products VALUES(10 , 'Product 10' )  
    
-- Identifying numerical holes in products list  
SELECT *  
FROM AriNumbersTbl  
WHERE number NOT IN (SELECT ProductID FROM Products) AND  
      number <= (SELECT MAX( ProductID) FROM Products )  
    
-- Deleting sample tables  
DROP TABLE Products  


Similar Articles