CUMEDIST Analytic Functions In SQL Server 2012

The CUME_DIST () are analytic functions in SQL Server 2012

The purpose of CUME_DIST () function is to calculate cumulative distribution value in a group of the values in a given result set or a partition.

The CUME_DIST () formula

“The values less than or equal to the current value row number” / “Total number of row in a group or partition”. The return type is float (53) and the values are always between 0 and 1.

The syntax of CUME_DIST () is given below.

  1. CUME_DIST ( )  
  2. OVER ([ partition_by_clause ] order_by_clause )  
Example 1 CUME_DIST ()
  1. CREATE TABLE [CompanyShareRates]  
  2. (  
  3. [Id] INT IDENTITY,  
  4. [Date] DATETIME,  
  5. [NameVARCHAR(100),  
  6. [ShareRate] NUMERIC(18,2)  
  7. )  
  8.   
  9. INSERT INTO [CompanyShareRates] VALUES('2017-02-03','TCS',200.00)  
  10. INSERT INTO [CompanyShareRates] VALUES('2017-02-04','TCS',210.00)  
  11. INSERT INTO [CompanyShareRates] VALUES('2017-02-05','TCS',230.00)  
  12. INSERT INTO [CompanyShareRates] VALUES('2017-02-06','TCS',230.00)  
  13. INSERT INTO [CompanyShareRates] VALUES('2017-02-07','TCS',250.50)  
  14.   
  15.   
  16. SELECT  
  17. ROW_NUMBER() OVER (ORDER BY [Name],[ShareRate]) AS [Row Number]  
  18. ,[Date]  
  19. ,[NAME]  
  20. ,[ShareRate]  
  21. ,CUME_DIST () OVER (ORDER BY [Name],[ShareRate]) AS [CUME_DIST]  
  22. FROM [CompanyShareRates]  

Explanation

CUME_DIST is calculated in Excel.



In ‘TCS’ group given above, we have 4 share rates in 5 rows.

Column E

For the 1st row, there is 1 row with value (200) or lower. Hence, the ‘E’ Column value will be 1.

For the 2nd row, there are 2 rows with value (210) or lower. Hence, the ‘E’ Column value will be 2.

For the 3rd and 4th rows, there are 4 rows with value (230) or lower. Hence, the ‘E’ Column value will be 4 for both the row.

For the 5th row, there are 5 rows with the value (250.5) or lower. Hence, the ‘E’ Column value will be 5.

Column F

Total number of rows in result set. i.e. -- 5

Column G 
  • CMUE_DIST for row 1 = 1/5 = 0.2
  • CMUE_DIST for row 2 = 2/5 = 0.4
  • CMUE_DIST for row 3 = 4/5 = 0.8
  • CMUE_DIST for row 4 = 4/5 = 0.8
  • CMUE_DIST for row 5 = 5/5 = 1
Example 2 CUME_DIST () with Partition by Clause
  1. INSERT INTO [CompanyShareRates] VALUES('2017-02-03','Hero Motocorp',390.50)  
  2. INSERT INTO [CompanyShareRates] VALUES('2017-02-03','GAIL',302.00)   
  3. INSERT INTO [CompanyShareRates] VALUES('2017-02-04','Hero Motocorp',391.00)  
  4. INSERT INTO [CompanyShareRates] VALUES('2017-02-04','GAIL',302.25)   
  5. INSERT INTO [CompanyShareRates] VALUES('2017-02-05','Hero Motocorp',390.00)  
  6. INSERT INTO [CompanyShareRates] VALUES('2017-02-05','GAIL',203.25)   
  7. INSERT INTO [CompanyShareRates] VALUES('2017-02-06','Hero Motocorp',389.50)  
  8. INSERT INTO [CompanyShareRates] VALUES('2017-02-06','GAIL',401.00)   
  9. INSERT INTO [CompanyShareRates] VALUES('2017-02-07','Hero Motocorp',389.00)  
  10. INSERT INTO [CompanyShareRates] VALUES('2017-02-07','GAIL',401.00)  
  11.   
  12.   
  13. SELECT  
  14. ROW_NUMBER() OVER (Partition by Name ORDER BY [Name],[ShareRate]) AS [Row Number]  
  15. ,[Date]  
  16. ,[NAME]  
  17. ,[ShareRate]  
  18. ,CUME_DIST () OVER (Partition by Name ORDER BY [Name],[ShareRate]) AS [CUME_DIST]  
  19. FROM [CompanyShareRates]