What is MS SQL Table Partition?

What is table partitioning?

Table partitioning in SQL is a database design technique that involves dividing a large table into smaller, more manageable pieces called partitions. Each partition is essentially a subset of the table's data, and it has its storage characteristics and may be stored in a separate filegroup or physical location. The goal of table partitioning is to improve query performance, simplify data management, and enhance maintenance tasks.

Here are some key concepts related to table partitioning in SQL.

Partition Key

A partitioned table is divided based on a column or set of columns called the partition key. The values in this key determine which partition a specific row belongs to.

The partition key is chosen based on the distribution of data and the requirements of queries.

Types of partitioning

  1. Range Partitioning: Rows are grouped based on a specified range of values in the partition key. For example, you might partition a sales table by date ranges.
  2. List Partitioning: Rows are grouped based on a predefined list of values in the partition key. This is useful when the partitioning is based on discrete values.
  3. Hash Partitioning: Rows are distributed across partitions based on a hash function applied to the partition key. This can help evenly distribute data without requiring specific knowledge of data ranges.
  4. Composite (Multi-level) Partitioning: Combining multiple partitioning methods.

Benefits of table partitioning

  1. Improved Query Performance: Queries that involve the partition key can benefit from partition pruning, where the database engine only accesses relevant partitions, reducing the amount of data scanned.
  2. Easier Data Management: Partitioning can simplify the process of managing large tables by allowing for more efficient data loading, unloading, and archiving of partitions.
  3. Enhanced Maintenance Operations: For tasks like index rebuilds or updates, partitioning can make these operations more efficient as they can be applied to specific partitions instead of the entire table.

What is MDF and NDF?

In Microsoft SQL Server, MDF and NDF are file extensions associated with database files. These files store the data and objects for an SQL Server database. Here's what each of these file types represents

MDF (Primary data file)

  • The primary data file has an extension of .mdf. It is the primary database file that contains the startup information for the database and points to the other files in the database.
  • The MDF file holds system tables, user tables, and other database objects.
  • When you create a new database, at least one MDF file is created, and it is associated with the primary filegroup. Additional filegroups can be added, each with its data file (NDF), but the primary data file (MDF) is required.

NDF (Secondary data file)

  • The secondary data files have extensions of .ndf. These are additional data files that can be associated with user-defined filegroups within a database.
  • When you create additional filegroups, you can add NDF files to store user data on different disks or storage devices. This helps distribute I/O and improve performance.
  • NDF files are optional, and a database can have zero or more secondary data files associated with different file groups.

Create database

Create a new database or use any other existing DB that is available on your machine.

Create file-group

In SQL Server, a filegroup is a logical container for database files. A database is divided into one or more file groups, and each filegroup contains one or more data files. The primary purpose of using filegroups is to provide a level of organization and control over the physical storage of database objects. This allows administrators to allocate database objects to specific filegroups based on various criteria, such as usage patterns, performance considerations, and manageability.

First, we want to create a file group Like below.

ALTER DATABASE PartitionTest
ADD FILEGROUP Jan_2023
GO
ALTER DATABASE PartitionTest
ADD FILEGROUP Feb_2023
GO
ALTER DATABASE PartitionTest
ADD FILEGROUP Mar_2023
GO

Select Available File Group after file group creation like above.

You can Select and see the Available File Group using the below query.

SELECT name AS AvailableFilegroups
FROM sys.filegroups
WHERE type = 'FG'

Create .ndf file attached with File Group which we created in Point-1.

ALTER DATABASE PartitionTest
ADD FILE
(

  NAME = [Partition_Jan_2023],
 FILENAME = 'D:\SQL_Do_Not_Delete\Data\Jan_2023.ndf',
  SIZE = 5 MB, 
    MAXSIZE = UNLIMITED,
    FILEGROWTH = 10 MB
) TO FILEGROUP Jan_2023
GO


ALTER DATABASE PartitionTest
ADD FILE
(
  NAME = [Partition_Feb_2023],
  FILENAME = 'D:\SQL_Do_Not_Delete\Data\FEB_2023.ndf',
    SIZE = 5 MB, 
    MAXSIZE = UNLIMITED,
    FILEGROWTH = 10 MB
) TO FILEGROUP Feb_2023
GO


ALTER DATABASE PartitionTest
ADD FILE
(
  NAME = [Partition_Mar_2023],
  FILENAME = 'D:\SQL_Do_Not_Delete\Data\Mar_2023.ndf',
    SIZE = 5 MB, 
    MAXSIZE = UNLIMITED,
    FILEGROWTH = 10 MB
) TO FILEGROUP Mar_2023
GO

To Check the file created and associated with the filegroup.

You can see using the below query, the ndf file created and associated with file groups.

SELECT
name as [FileName],
physical_name as [FilePath]
FROM sys.database_files
where type_desc = 'ROWS'
GO

Create a partition function

CREATE PARTITION FUNCTION [TransactionPartitioningByMonth] (datetime)
AS RANGE RIGHT FOR VALUES (  '20230201', '20230301'   );

Create partition scheme

CREATE PARTITION SCHEME [TransactionPartitioningByMonth]
AS PARTITION [TransactionPartitioningByMonth]
TO (Jan_2023, Feb_2023, Mar_2023   );

Create table

CREATE TABLE [dbo].[Transactions](
         [ID] [bigint] IDENTITY(1,1) NOT NULL,
         [Date] [datetime] NOT NULL,
         [Amoutn] [nvarchar](500) NULL,
 CONSTRAINT [PK_Transactions] PRIMARY KEY CLUSTERED
(
         [ID] ASC,
         [Date] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF)
)
GO

Apply partition to the table

BEGIN TRANSACTION
ALTER TABLE [dbo].[Transactions] DROP CONSTRAINT[PK_Transactions] WITH ( ONLINE = OFF )
ALTER TABLE [dbo].[Transactions] ADD  CONSTRAINT [PK_Transactions] PRIMARY KEY CLUSTERED
(
         [ID] ASC,
         [Date] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [TransactionPartitioningByMonth]([Date])
COMMIT TRANSACTION

Insert data into the created table

Insert data into create table
GO
SET IDENTITY_INSERT [dbo].[Transactions] ON
GO
INSERT [dbo].[Transactions] ([ID], [Date], [Amoutn]) VALUES (4, CAST(N'2023-02-01T00:00:00.000' AS DateTime), N'10')
GO
INSERT [dbo].[Transactions] ([ID], [Date], [Amoutn]) VALUES (5, CAST(N'2023-02-02T00:00:00.000' AS DateTime), N'20')
GO
INSERT [dbo].[Transactions] ([ID], [Date], [Amoutn]) VALUES (6, CAST(N'2023-02-03T00:00:00.000' AS DateTime), N'30')
GO
INSERT [dbo].[Transactions] ([ID], [Date], [Amoutn]) VALUES (7, CAST(N'2023-03-01T00:00:00.000' AS DateTime), N'10')
GO
INSERT [dbo].[Transactions] ([ID], [Date], [Amoutn]) VALUES (8, CAST(N'2023-03-02T00:00:00.000' AS DateTime), N'20')
GO
INSERT [dbo].[Transactions] ([ID], [Date], [Amoutn]) VALUES (9, CAST(N'2023-03-03T00:00:00.000' AS DateTime), N'30')
GO
SET IDENTITY_INSERT [dbo].[Transactions] OFF
GO

Select row count partition-wise

SELECT
p.partition_number AS PartitionNumber,
f.name AS PartitionFilegroup,
p.rows AS NumberOfRows
FROM sys.partitions p
LEFT JOIN sys.destination_data_spaces dds ON p.partition_number = dds.destination_id
LEFT JOIN sys.filegroups f ON dds.data_space_id = f.data_space_id
WHERE OBJECT_NAME(OBJECT_ID) = 'Transactions'

Result

Move Specific Partition to another table.

Create stating table

BEGIN TRANSACTION
SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
CREATE TABLE [dbo].[staging_Transactions_20240130-105335](
         [ID] [bigint] NOT NULL,
         [Date] [datetime] NOT NULL,
         [Amoutn] [nvarchar](500) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) ON [Jan_2023]
ALTER TABLE [dbo].[staging_Transactions_20240130-105335] ADD  CONSTRAINT [staging_Transactions_20240130-105335_PK_Transactions] PRIMARY KEY CLUSTERED
(
         [ID] ASC,
         [Date] ASC
)WITH (PAD_INDEX = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [Jan_2023]
ALTER TABLE [dbo].[staging_Transactions_20240130-105335]  WITH CHECK ADD  CONSTRAINT [chk_staging_Transactions_20240130-105335_partition_1] CHECK  ([Date]<N'2023-02-01T00:00:00')
ALTER TABLE [dbo].[staging_Transactions_20240130-105335] CHECK CONSTRAINT [chk_staging_Transactions_20240130-105335_partition_1]
COMMIT TRANSACTION

Move to a different table to the same DB

BEGIN TRANSACTION
ALTER TABLE  [Transactions] SWITCH PARTITION 1 TO  staging_Transactions_20240130-105335] WITH (WAIT_AT_LOW_PRIORITY (MAX_DURATION = 0 MINUTES, ABORT_AFTER_WAIT = NONE))
COMMIT TRANSACTION

Delete file group and ndf file

 /**
 --##################
 Remove file group and partition.
 #################################
 ALTER DATABASE PartitionTest REMOVE FILE [Partition_Mar_2023]
GO
ALTER DATABASE PartitionTest REMOVE FILEGROUP [Mar_2023]
 ALTER DATABASE PartitionTest REMOVE FILE [Partition_Feb_2023]
GO
ALTER DATABASE PartitionTest REMOVE FILEGROUP [Feb_2023]
 ALTER DATABASE PartitionTest REMOVE FILE [Partition_Jan_2023]
GO
ALTER DATABASE PartitionTest REMOVE FILEGROUP [Jan_2023]
**/