In Focus

Change Data Capture (CDC) in SQL Server 2008

CDC is capture insert, update and delete activity on SQL tables.

CDC is capture insert, update and delete activity on SQL tables. CDC contains a column structure that is exactly the same as the column structure of the source table; in other words it mirrors the column structure of the tracked source table along with the metadata that is required to understand changes done in the table's data.

CDC is for providing information about the DML (Data Manipulation Language) changes on the table and database. It help us to remove expensive techniques like trigger, timestamp column and complex join queries.

To use CDC, it must be configured.

Setup & Configure CDC

To use CDC, it must be enabled at the database level; by default it is disabled. To enable CDC, you must be the member of "SYSADMIN" (Fixed role of SQL Server). You can enable CDC only on a user Database, not a system database.

To determine whether a database is CDC enabled or not, run the following T-SQL:

select name, is_cdc_enabled from sys.databases

CDCSQL1.gif

To enable CDC on a database, use the system stored procedure called "SYS.SP_CDC_ENABLE_DB" .Execute following T-SQL.

USE AdventureWorks
Go
EXEC
sys.sp_cdc_enable_db
GO

CDCSQL2.gif

The following CDC tables are created under the CDC schema.

  1. cdc.captured_columns
  2. cdc.change_tables
  3. cdc.ddl_history
  4. cdc.index_columns
  5. cdc.lsn_time_mapping

The next Step is to enable CDC on a table, to track changes in the table data.

CDCSQL3.gif

USE AdventureWorks
GO
EXEC
sys.sp_cdc_enable_table
@source_schema = 'dbo',
@source_name = 'Customer',
@role_name = 'cdc_Customer'
GO

CDCSQL4.gif

The sys.sp_cdc_enable_table system stored procedure has a few parameters.

table2.jpg

While Enabling CDC, it will create a certain stored produce, SQL job and function.

CDCSQL5.gif

CDCSQL6.gif

CDCSQL7.gif

Example of CDC:

Perform the couple of insert statements on "Customer" table.

Insert into Customer values('Jignesh Trivedi','Address 1','Address 2','Address 3','City 1','State 1')

Insert into Customer values('Tejas Trivedi','Address 1','Address 2','Address 3','City 3','State 4')

Now let us check the effect of the above query when run on the database.

To determine the change in table data, SQL has two functions.

  1. fn_cdc_get_net_changes_dbo_TableName.

  2. fn_cdc_get_all_changes_dbo_TableName.

And these functions have a couple of parameters.

Parameter

Description

from_lsn

LSN No that lowest or start result set.

to_lsn

LSN No that highest or end result set.

row_filter_option

The Option governs the content of metadata column as well as row rows return in the result set.

Refer to MSDN for more information about the fn_cdc_get_net_changes_dbo_TableName function.

Result:

declare @begin_lsn binary(10), @end_lsn binary(10)
-- get the first LSN for customer changes
select @begin_lsn = sys.fn_cdc_get_min_lsn('dbo_customer')
-- get the last LSN for customer changes
select @end_lsn = sys.fn_cdc_get_max_lsn()
-- get net changes; group changes in the range by the pk|
select * from cdc.fn_cdc_get_net_changes_dbo_customer(@begin_lsn, @end_lsn, 'all');
-- get individual changes in the range
select * from cdc.fn_cdc_get_all_changes_dbo_Customer(@begin_lsn, @end_lsn, 'all');

CDCSQL8.gif

To find out difference between above two functions, do some more DML on "Customer" table.

UPDATE Customer
SET AddressLine1 = 'Address 5'
WHERE CustomerID = 1

Insert into Customer values('Rakesh Trivedi','Address 4','Address 4','Address 4','City 4','State 4')

DELETE Customer WHERE CustomerID = 3

Now we run above two functions.

CDCSQL9.gif

Function "fn_cdc_get_all_changes_dbo_TableName" contains all changes between from LSN and To LSN. And function "fn_cdc_get_net_changes_dbo_TableName" contains only net changes on table.

Reference:

http://msdn.microsoft.com/en-us/library/bb522649.aspx
http://msdn.microsoft.com/en-us/library/cc627369.aspx