Harnessing Regular Expressions in SQL Server using CLR Integration

This article discusses how to create a Regex library of CLR functions for SQL Server. CLR Integration is available in SQL Server 2005 and above.


Purpose

Many times it is useful to have regular expression support in SQL Server. This article discusses how to create a Regex library of CLR functions for SQL Server. CLR Integration is available in SQL Server 2005 and above.

Implementation Details
 

Create a SQL Server Project in Visual Studio 2008.

Add a new User-Defined Function file to the project.

Create CLR functions in this file as in the following sample.


    [Microsoft.SqlServer.Server.SqlFunction] 

    public static SqlBoolean RegexMatch(string input, string pattern)

    {

        return Regex.Match(input, pattern).Success;

    }


Build and deploy the solution to the database. The CLR functions will show up in the database Functions section under Programmability as shown below.

CLR_Functions_View.jpg

Ensure that CLR Integration is enabled in the SQL Server Surface Area Configuration tool.

Enable CLR for the database as shown below

EXEC sp_configure 'clr enabled' , '1'

Sample Usage


You can use the CLR functions in your SQL as show below.


 

DECLARE @input AS NVARCHAR(4000)

DECLARE @pattern AS NVARCHAR(4000)

 

SET @input = 'red,magenta,green'

SET @pattern = '^((?<colour>.+?)(,|$))+$'

 

 

-- matches if green in the input

IF (dbo.RegexMatch(@input, '(G|g)reen') = 1)

      SELECT 'green found in string' AS [Message]

 

-- gets the number of captures in the named group colour

SELECT dbo.RegexMatchGroupCaptureCount(@input, @pattern, 'colour') AS CaptureCount

 

-- gets the value of capture at index 2 in the named group colour

SELECT dbo.RegexMatchGroupAt(@input, @pattern, 'colour', 2) AS CaptureItem

 

SELECT * FROM dbo.RegexMatchGroupCaptures(@input, @pattern, 'colour')

 

SET @pattern = '^\s*|\s*$'

 

-- performs a trim

SELECT dbo.RegexReplace('  This should perform a trim  ', @pattern, '') AS Trimmed

 

-- splits the input string on comma delimiter

SELECT * FROM dbo.RegexSplit(@input, ',')

Results for above SQL


CLR_SQL_Results.jpg 

 

Regex Library for SQL Server

 

Project Toolkit.SqlServer.CLR  contains code that provides the following API to support Regular Expressions in the database.

CLR_ClassDiag.jpg

CLR Functions Library :

Function

Parameters

RegexMatch

input

string

Input string

pattern

string

The Regex pattern

Return type : SqlBoolean 

Returns true if matched else false.

RegexMatchGroupAt

Input

string

Input string

pattern

string

The Regex pattern

groupName

string

Name of the named group

captureIndex

int

The index into the capture

Returns: SqlString

The value of the capture for the group at the index.

RegexMatchGroupIndexAt

Input

string

Input string

pattern

string

The Regex pattern

groupIndex

string

Index of the group

captureIndex

int

The index into the capture

Returns: SqlString

The value of the capture for the group at the index.

RegexMatchGroupCaptureCount

input

string

Input string

pattern

string

The Regex pattern

groupName

string

Name of the named group

Returns: SqlInt32

The number of captures for a specified named group.

RegexMatchGroupIndexCaptureCount

input

string

Input string

pattern

string

The Regex pattern

groupIndex

string

Index of the group

Returns: SqlInt32

The number of captures for a specified named group.

RegexMatchGroupCaptures

input

string

Input string

pattern

string

The Regex pattern

groupName

string

Name of the named group

Returns: IEnumerable i.e. a Table Valued Variable  containing the capture values

RegexMatchGroupIndexCaptures

input

string

Input string

pattern

string

The Regex pattern

groupIndex

string

Index of the group

Returns: IEnumerable i.e. a Table Valued Variable  containing the capture values

RegexReplace

input

string

Input string

pattern

string

The Regex pattern

replacement

string

The replacement string

Returns: SqlString

The string after the replacement.

RegexSplit

input

string

Input string

pattern

string

The Regex pattern

Returns: IEnumerable i.e. a Table Valued Variable  containing the split values