Regular expressions, commonly known as regex, are powerful and flexible tools used for pattern matching and manipulation of text. They are essentially a sequence of characters that define a search pattern.
Regular expressions are used to perform various operations on strings, such as searching for specific patterns, validating input, or transforming text data. They are widely supported in programming languages, text editors, and command-line tools.
A regular expression consists of two types of characters: literal characters and metacharacters. Literal characters represent themselves and match the same characters in the text being searched. For example, the regular expression "cat" matches the string "cat" exactly.
Metacharacters, on the other hand, have special meanings and are used to define the rules and patterns within the regular expression. Some common metacharacters include:
- Dot (.) - Matches any single character except a new line.
- Asterisk (*) - Matches zero or more occurrences of the preceding character or group.
- Plus sign (+) - Matches one or more occurrences of the preceding character or group.
- Question mark (?) - Matches zero or one occurrence of the preceding character or group.
- Square brackets ([])- Defines a character class and matches any single character within the brackets.
- Caret (^) - Matches the beginning of a line or string.
- Dollar sign ($) - Matches the end of a line or string.
- Pipe (|) - Represents the OR operator, allowing for multiple alternatives.
Regular expressions can be as simple as matching a specific word or character, or they can be more complex and involve combinations of metacharacters to define intricate patterns.
Here are a few examples of what you can do with regular expressions:
- Search for email addresses within a document: \b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+.[A-Za-z]{2,}\b
- Validate a phone number in a specific format:
- ?
- \d
- 3
- ?\d3?[-.\s]?\d{3}[-.\s]?\d{4}
- Find and replace all occurrences of a word: s/old_word/new_word/g
- Extract all URLs from a webpage: https?://[\w./?-]+
Regular expressions provide a concise and powerful way to work with text data, and learning how to use them effectively can greatly enhance your ability to manipulate and process strings in various contexts.
In C#, regular expressions are supported through the System.Text.RegularExpressions namespace, which provides classes and methods for working with regular expressions. Here's an overview of how you can use regular expressions in C#:
Creating a regular expression object
You can create a Regex object by instantiating it with a pattern string as the constructor argument. For example:
Regex regex = new Regex(@"\b\w+"); // Matches one or more word characters
Matching patterns
The Regex object provides methods for matching patterns in strings. The Match method returns the first occurrence of the pattern in the input string, and the Matches method returns all occurrences as a collection of Match objects. For example:
string input = "Hello, world!";
Match match = regex.Match(input);
if (match.Success)
{
string matchedText = match.Value; // "Hello"
int startIndex = match.Index; // 0
int length = match.Length; // 5
}
Checking for matches
The Regex object has an IsMatch method that checks if a pattern matches a given string. It returns a boolean indicating whether there was a match. For example:
bool isMatch = regex.IsMatch(input);
if (isMatch)
{
// There is a match
}
Finding and replacing patterns
The Regex class provides a Replace method that allows you to find and replace patterns in a string. You can specify the replacement string and optional options to modify the behavior. For example:
string replacedText = regex.Replace(input, "Hi");
// replacedText will be "Hi, world!"
These are just some of the basic operations you can perform with regular expressions in C#. The Regex class offers many more methods and options for working with patterns, such as specifying options like case-insensitivity or multiline matching, capturing groups, and more.
Remember to escape special characters in the pattern string using a backslash (\) if you want to match those characters literally.
Regular expressions in C# provide a powerful and flexible way to work with text data, allowing you to search, validate, and transform strings efficiently.
Detailed Regex Example
Let's create a complete application with a few patterns.
Pattern #1:
Regex objNotNaturalPattern=new Regex("[^0-9]");
Pattern #2:
Regex objNaturalPattern=new Regex("0*[1-9][0-9]*");
Pattern #1 will match strings other than 0 to 9.
- The ^ symbol is used to specify, not condition.
- the [] brackets if we are to give range values such as 0 - 9 or a-z or A-Z
In the above example, input 'abc' will return true, and '123' will return false.
Pattern #2 will match strings that are natural numbers. Natural numbers are numbers that are always greater than 0. Pattern 0* says a natural number can be prefixed with zeros or non-zero. The next [1-9] says it should contain at least one number from 1 to 9 followed by any numbers of 0-9's.
In this case, input '0007' will return true, and '00' will return false.
Basic operators to understand in Regex:
- "*" matches 0 or more patterns
- "?" matches a single character
- "^" for ignoring matches.
- "[]" for searching range patterns.
The complete source code example provides functions to check IsNaturalNumber, IsWholeNumber, IsPositiveNumber, IsInteger, IsNumber, IsAlpha, and IsAlphaNumeric.
// Source Code starts
using System.Text.RegularExpressions;
using System;
/*
<HowToCompile>
csc /r:System.Text.RegularExpressions.dll,System.dll Validation.cs
</HowToComplie>
*/
class Validation
{
public static void Main()
{
String strToTest;
Validation objValidate = new Validation();
Console.Write("Enter a String to Test for Alphabets:");
strToTest = Console.ReadLine();
if (objValidate.IsAlpha(strToTest))
{
Console.WriteLine("{0} is Valid Alpha String", strToTest);
}
else
{
Console.WriteLine("{0} is not a Valid Alpha String", strToTest);
}
}
// Function to test for Positive Integers.
public bool IsNaturalNumber(String strNumber)
{
Regex objNotNaturalPattern = new Regex("[^0-9]");
Regex objNaturalPattern = new Regex("0*[1-9][0-9]*");
return !objNotNaturalPattern.IsMatch(strNumber) &&
objNaturalPattern.IsMatch(strNumber);
}
// Function to test for Positive Integers with zero inclusive
public bool IsWholeNumber(String strNumber)
{
Regex objNotWholePattern = new Regex("[^0-9]");
return !objNotWholePattern.IsMatch(strNumber);
}
// Function to Test for Integers both Positive & Negative
public bool IsInteger(String strNumber)
{
Regex objNotIntPattern = new Regex("[^0-9-]");
Regex objIntPattern = new Regex("^-[0-9]+$|^[0-9]+$");
return !objNotIntPattern.IsMatch(strNumber) && objIntPattern.IsMatch(strNumber);
}
// Function to Test for Positive Number both Integer & Real
public bool IsPositiveNumber(String strNumber)
{
Regex objNotPositivePattern = new Regex("[^0-9.]");
Regex objPositivePattern = new Regex("^[.][0-9]+$|[0-9]*[.]*[0-9]+$");
Regex objTwoDotPattern = new Regex("[0-9]*[.][0-9]*[.][0-9]*");
return !objNotPositivePattern.IsMatch(strNumber) &&
objPositivePattern.IsMatch(strNumber) &&
!objTwoDotPattern.IsMatch(strNumber);
}
// Function to test whether the string is valid number or not
public bool IsNumber(String strNumber)
{
Regex objNotNumberPattern = new Regex("[^0-9.-]");
Regex objTwoDotPattern = new Regex("[0-9]*[.][0-9]*[.][0-9]*");
Regex objTwoMinusPattern = new Regex("[0-9]*[-][0-9]*[-][0-9]*");
String strValidRealPattern = "^([-]|[.]|[-.]|[0-9])[0-9]*[.]*[0-9]+$";
String strValidIntegerPattern = "^([-]|[0-9])[0-9]*$";
Regex objNumberPattern = new Regex("(" + strValidRealPattern + ")|(" + strValidIntegerPattern + ")");
return !objNotNumberPattern.IsMatch(strNumber) &&
!objTwoDotPattern.IsMatch(strNumber) &&
!objTwoMinusPattern.IsMatch(strNumber) &&
objNumberPattern.IsMatch(strNumber);
}
// Function To test for Alphabets.
public bool IsAlpha(String strToCheck)
{
Regex objAlphaPattern = new Regex("[^a-zA-Z]");
return !objAlphaPattern.IsMatch(strToCheck);
}
// Function to Check for AlphaNumeric.
public bool IsAlphaNumeric(String strToCheck)
{
Regex objAlphaNumericPattern = new Regex("[^a-zA-Z0-9]");
return !objAlphaNumericPattern.IsMatch(strToCheck);
}
}
// Source Code End
Here are some of the most common Regex code examples in C#:
C# Regex Examples (2023)