Using Regular Expressions with C# and .NET Framework

Regular expressions are not new concept Perl and PHP developer used it a lot in prior applications. We can do many things with regex and Regular expressions. For example we can do pattern matching, text parsing, text update and delete as well as other text edit functions.

Use of regular expressions in Pattern matching employs following name space System.Text.RegularExpressions. Here is an example of the most basic regular expression.

With Pattern matching we can match specific string or location within string. There is a set of most common symbols that we use for string matching.

 

CharacterDescription
^ Sets that the match must begin.
$ Sets that the match must end at either the last char.
\A Sets that the match must begin at the first char.
\Z Sets that the match must end at either the last char.
\z Sets that the match must end at the last char.
\G Sets that the match must occur at the point where the previous match ended.
\b Sets that the match must occur on a boundary between \w (alphanumeric) and \W (nonalphanumeric) char.
\B Sets that the match must not occur on a \b boundary.

Important thing to remember is that we always use @ sign before regular expression so that our compiler treats backslashes literally. For example:
Regex.IsMatch("pattern", @"\Apattern\Z")

We can also match set of special characters as shown in the table

 

CharacterDescription
\a a bell (alarm).
\b \b denotes a word boundary.
\t a tab.
\r a carriage return.
\v a vertical tab.
\f a form feed.
\n a new line.
\e an escape.

Wildcards are also used

 

CharacterDescription
* char. zero or more times.
+ char. one or more times.
? char. zero or one time.
{n} non-negative integer.
{n,} non-negative integer.
{n,m} m and n are non-negative integers.
? the matching pattern is nongreedy.
. single character except "\n".
x|y either x or y.
[xyz] any one of the enclosed characters.
[a–z] range of characters.

Another set of special numeric characters

 

CharacterDescription
\d digit.
\D nondigit.
\s white-space.
\S non-white-space.
\w any word.
\W any nonword.

There is Backreference in regular expression matching. We use it in order to match first instance of the match to the rest of the string. We can use it to find repeating groups for example. (?\s\w+)\k\b this regular expression can find repeating words.

 

Backreference constructDefinition
\number Backreference.
\k<name> Named backreference.

Regular Expressions have options, which determine how regular expression will be run.

 

RegexOption memberInline characterDescription
None N/A no options are set.
IgnoreCase i case-insensitive matching.
Multiline m multiline mode.
ExplicitCapture n the only valid captures are explicitly named or numbered groups of the form (?<name>…).
Compiled N/A the regular expression will be compiled to an assembly.
Singleline s single-line mode.
IgnorePatternWhitespace x unescaped white space is excluded from the pattern.
RightToLeft N/A the search moves from right to left instead of from left to right.
ECMAScript N/A ECMAScript-compliant behavior is enabled.
CultureInvariant N/A Specifies that cultural differences in language are ignored.

For example we have three lines of text
abc
def
ghi

If we run Regex.IsMatch(s, "^def$") will get nothing in return and the main reason is because first line of three multiline text is not def. However, if we use Regex.IsMatch(s, "^def$", RegexOptions.Multiline) we’ll be able to match. Option Multiline allowed us to succeed.