Using Regular Expressions with C# and .NET Framework
Use of regular expressions in Pattern matching employs following name space System.Text.RegularExpressions. Here is an example of the most basic regular expression.
With Pattern matching we can match specific string or location within string. There is a set of most common symbols that we use for string matching.
Character | Description |
---|---|
^ | Sets that the match must begin. |
$ | Sets that the match must end at either the last char. |
\A | Sets that the match must begin at the first char. |
\Z | Sets that the match must end at either the last char. |
\z | Sets that the match must end at the last char. |
\G | Sets that the match must occur at the point where the previous match ended. |
\b | Sets that the match must occur on a boundary between \w (alphanumeric) and \W (nonalphanumeric) char. |
\B | Sets that the match must not occur on a \b boundary. |
Important thing to remember is that we always use @ sign before regular expression so that our compiler treats backslashes literally. For example:
Regex.IsMatch("pattern", @"\Apattern\Z")
We can also match set of special characters as shown in the table
Character | Description |
---|---|
\a | a bell (alarm). |
\b | \b denotes a word boundary. |
\t | a tab. |
\r | a carriage return. |
\v | a vertical tab. |
\f | a form feed. |
\n | a new line. |
\e | an escape. |
Wildcards are also used
Character | Description |
---|---|
* | char. zero or more times. |
+ | char. one or more times. |
? | char. zero or one time. |
{n} | non-negative integer. |
{n,} | non-negative integer. |
{n,m} | m and n are non-negative integers. |
? | the matching pattern is nongreedy. |
. | single character except "\n". |
x|y | either x or y. |
[xyz] | any one of the enclosed characters. |
[a–z] | range of characters. |
Another set of special numeric characters
Character | Description |
---|---|
\d | digit. |
\D | nondigit. |
\s | white-space. |
\S | non-white-space. |
\w | any word. |
\W | any nonword. |
There is Backreference in regular expression matching. We use it in order to match first instance of the match to the rest of the string. We can use it to find repeating groups for example. (?\s\w+)\k\b this regular expression can find repeating words.
Backreference construct | Definition |
---|---|
\number | Backreference. |
\k<name> | Named backreference. |
Regular Expressions have options, which determine how regular expression will be run.
RegexOption member | Inline character | Description |
---|---|---|
None | N/A | no options are set. |
IgnoreCase | i | case-insensitive matching. |
Multiline | m | multiline mode. |
ExplicitCapture | n | the only valid captures are explicitly named or numbered groups of the form (?<name>…). |
Compiled | N/A | the regular expression will be compiled to an assembly. |
Singleline | s | single-line mode. |
IgnorePatternWhitespace | x | unescaped white space is excluded from the pattern. |
RightToLeft | N/A | the search moves from right to left instead of from left to right. |
ECMAScript | N/A | ECMAScript-compliant behavior is enabled. |
CultureInvariant | N/A | Specifies that cultural differences in language are ignored. |
For example we have three lines of text
abc
def
ghi
If we run Regex.IsMatch(s, "^def$") will get nothing in return and the main reason is because first line of three multiline text is not def. However, if we use Regex.IsMatch(s, "^def$", RegexOptions.Multiline) we’ll be able to match. Option Multiline allowed us to succeed.