Exforsys.com
 

Sponsored Links

 

C Sharp Tutorials

 
Home Tutorials C Sharp
 

Regular Expressions and C#, .NET

 

Regular Expressions and C#, .NET

This Article explores the concept of Regular Expressions in the context of C#, .NET support for Regular Expressions, Meta-characters and their Description, Character Escapes, Substitutions, Character Classes, Regular Expression Options and Atomic Zero-Width Assertions.



What are regular expressions?

Regular expressions are Patterns that can be used to match strings. You can call it a formula for matching strings that follow some pattern. Regular expression(s) can be considered as a Language, which is designed to manipulate text. You can then ask questions such as


  • “Does the given string match the pattern?”, or
  • “Does the given string contain characters that match a pattern?”.

Regular Expressions may be used to find one or more occurrences of a pattern of characters within a string. You may choose to replace it with some other characters or perform some other tasks based on the results obtained. These patterns of characters can be simple or very complex. Regular Expressions generally comprises of two types of characters –


1) Literal or Normal Characters such as “abcd123”
2) Special Characters that have a special meaning such as “.” Or “$” or “^”


Due to the special characters Regular Expressions form a very powerful means of manipulating strings and text.


.NET support for Regular Expressions:

.Net provides an extensive set of Regular expressions which you could use to create, modify or compare strings. They can be classified as follows –


a) Character Escapes
b) Substitutions
c) Character Classes
d) Regular Expression Options
e) Atomic Zero-Width Assertions
f) Quantifiers
g) Grouping Constructs
h) Backreference Constructs
i) Alternation Constructs
j) Miscellaneous Constructs


Meta-characters and their Description

.


Matches any single character. An example of this is the regular expression s.t would match the strings sat, sit, but not sight.


$


Matches the end of a line. For instance, the regular expression reason$ would match the end of the string "He has a reason" but not the string "He has his reasons"


^


Matches the beginning of a line. For instance, the regular expression ^Where would match the beginning of the string "Where is my cap" but would not match "Do you know Where it is " .


*


Matches zero or more occurrences of the character immediately preceding. For example, the regular expression .* means match any number of any characters.


 


This is a escape or quoting character. The character after this is treated as an ordinary character. For example, ^ is used to match the caret sign character (^) rather than the beginning of a line. Similarly, the expression . is used to match the “.” character .


[ ]


[c1-c2]


[^c1-c2]
 


Matches any one of the characters between the brackets.
For example, the regular expression s[ia]t matches sat, sit, but not set.


Ranges of characters can specified by using a hyphen.
For example, the regular expression [0-9] means match any digit. Multiple ranges can be specified as well. The regular expression [A-Za-z] means match any upper or lower case letter.


To match any character except those in the range, the complement range, use the caret as the first character after the opening bracket.
For example, the expression [^123a-z] will match any characters except 1,2, 3, and lower case letters.


< >


Matches the beginning ( < ) or end ( >) or a word. For example, < THE< _fckxhtmljob="2" span > matches on "the" in the string "for the older" but does not match "the" in "rather"


( )


Treat the expression between ( and ) as a group. Also, saves the characters matched by the expression into temporary holding areas. Up to nine pattern matches can be saved in a single regular expression. They can be referenced as 1 through 9.


|


Or two conditions together. For example (him|her) matches the line "it belongs to him" and matches the line "it belongs to her" but does not match the line "it belongs to them."


+


Matches one or more occurrences of the character or regular expression immediately preceding. For example, the regular expression 9+ matches 9, 99, 999.


?


Matches 0 or 1 occurrence of the character or regular expression immediately preceding.


{i}


{i,j}


Match a specific number of instances or instances within a range of the preceding character.
For example, the expression A[0-9]{3} will match "A" followed by exactly 3 digits. That is, it will match A123 but not A1234.


The expression [0-9]{4,6} any sequence of 4, 5, or 6 digits


Character Escapes

The escape character (a single backslash) signals to the regular expression parser that the character following the backslash is not an operator


b


Matches a backspace


t


Matches a tab


r


Matches a carriage return


v


Matches a vertical tab


f


Matches a form feed


n


Matches a new line


e


Matches an escape


40


Matches an ASCII character as octal (up to three digits);


x20


Matches an ASCII character using hexadecimal representation (exactly two digits).


cC


Matches an ASCII control character; for example, cC is control-C.


u0020


Matches a Unicode character using hexadecimal representation (exactly four digits).


Substitutions:

Provides information on the special constructs used in replacement patterns. Substitutions are allowed only within replacement patterns.


Character


Description


$number


Substitutes the last substring matched by group number number (decimal).


${name}


Substitutes the last substring matched by a (? ) group.


$$


Substitutes a single "$" literal.


$&


Substitutes a copy of the entire match itself.


$`


Substitutes all the text of the input string before the match.


$'


Substitutes all the text of the input string after the match.


$+


Substitutes the last group captured.


$_


Substitutes the entire input string.


Character Classes

A character class is a set of characters that will find a match if any one of the characters included in the set matches.


Character class


Description


.


Matches any character except n. If modified by the Singleline option, a period character matches any character.


[aeiou]


Matches any single character included in the specified set of characters.


[^aeiou]


Matches any single character not in the specified set of characters.


[0-9a-fA-F]


Use of a hyphen (–) allows specification of contiguous character ranges.


p{name}


Matches any character in the named character class specified by {name}.


P{name}


Matches text not included in groups and block ranges specified in {name}.


w


Matches any word character.


W


Matches any non-word character.



Read Next: .NET Remoting



 

 

Comments


shoban said:

  better know about Interface, Explain more about it
December 13, 2006, 11:34 pm

Vinod.TK said:

  Its good that u have given in detail
July 18, 2007, 8:15 am

ELmO BonD said:

  Good points.
August 15, 2007, 6:25 pm

Ruju said:

  Its very helpful but need some examples
September 12, 2007, 1:24 am

Malik hassan said:

  It's really usefull
October 22, 2007, 6:00 am

waheed said:

  Must be very usefull
December 8, 2007, 4:51 am

mayur potnis said:

  if examples also included would be absolutely beneficial
December 16, 2007, 8:30 pm

Sekar.T said:

  Very very nice
December 18, 2007, 12:57 am

Pradeep Bansal said:

  I am greately impressed with this.
April 30, 2008, 4:36 am

korei said:

  very nice well done thanks for sharing this wonderful infomation
May 5, 2008, 10:03 am

jhessie said:

  very nice article and educational
May 8, 2008, 10:29 pm

Shaik Kalelur Rahman said:

  Sofar I have seen many notes. Yours in best amon them.. good luck
May 27, 2008, 4:41 am

Roozbeh said:

  Very useful!
June 24, 2008, 1:20 pm

DEEPAK Dalal said:

  Good article regarding Regular Expression.
October 17, 2008, 5:24 am

Mahitosh Kumar said:

  The expression A[0-9]{3} fails I don't know why.
April 2, 2009, 9:27 am

Pranita Jain said:

  '' what does this character says about , plz share .
June 9, 2009, 7:18 am

Post Your Comment:

Members Please Login
Your Name:*
e-mail ID:(required for notification)*
Image Verification: 
 
 Subscribe    

Sponsored Links

 

Subscribe via RSS


Get Daily Updates via Subscribe to Exforsys Free Training via email


Get Latest Free Training Updates delivered directly to your Inbox...

Enter your email address:


 

Subscribe to Exforsys Free Training via RSS
 

 
Partners -  Privacy and Legal Policy -  Site News -  Contact   Sitemap  

Copyright © 2000 - 2009 exforsys.com. All Rights Reserved

Page copy protected against web site content infringement by Copyscape