Monday, February 16, 2009

Regular Expressions in ASP.NET

I worked with PERL for a long time and of course I used the power of regular expressions intensively. I was glad to read that ASP.NET has built-in support for regular expressions, but after reading about it I realized I can say bye-bye to the elegant one-line commands I am used to from PERL.
After reading documentations, examples, and playing with the code myself, I can finally understand how to achieve what was so easy to do in PERL using the class hierarchy of ASP.NET.
This post is intended for people who know what are regular expressions, and want to understand how to implement them in C#.
This is what I needed to do:
I had a string consisting of some prefix, underscore and then a number: {prefix}_{num}
I wanted to get the number from that string.
The PERL way would be something like
my ($num) = $my_string =~ /_([0-9]+)$/
That's it... so simple.
Now this is how I did it in C#:

Match match = Regex.Match(my_string, @"_(?<num>[0-9]+)$");
int num = Int32.Parse(match.Groups["num"].ToString());

I used named grouping - I called the group I was looking for by a name, "num".
First I tried it without naming - omitting the ? from the pattern, and look for the first group: match.Groups[0] but what I got was the whole _ string, not only the number I was looking for. When I used named grouping I managed to get only the number, but it puzzled me, so I looked at it a little more. After some digging I found out that the first group - Group[0] - is the whole match, and the variables start at Group[1], so if I didn't want to use named groups, I would have to use Group[1] instead of Group[0].
It is quite confusing (and not really documented, at least where I was looking) so it seems clearer to me to remain with the named groups.

No comments:

Post a Comment