Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Mark Leighton Fisher (4252)

Mark Leighton Fisher
  (email not shown publicly)
http://mark-fisher.home.mindspring.com/

I am a Systems Engineer at Regenstrief Institute [regenstrief.org]. I also own Fisher's Creek Consulting [comcast.net].
Friday January 26, 2007
01:25 PM

.NET Regexes: Why Groups and Captures?

[ #32261 ]

When you need to grab part of a regex for later use, why does .NET require dealing with both groups and captures? In my experience, the canonical method for dealing with regex captures has you numbering your captures starting at 1 from their opening parenthesis, so that in a regex like:

    ^[^"']*((.)([^"']*)(.))

capture $1 is the whole quoted string, captures $2 and $4 are the quotes, and capture $3 is the quoted string itself.

In the .NET regex classes, you can't directly access the captures you must go through their containers, the Groups. What would match my own personal model is where each Capture would be linked somehow to the Captures contained within, with each Regex linked to all of its Captures in opening-parenthesis-first order like in the example above.. If there are no Captures, there is no link. ("Link" does not mean literal link it could just as well be some kind of collection reference.)

I suspect there is a use for Groups, but I'm not sure what it is. Maybe when you are generating your regexes with a program you could use a Group to simplify your code.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • To translate from Perl terminology to .Net, think of a group as a capture. If you just want the value that you would get in $1, $2 etc then access group 1, 2, etc and use the .Value property (I think).

    Captures are something else in .Net - they are useful if you have a quantified group. So if you have

    ([ab])*

    And matched the string aabba then your group object would just have the value "a", but there would be 5 capture objects held in the collection within the group object, having the values "a", "a", "b", "b"