Regular Expressions Overview

Web Site

What is a regular expression?

Regular expressions are a powerful method for finding a substring in a string, file or other stream of text data. They are particularly useful for validating complex user input, searching for patterns in source files and other semi-structured text data.

A regular expression class exhibits the following functionality:

Here is a simple example: the following regular expression will match a time string in h:mm or hh:mm format

   {[0-9]?[0-9]}:{[0-9][0-9]}

The symbols in this expression have the following meaning:

 

Library regular expression classes

There are two classes used with regular expressions. The first one - Regexp - contains most of the functionality: parsing a regular expression and scanning for matches within a string.

The second class - RegexpMatch - is a helper class that will hold an array with the results for one particular match. You use a temporary instance of RegexpMatch every time you call Regexp::Match.

To use regular expressions with Str Library, you must first define the symbol STR_USE_REGEX so that appropriate code will be compiled and linked in your project. Also make sure your project file / makefile includes str_impadvanced.cpp as part of the build process.

 

Usage example

The following sample snippet shows how to parse the regular expression shown above, call Match to evaluate the string "compliance" (i.e. whether the test pattern can be found inside) and make sure the two groups in the result are as expected:

   Regexp   re;
   re.Parse("{[0-9]?[0-9]}:{[0-9][0-9]}");
   RegexpMatch mc;
   if (re.Match("2:57", &mc)) { //		Successful match
     ASSERT (mc.GetCount() == 2);
     Str item = mc.GetMatch(0);
     ASSERT(item == "2"); 
     item = mc.GetMatch(1);
     ASSERT(item == "57");
   }

It is important to note that the example above would also match, for example, the string Pink2:57 Floyd - this is expected behavior. If you want to perform a match where the pattern starts in the beginning of the string, and finishes at its end, you can use special directives in the expression itself.

 

See also: Regular expressions classes, All methods, STR_USE_REGEX

Web links: A regular expression tutorial (at living-source)