ЭЛЕКТРОННАЯ БИБЛИОТЕКА КОАПП
Сборники Художественной, Технической, Справочной, Английской, Нормативной, Исторической, и др. литературы.



7.2 Simple Uses of Regular Expressions

If we were looking for all lines of a file that contain the string abc, we might use the Windows NT findstr command:

>findstr abc somefile > results

In this case, abc is the regular expression that the findstr command tests against each input line. Lines that match are sent to standard output, and end up in the file results because of the command-line redirection.

In Perl, we can speak of the string abc as a regular expression by enclosing the string in slashes:

if (/abc/) {
  print $_;
}

But what is being tested against the regular expression abc in this case? Why, it's our old friend, the $_ variable! When a regular expression is enclosed in slashes (as above), the $_ variable is tested against the regular expression. If the regular expression matches, the match operator returns true. Otherwise, it returns false.

For this example, the $_ variable is presumed to contain some text line and is printed if the line contains the characters abc in sequence anywhere within the line - similar to the findstr command above. Unlike the findstr command, which is operating on all of the lines of a file, this Perl fragment is looking at just one line. To work on all lines, add a loop, as in:

while (<>) {
  if (/abc/) {
    print $_;
  }
}

What if we didn't know the number of b's between the a and the c? That is, what if we want to print the line if it contains an a followed by zero or more b's, followed by a c? With findstr, we'd say:

>findstr ab*c somefile >results

In Perl, we can say exactly the same thing:

while (<>) {
  if (/ab*c/) {
    print $_;
  }
}

Just like findstr, this loop looks for an a followed by zero or more b's followed by a c.

We'll visit more uses of pattern matching in the section "More on the Matching Operator," later in the chapter, after we talk about all kinds of regular expressions.

Another simple regular expression operator is the substitute operator, which replaces the part of a string that matches the regular expression with another string. The substitute operator consists of the letter s, a slash, a regular expression, a slash, a replacement string, and a final slash, looking something like:

s/ab*c/def/;

The variable (in this case, $_) is matched against the regular expression (ab*c). If the match is successful, the part of the string that matched is discarded and replaced by the replacement string (def). If the match is unsuccessful, nothing happens.

As with the match operator, we'll revisit the myriad options on the substitute operator later, in the section "Substitutions."