ЭЛЕКТРОННАЯ БИБЛИОТЕКА КОАПП |
Сборники Художественной, Технической, Справочной, Английской, Нормативной, Исторической, и др. литературы. |
6.8. Extracting a Range of LinesProblemYou want to extract all lines from one starting pattern through an ending pattern or from a starting line number up to an ending line number. A common example of this is extracting the first 10 lines of a file (line numbers 1 to 10) or just the body of a mail message (everything past the blank line). SolutionUse the operators while (<>) { if (/BEGIN PATTERN/ .. /END PATTERN/) { # line falls between BEGIN and END in the # text, inclusive. } } while (<>) { if ($FIRST_LINE_NUM .. $LAST_LINE_NUM) { # operate only between first and last line, inclusive. } } The while (<>) { if (/BEGIN PATTERN/ ... /END PATTERN/) { # line is between BEGIN and END on different lines } } while (<>) { if ($FIRST_LINE_NUM ... $LAST_LINE_NUM) { # operate only between first and last line, but not same } } DiscussionThe range operators, These conditions are absolutely arbitrary. In fact, you could write # command-line to print lines 15 through 17 inclusive (see below) perl -ne 'print if 15 .. 17' datafile # print out all <XMP> .. </XMP> displays from HTML doc while (<>) { print if m#<XMP>#i .. m#</XMP>#i; } # same, but as shell command % perl -ne 'print if m#<XMP>#i .. m#</XMP>#i' document.html If either operand is a numeric literal, the range operators implicitly compare against the perl -ne 'BEGIN { $top=3; $bottom=5 } print if $top .. $bottom' /etc/passwd # previous command FAILS perl -ne 'BEGIN { $top=3; $bottom=5 } \ print if $. == $top .. $. == $bottom' /etc/passwd # works perl -ne 'print if 3 .. 5' /etc/passwd # also works The difference between print if /begin/ .. /end/; print if /begin/ ... /end/; Given the line You may mix and match conditions of different sorts, as in: while (<>) { $in_header = 1 .. /^$/; $in_body = /^$/ .. eof(); } The first assignment sets Here's an example. It reads files containing mail messages and prints addresses it finds in headers. Each address is printed only once. The extent of the header is from a line beginning with a %seen = (); while (<>) { next unless /^From:?\s/i .. /^$/; while (/([^<>(),;\s]+\@[^<>(),;\s]+)/g) { print "$1\n" unless $seen{$1}++; } } If this all range business seems mighty strange, chalk it up to trying to support the s2p and a2p translators for converting sed and awk code into Perl. Both those tools have range operators that must work in Perl. See AlsoThe |