ЭЛЕКТРОННАЯ БИБЛИОТЕКА КОАПП |
Сборники Художественной, Технической, Справочной, Английской, Нормативной, Исторической, и др. литературы. |
1.15. Parsing Comma-Separated DataProblemYou have a data file containing comma-separated values that you need to read in, but these data fields may have quoted commas or escaped quotes in them. Most spreadsheets and database programs use comma-separated values as a common interchange format. SolutionUse the procedure in Mastering Regular Expressions. sub parse_csv { my $text = shift; # record containing comma-separated values my @new = (); push(@new, $+) while $text =~ m{ # the first part groups the phrase inside the quotes. # see explanation of this pattern in MRE "([^\"\\]*(?:\\.[^\"\\]*)*)",? | ([^,]+),? | , }gx; push(@new, undef) if substr($text, -1,1) eq ','; return @new; # list of values that were comma-separated } Or use the standard Text::ParseWords module. use Text::ParseWords; sub parse_csv { return quoteword(",",0, $_[0]); } DiscussionComma-separated input is a deceptive and complex format. It sounds simple, but involves a fairly complex escaping system because the fields themselves can contain commas. This makes the pattern matching solution complex and rules out a simple Fortunately, Text::ParseWords hides the complexity from you. Pass its If you want to represent quotation marks inside a field delimited by quotation marks, escape them with backslashes " Here's how you'd use the $line = q<XYZZY,"","O'Reilly, Inc","Wall, Larry","a \"glug\" bit,",5, "Error, Core Dumped">; @fields = parse_csv($line); for ($i = 0; $i < @fields; $i++) { print "$i : $fields[$i]\n"; } See AlsoThe explanation of regular expression syntax in perlre (1) and Chapter 2 of Programming Perl; the documentation for the standard Text::ParseWords module (also in Chapter 7 of Programming Perl); the section "An Introductory Example: Parsing CSV Text" in Chapter 7 of Mastering Regular Expressions |