[Chapter 1] 1.4 Filehandles

1.4 Filehandles

Unless you're using artificial intelligence to model a solipsistic philosopher, your program needs some way to communicate with the outside world. In lines 3 and 4 of our grade example you'll see the word GRADES, which exemplifies another of Perl's data types, the filehandle. A filehandle is just a name you give to a file, device, socket, or pipe to help you remember which one you're talking about, and to hide some of the complexities of buffering and such. (Internally, filehandles are similar to streams from a language like C++, or I/O channels from BASIC.)

Filehandles make it easier for you to get input from and send output to many different places. Part of what makes Perl a good glue language is that it can talk to many files and processes at once. Having nice symbolic names for various external objects is just part of being a good glue language.[17]

[17] Some of the other things that make Perl a good glue language are: it's 8-bit clean, it's embeddable, and you can embed other things in it via extension modules. It's concise, and networks easily. It's environmentally conscious, so to speak. You can invoke it in many different ways (as we saw earlier). But most of all, the language itself is not so rigidly structured that you can't get it to "flow" around your problem. It comes back to that TMTOWTDI thing again.

You create a filehandle and attach it to a file by using the open function. open takes two parameters: the filehandle and the filename you want to associate it with. Perl also gives you some predefined (and preopened) filehandles. STDIN is your program's normal input channel, while STDOUT is your program's normal output channel. And STDERR is an additional output channel so that your program can make snide remarks off to the side while it transforms (or attempts to transform) your input into your output.[18]

[18] These filehandles are typically attached to your terminal, so you can type to your program and see its output, but they may also be attached to files (and such). Perl can give you these predefined handles because your operating system already provides them, one way or another. Under UNIX, processes inherit standard input, output, and error from their parent process, typically a shell. One of the duties of a shell is to set up these I/O streams so that the child process doesn't need to worry about them.

Since you can use the open function to create filehandles for various purposes (input, output, piping), you need to be able to specify which behavior you want. As you would do on the UNIX command line, you simply add characters to the filename.

open(SESAME, "filename");               # read from existing file
open(SESAME, "<filename");              #   (same thing, explicitly)
open(SESAME, ">filename");              # create file and write to it
open(SESAME, ">>filename");             # append to existing file
open(SESAME, "| output-pipe-command");  # set up an output filter
open(SESAME, "input-pipe-command |");   # set up an input filter

As you can see, the name you pick is arbitrary. Once opened, the filehandle SESAME can be used to access the file or pipe until it is explicitly closed (with, you guessed it, close(SESAME)), or the filehandle is attached to another file by a subsequent open on the same filehandle.[19]

[19] Opening an already opened filehandle implicitly closes the first file, making it inaccessible to the filehandle, and opens a different file. You must be careful that this is what you really want to do. Sometimes it happens accidentally, like when you say open($handle,$file), and $handle happens to contain the null string. Be sure to set $handle to something unique, or you'll just open a new file on the null filehandle.

Once you've opened a filehandle for input (or if you want to use STDIN), you can read a line using the line reading operator, <>. This is also known as the angle operator, because of its shape. The angle operator encloses the filehandle (<SESAME>) you want to read lines from.[20] An example using the STDIN filehandle to read an answer supplied by the user would look something like this:

[20] The empty angle operator, <>, will read lines from all the files specified on the command line, or STDIN, if none were specified. (This is standard behavior for many UNIX filter programs.)

print STDOUT "Enter a number: ";          # ask for a number
$number = <STDIN>;                        # input the number
print STDOUT "The number is $number\n";   # print the number

Did you see what we just slipped by you? What's the STDOUT doing in those print statements there? Well, that's one of the ways you can use an output filehandle. A filehandle may be supplied as the first argument to the print statement, and if present, tells the output where to go. In this case, the filehandle is redundant, because the output would have gone to STDOUT anyway. Much as STDIN is the default for input, STDOUT is the default for output. (In line 18 of our grade example, we left it out, to avoid confusing you up till now.)

We also did something else to trick you. If you try the above example, you may notice that you get an extra blank line. This happens because the read does not automatically remove the newline from your input line (your input would be, for example, "9\n"). For those times when you do want to remove the newline, Perl provides the chop and chomp functions. chop will indiscriminately remove (and return) the last character passed to it, while chomp will only remove the end of record marker (generally, "\n"), and return the number of characters so removed. You'll often see this idiom for inputting a single line:

chop($number = <STDIN>);    # input number and remove newline

which means the same thing as

$number = <STDIN>;          # input number
chop($number);              # remove newline