What is Truth? - The Perl Journal, Summer 1999

Nathan Torkington

Some programming languages give you a single TRUE and a single FALSE. Others make you represent each with integers (typically 1 and 0). But not Perl. Truth plays a larger role in Perl than most other languages, and its subtleties often confuse beginners. In Perl, truthfulness is determined from a few simple rules. To understand one of those rules, though, we need to first learn about good programming practice, warnings, laziness, and undef. Then we can learn the true nature of truth and see how to apply it in our own programs.

`undef`

Perl was designed for system administrators who want to automate the automatable tasks of their jobs. Such people typically produce small programs, and don't need to worry about formally verifiable correctness, corporate coding standards, and other such things. They just want to get the job done. For this reason, Perl's default behavior allows the programmer to be lazy and leave off parentheses around function arguments, to use subroutines before they're defined, and to use variables without initializations or even definitions. However, these very same practices that enable small programs to be written quickly can bog down larger programs by permitting subtle errors. By assuming that the programmer is all-knowing and perfect, Perl fails to notice things that smart and lazy programmers do deliberately but are mistakes for the rest of us. That's what Perl's -w flag is for. The -w flag turns on warnings for ambiguous or possibly erroneous practices. One of the things it catches is the use of a variable before it has a value. Here's the simplest possible demonstration of that:

 
#!/usr/bin/perl -w  
print $x;

This actually generates two warnings: main::x only used once and use of uninitialized value The first warning comes after Perl has finished compiling our program and realizes "hey, I only saw that variable once. That's probably a mistake." The second warning comes at runtime, when we attempt to print out $x without having given it a value. In cases like these, $x contains "the undefined value" or "the uninitialized value" and is written as undef This value is completely separate from any other Perl value: it isn't the empty string, nor is it zero. It's almost like NULL in C. undef is a special value used whenever a variable hasn't yet been assigned a value. We can test for undef with the defined function:

if (defined $x) {
    print "x has value $x\n";
} else {
    print "x is undefined\n";
}

Any attempt to use $x as though it were a real value (by treating it like a string or a number, for instance) generates a warning at runtime if you're using -w. This means we can't use == or eq with $x since those are number and string comparison operators and we'll get a warning for trying to use undef as a number or string. If we do try, Perl will emit its warning if we used -w but keep on going whether or not we used -w This is how the lazy programmers were able to leave warnings off and have their code still work: Perl treats undef as either zero or the empty string. That's why we don't see anything when we try to print an undefined variable. Similarly:

  
#!/usr/bin/perl -w
$y = $x + 3;       # $y = 3, warning emitted
$y = length $x;    # $y = 0, warning emitted

We can return a scalar variable to its initial pristine undefined state by using undef as either a function or a value:

undef $x;
$x = undef;

defined and undef are good for testing and setting scalars. Don't try them with arrays, though. Presently, defined(@array) returns true if Perl has allocated storage for array something that is weird and not useful to the average programmer. To return an array to its initial state, we say:

 
@array = ();        # good

To say @array = undef is to make @array contain a one-element list, with the single element being the scalar value undef. This is hardly ever what we want.

Back to truth

So how does this fit into truth in Perl? Perl has some simple rules for determining whether something is true or false. One of those rules involves undef. Here are the rules:

true/false are scalar concepts
undef is false
"" is false
0 is false
0.0 is false
"0" is false
all else is true

Rule 1 is important: only scalars can be tested for true and false. We'll see why this is important after examining the rest of the rules. Rule 2 says that any uninitialized value is false. Because undef behaves as 0 and "", it's easy to see why rules 3 and 4 are there. Rule 5 is a weird one: internally, Perl can store numbers either as integers or as floating point numbers (that is, numbers with a decimal point), and converts between the two as needed. Because 0 (an integer) is false, it would be inconsistent to have the floating point version of 0 (0.0) be anything other than false. Rule 6 is similar to Rule 5: because Perl converts between strings and numbers on demand, the string "0" must be false as well. Not "0.0", though, nor "0.00".

Rule 7 simply says that if it's not one of those five false values, it's true. This means that references are true, positive numbers are true, negative numbers are true (this surprised one of my students, and immediately identified the bug he'd been working for hours to fix), and every string is true with the exception of the empty string and "0". Truth in Perl is really used to test for interesting values: if we've got a variable $name that might hold someone's name, it'll either be true (if we have a valid name) or it'll be false (if it's undefined, an empty string, or a form of 0). Either way, true means we want to work on it and false means we don't:

if ($name) {
   print "Hello, $name!\n";
} else {
print "You didn't enter your name. 
       What do you have to hide?\n";
}

Similarly, if we've asked for someone's age:

if ($age) {
   $average = $total / $age;
} else {
   print "You didn't give me an age value.\n";
}

The 0 age is almost always wrong. For cases when we do want to permit an age of 0, we have to drop back and test with defined:

if (defined $age) { 
   if ($age) {
       $average = $total / $age; 
   } else { 
       print "Can't average a 0 age.\n"; 
       $average = 0; 
   } 
} else { 
   print "You really didn't give me an age value.\n"; 
}

It's worth emphasizing: Truth and definedness are different. There are four false values that are defined: 0, "0", 0.0, and "". If you're in doubt, you can write a small program to test this:

 
#!/usr/bin/perl -w
$x = 0; 
print "$x is defined\n" if defined $x; 
$x = 0.0; 
print "$x is defined\n" if defined $x; 
# ...

Truth in Context

Every Perl expression is used in a particular context. At its most basic level, context is the answer to the question, "Am I producing a single value, or am I producing many values?" If the expression is meant to produce a single value, it's in scalar context. If the expression is to produce many values, it's in list context. Scalar context is most often seen in assignments to scalar variables, boolean testing, and scalar subscripts:

  
$x = EXPR;      # EXPR in scalar context
if (EXPR)       # EXPR in scalar context
$foo[EXPR]      # EXPR in scalar context

List context is most often seen in assignments to array variables, arguments to an unprototyped function (or a function prototyped to take a list), and slices:

@x = EXPR;      # EXPR in list context
foo(EXPR);      # EXPR in list context
@foo[EXPR]      # EXPR in list context

Context is used by Perl programmers a lot. When we say this, we're forcing @array into scalar context:

  
$count = @array;       
# number of elements in @array

Perl has rules for evaluating lists, arrays, and hashes in scalar context. An array in scalar context behaves like the number of things in the array, so $count gets set to the number of elements in @array. Because boolean tests also provide scalar context, we can say:

if (@array) {
    print "There are values in @array\n";
}

@array in scalar context is 0 (false) if the array is empty, and some non-zero number (true) if the array has values in it. However, it's not just if that provides scalar context. Any boolean operator forces its arguments into scalar context:

if ($must_process_array && @array) {
# process the array
}

Applications

Here's the meaty part. Perl programmers use true and false all the time. If you've used Perl for any length of time, you'll have seen something like this:

open(FH, "< $filename") || 
         die "Couldn't open $filename: $!\n";

The open function returns true if it succeeds, false if it doesn't. If it returns false, the || operator evaluates its right-hand side and calls die. This relies on the short-circuit property of Perl's boolean operators: if we're testing logical OR, and the left-hand side of the || or or is true, we know that the OR will be true and so we don't have to test the right-hand side. Similarly, AND can return false if the left-hand side returns false, because there's no way the AND can ever be true. Perl programmers also use logical OR to supply variables with default values. Here's an excerpt from one of my programs:

$Data_File = $opt_t || "/tmp/web.data";

If $opt_t was given a value (say, from the command-line option parsing module Getopt::Std), that value will be assigned to $Data_File. If $opt_t wasn't given a value (because the user didn't supply a -t command line option) then $Data_File gets set to "/tmp/web.data". This single expression replaces a more verbose if statement:

if ($opt_t) {
   $Data_File = $opt_t;
} else {
   $Data_File = "/tmp/web.data";
}

Finally, let's end with a more involved example: sorting on two criteria. We already know that the sort function sorts a list of values in ASCII order:

@sorted = sort @unsorted;

If we want to sort by some other means, for instance numerically, we have to give sort a code block (or a subroutine name) that compares two elements and returns -1, 0, or +1 to indicate how they should be sorted. This example uses the Perl operator <=> to sort an array of numbers from smallest to largest:

# sort numerically
@sorted = sort { $a <=> $b } @unsorted;

Because 0 is used to tell sort "these two elements sort to the same position", we can use the || operator to connect two sorts:

@sorted = sort { -s $a <=> -s $b 
        || $a cmp $b } @filenames;

Here we're sorting filenames, so the block of code gets called with $a and $b as the two filenames to compare. We first use the -s operator to fetch the size of the files and compare those file sizes numerically. If they're the same size, <=> will return 0, and the || will instead use the value of the right-hand side, a string comparison of the filenames. We can write that more prettily as follows:

@sorted = sort { -s $a <=> -s $b
       || $a cmp $b } @filenames;

Conclusion

And on that elegant note I leave you. We've seen inside Perl's notions of true and false, and found some common applications. Hopefully now when you see a || or a if (@array) you'll know what's going on. Here are some things I didn't discuss, and which you might try reading the online documentation or O'Reilly's Programming Perl to learn about:

how lists and hashes behave in scalar context
the important difference between or and ||
how subroutines can learn their context with wantarray
how context propagates through subroutine calls to the return statement.

__END__

Nathan Torkington has defied common wisdom and quit his day job to write, edit, hack, and teach Perl to unsuspecting C++ programmers. In his spare time he works on the answer to important questions like "where did I leave my shoes", and "what do you mean, my shirt's on backwards?"

TABLE OF CONTENTS