PREVIOUS  TABLE OF CONTENTS  NEXT 

Operator Overloading in Perl

Hildo Biersma

This article describes operator overloading in Perl. Operator overloading is a language feature that allows user-defined types to act in the same way as built-in types, such as strings and numbers. As such, operator overloading is one of the main strengths of C++ and one of the most glaring omissions in Java.

This article will start by implementing a simple user-defined type, then extend it to act as much like a built-in type as possible. After reading it, you will be able to decide how and when operator overloading should be used and how to implement these features for your own types.

Defining Your Own Types

Perl 5 has always allowed you to add you own data types to the language via object-oriented programming. In this article, we will define our own 'Date' type, which stores date stamps with a resolution of one day. It will support easy formatting in textual, European, and US formats, provide easy comparison between dates and allow simple arithmetic on dates.

We start by defining a simple Date class, in its own module. The class contains a new method shown below:

# Constructor: get day, month, year, return object
sub new {
    my ($class, %args) = @_;    # Argument checking
    $args{'month'} -= 1;	     # should be done here
    $args{'year'} -= 1900;	     # and here

    my $ctime = timelocal(0,0,0, $args{'day'}, 
    $args{'month'}, $args{'year'});
    my $this = { 'ctime' => $ctime };
    return bless $this, $class;
}

The constructor as shown computes a Unix timestamp and stores this inside the object. This is an implementation detail, of course; the object might as well store the day, month and year values instead of the Unix time. A full implementation would also perform error checking and throw an exception if the arguments are invalid or incomplete.

The Date class can be used as shown below:

use Date;

my $d1 = new Date('day' => 31,  'month' => 12, 'year' => 1999);
my $d2 = new Date('month' => 2, 'day' => 29,   'year' => 2000);

Adding Methods To The Date Class

Now, to make the class a bit more useful, we add three formatting 
methods that allow us to show the date in a textual format, a US 
format, and a European format:

# Create a nice string for a date, like "Dec 31, 1999"
sub as_string {
    my ($this) = @_;

    my ($dd, $mm, $yy) = (localtime($this->{'ctime'}))[3,4,5];
    $mm = (qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec))[$mm];
    $yy += 1900;
    return "$mm $dd, $yy";
}
    
    
# Return in US format, e.g. 12-31-1999
sub us_fmt {
    my ($this) = @_;
    
    my ($dd, $mm, $yy) = (localtime($this->{'ctime'}))[3,4,5];
    $mm += 1;
    $yy += 1900;
    return "$mm-$dd-$yy";
}
    
    
# Return in European format, e.g. 31-12-1999
sub euro_fmt {
    my ($this) = @_;
    
    my ($dd, $mm, $yy) = (localtime($this->{'ctime'}))[3,4,5];
    $mm += 1;
    $yy += 1900;
    return "$dd-$mm-$yy";
}

These methods can be used as shown below:

use Date;
    
my $d1 = new Date('day' => 31,  'month' => 12, 'year' => 1999);
my $d2 = new Date('month' => 2, 'day' => 29,   'year' => 2000);

foreach my $dateval ($d1, $d2) {
    my $sd = $dateval->as_string();
    my $ed = $dateval->euro_fmt();
    my $ud = $dateval->us_fmt();
    print "Text: $sd; US format: $ud; Euro format: $ed\n";
}

The script generates the following output:

Text: Dec 31, 1999; US format: 12-31-1999; Euro format: 31-12-1999
Text: Feb 29, 2000; US format: 2-29-2000; Euro format: 29-2-2000

A Minor Problem

A minor problem occurs when we display date values without using any of the formatting methods displayed above. If you print $d1, you will get a string that indicates the class, the implementation, and the memory address of your object, like this: Date=HASH(0x80f11fc). That's not too informative.

When you use the date values with other operations, e.g. numerical addition, numerical subtraction, comparison, or sorting, you get unfortunate effects: Perl operates on the values generated by converting your object to a string like "Date=HASH(0x80f11fc)".

Introducing Overloading

Operator overloading circumvents these problems, because it lets you provide your own versions of built-in operations like addition and subtraction. Your new (overloaded) versions are automatically invoked in any expression involving objects of your class.

With our Date class, overloading can be applied to the built-in operators such as conversion-to-text ("stringification"), comparison, addition, and subtraction. All operator overloading features in Perl use the overload module, which is a standard part of Perl 5.004 and later versions.

Let's start by adding overloading for the conversion-to-string operator. Whenever this is called, we do not want to see things like Date=HASH(0x80f11fc); instead, we want to invoke the as_string() method. We start by altering the Date class as follows:

package Date;
    
use overload ('""' => 'as_string');

That's all you need! From now on, printing a date object displays the proper value.

The syntax for the overload module is quite simple: following the use overload, list the operations to be overloaded, followed by their implementation. The implementation can either be a reference to a subroutine (\&as_string), or a string with the name of a method to be called ('as_string').

The difference between supplying a subroutine reference and a method name has to do with inheritance: when you supply a reference, you make sure the overloaded operator calls that exact subroutine, inheritance be damned. When you supply a method name, the overloaded operator will call that name and respect inheritance. In general, use of method names is preferable.

Overloading More Methods

Now that we've seen how to overload the stringification operator, let's do more. It would be useful if we could add a number of days to a date, e.g. $day + 2, and have that work properly.

We start by adding an add() method to our Date class:

# Add an integer number of days to a date
sub add {
    my ($this, $days) = @_;

    my $retval = { 'ctime' => $this->{'ctime'} };
    $retval->{'ctime'} += $days * 24 * 60 * 60;
    bless $retval, ref($this);

    return $retval;
}

The add() method must take care to build a new object, which it modifies and returns. Users wouldn't be happy if $b = $a + 1 modified $a! Also, the object is built using the two-argument form of bless, making this code safe for inheritance. (When called with a derived object, we create a new object of the exact same class.) Combined with the appropriate change to the use overload instruction, this allows you to run the program below:

use Date;
    
my $d1 = new Date('day' => 31, 'month' => 12, 'year' => 1999);
my $d2 = $d1 + 1;
my $d3 = 100 + $d1;
print "$d1 $d2 $d3";

The output is shown here:

	Dec 31, 1999 Jan 1, 2000 Apr 9, 2000

But wait! We have $d1 + 1 as well as 100 + $d1 - how does that work?

Overloading And Associativity

Whether you call $object + number or number + $object, the same method is called. Obviously, the number cannot be asked to add a date value to itself and return a new date. This works fine for commutative operators such as + and *, where the order of the operands doesn't matter. For non-commutative operators like -, this is obviously not desired. So Perl adds a third parameter to the method being called. This parameter is false if the arguments are in the proper order, but true if the parameters are reversed, as in the case of 100 + $date.

All this allows us to build a proper implementation of the subtraction operator. We want to support the following:

The code below shows how to do this. Note that we've moved the cloning of a date into its own method, copy(), which is also invoked from add(). This is just for convenience. An alternative design strategy would be to require the Date class and all derived classes to support a less messy clone() method.

# Add an integer number of days to a date
sub add {
    my ($this, $days) = @_;

    my $retval = $this->copy();
    $retval->{'ctime'} += $days * 24 * 60 * 60;

    return $retval;
}

# Subtract a number of days from a date or subtract two dates
sub subtract {
    my ($first, $second, $reverse) = @_;

    if (ref($second)) {# Second parameter is a reference
        if (UNIVERSAL::isa($second, 'Date')) {
        my $val = $first->{'ctime'} - $second->{'ctime'};
        $val /= 24 * 60 * 60;
        return $val;
}
        confess "Cannot subtract non-date [$second] from [$first]";
    } else {# Second parameter not a reference
if ($reverse) {
    confess "Cannot call [[$second - $first]";
}
my $retval = $first->copy();
$retval->{'ctime'} -= $second * 24 * 60 * 60;
return $retval;
    }
}


# Copy constructor
sub copy {
    my ($this) = @_;

    return bless { %$this }, ref($this);
}

The subtract() method shown in the sidebar uses the UNIVERSAL class to determine whether the second object is a Date object or a derived class, before accessing the ctime field inside the object. This code is careful to check the type (is this really a Date object?), while still allowing other classes to be derived from Date. Alternately, you could support "interfaces" (in the Java design style) and assume a ctime() method is supported by the second object. It all depends on your programming and design style.

Let's use these methods with the following code:

use Date;

my $d1 = new Date('day' => 1,   'month' => 1,  'year' => 1999);
my $d2 = new Date('day' => 31,  'month' => 12, 'year' => 1999);
my $d3 = $d1 - 1;
my $days = $d1 - $d2;
print "$d1 / $d2 / $d3 / $days\n";

# The next one dies, so let's see...
eval { 100 - $d1 };
print $@;

We see the following output:

Jan 1, 1999 / Dec 31, 1999 / Dec 31, 1998 / -364
Cannot call [[100 - Jan 1, 1999] at Date.pm line 80
        Date::subtract('Jan 1, 1999', 100, 1) called at date1.pl line 10
        eval {...} called at date1.pl line 10

Full overloading implementations

A fully overloaded user-defined data type needs far more than this. Besides mere string conversion and simple arithmetic, you want to be able to compare two date objects using ==, >, or gt, sort objects using Perl's built-in sort function, use more complex operators such as += and ++, and cope with calls to undefined methods.

Please refer to the overload documentation bundled with Perl to see which operators can be overloaded: more than fifty are supported. In many cases, this is so much work it's not worth the effort. Of course, Perl can help with this as well.

Automatically generating overloaded methods

Suppose that you do not change the Date class above, but invoke:
use Date;

my $d1 = new Date('day' => 1,  'month' => 1, 'year' => 1999);
$d1++;
$d1 += 5;
print "Date: $d1\n";

Possibly to your surprise, this will work properly. The reason is quite simple: Perl is able to build its own implementation of the ++ and += operators using the + operator that has been defined. As you can guess, the efficiency of these generated operators is slightly less than that of hand-crafted methods that just modify an existing object, but hey, you get them for free.

In a similar fashion, Perl autogenerates all comparison operators if you provide cmp and <=>. It creates unary minus (negation) from the subtraction operator, and supports concatenation using string conversion. This magic drastically cuts down the amount of operators you need to write.

The code below defines a single compare() method, used for string and numerical comparisons. Once this has been defined, all Perl comparison operators will work properly.

package Date;
    
use overload ('cmp' => 'compare',
             '<=>' => 'compare',
             '""'  => 'as_string',
             '-'   => 'subtract',
             '+'   => 'add');
    
# Compare two values by comparing their ctimes
sub compare {
    my ($first, $second) = @_;
    
    unless (UNIVERSAL::isa($second, 'Date')) {
      confess "Can only compare two Date objects, not $second";
    }
    return ($first->{'ctime'} <=> $second->{'ctime'});
}

The Fallback Mechanism

A question remains: what will Perl do if you try to use an operator you haven't defined? Normally, Perl will complain and throw a fatal exception. Witness the output produced by trying to use the exponentiation operator on a Date:
Operation '**': no method found,
    left argument in overloaded package Date,
    right argument has no overloaded magic at date2.pl line 4.

As you can see, Perl dies. If you define a useful numerical conversion operator (called 0+) for your class, all normal numerical operators could act on that converted numerical value. (Obviously, no such useful conversion exists for the Date class.)

It turns out Perl can do just that - provided that you allow it to. This is done through a fallback mechanism that allows Perl to use conversions, followed by the normal implementation of all operators. The fallback mechanism is enabled at the use overload line, and can be set to the following values:

As a silly example, the code below alters the Date class to turn on fallback and use a numerical conversion that generates the weekday number. Now, ** will exponentiate using the weekday number.

package Date;
    
use overload ('fallback'  => 1,
                  '0+'  => 'to_number',
                  'cmp' => 'compare',
                  '<=>' => 'compare',
                  '""'  => 'as_string',
                  '-'   => 'subtract',
                  '+'   => 'add');
# Silly numerical conversion operator
sub to_number {
    my ($this) = @_;
    
    return (localtime($this->{'ctime'}))[6];
}

Overloading And Inheritance

Overloading can be combined at will with inheritance. Any subclass automatically inherits all methods from the parent class and can then go on to override or add any operator at will. As an example of this, we'll define a EuroDate class that behaves exactly the same way as the normal Date class, except that the string representation is now in European format, not text format.
package EuroDate;
    
use strict;
use Date;
use vars qw(@ISA);
    
@ISA = qw(Date);
    
use overload ('""'  => 'euro_fmt');
    
1;

Everything behaves exactly as you expect.
In the case of inheritance, there are a few traps that you should avoid:

Limitations of operator overloading

However simple and elegant the Perl operator overloading mechanism may be, it has some limitations that C++ doesn't have. The most important of these are:

Conclusion

This article has shown you how to use operator overloading for your own datatypes. As you've seen, simple things can be done very easily, and complex behavior can be created when you need to. The Perl overloading mechanism is general, yet flexible enough to be applied to your own classes as well.

So, why are you still debugging using Data::Dumper and suffering through output of the form MyType=HASH (0xDEADBEEF)? Why are you still invoking compare($first, $second) when implementing both $first == $second and $first eq $second would be trivial?

References

__END__


Hildo Biersma (hpp@elvenkind.com) is heavily into Perl, C++, and objects. He uses Perl for data conversion and large-scale Web applications.

PREVIOUS  TABLE OF CONTENTS  NEXT