PREVIOUS  TABLE OF CONTENTS  NEXT 

Version Control with makepatch

Johan Vromans

"When you have two copies of a piece of information, at least one of them is wrong."

This theorem is often used in information technology to emphasize that you should avoid copying information, because when you do, you have to spend effort keeping all the copies up to date.

However, copying often cannot be avoided; programs, data files, documents, and web pages are frequently copied all over the world. This article describes a technique to help keeping copies of sets of documents consistent and up to date. Although the technique is most widely used in software development, it is applicable to virtually any type of data.

diff and patch

When people collaborate on a program, how can they keep all the programs consistent, so that the changes from different people don't conflict? One solution is to ship the latest version to everyone after every change, but that's not feasible if the program is large. Another solution is to publish changes as individual update files in the format generated by the Unix diff program (also available for Win32; see Listing 1: How diff and patch Work & diff and patch for Win32 and the Mac). With diff and patch, you can apply someone else's changes to your own program (or document, or data file, or web page).

The patch program, which automates the process of integrating the update files to a set of source files, was written by none other than Larry Wall, author of Perl.

patch solved only part of the problem, however. Synchronizing two versions of a source tree (or web site) requires more than just changing individual source files. Sometimes new files need to be created, and obsolete files need to be removed. diff and patch don't do these things well.

In the rest of this article, I'll assume that we're talking about a program with multiple files of source code, although the techniques apply to any collection of files.

The Problem

To properly update a source tree, we need to worry about a few things:

The makepatch Package

The makepatch package performs all of the tasks that diff and patch don't. It contains two Perl programs: makepatch and applypatch. makepatch builds a patch kit that can be applied reliably; applypatch integrates the patch kit on the receiving end.

This article describes version 2.00a of the makepatch package.

Generating The Patch Kit

makepatch generates a patch kit from two source trees: the original, and the new tree. Here's how it does that:

The generated patch kit is valid input for the patch program, making use of patch's feature of ignoring everything it does not understand.

As a special service, makepatch prepends a small shell script to the patch kit that, when fed to a standard Bourne shell, creates the necessary directories and files and removes obsolete ones. Of course, this requires that the receiving platform supports both the shell and Unix filename conventions, so the shell script is pretty much useful only for Unix. These limitations can be overcome by using the applypatch utility instead.

Applying the patch kit

applypatch takes care of everything that patch doesn't:

To allow applypatch to do its job, makepatch appends additional information (like checksums) to the patch kit.

applypatch only requires Perl and patch; no other operating system support is necessary. This makes it possible to apply patches on any operating systems supporting these two programs.

General Usage

Suppose you have an archive pkg-1.6.tar.gz, containing the sources for the pkg package version 1.6. You also have a directory tree pkg-1.7 containing the sources for version 1.7. The following command generates a patch kit that updates the 1.6 sources into their 1.7 versions:
     makepatch pkg-1.6.tar.gz pkg-1.7 > pkg-1.6-1.7.patch 

By default, makepatch provides a few lines of progress information:

    Extracting pkg-1.6.tar.gz to /tmp/mp21575.d/old ...
    Manifest MANIFEST for pkg-1.6 contains 1083 files.
    Manifest MANIFEST for pkg-1.7 contains 1292 files.
    Processing the filelists ...
    Collecting patches ...
      266 files need to be patched.
      216 files and 8 directories need to be created.
      7 files need to be removed.

To apply the generated patch kit, go to the directory containing the 1.6 sources and feed the kit to applypatch:

    cd old/pkg-1.6
    applypatch pkg-1.6-1.7.patch 

applypatch verifies that it is executing in the right place and makes all neccessary updates. The program provides no feedback information by default.

Over the last couple of years, makepatch has been used extensively by several developers and teams all over the Internet, including the Perl 5.6 development team. The program has evolved from a simple wrapper around the diff program to a tool that provides a lot of interesting features for everyone involved in maintaining source documents. I'll mention just a few of these.

Fetching Source Files From Archives

The set of sources makepatch operates on need not be explicitly present on disk. makepatch can process files that are archived in any of several popular archive formats (.tar, .tar.gz, .tgz, .tar.bz2 and .zip). Other archive formats can be easily added without changing the program.

Selecting The Source Files

The list of files constituting the source tree can be specified in a MANIFEST file, but it can also be generated on the fly by recursively traversing the source tree. File names can be excluded using shell style wildcards and Perl regular expression patterns. There are predefined patterns to exclude the version control files generated the revision control systems, and they can be activated with a single command line option.

A Word About Manifest Files

A manifest file lists the files comprising a package. Manifest files are traditionally called MANIFEST and reside in the top level directory of the package. Although there is no formal standard for the contents of manifest files, makepatch uses the following rules:

makepatch Options

makepatch accepts lots of options. Full detail is available in the documentation provided with the package, but here are brief descriptions:

These options needn't be specified on the command line. makepatch looks for options in the following order:

In all option files, empty lines and lines starting with ; or # are ignored. All other lines are considered to contain options exactly as if they had been supplied on the command line.

For an extensive list of the possible options, see the makepatch documentation.

Current status and future directions

The current version of the makepatch package is 2.00a found at authors/id/JV/makepatch-2.00a.tar.gz on the CPAN. It requires Perl 5, and a suitable version of the diff and patch programs.

The next version of applypatch will apply its own patches, eliminating the need for the patch program. Also, a future version of makepatch might be able to generate the patch information, eliminating the need for the diff program on the source platform. This will be especially interesting for users on platforms like Windows, where these programs are not available by default.

__END__


Johan Vromans (jvromans@squirrel.nl) has been engaged in software engineering since 1975. He has been a Perl user since version 2 and participated actively in its development. Besides writing makepatch, he also wrote Getopt::Long, the Perl5 Pocket Reference, and co-authored The Webmaster's Handbook. He offers Perl consulting and courses with the Squirrel Consultancy (http://www.squirrel.nl).

PREVIOUS  TABLE OF CONTENTS  NEXT