PREVIOUS  TABLE OF CONTENTS  NEXT 

The Perl Machine

Ray F. Piodasoll

One year ago, "Perl and Nuclear Weapons Don't Mix" appeared in TPJ. In that article, I discussed some of my work at NORAD, where my buggy Perl programs were burnt into the ROMs of nuclear guidance systems. As I mentioned toward the end of the article, I was laid off as a result. After a brief series of consulting gigs, I've now found my dream job, one that combines my love of computer architecture with my affinity for Perl.

You may have heard of JavaOS, an attempt to create a compact operating system that runs Java directly on microprocessors. The Perl Machine is an even more ambitious project - an entire computer designed from the ground up to run Perl. Like the dedicated Lisp Machines of yesteryear, the Perl Machine is built by people who love their language and want to see its utility and spirit replicated at every level of computation, from hardware to firmware to software - and even in the casing.

Physical Appearance

The Perl Machine workstation has a translucent faux-pearl spherical shell, two feet in diameter. The bottom edge is flattened so it won't roll away. The translucency allows you to inspect the guts of the computer even when it's running. For users with typing - aggravated injuries, the Perl Machine comes with three foot pedals, for $, @, and %.

The Perl Machine notebook will be an 'oyster shell' that opens up to a 12.1 inch active-matrix display and full size keyboard.

Architecture

The main CPU is a Digital 21264 Alpha chip, clocking in at 1000 MHz and boasting 64K data and instruction caches. The CPU performs instruction pipelining of OPs (see Chip Salzenberg's article: Guts: Basic Anatomy), which are prefetched for an extra performance boost.

The Perl Machine has custom hardware for regular expressions. The OysterShucker, a coprocessor, uses FPGAs (field-programmable gate arrays) to build fast regular expression DFAs (see Mark-Jason Dominus' article on page 46) on the fly. Compiling a typical regex into the FPGA takes about 0.1 seconds, after which you can perform typical regex matches in a single clock cycle. On the 21264, that's 900 million matches in the first second and one billion every second thereafter. A nice optimization for /o.

An ASIC called the Spinneret encodes the CGI module, enabling commonly called methods like ()start_form() and ()textfield() to be processed separately from the main instruction stream.

As we'll no doubt be learning from future installments of Chip's column, the basic datatype in the guts of Perl is the SV, the C structure containing a scalar value. When a conventional computer reads an SV, it has to read a few bytes at a time. Not so on the Perl Machine. SVs, AVs (array values), and HVs (hash values) are each stored in a different kind of memory. Special machine instructions make it possible to retrieve only the most frequently-accessed fields.

Firmware

On conventional computers, Perl has a large 'footprint' - the compiler is often more than a megabyte, which can take a large fraction of a second to load into memory. On the Perl Machine, the Perl compiler never has to be loaded, because it's in firmware. The firmware compiler also automatically inlines subroutines for extra performance.

Upgrading to a more recent version of Perl is simply a matter of replacing one chip with another - there's no lengthy configuration ordeal. Every upgrade dumps the previous Perl version onto disk, so if a change to Perl breaks your program you can always fall back to older versions of Perl. The Perl Machine ships with every major version of Perl since Perl 1.0, and every minor version since Perl 5.0.

Reference counting is implemented in microcode, as is hashing, which uses a lookup table to hash four bytes in a single instruction cycle. Since many Perl programs spend lots of time storing and fetching data from hashes, this is a big win.

The Perl Machine's kernel is called the Nucleus - that's the name for an irritant in oysters isn't yet big enough to be called a pearl. The Nucleus has system calls for processing HTML, XML, Usenet news, and Internet mail.

Single-namespace mode. Just like you can boot Unix workstations in single user mode, you can boot the Perl Machine in single-namespace mode, which ignores all package declarations. That enables some optimizations that I don't have space to describe here; as an example, hashing in this mode uses the Perl Machine's 128-bit-wide addressing to directly map keys to their hashed values. As long your hash keys are less than sixteen characters, hashing is instantaneous.

Compiled Code vs. Cultured Code

We're used to thinking of compilation as a discrete stage: either our code is compiled or it's not. The Perl Machine lets you obscure the boundary between compiled and non-compiled code by introducing an alternative to compiling called culturing.

The Perl Machine lets regular Perl scripts call directly into compiled Perl bytecode, and vice versa. When you culture your Perl code, you begin with a little kernel of compiled code - perhaps a common module - and you extend it with layers of not-yet-compiled code around it. When it works, you "harden" the layer of code you've just added, compiling it down into the kernel.

Economics

Both the Perl Machine workstation and laptop are expected to retail for $3999. To make this extremely low price feasible, The Perl Factory copped a sales technique from Gillette and Nintendo: sell the main unit (razors, game systems) at a loss, and make the money back on peripherals (razor blades, cartridges). The Perl Factory will charge a licensing fee from module developers.

The CPAN will still fulfill the role it does today: thousands of modules and scripts will continue to be freely available. The Perl Machine isn't about to stop that. But you have to wonder about folks like Lincoln Stein, whose CGI module has collectively saved the world millions of hours. He probably wishes he had a tiny per-site, or even per-use, royalty. Likewise, some of the people using CGI.pm on heavily-trafficked Web sites probably wish that it could run even faster - and would be willing to pay for it, mod_perl (see Stately Scripting with mod_perl) notwithstanding.

The Perl Machine has the answer: a physical slot into which you can plug cartridges, just like Nintendo. If a module developer wants to release his module as a cartridge, he pays us a licensing fee and we turn it into a cartridge for him: a hardware implementation of his module. Pure software modules will remain free; you pay only if you need the speed of hardware.

The Network REALLY IS The Computer

Perl Machines are always on the Internet. If your regular Internet connection is lost, a built-in cellular modem continues the connection seamlessly. If your local cell is down, Perl Machines use spread-spectrum Ethernet over shortwave radio to gateway packets. Nothing short of massive electromagnetic interference or immersion in a Faraday cage will keep your Perl Machine off the Internet.

Constant connectivity has its advantages. The Perl community is famous for its cooperative nature, especially when it comes to code redistribution and reuse. It's only logical to take this cooperation to the next level and deploy a computational infrastructure where you can use not just my program, but my computer: my CPU cycles, my network connectivity, my disk drive.

CPOS. Every Perl Machine owner can elect to join the CPOS: the Comprehensive Perl Operating System. This gives member computers limited access to the resources of all the other computers. Brute-force computation with threads distributed over thousands of computers is a snap. When you need the horsepower, other Perl Machines will help.

This has a wonderful corollary: the utility of a Perl Machine is proportional to how many people own them. It's a stunning violation of the law of supply and demand: The more there are, the more each is worth.

The Perl Wizard. Besides the materialistic resources of MIPS, disks, and bandwidth, there's another important resource: expertise. Perl Machine owners will be among the best programmers, and so we've incorporated a feature to help bring their shared brainpower to your desktop. Many Perl hackers have wished that the behavior of -w and se strict were more customizable; the Perl Wizard system lets everyone define precisely what they consider bad programming practice - and then lets them provide that set of restrictions to others.

When you say use 'Ray F. Piodasoll' in your program, the Perl Machine downloads my restrictions to your computer. I like use strict 'refs' but not use strict 'vars', for instance, so that's what you'll get. I also deplore the for (1..100000) construct, which chews up a lot of memory. If you write a statement like that, you'll be greeted by a picture of me wagging my finger and explaining why that's a bad idea.

Reliable Storage. You can partition your disk to obey PFS, the Perl File System. You may have heard of disks that use check digits to make storage more reliable, or parallel tape backup systems that store each datum twice. However, even the best of these systems is still vulnerable to earthquakes and other natural disasters. PFS is like RAID striping, but it uses six different disks on six different continents. Every datum is stored across the globe, so that even if thermonuclear war takes out half the planet, there will be enough information in the other half to reconstruct your data.


Ray F. Piodasoll (fool@readable.com) is a systems architect for The Perl Factory in Cambridge, Massachusetts. He occasionally writes articles intended to be published in April, on the first of the month. Ray would like to thank the following people who have contributed to the design of The Perl Machine: Geoffrey Broadwell, Jürgen Christoffel, Anthony David, Mark-Jason Dominus, Robert Ferrell, Dan Gruhl, Benjamin Holzman, Tom Horsley, Tuomas Lukka, Jon Orwant, Tom Phoenix, John Redford, Gurusamy Sarathy, Abigail, Dan Sugalski, Jeff Sumler, Nathan Torkington, Adam Turoff, Larry Virden, and Ilya Zakharevich.

PREVIOUS  TABLE OF CONTENTS  NEXT