The Perl Wizard's Quiz

Tom Christiansen

1. What value is returned by a lone return; statement?
    a. The empty list value ().
    b. The undefined value in scalar context, and the empty list value () in list context.
    c. The result of the last evaluated expression in that subroutine's block.
    d. The undefined value.

2. What's the difference between /^Foo/s and /^Foo/?
    a. The first would allow the match to cross newline boundaries.
    b. The first would match Foo other than at the start of the record if the previous match were /^Foo/gcm, new in the 5.004 release.
    c. The second would match Foo other than at the start of the record if $* were set.
    d. There is no difference because /s only affects whether dot can match newline.

3. What does length(%HASH) produce if you have thirty-seven random keys in a newly created hash?
    a. 5
    b. 37
    c. 74
    d. 2

4. What does read() return at end of file?
    a. 0
    b. "0 but true"
    c. "\0"
    d. undef

5. How do you produce a reference to a list?
    a. [ @array ]
    b. \($s, @a, %h, &c)
    c. You can't produce a reference to a list.
    d. \@array

6. Why aren't Perl's patterns regular expressions?
    a. Because Perl allows both minimal matching and maximal matching in the same pattern.
    b. Because Perl uses a non-deterministic finite automaton rather than a deterministic finite automaton.
    c. Because Perl patterns can have look-ahead assertions and negations.
    d. Because Perl patterns have backreferences.

7. Why doesn't Perl have overloaded functions?     a. Because you can inspect the argument count, return context, and object types all by yourself.
    b. It does, along with overloaded operators as well as overridden functions and methods.
    c. Because Perl doesn't have function prototypes.
    d. Because it's too hard.

8. Why is it hard to call this function: sub y { "because" }
    a. It's not.
    b. Because y is a predefined function.
    c. Because it has no prototype.
    d. Because y is a kind of quoting operator.

9. How do you print out the next line from a filehandle with all its bytes reversed?
    a. print reverse scalar <FH>
    b. print scalar reverse scalar <FH>
    c. print scalar reverse <FH>
    d. print reverse <FH>

10. When would local $_ in a function ruin your day?
    a. When your caller was in the middle for a while(<>) loop.
    b. When your caller was in the middle for a while(m//g) loop
    c. When $_ was imported from another module.
    d. When your caller was in the middle for a foreach(@a) loop.

11. Which of these is a difference between C++ and Perl?
    a. C++ can have objects whose data cannot be accessed outside its class, but Perl cannot.
    b. C++ supports multiple inheritance, but Perl does not.
    c. C++ will not call destructors on objects that go out of scope if a reference to that object still exists, but Perl will.
    d. Perl can have objects whose data cannot be accessed outside its class, but C++ cannot.

12. Assuming both a local($var) and a my($var) exist, what's the difference between ${var} and ${"var"}?
    a. ${var} is the package variable $var, and ${"var"} is the scoped variable $var.
    b. There is no difference.
    c. ${var} is a package variable $var, and ${"var"} a global variable $var.
    d. ${var} is the lexical variable $var, and ${"var"} is the dynamic variable $var.

13. How do you match one letter in the current locale?
    a. /[a-z]/i
    b. /[^\W_\d]/
    c. /[:isalpha:]/
    d. /[a-zA-Z]/

14. If EXPR is an arbitrary expression, what is the difference between $Foo::{EXPR} and *{"Foo::".EXPR} ?
    a. The second is disallowed under use strict "refs".
    b. The first only happens at runtime, the second at only compile time.
    c. One is just a regular hash, the other a typeglob access for a strangely named variable.
    d. The first can create new globs dynamically, but the second cannot.

15. Assuming $_ contains HTML, which of the following substitutions will remove all tags in it?
    a. s/<.*>//g;
    b. s/<.*?>//gs;
    c. s/<\/?[A-Z]\w*(?:\s+[A-Z]\w*(?:\s*=\s*(?:(["']).*?\1|[\w- .]+))?)*\s*>//gsix;
    d. You can't do that.

16. What does new $cur->{LINK} do? (Assume the current package has no new() function of its own.)
    a. $cur->new()->{LINK}
    b. new($cur->{LINK})
    c. $cur ? ($cur->{LINK}->new()) : (new()->{LINK})
    d. $cur->{LINK}->new()

17. What does $result = f() .. g() really return?
    a. It produces a syntax error.
    b. True if and only if both f() and g() are true, or if f() and g() are both false, but returns false otherwise.
    c. False so long as f() returns false, after which it returns true until g() returns true, and then starts the cycle again.
    d. The last number from the list of numbers returned in the range between f()'s return value and g()'s.

18. What happens when you return a reference to a private variable?
    a. The underlying object is silently copied.
    b. Nothing bad - it just works.
    c. The compiler doesn't let you.
    d. You get a core dump later when you use it.

19. How do you give functions private variables that retain their values between calls?
    a. Include them as extra parameters in the prototype list, but don't pass anything in at that slot.
    b. Use localized globals.
    c. Create a scope surrounding that sub that contains lexicals.
    d. Perl doesn't support that.

20. What happens to objects lost in "unreachable" memory, such as the object returned by $Ob->new() in { my $ap; $ap = [ $Ob->new(), \$ap ]; } ?
    a. Their destructors are called when the memory becomes unreachable.
    b. Their destructors are never called.
    c. Perl doesn't support destructors.
    d. Their destructors are called when that interpreter thread shuts down.

21. What does Perl do if you try to exploit the execve(2) race condition involving setuid scripts?
    a. Sends mail to root and exits.
    b. Runs the fake script with setuid permissions.
    c. Runs the fake script, but without setuid permissions.
    d. Reboots your machine.


1. b. This way functions that wish to return failure can just use a simple return without worrying about the context in which they were called.
a: That would only be true in list context.
c: That's what happens when the function ends without return being used at all.
d: That would only be true in scalar context.

2. c. The deprecated $* flag does double duty, filling the roles of both /s and /m. By using /s, you suppress any settings of that spooky variable, and force your carets and dollars to match only at the ends of the string and not at ends of line as well - just as they would if $* weren't set at all.
a: /s only makes a dot able to match a newline, and then only if the string actually has a newline in it.
b: Although the /c modifier is indeed new as of 5.004 (and is used with /g), this has no particular interaction with /s.
d: /s does more than that.

3. a. length() is a built-in function prototyped as sub length($), and that scalar prototype silently changes aggregates into radically different forms.The scalar sense of a hash is false (0) if it's empty, otherwise it's a string representing the fullness of the hash buckets, like '18/32' or '39/64'. The length of that string is likely to be 5. Likewise, length(@a) would be 2 if there were 37 elements in @a.
b: length %HASH is nothing at all like scalar keys %HASH, which is a good bit more useful.
c: length %HASH is nothing at all like the size of the list of all the keys and values in %HASH.
d: You probably think it decided there were 37 keys, and that length(37) is 2. Close, but not quite.

4. a. A defined (but false) 0 value is the proper indication of the end of file for read() and sysread().
b: You're thinking of the ioctl() and fcntl() functions which return this when the C version returned 0, reserving undef for when the C version returns -1. For example, fcntl(STDIN,F_GETFL,1) returns "0 but true" depending on whether and how standard input has been redirected. (The F_GETFL flag can be loaded from the module.)
c: That's a string of length 1 consisting of the NULL character, whose ord() is 0, which is false. The string, however, is true. read() doesn't return strings, but rather byte-counts.
d: That would signal an I/O error, not normal end of file. The circumfix operator <> returns undef when it reaches end of file, but a normal read does not.

5. c. A list is not an array, although in many places one may be used for the other. An array has an AV allocated, whereas a list is just some values on a stack somewhere. You cannot alter the length of a list, for example, any more than you could alter a number by saying something like 23++. While an array contains a list, it is not a list itself.
a: That makes a reference to a newly allocated anonymous array, and populates it with a copy of the contents of @array.
b: The backslash operator is distributive across a list, and produces a list in return, this being (\$s, \@a, \%h, \&c). Well, in list context. In scalar context, it's a strange way to get a reference to the function &c.
d: @array is not a list, but an array.

6. d. A regular expression (by definition) must be able to determine the next state in the finite automaton without requiring any extra memory to keep around previous state. A pattern /([ab]+)c\1/ requires the state machine to remember old states, and thus disqualifies such patterns from being regular expressions in the classic sense of the term.
a: The mere presence of minimal and maximal repetitions does not disqualify a language from being "regular."
b: Both NFAs and DFAs can be used to solve regular expressions. Given an NFA, a DFA for it can be constructed, and vice versa. For example, classical grep uses an NFA, while classical egrep a DFa. Whether a pattern matches a particular string doesn't change, but where the match occurs may. In any case, they're both regular. However, an NFA can also be modified to handle backtracking, while a DFA cannot.
c: The (?=foo) and (?!foo) constructs no more violate whether the language is regular than ^ and $, which are also zero-width statements.

7. a. In Perl, the number of arguments is available to a function via the scalar sense of @_, the return context is available via wantarray(), and the types of the arguments via ref() (if they're references) and simple pattern matching like /^\d+$/ (otherwise). In low-level languages like C++, where you can't do this, you must resort to overloading of functions.
b: Actually, Perl does support overloaded operators via use overload, overridden functions as in use Cwd qw!chdir!, and overridden methods via inheritance and polymorphism. It just doesn't support functions automatically overloaded on parameter signature or return type. Not that such isn't longed for.
c: Perl actually does have function prototypes, but this isn't used for the traditional sort of prototype checking, but rather for creating functions that exactly emulate Perl's built-ins, which can implicitly force context conversion or pass-by-reference without the caller being aware.
d: Just because it's hard isn't likely to rule out something from being implemented - someday.

8. d. The y/// operator is the sed-savvy synonym for tr///. That means y(3) would be like tr(), which would be looking for a second string, as in tr/a-z/A-Z/, tr(a-z)(A-Z), or tr[a-z][A-Z].
a: Most people don't call functions with ampersands anymore. If they did, as in &y(), it wouldn't be so hard.
b: y isn't really a function, per se. If it were, you would never see y!abc!xyz!, since proper functions do not like getting banged on that way.
c: Functions don't require prototypes in Perl.

9. b. Surprisingly enough, you have to put both the reverse() and the <FH> into scalar context separately for this to work.
a: Although scalar <FH> did retrieve just the next line, the reverse() is still in the list context imposed on it by print, so it takes its list of one element and reverses the order of the list, producing exactly the next line. An expensive way of writing print scalar <FH>.
c: Although the first use of scalar inhibits the list context being imposed on reverse() by print(), it doesn't carry through to change the list context that reverse() is imposing on <FH>. So reverse() catenates all its arguments and does a byte-for-byte flip on the resulting string.
d: That reads all lines in F H, then reverses that list of lines and passes the resulting reversed list off to print. This is actually a very useful thing, and simulates tail -r behavior but without the annoying buffer limitations of that utility. Nonetheless, it's not what we want.

10. b. The /g state on a global variable is not protected with local. That'll teach you to stop using locals. Too bad $_ can't be the target of a my() - yet.
a: However, if you do a while(<>) and forget to first localize $_, you'll hurt someone above you. That's because even though foreach implicitly localizes $_, while(<>) does not.
c: Doing a local() on an imported variable is not harmful. Of course, in the case of $_, it's virtually unnecessary, since $_ is always forced to mean the version in the main package, that is, $main::_.
d: This looks close to the bizarre phenomenon known as variable suicide, but as of this writing, you should be safe from it.

11. d. Perl can use closures with unreachable private data as objects, and C++ doesn't support closures.Furthermore, C++ does support pointer arithmetic via int *ip = (int*)&object, allowing you to look all over the object. Perl doesn't have pointer arithmetic. It also doesn't allow #define private public to change access rights to foreign objects. On the other hand, once you start poking around in /dev/mem, no one is safe.
a. See above for why.
b: Both support multiple inheritance.
c: Exchange "Perl" and "C++" in that answer, and you would be telling the truth. C++ is too primitive to know when an object is no longer in use, because it has no garbage collection system. Perl does.

12. d. Odd though it appears, this is how it works. Note that because the second is a symbol table lookup, it is disallowed under use strict "refs". The words global, local, package, symbol table, and dynamic all refer to the kind of variables that local() affects, whereas the other sort, those governed by my(), are variously knows as private, lexical, or scoped variables.
a: Try again. You're close.
b: One is the scoped variable, the other the package variable. Which is which, though?
c: There is no difference between a package variable and a global variable. All package variables are globals, and vice versa.

13. b. We don't have full POSIX regexes, so you can't get at the isalpha() macro from ctype.h except indirectly. You ask for one byte which is neither a non-alphanumunder, nor an under, nor a numeric. That leaves just the alphabetics, which is what you want.
a: You still forgot the locale-specific letters. The /i flag doesn't bring them in.
c: Lamentably, this reasonably standard syntax is not yet supported in Perl.
d: You forgot the locale-specific letters.

14. a. Dereferencing a string with *{"STR"} is disallowed under the refs stricture, although *{STR} would not be. This is similar in spirit to the way ${"STR"} is always the symbol table variable, while ${STR} may be the lexical variable. If it's not a bareword, you're playing with the symbol table in a particularly dynamic fashion.
b: Assuming that the expressions don't get resolved at compile time, this all has to wait until run time. Something like *Foo::varname, however, would be looked up at compile time.
c: The %Foo:: hash is always the symbol table associated with package Foo; such a hash can hardly be called regular. Both versions actually refer to the same typeglob, although somewhat differently.
d: Although you can get package Foo's symbol table via the hash %Foo::, you cannot usefully generate new typeglobs (symbols) this way. You could copy old ones into that slot, though, effectively doing the Exporter's job by hand.

15. d. If it weren't for HTML comments, improperly formatted HTML, and tags with interesting data like <SCRIPT>, you could do this. Alas, you cannot. It takes a lot more smarts, and quite frankly, a real parser.
a: As written, the dot will not cross newline boundaries, and the star is being too greedy. If you add a /s, then yes, it will remove all tags - and a great deal else besides.
b: It is easy to construct a tag that will cause this to fail, such as <IMG SRC='foo.gif' ALT="> ">.
c: For a good deal of HTML, this will actually work, but it will fail on cases with annoying comments, poorly formatted HTML, and tags like <SCRIPT> and <STYLE>, which can contain things like while (<FH>) {} without those being counted as tags. Comments that will annoy you include <!-- <foo bar = "-->"> which will remove characters when it shouldn't; it's just a comment followed by ">. And even something like <!-- <foo bar = "--> most browsers will get right, but the substitution will not. And if you have improper HTML, you get into even more trouble, like this: <foo bar = "bleh" @> text text text <foo bar = "bleh"> in which case the .*? will gobble up much more than you thought it would.

16. a. The indirect object syntax only has a single token lookahead. That means if new() is a method, it only grabs the very next token, not the entire following expression. This is why new $obj[23] arg doesn't work, as well as why print $fh[23] "stuff\n" doesn't work. Mixing notations between the OO and IO notations is perilous. If you always use arrow syntax for method calls, and nothing else, you'll never be surprised.
b: If the current package did in fact have its own new() function, then this would be the right answer, but for the wrong reasons. Within a class, it might appear to make no difference since the new() subroutine would get its argument in $_[0] whether it's called as a function or a method. However, a method call can use inheritance, while a function call never does. That means esoteric overridden new() methods would be duped out of calling their derived class' constructor first, and we wouldn't want that to happen, now would we?
c: Perl may be crazy, but it's not quite that crazy. Yet.
d: Just because it looks like a unary function doesn't mean a method call parses like one. You just want it to work this way. If you want that, write that.

17. c. This is scalar context, not list context, so we have the bistable flip-flop range operator famous in parsing of mail messages, as in $in_body = /^$/ .. eof(). Except for the first time f() returns true, g() is entirely ignored, and f() will be ignored later when g() is evaluated. Double dot is the inclusive range operator; f() and g() will both be evaluated on the same record. If you don't want that to happen, the exclusive range operator, triple dots, can be used instead. For extra credit, describe this: $bingo = ( a() .. b() ) ... ( c() .. d() );
a: You'd be amazed at how many things in Perl don't cause syntax errors.
b: That sounds more like a negated logical xor. A logical xor is !$a != !$b, so you've just described !$a == !$b. Interesting, and perhaps even useful, but unrelated to .., our scalar range operator.
d: That might work in list context, but never in scalar. The list operator .. is a totally different creature than the scalar one. They're just spelled the same way, kind of like when you can the rusty old can down by the guys' can just because you can. Context, as always, is critical.

18. b. Perl keeps track of your variables, whether dynamic or otherwise, and doesn't free things before you're done using them.
a: Even though the reference returned is for all intents and purposes a copy of the original (Perl uses return by reference), the underlying referent has not changed.
c: Perl seldom stops you from doing what you want to do, and tries very hard to do what you mean to do. This is one of those cases.
d: Perl is not C or C++.

19. c. Only lexical variables are truly private, and they will persist even when their block exits if something still cares about them. Thus: { my $i = 0; sub next_i { $i++ } sub last_i { --$i } } creates two functions that share a private variable. The $i variable will not be deallocated when its block goes away because next_i() and last_i() need to be able to access it.
a: Perl is not the Korn shell, nor anything like it. If you tried this, your program probably wouldn't even compile.
b: The local() operator merely saves the old value of a global variable, restoring that value when the block in which the local occurred exits. Once the subroutine exits, the temporary value is lost. Before then, other functions can access the temporary value of that global variable.
d: It would be difficult to keep private state in a function otherwise.

20. d. When the interpreter exits, it first does an exhaustive search looking for anything that it allocated. This allows Perl to be used in embedded and multithreaded applications safely, and furthermore guarantees correctness of object code.
a: Under the current implementation, the reference-counted garbage collection system won't notice that the object in $ap's array cannot be reached, because the array reference itself never has its reference count go to zero.
b: That would be very bad, because then you could have objects whose class-specific cleanup code didn't get called ever.
c: A class's DESTROY function, or that of its base classes, is called for any cleanup. It is not expected to deallocate memory, however.

21. a. It has been said that all programs advance to the point of being able to automatically read mail. While not quite there yet (well, without loading a module), Perl will at least automatically send it.
b: That would be bad. Very Bad. What do you think we are? A shell or something?
c: It would be improper to run anything at all in the face of such naughtiness.
d: An appealing idea, though, isn't it? After all, Perl does possess super(user)powers at this point. You just never know what it might do. In the interests of courtesy, though, Perl stays out of your power supply just as it stays out of your living room.

_ _END_ _

Tom Christiansen is the co-author of Programming Perl, Learning Perl, and the upcoming Perl Cookbook. He would like to thank Nathan Torkington, Abigail, and Jeffrey Friedl for their suggestions.