UK2 Perl Competency Test

What is CPAN

CPAN is the Comprehensive Perl Archive Network, a large collection of Perl software and documentation. It provides a huge collection of useful Perl modules, to do a wide range of common tasks.

It provides many opportunities to avoid reinvention of wheels by reusing other (mostly already tested and proven) code.

What should a Perl script start with?

The basic, obvious answer is the shebang line with the path to the Perl interpeter:

#!/usr/bin/perl

However, a good Perl script will also use strict, and perhaps use warnings or use the -w flag to the shebang line. Taint mode (-t) is also a security-conscious choice.

How would you install a Perl module?

Manually: you need to untar the .tar.gz file that most modules are distributed in, then run:

perl Makefile.PL
make
make install

(Running the make install as root, unless you're installing to a directory to which you have access, by passing PREFIX= to the Makefile.PL line).

Using the CPAN module:

perl -MCPAN -e shell
# when you get the shell, do:
install Some::Module

# or, immediately with:
perl -MCPAN -e 'install Some::Module'

What is a hash in Perl?

A hash in Perl is an array where the keys are arbitrary values, referred to in other languages as an "associative array" or "dictionary".

How would you connect to a database? Provide a basic example (For the sake of examples, assume a MySQL database running on the local host)

The DBI module from CPAN is the way to go... something similar to the following:

use DBI;
# get a connection:
$dbh = DBI->connect('DBI:mysql:database_name:server', 'dbuser', 'dbpass');
 
$dbh = DBI->connect($dsn, $user, $password,
                      { RaiseError => 1, AutoCommit => 1 });

What does the following piece of code do? Why would you do it this way, rather than having the SQL query within the for loop?

$sth = $dbh->prepare('SELECT * FROM users WHERE id = ?');
 
my @ids = (5,6,10);
 
for my $id (@ids) {
    my $row = $sth->execute($id);
    # ... do something with $row
}

It prepares the SQL statement with a placeholder for the id value to look for, then executes it multiple times, one for each ID from @ids.

Doing it this way provides two benefits:

For database engines which support it, it allows them to only decide on the plan of execution for the query once, reusing that plan for each execution of the query - for a query which is run many times over the lifetime of the DB connection, this can provide significant efficiency savings
Correct quoting of the parameters passed in to the execute call is guaranteed, avoiding the need to carefully check/quote the values to avoid SQL injection attacks.

How would you iterate through the elements of a hash, in ascending alphabetical order of the keys?

A simple example:

for my $key (sort keys %hash) {
    print "$key is $hash{$key}\n";
}

Using a regular expression, extract the person's name, the activity they did and the day,month and year from the following strings:

my @strings = (
    'Bob went fishing on 13/06/2006 at 13:30',
    'Sarah went swimming on 04/01/2007 at 17:05',
    'Fred went running on 21/10/2006 at 10:15',
);

They could do it various ways, just look for a way that works. One way would be:

for my $line (@lines) {
    if (my($activity, $day, $month, $year) = 
        $line =~ /went (.+) on ([0-9]{2})/([0-9]{2})/([0-9]{2})/) {
        
        print "activity: $activity day:$day month:$month year:$year\n";
    }
}

How would you validate an email address (checking it looks syntactically valid, not whether it actually exists)?

Good answers would include "using Email::Valid or Mail::RFC822::Address or various other CPAN modules". Not-so-good answers would be rolling your own solution - a regexp to validate an email address properly according to RFC822 / RFC2822 is absolutely horrendous. This wheel has been invented too many times already.

How would you implement a timeout around a blocking syscall?

Good answers would include the use of an eval block with an alarm, such as:

eval {
    local $SIG{ALRM} = sub { die "alarm\n"; };
    alarm 5;  # time out after 5 seconds
 
    # do your syscall or whatever here
 
    alarm 0;  # if we reach here it completed before timeout,
              # so disable the alarm
};
 
# now check for errors:
if ($@) {
    # an error from the eval, let's see what:
    if ($@ eq "alarm\n") {
        # it was the timeout:
    } else {
        # it was some other unexpected failure... better
        # either handle it or propagate it:
        die $@;
    }
}

Another good answer would be using Sys::SigAction from CPAN, since it allows safe signal handling (Perl < 5.8.0 had 'unsafe' siginal handling, meaning there is a risk a signal will arrive and be handled whilst Perl is changing internal data structures, possibly causing subtle problems.

What does map do?

map evaluates a block of code, or a straight expression, for each element of the list it is passed, and returns the result (as a list).

A real-world example would be turning a hash of keys and values into a HTML select box (CGI.pm can be used to create these nicely, but this is a useful example):

my %options = ( apple => 'A green apple', banana => 'A yellow banana' );

my @tags = map { qq[<option value="$_">$options{$_}</option>] } keys %options;

The above code will give an array of <option> tags, with the value of each tag being the hash key, and the text displayed to the user being the value of that hash element.

What will @tags contain after running the following code?

my %options = ( apple => 'A green apple', banana => 'A yellow banana' );

my @tags = map { qq[<option value="$_">$options{$_}</option>] } keys %options;

As explained in the answer to the previous question, it will look like:

# output of print Dumper(\@tags) :
$VAR1 = [
          '<option value="banana">A yellow banana</option>',
          '<option value="apple">A green apple</option>'
        ];

What does grep do?

grep is similar to map in that it evaluates an expression or block of code once for each item in the list it is passed, unlike map it will just return the elements for which the expression returned true.

What would the following code be used for? Why would it be written that way?

my %items = map { $_ => 1 } qw(apple orange banana pear);
if ($items{$fruit}) { print "Yes, we have some ${fruit}s"; }

It creates a hash, which is used as a lookup table. It is functionally equivalent to:

my %items = ( apple => 1, orange => 1, banana => 1, pear => 1 );

On a small dataset it would likely be more efficient (and clearer) to use grep to look for the item:

if (grep /$fruit/, qw(apple orange banana pear)) {
.....
}

However, as the dataset increases, the time taken for a random search across the array increases in polynomial time, wherease the time for a hash lookup remains fixed.

Another, quicker but slightly more obscure method:

my @colours = qw/red green yellow orange white mauve blue ochre
		pink purple gold silver grey brown steel/;

my %colours;
# set all values in %colours from the keys in @colours to
# have the undefined value (but exist in the hash).
@colours{@colours} = ();

Show a way to take a string, capitalising the first letter of every word, and lowercasing the rest (e.g. given 'Fred bloggs', 'fred bloggs', 'FRED BLOGGS' or 'fReD bloGgs' it would return 'Fred Bloggs')

Obviously, as with anything in Perl, There's More Than One Way To Do It, but one suitable method would be:

$name = join ' ', map { ucfirst $_ } split /\s/, lc $name;

Show a quick way to randomly sort an array.

A suitable, if slightly unclear way would be:

sort { (-1,1)[rand 2] } @array;

# e.g :
for my $thing (sort { (-1,1)[rand 2] } @array) {
    print "$thing\n";
}

What is the following?

my ($apples, $oranges) = @stock{ qw(apples oranges) };

It's a hash slice. It sets $apples and $oranges to the values of $stock{apples} and $stock{oranges}.