Category Archives: Perl

Perl, the duct tape of the Internet. Posts about Perl development, releases of my CPAN modules, etc.

Easy CLI option parsing with Getopt::Lucid

I often write Perl scripts which need to read options given on the command line. Normally I turn to the venerable old Getopt::Long which does the job.

However, I was writing a script which needed to be able to accept only certain parameters, which were mostly optional, and also take a list of filenames. I wanted this to be possible in any order, e.g.:

myscript --foo=foo --bar=bar file1 file2
myscript file1 file --foo foo

Getopt::Lucid makes this all pretty easy, and also makes the code pretty self-documenting, too. Straight from the documentation, showing the various types of parameters it can parse:

@specs = (
    Switch("version|V"),
    Counter("verbose|v"),
    Param("config|C"),
    List("lib|l|I"),
    Keypair("define"),
    Switch("help|h"),
);

$opt = Getopt::Lucid->getopt( \@specs );

$verbosity = $opt->get_verbose;
@libs = $opt->get_lib;
%defs = $opt->get_define;

A real-world example,from one of my scripts which handles ID3-tagging:

# The options we can take - these correspond to the name of the tag that
# they'll set:
my @options = qw(track title artist album comment);

my @option_specs = (
    Getopt::Lucid::Switch('verbose|v'),
    map { Getopt::Lucid::Param($_) } @options,
);
my $opt = Getopt::Lucid->getopt(\@option_specs);

my @tags_to_set = grep { $opt->{seen}{$_} } @options;
my @files = @{ $opt->{target} };

if (!@tags_to_set) {
    say "Nothing to do.  Use one or more of the options:\n" .
        join ', ', map {'--'.$_} @options;
    exit;
}

(The script then goes on to loop over all files, and use Music::Tag to set the ID3 tags requested).

Easy file finding with File::Find::Rule

Recently I found File::Find::Rule on the CPAN, and I’m impressed how easy it makes it to get a list of files to work on.

A fairly common way to do this in Perl would be something like:

my $dirh = new DirHandle($somedir);
while (my $entry = $dirh->read) {
    # Skip hidden files and directories:
    next if ($entry =~ /^\./ || !-f $entry);

    # Skip if it doesn't match the name we want:
    next if ($entry !~ /\.txt$/);

    print "Found: $somedir/$entry\n";
}

File::Find::Rule makes things rather easier:

my @files = File::Find::Rule->file()->name('*.txt')->in($somedir);

Various conditions can be chained together to find exactly what you want.

Another example, showing combining rules with ->any() to find files matching any of those conditions:

# find avis, movs, things over 200M and empty files
my @files = File::Find::Rule->any(
    File::Find::Rule->name( '*.avi', '*.mov' ),
    File::Find::Rule->size( '>200M' ),
    File::Find::Rule->file->empty,
)->in('/home');

There’s plenty of other ways to do this, but I think File::Find::Rule gives a way to clearly and concisely state what you want and get the job done.

Favourite new Perl features

I’ve been starting to make use of the new features introduced in perl 5.10 recently (after being constrained by my main dev environments still running perl 5.8.8, and not having the time to upgrade).

My favourite features so far are:

The smart match operator

The new smart-match operator, ~~, is a great example of DWIM.

A few examples:

if (@a ~~ 'foo')  # list contains at least one item equalling 'foo'
if (@a ~~ /foo/) # list contains at least one item matching /fo+/
if (@a ~~ @b)   # lists contain same values

That’s just a brief overview; there’s plenty more documentation

say

Not a big change, but the new say keyword acts just like print, but adds an implicit newline to the end – so say 'Hello'; is just the same as print "Hello\n";

It’s more useful in cases where you would have had to add parenthesis to get correct precedence – something like: print join(';', @foo) . "\n"; can now be written more concisely as just say join ';', @foo;.

Switch (given) statement

given ($foo) {
    when (/^abc/) { abc(); }
    when (/^def/) { def(); }
    when (/^xyz/) { xyz(); }
    default { die "Unrecognised foo"; }
}

Defined-or

// is now the defined-or operator.

It’s pretty common to use conditional assignments like: $a ||= $b to assign to $a unless $a already has a value. Now you can use $a //= $b to test for definedness rather than truthiness.

Likewise, if ($hash{foo} // $hash{bar}) will be true if either of them is defined (even if they’re defined but have a false value).

Named regex captures

Parenthesised sub-expressions in regular expressions can now be given a name, and accessed via the special %+ hash:

if ($foo =~ m{ (? \d{4} ) - (? \d{2}) - (? \d{2}) }xms) {
    say "Year: $+{year}";
}

The features above are my own personal favourites, in no particular order. The full (large) set of changes can be found in the perldelta for 5.10.0.

Quick Fibonacci calculations are nothing new

Just read this post by Ben Newman (found via Reddit).

Now, the use of C++ templates to calculate the value at compile time rather than runtime is midly clever and amusing (if also impractical and convoluted) but the fact that it can calculate a Fibonacci number quickly is nothing new; it’s solely down to remembering the values you’ve already calculated, and not calculating them again needlessly.
Continue reading Quick Fibonacci calculations are nothing new

Creating HTML tables from database queries with HTML::Table::FromDatabase

A task I find myself doing reasonably often when programming is producing a HTML table based on the result of a database query.

This often ends up with the same kind of boring code being written again and again, which get tedious.

For example:

print <
idfoobar
TABLESTART

my $sth = $dbh->prepare(
    "select id, foo, bar from mytable where something = 'somethingelse'"
);
$sth->execute() or die "Failed to query";

while (my $row = $sth->fetchrow_hashref) {
    print '';
    print join '', @$row{qw(id foo bar)};
    print "\n";
}
print "\n";
$sth->finish;

Not hard, but it does get tedious.

HTML::Table makes things better by taking out most of the HTML drudgery, but you still need to loop through adding rows to your table.

This is where my HTML::Table::FromDatabase comes in – it’s a subclass of HTML::Table which accepts an executed DBI statement handle, and automatically produces the table for you.

For instance:

my $sth = $dbh->prepare(
    "select id, foo, bar from mytable where something = 'somethingelse'"
);
$sth->execute() or die "Failed to query";

my $table = HTML::Table::FromDatabase->new( -sth => $sth );
$table->print;

Much easier, and HTML::Table::FromDatabase does all the tedious work.

Sometimes that won’t be quite flexible enough though; you might have something you want to do to certain columns or values before outputting them.

That’s where HTML::Table::FromDatabase’s callbacks come in handy. For a basic example, let’s say that one of the columns you’re fetching contains URLs, and you want to wrap them in anchor tags to make them clickable links. Simply done with:

 my $table = HTML::Table::FromDatabase->new(
    -sth => $sth,
    -callbacks => [
        {
            column => 'url',
            transform => sub { $_ = shift; qq[$_]; },
        },
    ],
 );

Another example – looking for all cells whose value is a number, and formatting them to two decimal places:

 my $table = HTML::Table::FromDatabase->new(
    -sth => $sth,
    -callbacks => [
        {
            value => qr/\d+/,
            transform => sub { return sprintf '%.2f', shift },
        },
    ],
 );

You can apply as many callbacks as you need.

As HTML::Table::FromDatabase is a subclass of HTML::Table, all of HTML::Table’s options can still be used to control how the generated table appears, for example:

  • -class => 'classname' to give the table a specific class to help you apply CSS styling
  • -border => 1 to apply borders, -padding => 3 to set cell padding
  • -evenrowclass and -oddrowclass if you want to have different styling for even and odd rows (e.g. alternating row backgrounds).

The full list of options can be found in the HTML::Table documentation, I’m not going to duplicate it all here.

Currently, the row headings used in the generated table are taken from the column names in the query, but I plan to release a new version sometime soon which allows you to alias them, if you want to do so.

(The code samples in this post are intentionally kept relatively simple, omitting obvious things like connecting to the database first, error checking etc).

(This post also appears on Perlbuzz)

Playing with Ohloh

I’ve been having a quick play with Ohloh, and it seems pretty good. It’s “a website which provides a web services suite and online community platform that aims to map the landscape of open source software development.”

I figured it was worth getting my Perl modules listed, if only to boost the amount of Perl code listed there – I don’t think enough people sing Perl’s praises as they’re busy doing real work with it, so it appears to some to be going “the way of the dinosaurs”.

Ohloh seems impressive so far, with features to hook in to your source control system (Subversion in my case) to see contributors, change history etc. The only drawback is that it does not like re-organisation of the repo, and I re-organised mine to get all the code I’m willing to publically expose under a certain path in the repo, so I can point svnserve at that path, whilst some other code sits at another level. This means that, as far as Ohloh can see, there’s only ever been one commit to my projects. It’s a known problem (according to this forum post).

So far I’ve added SMS::AQL and HTML::Table::FromDatabase – other projects to follow.

New song lyrics search site

A whistling badger

I’ve been meaning to whack up a post about this – I launched a new song lyrics search website the other day called LyricsBadger.

It uses my Lyrics::Fetcher Perl module to fetch song lyrics from a variety of sites, and remembers what it’s been asked for before so that it can present lists of artists/songs which it’s already been asked for.

I built it as a testbed for Lyrics::Fetcher and to get some experience with Template Toolkit for Perl (which absolutely rocks!). The entire site is powered by one Perl script and a handful of templates, and uses a ScriptAlias directive to pass all requests to the one script so that it can provide nice clean URLs like /lyrics/Artist/Title.

Why not go and give LyricsBadger a try?

SimpleStreamer – Flash video streamer for Wii or PC

I recently got a Wii as a new toy, and wanted something good to stream videos to it.

I had a look round at some of the various “Media Center” options out there, but didn’t find anything I liked, so I’m writing my own.

So far I have a working system in Perl which allows you to browse the pre-set video dir(s) and presents a list of files + folders, when you select one it then uses a Flash streaming player, and either presents the file directly if it was a .flv video, otherwise it uses ffmpeg to transcode it to FLV on the fly and stream it to the Flash player.

Continue reading SimpleStreamer – Flash video streamer for Wii or PC