Category: Programming

Intro to Testing at SVPerl

As promised, here are the slides from my talk: Intro to Testing.

This version of the talk covers a little more ground than the previous version, and gives you the preferred ways of dealing with leaking output and warnings, and includes the new section on Cucumber.

You can also play with the demo programs which live on GitHub.

June 6, 2015
Automating ingest of StillStream signups

I help manage the StillStream internet radio site. We’re in the midst of a conversion of the site to WordPress to make it a lot more usable for both our listeners and our DJs, but at the moment we’re still running off our old custom PHP site coded by Palancar back when he first created the station.

One of the things this means is that we get all new signups by artists as emails delivered to me, and I have to manually enter the signup details into the site via PHPmyadmin. Our new system will automatically store the data in the database and let us approve the artists to be shown publicly, but for right now it’s a lot of cutting and pasting. Tedious and potentially easy to make a mistake, so I decided to see if I could automate it at least a little.

First, I’ve already got Gmail tagging the signups with their own label. gmvault, a nice Python program that can connect to Gmail and back up one’s emails, has recently been enhanced to allow you to pull down emails with a specific label. I log in to Gmail, check what the proper “in:” string is to select those emails, and then run gmvault to download just those:

gmvault sync –type custom –gmail-req “in:stillstream-inbox-artists” my.email@gmail.com

Joes-MacBook-Pro:~ joemcmahon$ cd gmvault-db/db/

Joes-MacBook-Pro:db joemcmahon$ ls

2014-05 2014-06 2014-07 2014-08 2014-09 2014-10 chats

(The ‘chats’ directory only contains Google Chat messages, so I’ll just ignore it.) Now I’ve got the emails, and I need to process them. Each email is essentially a dump of data from the signup forms. Here’s a sample email body – I’ve dropped the headers because they’re all the same and don’t contain any data I use for the process, and the field contents are redacted so my example artist doesn’t get spammed.

An artist has digitally signed the StillStream agreement.

Full Name: [John Doe]

Artist Name: [Does in the Woods]

URL: [http:/doesinthewoods.com]

Addr1: [123 Someplace Street]

Addr2: []

City: [Bay City]

State: [AK]

Zip: [99999]

Country: [US]

Email: [john.doe@doesinthewoods.com]

Phone: [715-555-1212]

Agreement Version: [3]

REMEMBER – artist is not in the database automatically any more – you must add them BY HAND.

What I need to do now is transform this into a nice MySQL UPDATE or INSERT statement I can execute on the StillStream server to add the artist to our database. A quick little Perl program seems the right idea.

October 25, 2014
xmonad on OS X Mavericks
I installed XQuartz today, and while looking around for a low-distraction window manager, I came across xmonad. It looked interesting, and I started following the installation instructions and found they were out of date. Here’s an updated set of instructions for installing xmonad.
1. Install XQuartz.
2. Install homebrew if you don’t already have it.
3. brew update
4. brew install ghc cabal-install wget
5. cabal update
6. export LIBRARY_PATH=/usr/local/lib:/usr/X11/lib
7. cabal install xmonad
8. Launch XQuartz and go to Preferences (command-,). Set the following:
  - Output
    
    Enable “Full-screen mode”
  - Input
    
    Enable “Emulate three button mouse”
    
    Disable “Follow system keyboard layout”
    
    Disable “Enable key equivalents under X11”
    
    Enable “Option keys sent Alt_L and Alt_R”
  - Pasteboard
    
    Enable all of the options
monad has been installed in $HOME/.cabal/bin/xmonad. You now need to create an .xinitrc that will make XQuartz run monad. Edit ~/.xinitrc and add these lines:
```
[[ -f ~/.Xresources ]] && xrdb -load ~/.Xresources
xterm &
$HOME/.cabal/bin/xmonad
```
You can now launch XQuartz; nothing seems to happen, but press command-option-A and the xmonad “desktop” (one huge xterm) will appear, covering the whole screen. Great! It’s using the default teeny and nasty xterm font, though. Let’s pretty it up a bit by making it use Monaco instead. Edit ~/.xresources and add these lines:
```
xterm*background: Black
xterm*foreground: White
xterm*termName: xterm-color
xterm*faceName: Monaco
```
Quit XQuartz with command-Q, and then relaunch, then hit command-option-A again to see the XQuartz desktop. The terminal should now be displaying in Monaco.

At this point, you should take a look at the guided tour and get familiar with xmonad. If you’re looking for a distraction-free working environment, this might be good for you. I’m going to give it a try and see how it works out.
October 23, 2014

Shellshock scanner

So I had a bunch of machines with a standard naming convention that I needed to scan for the Shellshock bug. Since I just needed to run a command on each one and check the output, and I had SSH access, it seemed easy enough to put together a quick script to manage the process.

Here’s a skeleton of that script, with the details on what machines I was logging into elided. This does a pretty reasonable job, checking 300 machines in about a minute. You need to have a more recent copy of Parallel::ForkManager, as versions prior to 1.0 don’t have the ability to return a data structure from the child.

$|++;
use strict;
use warnings;
use Parallel::ForkManager 1.07;

my $MAX_PROCESSES = 25;
my $pm = Parallel::ForkManager->new($MAX_PROCESSES);
my @servers = @SERVER_NAMES;
my %statuses;
my @diagnostics;
$pm-> run_on_finish (
    sub {
        my($pid, $exit_code, $ident, $exit_signal, $core_dump,
           $data_structure_reference) = @_;
        if (defined($data_structure_reference)) { 
            my ($host_id, $status, $results) = @{$data_structure_reference};
            if ($status eq 'Unknown') {
                push @diagnostics, $host_id, $results;
            } else {
                push @{ $statuses{$status} }, $host_id;
            }
        } else { 
            warn qq|No message received from child process $pid!\n|;
        }
    }
);

print "Testing servers: ";
for my $host_id (@servers) {
    my $pid = $pm->start and next;
    my $result = << `EOF`;
ssh -o StrictHostKeyChecking=no $host_id <<'ENDSSH' 2>&1
env x='() { :;}; echo vulnerable' bash -c "echo this is a test"
ENDSSH
EOF
    my $status;
    if ($result =~ /Permission denied/is) {
       $status = q{Inacessible};
    } elsif ($result =~ /key verification failed/s) {
       $status = q{Key changed};
    } elsif ($result =~ /timed out/is) {
       $status = q{Timed out};
    } elsif ($result =~ /vulnerable/s) {
           $status = q{Vulnerable};
    } elsif ($result =~ /ignoring function definition attempt/s) {
       $status = q{Patched};
    } elsif ($result =~ /Name or service not known/s) {
       $status = q{Nonexistent};
    } else {
       $status = q{Unknown}
    }
    print "$host_id, ";
    $pm->finish(0, [$host_id, $status, $result]);
}
$pm->wait_all_children;
print "done!\n";
for my $status (keys %statuses) {
    print "$status: ",join(',', @{$statuses{$status}}), "\n";
}
print "The following hosts returned an undiagnosed status:",
      join("\n", @diagnostics), "\n";

Note that this doesn’t test the most recent version (#3) of the bug; I have modified it slightly to test for that, but that’s a reasonable exercise for the reader.

September 28, 2014

ETL into WordPress: lessons learned

I had a chance this weekend to do a little work on importing a large (4000 or so articles and pages) site into WordPress. It was an interesting bit of work, with a certain amount learning required on my part – which translated into some flailing around on to establish the toolset.

Lesson 1: ALWAYS use a database in preference to anything else when you can.

I wasted a couple hours trying to clean up the data for CSV import using any of a number of WordPress plugins. Unfortunately, CSV import is half-assed at best – more like about quarter-assed, and any cleanup in Excel is excruciatingly slow.

Some of the data came out with mismatched quotes, leaving me with aberrant entries in the spreadsheet that caused Excel to throw an out-of-memory error and refuse to process them when I tried to delete the bad rows or even cells from those bad rows.

Even attempting to work with the CSV data using Text::CSV in Perl was problematic because the site export data (from phpMyAdmin) was fundamentally broken. I chalk that partially up to the charset problems we’ll talk about later.

I loaded up the database using MAMP, which worked perfectly well, and was able to use Perl DBI to pull the pages and posts out without a hitch, even the ones with weirdo character set problems.

Lesson 2: address character set problems first

I had a number of problems with the XMLRPC interface to WordPress (which otherwise is great, see below) when the data contained improperly encoded non-ASCII characters. I was eventually forced to write code to swap the strings into hex, find the bad 3 and 4 character runs, and replace them with the appropriate Latin-1 substitutes (note that these don’t quite match that table – I had to look for the ”e2ac’ or ‘c3’ delimiter characters in the input to figure out where the bad characters were. Once I hit on this idea, it worked very well.

Lesson 3: build in checkpointing from the start for large import jobs

The various problems ended up causing me to repeatedly wipe the WordPress posts database and restart the import, which wasted a lot of time. I did not count that toward the overall time needed to complete when I charged my client. If I had, it would have been more like 20-24 hours instead of 6. Fortunately the imports were, until a failure occurred, a start-it-and-forget-it process. It was necessary to wipe the database between tried because WordPress otherwise very carefully preserves all the previous versions, and cleaning them out is even slower.

I hit on the expedient of recording the row ID of an item each time one successfully imported and dumping that list out in a Perl END block. If the program fell over and exited due to a charset problem, I got a list of the rows that had processed OK which I could then add to an ignore list. Subsequent runs could simply exclude those records to get me straight to the stuff I hadn’t done yet and and to avoid duplicate entries.

I had previously tried just logging the bad ones and going back to redo those, but it turned out to be easier to exclude than include.

Lesson 4: WordPress::API and WordPress XMLRPC are *great*.

I was able to find the WordPress::API module on CPAN, which provides a nice object-oriented wrapper around WordPress XMLRPC. With that, I was able to programmatically add posts and pages about as fast as I could pull them out of the local database.

Lesson 5: XMLRPC just doesn’t support some stuff

You can’t add users or authors via XMLRPC, sadly. In the future, the better thing to do would probably be to log directly in to the server you’re configuring, load the old data into the database, and use the PHP API calls directly to create users and authors as well as directly load the data into WordPress. I decided not to embark on this, this time, because I’m faster and more able in Perl than I am in PHP, and I decided it would be faster to go that way than try to teach myself a new programming language and solve the problem simultaneously.

Overall

I’d call this mostly successful. The data made it in to the WordPress installation, and I have an XML dump from WordPress that will let me restore it at will. All of the data ended up where it was supposed to go, and it all looks complete. I have a stash of techniques and sample code to work with if I need to do it again.

June 21, 2014
Bluetooth, LineIn, Soundflower: talking over Skype and playing music
2025: This is ridiculously outdated. In the era of Zoom, you just set the camera up on the tripod and go. Left here for reference when someone says, “hey, old-timer, what was doing classes like before Zoom and FaceTime?”.

Someone who wants to teach dance classes online asked me if there was a reasonable way (i.e.., without spending a lot of money) to set up a Skype link that can be used for both music and a wireless microphone setup.

The plan is to put something together that allows her to
- Get far enough away from the camera that she can be seen head to toe (being able to see the footwork is important) and with a wide enough angle that she doesn’t have to dance unnaturally in one spot.
- Send iTunes output and her voice over the line at the same time to one or more people, in sync to the music.
- Have some kind of a wireless mic to be able to communicate to her students without shouting.
- Be able to hear her students talk back without their hearing their own voices delayed, or her hearing her own voice, delayed.
This turns out to be more complicated than it might seem. The iSight camera doesn’t work very well for this; its field of view is quite narrow, and it’s very difficult to adjust it so that it pointed properly on top of that. This was relatively easy to solve: a Logitech HD Pro 920 works fine for both the wide-angle and head-to-toe issues; it can be mounted on a tripod (it has the necessary threading to mount on a standard photo tripod), and after an upgrade to a more powerful laptop – her 2008 MacBook Air was just not cutting it! – the video issue was solved.

The audio issue was thornier. Originally, I hit up Sweetwater Sound for a real wireless mic setup; after realizing this was going to be well north of $300 once I got the mic, the base station, and the computer interface to actually hook it up with, and that this was going to be a lot of different hardware issues to deal with as well, I decided I’d better scout around for a better option.

I was stuck until the instructor suggested a Bluetooth headset instead. It’s a reasonable, good-enough audio input channel at 8KHz – she wants to talk across it, not record studio-quality audio, so a little bit tinny is OK – and it’s definitely wireless. After a bit of investigation, I settled on the Jawbone ERA as the most-likely-workable option. The ERA is light, small, fits tightly (important for a dancer) and is the current best headset suggestion from Wirecutter, who I have learned to trust on stuff like this. It’s easy to connect a Bluetooth headset to OS X (getting it to talk properly to the software’s a different issue, see below). This takes a lot of hardware complication out of the way. Skype supports Bluetooth, so I thought I’d solved the problem.

Unfortunately, an audio test with the music and voice both going through the Bluetooth mic showed me I’d have to get more creative; the music was either inaudible or distorted (that 8KHz bandwidth made it sound hideous, when you could hear it at all). It needed to be audible and undistorted if it was going to be possible for a student on the far end to use it to dance along with.

A lot of Googling finally led me to thisevilempire’s blog entry on how to play system audio in Skype calls on OS X. This got me part of the way: I had, according to tests with the Skype Audio Tester “number”, gotten the audio to play nicely across the link, but I was getting a half-second delay of my voice back on the same channel, which made it hard to talk continuously. Not good enough for an instructor.

More searching found a post on Lockergnome spelling out how to transmit clean audio, overlay voice, and hear the returned call without an echo. Here’s how:
1. Install Soundflower and LineIn, both free.
2. Make sure the Bluetooth headset is on.
3. Open the Sound preference pane in System Preferences.
4. Set the
  1. Jawbone ERA as the input device
  2. Soundflower (64ch) as the output.
5. Duplicate LineIn in the Applications folder, and rename both copies: one to “LineIn Bluetooth” and the other to “Bluetooth System”. The names aren’t important; this just so you can tell them apart.
6. Launch both copies of LineIn. You’ll need to drag one window aside to reveal them both; they initially launch in exactly the same spot.
7. Choose the “LineIn Bluetooth” instance in the Dock, and set
  1. Input to “ERA by Jawbone”
  2. Output to Soundflower (2ch).
  3. Click the “Pass thru” button.
8. Select the other instance, “LineIn System”, and set
  1. Input to Soundflower (16ch)
  2. Output to Soundflower (2ch).
  3. Click the “Pass thru” button.
9. Run Soundflowerbed (installed in the Applications folder by the Soundflower install). In the menu bar, click on the little flower icon, and
  1. Select “None” under Soundflower (2ch)
  2. Select “Built-in Output” under Soundflower (16ch).
10. Run Skype, and open its preferences.
  1. Select “Soundflower 2ch” in its Microphone pulldown, and leave everything else alone.
  2. If you have an alternate camera attached, switch the Camera pulldown to the appropriate camera.
You should now be able to make a Skype call, and play music from iTunes, DVD Player, or Youtube over the wire at full fidelity, and talk at the same time. You should hear the far end’s voice on your speakers, along with the music you’re sending across (undelayed).

Try to keep the headset away from the speakers to minimize the chances of feedback.

It’s not all that difficult; it’s just the tricky bits of being able to reroute the audio internally via the two LineIn instances and Soundflower. Getting those tricky bits right is the difficult part.

I’ve tested this with the Skype test call and it seems to have worked; the big test will be the full-up video camera plus the streaming audio. We’ll give that a shot soon and I’ll follow up on whether the Bluetooth mic is good enough, or if a better mic is needed.

Update: Undoing the process!

It’s necessary to restore the normal audio routing after the call; you can do this with System Preferences.
1. Open System Preferences and select Sound.
2. Set Input to Internal Microphone. If you’re wearing the ERA, it will make a little descending bleep to let you know it’s been disconnected.
3. Set Output to Internal Speakers.
4. Quit both copies of LineIn.
5. Check the Soundflowerbed menu; it should have both Soundflower 2ch and SoundFlower 64ch pointing to None. Quit Soundflowerbed.
6. Turn off the Bluetooth headset; put it on its charger for a while.
7. Quit Skype.
You should be all set.
March 11, 2014
Mojolicious Revolutions

3rd in my series of talks at SVPerl about Mojolicious; this one reviews using the server-side features for building bespoke mock servers, and adds a quick overview of the Mojo client features, which I had missed until last week. Color me corrected.

Mojolicious Revolutions

March 7, 2014

youtube-dl: it just works

I was having trouble watching the Théâtre du Châtelet performance of Einstein on the beach at home; my connection was stuttering and buffering, which makes listening to highly-pulsed minimalist music extremely unrewarding. Nothing like a hitch in the middle of the stream to throw you out of the zone that Glass is trying to establish. (This is a brilliant staging of this opera and you should go watch it Right Now.)

So I started casting around for a way to download the video and watch it at my convenience. (Public note: I would never redistribute the recording; this is solely to allow me to timeshift the recording such that I can watch it continuously.) I looked at the page and thought, “yeah, I could work this out, but isn’t there a better way?” I searched for a downloader for the site in question, and found it mentioned in a comment in the GitHub pages for youtube-dl.

I wasn’t 100% certain that this would work, but a quick perusal seemed to indicate that it was a nicely sophisticated Python script that ought to be able to do the job. I checked it out and tried a run; it needed a few things installed, most importantly ffmpeg. At this point I started getting a little excited, as I knew ffmpeg should technically be quite nicely able to do any re-enoding etc. that the stream might need.

A quick brew install later, I had ffmpeg, and I asked for the download (this is where we’d gotten to while I’ve been writing this post):

$ youtube_dl/__main__.py http://culturebox.francetvinfo.fr/einstein-on-the-beach-au-theatre-du-chatelet-146813
 [culturebox.francetvinfo.fr] einstein-on-the-beach-au-theatre-du-chatelet-146813: Downloading webpage
 [culturebox.francetvinfo.fr] EV_6785: Downloading XML config
 [download] Destination: Einstein on the beach au Théâtre du Châtelet-EV_6785.mp4
 ffmpeg version 1.2.1 Copyright (c) 2000-2013 the FFmpeg developers
 built on Jan 12 2014 20:50:55 with Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn)
 configuration: --prefix=/usr/local/Cellar/ffmpeg/1.2.1 --enable-shared --enable-pthreads --enable-gpl --enable-version3 --enable-nonfree --enable-hardcoded-tables --enable-avresample --enable-vda --cc=cc --host-cflags= --host-ldflags= --enable-libx264 --enable-libfaac --enable-libmp3lame --enable-libxvid
 libavutil 52. 18.100 / 52. 18.100
 libavcodec 54. 92.100 / 54. 92.100
 libavformat 54. 63.104 / 54. 63.104
 libavdevice 54. 3.103 / 54. 3.103
 libavfilter 3. 42.103 / 3. 42.103
 libswscale 2. 2.100 / 2. 2.100
 libswresample 0. 17.102 / 0. 17.102
 libpostproc 52. 2.100 / 52. 2.100
 [h264 @ 0x7ffb5181ac00] non-existing SPS 0 referenced in buffering period
 [h264 @ 0x7ffb5181ac00] non-existing SPS 15 referenced in buffering period
 [h264 @ 0x7ffb5181ac00] non-existing SPS 0 referenced in buffering period
 [h264 @ 0x7ffb5181ac00] non-existing SPS 15 referenced in buffering period
 [mpegts @ 0x7ffb52deb000] max_analyze_duration 5000000 reached at 5013333 microseconds
 [mpegts @ 0x7ffb52deb000] Could not find codec parameters for stream 2 (Unknown: none ([21][0][0][0] / 0x0015)): unknown codec
 Consider increasing the value for the 'analyzeduration' and 'probesize' options
 [mpegts @ 0x7ffb52deb000] Estimating duration from bitrate, this may be inaccurate
 [h264 @ 0x7ffb51f9aa00] non-existing SPS 0 referenced in buffering period
 [h264 @ 0x7ffb51f9aa00] non-existing SPS 15 referenced in buffering period
 [hls,applehttp @ 0x7ffb51815c00] max_analyze_duration 5000000 reached at 5013333 microseconds
 [hls,applehttp @ 0x7ffb51815c00] Could not find codec parameters for stream 2 (Unknown: none ([21][0][0][0] / 0x0015)): unknown codec
 Consider increasing the value for the 'analyzeduration' and 'probesize' options
 Input #0, hls,applehttp, from 'http://ftvodhdsecz-f.akamaihd.net/i/streaming-adaptatif/evt/pf-culture/2014/01/6785-1389114600-1-,320x176-304,512x288-576,704x400-832,1280x720-2176,k.mp4.csmil/index_2_av.m3u8':
 Duration: 04:36:34.00, start: 0.100667, bitrate: 0 kb/s
 Program 0
 Metadata:
 variant_bitrate : 0
 Stream #0:0: Video: h264 (Main) ([27][0][0][0] / 0x001B), yuv420p, 704x396, 12.50 fps, 25 tbr, 90k tbn, 50 tbc
 Stream #0:1: Audio: aac ([15][0][0][0] / 0x000F), 48000 Hz, stereo, fltp, 102 kb/s
 Stream #0:2: Unknown: none ([21][0][0][0] / 0x0015)
 Output #0, mp4, to 'Einstein on the beach au Théâtre du Châtelet-EV_6785.mp4.part':
 Metadata:
 encoder : Lavf54.63.104
 Stream #0:0: Video: h264 ([33][0][0][0] / 0x0021), yuv420p, 704x396, q=2-31, 12.50 fps, 90k tbn, 90k tbc
 Stream #0:1: Audio: aac ([64][0][0][0] / 0x0040), 48000 Hz, stereo, 102 kb/s
 Stream mapping:
 Stream #0:0 -> #0:0 (copy)
 Stream #0:1 -> #0:1 (copy)
 Press [q] to stop, [?] for help
 frame=254997 fps=352 q=-1.0 size= 1072839kB time=02:49:59.87 bitrate= 861.6kbits/s

Son of a gun. It works.

I’m waiting for the download to complete to be sure I got the whole video, but I am pretty certain this is going to work. Way better than playing screen-capture games. We’ll see how it looks when we’re all done, but I’m quite pleased to have it at all. The download appears to be happening at about 10x realtime, so I should have it all in about 24 minutes, give or take (it’s a four-hour, or 240 minute, presentation).

Update: Sadly, does not work for PBS videos, but you can actually buy those; I can live with that.

January 12, 2014

Test::Routine slides

This is my Test::Routine slide deck for the presentation I ended up doing from memory at the last SVPerl.org meeting. I remembered almost all of it except for the Moose trigger and modifier demos – but since I didn’t have any written yet, we didn’t miss those either!

Update: My WordPress installation seems to have misplaced this file. I’ll look around for it and try to put it back soon.

December 7, 2013
Intro to Perl Testing at SVPerl

A nice evening at SVPerl – we talked about the basic concepts of testing, and walked through some examples of using Test::Simple, Test::More, and Test::Exception to write tests. We did a fair amount of demo that’s not included in the slides – we’ll have to start recording these sometime – but you should be able to get the gist of the talk from the slides.

November 8, 2013