Things I know about stuff

Debeasy - Read .deb File Metadata in Ruby

As part of a script to manage apt-repos at $dayjob, I realised it would be nice to have a way to read Debian package file metadata from the file without having to parse the output of dpkg and friends.

Enter Debeasy.

It’s a very simple gem to extract package metadata. You can also find it on Rubygems.

Hadoop JobTracker REST Interface

Hadoop’s JobTracker and NameNode are very bad at presenting information about themselves in an easily consumable fashion. Sure, you can probe them with JMX, but that limits your probing toolset to Java or something that runs on the JVM (e.g. JRuby, Jython, Scala, Clojure, etc). Plus, it means dealing with the verbose and fairly opaque JMX interface.

For the JobTracker, it would be great if there was a simple REST API to query jobs and their properties.

To that end, I’ve written a very simple JRuby & Sinatra app that connects to the JobTracker using the Hadoop libraries, and presents information about jobs in progress, queued, failed, etc via a simple REST interface, with JSON as the output format.

You can find it on Github. It was written for use with internal systems at Forward3D.

I did plan to write something similar for the NameNode, but never got around to it. One day…

Puppet With Embedded Ruby

One thing that sucks about deploying Ruby apps is that your configuration management tool (e.g. Puppet) is usually also written in Ruby, which means it can’t manage the installed Ruby version easily. Also, you may get into a situation where your app requires one version of Ruby, and Puppet requires another.

Opscode’s approach to this with Chef is to ship a completely standalone Chef, with all its dependencies completely built into a single package, running from an embedded Ruby interpreter. They called this Omnibus packaging.

Omnibus is quite easy to pick up and use, so I’ve used this to create a Puppet Omnibus package. It comes with all the gems required for Puppet to run, and Ruby 1.9.3 to run it. You can find the Github repo for building your own package in my repository on Github.

I’ve taken the approach of not building all binary dependencies from source; instead, I have the final OS package require other OS packages containing the binary dependencies it requires (which amounts to OpenSSL, libaugeas and a few others). This saves a bit of time and complexity, but at the potential of a breaking change in those libraries screwing up the Omnibus package. It’s entering production now, so we’ll see how it goes.

At some point in the future I may host a package repo for it.

Ruby DAAP Server

DAAP is Apple’s closed music-sharing/streaming protocol that’s embedded in iTunes and other media products.

My general experience with iTunes music sharing is that it works pretty well. At work I use OpenVPN to tunnel back into my house network, since iTunes has the restriction of only allowing libraries to be shared on a local network.

I’ve previously used forked-daapd and Tangerine as my DAAP server. However, I’ve never been fully happy with either, and I thought it would be an interesting Ruby project to write a DAAP server from scratch.

The fruits of my labour is the (un)imaginatively titled rubydaap.

It’s a simple Sinatra app which uses Thin as the Rack webserver. The database is MongoDB, because it’s fast, simple to use, and the DAAP protocol doesn’t require us to consider data in a relational way. I made use of dmap-ng to provide the DAAP protocol parsing, which saved a bunch of time. It also makes use of the excellent Taglib bindings for Ruby. It makes use of inotify on Linux (and equivalents on other platforms) to look for file changes after the initial scan.

It’s a bit broken right now, as it needs to be a bit more careful about how it reads files in the Scanner thread – it’s possible to read MP3 files with funky tags that’ll cause the app to stop. If you have a reasonably sensible music collection, it should work pretty well. When I get a free day or so, I’ll make the Scanner much more robust (I’ve managed to crash Ruby reading a bad MP3 file’s tags using taglib-ruby!)

Apple DAAP Documentation

There used to be an awesome chunk of reverse-engineered documentation on Apple’s DAAP, the proprietary music-sharing protocol they created, at http://tapjam.net/daap, but it seems to have disappeared off the web. I’ve captured it from the Wayback Machine and stuck it on Github.

Puppet and Red Hat Satellite/Spacewalk

The place I currently work at recently decided that we would standardise the Linux distributions we had in use, with the aim being to use only Red Hat Enterprise Linux wherever possible. To deliver packages, we ended up with RHN Satellite installed on a VM. Now, we provision pretty much everything possible with Puppet, and Satellite required clicky-clicky in the web interface to configure machines. That’s not how I like to do things.

A note on Satellite/Spacewalk

In case you’re confused, Spacewalk is the open-sourced code that makes up RHN Satellite, with the branding removed. The module described below will work with Spacewalk as well as Satellite. (In theory: I haven’t tested it.

Assigning machines to Satellite channels with Puppet

Satellite is weird, in that you assign machines to “channels” (repositories) on the Satellite server itself; RHEL clients running yum ask the Satellite: “Hey, what channels am I subscribed to?”, and Satellite returns whichever ones the client is permitted to use. With bog-standard yum repositories, you configure the client to look at a URL, and that’s it.

The first challenge was figuring out how a RHEL client can change the list of channels it is currently subscribed to. The Satellite API doesn’t make any mention of a mechanism for doing this, but there’s a Python script called spacewalk-channel which is part of the rhn-setup package, which can change a client’s channel subscriptions from the client.

This code makes use of an API that doesn’t appear in the Satellite API docs, called up2date. It was pretty simple to figure out how this API works, and engineer a Puppet type & provider that hooks it.

The code is on Github. It can’t handle changing base channel; only child channel subscriptions. I do have a mechanism for doing this in our environment, but it needs cleaning up before it can be made public (it’s a bit too specific for the way we do things). It also doesn’t register the machine against Satellite, though that should be trivial for you to script – in my environment, our provisioning steps do the registration and install Puppet, at which point the module takes over.

Ideas for expansion

One thing I’d really like this module to do is read the child channel’s configuration, and see if it has a GPG key associated with it, then bring the key down to the system and install it. At the moment, I have a separate module that installs all the extra GPG keys for packages we require, and it’s ugly – to do this, I use stschulte’s rpmkey type.