Postings in December 2012

The Movie Database, Grilo and mock testing

This year at Openismus had a strong focus on media files and related metadata. So it isn't surprising that we also got in touch with Grilo. Grilo is a framework for discovering media files and decorating it with relevant metadata. One missing piece was background information about movies, like story summaries, artwork, information about artists. With the support of our customers we decided to change that.

Now an obvious choice for getting information on movies is Amazon's incredibly comprehensive Internet Movie Database. Sadly there doesn't seem to be any official web API that could be used by open source software. If I missed something, please tell me in the comments.

The Movie Database plugin for Grilo

Anyway, with IMDb out of reach we had to look for an alternative - and we've found a great one: The Movie Database. This database is a community driven project, built with the needs of media players in mind. It comes with a great and easy to use API. Like IMDb it seems to know just each and every relevant and irrelevant movie ever made. Some entertaining stuff like movie trivia, goofs and quotes are missing, but for most movies the database knows the IMDb id - so all the entertainment is just one click away.

Now that we've found a good movie database Jens started implementing a TMDb plugin for Grilo, and a first version of the plugin was released with grilo-plugins 0.2.2. I've later finished it while integrating it with our customer's project. Murray took care of examples and documentation.

The plugin implements a metadata resolver: It takes a media object, checks for a title or the unique movie id and then fills the media object with information it found on TMDb. Biggest obstacle when using the plugin is the need for an API key. It should be easy to get. The database operator hope these keys will help them to deal with misbehaving clients.

Another issue is related to Grilo's architecture. Grilo doesn't permit its sources to improve any metadata. So if you interpolated a movie title from the filename and pass a media object with that title to the TMDb plugin, the plugin is not permitted to replace your usually ugly title with the pretty and official title it found in the database. To work around that issue you would not resolve the TMDb data in a single step. Instead you'd do a first resolve operation to just retrieve the TMDb's movie id. In a second step you'd delete your own title, but keep the resolved movie id and let the plugin work with that data:

/* Create dummy media from filename */
media = grl_media_new ();
grl_media_set_title (media, "the_last_unicorn_720p");

/* Ask grilo to resolve the TMDb id */
metadata_key_tmdb_id = grl_registry_lookup_metadata_key (registry, "tmdb-id");
keys = grl_metadata_key_list_new (metadata_key_tmdb_id, GRL_METADATA_KEY_INVALID);
grl_source_resolve_sync (source, media, keys, options, &error);
/* Exercise: Release memory and handle possible errors */

/* Use the resolved id to get full metadata */
movie_id = grl_data_get_string (GRL_DATA (media), metadata_key_tmdb_id);

if (movie_id) {
    grl_data_remove (GRL_DATA (media), GRL_METADATA_KEY_TITLE);
    grl_source_resolve_sync (source, media, keys, options, &error);
    /* ... */

Behind your back the same number of network requests is needed for both approaches.

Testing the Plugin

At Openismus we have a mantra we strongly believe in:

No code is done before it has proper tests.

Or more radically:

It doesn't work if it isn't tested.

In the case of the TMDb plugin this commitment caused us a bit of work. Grilo didn't have any infrastructure for testing network based plugins yet. Also if we'd do those tests we'd like to run them offline:

So we took the challenge and also implemented a mock testing framework for Grilo. It became handy that Grilo already routes all its network access through a dedicated abstraction, so that it can implement request throttling. So we just hooked into that layer and introduced a few environment variables that influence behavior of that layer: First you'd set GRL_NET_CAPTURE_DIR to some folder name and then let your plugin perform a few interesting operations. In a next step you'd edit the recorded files and rename them nicely to suite your needs. Essential in that work is a generated key-value file of the name grl-net-mock-data-$PID.ini which maps URLs to captured responses. Once done you can set GRL_NET_MOCKED to point to that file. Grilo will then stop doing real network requests and answer all requests with the information this file provides.

Details about testing with the Grilo mock framework are in the reference manual. A few examples for using the framework can be found in Grilo's test folder.

Media Discovery with QtGStreamer

Earlier this year we at Openismus proposed a Qt based project that would utilize GStreamer for handling media files. Especially we were interested in using the GstDiscoverer class which provides a really nice and easy to use API for discovering properties of media files, such as the container format and the audio and video formats, but also more interesting things like EXIF information, when used with photos.

Now combining code from different worlds with their different paradigms isn't exactly fun. The resulting code often is a disgusting Frankenstein monster not fitting at any place, unless you wrap one of the libraries to match the project's preferred code style. Luckily in the case of Qt and GStreamer Collabora's George Kiagiadakis created QtGStreamer and therefore did most of the hard work already. Still that library didn't support our beloved GstDiscoverer class yet. So we had the choice: Use something different, or wrap that thing. Now we love doing free software, also we use GstDiscoverer with great success in the Rygel UPnP AV/DLNA Media Server already, and in the end the media files shall get played via GStreamer in the end. So we decided to just wrap that class for QtGStreamer.

Doing that work actually was surprisingly easy: A few loose ends here (#680235), a bit of nitpicking there (#680233, #GB680237). Biggest effort was doing the regression tests. This tests also demonstrate how easy the wrapped GstDiscoverer is to use. Synchronous media discovery is done like that:

QGst::DiscovererPtr discoverer = QGst::Discoverer::create(QGst::ClockTime::fromSeconds(1));
QGst::DiscovererInfoPtr info;

try {
    info = discoverer->discoverUri("file:///home/mathias/blockbuster.ogv");
} catch(const QGlib::Error &error) {
    qWarning("Discovery failed: %s", qPrintable(error.message()));
    // ...maybe also check error.domain() and .code()

You also can try asynchronous discovery if you have a Qt build that integrates GMainLoop:

QGst::DiscovererPtr discoverer = QGst::Discoverer::create(QGst::ClockTime::fromSeconds(1));

// Connect C++ member methods to the signals
QGlib::connect(discoverer, "starting", this, &DiscovererTest::onStartingDiscovery);
QGlib::connect(discoverer, "discovered", this, &DiscovererTest::onUriDiscovered);
QGlib::connect(discoverer, "finished", this, &DiscovererTest::onDiscoveryFinished, QGlib::PassSender);


QEventLoop loop;

Usually only X11 builds match that requirement, but it should be possible to just hook QEventDispatcherGlib into your own application if needed.

The discovered data is accessible by the various attributes and methods of QGst::DiscovererInfo:

QGst::DiscovererInfoPtr info = ...;

qDebug() << info->uri();
qDebug() << info->tags();
qDebug() << info->duration();
// ...

Q_FOREACH(const QGst::DiscovererVideoInfoPtr &info, info->videoStreams()) {

Sadly our customer wasn't that much a fan of Qt as we thought, so we didn't have much use of our own for this work yet. This situation also delayed finishing the last few bits of that patches. Luckily Murray just took the time recently to do that last bits of work, and to get the patches merged. The code is in the git repository now and should get released with QtGStreamer 0.10.3. So whenever your Qt application needs to discover media file properties you also can use QtGStreamer now.