Postings tagged with Qt
Media Discovery with QtGStreamer
Earlier this year we at Openismus proposed a Qt based project that would utilize GStreamer for handling media files. Especially we were interested in using the GstDiscoverer class which provides a really nice and easy to use API for discovering properties of media files, such as the container format and the audio and video formats, but also more interesting things like EXIF information, when used with photos.
Now combining code from different worlds with their different paradigms isn't exactly fun. The resulting code often is a disgusting Frankenstein monster not fitting at any place, unless you wrap one of the libraries to match the project's preferred code style. Luckily in the case of Qt and GStreamer Collabora's George Kiagiadakis created QtGStreamer and therefore did most of the hard work already. Still that library didn't support our beloved GstDiscoverer class yet. So we had the choice: Use something different, or wrap that thing. Now we love doing free software, also we use GstDiscoverer with great success in the Rygel UPnP AV/DLNA Media Server already, and in the end the media files shall get played via GStreamer in the end. So we decided to just wrap that class for QtGStreamer.
Doing that work actually was surprisingly easy: A few loose ends here (#680235), a bit of nitpicking there (#680233, #GB680237). Biggest effort was doing the regression tests. This tests also demonstrate how easy the wrapped GstDiscoverer is to use. Synchronous media discovery is done like that:
QGst::DiscovererPtr discoverer = QGst::Discoverer::create(QGst::ClockTime::fromSeconds(1));
QGst::DiscovererInfoPtr info;
try {
info = discoverer->discoverUri("file:///home/mathias/blockbuster.ogv");
} catch(const QGlib::Error &error) {
qWarning("Discovery failed: %s", qPrintable(error.message()));
// ...maybe also check error.domain() and .code()
}
You also can try asynchronous discovery if you have a Qt build that integrates GMainLoop:
QGst::DiscovererPtr discoverer = QGst::Discoverer::create(QGst::ClockTime::fromSeconds(1));
// Connect C++ member methods to the signals
QGlib::connect(discoverer, "starting", this, &DiscovererTest::onStartingDiscovery);
QGlib::connect(discoverer, "discovered", this, &DiscovererTest::onUriDiscovered);
QGlib::connect(discoverer, "finished", this, &DiscovererTest::onDiscoveryFinished, QGlib::PassSender);
discoverer->start();
QEventLoop loop;
loop.exec();
Usually only X11 builds match that requirement, but it should be possible to just hook QEventDispatcherGlib into your own application if needed.
The discovered data is accessible by the various attributes and methods of QGst::DiscovererInfo:
QGst::DiscovererInfoPtr info = ...;
qDebug() << info->uri();
qDebug() << info->tags();
qDebug() << info->duration();
// ...
Q_FOREACH(const QGst::DiscovererVideoInfoPtr &info, info->videoStreams()) {
...
}
Sadly our customer wasn't that much a fan of Qt as we thought, so we didn't have much use of our own for this work yet. This situation also delayed finishing the last few bits of that patches. Luckily Murray just took the time recently to do that last bits of work, and to get the patches merged. The code is in the git repository now and should get released with QtGStreamer 0.10.3. So whenever your Qt application needs to discover media file properties you also can use QtGStreamer now.
Using Full Text Search Engines as Datastore
It's a common design to use full text search engines only for free text searches, but to store the actual structured data in a separate database. Such designs come at a cost. Therefore Openismus asked me to build upon my previous post, where I analyzed several FTS engines. This time I'll research if we could use the full text search index itself as our primary data store.
Relations
A first obvious limitation is the lack of joins. So to use the FTS index as
data store, you must denormalize your data. That is, instead
of storing your movie database in distinct entity tables like Movie and
Artist, linked by relationship tables like isLeadActor or isDirector,
you must find a way to put everything into one single flat table. This isn't
entirely nice in terms of redundancy and consistency. On the other hand joining
tables is what makes relational databases slow and hinders distributing them
across servers. Is there someone whispering "NoSQL"? Well. Yes, while I
absolutely dislike their striking marketing: They are on to something, and
with our journey today we enter their land.
Seems I've lost myself in chatting, so back on topic. So to store data in a FTS index we must denormalize our data. Luckily they make it easier than it sounds. In opposition to the relational model, there is no need to create complex relationships, just to assign more than only one actor or director to a movie: When adding artists to your movie you just tag each name with the proper field prefix before adding it to the index, and you are done. FTS engines natively support multi-value fields!
With some additional effort it also should be possible to store more structured
data in those multi-value fields, things like (release-date, country), or
(actor, role): You'd add more prefixes and use the positional information
stored for phrase searches to reliably identify those fields. Sadly my time is
too limited to research this more in detail, but the Internet surely has
documents about this. Well, or for additional fun you can try to figure it out
yourself.
Exact Matches
You can just add unanalyzed fields and use term queries on them like kamstrup pointed out.
Data Types
So we've learned that lack of relations isn't much of a problem for many useful datasets, but structured data is not only about relationships, it also is about data types. Full Text Search engines only support lexicographical order, so they surely fail for dates and numbers. You surely cannot use them to find documents within a given range!
I am sorry to disappoint you. The people researching FTS are smarter than that.
Actually properly sorting and ranging dates, while only using lexicographic
order is trivial. Most probably you have done it yourself already. Simply store
your dates in ISO format, that is YYYY-MM-DDThh:mm:ss.SSSNNN or any prefix of
this, and you are done. Omit the separators if you prefer. ISO-8601 explicitly
is designed for lexicographic sorting.
So how do you do this with numbers? You could prefix them, for instance with zeros, to get a fixed width. This works reasonably if you know your number ranges, and in most cases you do. Sometimes you know the range from your application's context, e.g. the first known celluloid film was recorded in 1888. More easily you just use your technical limits, like [-263..263-1] for long integers. While first experiments really followed that approach, padding numbers with up to 18 zeros isn't exactly efficient or pretty. Also we didn't talk about floating point numbers yet. Therefore FTS engines like Lucene or Xapian provide more efficient mechanisms for turning numbers into sortable strings. First they write a prefix indicating number precision (64 bit, 32 bit, 10 bit, ...). Then they convert the numbers to some unsigned format, and apply some kind of base-128 encoding to the resulting bytes. The most significant bit gets stored first. For floating point numbers they shuffle some bits of the number's IEEE-754 representation. The resulting, sortable 64 bit integer then is encoded like any other number. You can consult Lucene's documentation, and the source code of Lucene::NumericUtils, or Xapian::sortable_serialise for details.
Benchmarks
Hope I didn't lose you with all this theory, now it is benchmark time!
To test how useful FTS engines are for storing arbitrary data I've extended my previous benchmark to better support range searches, and to support exact matching of fields. I've also added Michal Hruby's patch for supporting prefix searches. Since the prefix search gives countless hits, the query results consistently are limited to 10.000 rows now. I've dropped QtCLucene for now since it doesn't seem to support numeric range searches and such. It was forked from Java Lucene a long time ago. For SQLite I ran two sets of tests: bm_sqlite doesn't create indices for fields like movie title or artist names. Since such setup is unfair when comparing with FTS engines, the second set bm_sqlite_index creates indices for all fields we perform lookups for. For tracker we again test the Nepomok media ontology (bm_tracker) and a optimized ontology (bm_tracker_flat), that attaches all properties to the same RDF class. I had to disable prefix searches for bm_tracker: The query ran for more than 2 hours on the dataset with 17k movies. I seriously wish I'd get sponsored to improve Tracker's data model!
Source code still is in the fts-benchmark repository, tagged as
release/0.3.
Results and Discussion
Each query got run 7 times on 5 different data sets. This time I didn't take
the mean of the query execution times. The individual results of each dataset
are grouped together and labeled with qxx_t1 to qxx_t7. Data and result
sets grow with each group.
Also be careful when reading the charts as time is scaled logarithmically. You might want to consult the raw data tables below for details. Please keep in mind that the basic goal of this benchmarks is to test scalability, not raw performance. Therefore I don't mind much if an engine is 10 times slower than another for small data sets. Constant performance is the ideal result.
You'll also notice that some charts have gaps for bm_tracker. Like explained above I had to skip bm_tracker for few data sets, as tracker took way to long to perform those benchmarks.
![rating:[90 TO 99]](http://taschenorakel.de/files/fts-benchmark/ftsds1.png)
Lucene++ appears significantly slower than its competition for small data sets, but then gives comparable results for data sets with more than 3,000 movies. Still I would not overrate this finding: We are talking about lookup times in the range of 10 ms. That's still pretty fast and close to measurement limits like the spikes in the other engine's results show.
![release:[1999/01/01 TO 1999/09/30]](http://taschenorakel.de/files/fts-benchmark/ftsds2.png)
This results are similar to the rating:[90 TO 99] query.

For this query you see the importance of having an index for your lookup keys: Performance of bm_lucene++ and bm_sqlite_index remains almost constant, while effort of the other engines grows dramatically as the data size grows.
Xapian's bad performance comes as a surprise, but actually I am to blame here:
For stupid reasons I've implemented this very search as range search in
Lucene++ and Xapian (release:[1999/03/31 TO 1999/03/31]). As the results
indicate Lucene++ seems to putting more effort into optimizing range searches,
and compensates my mistake.

Similar results as for release=1999/03/31, only that Xapian behaves as
expected now. When given a proper query it also shows constant lookup time for
exact phrase searches.

With this query you see the advantage you get from using denormalized tables: Lucene++ and Xapian just are as efficient as in the previous tests, but as a not so big surprise Tracker with the flat ontology now beats all remaining engines, including bm_sqlite_index.

Performance of the different engines is similar to each other when performing prefix searches.
Raw Result Data
| rating:[90 TO 99] - 9 movies, 3 matches | |||||||
|---|---|---|---|---|---|---|---|
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 12.333 ms | 10.409 ms | 9.885 ms | 9.821 ms | 10.221 ms | 9.840 ms | 9.986 ms |
| bm_sqlite | 0.196 ms | 0.169 ms | 0.169 ms | 0.173 ms | 0.166 ms | 0.167 ms | 0.167 ms |
| bm_sqlite_index | 0.207 ms | 0.183 ms | 0.172 ms | 0.192 ms | 0.193 ms | 0.173 ms | 0.172 ms |
| bm_tracker | 0.992 ms | 0.655 ms | 0.582 ms | 0.589 ms | 0.554 ms | 0.549 ms | 0.525 ms |
| bm_tracker_flat | 0.693 ms | 0.463 ms | 0.437 ms | 0.461 ms | 0.450 ms | 0.443 ms | 0.436 ms |
| bm_xapian | 0.242 ms | 0.201 ms | 0.200 ms | 0.198 ms | 0.200 ms | 0.199 ms | 0.197 ms |
| rating:[90 TO 99] - 1,099 movies, 17 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 12.949 ms | 13.057 ms | 12.981 ms | 13.018 ms | 13.150 ms | 12.840 ms | 12.644 ms |
| bm_sqlite | 0.696 ms | 0.546 ms | 0.516 ms | 0.530 ms | 0.515 ms | 0.518 ms | 0.522 ms |
| bm_sqlite_index | 0.448 ms | 0.234 ms | 0.231 ms | 0.237 ms | 0.236 ms | 0.231 ms | 0.231 ms |
| bm_tracker | 5.051 ms | 4.485 ms | 4.441 ms | 4.486 ms | 4.425 ms | 4.831 ms | 4.828 ms |
| bm_tracker_flat | 1.465 ms | 1.133 ms | 1.110 ms | 1.104 ms | 1.108 ms | 1.108 ms | 1.108 ms |
| bm_xapian | 1.445 ms | 1.285 ms | 1.159 ms | 7.824 ms | 1.878 ms | 1.669 ms | 1.393 ms |
| rating:[90 TO 99] - 3,216 movies, 35 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 14.287 ms | 13.596 ms | 13.453 ms | 13.912 ms | 13.875 ms | 14.559 ms | 13.981 ms |
| bm_sqlite | 3.524 ms | 4.110 ms | 4.129 ms | 1.916 ms | 1.732 ms | 2.300 ms | 9.584 ms |
| bm_sqlite_index | 0.423 ms | 2.036 ms | 4.617 ms | 4.577 ms | 0.388 ms | 1.957 ms | 7.981 ms |
| bm_tracker | 12.776 ms | 11.816 ms | 12.449 ms | 11.755 ms | 11.762 ms | 11.983 ms | 11.764 ms |
| bm_tracker_flat | 2.935 ms | 2.517 ms | 2.374 ms | 2.264 ms | 2.250 ms | 2.261 ms | 2.258 ms |
| bm_xapian | 9.292 ms | 2.702 ms | 10.573 ms | 6.773 ms | 3.098 ms | 11.438 ms | 3.035 ms |
| rating:[90 TO 99] - 17,251 movies, 260 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 58.996 ms | 56.894 ms | 62.172 ms | 57.028 ms | 57.255 ms | 57.540 ms | 57.259 ms |
| bm_sqlite | 36.682 ms | 28.260 ms | 34.116 ms | 34.786 ms | 35.195 ms | 35.813 ms | 35.221 ms |
| bm_sqlite_index | 45.802 ms | 62.460 ms | 31.603 ms | 32.982 ms | 33.302 ms | 31.904 ms | 31.656 ms |
| bm_tracker | 67.022 ms | 64.609 ms | 64.649 ms | 65.243 ms | 64.183 ms | 64.887 ms | 64.283 ms |
| bm_tracker_flat | 14.730 ms | 14.179 ms | 14.132 ms | 14.221 ms | 14.248 ms | 20.225 ms | 35.888 ms |
| bm_xapian | 94.872 ms | 47.067 ms | 85.202 ms | 28.575 ms | 142.854 ms | 48.562 ms | 52.567 ms |
| rating:[90 TO 99] - 121,587 movies, 1,510 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 283.122 ms | 392.801 ms | 382.164 ms | 403.929 ms | 384.512 ms | 408.292 ms | 361.548 ms |
| bm_sqlite | 293.488 ms | 236.636 ms | 249.677 ms | 232.674 ms | 270.198 ms | 282.806 ms | 218.726 ms |
| bm_sqlite_index | 231.638 ms | 311.523 ms | 198.781 ms | 279.063 ms | 219.294 ms | 192.589 ms | 276.822 ms |
| bm_tracker | - | - | - | - | - | - | - |
| bm_tracker_flat | 181.478 ms | 272.453 ms | 251.730 ms | 256.744 ms | 293.067 ms | 230.615 ms | 245.113 ms |
| bm_xapian | 376.176 ms | 417.637 ms | 411.263 ms | 366.596 ms | 393.168 ms | 372.888 ms | 412.411 ms |
| release:[1999/01/01 TO 1999/09/30] - 9 movies, 2 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 18.768 ms | 10.167 ms | 10.799 ms | 10.215 ms | 10.443 ms | 10.917 ms | 10.210 ms |
| bm_sqlite | 0.165 ms | 0.166 ms | 0.164 ms | 0.164 ms | 0.168 ms | 0.164 ms | 0.164 ms |
| bm_sqlite_index | 0.175 ms | 0.175 ms | 0.170 ms | 0.169 ms | 0.169 ms | 0.169 ms | 0.170 ms |
| bm_tracker | 1.074 ms | 0.569 ms | 0.546 ms | 0.561 ms | 0.544 ms | 0.549 ms | 0.546 ms |
| bm_tracker_flat | 0.877 ms | 0.480 ms | 0.460 ms | 0.458 ms | 0.461 ms | 0.458 ms | 0.456 ms |
| bm_xapian | 0.183 ms | 0.175 ms | 0.175 ms | 0.178 ms | 0.178 ms | 0.180 ms | 0.175 ms |
| release:[1999/01/01 TO 1999/09/30] - 1,099 movies, 34 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 19.154 ms | 19.449 ms | 18.811 ms | 19.419 ms | 19.692 ms | 19.315 ms | 18.862 ms |
| bm_sqlite | 0.691 ms | 0.686 ms | 0.684 ms | 0.687 ms | 0.690 ms | 0.702 ms | 0.698 ms |
| bm_sqlite_index | 0.365 ms | 0.311 ms | 0.317 ms | 0.312 ms | 0.311 ms | 0.312 ms | 0.313 ms |
| bm_tracker | 6.231 ms | 5.543 ms | 5.734 ms | 5.522 ms | 5.663 ms | 5.538 ms | 5.465 ms |
| bm_tracker_flat | 1.998 ms | 1.494 ms | 1.466 ms | 1.469 ms | 1.470 ms | 1.454 ms | 1.469 ms |
| bm_xapian | 5.336 ms | 1.590 ms | 7.241 ms | 1.977 ms | 2.651 ms | 4.013 ms | 2.544 ms |
| release:[1999/01/01 TO 1999/09/30] - 3,216 movies, 84 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 32.202 ms | 31.513 ms | 31.362 ms | 30.894 ms | 31.345 ms | 31.741 ms | 31.518 ms |
| bm_sqlite | 6.169 ms | 2.645 ms | 7.560 ms | 20.764 ms | 10.385 ms | 13.278 ms | 10.206 ms |
| bm_sqlite_index | 19.176 ms | 4.358 ms | 12.576 ms | 15.448 ms | 15.745 ms | 5.572 ms | 5.770 ms |
| bm_tracker | 15.507 ms | 14.803 ms | 13.629 ms | 15.465 ms | 13.930 ms | 14.515 ms | 13.652 ms |
| bm_tracker_flat | 3.956 ms | 3.488 ms | 3.183 ms | 3.176 ms | 3.213 ms | 3.193 ms | 3.157 ms |
| bm_xapian | 18.414 ms | 5.874 ms | 11.902 ms | 12.932 ms | 19.995 ms | 21.098 ms | 13.009 ms |
| release:[1999/01/01 TO 1999/09/30] - 17,251 movies, 374 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 93.892 ms | 93.900 ms | 93.549 ms | 93.555 ms | 93.924 ms | 94.396 ms | 93.795 ms |
| bm_sqlite | 37.831 ms | 44.905 ms | 47.617 ms | 45.894 ms | 43.796 ms | 45.752 ms | 47.048 ms |
| bm_sqlite_index | 48.475 ms | 47.805 ms | 43.046 ms | 47.393 ms | 44.689 ms | 47.842 ms | 54.208 ms |
| bm_tracker | 72.507 ms | 72.667 ms | 72.233 ms | 73.570 ms | 72.997 ms | 72.991 ms | 72.527 ms |
| bm_tracker_flat | 29.351 ms | 48.892 ms | 55.351 ms | 49.793 ms | 88.375 ms | 55.393 ms | 45.917 ms |
| bm_xapian | 59.522 ms | 168.591 ms | 55.750 ms | 83.424 ms | 113.679 ms | 62.803 ms | 127.895 ms |
| release:[1999/01/01 TO 1999/09/30] - 121,587 movies, 2,265 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 543.495 ms | 564.582 ms | 609.045 ms | 519.248 ms | 561.844 ms | 663.549 ms | 590.518 ms |
| bm_sqlite | 165.617 ms | 387.256 ms | 293.285 ms | 335.219 ms | 324.528 ms | 324.022 ms | 371.839 ms |
| bm_sqlite_index | 375.504 ms | 315.671 ms | 321.115 ms | 371.228 ms | 300.951 ms | 344.073 ms | 356.366 ms |
| bm_tracker | - | - | - | - | - | - | - |
| bm_tracker_flat | 241.569 ms | 316.626 ms | 398.308 ms | 349.426 ms | 398.289 ms | 318.078 ms | 363.809 ms |
| bm_xapian | 529.377 ms | 556.989 ms | 577.643 ms | 576.194 ms | 626.388 ms | 545.251 ms | 570.695 ms |
| release=1999/03/31 - 9 movies, 1 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 10.065 ms | 10.068 ms | 9.702 ms | 9.974 ms | 9.837 ms | 9.751 ms | 10.356 ms |
| bm_sqlite | 0.164 ms | 0.165 ms | 0.171 ms | 0.168 ms | 0.167 ms | 0.164 ms | 0.162 ms |
| bm_sqlite_index | 0.171 ms | 0.169 ms | 0.171 ms | 0.172 ms | 0.175 ms | 0.165 ms | 0.164 ms |
| bm_tracker | 0.659 ms | 0.476 ms | 0.473 ms | 0.469 ms | 0.464 ms | 0.468 ms | 0.468 ms |
| bm_tracker_flat | 0.510 ms | 0.395 ms | 0.385 ms | 0.384 ms | 0.389 ms | 0.383 ms | 0.389 ms |
| bm_xapian | 0.154 ms | 0.152 ms | 0.151 ms | 0.153 ms | 0.152 ms | 0.156 ms | 0.152 ms |
| release=1999/03/31 - 1,099 movies, 2 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 10.853 ms | 10.545 ms | 10.718 ms | 10.390 ms | 10.521 ms | 10.754 ms | 10.661 ms |
| bm_sqlite | 0.515 ms | 0.528 ms | 0.505 ms | 0.512 ms | 0.502 ms | 0.507 ms | 0.505 ms |
| bm_sqlite_index | 3.139 ms | 0.184 ms | 0.175 ms | 3.440 ms | 0.183 ms | 0.212 ms | 0.205 ms |
| bm_tracker | 4.559 ms | 4.229 ms | 4.177 ms | 4.220 ms | 4.383 ms | 4.532 ms | 4.464 ms |
| bm_tracker_flat | 0.977 ms | 0.830 ms | 0.800 ms | 0.808 ms | 0.802 ms | 0.811 ms | 0.802 ms |
| bm_xapian | 0.672 ms | 0.685 ms | 0.774 ms | 0.752 ms | 0.916 ms | 1.285 ms | 0.663 ms |
| release=1999/03/31 - 3,216 movies, 2 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 10.799 ms | 10.762 ms | 11.399 ms | 10.676 ms | 10.704 ms | 10.169 ms | 10.325 ms |
| bm_sqlite | 1.912 ms | 1.462 ms | 1.453 ms | 1.163 ms | 1.151 ms | 1.157 ms | 4.858 ms |
| bm_sqlite_index | 0.366 ms | 0.350 ms | 0.355 ms | 1.883 ms | 0.364 ms | 0.345 ms | 0.371 ms |
| bm_tracker | 11.707 ms | 11.548 ms | 11.433 ms | 11.425 ms | 11.465 ms | 11.450 ms | 11.912 ms |
| bm_tracker_flat | 1.661 ms | 1.511 ms | 1.513 ms | 1.714 ms | 1.507 ms | 1.612 ms | 1.510 ms |
| bm_xapian | 1.278 ms | 1.364 ms | 1.359 ms | 1.821 ms | 1.994 ms | 1.429 ms | 3.192 ms |
| release=1999/03/31 - 17,251 movies, 3 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 12.485 ms | 12.281 ms | 12.323 ms | 11.981 ms | 12.137 ms | 11.808 ms | 12.552 ms |
| bm_sqlite | 8.247 ms | 6.259 ms | 6.007 ms | 6.300 ms | 6.125 ms | 5.958 ms | 5.921 ms |
| bm_sqlite_index | 0.379 ms | 0.297 ms | 0.285 ms | 0.284 ms | 0.252 ms | 0.254 ms | 0.251 ms |
| bm_tracker | 61.537 ms | 60.815 ms | 61.014 ms | 60.821 ms | 61.013 ms | 60.850 ms | 60.820 ms |
| bm_tracker_flat | 11.063 ms | 8.021 ms | 8.414 ms | 8.690 ms | 7.798 ms | 7.811 ms | 8.313 ms |
| bm_xapian | 5.545 ms | 4.561 ms | 4.956 ms | 4.388 ms | 4.321 ms | 4.687 ms | 4.396 ms |
| release=1999/03/31 - 121,587 movies, 12 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 14.005 ms | 14.031 ms | 12.792 ms | 14.354 ms | 12.736 ms | 13.862 ms | 13.374 ms |
| bm_sqlite | 64.517 ms | 61.783 ms | 61.669 ms | 62.418 ms | 61.377 ms | 61.326 ms | 62.036 ms |
| bm_sqlite_index | 9.994 ms | 0.403 ms | 0.358 ms | 0.351 ms | 0.368 ms | 0.363 ms | 3.368 ms |
| bm_tracker | - | - | - | - | - | - | - |
| bm_tracker_flat | 62.160 ms | 62.760 ms | 56.630 ms | 60.929 ms | 54.310 ms | 53.189 ms | 58.016 ms |
| bm_xapian | 29.180 ms | 28.239 ms | 28.080 ms | 28.054 ms | 27.777 ms | 27.615 ms | 27.505 ms |
| title=The Matrix - 9 movies, 1 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 9.248 ms | 8.929 ms | 9.139 ms | 9.455 ms | 9.609 ms | 9.128 ms | 9.110 ms |
| bm_sqlite | 0.163 ms | 0.163 ms | 0.163 ms | 0.161 ms | 0.160 ms | 0.163 ms | 0.164 ms |
| bm_sqlite_index | 0.167 ms | 0.165 ms | 0.178 ms | 0.164 ms | 0.164 ms | 0.163 ms | 0.165 ms |
| bm_tracker | 0.733 ms | 0.484 ms | 0.475 ms | 0.478 ms | 0.481 ms | 0.475 ms | 0.476 ms |
| bm_tracker_flat | 0.575 ms | 0.400 ms | 0.380 ms | 0.382 ms | 0.379 ms | 0.387 ms | 0.379 ms |
| bm_xapian | 0.226 ms | 0.197 ms | 0.194 ms | 0.191 ms | 0.191 ms | 0.194 ms | 0.190 ms |
| title=The Matrix - 1,099 movies, 1 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 10.758 ms | 10.578 ms | 10.083 ms | 10.230 ms | 10.555 ms | 10.630 ms | 10.831 ms |
| bm_sqlite | 0.728 ms | 0.524 ms | 0.504 ms | 0.501 ms | 0.506 ms | 0.500 ms | 0.501 ms |
| bm_sqlite_index | 0.218 ms | 0.203 ms | 0.201 ms | 0.198 ms | 0.199 ms | 0.277 ms | 0.233 ms |
| bm_tracker | 5.906 ms | 5.409 ms | 5.426 ms | 5.453 ms | 5.420 ms | 5.410 ms | 5.344 ms |
| bm_tracker_flat | 1.685 ms | 1.471 ms | 1.455 ms | 1.455 ms | 1.440 ms | 1.448 ms | 1.439 ms |
| bm_xapian | 0.445 ms | 0.385 ms | 0.398 ms | 0.373 ms | 0.836 ms | 0.451 ms | 0.374 ms |
| title=The Matrix - 3,216 movies, 1 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 10.138 ms | 10.144 ms | 10.652 ms | 10.124 ms | 10.169 ms | 10.070 ms | 10.547 ms |
| bm_sqlite | 2.587 ms | 1.180 ms | 1.198 ms | 2.202 ms | 1.411 ms | 1.422 ms | 1.288 ms |
| bm_sqlite_index | 0.323 ms | 0.300 ms | 0.306 ms | 0.298 ms | 0.493 ms | 0.304 ms | 0.304 ms |
| bm_tracker | 15.097 ms | 14.727 ms | 14.692 ms | 14.759 ms | 14.840 ms | 14.888 ms | 14.791 ms |
| bm_tracker_flat | 3.727 ms | 3.529 ms | 3.558 ms | 3.545 ms | 3.504 ms | 3.504 ms | 3.520 ms |
| bm_xapian | 0.432 ms | 0.353 ms | 0.345 ms | 0.349 ms | 0.348 ms | 0.342 ms | 0.692 ms |
| title=The Matrix - 17,251 movies, 1 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 12.462 ms | 11.871 ms | 12.020 ms | 11.603 ms | 12.469 ms | 11.850 ms | 11.823 ms |
| bm_sqlite | 6.093 ms | 6.096 ms | 6.130 ms | 5.941 ms | 5.882 ms | 5.959 ms | 6.789 ms |
| bm_sqlite_index | 1.431 ms | 0.304 ms | 0.201 ms | 0.200 ms | 0.201 ms | 0.199 ms | 0.199 ms |
| bm_tracker | 79.019 ms | 78.831 ms | 78.514 ms | 78.491 ms | 79.423 ms | 78.506 ms | 78.759 ms |
| bm_tracker_flat | 19.173 ms | 20.160 ms | 19.373 ms | 19.043 ms | 18.992 ms | 18.961 ms | 19.207 ms |
| bm_xapian | 0.422 ms | 0.344 ms | 0.339 ms | 0.335 ms | 0.336 ms | 0.339 ms | 0.345 ms |
| title=The Matrix - 121,587 movies, 1 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 13.367 ms | 13.395 ms | 12.906 ms | 13.164 ms | 12.856 ms | 13.348 ms | 12.862 ms |
| bm_sqlite | 62.625 ms | 61.341 ms | 61.296 ms | 61.361 ms | 61.248 ms | 61.195 ms | 61.607 ms |
| bm_sqlite_index | 0.328 ms | 0.312 ms | 0.300 ms | 0.303 ms | 0.301 ms | 7.473 ms | 0.330 ms |
| bm_tracker | - | - | - | - | - | - | - |
| bm_tracker_flat | 138.148 ms | 131.762 ms | 130.937 ms | 131.431 ms | 131.471 ms | 130.975 ms | 130.770 ms |
| bm_xapian | 0.833 ms | 0.681 ms | 0.674 ms | 0.687 ms | 0.665 ms | 0.667 ms | 0.665 ms |
| director=Quentin Tarantino - 9 movies, 1 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 9.112 ms | 9.540 ms | 9.671 ms | 9.258 ms | 9.510 ms | 9.597 ms | 9.126 ms |
| bm_sqlite | 0.273 ms | 0.243 ms | 0.243 ms | 0.241 ms | 0.239 ms | 0.239 ms | 0.239 ms |
| bm_sqlite_index | 0.282 ms | 0.243 ms | 0.257 ms | 0.244 ms | 0.245 ms | 0.243 ms | 0.337 ms |
| bm_tracker | 0.810 ms | 0.547 ms | 0.542 ms | 0.544 ms | 0.541 ms | 0.554 ms | 0.567 ms |
| bm_tracker_flat | 0.606 ms | 0.410 ms | 0.398 ms | 0.403 ms | 0.383 ms | 0.459 ms | 0.392 ms |
| bm_xapian | 0.215 ms | 0.204 ms | 0.195 ms | 0.197 ms | 0.195 ms | 0.208 ms | 0.194 ms |
| director=Quentin Tarantino - 1,099 movies, 9 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 11.574 ms | 12.063 ms | 11.780 ms | 12.169 ms | 12.253 ms | 11.801 ms | 11.939 ms |
| bm_sqlite | 13.775 ms | 8.831 ms | 9.583 ms | 9.506 ms | 9.193 ms | 9.154 ms | 9.452 ms |
| bm_sqlite_index | 13.332 ms | 8.963 ms | 10.201 ms | 9.064 ms | 8.925 ms | 10.095 ms | 8.756 ms |
| bm_tracker | 5.173 ms | 4.644 ms | 4.546 ms | 4.473 ms | 4.552 ms | 4.472 ms | 4.455 ms |
| bm_tracker_flat | 1.137 ms | 0.857 ms | 0.851 ms | 0.855 ms | 0.844 ms | 0.842 ms | 0.844 ms |
| bm_xapian | 0.898 ms | 0.878 ms | 0.893 ms | 0.873 ms | 1.000 ms | 0.882 ms | 0.843 ms |
| director=Quentin Tarantino - 3,216 movies, 10 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 12.343 ms | 12.175 ms | 12.307 ms | 12.004 ms | 12.235 ms | 12.947 ms | 12.194 ms |
| bm_sqlite | 40.967 ms | 37.867 ms | 38.607 ms | 37.618 ms | 37.487 ms | 37.124 ms | 38.147 ms |
| bm_sqlite_index | 43.470 ms | 36.820 ms | 37.027 ms | 36.779 ms | 36.957 ms | 36.585 ms | 36.782 ms |
| bm_tracker | 13.707 ms | 13.074 ms | 12.763 ms | 12.740 ms | 12.848 ms | 12.779 ms | 12.855 ms |
| bm_tracker_flat | 2.015 ms | 1.559 ms | 1.531 ms | 1.525 ms | 1.530 ms | 1.545 ms | 1.511 ms |
| bm_xapian | 0.933 ms | 0.886 ms | 0.908 ms | 2.944 ms | 1.023 ms | 1.030 ms | 0.799 ms |
| director=Quentin Tarantino - 17,251 movies, 13 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 13.704 ms | 14.413 ms | 14.331 ms | 15.096 ms | 14.026 ms | 14.492 ms | 14.205 ms |
| bm_sqlite | 307.961 ms | 308.146 ms | 308.565 ms | 307.942 ms | 308.342 ms | 308.387 ms | 308.991 ms |
| bm_sqlite_index | 308.011 ms | 305.433 ms | 305.347 ms | 304.567 ms | 304.920 ms | 305.567 ms | 304.404 ms |
| bm_tracker | 72.690 ms | 72.075 ms | 72.005 ms | 71.999 ms | 71.938 ms | 71.946 ms | 72.108 ms |
| bm_tracker_flat | 7.489 ms | 6.996 ms | 6.877 ms | 6.987 ms | 7.148 ms | 7.088 ms | 7.021 ms |
| bm_xapian | 1.087 ms | 0.963 ms | 1.010 ms | 1.151 ms | 1.088 ms | 0.965 ms | 0.959 ms |
| director=Quentin Tarantino - 121,587 movies, 14 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 13.546 ms | 13.955 ms | 13.981 ms | 13.854 ms | 13.740 ms | 14.114 ms | 15.816 ms |
| bm_sqlite | 4,752.853 ms | 2,793.690 ms | 2,800.197 ms | 2,795.611 ms | 2,800.578 ms | 2,794.765 ms | 2,801.000 ms |
| bm_sqlite_index | 2,806.890 ms | 2,789.648 ms | 2,788.729 ms | 2,791.168 ms | 2,788.102 ms | 2,790.845 ms | 2,789.475 ms |
| bm_tracker | - | - | - | - | - | - | - |
| bm_tracker_flat | 47.801 ms | 46.303 ms | 46.701 ms | 46.640 ms | 46.467 ms | 46.862 ms | 46.448 ms |
| bm_xapian | 20.098 ms | 1.260 ms | 1.176 ms | 1.162 ms | 1.156 ms | 1.149 ms | 1.148 ms |
| T* - 9 movies, 9 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 17.303 ms | 17.072 ms | 16.927 ms | 16.539 ms | 16.816 ms | 16.758 ms | 16.797 ms |
| bm_sqlite | 0.547 ms | 0.544 ms | 0.547 ms | 0.541 ms | 0.541 ms | 0.546 ms | 0.544 ms |
| bm_sqlite_index | 0.553 ms | 0.549 ms | 0.554 ms | 0.553 ms | 0.658 ms | 0.547 ms | 0.544 ms |
| bm_tracker | - | - | - | - | - | - | - |
| bm_tracker_flat | 2.525 ms | 2.302 ms | 2.423 ms | 2.415 ms | 2.372 ms | 2.356 ms | 2.305 ms |
| bm_xapian | 3.086 ms | 2.871 ms | 2.947 ms | 2.893 ms | 3.104 ms | 3.022 ms | 3.126 ms |
| T* - 1,099 movies, 1,098 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 358.775 ms | 355.830 ms | 350.287 ms | 349.816 ms | 347.998 ms | 356.585 ms | 347.143 ms |
| bm_sqlite | 64.679 ms | 142.927 ms | 143.776 ms | 142.847 ms | 145.319 ms | 147.244 ms | 135.600 ms |
| bm_sqlite_index | 62.383 ms | 151.941 ms | 144.456 ms | 144.108 ms | 141.330 ms | 173.728 ms | 169.799 ms |
| bm_tracker | - | - | - | - | - | - | - |
| bm_tracker_flat | 199.108 ms | 213.355 ms | 202.793 ms | 196.659 ms | 194.937 ms | 194.708 ms | 195.267 ms |
| bm_xapian | 419.323 ms | 516.929 ms | 677.357 ms | 591.280 ms | 599.091 ms | 643.124 ms | 497.649 ms |
| T* - 3,216 movies, 3,204 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 842.413 ms | 968.828 ms | 958.367 ms | 1,002.383 ms | 932.222 ms | 946.388 ms | 1,004.821 ms |
| bm_sqlite | 327.669 ms | 415.921 ms | 440.198 ms | 408.543 ms | 432.575 ms | 537.572 ms | 412.061 ms |
| bm_sqlite_index | 310.218 ms | 432.201 ms | 413.221 ms | 404.165 ms | 479.691 ms | 431.758 ms | 436.533 ms |
| bm_tracker | - | - | - | - | - | - | - |
| bm_tracker_flat | 727.867 ms | 711.970 ms | 722.046 ms | 717.685 ms | 719.927 ms | 713.077 ms | 713.843 ms |
| bm_xapian | 1,442.238 ms | 1,470.821 ms | 1,415.183 ms | 1,392.164 ms | 1,437.493 ms | 1,464.149 ms | 1,520.747 ms |
| T* - 17,251 movies, ≥ 10,000 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 3,006.139 ms | 3,127.174 ms | 3,136.617 ms | 3,151.197 ms | 3,131.469 ms | 3,141.155 ms | 3,056.497 ms |
| bm_sqlite | 1,481.321 ms | 1,388.573 ms | 1,468.062 ms | 1,533.263 ms | 1,422.012 ms | 1,442.638 ms | 1,456.166 ms |
| bm_sqlite_index | 1,346.717 ms | 1,451.410 ms | 1,508.228 ms | 1,411.643 ms | 1,460.563 ms | 1,514.390 ms | 1,391.342 ms |
| bm_tracker | - | - | - | - | - | - | - |
| bm_tracker_flat | 2,945.536 ms | 2,938.230 ms | 2,957.149 ms | 2,959.569 ms | 2,972.291 ms | 2,933.668 ms | 2,936.655 ms |
| bm_xapian | 3,391.825 ms | 3,490.307 ms | 3,474.203 ms | 3,483.310 ms | 3,560.886 ms | 3,505.060 ms | 3,398.937 ms |
| T* - 121,587 movies, ≥ 10,000 matches | |||||||
| t1 | t2 | t3 | t4 | t5 | t6 | t7 | |
| bm_lucene++ | 3,627.408 ms | 3,625.588 ms | 3,546.610 ms | 3,508.233 ms | 3,599.160 ms | 4,597.857 ms | 4,101.686 ms |
| bm_sqlite | 2,182.548 ms | 2,109.730 ms | 2,109.812 ms | 2,121.573 ms | 2,104.320 ms | 2,117.912 ms | 2,145.342 ms |
| bm_sqlite_index | 2,108.863 ms | 2,103.648 ms | 2,131.009 ms | 2,132.823 ms | 2,109.655 ms | 2,137.286 ms | 2,106.779 ms |
| bm_tracker | - | - | - | - | - | - | - |
| bm_tracker_flat | 8,757.130 ms | 9,316.640 ms | 8,708.298 ms | 8,781.584 ms | 8,788.042 ms | 8,699.770 ms | 8,721.099 ms |
| bm_xapian | 4,805.474 ms | 4,528.004 ms | 4,692.763 ms | 4,640.065 ms | 4,618.215 ms | 4,647.170 ms | 4,674.588 ms |
Full Text Search Engines, Part I
Openismus asked me to research how best to index media files and provide full text searching. For the last two years, I have used Tracker for this kind of thing. I like Tracker, but I want to avoid being biased. Therefore, I decided to evaluate alternatives.
Performance is an obvious requirement. We also want to provide a library to permit other applications to access the data we collected. Therefore, SQLite and Lucene (in its C++ incarnations) are obvious contenders. Lucene++ is an emerging project that got suggested by Mikkel Kamstrup Erlandsen at Canonical. QtCLucene is a bit special: So far Qt doesn't provide official support for this library and doesn't install its headers files. Still it is used by Qt's help system, which makes QtCLucene a widely deployed and well tested C++ implementation of Lucene.
Sadly, the big names like MySQL or PostgreSQL do not fit: MySQL's embedded server library is licensed under GPL (instead of LGPL, for instance), which greatly limits legal use cases. PostgreSQL doesn't provide any embedding at all. Because I enjoy RDF and SPARQL I also wondered about testing the Redland RDF libraries, but I found that they don't provide any full text search at all.
Contenders
- Tracker 0.14.0-2ubuntu1
- SQLite 3.7.9-2ubuntu1
- Lucene++ 3.0.3.4 (e28b15b02ff9de2208965e9af8eb80983380cdcd)
- QtCLucene as provided by libqtcore4 4.8.1-0ubuntu4
- Xapian 1.2.8-1
Test Platform
- Ubuntu 12.04
- Intel Core 2 Duo P8400 (2.26GHz), 4 GiB RAM
- HDD: WDC WD2500BEVT-2, encrypted (aes-cbc-essiv:sha256)
Test Scenario
To get somewhat realistic data I've fetched a copy of the Internet Movie Database from ftp.fu-berlin.de. Since it is a quite huge database (about 1 GiB when compressed with gzip) I've extracted a few subsets of it: All movies with at least 500,000, 50,000, 15,000 1,000 and 50 user votes. This data then got imported into a fresh instance of Tracker, SQLite, Lucene++ and QtCLucene. After that I've run a few trivial full text searches:
"The Matrix"
Fast Furious
"Star Trek" OR "Star Wars"
Lord Rings King
Keanu Reeves
"Brad Pitt" OR "Bruce Willis"
Jackson Samuel
Quentin Tarantino
Wachowski
Thomas Neo Anderson
Neo
Each scenario was repeated five times. To avoid cache effects each engine was tested after the others for a given set of parameters. Tracker was tested with two different scenarios: First I've tried the Nepomuk based multimedia ontology shipped with Tracker (nmm), after that I've also tried a flattened ontology (fmm) which is a much better fit for the data model of pure full text search indices like Lucene. All engines where used with default parameters. No magic configuration options or pragmas were applied. Feel free to repeat the tests with your own optimized settings, and report the results when doing so.
Source Code and Data
The source code of these benchmarks can be found at Gitorious, and can be built using autotools or qmake. Just like you prefer.
Run src/benchmark.sh to reproduce the tests. The log files can be turned into a CSV file by running src/report.sh.
The charts have been created with LibreOffice:
It should be sufficient to copy the CSV data into the data sheet of
logs/report.ods.
Select "English (USA)" as language in the import dialog, to ensure that
numbers are recognized properly. After that you still might have to sort
the rows by the columns suite, num_movies and experiment. The data
sorting dialog provides an option for marking the first row as column header.
Update: I've pushed some more changes, so to exactly reproduce the results
discussed in this post, checkout the tags releases/0.1 for the
initial results, and releases/0.2 to also include Xapian tests.
Results

| Lucene++ | QtCLucene | SQLite | Tracker (Nepomuk) | Tracker (Flat) | Xapian | |
|---|---|---|---|---|---|---|
| 9 | 6.84 ms | 3.46 ms | 43.2 ms¹⁾ | 36.2 ms | 7.13 ms | 52.561 ms¹⁾ |
| 1,099 | 2.93 ms | 5.72 ms | 3.63 ms | 26.4 ms | 3.32 ms | 5.94 ms |
| 3,216 | 2.32 ms | 5.37 ms | 2.87 ms | 21.2 ms | 2.89 ms | 4.97 ms |
| 17,251 | 1.98 ms | 5.10 ms | 2.50 ms | 14.2 ms | 2.19 ms | 3.58 ms |
| 121,587 | 1.21 ms | 5.21 ms | 3.96 ms²⁾ | 10.4 ms | 1.80 ms | 2.30 ms |
- The dataset is tiny. I suspect that some startup overhead is invalidating this result.
- We might see first signs of a memory barrier here.

| Lucene++ | QtCLucene | SQLite | Tracker (Nepomuk) | Tracker (Flat) | Xapian | |
|---|---|---|---|---|---|---|
| 9 | 2.23 ms | 0.572 ms | 0.159 ms | 1.33 ms | 0.494 ms | 0.271 ms |
| 1,099 | 6.06 ms | 2.18 ms | 1.17 ms | 90.3 ms | 1.67 ms | 0.955 ms |
| 3,216 | 8.72 ms | 3.41 ms | 1.55 ms | 335 ms | 3.57 ms | 1.50 ms |
| 17,251 | 13.1 ms | 5.33 ms | 1.92 ms | 2,380 ms | 7.52 ms | 2.35 ms |
| 121,587 | 17.0 ms | 44.2 ms | 17.4 ms | 86,800 ms | 19.885 ms | 18.1 ms |
| Complexity | O(log(n)²) | O(log(n)²) | O(log(n)²) | O(n log(n)) | O(sqrt(n)) | O(log(n)²) |
QtCLucene, SQLite, Tracker (Nepomuk) and Xapian seem to hit a memory barrier at 121,587 movies.

| Lucene++ | QtCLucene | SQLite | Tracker (Nepomuk) | Tracker (Flat) | Xapian | Raw Data | |
|---|---|---|---|---|---|---|---|
| 9 | 80 KiB | 76 KiB | 368 KiB | 4.4 MiB | 2.3 MiB | 424 KiB | 104 KiB |
| 1,099 | 4.9 MiB | 4.8 MiB | 32 MiB | 59 MiB | 29 MiB | 21 MiB | 7.8 MiB |
| 3,216 | 12 MiB | 12 MiB | 75 MiB | 114 MiB | 53 MiB | 47 MiB | 18 MiB |
| 17,251 | 39 MiB | 39 MiB | 257 MiB | 305 MiB | 155 MiB | 170 MiB | 57 MiB |
| 121,587 | 154 MiB | 154 MiB | 1.0 GiB | 906 MiB | 521 MiB | 683 MiB | 198 MiB |
Discussion
The performance of Tracker is devastating. Entirely not the result you want to see for a project you actually like and enjoy using. You clearly see the bad impact of the many joins it must perform for mapping the ontologies and queries to SQL. This is surprising since in my opinion Nepomuk's multimedia ontology is a quite typical ontology. Also the datasets itself are not that huge for something that initially started as file indexer. The (sadly quite unrealistic) flat ontology might give a few hints on how to improve Tracker. The execution times with this ontology are comparable with them of the other engines. Still the observed (and only estimated) complexity class for executing queries is worrying.
Lucene++ shines at writing data, it is just incredibly fast when building its index. In contrast to the other engines it even spends less time per movie, the bigger its index grows. It is noticable slower than QtCLucene or SQLite when looking up terms. Still I'd call an average time of 17 ms for finding matches within 122k documents a quite good achievement. Additionally Lucene++ seems to be implemented sufficiently efficient to not hit any memory barrier yet at this scale.
QtCLucene is about two times slower than Lucene++ or SQLite when building its index, still the index size doesn't seem to impact insertion time per movie. It pays back with good lookup performance. It is about 2 to 3 times faster than Lucene++. It seems to hit a memory barrier at 122k documents.
SQLite's performance is just in the middle between Lucene++ and QtCLucene when building the index. When searching terms it even beats QtCLucene, again by a factor of 2 to 3.
Lucene++ and QtCLucene consume less disk space than the original files, most probably because the raw data stores movies and artists in separate files. The records in this files must be linked with each other. Lucene just does this more efficiently. SQLite and Tracker consume significantly more disk space than Lucene or the original data. Partly this can be explained by fields being stored twice: Once in their table and another time in the full text search index. Column indexes also play a role. Still this doesn't explain why disk consumption is significantly higher.
Xapian's characteristics are quite similar to those of SQLite. It doesn't hit yet that memory barrier that affects SQLite's insert performance at 122k documents, maybe because it consumes only 2/3 of the disk space. Enjoyed its API for being much closer to modern C++ than any other engine. It gives more low-level access to all the FTS mechanics: For instance you have to attach values and feed the indexer yourself. Also you have to deal with token prefixes. Details that Lucene just hides behind a Field class and its attributes. Not sure yet, what approach I prefer.
Conclusion
Tracker is out. Lucene++, QtCLucene and SQLite are quite comparable in terms of performance, with Lucene++ being the fastest engine when building the index, and with SQLite being the fastest when performing full text searches. There are some first signs that Lucene++ is more memory efficient than its competitors. This needs further investigation. Also we should investigate capabilities for doing point and range searches, instead of full text searches.
FOSDEM 2012
FOSDEM in 2012 was an exciting (and naturally, exhaustive) conference again. It's great to have so many relevant people who are all active in the free software world together in one place. It's also a great opportunity to discuss radical new ideas, ideally while experimenting with Belgium beer. Which is what we usually did when we weren't at the conference site.
It was nice to see Jarno and Esko at the conference, too. We even stayed in the same hotel. I hope they enjoyed the Ethiopian lunch as much as I did. And perhaps they're not too angry any more that we lead them to drink Absinthe ;-)
Jon and I gave two talks. Jon's talk (slides) was about Maliit as a project, explaining what Maliit is (and what it is not), combined with a short history lesson about the project. I tried to outline the difficulties of mobile text input in general (slides), picking some use-cases that are known from the desktop world and showing why simply copying the use-cases and their known interaction models does not work very well. I honestly liked Jon's talk more though.
Neither of us two actually managed to visit other talks, even though we wanted to. We had to ask Jarno, Esko and others about what great talks we missed. Apparently there were quite a few :-(
Our Maliit T-Shirts were well received, though we usually only handed them out when someone listened to our Maliit ramblings long enough.
We were asked about accessibility several times, which is currently not within the scope of Maliit but perhaps something to think about in the future.
We also got to talk with the people working on (text) input in Redhat and Intel, mostly in the context of Wayland. There are some interesting opportunities to get things (more) right this time around.
Thanks to our employer, Openismus, for sending us there!
The infrastructure of the Maliit project
It took us a while to transform the Maliit project into a real opensource project. At first there was only public code, later some wiki pages @ meego.com together with constantly changing components in the official MeeGo bugtracker, then a public mailing list.
After that we tried to become independent of MeeGo, but neither freedesktop.org nor the GNOME project could give us a suitable home. So we had to go with our own infrastructure in the end, which probably was the best we could do, in any case. We now enjoy our own website (mostly a wiki, for which we can also analyze the traffic), our own IRC channel, our own public bugtracker, our own mailing lists and a build bot. We also make use of other services such as launchpad.org and the openSUSE Build Service, both for packaging but also as part of our continouous integration setup. Both services provide nightly builds for Maliit, for example (though we still lack packages for ARM).
But there was always one thing missing: T-Shirts. Now that this is solved, too, we can finally call Maliit a real opensource project ;-) Hopefully we'll soon have another group photo of the people who've been involved in the project over the years. I'll make sure to bring a couple of T-Shirts to FOSDEM, so make sure grab Jon or me if you want one.
How we enable others to write 3rd party plugins with Maliit
We finally published a video about Maliit - an input method framework including a virtual keyboard - and 3rd party plugins. Kudos goes to Jon for making time for that.
This video highlights one of Maliit's key features: pluggable input methods which come with their very own user interfaces. The Chinese input methods show how Maliit offers support for composed characters. The video is proof that 3rd party development for Maliit (open-source and proprietary) is not only possible but also happening.
maliit.org states that "it should be easy to customize existing input methods or develop powerful new input methods, whether for profit, research or fun", we actually mean it.
The harder question is of course how to motivate others to actually get started on input method development with Maliit. For that, we have a multipronged strategy:
-
Provide sufficiently polished reference plugins that can show off Maliit capabilities but also serve as inspiration for new plugins (hence the BSD license for reference plugins). Our reference plugins are currently using Qt/C++ (Maliit Keyboard) and QML (Nemo Keyboard). We also have PySide support, but no one contributed a reference plugin yet. This gives choice to interested input method developers, and we think that's important. The reference plugins serve another role when it comes to designing new API: They become our testbed, allowing us to verify our API proposals.
-
Ship Maliit with a bunch of example plugins and example applications. None of them try to be complete. They are all self-contained though and usually show one feature at a time. This can be tedious to maintain, but we believe that examples need to stay small and focused, otherwise developers won't look at them.
-
Documentation that is easy to consume. Our documentation is not as concise and clear as we'd like it to be, but it's slowly improving. We also experiment with videos that can serve as an introduction to more in-depth (text) documentation.
-
Packages for most common Linux distributions. This one seems obvious, but sadly, it's quite a lot of work for us to keep up with it (and we already use automated services such as Launchpad and OpenSuse Build Service). In the hope to attract dedicated packagers we wrote down some packaging guidelines
-
An architecture that had 3rd party plugins and multiple toolkit support in mind from the start. The plugin developer facing API needs to be easy to use and clearly documented. This will be the focus of the upcoming 0.9x series.
We will demo Maliit @ FOSDEM 2012, hope to see you there!
Qt Quick best practices: Using Components
The next article in the series about Qt Quick best practices has been published (but don't miss out the other one about property bindings). This time, I talked about Components, and how they can help to keep your QML code clean and maintainable. The team behind the N9 Developer blog has been a great help to me, especially Ville Lavonious and Matti Airas. I am also thankful for the additional input (and proof reading!) from Jon Nordby and Sauli Kauppi. Thanks guys!
Miniature 0.5 'London 1851' released
From the release notes: "Miniature now supports different languages thanks to a determined community of translators. Thank you for your effort! This is why we are dedicating this release to the first international chess tournament, celebrated in London on 1851.
Miniature 0.5 is being released for MeeGo Harmattan (Nokia N9 & N950) and Maemo (Nokia N900). Thanks to everybody involved in the initial Maemo attempts and the experimental version that was made available after the Miniature 0.4 release."
We also improved usability, compared to the previous release, but there's still a ton of work left.
A bit of history
I started working on Miniature – a chess client for freechess.org – in November 2009, after reading the Call for Contributors. Even though we had a pretty cool P2P feature (based on Telepathy and developed mostly by Dariusz Mikulski), it never quite reached the original goal: playing chess online. Back then I was learning how to create UI's with Qt Graphics View, which was all the rage at the time. Well, we now know that writing real UI's with that technology is a major PITA, but for my pet project, it was just too much. I got lost in the struggle.
For the next 18 months, Miniature was basically dead. Another failed project that started so promising. Quim did not want to give up though. After the N9 announcement, he launched a second Call for Contributors.
Perhaps I responded to his mail because I was embarrased at the idea of people wasting time trying to salvage the working parts of Miniature; there simply wasn't much to salvage! So I started again, this time with a very clear goal: online chess, and online chess only. Let others create the actual UI and whatnot. Focusing on one prominent feature and not having to worry about the UI worked well for me, even though I had to iterate over some architecture ideas until I felt comfortable. Quim in the meantime started to prototype the UI with QML. It was impressive to see his results, a level of polish I could have never achieved with my Qt Graphics View approach. At some point the backend was good enough to be sewn together with the frontend and suddenly we had achieved where I failed before: A touch enabled chess client for the N9 that can play chess online.
Having my own useful application available on the N9, published through OVI store, means a lot to me. I hope others will enjoy Miniature as much as we enjoyed re-creating it the second time around.
Using MeeGo Keyboard from git on your Nokia N9
Usually AEGIS, the N9's security framework, protects system packages from being replaced. As such, files belonging to a system package can't be overwritten. And that's definitely a good thing, because otherwise each download from OVI store would put the user at a considerable risk.
Maliit is such a system package, but its flexible architecture allows for a creative way to replace the MeeGo Keyboard with a more recent version. This can be
useful if you want to testdrive new features and to … nah whom am I kidding, it's purely for fun!
Be warned though, the following hack requires you to enable developer mode on your N9. Don't ever activate it unless you're absolutely sure what you're doing to your N9. It would be unforgivable to brick this beauty because of some misguided hack the planet attitude.
First we need to find a MeeGo Keyboard tag that will be compatible with the installed Maliit framework version on your device. Check that the output of
$ apt-cache showpkg meego-keyboard
matches the dependencies mentioned in the tag's Debian control file and the packages installed in your scratchbox ARMEL target.
Apply the community patch on top of the chosen tag. It renames the package to meego-keyboard-community and only installs the plug-in's .so file, together with a renamed CSS file (libmeegotouch requires that CSS file names match with library names).
This mean that we won't uninstall the regular package, as we still depend on most the other files that meego-keyboard installs.
Now build the Debian package. Copy it over and login to the device, then gain root access via devel-su. It's recommended to make a backup of /usr/lib/meego-im-plugins before installing the package.
After installing libmeego-keyboard-community, remove libmeego-keyboard.so from /usr/lib/meego-im-plugins, to avoid in-fights between the two plug-ins. Use
$ gconftool-2 -s /meegotouch/inputmethods/onscreen/enabled -t list --list-type string [libmeego-keyboard-community.so, en_gb.xml]
$ gconftool-2 -s /meegotouch/inputmethods/onscreen/active -t list --list-type string [libmeego-keyboard-community.so, en_gb.xml]
to activate the community plug-in. The language settings applets will most likely get confused, so be prepared that enabling new language layouts might only work directly via GConf from now on.
Gain user access and kill meego-im-uiserver. It should now load the new community plug-in. If you want to get the original MeeGo Keyboard back, uninstall the community package and copy the .so back from your backup. Alternately, you can try to reinstall it:
$ apt-get install --reinstall meego-keyboard
Have fun!
Best practices in using Qt Quick
I am writing a series about best practices in using Qt Quick. It will be published on the official N9 Developer blog. The introduction and first article have already appeared. Your feedback on that series is very much welcomed.
Better GTK+ support in Maliit
So far, using Maliit's virtual keyboard in GTK+ applications required fetching and compiling a GTK+ input method brigde yourself. Not any more. With the latest release, GTK+ applications should just work out of the box, thanks to Jon's integration efforts. Right at the same time, Łukasz was looking into using Maliit together with GTK+ applications on his Ubuntu desktop. He did a great job testing Jon's improvement and also contributed patches to properly update GTK+'s input method module cache. When compared to the Qt support, the gap in terms of supported features is quite large. We would like to further improve the GTK+ support and contributions are certainly welcome.
Devhelp books in QtCreator
Had a few problems focusing on my work today, so I came up with this little hack: A script converting devhelp books into Qt help collections.
Together with QtCreator's autotools plugin it should help turning QtCreator into a proper GNOME IDE.
Next steps: Let QtCreator index devhelp books automatically. Also on my wishlish: A code model for GObject properties and signals, and Glade integration.
Well, now back to real work.
Real users, real feedback
We released Maliit 0.80.7 on Friday. Over these last days, I am doubly proud about our project. Not only did the N9's virtual keyboard get astonishing reviews across the board, but what's even better: We managed to keep this software open-source. In our communities, there will always be those who focus too much on technical aspects. I remember the technical struggles we had even within MeeGo! But now we get feedback from real users who couldn't care less about what Qt or MeeGo Touch is, and to be honest, that's a refreshing change.
Being here at Qt's Developer Days 2011, it feels great to get such feedback directly, from first-time users of the Nokia N9. Especially the fine haptic feedback and the keyboard's accuracy gets noticed.
I also had the possibility to see a Japanese input method — running on the N9 and powered by Maliit. Seeing how well this plugin already integrates with the platform, I feel that our architecture yet again has been justified. I am looking forward to see more Maliit plugins, and more platforms using Maliit!
They call us crazy, but we store Contacts in Tracker
Visa authorities playing bad games with Chandni gave me the chance to talk about the QtContacts tracker plugin, which I and others where working on for that past few months. In case you've missed that early talk, here are the slides.
Interesting to just watch George from KDE to talk about similar things.

Using C++ enums in QML
When mapping Qt/C++ API's to QML, or, to put it more precisely, making a Qt/C++ API available to QML, road bumps are to be expected. One such bump is the mapping of C++ enums.
If you happen to create enums inside a QObject, then it will be exported to QML via the Q_ENUMS helper:
SomeEnumsWrapper
: public QObject
{
Q_OBJECT
Q_ENUMS(SomeState)
public:
enum SomeState {
BeginState, // Remember that in QML, enum values must start
IntermediateState, // with a capital letter!
EndState
};
};
You will still need to declare this class as an abstract type for QML to be able to use enums from it (put in your main function for example):
qmlRegisterUncreatableType<SomeEnumsWrapper>("com.mydomain.myproject", 1, 0,
"SomeEnums", "This exports SomeState enums to QML");
Now in QML, the enums can be accessed as '''SomeEnums.BeginState'''. Note how the enum is accessed through the exported type name, not an instance.
But what if you've put your enums into a dedicated C++ namespace? Then the same mechanism can be used. Let's start with the namespace:
namespace SomeEnums {
enum SomeState {
BeginState,
IntermediateState,
EndState
};
}
We can re-use the idea of wrapping enums in a QObject type, with one tiny change:
SomeEnumsWrapper
: public QObject
{
Q_OBJECT
Q_ENUMS(SomeState)
public:
enum SomeState {
BeginState = SomeEnums::BeginState, // Keeps enum values in sync!
IntermediateState = SomeEnums::IntermediateState,
EndState = SomeEnums::EndState
};
};
The process of forwarding all your enums through this mechanism can be tedious, but being able to use enums properly in QML properly will improve the readability and maintainability of your QML code.
For a fully working example check Maliit's Qt Quick support
Qt Contributors Summit is over
Really enjoyed the Qt Contributors Summit. Nice, open minded people. Café Moskau turned out as awesome location for technical orientated meetings.
Even held my own little session about my griefs with QObject life-cycle. We found some few chances for improvement, but we also sadly had to conclude that proper two-phase construction and destruction isn't possible in C++, unless you forbid stack allocation and usage of the delete operator. Actually had my little pervert moment of pleasure when realising that Thiago seems a bit jealous for the freedom GObject gets from plain C.
Still wondering a bit if there's really no way to implement proper two-phase destruction in C++. Must we really bribe the C++ standard committee to enhance the specification?
Input methods and Wayland in Qt5
I was attending the Qt Contributors' Summit 2011. During the key note, it was promised that everything is up for discussion so I took my chance to discuss about improving input methods support for Qt5.
After some initial discussions with Kristian Høgsberg (Wayland, of course) and Jørgen Lind (who works on Qt Lighhouse), I also addressed Wayland. It became clear that one needs some kind of input method interface directly in Wayland. Kristian immediately started with a small prototype, in order to explain better how a Wayland compositor can provide a much better window management policy than what we currently have with Maliit and X11.
I think the session itself was really successful. I was surprised at the strong interest in this topic.
It became apparent that we should do something about Qt's input context API. For instance, add more input methods hints, come up with a better interface that describes the focus widget, preedit handling, orientation support and so on.
Now we only need to agree on how to make it happen :-)
Qt Contributors' Summit
A bit odd for something with my background? Does that mean I am leaving the GNOME universe?
No. It just happens in Berlin, and I've just spent lots of time on letting QtContacts use some awesome GNOME technology (tracker). On the summit I'll try to convince some Qt core guys, that maybe UTF-8 would be a much better choice for the Linux port of Qt. It would improve interaction with kernel, DBus and GNOME libraries so much. Well, and maybe I can get them to consider more reasonable memory management for QObject: With Qt leaving the GUI corner it's simple parent-ownership model doesn't fit anymore. QtQuick already skips that obsolete model. Now let's also let C++ components benefit.
PS: If someone ever wants to modernize libebook, then looking at QtContacts API is a good exercise. It was designed to explicitly fix the issues we had with libebook during Fremantle. Actually even thought of making a GIR typelib for QtContacts - but that's a different story and maybe even doesn't make sense.
QGraphicsItem: When you're doing it wrong
When you read through the Qt GraphicsView documentation, you might miss the detail that QGraphicsItem::pos() refers to the center of an item: "Their coordinates are usually centered around its center point (0, 0), [...] At item's position is the coordinate of the item's center point in its parent's coordinate system;".
Oh great, so the word "usually" indicates it is not enforced, and everyone can choose his own semantics when implementing QGraphicsItems ... which is precisely what I did, accidentally. My QGraphicsItems' pos() refers to their top left corner. One chooses the semantics when overriding boundingRect() (which apparently uses an item's postion to map the bounding rect into the parent's space). So let's check the boundingRect documentation, whether it contains this hint. Hm, no. But it contains an example:
QRectF CircleItem::boundingRect() const
{
qreal penWidth = 1;
return QRectF(-radius - penWidth / 2, -radius - penWidth / 2,
diameter + penWidth, diameter + penWidth);
}
Damn, so the bounding rect does not start at (0, 0), even though it's in item's coordinate space ... I ended up introducing a policy for differentiating between graphics items who do it right (by following the Qt conventions), and for those that I created, for whenever I have to deal with item positions. The other possibility - to fix every item to follow the Qt convention - would have been too much work, sadly.
I wish I had discovered this earlier. (Extra rant: That's why too much documentation that is more about story telling than about being to the point is just as wrong as no documention.)
Operator Overloading
Just wondered right now why Qt doesn't provide a greater-than operator for QSize.
Well, indeed: How would you define this operator? Maybe like this?
inline bool
operator >(const QSize &a, const QSize &b)
{
return a.width() * a.height() > b.width() * b.height();
}
Or is this the proper definition?
inline bool
operator >(const QSize &a, const QSize &b)
{
return (a.width() > b.width() || a.height() > b.height());
}
Mathematician might intuitively choose the first alternative, aka. covered area. I claim for UI problems usually the second interpretation is useful.
Funnily the Qt author(s) of QSize implicitly agree with my claim, as they provide:
inline bool QSize::isValid() const
{
return wd>=0 && ht>=0;
}
Which gives "(b - a).isValid()" computing the same result as my preferred interpretion of the greater-than operator.
Well, my currently preferred interpretion, within the scope of my current problem. Oh, and sans integer overflows and such "minor problems" - of course. Someone really cares about such "minor details"? :-)
So what tells this? API design is fun. Even more if you add operator overloading to the soup.
*Disclaimer: There is nothing Qt specific in this post. It only provides the example. *
Using DBus as lock-daemon
Recently I found this comment in the source code I am working with:
// what if both processes read in the same time and write at the same time, no increment
Please! Don't do such things! Don't just leave such comments in hope someone else will come around and will fix later. Please take the time to apply a locking mechanism.
Obvious choice when dealing with files would be to create a lock file. Unfortunately creating a file based lock isn't trivial, especially when you deal with portable software. Questions arise like: Is that system call really atomic in this context? Is the feature even available? Which characters can be used in the file name? Can it start with a dot? How much time does it take? Will it reduced lifetime of my flash media?
Uh, my head is spinning! Stop!!!
Somewhat understandable that my colleague just left a comment.
Well, fortunatly there is a more trivial solution for that problem. If you have DBus available. More experienced DBus hackers already know and will feel bored now, but to everyone else:
DBus service names can be used as locking mechanism!
Implementation would look similar to this:
bus.request_name('de.taschenorakel.locker.example')
bus.wait_for_name('de.taschenorakel.locker.example')
now some work...
bus.release_name('de.taschenorakel.locker.example')
Easy, not? Doesn't hit the file system. Fully implemented. Ready to use. Daily tested at your desktop.
One implementation of that concept can be found in qtcontacts-tracker.
QML Hype
So yesterday I've skipped the chance to watch some "exciting" QML demos in Helsinki. This was quite surprising to some of my KDE rooted team mates. They didn't understand how I could not show the slightest sign of excitement.
Well, but actually I wonder for months: What's actually the fancy and awesome, the brilliant new, the exciting part of QML? It doesn't seem to be rocket science. It's nothing new. Declarative UIs are done for ages. To name some very few implementations there are Windows and PM/Shell RC files, Glade, GtkBuilder. You want to mix declarations with managed code? XUL and XAML have visited that land. You want to use JavaScript for your UIs? Flash, XUL, Dynamic HTML and Web Widgets, GObject Introspection.
So what am I missing except that Qt finally catches up to its competition? It's a welcome addition, but why should I be overly excited and die of excitement?
Logging facility for Miniature
On Maemo 5, log output from your app isn't always accessible to the user. This has created problems for Miniature bug reports (see bug #8124). To solve this, I created a "Game Log" screen which allows to filter the messages (by a given log level, I might want to allow combinations, too). It also has a nice fat "Copy all" button, so that the log output can be quickly attached to a bug report.
Now I "only" need to add useful log information =D
Thanks to Openismus for letting me work on this.
Testdriving the UI Extensions for Mobile on Maemo 5
Reading about Nokia's UI Extensions for Mobile (sources available) I wanted to quickly try it myself. So I looked at the provided examples, and this video is the result of what I came up with (well, of course it is FOSDEM-induced). The provided API allowed me to easily apply my previous Qt knowledge, which is a nice touch. Sources for the example in the video app can be found here and here. Happy hacking!
Update: The UI Extensions for Mobile will not compile on 64bit architectures.
How to customize your view with delegates
The usual way in Qt, when an item view wants to render itself, is to ask the item model for everything, layout, style and the actual data. Storing view-specific layout information in the model itself means that views and models can always only exist in a 1:1 relationship. The logical conclusion: If you need to share a model between different views, you will probably have to add proxy models for each view. I wasn't satisfied with that idea, so I continued to research QStyledItemDelegates. They have the following nice properties:
- They are owned by the view.
- They have complete control over the cell they are assigned to (although that might not be too obvious).
- They only get called if there's work to do, that is, nothing is wasted unless a cell becomes visible in a view.
So instead of having one proxy model per view, I can now keep the single model/multiple views approach by writing custom delegates for each view. For a tabular view, that means I want to install custom delegates per column or row (see Qlom's main window implementation).
How to change colors of a cell
Use the delegate's paint method, apply your changes to the QStyleOptionViewItem's palette and forward the paint request to the parent class. I had problems getting the correct background role, that is why I simply used the supplied painter to draw the background myself.
How to format data from the model in the view
The delegates' displayText method comes in handy. The model's data is wrapped in a QVariant already, so your custom delegate can apply all kinds of string formatting here. Just be aware that this method will never be called if the QVariant returned from the model (for the queried model index) is null.
How to replace a cell with a custom widget
For this, we abuse the fact that delegates are owned by their view. We only need to find a method that has a model index parameter, e.g., the paint method, and we are good to go! Inside that method, we query the parent() and use the item view's setIndexWidget method (perhaps check if there already is a widget at the given index, so that we don't end up re-creating widgets for every paint request).
In Qlom, the embedded widget is a simple button, so I was interested in its pressed signal. For that, I used a QSignalMapper to bind custom data (here: the model index) to the delegate's buttonPressed signal. Now the view can connect to the delegates' buttonPressed signal, unwrap the custom QObject to find the model index and display a nice message box with the exact model index of the clicked button.
Feedback and especially corrections welcome!
Fragile API and invalid iterators
For the Miniature project I sometimes need to iterate over all graphics items in a QGraphicsScene, or more precisely, all items of a specific parent item, the chess board itself. For that, I use the QGraphicsItem::childItems() API. However, when used with STL-style iterators you have to be careful, since value types of that form can easily break the iterator idiom:
for (QList<QGraphicsItem *>::iterator iter = parent_item->childItems().begin();
iter != parent_item->childItems().end();
++iter)
{
(*iter)->doSth(); // Ooops! Invalid iterator deref'ing!
}
The value-type list that is returned by parent_item->childItems() has not been bound to a variable explicitly, so it is bound to an anonymous variable instead. The compiler will not complain here although the code is borken, from a C++ point of view. Worse yet, "(*iter)->doSth();" might work often enough for you to not see the crashes right away, depending on the compiled code and how the architecture handles memory.
Eventually though it will crash, and according to Murphy this will happen right after the big release. Valgrind of course would have complained about the illegal memory access right away (when run through that part of your code), because the list that we queried in the loop header is invalidated as soon as we enter the loop body, leaving us with an iterator that points into the void. This is C++-specific, in other languages even anonymous variables are only ever cleaned up after leaving the current block context, but in C++ their life time is only guaranteed for the scope of the current expression.
There is one fix and one workaround to that problem. First the fix: let QGraphicsItem::childItems() return a reference to a list member instead, or simply expose the begin/end iterators for the internal data structure directly. Assuming this is not possible, we are left with the workaround - bind the returned list to a variable explicitly, before the iterator is used:
QList<QGraphicsItem *> children = parent_item->childItems();
for (QList<QGraphicsItem *>::iterator iter = children.begin();
iter != children.end();
++iter)
{
(*iter)->doSth(); // Fine!
}
This of course prolongs the life time of the returned list unnecessarily. We only needed it inside the loop, but now it won't be cleaned up until the flow of control leaves the surrounding block context. To finally get the desired behaviour we could limit the surrounding block context:
{
QList<QGraphicsItem *> children = parent_item->childItems();
for (QList<QGraphicsItem *>::iterator iter = children.begin();
iter != children.end();
++iter)
{
(*iter)->doSth(); // Fine!
}
}
Perhaps a cleaner way is to move the loop code in a function of its own instead, although that might not always be feasible, depending on the code inside the loop body.
Another issue might be the need to dynamically cast QGraphicsItems (QGI) to the desired type, once you implemented your custom QGraphicsItem type. This wouldn't be too bad if there wasn't this horrible mix of QGraphicsObjects and QGraphicsItems in Qt 4.6 - some items inherit from QObject (QGraphicsSvgItem), others don't (QGraphicsPixmapItem). So now you also have to decide between dynamic_casts and qobject_casts, how nice ... not! Honestly, it would be easier here to simply forget about qobject_casts, but that might break in subtle ways.
These two issues were enough for me to avoid the QGI::childItems() API whenever possible. One possible replacement can be provided by signals and slots [1]: First, add a slot representing the loop body to your custom QGI type. Second, connect a signal to the instances of your QGI type that would have been iterated in the loop, that is, have a custom signal representing the loop iteration. Then, instead of iterating over the children of the parent item you simply emit the custom signal which triggers the "loop body" execution for each connected custom QGI instance.
[1] requires QObject inheritance for your custom QGraphicsItem type
About delegates and cell renderers - data formatting in Qt
Warning: the following blog post has a rant-to-usefulness ratio of 3:1.
Last week I needed to perform some data formatting on Qt list view. From reading the documentation alone I could not find a satisfying answer. When asking on IRC the answer was to use proxy models. This would have worked, but having two models for one view can create all kind of correspondence problems (think of sorting etc.). And there is this interesting bugreport, complaining that there is no easy way to format data. Status: rejected! So formatting should be a responsibility of the view? Wow, who would have thought ... however, if proxy models are a no-go we are left with ... custom delegates.
(cut some nonsense about MVC)
In the Qt world, a delegate is reponsible for 3 tasks:
- editing contents displayed in a view (possibly by creating new editor widgets on the fly),
- updating the model with the modified contents,
- rendering the contents in a view.
Somehow, that's two tasks too many, for something that is not a controller. That's where I prefersLet's take a look at GTK+ cell renderers: they only perform one task, and that's exactly what their names suggests(correction: actually, they are responsible for editing and updating as well, should have read more). A text cell renderer renders text, a pixbuf cell renderer renders pixbufs, and so on. Simple but tremendously flexible.
So how can I inject this flexibility into delegates? The QItemDelegate won't be very useful unless you want to use a QPainter for everything. But there is this styled item delegate, added with version 4.4 of Qt. And if we look at the roles & accepted types table we can see how this - together with displayText - could translate nicely into Qt "cell renderers". So we create custom styled delegates for each data type we need to display in our view: text delegates, pixmap^Wdecoration delegates ... Once we have defined a set of custom delegate classes we then request the view to use them on a per-row or on a per-column basis. A proof of concept can be found here.
EDIT: Had to correct some nonsense, this post was too much of a rant and I wasn't thinking clearly. What I originally wanted to express was that with QStyledItemDelegate we can have something very similar to GtkCellRenderer and use them both in a very similar way, too. I think the Qt documentation could have been easier to follow with straight and simple formatting data example, hence my rant. Sorry =/
Qt 4.6 for Maemo: It works!
Today at work, David King kindly informed me that there was some new Qt package in extras-devel. This could only mean one thing - I immediately fired up my scratchbox environment and installed the packages, trying to confirm that this new version would run with Qlom. And in fact, it was surprisingly painless. Thanks to autotroll, a simple QT_PATH env variable did all the magic, hooray!
Both of us were impressed with the UI improvements. It's certainly a big step forward regarding the Hildonisation of Qt on Maemo5. The application menues look correct now. Button sizes, colors, animations, etc - it all comes together nicely, finally.
There are still some widgets that need more work, but for a tech preview this is a pleasant surprise.
On another note, the timing for the Miniature project could not have been much better. We immediately switched to Qt 4.6, and it even runs on the N900. It feels good to know that we can stop using hacks and that we can start to do (most) things properly, staying as cross-platform as possible. Needless to say, Quim was happy, too.
Miniature - it moves!
How it begun
When I read Quim's thread about the idea for a better Maemo chess app I knew I wanted to join the project. To me, it's all about the device and the sparkling Hildon UX. I really want a good chess app, for myself! I want to play chess online, everywhere! And I want to analyze games as (OK, maybe after =p) they happen. No more "I'll check this position later" (we all know this rarely happens).
So I finally started last friday. At this point, Quim and Andreas had already created a beautiful, content-rich wiki page. It took a while for me to digest it all, and I added information where appropiate.
Kick-starting the development
Andreas had registered a garage project, but we eventually decided to use gitorious for our repository. Gitorious' UI definitely improved over the recent months, and the possibility to have teams working on a single project - also known as not-so-extreme-dvcs-development - makes gitorious a better choice than github, at the moment.
Saturday night (what better things to do than coding some Qt - my soul will be forever lost) I had a first running example (see screenshot). Currently, Miniature can move between positions, using next/prev menu navigation (we don't need this functionality per se, but it's perhaps a good demonstration that the simple approach I took works).
So no matter the toolkit, no matter the outdated packages or the endless confusion I had with the various Qt repos at gitorious - this project is really fun! Hopefully we get to make a 0.1 release soon.
Wt - a Qt-ish web toolkit written in C++
First of all, C++ for web apps doesn't necessarily make it easier to write them. Compared to other solutions (let's say Django) you'll end up writing a lot more code. And you have to be very very careful with memory leaks.
But that isn't the point. The main advantage of this framework is how close it is to desktop applications. Porting your Qt desktop application to the web is certainly easier with Wt than with any other solution, as you keep most of the widget API and also the signal and slots paradigm (which should allow to port the app reusing the same business logic as on the desktop).
Also, this is the first time I could attach a powerful debugger, namely gdb, to a web app and debug it is as if it were a normal desktop app. Together with the compiler-guaranteed type safety this is a huge improvement for code robustness.
A toolkit that actually uses C++
There are also some distinct advantages of this toolkit over Qt itself:
- no MOC preprocessing (unless you integrate it with your Qt libraries, of course),
- boost::signals & boost::bind wrapped in an easy-to-use WSignal, keeping most of the boost API accessible for the user. Which means: the compiler can check whether your signal connections will work!
- boost::any instead of QVariant: another big advantage. Boost::any is type-inferred using template voodoo so again, the compiler can check its correct usage for you. Getting values out sadly requires a (static) cast it seems.
Installation
The fastest way is to simply get the source from the git repo and to follow the instructions:
- (enter your jhbuild shell if your project happens to use any GNOME stuff)
- enter the cloned git repo
- "$ mkdir build"
- "$ cd build #dont ignore this, it helps later on!"
- "$ cmake -i ../ #interactive mode asks lots of stupid things but also various path settings!"
- "$ make"
- "$ make install"
For your own projects make sure to link against all the needed Wt libs: libwt (core), libwtext (if you want to use anything from Wt::Ext, see below), libwthttp (if you want to build apps with inbuilt web server, quite helpful). If you happen to use autotools then you might want to include AC_CHECK_LIB macros into your configure.ac. I had troubles finding non-name-mangled function names, but "$ readelf --dynamic --symbols /path/to/lib/lib.so | grep -v _Z | tail" should help.
ExtJS
Wt wrappes a highly advanced (in terms of bringing desktop UX to the web) Javascript framework - ExtJs. Sadly, it is also big (roughly 80kb have to travel to the client first) and very buggy. It would seem that you could use ExtJS and make it use jQuery instead, perhaps that's worth a try. For now I disabled ExtJS in my little project, since I couldn't debug some of its issues. The Wt-native widgets seem to be quite solid in comparison (even if they look a bit boring).
The installation instructions that come with Wt don't tell you how to install ExtJS, so here is what I found out:
- download ExtJS framework (you probably want version 3),
- copy it into your chosen wt webroot (let's say "/var/www/wt/"), consult your CMakeCache.txt,
- extract the zip archive,
- THEN copy /var/www/wt/ext/adapter/ext/ext-base.js to /var/www/wt/ext/ (the error message when starting the web server gave that hint away),
- make sure your project is linked against the libwtext library.
Who wants my feedback for the Maemo6-Qt tech preview?
During the Maemo Summit there were several talks about the upcoming Maemo 6 platform and also about the transition from GTK+/Hildon to Qt. One of them - the "(Introduction to the) Harmattan UI framework" also mentioned where to get the code for the Maemo 6 UI Framework from. So I went there and cloned the two available repos (tech preview of the framework, tech preview of the homescreen) together with the most recent Qt version that is also on that site.
But sadly, compiling the code wasn't possible for me. Apparantly because some header files were missing. The talk didn't mention a specific resource for feedback so I tried Maemo's Bugzilla. An e-mail to the git repo maintainer didn't yield any response (yet), either.
I reckon that my bug report on the Maemo bugtracker was filed against the wrong module (Qt on Fremantle), but where else should I post problems like that? Even if at this stage the above mentioned repos are probably not much more than a widget gallery it seems like a wasted opportunity to collect feedback, no?
First steps with Qt
I started to look into Qt with the help of "Foundations of Qt Development" by Johan Thelin. As with its counter-part it shares the decent text and the sometimes not-so-convincing examples. Perhaps code examples from a book should generally be maintained like a real project (including bugtracker, mailing list/forum and public repository), since that allows the examples to improve over time.
Now, a personal highlight while reading the first chapters of the book was the screenshot of a paper'n'pencil design at the beginning of the second chapter. From there it continued with the derived use cases and then went on to explain the needed events and event handlers (signals and slots in QtSpeak) - a very convincing approach. However, the opportunity of introducing the reader to TDD was missed, and Qt unit testing is only mentioned late in the book (chapter 16). This actually implies that writing unit tests after writing the application code is OK. Nothing could be further away from the truth since testability is a design choice.
My first impressions of Qt are two-folded. There are already quite a few things I strongly dislike in Qt. Nevertheless I have to admit that it enables you to rapidly develop solid desktop applications, without boilerplating your code.





