BlogRussian blogGSoC 2011GSoC 2010AboutMaintenance

The Google Summer Of Code has ended for four month ago. Since then, the both projects have got some improvements, tiny and significant.

Glib kqueue backend

The patch is available in the NetBSD's pkgsrc, thanks to Julio Merino for packaging. You will need to rebuild the glib2 package as follows:

$ cd /usr/pkgsrc/devel/glib2

Please report about any issues to my personal mail (see the About page).

Recently I have been contacted by Antoine Jacoutot. Antoine is an OpenBSD and GNOME contributor. He has integrated the gio patch into OpenBSD's package collection, so now the functionality is available in OpenBSD too. Antoine also mentioned that he can help with pushing the patch to upstream, and I will work on in in Jan/Feb.


As far as I know, today libinotify is packaged in two systems:

  1. NetBSD -- Thanks to Thomas Klausner for packaging.
  2. FreeBSD -- Thanks to Stanislav Sedov for FreeBSD patches and packaging.

Some problems were found on OpenBSD -- there is no pthread_barrier(3), so I will have to write my own synchronization primitive instead of using this convenient one.

libinotify has helped to run incrond on NetBSD and FreeBSD. Soon I will prepare a package for it. Also, porting is planned.

If you want to build and intall the latest libinotify on your BSD system:

$ git clone git://
$ cd libinotify-kqueue
$ autoreconf -fvi
$ ./configure
$ make
$ make test
# make install

Some tests (IN_OPEN, IN_CLOSE_WRITE, IN_CLOSE_NOWRITE) will fail and it is expected -- these events just are not implemented.

So, the work is in progress, Happy Holidays to everyone!

Add a comment
Testing the behavior

All planned features for libinotify-kqueue are implemented, and now I am working on tests.

Tests are especially important for a library that emulates an existing behavior. So I have decided to cover the library with behavioral tests, not with the unit tests.

How to test the behaviour of such an entity like inotify? To answer this question, we have to figure out what it provides at first.

So, with inotify we can:

  1. Start a watch;
  2. Modify a watch;
  3. Register notifications from watch(es);
  4. Remove a watch.

Great. But if we want to register notifications from the file system, we should produce the activity on the file system at first. So we need to have two entities: one produces activity (a producer) and one registers it (a consumer).

Next, we have to ensure that the registered events are the same events that we have expected. So the scheme how the producer and the consumer communicate might look like the following:

  1. Producer tells to consumer, which events are expected to appear;
  2. Consumer starts registering events;
  3. Producer starts generating a file system activity;
  4. Consumer register all the expected events OR reaches the timeout;
  5. Consumer specifies to producer, which events have left unregistered;
  6. Producer processes this info and passes or failes the appropriate test case.

The communication between the producer and consumer can be also generalized like this:

  1. Producer tells to consumer, what to do;
  2. Consumer executing the action;
  3. Producer waits while the consumer works (producer can do its own work here too);
  4. Consumer tells the results to the producer;
  5. Producer passes or failes the test case.

Besides monitoring for expected events, action is also can be:

  • Starting a watch or updating flags of an existing watch;
  • Stopping a watch.

These actions give us the additional functionality, quite suitable for covering sophicated scenarions.

The test suite have to run on both Linux and NetBSD. All tests have to pass on Linux, some of them are expected to fail on NetBSD. I will write some tests knowing that they will fail on NetBSD intentionally.

The test suite should also have entities for logging and tracking passed and failed tests.

Considering all these requirements, I have developed my own tiny BDD framework. Its last version can be viewed here (and then, the more recent version will be available in the master branch).

It was an interesting experience to make it run on both Linux and NetBSD. I have represented both consumer and producer by threads (probably it is possible to run them in a single thread, but in my vision it would require to build an FSM and with the FSM it would be harder to understand the flow of tests), and the hardest part (as always) were in its synchronization. Pthread barriers fit perferctly here, and it worked fine on Linux but didn't work well on NetBSD. So I had to implement my own primitives atop of mutexes and condition variables for it.

Currently I am covering inotify with the tests on Linux. There are 2 tests with 16 test cases in total now, but it is just a beggining and I expect to get 50-70 test cases in total. Anyway, the first test already crashes my library to death :) But I will resolve all these issues only after writing all the tests.

Test suite running on Debian GNU/Linux, larger version, 10K

Add a comment
First results on the libinotify-kqueue compatibility library

Some hours ago I have just merged a big squashed commit into the master branch of the second part of my GSoC project.

The code is the implementation of the inotify emulation and a test application. Not all the planned events are supported currently, but it is just a question of time.

During the last two weeks I have changed some design decisions and rewrote some pieces of code some times. The general complication is in the inotify behavior: when you start watching a directory, inotify also watches all the files in this directory. So we have a dependency between a user-supplied entry (a directory) and automatically watched entries (the files of that directory). Obviously, the dependency files should be also monitored and events from them should be fired to user. Manipulating with this mess on C is an unforgettable experience :)

Now the library has come to its final conceptual form, however I will write more detailed overview a bit later, after a set of refactorings and clean-ups.

The screenshot below illustrates what and how it works:

A larger version (20K) is also available.

Add a comment
Midterm evaluations

Its time fot Google Summer of Code midterm evaluations and here I will summarize the current project status.

GIO/kqueue backend

All the planned functionality is implemented:

  • File and directory monitoring, supported events are
  • Monitoring for files and directories which do not exist yet;
  • Monitoring cancellation.

Current limitations

  • Pair moves (i.e. when a GFileMonitor emits DELETED and CREATED events for a file move) are not supported;
  • Backend opens a file descriptor for each monitored entry, so the observed file's FS can not be unmounted.

All the issues pointed by you and Glib guys were resolved and merged into my kqueue/master branch.

Furher plans

  • Make the kqueue backend behave more like an inotify one, if there will be significant differences;
  • Fixes and further improvements (i.e. more error handling and robustness);
  • Tests. I have not found the monitor tests in Glib test suite, so I am going to write my own.

Brief design overview

Backend runs an additional thread, a kqueue thread, where themonitoring is performed.

When user creates a GFileMonitor instance for a file, backend tries to open that file and if the file is not found, the backend pushes itinto a missing files list. If the file was found, the backend pushes its FD to a queue and then sends a signal (via a socket) to the kqueue thread. The kqueue thread wakes up and takes this descriptor for the further kevent().

When a call to the kevent() returns, the kqueue thread searches for signalled descriptors, marshals and writes it into a local socket.

This socket is watched by Glib in its main event loop. Backend reads the notifications coming from a kqueue thread, converts kqueue filter flags into GIO event flags and then emits the "changed" signal on the appropriate GFileMonitor instances.

Backend periodically traverses the missing files list (if it is notempty) and tries to start monitoring on each entry. If the file hasappeared on the FS and the monitoring has started successfully, the backend removes this entry from the missing files list. Otherwise, it will repeat checking on missing files until the user will cancel the monitoring on the appropriate GFileMonitor instance.

libinotify compatibility library

In development.

I have just finally took the general design decisions, so it is planned to works like follows.

On each call to inotify_init() or inotify_init1() a new thread and a new kqueue will be created. The kqueue will sleep on the monitored descriptors in that thread, like in the gio/kqueue backend. A socketpair will be created and one of its descriptors will be returned to a user as an inotify interface, and another one will be used by the kqueue thread.

The tricky ones are inotify_add_watch() and inotify_rm_watch() functions. The both return a value to an end user, indicating a failure or a success. These functions will push an object (representing the operation requested) to a special internal queue (separate for each kqueue thread). Then an appropriate kqueue thread will be awakened and will perform the requested operation (add/modify/remove an entry). The kqueue thread will also put the operation status (success or failure, an identificator of an added watch, etc) to a per-operation storage so it than could be used in the inotify_add_watch() and inotify_rm_watch() functions as return values.

The problem is in the implementation of this scheme and in that inotify_add_watch() and inotify_rm_watch() functions should sleep while the kqueue thread will perform the requested operations.

I have decided to use SIMPLEQ for a queue and pthread barriers for thread synchronization, so the implementation now looks clear for me.

The inotify emulation library will not support some kinds of events, like IN_CLOSE[_XXX], IN_OPEN and so on, because kqueue has no analogues to it. Some other events, like IN_MOVED_FROM or IN_MOVED_TO (for a directory) I will try to simulate with caching and diffing the directory's contents between notifications.

Also, in Linux a user can determine the total count of pending notifications on the inotify's file descriptor with fcntl(). With an userspace emulation it will not be possible to do the same on BSDs, I think.

Add a comment
Yet another progress update

The last two weeks I have spent passing my last exams and working on gio/kqueue backend improvements. The code has been reviewed by two developers from gtk-devel-list@, Dan Winship and Colomban Wendling. They have pointed out a set of issues regarding the coding style, overkills and reimplementations of already existed routines, and so on.

My mentor Julio Merino also has posted some comments regarding the code.

There are two commits, a big squashed one and another one, they contain all the necessary changes.

Now the development of the backend turns into its final stage. All the expected functionality is implemented, and the further plans are:

  • Tests, tests, tests. I perform so-called "smoke testing" with my Smalltalk script after every commit, but will write some simple unit tests too. Also, I will try to compile and run GNOME with my glib tree, it would be a good stress test for the backend :)
  • Documentation. The code is already commented well with the gtk-doc formatting, but it is just a reference documnenation. I will summarize design decisions and other stuff in a separate document, and probably some posts from this blog will be included there too.
  • I am plagued by vague doubts regarding similarities and differences of inotify and kqueue backends behavior. Currently I am setting up the build environment for Glib on my Debian netbook to check how the inotify backend really behaves. If the difference will be significant, I will probably adjust the kqueue backend to behave like the inotify one. The aim is to avoid surprising developers of the end-user software with signalling some unexpected events, which should not occur in a that conditions with the inotify backend.

Also, I have just created a new Git repo for the second part of the project. Stay tuned!

Add a comment
Almost ready

During the last 2+ weeks I have resolved many pending TODOs in the gio/kqueue project, including

  • Memory allocation policy improvements;
  • Some performance tricks;
  • Little refactorings and documentation;

Also, the missing files monitoring was implemented and merged into the kqueue/master branch.

So, almost all the planned functionality for GIO kqueue backend is ready and I am testing it actively now.

The backend works fine on tiny homebrew examples, like

rm file
touch file
touch file
echo 'Hello' >> file
mv file file2
rm file2

but it is not enough. I have wrote a GNU Smalltalk script that generates a directory tree and simulates a heavy file system activity on it. It touches, deletes, moves files and directories and generates subdirectories in that tree for ~50 times per second. The test application (not published yet) monitors activity on that tree. I have discovered some issues, but I am not sure if these issues are related to the backend or to the application. Currently I am working on resolving it.

Anyway, the rapid monitoring on a tree of ~1000 nodes looks fantastic! :)

I have also posted two announcements, one on tech-pkg NetBSD list and another one on Glib/GTK+ developers list.

Stay tuned.

Add a comment
First results

Almost two weeks gone since my last post here.

It is a hot time in the university, there are setoffs now and exams ahead. However, I have merged two big commits into my kqueue/master branch.

The first one just adds kqueue backend stub into Glib source tree, it is not so interesting, as the second one. The core functionality for file monitoring with kqueue is introduced in this merge.

Currently the following events can be monitored:


Further plans for gio/kqueue extension:

  1. Implement (emulate) G_FILE_MONITOR_EVENT_CREATED and (if possible) G_FILE_MONITOR_EVENT_UNMOUNTED;
  2. Implement a missing files monitoring (i.e. wait while a file will be created and then start monitoring it);
  3. Resolve all 20+ TODOs (pending features and improvements, doubtful solutions in the code);
  4. Write tests and benchmarks.

Larger version, 56K

A gloomy screenshot above demonstrates what is working for now. Stay tuned!

Add a comment
Integrating kevent() into GMainLoop

In this post I will try to explain how the GMainLoop works. GMainLoop is the heart of every GTK+-powered application, so understanding what happens under the hood is mandatory for a Glib hacker. At least, I think that the best way to understand something better is to explain it, so I am writing this post.

What GMainLoop gives us

As I have already said, every GTK+ application runs a GMainLoop. GMainLoop processes all incoming events:

  • user input: keyboard, mouse, etc;
  • activity on file descriptors and sockets;
  • timers (timeouts)

and invokes the appropriate handlers for all kinds of events. As I have mentioned, GIO receives inotify events using the standard GIO Channels, these events are also dispatched via GMainLoop.

There is polling all the way down

GMainLoop is like an endless loop of main loop iterations. On each iteration, it checks for new events from all event sources. Depending on the application, event sources may be:

  • IO channels -- files and sockets;
  • Window subsystem -- X11, Windows, Quartz, etc;
  • Timers (timeouts).

GMainLoop usually runs in a signle thread. And now, imagine a situation: a GTK+ application has installed an IO channel watch to handle incoming data on a socket in a Glib way. But if it is running in a single thread, how it can process GDK events and wait for data on a socket at the same time?

The first idea is that GMainLoop uses timed-out polling. The scheme would be something like this:

  1. (A) check for available IO on the polled file descriptors using select() or poll() with a short timeout;
    • if there is data available, invoke the appropriate handlers;
  2. (B) when timeout elapses or data processed, process the pending events from X11 server;
  3. repeat.

But this scheme is very, very bad -- it requires a constant switching between (A) and (B) and consumes a lot of CPU. Thanks God, GMainLoop works in another way.

In GNU/Linux, GMainLoop always invokes a single poll() during an iteration with an infinite timeout. But why GUI does not getting blocked when we are waiting for data on a socket? The key point is that native GTK+ environment is X11 Window System. Remember its network transparency that people often swear? X11 client and X11 server communicate via socket, and this socket is polled in the same file descriptor set as well as all other sockets or files.

When user clicks a button, X11 server sends an event to a socket and poll() signals us about it. When data is available on another polled socket, the same poll() signals us about it. When inotify notifies us about filesystem activity, the same poll() will say us about it too. One mechanism, multiple purposes, really genius.

However, this genious simplicity has a significant shortcoming...

The problem

And the shortcomming is the following: if we want to add a custom event source to GMainLoop gracefully, we will need to pass a file descriptor to it for polling (timers do not counting!). Sometimes it is not possible.

Each event source in Glib is represented with the following set of functions:

struct GSourceFuncs {
  gboolean (*prepare)  (GSource    *source,
                        gint       *timeout_);
  gboolean (*check)    (GSource    *source);
  gboolean (*dispatch) (GSource    *source,
                        GSourceFunc callback,
                        gpointer    user_data);
  void     (*finalize) (GSource    *source); /* Can be NULL */

  /* For use by g_source_set_closure */
  GSourceFunc     closure_callback;        
  GSourceDummyMarshal closure_marshal; /* Really is of type GClosureMarshal */

prepare and (AFAIK, if a file descriptor is present) check functions are called on each GMainLoop iteration. If they all returned TRUE, the dispatch function will be called. This function should invoke a user callback for an event.

In my project I need to check filesystem activity with kevent(), so I could just add it into my implementation of prepare. But it is a completely wrong way -- I will need to monitor all file events with kqueue and all other events with poll. Unfortunately, it is unacceptable to block the execution at first with kqueue() and then with poll(). It is also unacceptable to use timed-out polling and to call both kqueue() and poll() with short timeouts during an iteration because of big CPU consumption of such scheme.

The best solution is to make a true Glib BSD port -- to drop poll() and to use kevent() everywhere instead. The efficient replacement will require a lot of unplanned changes in the core Glib, and it is an error-prone way. The unefficient replacement will not require a lot of changes, but nobody need it.

Another way is to run a kevent() loop in a separate thread. I also do not like this way, because it will automatically introduce an additional thread for each process that use GIO file monitoring. Even today, context switching is a hard task.

So, I have decided to look for alternative solutions.

How it is done in Windows

I have programmed for Windows for a long time and I know that that an Windows application communicates with the system via windows messages (and it peeks it from a message queue). So there should not be any sockets and trick with poll() should not pass, and at first I have referred to Win32 Glib & GTK+ implementation.

In Windows, g_poll() is not just a wrapper over poll() as in Unix systems, it is completely different here. It uses WaitForMultipleObjectsEx() and MsgWaitForMultipleObjectsEx() routines to monitor events. The first function supports only HANDLEs and is used for sockets. The second one is used when GTK is loaded, since MsgWaitForMultipleObjectsEx() can monitor window messages too. So we are facing with the same situation as in GNU/Linux -- the same routine can be used for working with both sockets and window events. Another point to implement a true Glib BSD port with kevent instead of poll() :)

How it is done in Mac OS X

Unfortunately, the Windows port has not gave me any ideas, so I have referred to a Mac OS X GTK+ port. I have never worked with Macs and this programming environment is new for me. A big comment in the beginning of /gdk/quartz/gdkeventloop-quartz.c states the following:

 * Both cases share a single problem: the OS X API's don't allow us to
 * wait simultaneously for file descriptors and for events. So when we
 * need to do a blocking wait that includes file descriptor activity, we
 * push the actual work of calling select() to a helper thread (the
 * "select thread") and wait for native events in the main thread.

This is exactly my problem! And gtk-quartz developers have solved it with a separate thread, as I have already assumed in this post. Okay, if they have done so, I will do the same.

P.S. Git 'em All!

I have uploaded my Glib source tree to Github. The main development branch is kqueue/master, AFAIR it is based on tag 2.26.1. There is also branch kqueue/sandbox, where I will do some experimental things before merging it to the kqueue/master. The first commit, that adds kqueue checking to configure and plugin stub to source, is already there.

Add a comment
GIO file monitoring overview

Last two weeks I have spent in trying to update my NetBSD 5.0.2 to current. It was an interesting quest and I still could not complete it yet :)

The CVS source snapshot has failed to build, the binary one haven't worked properly (I had some problems with X). So I have stopped on v5.1 -- it is not so outdated for a developer as v5.0.2, but it works. Though I start Emacs with LD_LIBRARY_PATH=/usr/pkg/lib (it will not launch otherwise) and sometimes MPlayer crashes the entire system, I think it is quite usable.

But this post is not about it. Here I will try to summarize how the GIO file monitoring works and how it uses inotify.

GFileMonitor for an end-user

GFileMonitor is an monitoring entity in Glib. Let's take a look on its public interface (the official documentation here and here):


enum           GFileMonitorEvent;

GFileMonitor*  g_file_monitor            (GFile *file,
                                          GFileMonitorFlags flags,
                                          GCancellable *cancellable,
                                          GError **error);


  "changed"                              : Run Last

To start monitoring on a specific file or directory, the programmer should call g_file_monitor() with the appropriate arguments. The returned object is the monitor itself, it will emit the "changed" signal on every event occured. The signal/slot system is common to Glib and it is no surprice that it is used here too.

The slot should have the following prototype:

void           user_function            (GFileMonitor     *monitor,
                                         GFile            *file,
                                         GFile            *other_file,
                                         GFileMonitorEvent event_type,
                                         gpointer          user_data)       : Run Last

The event_type argument will show what kind of event has happened:

typedef enum {
} GFileMonitorEvent;

Thats all. Clear and simple :)

inotify for an end-user

Now let's take a look inotify -- the subsystem that is used by GFileMonitor on Linux. inotify is implemented in the kernel, the userspace interface is:

int            inotify_init             (void)
int            inotify_add_watch        (int fd,
                                         const char *pathname,
                                         uint32_t mask);

(see this LWN article for more details).

Again, the interface is fairly simple. The inotify_init() function initializes the inotify subsystem and returns a file descriptor. inotify_add_watch() specifies a file and events to monitor. The file descriptor then can be used to monitor events with select(), poll(), etc. The application will receive notifications from this file descriptor in the following form:

struct inotify_event {
    int wd;          /* Watch descriptor */
    uint32_t mask;   /* Mask of events */
    uint32_t cookie; /* Unique cookie associating related
                        events (for rename (2)) */
    uint32_t len;    /* Size of 'name' field */
    char name[];     /* Optional null-terminated name */

The following event types are available:

  1. IN_ACCESS -- File was read from;
  2. IN_MODIFY -- File was written to;
  3. IN_ATTRIB -- File's metadata (inode or xattr) was changed;
  4. IN_CLOSE_WRITE -- File was closed (and was open for writing);
  5. IN_CLOSE_NOWRITE -- File was closed (and was not open for writing);
  6. IN_OPEN -- File was opened;
  7. IN_MOVED_FROM -- File was moved away from watch;
  8. IN_MOVED_TO -- File was moved to watch;
  9. IN_DELETE -- File was deleted;
  10. IN_DELETE_SELF -- The watch itself was deleted.

And again, clear and simple.

The glue

And now finally let's take a look under the hood -- on how GFileMonitor uses inotify. When g_file_monitor_file() is invoked, the following call chain occurs:


Depending on the type of a file, the toplevel function g_file_monitor() calls g_file_monitor_file() for files and g_file_monitor_directory() for directories. I have shown a call stack for case of file.

g_file_monitor_file() takes the observed file's GFile object interface, iface, and calls its monitor_file member function by pointer. For local files, which are represented with GLocalFile class, this interface member function points to g_local_file_monitor_file(). This function, in turn, calls _g_local_file_monitor_new().

_g_local_file_monitor_new() is the most interesting one. First of all, it obtains a type of created monitor object with get_default_local_file_monitor() and then creates this instance using the obtained type with g_object_new(). Here we are facing with classes and metaclasses like in Smalltalk, Common Lisp and its modern pop successors.

get_default_local_file_monitor() returns the type of the monitor to be created. On its first invocation, the function searches for a suitable plugin across the available GIO extensions. GIO supports several types of extensions, and the function uses G_LOCAL_FILE_MONITOR_EXTENSION_POINT_NAME to filter out only file monitoring ones. Then each matched exension is asked for its GLocalFileMonitorClass subclass object. Each such object has an is_supported() member function, and if this method will return true, the class will be picked up as the default class for file monitoring objects. This type will be returned on all subsequent calls to get_default_local_file_monitor().

So, _g_local_file_monitor_new() function acts much like as a "virtual constructor".

Thus, GIO does not refer to its inotify backend directly.

The rest is trivial: in the gio/inotify directory at Glib's source tree we can find GInotifyFileMonitor and GInotifyDirectoryMonitor classes, which operate with inotify directly.

Considering that inotify provides a file descriptor for monitoring, the implementation of inotify backend can be summarized quite simple:

  1. When the extension is initialized, invoke inotify_init() and install GIO watches on the obtained file descriptor;
  2. When a GInotify{File,Directory}Monitor is created, invoke inotify_add_watch() with the appropriate parameters (obtained from g_file_monitor());
  3. In the inotify channel watch callback function, read a struct inotify_event from file descriptor, decode it to GIO format (GFileMonitorEvent event code, etc) and emit the "changed" signal.

That is it!

Add a comment
Getting started on Google Summer of Code 2011

This summer I will work for The NetBSD Foundation. This blog is dedicated to track my progress.

My project is "Add kqueue support to GIO". The main intention is to bring a native file monitoring support to GIO. Currently it uses inotify "as is", during the summer I will give it an abstration layer and a kqueue back-end. Additionally an inotify-over-kqueue compatibility library is planned, it will help to easily port and run any software that uses inotify directly.

The subject is interesting and fun, and it promises a lot of research of the kqueue and inotify implementations.

Here is a short plan of actions:

  1. Community bonding period
  2. May 23 - June 23: the core work
  3. June 24 - July 10: wrap up for a milestone (midterm evaluations)
  4. July 11 - August 1: the rest of the work
  5. August 2 - August 7: offline
  6. August 8 - August 22: final stage

The more complete one will be available here (as soon as I will receive permissions to modify that page).

Currently I am updating my NetBSD 5.0.2 to current. The Toshiba A210-199 laptop has overheated and halted during the build for three times in a row, so I have placed it on a sill to use a cool air from the street for cooling.

Larger version, 1.5M

Add a comment
backend (7)
libinotify (4)
gio (3)
results (3)
glib (2)
inotify (2)
current (1)
week7 (1)
kevent (1)
week-2 (1)
smalltalk (1)
week2 (1)
tests (1)
week0 (1)
week-1 (1)
fixes (1)
gmainloop (1)
week4 (1)
poll (1)