dmitrymatveev.co.uk
BlogRussian blogGSoC 2011GSoC 2010AboutMaintenance
Back to posts
Integrating kevent() into GMainLoop

In this post I will try to explain how the GMainLoop works. GMainLoop is the heart of every GTK+-powered application, so understanding what happens under the hood is mandatory for a Glib hacker. At least, I think that the best way to understand something better is to explain it, so I am writing this post.

What GMainLoop gives us

As I have already said, every GTK+ application runs a GMainLoop. GMainLoop processes all incoming events:

  • user input: keyboard, mouse, etc;
  • activity on file descriptors and sockets;
  • timers (timeouts)

and invokes the appropriate handlers for all kinds of events. As I have mentioned, GIO receives inotify events using the standard GIO Channels, these events are also dispatched via GMainLoop.

There is polling all the way down

GMainLoop is like an endless loop of main loop iterations. On each iteration, it checks for new events from all event sources. Depending on the application, event sources may be:

  • IO channels -- files and sockets;
  • Window subsystem -- X11, Windows, Quartz, etc;
  • Timers (timeouts).

GMainLoop usually runs in a signle thread. And now, imagine a situation: a GTK+ application has installed an IO channel watch to handle incoming data on a socket in a Glib way. But if it is running in a single thread, how it can process GDK events and wait for data on a socket at the same time?

The first idea is that GMainLoop uses timed-out polling. The scheme would be something like this:

  1. (A) check for available IO on the polled file descriptors using select() or poll() with a short timeout;
    • if there is data available, invoke the appropriate handlers;
  2. (B) when timeout elapses or data processed, process the pending events from X11 server;
  3. repeat.

But this scheme is very, very bad -- it requires a constant switching between (A) and (B) and consumes a lot of CPU. Thanks God, GMainLoop works in another way.

In GNU/Linux, GMainLoop always invokes a single poll() during an iteration with an infinite timeout. But why GUI does not getting blocked when we are waiting for data on a socket? The key point is that native GTK+ environment is X11 Window System. Remember its network transparency that people often swear? X11 client and X11 server communicate via socket, and this socket is polled in the same file descriptor set as well as all other sockets or files.

When user clicks a button, X11 server sends an event to a socket and poll() signals us about it. When data is available on another polled socket, the same poll() signals us about it. When inotify notifies us about filesystem activity, the same poll() will say us about it too. One mechanism, multiple purposes, really genius.

However, this genious simplicity has a significant shortcoming...

The problem

And the shortcomming is the following: if we want to add a custom event source to GMainLoop gracefully, we will need to pass a file descriptor to it for polling (timers do not counting!). Sometimes it is not possible.

Each event source in Glib is represented with the following set of functions:

struct GSourceFuncs {
  gboolean (*prepare)  (GSource    *source,
                        gint       *timeout_);
  gboolean (*check)    (GSource    *source);
  gboolean (*dispatch) (GSource    *source,
                        GSourceFunc callback,
                        gpointer    user_data);
  void     (*finalize) (GSource    *source); /* Can be NULL */

  /* For use by g_source_set_closure */
  GSourceFunc     closure_callback;        
  GSourceDummyMarshal closure_marshal; /* Really is of type GClosureMarshal */
};

prepare and (AFAIK, if a file descriptor is present) check functions are called on each GMainLoop iteration. If they all returned TRUE, the dispatch function will be called. This function should invoke a user callback for an event.

In my project I need to check filesystem activity with kevent(), so I could just add it into my implementation of prepare. But it is a completely wrong way -- I will need to monitor all file events with kqueue and all other events with poll. Unfortunately, it is unacceptable to block the execution at first with kqueue() and then with poll(). It is also unacceptable to use timed-out polling and to call both kqueue() and poll() with short timeouts during an iteration because of big CPU consumption of such scheme.

The best solution is to make a true Glib BSD port -- to drop poll() and to use kevent() everywhere instead. The efficient replacement will require a lot of unplanned changes in the core Glib, and it is an error-prone way. The unefficient replacement will not require a lot of changes, but nobody need it.

Another way is to run a kevent() loop in a separate thread. I also do not like this way, because it will automatically introduce an additional thread for each process that use GIO file monitoring. Even today, context switching is a hard task.

So, I have decided to look for alternative solutions.

How it is done in Windows

I have programmed for Windows for a long time and I know that that an Windows application communicates with the system via windows messages (and it peeks it from a message queue). So there should not be any sockets and trick with poll() should not pass, and at first I have referred to Win32 Glib & GTK+ implementation.

In Windows, g_poll() is not just a wrapper over poll() as in Unix systems, it is completely different here. It uses WaitForMultipleObjectsEx() and MsgWaitForMultipleObjectsEx() routines to monitor events. The first function supports only HANDLEs and is used for sockets. The second one is used when GTK is loaded, since MsgWaitForMultipleObjectsEx() can monitor window messages too. So we are facing with the same situation as in GNU/Linux -- the same routine can be used for working with both sockets and window events. Another point to implement a true Glib BSD port with kevent instead of poll() :)

How it is done in Mac OS X

Unfortunately, the Windows port has not gave me any ideas, so I have referred to a Mac OS X GTK+ port. I have never worked with Macs and this programming environment is new for me. A big comment in the beginning of /gdk/quartz/gdkeventloop-quartz.c states the following:

 * Both cases share a single problem: the OS X API's don't allow us to
 * wait simultaneously for file descriptors and for events. So when we
 * need to do a blocking wait that includes file descriptor activity, we
 * push the actual work of calling select() to a helper thread (the
 * "select thread") and wait for native events in the main thread.

This is exactly my problem! And gtk-quartz developers have solved it with a separate thread, as I have already assumed in this post. Okay, if they have done so, I will do the same.

P.S. Git 'em All!

I have uploaded my Glib source tree to Github. The main development branch is kqueue/master, AFAIR it is based on tag 2.26.1. There is also branch kqueue/sandbox, where I will do some experimental things before merging it to the kqueue/master. The first commit, that adds kqueue checking to configure and plugin stub to source, is already there.

Add a comment
GIO file monitoring overview

Last two weeks I have spent in trying to update my NetBSD 5.0.2 to current. It was an interesting quest and I still could not complete it yet :)

The CVS source snapshot has failed to build, the binary one haven't worked properly (I had some problems with X). So I have stopped on v5.1 -- it is not so outdated for a developer as v5.0.2, but it works. Though I start Emacs with LD_LIBRARY_PATH=/usr/pkg/lib (it will not launch otherwise) and sometimes MPlayer crashes the entire system, I think it is quite usable.

But this post is not about it. Here I will try to summarize how the GIO file monitoring works and how it uses inotify.

GFileMonitor for an end-user

GFileMonitor is an monitoring entity in Glib. Let's take a look on its public interface (the official documentation here and here):

Synopsis

enum           GFileMonitorEvent;
               GFileMonitor;

GFileMonitor*  g_file_monitor            (GFile *file,
                                          GFileMonitorFlags flags,
                                          GCancellable *cancellable,
                                          GError **error);

Signals

  "changed"                              : Run Last

To start monitoring on a specific file or directory, the programmer should call g_file_monitor() with the appropriate arguments. The returned object is the monitor itself, it will emit the "changed" signal on every event occured. The signal/slot system is common to Glib and it is no surprice that it is used here too.

The slot should have the following prototype:

void           user_function            (GFileMonitor     *monitor,
                                         GFile            *file,
                                         GFile            *other_file,
                                         GFileMonitorEvent event_type,
                                         gpointer          user_data)       : Run Last

The event_type argument will show what kind of event has happened:

typedef enum {
  G_FILE_MONITOR_EVENT_CHANGED,
  G_FILE_MONITOR_EVENT_CHANGES_DONE_HINT,
  G_FILE_MONITOR_EVENT_DELETED,
  G_FILE_MONITOR_EVENT_CREATED,
  G_FILE_MONITOR_EVENT_ATTRIBUTE_CHANGED,
  G_FILE_MONITOR_EVENT_PRE_UNMOUNT,
  G_FILE_MONITOR_EVENT_UNMOUNTED,
  G_FILE_MONITOR_EVENT_MOVED
} GFileMonitorEvent;

Thats all. Clear and simple :)

inotify for an end-user

Now let's take a look inotify -- the subsystem that is used by GFileMonitor on Linux. inotify is implemented in the kernel, the userspace interface is:

int            inotify_init             (void)
int            inotify_add_watch        (int fd,
                                         const char *pathname,
                                         uint32_t mask);

(see this LWN article for more details).

Again, the interface is fairly simple. The inotify_init() function initializes the inotify subsystem and returns a file descriptor. inotify_add_watch() specifies a file and events to monitor. The file descriptor then can be used to monitor events with select(), poll(), etc. The application will receive notifications from this file descriptor in the following form:

struct inotify_event {
    int wd;          /* Watch descriptor */
    uint32_t mask;   /* Mask of events */
    uint32_t cookie; /* Unique cookie associating related
                        events (for rename (2)) */
    uint32_t len;    /* Size of 'name' field */
    char name[];     /* Optional null-terminated name */
}; 

The following event types are available:

  1. IN_ACCESS -- File was read from;
  2. IN_MODIFY -- File was written to;
  3. IN_ATTRIB -- File's metadata (inode or xattr) was changed;
  4. IN_CLOSE_WRITE -- File was closed (and was open for writing);
  5. IN_CLOSE_NOWRITE -- File was closed (and was not open for writing);
  6. IN_OPEN -- File was opened;
  7. IN_MOVED_FROM -- File was moved away from watch;
  8. IN_MOVED_TO -- File was moved to watch;
  9. IN_DELETE -- File was deleted;
  10. IN_DELETE_SELF -- The watch itself was deleted.

And again, clear and simple.

The glue

And now finally let's take a look under the hood -- on how GFileMonitor uses inotify. When g_file_monitor_file() is invoked, the following call chain occurs:

g_file_monitor()
  g_file_monitor_file()
    iface->monitor_file()
      g_local_file_monitor_file()
        _g_local_file_monitor_new()
          get_default_local_file_monitor()
          g_object_new()

Depending on the type of a file, the toplevel function g_file_monitor() calls g_file_monitor_file() for files and g_file_monitor_directory() for directories. I have shown a call stack for case of file.

g_file_monitor_file() takes the observed file's GFile object interface, iface, and calls its monitor_file member function by pointer. For local files, which are represented with GLocalFile class, this interface member function points to g_local_file_monitor_file(). This function, in turn, calls _g_local_file_monitor_new().

_g_local_file_monitor_new() is the most interesting one. First of all, it obtains a type of created monitor object with get_default_local_file_monitor() and then creates this instance using the obtained type with g_object_new(). Here we are facing with classes and metaclasses like in Smalltalk, Common Lisp and its modern pop successors.

get_default_local_file_monitor() returns the type of the monitor to be created. On its first invocation, the function searches for a suitable plugin across the available GIO extensions. GIO supports several types of extensions, and the function uses G_LOCAL_FILE_MONITOR_EXTENSION_POINT_NAME to filter out only file monitoring ones. Then each matched exension is asked for its GLocalFileMonitorClass subclass object. Each such object has an is_supported() member function, and if this method will return true, the class will be picked up as the default class for file monitoring objects. This type will be returned on all subsequent calls to get_default_local_file_monitor().

So, _g_local_file_monitor_new() function acts much like as a "virtual constructor".

Thus, GIO does not refer to its inotify backend directly.

The rest is trivial: in the gio/inotify directory at Glib's source tree we can find GInotifyFileMonitor and GInotifyDirectoryMonitor classes, which operate with inotify directly.

Considering that inotify provides a file descriptor for monitoring, the implementation of inotify backend can be summarized quite simple:

  1. When the extension is initialized, invoke inotify_init() and install GIO watches on the obtained file descriptor;
  2. When a GInotify{File,Directory}Monitor is created, invoke inotify_add_watch() with the appropriate parameters (obtained from g_file_monitor());
  3. In the inotify channel watch callback function, read a struct inotify_event from file descriptor, decode it to GIO format (GFileMonitorEvent event code, etc) and emit the "changed" signal.

That is it!

Add a comment
backend (7)
libinotify (4)
gio (3)
results (3)
glib (2)
inotify (2)
current (1)
week7 (1)
kevent (1)
week-2 (1)
smalltalk (1)
week2 (1)
tests (1)
week0 (1)
week-1 (1)
fixes (1)
gmainloop (1)
week4 (1)
poll (1)