In this post I will try to explain how the GMainLoop works. GMainLoop is the heart of every GTK+-powered application, so understanding what happens under the hood is mandatory for a Glib hacker. At least, I think that the best way to understand something better is to explain it, so I am writing this post.

What GMainLoop gives us

As I have already said, every GTK+ application runs a GMainLoop. GMainLoop processes all incoming events:

  • user input: keyboard, mouse, etc;
  • activity on file descriptors and sockets;
  • timers (timeouts)

and invokes the appropriate handlers for all kinds of events. As I have mentioned, GIO receives inotify events using the standard GIO Channels, these events are also dispatched via GMainLoop.

There is polling all the way down

GMainLoop is like an endless loop of main loop iterations. On each iteration, it checks for new events from all event sources. Depending on the application, event sources may be:

  • IO channels -- files and sockets;
  • Window subsystem -- X11, Windows, Quartz, etc;
  • Timers (timeouts).

GMainLoop usually runs in a signle thread. And now, imagine a situation: a GTK+ application has installed an IO channel watch to handle incoming data on a socket in a Glib way. But if it is running in a single thread, how it can process GDK events and wait for data on a socket at the same time?

The first idea is that GMainLoop uses timed-out polling. The scheme would be something like this:

  1. (A) check for available IO on the polled file descriptors using select() or poll() with a short timeout;
    • if there is data available, invoke the appropriate handlers;
  2. (B) when timeout elapses or data processed, process the pending events from X11 server;
  3. repeat.

But this scheme is very, very bad -- it requires a constant switching between (A) and (B) and consumes a lot of CPU. Thanks God, GMainLoop works in another way.

In GNU/Linux, GMainLoop always invokes a single poll() during an iteration with an infinite timeout. But why GUI does not getting blocked when we are waiting for data on a socket? The key point is that native GTK+ environment is X11 Window System. Remember its network transparency that people often swear? X11 client and X11 server communicate via socket, and this socket is polled in the same file descriptor set as well as all other sockets or files.

When user clicks a button, X11 server sends an event to a socket and poll() signals us about it. When data is available on another polled socket, the same poll() signals us about it. When inotify notifies us about filesystem activity, the same poll() will say us about it too. One mechanism, multiple purposes, really genius.

However, this genious simplicity has a significant shortcoming...

The problem

And the shortcomming is the following: if we want to add a custom event source to GMainLoop gracefully, we will need to pass a file descriptor to it for polling (timers do not counting!). Sometimes it is not possible.

Each event source in Glib is represented with the following set of functions:

struct GSourceFuncs {
  gboolean (*prepare)  (GSource    *source,
                        gint       *timeout_);
  gboolean (*check)    (GSource    *source);
  gboolean (*dispatch) (GSource    *source,
                        GSourceFunc callback,
                        gpointer    user_data);
  void     (*finalize) (GSource    *source); /* Can be NULL */

  /* For use by g_source_set_closure */
  GSourceFunc     closure_callback;        
  GSourceDummyMarshal closure_marshal; /* Really is of type GClosureMarshal */
};

prepare and (AFAIK, if a file descriptor is present) check functions are called on each GMainLoop iteration. If they all returned TRUE, the dispatch function will be called. This function should invoke a user callback for an event.

In my project I need to check filesystem activity with kevent(), so I could just add it into my implementation of prepare. But it is a completely wrong way -- I will need to monitor all file events with kqueue and all other events with poll. Unfortunately, it is unacceptable to block the execution at first with kqueue() and then with poll(). It is also unacceptable to use timed-out polling and to call both kqueue() and poll() with short timeouts during an iteration because of big CPU consumption of such scheme.

The best solution is to make a true Glib BSD port -- to drop poll() and to use kevent() everywhere instead. The efficient replacement will require a lot of unplanned changes in the core Glib, and it is an error-prone way. The unefficient replacement will not require a lot of changes, but nobody need it.

Another way is to run a kevent() loop in a separate thread. I also do not like this way, because it will automatically introduce an additional thread for each process that use GIO file monitoring. Even today, context switching is a hard task.

So, I have decided to look for alternative solutions.

How it is done in Windows

I have programmed for Windows for a long time and I know that that an Windows application communicates with the system via windows messages (and it peeks it from a message queue). So there should not be any sockets and trick with poll() should not pass, and at first I have referred to Win32 Glib & GTK+ implementation.

In Windows, g_poll() is not just a wrapper over poll() as in Unix systems, it is completely different here. It uses WaitForMultipleObjectsEx() and MsgWaitForMultipleObjectsEx() routines to monitor events. The first function supports only HANDLEs and is used for sockets. The second one is used when GTK is loaded, since MsgWaitForMultipleObjectsEx() can monitor window messages too. So we are facing with the same situation as in GNU/Linux -- the same routine can be used for working with both sockets and window events. Another point to implement a true Glib BSD port with kevent instead of poll() :)

How it is done in Mac OS X

Unfortunately, the Windows port has not gave me any ideas, so I have referred to a Mac OS X GTK+ port. I have never worked with Macs and this programming environment is new for me. A big comment in the beginning of /gdk/quartz/gdkeventloop-quartz.c states the following:

 * Both cases share a single problem: the OS X API's don't allow us to
 * wait simultaneously for file descriptors and for events. So when we
 * need to do a blocking wait that includes file descriptor activity, we
 * push the actual work of calling select() to a helper thread (the
 * "select thread") and wait for native events in the main thread.

This is exactly my problem! And gtk-quartz developers have solved it with a separate thread, as I have already assumed in this post. Okay, if they have done so, I will do the same.

P.S. Git 'em All!

I have uploaded my Glib source tree to Github. The main development branch is kqueue/master, AFAIR it is based on tag 2.26.1. There is also branch kqueue/sandbox, where I will do some experimental things before merging it to the kqueue/master. The first commit, that adds kqueue checking to configure and plugin stub to source, is already there.