Back to postsPosted on 1-Jan-2012 at 0:49:28 The Google Summer Of Code has ended for four month ago. Since then, the both projects have got some improvements, tiny and significant.
Glib kqueue backend
The patch is available in the NetBSD's pkgsrc, thanks to Julio Merino for packaging. You will need to rebuild the glib2 package as follows:
$ cd /usr/pkgsrc/devel/glib2
$ PKG_DEFAULT_OPTIONS=kqueue make
Please report about any issues to my personal mail (see the About page).
Recently I have been contacted by Antoine Jacoutot. Antoine is an OpenBSD and GNOME contributor. He has integrated the gio patch into OpenBSD's package collection, so now the functionality is available in OpenBSD too. Antoine also mentioned that he can help with pushing the patch to upstream, and I will work on in in Jan/Feb.
libinotify
As far as I know, today libinotify is packaged in two systems:
- NetBSD -- http://pkgsrc.se/devel/libinotify. Thanks to Thomas Klausner for packaging.
- FreeBSD -- http://www.freebsd.org/cgi/cvsweb.cgi/ports/devel/libinotify/. Thanks to Stanislav Sedov for FreeBSD patches and packaging.
Some problems were found on OpenBSD -- there is no pthread_barrier(3), so I will have to write my own synchronization primitive instead of using this convenient one.
libinotify has helped to run incrond on NetBSD and FreeBSD. Soon I will prepare a package for it. Also, porting is planned.
If you want to build and intall the latest libinotify on your BSD system:
$ git clone git://github.com/dmatveev/libinotify-kqueue.git
$ cd libinotify-kqueue
$ autoreconf -fvi
$ ./configure
$ make
$ make test
# make install
Some tests (IN_OPEN, IN_CLOSE_WRITE, IN_CLOSE_NOWRITE) will fail and it is expected -- these events just are not implemented.
So, the work is in progress, Happy Holidays to everyone!
Posted on 15-Jul-2011 at 0:17:25 Its time fot Google Summer of Code midterm evaluations and here I will summarize the current project status.
GIO/kqueue backend
All the planned functionality is implemented:
- File and directory monitoring, supported events are
-
G_FILE_MONITOR_EVENT_CREATED
-
G_FILE_MONITOR_EVENT_ATTRIBUTE_CHANGED
-
G_FILE_MONITOR_EVENT_CHANGED
-
G_FILE_MONITOR_EVENT_MOVED
-
G_FILE_MONITOR_EVENT_DELETED
- Monitoring for files and directories which do not exist yet;
- Monitoring cancellation.
Current limitations
-
G_FILE_MONITOR_EVENT_PRE_UNMOUNT and G_FILE_MONITOR_EVENT_UNMOUNTED events are not supported;
- Pair moves (i.e. when a
GFileMonitor emits DELETED and CREATED events for a file move) are not supported;
- Backend opens a file descriptor for each monitored entry, so the observed file's FS can not be unmounted.
All the issues pointed by you and Glib guys were resolved and merged into my kqueue/master branch.
Furher plans
- Make the kqueue backend behave more like an inotify one, if there will be significant differences;
- Fixes and further improvements (i.e. more error handling and robustness);
- Tests. I have not found the monitor tests in Glib test suite, so I am going to write my own.
Brief design overview
Backend runs an additional thread, a kqueue thread, where themonitoring is performed.
When user creates a GFileMonitor instance for a file, backend tries to open that file and if the file is not found, the backend pushes itinto a missing files list.
If the file was found, the backend pushes its FD to a queue and then sends a signal (via a socket) to the kqueue thread. The kqueue thread wakes up and takes this descriptor for the further kevent().
When a call to the kevent() returns, the kqueue thread searches for signalled descriptors, marshals and writes it into a local socket.
This socket is watched by Glib in its main event loop. Backend reads the notifications coming from a kqueue thread, converts kqueue filter flags into GIO event flags and then emits the "changed" signal on the appropriate GFileMonitor instances.
Backend periodically traverses the missing files list (if it is notempty) and tries to start monitoring on each entry. If the file hasappeared on the FS and the monitoring has started successfully, the backend removes this entry from the missing files list. Otherwise, it will repeat checking on missing files until the user will cancel the monitoring on the appropriate GFileMonitor instance.
libinotify compatibility library
In development.
I have just finally took the general design decisions, so it is planned
to works like follows.
On each call to inotify_init() or inotify_init1() a new thread and a new kqueue will be created. The kqueue will sleep on the monitored descriptors in that thread, like in the gio/kqueue backend. A socketpair will be created and one of its descriptors will be returned to a user as an inotify interface, and another one will be used by the kqueue thread.
The tricky ones are inotify_add_watch() and inotify_rm_watch() functions. The both return a value to an end user, indicating a failure or a success. These functions will push an object (representing the operation requested) to a special internal queue (separate for each kqueue thread). Then an appropriate kqueue thread will be awakened and will perform the requested operation (add/modify/remove an entry). The kqueue thread will also put the operation status (success or failure, an identificator of an added watch, etc) to a per-operation storage so it than could be used in the inotify_add_watch() and inotify_rm_watch() functions as return values.
The problem is in the implementation of this scheme and in that inotify_add_watch() and inotify_rm_watch() functions should sleep while the kqueue thread will perform the requested operations.
I have decided to use SIMPLEQ for a queue and pthread barriers for thread synchronization, so the implementation now looks clear for me.
The inotify emulation library will not support some kinds of events, like IN_CLOSE[_XXX], IN_OPEN and so on, because kqueue has no analogues to it. Some other events, like IN_MOVED_FROM or IN_MOVED_TO (for a directory) I will try to simulate with caching and diffing the directory's contents between notifications.
Also, in Linux a user can determine the total count of pending notifications on the inotify's file descriptor with fcntl(). With an userspace emulation it will not be possible to do the same on BSDs, I think.
Posted on 7-Jul-2011 at 15:58:22 The last two weeks I have spent passing my last exams and working on gio/kqueue backend improvements. The code has been reviewed by two developers from gtk-devel-list@, Dan Winship and Colomban Wendling. They have pointed out a set of issues regarding the coding style, overkills and reimplementations of already existed routines, and so on.
My mentor Julio Merino also has posted some comments regarding the code.
There are two commits, a big squashed one and another one, they contain all the necessary changes.
Now the development of the backend turns into its final stage. All the expected functionality is implemented, and the further plans are:
- Tests, tests, tests. I perform so-called "smoke testing" with my Smalltalk script after every commit, but will write some simple unit tests too. Also, I will try to compile and run GNOME with my glib tree, it would be a good stress test for the backend :)
- Documentation. The code is already commented well with the gtk-doc formatting, but it is just a reference documnenation. I will summarize design decisions and other stuff in a separate document, and probably some posts from this blog will be included there too.
- I am plagued by vague doubts regarding similarities and differences of inotify and kqueue backends behavior. Currently I am setting up the build environment for Glib on my Debian netbook to check how the inotify backend really behaves. If the difference will be significant, I will probably adjust the kqueue backend to behave like the inotify one. The aim is to avoid surprising developers of the end-user software with signalling some unexpected events, which should not occur in a that conditions with the inotify backend.
Also, I have just created a new Git repo for the second part of the project. Stay tuned!
Posted on 20-Jun-2011 at 4:30:37 During the last 2+ weeks I have resolved many pending TODOs in the gio/kqueue project, including
- Memory allocation policy improvements;
- Some performance tricks;
- Little refactorings and documentation;
Also, the missing files monitoring was implemented and merged into the kqueue/master branch.
So, almost all the planned functionality for GIO kqueue backend is ready and I am testing it actively now.
The backend works fine on tiny homebrew examples, like
rm file
touch file
touch file
echo 'Hello' >> file
mv file file2
rm file2
but it is not enough. I have wrote a GNU Smalltalk script that generates a directory tree and simulates a heavy file system activity on it. It touches, deletes, moves files and directories and generates subdirectories in that tree for ~50 times per second. The test application (not published yet) monitors activity on that tree. I have discovered some issues, but I am not sure if these issues are related to the backend or to the application. Currently I am working on resolving it.
Anyway, the rapid monitoring on a tree of ~1000 nodes looks fantastic! :)
I have also posted two announcements, one on tech-pkg NetBSD list and another one on Glib/GTK+ developers list.
Stay tuned.
Posted on 4-Jun-2011 at 15:18:27 Almost two weeks gone since my last post here.
It is a hot time in the university, there are setoffs now and exams ahead. However, I have merged two big commits into my kqueue/master branch.
The first one just adds kqueue backend stub into Glib source tree, it is not so interesting, as the second one. The core functionality for file monitoring with kqueue is introduced in this merge.
Currently the following events can be monitored:
-
G_FILE_MONITOR_EVENT_DELETED;
-
G_FILE_MONITOR_EVENT_ATTRIBUTE_CHANGED;
-
G_FILE_MONITOR_EVENT_CHANGED;
-
G_FILE_MONITOR_EVENT_MOVED.
Further plans for gio/kqueue extension:
- Implement (emulate)
G_FILE_MONITOR_EVENT_CREATED and (if possible) G_FILE_MONITOR_EVENT_UNMOUNTED;
- Implement a missing files monitoring (i.e. wait while a file will be created and then start monitoring it);
- Resolve all 20+
TODOs (pending features and improvements, doubtful solutions in the code);
- Write tests and benchmarks.

Larger version, 56K
A gloomy screenshot above demonstrates what is working for now. Stay tuned!
Posted on 22-May-2011 at 16:23:20 In this post I will try to explain how the GMainLoop works. GMainLoop is the heart of every GTK+-powered application, so understanding what happens under the hood is mandatory for a Glib hacker. At least, I think that the best way to understand something better is to explain it, so I am writing this post.
What GMainLoop gives us
As I have already said, every GTK+ application runs a GMainLoop. GMainLoop processes all incoming events:
- user input: keyboard, mouse, etc;
- activity on file descriptors and sockets;
- timers (timeouts)
and invokes the appropriate handlers for all kinds of events. As I have mentioned, GIO receives inotify events using the standard GIO Channels, these events are also dispatched via GMainLoop.
There is polling all the way down
GMainLoop is like an endless loop of main loop iterations. On each iteration, it checks for new events from all event sources. Depending on the application, event sources may be:
- IO channels -- files and sockets;
- Window subsystem -- X11, Windows, Quartz, etc;
- Timers (timeouts).
GMainLoop usually runs in a signle thread. And now, imagine a situation: a GTK+ application has installed an IO channel watch to handle incoming data on a socket in a Glib way. But if it is running in a single thread, how it can process GDK events and wait for data on a socket at the same time?
The first idea is that GMainLoop uses timed-out polling. The scheme would be something like this:
- (A) check for available IO on the polled file descriptors using
select() or poll() with a short timeout;
- if there is data available, invoke the appropriate handlers;
- (B) when timeout elapses or data processed, process the pending events from X11 server;
- repeat.
But this scheme is very, very bad -- it requires a constant switching between (A) and (B) and consumes a lot of CPU. Thanks God, GMainLoop works in another way.
In GNU/Linux, GMainLoop always invokes a single poll() during an iteration with an infinite timeout. But why GUI does not getting blocked when we are waiting for data on a socket? The key point is that native GTK+ environment is X11 Window System. Remember its network transparency that people often swear? X11 client and X11 server communicate via socket, and this socket is polled in the same file descriptor set as well as all other sockets or files.
When user clicks a button, X11 server sends an event to a socket and poll() signals us about it. When data is available on another polled socket, the same poll() signals us about it. When inotify notifies us about filesystem activity, the same poll() will say us about it too. One mechanism, multiple purposes, really genius.
However, this genious simplicity has a significant shortcoming...
The problem
And the shortcomming is the following: if we want to add a custom event source to GMainLoop gracefully, we will need to pass a file descriptor to it for polling (timers do not counting!). Sometimes it is not possible.
Each event source in Glib is represented with the following set of functions:
struct GSourceFuncs {
gboolean (*prepare) (GSource *source,
gint *timeout_);
gboolean (*check) (GSource *source);
gboolean (*dispatch) (GSource *source,
GSourceFunc callback,
gpointer user_data);
void (*finalize) (GSource *source); /* Can be NULL */
/* For use by g_source_set_closure */
GSourceFunc closure_callback;
GSourceDummyMarshal closure_marshal; /* Really is of type GClosureMarshal */
};
prepare and (AFAIK, if a file descriptor is present) check functions are called on each GMainLoop iteration. If they all returned TRUE, the dispatch function will be called. This function should invoke a user callback for an event.
In my project I need to check filesystem activity with kevent(), so I could just add it into my implementation of prepare. But it is a completely wrong way -- I will need to monitor all file events with kqueue and all other events with poll. Unfortunately, it is unacceptable to block the execution at first with kqueue() and then with poll(). It is also unacceptable to use timed-out polling and to call both kqueue() and poll() with short timeouts during an iteration because of big CPU consumption of such scheme.
The best solution is to make a true Glib BSD port -- to drop poll() and to use kevent() everywhere instead. The efficient replacement will require a lot of unplanned changes in the core Glib, and it is an error-prone way. The unefficient replacement will not require a lot of changes, but nobody need it.
Another way is to run a kevent() loop in a separate thread. I also do not like this way, because it will automatically introduce an additional thread for each process that use GIO file monitoring. Even today, context switching is a hard task.
So, I have decided to look for alternative solutions.
How it is done in Windows
I have programmed for Windows for a long time and I know that that an Windows application communicates with the system via windows messages (and it peeks it from a message queue). So there should not be any sockets and trick with poll() should not pass, and at first I have referred to Win32 Glib & GTK+ implementation.
In Windows, g_poll() is not just a wrapper over poll() as in Unix systems, it is completely different here. It uses WaitForMultipleObjectsEx() and MsgWaitForMultipleObjectsEx() routines to monitor events. The first function supports only HANDLEs and is used for sockets. The second one is used when GTK is loaded, since MsgWaitForMultipleObjectsEx() can monitor window messages too. So we are facing with the same situation as in GNU/Linux -- the same routine can be used for working with both sockets and window events. Another point to implement a true Glib BSD port with kevent instead of poll() :)
How it is done in Mac OS X
Unfortunately, the Windows port has not gave me any ideas, so I have referred to a Mac OS X GTK+ port. I have never worked with Macs and this programming environment is new for me. A big comment in the beginning of /gdk/quartz/gdkeventloop-quartz.c states the following:
* Both cases share a single problem: the OS X API's don't allow us to
* wait simultaneously for file descriptors and for events. So when we
* need to do a blocking wait that includes file descriptor activity, we
* push the actual work of calling select() to a helper thread (the
* "select thread") and wait for native events in the main thread.
This is exactly my problem! And gtk-quartz developers have solved it with a separate thread, as I have already assumed in this post. Okay, if they have done so, I will do the same.
P.S. Git 'em All!
I have uploaded my Glib source tree to Github. The main development branch is kqueue/master, AFAIR it is based on tag 2.26.1. There is also branch kqueue/sandbox, where I will do some experimental things before merging it to the kqueue/master. The first commit, that adds kqueue checking to configure and plugin stub to source, is already there.
Posted on 15-May-2011 at 14:31:30 Last two weeks I have spent in trying to update my NetBSD 5.0.2 to current. It was an interesting quest and I still could not complete it yet :)
The CVS source snapshot has failed to build, the binary one haven't worked properly (I had some problems with X). So I have stopped on v5.1 -- it is not so outdated for a developer as v5.0.2, but it works. Though I start Emacs with LD_LIBRARY_PATH=/usr/pkg/lib (it will not launch otherwise) and sometimes MPlayer crashes the entire system, I think it is quite usable.
But this post is not about it. Here I will try to summarize how the GIO file monitoring works and how it uses inotify.
GFileMonitor for an end-user
GFileMonitor is an monitoring entity in Glib. Let's take a look on its public interface (the official documentation here and here):
Synopsis
enum GFileMonitorEvent;
GFileMonitor;
GFileMonitor* g_file_monitor (GFile *file,
GFileMonitorFlags flags,
GCancellable *cancellable,
GError **error);
Signals
"changed" : Run Last
To start monitoring on a specific file or directory, the programmer should call g_file_monitor() with the appropriate arguments. The returned object is the monitor itself, it will emit the "changed" signal on every event occured. The signal/slot system is common to Glib and it is no surprice that it is used here too.
The slot should have the following prototype:
void user_function (GFileMonitor *monitor,
GFile *file,
GFile *other_file,
GFileMonitorEvent event_type,
gpointer user_data) : Run Last
The event_type argument will show what kind of event has happened:
typedef enum {
G_FILE_MONITOR_EVENT_CHANGED,
G_FILE_MONITOR_EVENT_CHANGES_DONE_HINT,
G_FILE_MONITOR_EVENT_DELETED,
G_FILE_MONITOR_EVENT_CREATED,
G_FILE_MONITOR_EVENT_ATTRIBUTE_CHANGED,
G_FILE_MONITOR_EVENT_PRE_UNMOUNT,
G_FILE_MONITOR_EVENT_UNMOUNTED,
G_FILE_MONITOR_EVENT_MOVED
} GFileMonitorEvent;
Thats all. Clear and simple :)
inotify for an end-user
Now let's take a look inotify -- the subsystem that is used by GFileMonitor on Linux. inotify is implemented in the kernel, the userspace interface is:
int inotify_init (void)
int inotify_add_watch (int fd,
const char *pathname,
uint32_t mask);
(see this LWN article for more details).
Again, the interface is fairly simple. The inotify_init() function initializes the inotify subsystem and returns a file descriptor. inotify_add_watch() specifies a file and events to monitor. The file descriptor then can be used to monitor events with select(), poll(), etc. The application will receive notifications from this file descriptor in the following form:
struct inotify_event {
int wd; /* Watch descriptor */
uint32_t mask; /* Mask of events */
uint32_t cookie; /* Unique cookie associating related
events (for rename (2)) */
uint32_t len; /* Size of 'name' field */
char name[]; /* Optional null-terminated name */
};
The following event types are available:
-
IN_ACCESS -- File was read from;
-
IN_MODIFY -- File was written to;
-
IN_ATTRIB -- File's metadata (inode or xattr) was changed;
-
IN_CLOSE_WRITE -- File was closed (and was open for writing);
-
IN_CLOSE_NOWRITE -- File was closed (and was not open for writing);
-
IN_OPEN -- File was opened;
-
IN_MOVED_FROM -- File was moved away from watch;
-
IN_MOVED_TO -- File was moved to watch;
-
IN_DELETE -- File was deleted;
-
IN_DELETE_SELF -- The watch itself was deleted.
And again, clear and simple.
The glue
And now finally let's take a look under the hood -- on how GFileMonitor uses inotify. When g_file_monitor_file() is invoked, the following call chain occurs:
g_file_monitor()
g_file_monitor_file()
iface->monitor_file()
g_local_file_monitor_file()
_g_local_file_monitor_new()
get_default_local_file_monitor()
g_object_new()
Depending on the type of a file, the toplevel function g_file_monitor() calls g_file_monitor_file() for files and g_file_monitor_directory() for directories. I have shown a call stack for case of file.
g_file_monitor_file() takes the observed file's GFile object interface, iface, and calls its monitor_file member function by pointer. For local files, which are represented with GLocalFile class, this interface member function points to g_local_file_monitor_file(). This function, in turn, calls _g_local_file_monitor_new().
_g_local_file_monitor_new() is the most interesting one. First of all, it obtains a type of created monitor object with get_default_local_file_monitor() and then creates this instance using the obtained type with g_object_new(). Here we are facing with classes and metaclasses like in Smalltalk, Common Lisp and its modern pop successors.
get_default_local_file_monitor() returns the type of the monitor to be created. On its first invocation, the function searches for a suitable plugin across the available GIO extensions. GIO supports several types of extensions, and the function uses G_LOCAL_FILE_MONITOR_EXTENSION_POINT_NAME to filter out only file monitoring ones. Then each matched exension is asked for its GLocalFileMonitorClass subclass object. Each such object has an is_supported() member function, and if this method will return true, the class will be picked up as the default class for file monitoring objects. This type will be returned on all subsequent calls to get_default_local_file_monitor().
So, _g_local_file_monitor_new() function acts much like as a "virtual constructor".
Thus, GIO does not refer to its inotify backend directly.
The rest is trivial: in the gio/inotify directory at Glib's source tree we can find GInotifyFileMonitor and GInotifyDirectoryMonitor classes, which operate with inotify directly.
Considering that inotify provides a file descriptor for monitoring, the implementation of inotify backend can be summarized quite simple:
- When the extension is initialized, invoke
inotify_init() and install GIO watches on the obtained file descriptor;
- When a
GInotify{File,Directory}Monitor is created, invoke inotify_add_watch() with the appropriate parameters (obtained from g_file_monitor());
- In the inotify channel watch callback function, read a
struct inotify_event from file descriptor, decode it to GIO format (GFileMonitorEvent event code, etc) and emit the "changed" signal.
That is it!
| |