BlogRussian blogGSoC 2011GSoC 2010AboutMaintenance
Building Mozilla Seamonkey 2.17 on NetBSD

Recently I've decided to use Mozilla Seamonkey as a default browser, because

  • It reminds me about that good-old-days;
  • I find it less bloated and slow, than today's mainstream Firefox and Chromium (even if Seamonkey has its own mail client, irc client, and so on).

I am running NetBSD-current (6.99.19) on amd64. The build invoked with

$ cd /usr/pkgsrc/www/seamonkey
$ make install clean
went fine, however there was an issue on the installation stage:

===> Building binary package for seamonkey-2.17
=> Creating binary package /usr/pkgsrc/packages/All/seamonkey-2.17.tgz
pkg_create: can't stat 
pkg_create: can't stat 
pkg_create: lstat failed for file 
No such file or directory
*** Error code 2

make: stopped in /usr/pkgsrc/www/seamonkey
*** Error code 1

make: stopped in /usr/pkgsrc/www/seamonkey

Looks like it is not a pkgsrc issue, since, as Google search results have shown, the same occurs on GNU/Linux systems. To fix it, simply do

$ cd /usr/pkgsrc/www/seamonkey 
$ cp work/comm-release/mozilla/dist/xpi-stage/inspector/install.rdf \
$ cp -rv work/comm-release/mozilla/dist/xpi-stage/inspector/defaults/ \
And then invoke make install again.

Add a comment
Haskell: escaping the record update syntax hell

The problem

Suppose you have a data type Sample:

data Sample = Sample {
    someNum :: Int
  , someText :: String
  , someFoo :: Foo
} deriving (Eq, Show)

data Foo = Foo {
    fooBool :: Bool
  , fooPair :: (Int, Int)
} deriving (Eq, Show)

..and you have a value of this type, s:

let s = Sample 1 "Sample" $ Foo True (2, 3)

Having s, you need to change something inside it's Foo. How to do it? The most obvious is to write something like:

s2 = s { someFoo = foo { fooBool = False } }
  where foo = someFoo s

Looks crappy, but it is okay. And what if we need to increment s's someFoo's fooPair? The issue is that increment depends on the previous value, so we need to write something like

s3 = s2 { someFoo = foo { fooPair = newPair } }
  where foo = someFoo s2
        newPair = (p1 + 1, p2 + 1)
        (p1, p2) = fooPair foo

Wow! It looks completely scary. Imagine how it would like if we had a three or four nesting levels!

The idea

We can make things easier with a simple helper functions:

modFoo :: Sample -> (Foo -> Foo) -> Sample
modFoo s fcn = s { someFoo = fcn $ someFoo s }

modFooPair :: Foo -> ((Int, Int) -> (Int, Int)) -> Foo
modFooPair f fcn = f { fooPair = fcn $ fooPair f }

Using these functions, we can define s3 as:

s3 = modFoo s2 $ \f -> modFooPair f $ \(a, b) -> (a + 1, b + 1)

It looks definitely better! But now we find that both modFoo and modFooPair functions follow the same pattern:

  1. Take an object and a fuction as parameters;
  2. Apply function to the selected field value;
  3. Return a new object based on a passed one with the selected field set to the of the function's return value.

It is boring to write such boilerplate code for each data field by hand. Cann't it be automated?

The solution

Yes, it can. With the Template Haskell extension, we can inspect the data type definitions and to generate the code we want.

This approach lies at the heart of the Data.Mutators package. For each field of the each record syntax constructor of the given data type, Data.Mutators will generate the set- and mod- functions. For example, given a data type

data ObjectType = ObjectType {
  something :: SomethingType

after invoking genMutators ''ObjectType we will get the following functions

setSomething :: ObjectType -> SomethingType -> ObjectType
modSomething :: ObjectType -> (SomethingType -> SomethingType) -> ObjectType

Obviously, the set- function sets the field value. The mod- function applies a function to the field value -- it is quite handful when we need to modify a field using its existing value.

The names of the generated functions are build by upper-casing the field name's first character and prefixing it with "set" or "mod". This behavior can be adjusted with an alternate function, genMutators', which takes an additional argument of type String -> (String, String). This function should return a tuple containing names for setter and modifier functions (exactly in this order), using the field name passed to it.


Add a comment
Some tips on using Amber Smalltalk with nginx

Amber is an implementation of Smalltalk-80 language that compiles into JavaScript. Amber is extremely useful for client-side web development as a neat replacement for JS; it is integrated with jQuery out of the box and it is fully compatible with other JavaScript libraries.

Amber comes with a nice IDE that works directly in the browser. However, to save the code from the browser-based IDE on disk, Amber requires a HTTP server with WebDAV support. Though Amber comes with a tiny WebDAV server written in Amber, production environments require more powerful HTTP servers, like nginx or Apache.

In this post I will explain how to configure nginx to work with Amber.

Basic configuration

WebDAV functionality is provided in nginx by the ngx_http_dav_module module. This module is not built by default, so nginx needs to be rebuilt with --with-http_dav_module configure option.

However, Debian GNU/Linux and Ubuntu have nginx-extras package that includes a lot of modules, including this one.

Consider the following minimal configuration (file /etc/nginx/nginx.conf):

server {
    root /opt/mysite;
    client_body_temp_path /opt/mysite/tmp;

    location / {
        dav_methods PUT DELETE MKCOL COPY MOVE;
        dav_access  group:rw all:r;

Here we defined a basic static site. The /opt/mysite directory should contain an index.html page with Amber Smalltalk included (see Getting started and A first application). Notice that the /opt/mysite should be accessible for nginx (see more at "Permissions").


Now you should have something working at http://localhost. But when you will try to save any changes from the IDE, the commit operation will fail. We need to set the appropriate permissions on some directories to make it work.

Likely you and nginx work in the system under different users. In Debian and Ubuntu nginx runs under www-data, and you work as, for example, as user. The rest of this section is full of tautology, sorry.

On the development phase, the website's files and directories usually belong to the user user, and user is free to modify it. To make user www-data also able to write to the specific directories, we need to make some simple steps:

  • Add user www-data to the user's group. In modern Linux distros users belongs to eponymous groups, to by default a user user belongs to the group user. The following operation should be performed under user root:
    # usermod -a -G user www-data
  • Modify the permissions for /st and /js subdirectiories. When a package committed, Amber stores there Smalltalk and JavaScript sources appropriately. The following operations should be performed under user user:
    $ cd /opt/mysite
    $ chmod 775 st js

The 775 mode allows to write into these directories for any users from any user's group. Not very secure, but enough for our needs. Since we've added user www-data to the group user, nginx process now can write there so commits should work.

A more sophisticated configuration

Suppose that we have a web application written on Smalltalk, Ruby, Python, or something else, and we want to use Amber there. Usually modern web frameworks are supplied with [their own] tiny HTTP servers to make the initial development and configuration easier. Not a problem! We still can serve WebDAV requests as well as static content (images, JS and CSS files) with nginx and dynamic content with a web-application. Even more, separating the responsibilities is a good practice.

For example, our custom application runs on the same machine with nginx and uses port 8000. There are few changes required in the configuration:

server {
    root /opt/mysite;
    client_body_temp_path /opt/mysite/tmp;

    location ~ ^/(images|js|st|css) {
        dav_methods PUT DELETE MKCOL COPY MOVE;
        dav_access  group:rw all:r;

    location / {
        proxy_pass http://localhost:8000;

We consider that the static content is available under the following locations:


This is the default layout when using Amber. If the request will start with /images, /js, /st, or css, the underlying content will be served by nginx. We have used regular expressions for that. DAV methods are also allowed there (since by default Amber puts the files in /js and /st directories on commit).

All other requests will be proxied to our custom application. The only thing we will need to tune there it to include the Amber scripts in the pages generated by the custom application.

And even more sophisticated configuration

Imagine that our web application is growing and is becoming more complex, and it uses RESTful URLs for the dynamic content. For example, users of the web application can edit their profile data on page http://localhost/edit/profile.

If you will place Amber on a page under such location, you will fail to commit your Smalltalk and JavaScript code from the Amber IDE. The answer is easy: Amber will generate DAV requests and nginx will try to write the files into /opt/mysite/edit/st and /opt/mysite/edit/js directories.

Of course you can create this directories and any other directories for each level in your RESTfull web app. But since we already have /opt/mysite/st and /opt/mysite/js directories, I would like to store Amber sources for any page there. How we can achieve that?

Again, the solution is fairly easy. The updated nginx site configuration should look like:

server {
    root /opt/mysite;
    client_body_temp_path /opt/mysite/tmp;

    location ~ ^(/|/edit/)(images|js|st|css) {
        rewrite /edit/(.*) /$1;
        dav_methods PUT DELETE MKCOL COPY MOVE;
        dav_access  group:rw all:r;

    location / {
        proxy_pass http://localhost:8000;

Here, if the request starts with /edit and then continues as a normal Amber or static data request, we just drop the /edit part so the Smalltalk sources will be written into /opt/mysite/st directory, not into the /opt/mysite/edit/st. Note that GET requests like /edit/images/1.png will be also rewritten and in this case the file /opt/mysite/images/1.png will be accessed. To fix it, we just need to move rewrite and dav_* methods into a separate location, such as ^(/|/edit/)(js|st).

Add a comment
Debugging GNU Smalltalk on NetBSD: Episode II

See also: Episode I

Setting up the NetBSD kernel debugging environment

NetBSD has its own kernel debugger named DDB. As the original documentation states, is useful for gathering crash tracebacks, examining the values of variables, and other minor debugging tasks, but if youre doing serious kernel hacking you will want to setup to work with the remote debugger, KGDB, instead.

There is a few HOWTOs on how to setup remote debugging for NetBSD kernel on the Internet.

The first one is the official NetBSD documentation chapter. It describes how to set up debugging using two computers connected with null-model cable.

The second one is a tutorial by Alexander Shishkin. It involves QEMU to host the debugged system, so the work can be done on a single PC. However, it uses a custom script for generating disk image with kernel and the basic userland. It looks a bit tricky.

I have wanted to use a normal system from the official distribution ISO image. Also I had only one laptop with NetBSD and QEMU was the solution. So, my way is a combination of the both methods mentioned above.

Building the kernel

Building kernel is fairly easy. All we need is just to modify the configuration to enable KGDB and to generate a full symbol table. The following steps are taken from the already referred official documentation:

# cd /usr/src/sys/arch/i386/conf

GENERIC and DEBUGGING are build configuration files. These files determine what will be included into the kernel, which options will be enabled or disabled and so on. GENERIC is a basic configuration file and the stock NetBSD kernel is built as GENERIC.

I have named a new configuration as "DEBUGGING". In the DEBUGGING configuration file, the following lines have to be commented with the preceiding pragma (#):

#options    DDB                     # in-kernel debugger
#options    DDB_HISTORY_SIZE=100    # enable history editing

and the following lines have to be uncommented by removing the preceiding pragma:

options     KGDB                    # remote debugger
options     "KGDB_DEVNAME=\"com\"",KGDB_DEVADDR=0x3f8,KGDB_DEVRATE=9600
makeoptions DEBUG="-g"              # compile full symbol table

KGDB_DEVADDR option sets the address of the serial port which will be used for debugging. 0x3f8 is tty00, 0x2f8 is tty01.

There is just a few things left to build the kernel:

# config DEBUGGING
# cd ../compile/DEBUGGING
# make depend
# make

That is all! After the successful compilation we will get netbsd and netbsd.gdb files in the current directory. It is a new kernel and debugging symbols for GDB respectively.

Preparing the guest system

Now we need to get a basic system working in QEMU. The following commands will create a 2GB QEMU hard disk image and run the NetBSD installer for it:

$ qemu-img create netbsd.img 2G
$ qemu -hda netbsd.img -cdrom i386cd-5.0.2.iso -boot d -m 196 -localtime

All QEMU options in the last command are quite straightforward, the most interesting are:

  • -hda -- specifies a hard disk image that we have created a step before (netbsd.img);
  • -cdrom -- specifies an installation CD ISO image (i386cd-5.0.2.iso in my case)
  • -boot d -- tells QEMU to boot from a virtual CD-ROM instead of the hard disk image;

After the successful installation we will need to restart QEMU with some different options:

$ qemu -hda netbsd.img -boot c -m 196 -localtime \
       -net user -net nic,model=rtl8139 \
       -redir tcp:5555::22

Two -net options will enable networking in the guest. The last -redir option will allow us to connect to the guest with ssh via the localhost port 5555.

After the system will boot, log in as root and run

# dhclient

to obtain an IP address. ping will not work, but pkgsrc will. I have installed Git and OpenSSHd.

Installation of a new kernel is quite simple. With QEMU networking the host machine usually appears as, so the kernel can be transferred with scp:

# mv /netbsd /netbsd.old
# scp user@ /

Final steps

Again, we will need to restart QEMU in a new configuration:

$ qemu -hda netbsd.img -boot c -m 196 -localtime \
       -net user -net nic,model=rtl8139 \
       -redir tcp:5555::22 \
       -serial tcp::4444,server

The last option -serial tcp::4444,server creates a pipe from a host's port 4444 to guest's serial port (remember KGDB options in the kernel configuration).

QEMU will start but will not launch the guest system until we connect to this port. It is time to open gdb:

(gdb) symbol-file /usr/src/sys/arch/i386/compile/DEBUGGING/netbsd.gdb 
(gdb) target remote :4444

QEMU window will appear and we will need to boot the guest system with a different command in the NetBSD bootloader:

boot -d

After some seconds the guest system will enter an initial breakpoint and in the gdb shell we will get something like this:

0xc053556c in breakpoint ()

Great! Now everything is ready for debugging.

(to be continued)

Add a comment
Debugging GNU Smalltalk on NetBSD: Episode I

In the previous post I have told how to build GNU Smalltalk on the fascinating operating system NetBSD. The interpreter worked pretty fine, but I wanted something more than just simple scripts.

The problem

So I have tried to run Seaside. netstat said that the port 8080 was opened, but I could not reach http://localhost:8080/seaside in the browser.

The first suspiction has fallen on sockets. Of course, it would be hard to debug sockets on such complicated tools as Swazoo and Seaside, so I have took a simple Samuel Montgomery-Blinn's TCP echo server example for tests. The code has been slightly simplified to run only in a single green thread, to serve a single client and to work only for a single message:

Eval [
    | client server string |

    PackageLoader fileInPackage: #TCP.

    server := TCP.ServerSocket port: 8000.
    server waitForConnection.

    client := server accept.
    string := client nextLine.
    client nextPutAll: string; nextPut: Character nl.

    client flush.
    client close.

This sample works fine on GNU/Linux, but does not work on NetBSD. I have successfully connected on port 8000 with telnet, but after typing a message and hitting Enter the server has not replied to me with echo. Server process still hanged in memory.

Great, it is time to take a look under the hood and to understand how GNU Smalltalk sockets work.

Sockets: it is streams all the way down

GNU Smalltalk sockets are implented in a cute way. "End-user" objects are not actually sockets, it is just adaptors that implement a Stream interface over a concrete socket implementations.


End-user class hierarchy

It is obvious that a socket class does actually implement methods like #nextLine -- it is abstract and is implemented somewhere in the Stream class. Design patterns call it "template methods", I call it good OO design. The template methods are expressed with another methods whose behavior may be specified or changed in the subclasses.

The underlying implementations are actually FileDescriptors.


Implementation class hierarchy

Again, it is quite logical -- the core BSD sockets are represented as file descriptors in the user space (remember that everything is file in Unix). Depending on the type of a file descriptor, calling common system calls (such as read(2), write(2), fcntl(2)) on it will result in invoking a different code at the kernel space.

Files, sockets and all the I/O as well is the intercommunication with the outside world. It can not be implemented in pure Smalltalk, at the lowest level we have to deal with the API, which the operating system provides for us. In the case of files and sockets we are working with file descriptors -- integer values in Unix systems.

In GNU Smalltalk, file descriptors are represented with FileDescriptor class. Every object of this class holds a numeric instance variable fd -- actually the Unix file descriptor.

All the high-level I/O methods, which the programmer uses in the application, are expressed with low-level access methods like #fileOp:, #fileOp:ifFail:, #fileOp:with:, #fileOp:with:ifFail: and so on. These methods call the same primitive VMpr_FileDescriptor_fileOp and the succeeding processing goes on the VM side. Depending on an index passed to the #fileOp: from a higher-level method, a different file operation will be performed.

The basic socket implementation class AbstractSocketImpl overrides the #fileOp: methods to call VMpr_FileDescriptor_socketOp primitive instead of VMpr_FileDescriptor_fileOp.

Now, after digging into the implementation details, lets return back to the echo server example. If we will interrupt the hanged-up server process, we will receive the following stack trace:

optimized [] in Sockets.StreamSocket>>newReadBuffer:

As we can see, our process has stuck on the call to AbstractSocketImpl>>ensureReadable, which was implicitly invoked via a chain of calls from Stream>>nextLine.

Stream>>nextLine method does a simple thing: it checks weither there is data available and reads it byte by byte until a newline character will be reached.

AbstractSocketImpl>>ensureReadable is a little bit more interesting. It blocks the current Smalltalk thread and waits until there will be data available for reading. It involves the VMpr_FileDescriptor_socketOp primitive too. Lets now go down from Smalltalk to the virtual machine side.

Asynchronous I/O for the win

Our sample server is synchronous. First of all, it waits for a client connection, and then it waits again while client will send us a line of text. All these operations are synchronous -- we can not do something else inside a single Smalltalk thread while waiting for an event.

Such operations are called "blocking". If we wrote our echo server on C, we would use a blocking sockets, so system calls like accept(2) and recv(2) would block our server process until a client will connect and send some data respectively. It is a very simple and straightforward scheme that is often used in simple applications.

We could assume that GNU Smalltalk's #waitForConnection and #nextLine are implemented in the same way, since these method provides us the same blocking behavior, but actually it is not true.

GNU Smalltalk implements green threads (aka Smalltalk Processes) for multitasking inside VM, it does not support native system threads, so calling accept(2) or recv(2) on a true blocking socket would block the entire virtual machine on a time of the call. It is completely unacceptable, so socket IO is implemented in a more cute way with non-blocking sockets.

When a Smalltalk process needs to wait for a specific event (client connection or incoming data) on a specific socket, the AbstractSocketImpl>>ensureReadable is called. #ensureReadable creates and locks a Semaphore to block the current Smalltalk process.

On the virtual machine side, via call to the primitive VMpr_FileDescriptor_socketOp with operational codes 14 and 13, the following happens:

  1. SIGIO signal handler is installed on the socket;
  2. Socket is added to a table of polled descriptors;
  3. If there is no code to execute and all Smalltalk processes are sleeping (waiting for data), sigsuspend(2) is called. In this state the virtual machine process will sleep in waiting of the arrival of any Unix signal. I did not tested it, but I assume that the VM process can handle SIGIO even without of calling sigsuspend(2).
  4. If there is an activity on a file descriptor, i.e. incoming connection or data, the VM process will receive SIGIO and the signal handler (installed on the first step) will be executed;
  5. This handler will check the table of polled descriptors. For every ready for I/O descriptor VM will unlock the appropriate semaphore and the appropriate Smalltalk process will resume its execution;
  6. The descriptor is removed from a table of polled descriptors.

Now we get back on the Smalltalk side. After resuming from #ensureReadable, we know that a descriptor is ready for IO and calling accept(2) or recv(2) will not block the interpreter. That's it!

A set of simple debugging printfs has been inserted in the VM and has shown that the VM really goes to sleep after the call to the #nextLine. Looks like the gst process just does not receive SIGIO on incoming data. I saw the only way to check it -- to debug the NetBSD kernel.

See also: Episode II

Add a comment
How to build GNU Smalltalk on NetBSD

GNU Smalltalk is being developed under GNU/Linux (primarily), so if you have used it on GNU/Linux, everything should work well.

But if you will change your working environment to a different operating system, like NetBSD, you can encounter some troubles, even (mostly?) on the compilation stage.

Okay, so what we need to build GNU Smalltalk on NetBSD properly?

First of all, BSD Make is not GNU Make. I could not build GNU Smalltalk with BSD Make (remember that BSD is Unix and GNU is Not Unix?).

Next, even with gmake compilation has failed. In my case, the linker has thrown an 'undefined reference to...' error, mentioning one of pthread functions. Okay, I do not know why autotools did not do it, all we need to fix it is just to add -lpthread to LDFLAGS.

After it the compilation completed successfully... but it is not the end of the story. After installation I have tried to create a basic Seaside image:

$ gst-load -iI Seaside

...and the DLD (GNU Smalltalk's Dynamic Library Loader) has said that there is no such module 'iconv'.

Knowning about GNU Smalltalk binding development features, I have decided to check the Iconv package:



Okay, the Iconv package dynamically loads the iconv dynamic library on startup. But there were no any dynamic libraries in GNU Smalltalk source & build directory!

In the compilation logs I have found a lot of 'Warning: linker path does not have real file for library ...' libtool warnings. It could not find m and other standard libraries. All they are available in /usr/lib. Okay, so we need to say about it to libtool and add -L/usr/lib to LDFLAGS. And it has worked!

So the building is:

$ autoreconf -vi
$ ./configure
$ gmake LDFLAGS="-L/usr/lib -lpthread"

Add a comment
Is Seaside secure?

Now playing: Iggy Pop - Livin' On The Edge Of The Night

Seaside is known as a heretical web framework: as every Seaside programmer knows, it uses continuations and stores session state information directly in the URL.

Typical Seaside URL contains two arguments, _s and _k. _s identifies a session (it is the same for all the generated pages within a single session) and _k is used to look up a continuation in this session. Depending on current continuation the appropriate components and content are rendered [1].

What is continuation? Briefly saying, continuation is a snapshot of the application state.

Well, what will happen if we will copy such URL from one browser and then open it in another one? If we will do it quickly (before the session expiration), we will reach the same place in the application. Even if the second browser is launched on a different PC!

If our web application supports user accounts, we can even appear under a different user in the system without authentication. All we need is just to obtain a generated URL with _s and _k from a logged-in user.

I have successfully reproduced it with Seaside 2.8 on this blog (actually I do not know the exact version of Seaside shipped with GNU Smalltalk). Although I use cookies for authentication and check it every time in WASession>>start:, I was able to remove a post from a separate browser without of logging in. Is it a bug or a feature? I think that it is a feature of Seaside and bug of an application :). My point is to move all the state into cookies and use only RESTful URLs for such actions. In this case we don't rely only on continuations and handle the situation fully.

Please correct me if I'm wrong.



Add a comment
Garbage collector, object dumper and an interesting situation

In the previous post [*] there was a great discussion between me and me (anonymous is also me).

I have removed a comment that had some replies. The comment object was removed from a collection (and I assume that this collection was dumped on a disk without of this object), but since it still has references, it was not GC'ed. GNU Smalltalk VM still holds it in memory and reply comments are rendered correctly (but they point to nowhere).

Ok, but what will happen if I will restart the image? :) I assume that since each comment has a reference on it's "root" in the reply tree, the removed comment still is stored in the dumped file, however not as a member of collection but as a nested in another comment(s).

I will pass the db to hexdump in order to get more info.

[*] Currently there is no "previous post", it was corrupted during development. My binary NoSQL persistence workaround is not very robust, well, I need to implement an SQL DB backend.

Add a comment
smalltalk (5)
netbsd (4)
howto (2)
kgdb (1)
seaside (1)
nginx (1)
seamonkey (1)
io (1)
qemu (1)
security (1)
haskell (1)
template haskell (1)
gc (1)
pkgsrc (1)
gdb (1)
vm (1)
amber (1)