Managing my email with guile - interfacing to notmuch's C library
Manually creating the bindings for a sizable C program is a lot of work, also it is not very rewarding. I’m lazy and don’t want to do that. I looked around for what other options I have and I was surprised about the solution.
In this series of posts I record how to use Guile as a scripting language and solve various tasks related to email work.
If my experience with Python has taught me something is that the endeavor of interfacing programs between languages can be quite painful. I remember trying to use boost.python, then cython, and then even hearing about pybind and python-cffi. All those projects, why is there no simple solution? They had a good start and then it was painful the rest of the way. With Guile I didn’t search for long and I was quickly blown away.
NYACC is a project that did what I wanted very quickly. It can read the C code and automagically generate all the bindings you need. I still experienced some difficulties, and I still spent a lot of time looking up at the C code in notmuch and their Python bindings for guidance, yet the overall experience was a lot nicer than what I remember from the Python world. My work was focused on designing a usable interface, thinking about how I want my implementation to work. The typing and generation of the bindings was done entirely by NYACC. What I like about it is that you don’t repeat yourself, you take the C header file and NYACC builds the bindings, it directly understands the C code. On the many Python projects I have used, you must implement a new copy, which you then need to maintain.
Creating the module
(define-ffi-module (ffi notmuch) #:library '("libnotmuch") #:include '("notmuch.h"))
That is all you need to start with, write it on a file called
ffi/notmuch.ffi inside your guile path. Then as the NYACC documentation
says just execute:
guild compile-ffi ffi/notmuch.ffi
and you get the
ffi/notmuch.scm file with ALL the bindings defined on
notmuch.h file. It provides even wrappers/unwrappers between Guile
and C types and their enums. I was really amazed how well it works. For
reasons I’m unaware of, you still need to call
pointer->string when dealing with those string pointers. Since it is
written in the documentation there might be a limitation or be a design
With all the bindings already implemented for you, the only thing left is to implement some adapters to interact with the library the way you like and not the way it was written to be used in the C world.
Building the interface
Make sure that the generated file
ffi/notmuch.scm is in your path and
import it. The workflow is now much easier, since all the bindings are
already at your disposal. I can directly use the module to create my
adapters to use notmuch in Guile.
Wrappers around the wrappers - the adapters
NYACC creates a binding for
notmuch_database_open, which looks more
complicated that what I presented in the previous post, yet that is because
it provides additional wrappers/unwrappers to the types. Same thing with
all other exposed functions.
NYACC also defines constructors for types, for example
make-notmuch_database_t* creates a pointer to that type and I get it with
a nice representation in the REPL, which is much nicer than, what I had in
the previous post with
make-bytevector. My adapter to open the database
is now much cleaner.
(use-modules (system foreign) (system ffi-help-rt) ;; functions from nyacc (ffi notmuch)) ;; the module just created (define (open-database path mode) ;; nyacc provides the pointer "constructor" (let ((ffi-db (make-notmuch_database_t*))) (notmuch_database_open (string->pointer path) mode (pointer-to ffi-db)) ffi-db))
Next I set up a query, and set the default of omitting the
spam tags. I should read those options from the notmuch-config, yet I
don’t want to create that interface at the moment, thus I just put it here.
(define (query-db db str) (let ((query (notmuch_query_create db (string->pointer str)))) (for-each (lambda (tag) (notmuch_query_add_tag_exclude query (string->pointer tag))) (list "deleted" "spam")) query))
To process the query I need to see the matching messages. For that I
result-messages and the extra utility function
(define (result-messages query) (let ((messages (make-notmuch_messages_t*))) (notmuch_query_search_messages query (pointer-to messages)) messages)) (define (count-messages query) (let ((counter (make-int32))) (notmuch_query_count_messages query (pointer-to counter)) (fh-object-ref counter)))
Iterating over the messages
The previous functions allowed me to get the messages matching the query,
yet I need to be able to process them, that means iterating over each
message. Looping in Guile is done via recursion. I use the named let to
express recursion for an iterative process. Here I do heavy use of the C++
functions to iterate over the messages, very similar to how it is
implemented in the C++ code. It gets annoying to differentiate in the
functions between plural and singular, because there are
message. I’m guilty of this crime on my own software, yet with so many
more prefixes and suffixes in the function names in here it was tougher on
my eyes this time. The message iterator with inline redundant explanations
is as follows:
(define (messages-iter query proc) ;; get all messages that match the query (let ((obj (result-messages query))) ;; This is the named let, LOOP is the procedure which accepts ;; the amount of bindings as arguments ;; ITEM are the individual messages, here I get the first one to initialize it ;; ACC is is a list accumulating the results of the iteration (let loop ((item (notmuch_messages_get obj)) (acc '())) ;; Terminate iteration if the obj, which is a pointer for the messageS ;; is not pointing to a valid message anymore. (if (= 0 (notmuch_messages_valid obj)) (begin ;; Extremely important to clear memory of the messageS (notmuch_messages_destroy obj) ;; This is the retun value, the list of results acc) (let ((result (proc item))) ;; I evalutate proc to a message ;; Extremely important to clear the memory of the message (notmuch_message_destroy item) ;; This moves the pointer of messageS to the next message (notmuch_messages_move_to_next obj) ;; Recursion in play, LOOP is called, it gets the next message ;; because the pointer was just moved and RESULT is placed at the ;; head of ACC, for the next iteration (loop (notmuch_messages_get obj) (cons result acc)))))))
The power of Scheme is that I can abstract that iteration and pass a function to process the messages. On the C++ I found all those pointer manipulating functions being called all over the place, each time an iteration was needed.
A simple function to extract selected headers is just again an iteration of
notmuch_message_get_header with different arguments. I get the
benefit to abstract behavior in a function of variable arity for each
header I want. I return a new function that only takes a message as
(define (get-headers . labels) (lambda (message) (map (lambda (label) (pointer->string (notmuch_message_get_header message (string->pointer label)))) labels))) ;; Use it like this, where msg is a notmuch message pointer ;; ((get-headers "date" "to" "from") msg) ;; Use it with the iterator like this: (let* ((db (open-database "/home/titan/.mail/" 0)) (query (query-db db "discussions on some mailing list")) (result (messages-iter query (get-headers "date" "to" "from")))) ;; always clear memory (notmuch_query_destroy query) (notmuch_database_destroy db) ;; return the result result)
Tagging is again a procedure I apply to a message, thus I only need to
implement that function like I did with
get-headers. In this case
apply-tags-to-message returns a function that consumes the message and
applies the desired tags, which are given all is the same string. This is
to reuse my configured tags from previous setups (
tags : "+sent +project -inbox"). The next function is quite dense as it needs to iterate over
each tag that is going to be applied or removed.
(define (apply-tags-to-message tags) (lambda (message) (let loop ((rest (string-tokenize tags))) (unless (null-list? rest) (let ((tag (string->pointer (substring (car rest) 1)))) (if (string-prefix? "-" (car rest)) (notmuch_message_remove_tag message tag) (notmuch_message_add_tag message tag))) (loop (cdr rest))))))
I use this function just as I showed in the previous section, however I
need to open the database in
READ_WRITE mode, that is I must pass a
open-database function. Keep in mind that the tagging function
does not return anything, thus the result of the iterator will be a list of
Deleting message files
This task has some new challenges. Notmuch has two functions to get the
notmuch_message_get_filenames, singular and plural cases again. The first
one gets a filename of the message, most of the time a message has also one
corresponding file. However, when you interact with mailing lists, you
might end with some copies of the same message, also if you manage multiple
email accounts and receive the same message on many of the accounts. For
that reason there is the function with the plural case. It returns an
iterator over the filenames, which is to be processed just like I did with
the messages iterator.
The code is so similar I will not write it here, you can take it as an exercise left to you. In an upcoming post I’ll show my attempt at using macros to write an iterator to deal with both messages and filenames.
As I already said at the beginning of the post, NYACC is an amazing solution to create the bindings from Guile to C, it felt much more comfortable to use than anything I have used on the Python world, when I had to use those tools some years ago.
After creating some basic adapters, extending my script to process my emails with notmuch was an absolute pleasure. I did run into problems like running out of memory and running out of file handles. I had forgotten about those old problems since I moved to the land of Python, yet when interfacing with C, you need to take that responsibility again, whether you are in Python or Guile.
I implemented enough features for this email processing script that I already manage to do everything afew, was able to do. As such I have replaced it already with my script. It is not comparatively better, yet it makes me proud to use my stuff, designed in the way I want to use it.