Guile macros - avoiding code duplication when interfacing to notmuch
When working on the interface to notmuch I stumbled upon the case where iterating over messages and iterating over filenames produced the same code structure only changing the functions being used. I hate to see code duplication and thus I went out exploring for solutions available in Guile to solve this issue.
In this series of posts I record how to use Guile as a scripting language and solve various tasks related to email work.
I always hear that macros are the killer feature of lisp languages. That, in lisp, macros are code generating functions which do all kinds of magic tricks. This case of code duplication sounded like the right moment to test that claim.
I must say that this is the first time that I deal with a language feature of this kind. It was quite a complicated topic to get around and make it work. I noticed that the very acclaimed feature of Scheme hygienic macros made my simple goals harder to achieve. I’m in no way at a position to judge against them. My software development experience has shown me that a language quality is not measured on how easy is to do big things, but rather how hard it makes it for you to do the wrong things. Maybe the solution I’ll propose next is just a bad way of doing things and thus it was hard to get it done. I’ll revisit this when I reach a higher experience, after all macros are the killer feature.
Looking for the common code
In the last post I showed how to iterate over messages, here is my solution
to iterate over filenames. Notice the similarities, the structure is the
same, the functions called are named similarly, because they do the same
and only change on the type they act, thus
filenames. A fundamental difference is that here, there is no function to
destroy the filename string pointer. That one, I really hope gets garbage
collected, because I found no reference on freeing that pointer. That also
means when comparing the iterators that the
messages version has an extra
(define (filenames-iter message proc) (let ((obj (notmuch_message_get_filenames message))) (let loop ((item (notmuch_filenames_get obj)) (acc (quote ()))) (if (= 0 (notmuch_filenames_valid obj)) (begin (notmuch_filenames_destroy obj) acc) (let ((result (proc item))) ;; notice that here there is no call to clear ;; the item memory with a function notmuch_filename_destroy (notmuch_filenames_move_to_next obj) (loop (notmuch_filenames_get obj) (cons result acc)))))))
let where I define the main object I’m iterating over is
actually an afterthought. I used to pass that object as a function
argument. Yet when I wrote the macro I realized that, the macro would call
the instantiation of the object on every call site and I ran out of memory
and my computer crashed. Thus I keep that structure on this early examples,
so that you can later recognize them on the macro. Macros are a new
programming discipline for me, I might need this reminder later in life.
From string to symbol
I know how to edit strings, that is a common task in software. What I learned for the first time with Guile was how to transform a string into a symbol:
It couldn’t be more simple. A dedicated function does the job. It is
important to realize that the symbol can’t do anything on its own. You
can’t put it as the first element of a list and expect it to be called,
find the function you specify and do the job you want. In my first
iteration of working with these code blocks, I didn’t use macros, but only
functions and used
eval to get the procedure I wanted. Like the
((eval (string->symbol "notmuch_filenames_get") (interaction-environment)) filenames-iterator)
I had to evaluate the symbol I just created in the currently defined
interaction environment for it to become the procedure I wanted to call and only
then I could apply it to the
filenames-iterator to get the currently
The most common macros I read about are
that is fine, but they seem to show to little of the power of a macro to
rewrite code. They are too small of a change on your writing conventions,
and once you read them they don’t make much difference.
The threading macros were more striking to me. I realized when writing lisp code that it is nice to have a function, test if it works and then wrap it with another one and so on, composing them as the way to process the data. However, when I came back and read that code, I had to read back and forth many-times, from the root to its leaves and back. It does has some benefits, because the extra time you take forces you to engage a bit more with the context and information about the code you are reading. However, many times I was just looking around for where the functions start and end, and I was more confused than enlightened.
Threading macros let you convert nested function calls into a list of
function calls and thereby improve readability. The next macro, is the
->, and what it does is place each element as the
first argument of the next element(that would be a function call). Try to
follow that modification on the next code block.
value gets inserted as
the first argument in a function call that is defined by the next element,
that is the function defined with the name
fun takes more than
one argument, I would write a list, and the macro injects
value as a
first argument. If
fun takes only one argument, I write the symbol
directly, and the macro takes care of placing
value in the
parenthesis. Then this macro calls itself recursively, that is how the
nesting is recreated.
(define-syntax -> (syntax-rules () ((_ value) value) ((_ value (fun . other-args) next ...) (-> (fun value . other-args) next ...)) ((_ value fun next ...) (-> (fun value) next ...))))
Here is an example of how I would use this macro. The
takes a syntax element (kind of a symbol with information about its
environment of evaluation) and turns it into a datum, which is the symbol.
That symbol becomes a string, from which I drop the last character, that
should be an
s, as that is the regular plural of nouns. At least that
works well enough for
(define (singular stx) (-> stx syntax->datum symbol->string (string-drop-right 1)))
My notmuch iterator macro
I went many times over the Guile documentation on macros and I wasn’t able to achieve what I wanted. I missed more examples and explanations. It was due to the hygienic macros, that I couldn’t just have my macro write symbols directly. I needed to always include the lexical context.
I started exploring NYACC’s code(after all it is a tool that generates
scheme code) an found enough inspiration for what I wanted. The next
nm-symbols uses the lexical context in the identifier
and a variable number of arguments that will be composed together as
strings. The return value is a syntax object being the named function I
want with the lexical context of
tmpl-id. This is how I compose the
symbols with the name I want and that they resolve to the functions
libnotmuch. I needed to wrap it on
eval-when for things to
work inside the macro presented later on.
(eval-when (expand load eval) (define (nm-symbols tmpl-id . args) (define (stx->str stx) (symbol->string (syntax->datum stx))) (datum->syntax tmpl-id (string->symbol (apply string-append (map (lambda (ss) (if (string? ss) ss (stx->str ss))) args))))) ;; the function singular defined before needs to be here )
That is all the setup I need. I can now write my iterator macro. Notice how
I construct the symbols I want with
nm-symbols, it uses
#'type to get
the lexical context, and
type is the symbol I use to refer to the notmuch
#' is a reader macro that brings the symbol into a syntax, only
from there I can extract lexical context.
After defining the syntax elements, I directly write my iterator using them. I would have loved to find a way not to write the destructor(free memory) if the symbol was not defined, but for now I check at runtime for the definition and if succeeds it calls it.
(define-syntax nm-iter (lambda (x) (syntax-case x () ((_ type query proc) (with-syntax ((valid? (nm-symbols #'type "notmuch_" #'type "_valid")) (destroy (nm-symbols #'type "notmuch_" #'type "_destroy")) (get (nm-symbols #'type "notmuch_" #'type "_get")) (next (nm-symbols #'type "notmuch_" #'type "_move_to_next")) (item-destroy (nm-symbols #'type "notmuch_" (singular #'type) "_destroy"))) #'(let ((obj query)) (let loop ((item (get obj)) (acc '())) (if (= 0 (valid? obj)) (begin (destroy obj) acc) (let ((result (proc item))) (when (defined? (quote item-destroy)) (item-destroy item)) (next obj) (loop (get obj) (cons result acc)))))))))))
The way I use this macro for messages is:
(nm-iter messages (result-messages query) (get-headers "from" "subject"))
and for filenames is:
(nm-iter filenames (notmuch_message_get_filenames message*) pointer->string)
This solution took me a lot of time to research. It has hard and painful to get it to work. At the time of writing, I’m still not sure I understand all elements that I ended up using, but it was a fun experience to get it to work. During the process of learning and trying, I ran out of file handles and also completely filled up my memory and crashed my system. All the good experiences that come by breaking things while learning.
The end result, from a software perspective, is worse than where I started. The solution takes more lines of code. The level of nesting of the functions and macros used is deeper. The overall readability and thus maintainability dropped. I’m also including implementation details in the macro (the part where I check for the symbol definition), it doesn’t feel right.
The good things is that I managed to get something new working. Despite
more new code, there is less code duplication and would I need an iterator
threads I would get it for free instead of incurring in