You want to write a custom org backend? Let's write onlybold backend together to get you started
Hi Emacsers,
Recently I've been playing with org-element
and org-export
.
Specifically, I was interested in the mechanism of the org exporter system and its flexibility.
The goal of this post is to get you started with the creation of org backends.
To do so, we build an org backend that:
keeps only
bold
elements,surrounds
bold
elements with***
before and after,surrounds
paragraph
elements with::
before and after,surrounds
section
elements with<--
before and-->
after (removing the last newline).
We call it onlybold
.
Before we start, if you are interested, I recommend you to read in org-mode's source code the following files:
lisp/test-ox.el (AMAZING).
You can get org-mode's source code by running the following command:
git clone https://git.savannah.gnu.org/git/emacs/org-mode.git
Let's get started.
what we want to achieve
We want to export this org buffer:
I like bold-1 and bold-2 and you?
I don't. I prefer bold-3.
I've loved bold-4 since I was a child.
I'm italic.
into another buffer like this:
<--::***bold-1*** ***bold-2*** ***bold-3***::
::***bold-4*** ::-->
org export mechanism
When org exports an org buffer, basically it does two things:
parse the org buffer producing a tree (a nested elisp list) representing the org buffer and,
recursively build a string by traversing the tree and choosing for each node what to do with it by looking up its associated transcode function defined by the org backend.
This means that org does the hard work for us "parsing" and "traversing".
To build our onlybold
org backend and any other org backends, in the
simplest case, we just have to provide the transcode functions
(or simply transcoders).
transcoders, org-export-define-backend and org-export-to-buffer
The function org-export-define-backend takes as arguments:
the backend's name we want to define and
an alist of transcoders.
A transcoder (or a transcode function), is a function that handles an org element when it is being exported.
For instance, our backend onlybold
must define a transcoder for bold
elements that surrounds bold texts with 3 stars ***
like this:
bold text -> ***bold text***
Most transcoders take three arguments:
the element as it appears in the parsed tree,
a content strings corresponding to the children of the element already "transcoded",
the communication channel that contains all the information the export system needs to correctly export the document (the obvious ones are the title, date and author of the document that can be defined inside the document using lines starting by
#+TITLE:
,#+DATE:
or#+AUTHOR
).
Let's define onlybold-bold
, the transcoder of bold
elements:
(defun onlybold-bold (bold contents info)
(concat "***" contents "***"))
Now, we can define the first version of onlybold
backend, which
transcodes only bold
elements:
(org-export-define-backend 'onlybold
'((bold . onlybold-bold)))
Then we defined the command onlybold-export
that pops up the buffer
*onlybold*
which contains the exported content (using onlybold
backend) of the current buffer:
(defun onlybold-export ()
(interactive)
(org-export-to-buffer 'onlybold "*onlybold*"))
Now, if we call the command onlybold-export
inside our org buffer,
the buffer *onlybold*
pops up with nothing in it.
We might be disappointed, but we aren't. This is totally normal.
In a specific backend, when an element doesn't have a transcoder to
handle it, the element is skipped. (In the same vein, if a transcoder
return nil
for an element, the element is also skipped).
parsed tree, section elements and paragraph elements
In our org buffer, the bold elements belong to paragraphs that belong
to a section. We can see this by looking at the parsed tree in the buffer
*Pp Eval Output*
after running the following command (being in the org
buffer):
M-x pp-eval-expression RET (org-element-parse-buffer)
We get the following tree ( ...
represents information that are not related to the shape of the tree):
(org-data
nil
(section
(...)
(paragraph
(...)
#("I like " ...)
(bold
(...)
#("bold-1" ...))
#("and " ...)
(bold
(...)
#("bold-2" ...))
#("and you?\nI don't. I prefer " ...)
(bold
(...)
#("bold-3" ...))
#(".\n" ...))
(paragraph
(...)
#("I've loved " ...)
(bold
(...)
#("bold-4" ...))
#("since I was a child.\n" ...))
(paragraph
(...)
#("I'm " ...)
(italic
(...)
#("italic" ...))
#("." ...))))
Indeed, bold
elements belong to paragraph
elements that belong to a
section
element.
And as we have just seen, if a backend doesn't provide a transcoder for an element, this element will be ignored in the exported result.
So let's write onlybold-section
, the transcoder of section
elements
which surrounds their content with <--
and -->
:
(defun onlybold-section (section contents info)
(concat "<--" contents "-->"))
and onlybold-paragraph
, the transcoder of paragraph
elements
which surrounds their content with ::
:
(defun onlybold-paragraph (paragraph contents info)
(concat "::" contents "::"))
Then, we modify onlybold
backend like this:
(org-export-define-backend 'onlybold
'((bold . onlybold-bold)
(section . onlybold-section)
(paragraph . onlybold-paragraph)))
Now, if we call the command onlybold-export
inside our org buffer,
the buffer *onlybold*
pops up with this content:
<--::I like ***bold-1*** and ***bold-2*** and you?
I don't. I prefer ***bold-3***.
::
::I've loved ***bold-4*** since I was a child.
::
::I'm .
::
-->
This is better:
The
bold
elements has been transcoded as we expected,The "normal" text remains the same as in our org buffer and,
note that the
italic
element has been ignored (which was expected because we didn't provide a transcoder foritalic
elements).
only keep bold elements
plain-text
elements are the leaves of the parsed tree and they are strings. This
is the right level to operate in order to keep only bold
elements.
So now, let's handle the plain-text
elements and keep only bold
elements.
There is at least two ways to do it:
using the filter system provided by the org export system (and so provide a filter that applies to
plain-text
elements) or,providing a specific transcoder for
plain-text
elements.
We implement the latter.
Let's write the transcoder onlybold-plain-text
which checks if the
parent of the plain-text
element (the string) is a bold
element. If
this is the case, we return the string and if not we return nil
:
(defun onlybold-plain-text (text info)
(when (eq 'bold (org-element-type (org-element-property :parent text)))
text))
Note that the arity (number of arguments) of onlybold-plain-text
is
different from the transcoders that we've seen so far.
Then we add it to onlybold
backend:
(org-export-define-backend 'onlybold
'((bold . onlybold-bold)
(section . onlybold-section)
(paragraph . onlybold-paragraph)
(plain-text . onlybold-plain-text)))
Now, if we call the command onlybold-export
inside our org buffer,
the buffer *onlybold*
pops up with this content:
<--::***bold-1*** ***bold-2*** ***bold-3***::
::***bold-4*** ::
::::
-->
We have filtered the text to keep only bold
elements.
remove empty paragraphs and the last newline of the section
Let's go further and remove the last empty paragraph.
To do so, we can "ask" the transcoder onlybold-paragraph
to return nil
when its contents is "empty", specifically when its content is the
empty strings ""
or a newline "\n"
. Here is the new implementation:
(defun onlybold-paragraph (paragraph contents info)
(if (member contents '("" "\n"))
nil
(concat "::" contents "::")))
Now, if we call the command onlybold-export
inside our org buffer,
the buffer *onlybold*
pops up with this content:
<--::***bold-1*** ***bold-2*** ***bold-3***::
::***bold-4*** ::
-->
We are almost happy :)
Only one thing remains...
The end of the section -->
alone in the last line is "quite ugly".
Let's put it just after ::
that close the last paragraph.
We can do this by modifying onlybold-section
and "asking" it to
remove the last newline of its content which is matched by the regexp
"\n\\'"
:
(defun onlybold-section (section contents info)
(let ((cts (replace-regexp-in-string "\n\\'" "" contents)))
(concat "<--" cts "-->")))
Now, if we call the command onlybold-export
inside our org buffer,
the buffer *onlybold*
pops up with this content:
<--::***bold-1*** ***bold-2*** ***bold-3***::
::***bold-4*** ::-->
We are done ;)
I hope that this toy example helps you get started with the creation of org backends
acknowledgments
I want to take the opportunity of this post to thank:
Nicolas Goaziou who is the author and maintainer of org-export-define-backend, and org-element-at-point.
All the people who work and contribute to org-mode (built-in and external packages),
All the people who work and contribute to Emacs (built-in and external packages).
And I want to tell you that:
Each time a piece of your code is heavy, I know that:
this piece of code fixes a bug or,
this piece of code handles an edge case or,
this piece of code provides flexibility (via options) to the end user.
Each time your code is simple, I know that you worked hard to make it simple.
And the more important, each time I read a piece of your code I feel closer to you.
Emacs is pure joy and it is thanks to you.