Technical Articles / Opinions / News / Projects

cleopatra: Bootstrapping an Extensible Toolchain

You are about to read the first version of cleopatra, the toolchain initially implemented to build the website you are reading. Since then, cleopatra has been completely rewritten as a independant, more generic command-line program. That being said, the build process described in this write-up remains the one implemented in cleopatra the Second.

A literate program is a particular type of software program where code is not directly written in source files, but rather in text document as code snippets. In some sense, literate programming allows for writing in the same place both the software program and its technical documentation.

That being said, cleopatra is a toolchain to build a website before being a literate program, and one of its objective is to be part of this very website it is used to generate. To acheive this, cleopatra has been written as a collection of org files which can be either “tangled” using Babel or “exported” as a HTML document. Tangling here refers to extracted marked code blocks into files.

The page you are currently reading is cleopatra entry point. Its primilarly purpose is to introduce two Makefiles: Makefile and bootstrap.mk.

Table of Contents

Revisions

This revisions table has been automatically generated from the git history of this website repository, and the change descriptions may not always be as useful as they should.

You can consult the source of this file in its current version here.

2020-07-14 Prepare the introduction of a RSS feed c62a61d
2020-04-02 Refactor the build process to use cleopatra the Second 46b2e7a
2020-02-29 Continue to work on soupault configuration 9cc680e
2020-02-27 Theme reloading 1a9268f
2020-02-27 Make a generation process prebuild depends on its tangled file a105b5e
2020-02-26 Display source blocks names and tangle filenames in HTML output e6cd97f
2020-02-26 Improving the end of the Bootstrapping cleopatra document af723a5
2020-02-26 Introduce a notion of dependency between generation processes 5945bc8
2020-02-25 Allow cleopatra to create missing directories 767aa5b
2020-02-24 More tweaking of too long code lines 1ea740e
2020-02-23 Reduce the length of long lines of code in cleopatra 89e27fc
2020-02-23 Move plugin-specific SASS rules in Soupault.org b79765f
2020-02-23 Try to improve the situation with overflowing source blocks af208a5
2020-02-23 Yet another attempt to only init npm and Emacs when necessary a031c80
2020-02-23 Ignore build.log 69f9af2
2020-02-23 cleopatra is completely boostrapped 2869989
2020-02-23 Do not remove cleopatra files with 'make clean' 39fd0f0
2020-02-23 Provide a rule to initialize Emacs packages e5480d5
2020-02-23 Integrate the scripts and plugins used by soupault in Soupault.org 4bb8750
2020-02-23 Reworking cleopatra presentation 404d052
2020-02-23 Polish cleopatra aa6de8b
2020-02-23 Give up on clean URLs fbe5ef4
2020-02-23 First complete draft for the Root of Generation section a85c838
2020-02-22 Explain 'tangle-org.el' 7f8a29e
2020-02-22 Remove an orphan sentence 519b4db
2020-02-22 List current generation processes and document how to add one c8a860b
2020-02-22 Increase the width of the log file header generated by cleopatra 719d177
2020-02-22 Remove useless dependencies in the `build' rule of cleopatra f074109
2020-02-22 Integrate `update-gitignore.sh' inside cleopatra f6bec11
2020-02-22 Use `tangle-org.el' during bootstrap d50ee0c
2020-02-22 Make ~make~ to call itself with the `build` rule when none is given 54a467c
2020-02-22 Provide a generic and reliable way to extends cleopatra dae198a
2020-02-22 Initiate the redaction of Bootstrap.org eb53819
2020-02-21 Various improvement in cleopatra e792118
2020-02-20 Adopt a literate programming for `main.sass' b3a3096
2020-02-20 Make an heavy use of Makefile variables 36b9264
2020-02-20 Make cleopatra extensible 046606c
2020-02-19 Rework the Makefiles for a cleaner handling of generated scripts c87e51b
2020-02-19 Various improvement in the content generation process 06809e8
2020-02-19 Initiate a literate programming approach for the Makefile rules 2b78fd3
2020-02-18 Disable `custom-entry-overriden' warning of Coq 15a9b74
2020-02-16 Adopt a literate programming approach for the configuration 2dc4e8b
2020-02-15 Make the website 3rd-party free and improve loading performance d173e77
2020-02-14 Provide an index page for write-ups with more value 707ccc2
2020-02-05 Make the output of `make` cleaner b0d0016
2020-02-05 Fix link to locally defined terms in coqdoc output a1719f5
2020-02-05 Keep the list of html files to ignore up-to-date when building a69a2cc
2020-02-04 Initial commit with previous content and a minimal theme 9754a53

Makefile serves two purposes: it initiates a few global variables, and it provides a rule to generate bootstrap.mk. At this point, some readers may wonder why we need Makefile in this context, and the motivation behind this choice is really reminescent of a boot sequence. The rationale is that we need a “starting point” for cleopatra. The toolchain cannot live solely inside org-files, otherwise there would not have any code to execute the first time we tried to generate the website. We need an initial Makefile, one that has little chance to change, so that we can almost consider it read-only. Contrary to the other Makefiles that we will generate, this one will not be deleted by make clean.

This is similar to your computer: it requires a firmware to boot, whose purpose —in a nutshell— is to find and load an operating system.

Modifying the content of Makefile in this document will modify Makefile. This means one can easily put cleopatra into an inconsistent state, which would prevent further generation. This is why the generated Makefile should be versioned, so that you can restore it using git if you made a mistake when you modified it.

For readers interested in using cleopatra for their own websites, this documents tries to highlight the potential modifications they would have to make.

1 Global Constants and Variables

First, Makefile defines several global “constants” (although as far as I know make does not support true constant values, it is expected further generation process will not modify them).

In a nutshell,

ROOT
Tell Emacs where the root of your website sources is, so that tangled output filenames can be given relative to it rather than the org files. So for instance, the BLOCK_SRC tangle parameter for Makefile looks like :tangle Makefile, instead of :tangle ../../Makefile.
CLEODIR
Tell cleopatra where its sources live. If you place it inside the site/ directory (as it is intended), and you enable the use of org files to author your contents, then cleopatra documents will be part of your website. If you don’t want that, just move the directory outside the site/ directory, and update the CLEODIR variable accordingly.

For this website, these constants are defined as follows.

ROOT := $(shell pwd)
CLEODIR := site/cleopatra
Makefile

We then introduce two variables to list the output of the generation processes, with two purposes in mind: keeping the .gitignore up-to-date automatically, and providing rules to remove them.

ARTIFACTS
Short-term artifacts which can be removed frequently without too much hassle. They will be removed by make clean.
CONFIGURE
Long-term artifacts whose generation can be time consuming. They will only be removed by make cleanall.
ARTIFACTS := build.log
CONFIGURE :=
Makefile

Generation processes shall declare new build outputs using the += assignement operators. Using another operator will likely provent an underisable result.

2 Easy Tangling of Org Documents

cleopatra is a literate program implemented with Org mode, an Emacs major editing mode. We provide the necessary bits to easily tangle Org documents.

The configuration of Babel is done using an emacs lisp script called tangle-org.el whose status is similar to Makefile. It is part of the bootstrap process, and therefore lives “outside” of cleopatra (it is not deleted with make clean for instance). However, it is overwritten. If you try to modify it and find that cleopatra does not work properly, you should restore it using git.

(require 'org)
(cd (getenv "ROOT"))
(setq org-confirm-babel-evaluate nil)
(setq org-src-preserve-indentation t)
(add-to-list 'org-babel-default-header-args
             '(:mkdirp . "yes"))
(org-babel-do-load-languages
 'org-babel-load-languages
 '((shell . t)))
(org-babel-tangle)
scripts/tangle-org.el

We define variables that ensure that the ROOT environment variable is set and tangle-org.el is loaded when using Emacs.

EMACSBIN := emacs
EMACS := ROOT="${ROOT}" ${EMACSBIN}
TANGLE := --batch \
          --load="${ROOT}/scripts/tangle-org.el" \
          2>> build.log
Makefile

Finally, we introduce a canned recipe to seamlessly tangle a given file.

define emacs-tangle =
echo "  tangle  $<"
${EMACS} $< ${TANGLE}
endef
Makefile

3 Bootstrapping

The core purpose of Makefile remains to bootstrap the chain of generation processes. This chain is divided into three stages: prebuild, build, and postbuild.

This translates as follows in Makefile.

default : postbuild ignore

init :
        @rm -f build.log

prebuild : init

build : prebuild

postbuild : build

.PHONY : init prebuild build postbuild ignore
Makefile

A generation process in cleopatra is a Makefile which provides rules for these three stages, along with the utilities used by these rules. More precisely, a generation process proc is defined in proc.mk. The rules of proc.mk for each stage are expected to be prefixed by proc-, e.g., proc-prebuild for the prebuild stage.

Eventually, the following dependencies are expected between within the chain of generation processes.

prebuild : proc-prebuild
build : proc-build
postbuild : proc-postbuild

proc-build : proc-prebuild
proc-postbuild : proc build

Because cleopatra is a literate program, generation processes are defined in Org documents –which may contains additional utilities like scripts or templates—, and therefore need to be tangled prior to be effectively useful. *~cleopatra~ relies on a particular behavior of make regarding the include directive. If there exists a rule to generate a Makefile used as an operand of include, make will use this rule to update (if necessary) said Makefile before actually including it.

Therefore, rules of the following form achieve our ambition of extensibility.

include ${PROC}.mk

prebuild : ${PROC}-prebuild
build : ${PROC}-build
postbuild : ${PROC}-postbuild

${PROC}-prebuild : ${PROC}.mk ${AUX}
${PROC}-build : ${PROC}-prebuild
${PROC}-postbuild : ${PROC}-build

${PROC}.mk ${AUX} &:\
   ${CLEODIR}/${IN}
        @$(emacs-tangle)

CONFIGURE += ${PROC}.mk ${AUX}

.PHONY : ${PROC}-prebuild \
         ${PROC}-build \
         ${PROC}-postbuild

where

  • ${IN} is the Org document which contains the generation process code
  • ${PROC} is the name of the generation process
  • ${AUX} lists the utilities of the generation process tangled from ${IN} with ${PROC}.mk

We use &: is used in place of : to separate the target from its dependencies in the “tangle rule.” This tells make that the recipe of this rule generates all these files.

Writing these rules manually —has yours truly had to do in the early days of his website— has proven to be error-prone.

One desirable feature for cleopatra would be to generate them automatically, by looking for relevant :tangle directives inside the input Org document. The challenge lies in the “relevant” part: the risk exists that we have false posivite. However and as a first steps towards a fully automated solution, we can leverage the evaluation features of Babel here.

Here is a bash script which, given the proper variables, would generate the expected Makefile rule.

<<extends>> :=
cat <<EOF
include ${PROC}.mk

prebuild : ${PROC}-prebuild
build : ${PROC}-build
postbuild : ${PROC}-postbuild

${PROC}-prebuild : ${PROC}.mk ${AUX}
${PROC}-build : ${PROC}-prebuild
${PROC}-postbuild : ${PROC}-build

${PROC}.mk ${AUX} &:\\
   \${CLEODIR}/${IN}
        @\$(emacs-tangle)

CONFIGURE += ${PROC}.mk ${AUX}

.PHONY : ${PROC}-prebuild \\
         ${PROC}-build \\
         ${PROC}-postbuild
EOF

The previous source block is given a name (extends), and an explicit lists of variables (IN, PROC, and AUX). Thanks to the noweb syntax of Babel, we can insert the result of the evaluation of extends inside another source block when the latter is tangled.

We derive the rule to tangle bootstrap.mk using extends, which gives us the following Makefile snippet.

include bootstrap.mk

prebuild : bootstrap-prebuild
build : bootstrap-build
postbuild : bootstrap-postbuild

bootstrap-prebuild : bootstrap.mk scripts/update-gitignore.sh
bootstrap-build : bootstrap-prebuild
bootstrap-postbuild : bootstrap-build

bootstrap.mk scripts/update-gitignore.sh &:\
   ${CLEODIR}/Bootstrap.org
        @$(emacs-tangle)

CONFIGURE += bootstrap.mk scripts/update-gitignore.sh

.PHONY : bootstrap-prebuild \
         bootstrap-build \
         bootstrap-postbuild

Makefile

Beware that, as a consequence, modifying code block of extends is as “dangerous” as modifying Makefile itself. Keep that in mind if you start hacking cleopatra!

Additional customizations of cleopatra will be parth bootstrap.mk, rather than Makefile.

4 Generation Processes

Using the extends noweb reference, cleopatra is easily extensible. In this section, we first detail the structure of a typical generation process. Then, we construct bootstrap.mk by enumerating the generation processes that are currently used to generate the website you are reading.

Each generation process shall

  1. Define proc-prebuild, proc-build, and proc-postbuild
  2. Declare dependencies between stages of generation processes
  3. Declare build outputs (see ARTIFACTS and CONFIGURE)

5 Wrapping-up

BEGIN_MARKER="# begin generated files"
END_MARKER="# begin generated files"

# remove the previous list of generated files to ignore
sed -i -e "/${BEGIN_MARKER}/,/${END_MARKER}/d" .gitignore
# remove trailing empty lines
sed -i -e :a -e '/^\n*$/{$d;N;};/\n$/ba' .gitignore

# output the list of files to ignore
echo "" >> .gitignore
echo ${BEGIN_MARKER} >> .gitignore
for f in $@; do
    echo "${f}" >> .gitignore
done
echo ${END_MARKER} >> .gitignore
scripts/update-gitignore.sh
ignore :
        @echo "  update  gitignore"
        @scripts/update-gitignore.sh \
           ${ARTIFACTS} \
           ${CONFIGURE}

clean :
        @rm -rf ${ARTIFACTS}

cleanall : clean
        @rm -rf ${CONFIGURE}
bootstrap.mk