
Hi, I’m lthms.
I didn’t like syntax highlighting, but I like types and functional programming languages. He/him.
Interested in starting a discussion? Don’t hesitate to shoot me an email.
soupault
We use soupault
to build this websitesoupault
is an awesome
free software project, with a unique approach to static website
generation. You should definitely check out their website!.
Installation §
We install soupault
in a local switch. We use a witness file
_opam/.init
to determine whether or not our switch has always been
created during a previous invocation of cleopatra
.
CONFIGURE += _opam rss.json
ARTIFACTS += out
soupault-prebuild : _opam/init
Using soupault
is as simple as calling it, without any particular
command-line arguments.
soupault-build : dependencies-prebuild style.min.css
@cleopatra echo "Executing" "soupault"
@soupault
We now describe our configuration file for soupault
.
Configuration §
Global Settings §
The options of the [settings]
section of a soupault
configuration are often self-explanatory, and we do not spend too
much time to detaul them.
[settings]
strict = true
site_dir = "site"
build_dir = "out/~lthms"
doctype = "<!DOCTYPE html>"
clean_urls = false
generator_mode = true
complete_page_selector = "html"
default_content_selector = "main"
default_content_action = "append_child"
page_file_extensions = ["html"]
ignore_extensions = [
"v", "vo", "vok", "vos", "glob",
"html~", "org"
]
default_template_file = "templates/main.html"
pretty_print_html = false
Setting Page Title §
We use the “page title” widget to set the title of the webpage
based on the first (and hopefully the only) <h1>
tag of the
page.
[widgets.page-title]
widget = "title"
selector = "h1"
default = "~lthms"
prepend = "~lthms: "
Acknowledging soupault
§
When creating a new soupault
project (using soupault --init
),
the default configuration file suggests advertising the use of
soupault
. Rather than hard-coding the used version of soupault
(which is error-prone), we rather determine the version of
soupault
with the following script.
soupault --version | head -n 1 | tr -d '\n'
The configuration of the widget —initially provided by
soupault
— becomes less subject to the obsolescenceThat
is, as long as soupault
does not change the output of its
--version
option..
[widgets.generator-meta]
widget = "insert_html"
html = """<meta name="generator" content="soupault 4.2.0">"""
selector = "head"
Prefixing Internal URLs §
On the one hand, internal links can be absolute, meaning they
start with a leading /
, and therefore are relative to the
website root. On the other hand, website (especially static
website) can be placed in larger context. For instance, my
personal website lives inside the ~lthms
directory of the
soap.coffee
domainTo my experience in hosting webapps and
websites, this set-up is way harder to get right than I initially
expect..
The purpose of this plugin is to rewrite internal URLs which are relative to the root, in order to properly prefix them.
From a high-level perspective, the plugin structure is the following.
First, we validate the widget configuration.
prefix_url = config["prefix_url"]
if not prefix_url then
Plugin.fail("Missing mandatory field: `prefix_url'")
end
if not Regex.match(prefix_url, "^/(.*)") then
prefix_url = "/" .. prefix_url
end
if not Regex.match(prefix_url, "(.*)/$") then
prefix_url = prefix_url .. "/"
end
Then, we propose a generic function to enumerate and rewrite tags which can have.
function prefix_urls (links, attr, prefix_url)
index, link = next(links)
while index do
href = HTML.get_attribute(link, attr)
if href then
if Regex.match(href, "^/") then
href = Regex.replace(href, "^/*", "")
href = prefix_url .. href
end
HTML.set_attribute(link, attr, href)
end
index, link = next(links, index)
end
end
Finally, we use this generic function for relevant tags.
prefix_urls(HTML.select(page, "a"), "href", prefix_url)
prefix_urls(HTML.select(page, "link"), "href", prefix_url)
prefix_urls(HTML.select(page, "img"), "src", prefix_url)
prefix_urls(HTML.select(page, "script"), "src", prefix_url)
prefix_urls(HTML.select(page, "use"), "href", prefix_url)
Again, configuring soupault to use this plugin is relatively straightforward.
[widgets.urls-rewriting]
widget = "urls-rewriting"
prefix_url = "~lthms"
after = "mark-external-urls"
Marking External Links §
function mark(name)
return '<span class="icon"><svg><use href="/img/icons.svg#'
.. name ..
'"></use></svg></span>'
end
links = HTML.select(page, "a")
index, link = next(links)
while index do
href = HTML.get_attribute(link, "href")
if href then
if Regex.match(href, "^https?://github.com") then
icon = HTML.parse(mark("github"))
HTML.append_child(link, icon)
elseif Regex.match(href, "^https?://") then
icon = HTML.parse(mark("external-link"))
HTML.append_child(link, icon)
end
end
index, link = next(links, index)
end
[widgets.mark-external-urls]
after = "generate-history"
widget = "external-urls"
Generating a Table of Contents §
The toc
widget allows for generating a table of contents for
HTML files which contains a node matching a given selector
(in
the case of this document, #generate-toc
).
[widgets.table-of-contents]
widget = "toc"
selector = "#generate-toc"
action = "replace_content"
valid_html = true
min_level = 2
max_level = 3
numbered_list = false
heading_links = true
heading_link_text = " §"
heading_links_append = true
heading_link_class = "anchor-link"
[widgets.append-toc-title]
widget = "insert_html"
selector = "#generate-toc"
action = "prepend_child"
html = '<h2>Table of Contents</h2>'
after = "table-of-contents"
Generating Per-File Revisions Tables §
Users Instructions
This widgets allows to generate a so-called “revisions table” of
the filename contained in a DOM element of id history
, based on
its history. Paths should be relative to the directory from which
you start the build process (typically, the root of your
repository). The revisions table notably provides hyperlinks to a
git
webview for each commit.
For instance, considering the following HTML snippet
<div id="history">
site/posts/FooBar.org
</div>
This plugin will replace the content of this <div>
with the
revisions table of site/posts/FooBar.org
.
Customization
The base of the URL webview for the document you are currently
reading is https://src.soap.coffee/soap.coffee/lthms.git
.
The template used to generate the revision table is the following.
<details id="history">
<summary>Revisions</summary>
<p>
This revisions table has been automatically generated
from <a href="https://src.soap.coffee/soap.coffee/lthms.git">the
<code>git</code> history of this website repository</a>, and the
change descriptions may not always be as useful as they should.
</p>
<p>
You can consult the source of this file in its current version
<a href="https://src.soap.coffee/soap.coffee/lthms.git/tree/{{file}}">here</a>.
</p>
<table class="fullwidth">
{{#history}}
<tr>
<td class="date"
{{#created}}
id="created-at"
{{/created}}
{{#modified}}
id="modified-at"
{{/modified}}
>{{date}}</td>
<td class="subject">{{subject}}</td>
<td class="commit">
<a href="https://src.soap.coffee/soap.coffee/lthms.git/commit/{{filename}}/?id={{hash}}">{{abbr_hash}}</a>
</td>
</tr>
{{/history}}
</table>
</details>
Implementation
We use the built-in preprocess_element
to implement, which
means we need a script which gets its input from the standard
input, and echoes its output to the standard input.
[widgets.generate-history]
widget = "preprocess_element"
selector = "#history"
command = 'scripts/history.sh templates/history.html'
action = "replace_element"
This plugin proceeds as follows:
- Using an ad-hoc script, it generates a JSON containing for each revision
- The subject, date, hash, and abbreviated hash of the related commit
- The name of the file at the time of this commit
- This JSON is passed to a mustache engine (
haskell-mustache
) with a proper template - The content of the selected DOM element is replaced with the output of
haskell-mustache
This translates in Bash like this.
function main () {
local file="${1}"
local template="${2}"
tmp_file=$(mktemp)
generate_json ${file} > ${tmp_file}
haskell-mustache ${template} ${tmp_file}
rm ${tmp_file}
}
Generating the expected JSON is therefore as simple as:
- Fetching the logs
- Reading 8 line from the logs, parse the filename from the 6th line
- Outputing the JSON
We will use git
to get the information we need. By default,
git
subcommands use a pager when its output is likely to be
long. This typically includes git-log
. To disable this
behavior, git
exposes the --no-pager
command. Besides, we
also need --follow
and --stat
to deal with file
renaming. Without this option, git-log
stops when the file
first appears in the repository, even if this “creation” is
actually a renaming. Therefore, the git
command line we use to
collect our history is
function gitlog () {
local file="${1}"
git --no-pager log \
--follow \
--stat=10000 \
--pretty=format:'%s%n%h%n%H%n%cs%n' \
"${file}"
}
This function will generate a sequence of 8 lines containing all the relevant information we are looking for, for each commit, namely:
- Subject
- Abbreviated hash
- Full hash
- Date
- Empty line
- Change summary
- Shortlog
- Empty line
For instance, the gitlog
function will output the following
lines for the last commit of this very file:
Website reorg 05617fa 05617fad8255248ee8ac8796e40a99529e1c8e8c 2022-10-23 site/{ => posts}/cleopatra/soupault.org | 3 +++ 1 file changed, 3 insertions(+)
Among other things, the 6th line contains the filename. We need
to extract it, and we do that with sed
. In case of file
renaming, we need to parse something of the form both/to/{old =>
new}
.
function parse_filename () {
local line="${1}"
local shrink='s/ *\(.*\) \+|.*/\1/'
local unfold='s/\(.*\){\(.*\) => \(.*\)}/\1\3/'
echo ${line} | sed -e "${shrink}" | sed -e "${unfold}"
}
The next step is to process the logs to generate the expected
JSON. We have to deal with the fact that JSON does not allow the
last item of an array to be concluded by ",". Besides, we also
want to indicate which commit is responsible for the creation of
the file. To do that, we use two variables: idx
and
last_entry
. When idx
is equal to 0, we know it is the latest
commit. When idx
is equal to last_entry
, we know we are
looking at the oldest commit for that file.
function generate_json () {
local input="${1}"
local logs="$(gitlog ${input})"
if [ ! $? -eq 0 ]; then
exit 1
fi
let "idx=0"
let "last_entry=$(echo "${logs}" | wc -l) / 8"
local subject=""
local abbr_hash=""
local hash=""
local date=""
local file=""
local created="true"
local modified="false"
echo -n "{"
echo -n "\"file\": \"${input}\""
echo -n ",\"history\": ["
while read -r subject; do
read -r abbr_hash
read -r hash
read -r date
read -r # empty line
read -r file
read -r # short log
read -r # empty line
if [ ${idx} -ne 0 ]; then
echo -n ","
fi
if [ ${idx} -eq ${last_entry} ]; then
created="true"
modified="false"
else
created="false"
modified="true"
fi
output_json_entry "${subject}" \
"${abbr_hash}" \
"${hash}" \
"${date}" \
"$(parse_filename "${file}")" \
"${created}" \
"${modified}"
let idx++
done < <(echo "${logs}")
echo -n "]}"
}
Generating the JSON object for a given commit is as simple as
function output_json_entry () {
local subject="${1}"
local abbr_hash="${2}"
local hash="${3}"
local date="${4}"
local file="${5}"
local created="${6}"
local last_entry="${7}"
echo -n "{\"subject\": \"${subject}\""
echo -n ",\"created\":${created}"
echo -n ",\"modified\":${modified}"
echo -n ",\"abbr_hash\":\"${abbr_hash}\""
echo -n ",\"hash\":\"${hash}\""
echo -n ",\"date\":\"${date}\""
echo -n ",\"filename\":\"${file}\""
echo -n "}"
}
And we are done! We can safely call the main
function to generate
our revisions table.
main "$(cat)" "${1}"
Rendering Equations Offline §
Users instructions
Inline equations written in the DOM under the class
and using the
syntax can be
rendered once and for all by soupault
. User For instance,
<span class="imath">\LaTeX</span>
is rendered
as
expected.
Using this widgets requires being able to inject raw HTML in input files.
Implementation
var katex = require("katex");
var fs = require("fs");
var input = fs.readFileSync(0);
var displayMode = process.env.DISPLAY != undefined;
var html = katex.renderToString(String.raw`${input}`, {
throwOnError : false,
displayModed : displayMode
});
console.log(html)
We reuse once again the preprocess_element
widget. The selector
is .imath
(i
stands for inline in this context), and we
replace the previous content with the result of our script.
[widgets.inline-math]
widget = "preprocess_element"
selector = ".imath"
command = "node scripts/render-equations.js"
action = "replace_content"
[widgets.display-math]
widget = "preprocess_element"
selector = ".dmath"
command = "DISPLAY=1 node scripts/render-equations.js"
action = "replace_content"
RSS Feed §
[index]
index = true
dump_json = "rss.json"
extract_after_widgets = ["urls-rewriting"]
[index.fields]
title = { selector = ["h1"] }
modified-at = { selector = ["#modified-at"] }
created-at = { selector = ["#created-at"] }
Series Navigation §
function get_title_from_path (path)
if Sys.is_file(path) then
local content_raw = Sys.read_file(path)
local content_dom = HTML.parse(content_raw)
local title = HTML.select_one(content_dom, "h1")
if title then
return String.trim(HTML.inner_html(title))
else
Plugin.fail(path .. ' has no <h1> tag')
end
else
Plugin.fail(path .. ' is not a file')
end
end
function generate_nav_item_from_title (title, url, template)
local env = {}
env["url"] = url
env["title"] = title
local new_content = String.render_template(template, env)
return HTML.parse(new_content)
end
function generate_nav_items (cwd, cls, template)
local elements = HTML.select(page, cls)
local i = 1
while elements[i] do
local element = elements[i]
local url = HTML.strip_tags(element)
local path = Sys.join_path(cwd, url)
local title_str = get_title_from_path(path)
HTML.replace_content(
element,
generate_nav_item_from_title(title_str, url, template)
)
i = i + 1
end
end
cwd = Sys.dirname(page_file)
home_template = 'This article is part of the series “<a href="{{ url }}">{{ title }}</a>.”'
nav_template = '<a href="{{ url }}">{{ title }}</a>'
generate_nav_items(cwd, ".series", home_template)
generate_nav_items(cwd, ".series-prev", nav_template)
generate_nav_items(cwd, ".series-next", nav_template)
[widgets.series]
widget = "series"
Injecting Minified CSS §
style = HTML.select_one(page, "style")
if style then
css = HTML.create_text(Sys.read_file("style.min.css"))
HTML.replace_content(style, css)
end
[widgets.css]
widget = "css"
Cleaning-up §
function remove_if_empty(html)
if String.trim(HTML.inner_html(html)) == "" then
HTML.delete(html)
end
end
function remove_all_if_empty(cls)
local elements = HTML.select(page, cls)
local i = 1
while elements[i] do
local element = elements[i]
remove_if_empty(element)
i = i + 1
end
end
remove_all_if_empty("p") -- introduced by org-mode
remove_all_if_empty("div.code") -- introduced by coqdoc
[widgets.clean-up]
widget = "clean-up"