# Knowasiak

Knowledge Social Network

# LuaTeX Comes of Age (2017)

69
LWN.net needs you!

Without subscribers, LWN would simply not exist. Please consider
signing up for a subscription and helping
to keep LWN publishing

August 22, 2017

TeX has been the tool of
choice for the preparation of papers and documents for mathematicians,
physicists, and other authors of technical material for many
years. Although it takes some effort to learn how to use this venerable
work of free software, its devotees become addicted to its ability to
produce publication-quality manuscripts from a plain-text,
version-control-friendly format.

Most TeX users use LaTeX,
which is a set of commands and macros built on
top of TeX that allow automated cross-referencing, indexing, creation of
documents. TeX, LaTeX, a host of associated utilities, fonts, and related
programs are assembled into a large package called TeX Live. It’s available through
the package managers of many Linux distributions, but to get an up-to-date
maintainers directly.

The 2017 version of TeXLive was recently released. As usual, this new
release contains no big surprises and should create no compatibility
issues. All of those who use TeX in their daily work will eventually get
around to starting the big download before bedtime to reap the benefits of
dozens of incremental
improvements
notes
may encounter one notable tidbit, however: the version of LuaTeX,
a key component of TeX Live, is now numbered 1.0.4.

LuaTeX is a project with several components and goals, all of which take
TeX in new directions. This project modernizes font handling, using Unicode
for input and output, including for math. It allows you to use any OpenType or
TrueType font on your system, and to select typefaces, styles, variants, and
font features easily and flexibly. LuaTeX embeds the Lua scripting language
into TeX, allowing authors a new level of power and control. Previously,
authors who needed to bend TeX to do non-standard or unusual typesetting
tasks were obliged to program in the TeX language itself, which is an arcane and
specialized skill. With LuaTeX, authors can accomplish these tasks by
writing scripts in a language with a more familiar syntax.

As recently as just a few years ago, however, documentation writers were
warning against using the still-evolving LuaTeX in critical work. For
example, there is a cautionary note in “A guide to
LuaLaTeX” [PDF]
, which is one of the few available introductory
project.

LuaTeX is now out of beta, however, which means that the time for
become “functionally complete“; most of the
functionality and interfaces are considered to be stable.
You can now undertake large LuaTeX projects without worrying about your
document breaking with the next upgrade. A “large LuaTeX project”
in this context means not just a long document that you happen to process
using LuaTeX, but one that makes extensive use of the Lua scripting that
the project makes possible.

In fact, LuaTeX has been the generally preferred TeX implementation for
some time. It has become an inseparable part of ConTeXt, which is
a large
project that is a popular alternative to LaTeX for publishing all kinds of
documents, especially books and pamphlets. LuaTeX passing version 1 means
that its official status has caught up to its
de facto status as the center of development in the TeX world.

#### Terminology

As terms in the TeX world can become confusing, this brief aside may be
called for. The engine is the command that you type to compile
your document. The name of the engine determines, among other things, what
format will be used, and what kind of output file TeX will
create. There are two choices of output file: the original DVI file and the
generally more popular and useful PDF. There are a handful of formats, such
as Plain TeX and LaTeX, that are large sets of macros that alter the
behavior of the TeX typesetting program that underlies everything, and that
expose various settings and commands to the author.

For example, the engine `pdftex` will use the Plain TeX
format to create a PDF; the `latex` command will use the LaTeX
format to create a DVI; and `lualatex` will use the LaTeX format
and create a PDF by running a version of the `pdftex` program
that has been rewritten in C and can interoperate with the Lua scripting
language.

Most of the terms with weird mixtures of case refer to formats (LaTeX)
or projects such as LuaTeX.

#### Why LuaTeX?

TeX, and a couple of possible disadvantages. There are some, usually older,
LaTeX packages that are incompatible with LuaTeX; if you depend on one of
these, then you may want to stick with XeLaTeX (a part of XeTeX), which is the Unicode-aware
LaTeX-like project, or even the older pdfLaTeX, if necessary. You may,
however, want to consider either making the jump to ConTeXt, which can
duplicate the capabilities of many old LaTeX packages, or trying to
reproduce the effects you want with some Lua scripting.

Another disadvantage is that, for some documents, luatex can be
slower
than the other engines. According to the Introduction of the LuaTeX
Reference [230-page PDF]
, when using Plain TeX, for example,
`pdftex` will almost always be faster than `luatex`,
but complex documents, especially using the recent incarnation of ConTeXt,
often run faster with `luatex`. In any case, the difference is
typically about a factor of two or so.

One of the delightful advantages of LuaTeX is its simple and powerful
font handling, which I demonstrated in a previous article about recent TeX
developments
. In addition to the features introduced in that article, I
should mention that LuaTeX has highly developed support for directional
typesetting, which means it can handle languages that go from right to left
efficient graphical subsystem into the TeX engine.

LuaTeX has a handful of commands to suppress different kinds of error
messages. Some of these don’t merely turn off the messages, but cause
things to be permitted that were previously forbidden. For example, you can
tell TeX that it’s OK to have a paragraph break inside an equation, which
frees you up to format your math input more flexibly.

There are several improvements in math typesetting related to the
possibility of using wide glyphs in equations. The distance between
equations and their numbers can now be controlled. There is a new type of
leader, whose alignment is based on the largest enclosing box (rather than
the smallest). You can set the minimum word length in which hyphenation
will be allowed.

Aside from font handling, these are all minor enhancements; there are
feature of LuaTeX is its embedding of the Lua scripting language, and
everything that enables.

#### Lua

Lua is a deceptively simple, but sophisticated, modern scripting
language. Some of Lua’s unusual, or unusually nice, features are: a single data
structure (the “table”); functions that can return multiple
results; proper tail calls; lexical scoping (closures, etc.); built-in
coroutines; and “metatables” and metamethods.

The features that make Lua so popular as an embedded scripting language,
however, are its small
footprint
(10,000 lines of C and a compiled, static library of 500K)
and its design from the ground up as a language for embedding in C and C++
programs. Its JIT compiler and advanced garbage collection make for good
performance in both time and space. The tradeoff is its relatively spare
standard library, but this is supplemented by a healthy ecosystem of official and
contributed packages.

Lua is widely used, for example, by game developers,
who can program the core game behavior in C while defining the game play
logic in Lua. You can get
of the language up and running on any system with an
ANSI C compiler — there are no extra dependencies. On many distributions, there
and, finally, there are binaries provided for a variety of operating
systems. Lua has an interactive mode that you can start by typing
`lua` in the terminal. This of course helps in learning the
language through experimentation but, unlike Python and Lisp, typing a
something won’t return its result unfortunately; you need to use the
`print` statement.

Fortunately the application of Lua within TeX documents rarely requires
anything beyond basic knowledge of the language. Those who wish a
systematic introduction may want to study the book Programming in
Lua
; all of the editions of the book, covering recent versions of the
language, can be found here,
including an early edition available free of charge.

#### Lua in LaTeX

By simply including the line `usepackage{luacode}` in your
LaTeX document’s preamble, and processing it with the `lualatex`
command, you can mix Lua in with your LaTeX. There are two major ways of
incorporating Lua into your document; the first is to insert the results of
a Lua calculation into the TeX token stream, by using
`tex.print` (or `tex.sprint`, that inserts the output
inline, avoiding a line break) instead of the normal Lua `print`
how to do that by using Lua to compute a numerical table within a LaTeX
document that formats the table, and another one that graphs it.

Below is
another example that uses Lua to calculate a fancy paragraph shape and
construct the TeX command for defining it. The `parshape`
command
in LaTeX or Plain TeX lets you define any shape for the current
paragraph, but it’s verbose and cumbersome, because you need to type in a
list of line lengths and indents, following the number of lines that you
want it to apply to. Wouldn’t it be nice to be able to simply say that you
would like a paragraph to have the shape of a particular mathematical
expression? Here is a little LaTeX document that typesets the beginning of
a great American novel in the wavy shape of the cos2
function:

```    documentclass{article}
usepackage{luacode}
usepackage{fontspec}
begin{document}
setlength{parindent}{0pt}
pagestyle{empty}
begin{luacode*}
function mpshape()
shape = ""
n = 23
bl = 10
for i = 1, n do
indent = string.format("%4f", 2*math.cos(i*2*math.pi/n)^2)
length = string.format("%4f", bl - 2*indent)
shape = shape.." "..indent.."cm "..length.."cm"
end
tex.sprint("\parshape= "..n.." "..shape)
end
end{luacode*}

Call me Ishmael. Some years ago—never mind how long precisely—having [text deleted]

end{document}
```

Here is the output:

A few things here need a little explanation. We need to include the
fontspec package to enable Unicode input: this should be a standard part of
your preamble if you are using LuaTeX, in which case you probably have a
handful of additional fontspec commands to set up your fonts (the em-dashes
in the input are the only Unicode here). The `luacode*`
environment passes its contents directly to Lua, and frees you from
worrying about escaping special TeX characters such as backslashes. This is
explained in the documentation
for the `luacode` LaTeX package. Everything
between the lines `begin{luacode*}` and
`end{luacode*}` is not TeX, but pure Lua. The Lua code here
serves to define the function `mpshape`, that takes no
arguments. The function uses an “arithmetic for” loop with a
common syntax. Note that all blocks are terminated with
`end`, and that white space is insignificant (except
for comments). The “..” operator used several times is string
concatenation; Lua converts numbers to strings as needed.

We’ve used string formatting, which works in Lua just as in C and many
other languages. The dimensions are in cm, and `bl` is the
unaltered line length. This document follows a common pattern, which is to
define some functions in a `luacode*` environment, and invoke
them in the document as needed. You can also put the function definitions
in an external file.

When you want to actually execute some Lua code, call it as the argument
of a `luadirect` command. Here you must use caution if you are
inserting Lua code directly rather than merely calling a function, as we do
here: the one place where line breaks are significant in Lua is in
comments, which begin with “–” and extend to the end of the
line. But TeX changes line breaks to spaces, which means that a comment in
your code will comment out everything that comes after it. Here the
`luadirect` command runs our `mpshape` function,
which in turn inserts the `parshape` command into the TeX
stream. I’ve kept things simple for illustration, but you can easily see
how this could be generalized. Since Lua has an `eval` function,
you could pass a mathematical expression into `mpshape` as an
argument, along with values for `bl` and `n` (the
number of lines to process, here hard-coded as 23, which I found by trial
and error).

#### Altering typesetting with callbacks

There is a whole different level of Lua-TeX integration afforded by
callbacks from TeX’s processing stages. TeX processes documents in
a series of steps, such as reading input, hyphenating, inserting glue,
ligaturing, breaking lines, and many more (see the LuaTeX reference manual
linked to above). At each of these stages, TeX is dealing with a linked
list of nodes, which are elements such as glyphs, kerns, lines,
etc., with associated collections of properties. In order to modify one of
TeX’s processing stages, you write a Lua function that manipulates the
relevant node list, and register this function as a callback
attached to the stage. When the TeX engine reaches the stage in question,
it will call the registered function, and continue as normal after it
returns, but with a modified node list.

Here is a example of a LaTeX document that uses this technique to
gradually decrease the grey value of each line in a paragraph, creating a
fade-out effect. Consider that it would be impossible to create this effect
by, for example, inserting color commands in the input file, because one
does not know where TeX will break lines until one actually runs the
document through it; and the line breaks depend on the line width. You need
to insert the color commands as part of the line breaking process, and this
is precisely what LuaTeX callbacks enable.

Here is a complete document that you can process with the
`lualatex` command to get the output in the figure. It will make
more sense than it does at first glance after the explanation that
follows.

```    documentclass{article}
usepackage{luacode}
usepackage{fontspec}
usepackage[total={10cm,25cm},centering]{geometry}
begin{document}
setlength{parindent}{0pt}
pagestyle{empty}
begin{luacode*}

WHAT = node.id("whatsit")
COL = node.subtype("pdf_colorstack")
colorize = node.new(WHAT,COL)
gvalue = 0
colorize.data = gvalue.." g"
gvalue = math.min(gvalue + 0.06, 1)
end
end

end{luacode*}

Call me Ishmael. [text deleted]

end{document}
```

Here is the result:

The `luacode` section starts with another function
definition, this time taking an argument that is the `head` of
a list of TeX nodes; which list of nodes it gets passed will be determined later
by how the function is registered as a callback. The first three lines of
the function define `colorize` to be a “whatsit” node,
which is a node type used for such things as PDF output. The
`for` block loops over nodes of type 0, which are “hlist
nodes”, beginning at the `HEAD`. For each node it defines a
string with the value “gvalue g”, where the number
`gvalue` starts at 0 (full black) and increases by 0.06 at each
iteration. The string is used as a new value for the `data`
field of the `colorize` node, which is then inserted before the
current node. The `node.insert_before` function takes care of
keeping the link structure correct as the new node is inserted into the
linked list. When TeX constructs the output PDF, the colorize nodes become
constructs that set the color.

After the function definition, the
`luatexbase.add_to_callback` call registers the function as a
callback attached to the post-linebreak phase. Our function
`fadelines` will be called immediately after TeX finishes
breaking the text into lines. The node list traversed by the function will
be the final, typeset list of lines. The string in the final argument can
be any label, and is used for a subsequent unregister command if
desired.

If all this seems a bit arcane, that’s because it is. There is little in
the way of gentle tutorial material to teach one how to do this kind of
work. But, after studying some example
code [52-page amusing PDF]
, carefully reading a few sections of the
reference manual, and some experimentation, I was pleasantly surprised at
how quickly I could go from an idea to a working implementation. Although
anything that you can do with these techniques you could also do, in
theory, by programming purely in TeX, using Lua and the interfaces defined
in the LuaTeX project is far simpler. If you’ve had the pleasure of trying
to read and understand a LaTeX style file, for example, the code here will
seem far less arcane by comparison.

A simple modification of our `fadelines` function will change
the color of the output text letter-by-letter. Here is the new function and
its output:

```    function fadelines(head)
GLYPH = node.id("glyph")
WHAT = node.id("whatsit")
COL = node.subtype("pdf_colorstack")
colorize = node.new(WHAT,COL)
cvalue = 0
colorize.data = cvalue.." "..1 - cvalue.." .5".." rg"
cvalue = math.min(cvalue + .0008, 1)
end
end

```

At the beginning of the function the `GLYPH` variable is set
to the node representing a single printed character. The constructed string
used for the `data` field of the coloring node now has the form
`R G B rg`“, and is made to change gradually as the
loop traverses the `GLYPH` nodes. Finally, we register the
function to the `pre_linebreak_filter` callback, to get access
to the list of glyph nodes.

#### Parting words

Since any LuaTeX document may contain any code whatsoever in the form of
embedded Lua scripts, you must use caution in processing documents from
untrusted sources. “Normal” TeX has traditionally refused to run
operating system commands (to run external programs, for example) unless
you specifically enabled them, but that safety check is absent from
LuaTeX. You can, however, invoke `lualatex` with the
`--safer` flag, which disables several features that could cause
mischief, including spawning processes and creating files. See pp. 44-45 of
the above-linked reference manual for details.

If you are undertaking a large project using LaTeX, such as writing a
textbook, or a series of smaller projects where you may want to get TeX to
perform typesetting tricks that are not covered by a LaTeX package, I
believe it is well worth it to become acquainted with the techniques
described here. Although TeX is a Turing-complete language, actually
writing TeX code to do anything non-trivial is dark magic. In contrast,
after very little study you can do things with LuaTeX that would be
practically impossible without it. Lua is pleasant to program in. The
ability to insert the results of Lua computations into the TeX document is
gives you powers that
used to be the exclusive possession of the most advanced TeX wizards.

Join the pack! Join 8000+ others registered users, and get chat, make groups, post updates and make friends around the world!
www.knowasiak.com/register/

WRITTEN BY

## Vanic

“Simplicity, patience, compassion.
These three are your greatest treasures.
Simple in actions and thoughts, you return to the source of being.
Patient with both friends and enemies,
you accord with the way things are.
Compassionate toward yourself,
you reconcile all beings in the world.”
― Lao Tzu, Tao Te Ching