LWN.net needs you!Featured Content Adsadd advertising hereWithout subscribers, LWN would simply not exist. Please consider |
August 22, 2017
This article was contributed by Lee Phillips
TeX has been the tool of
choice for the preparation of papers and documents for mathematicians,
physicists, and other authors of technical material for many
years. Although it takes some effort to learn how to use this venerable
work of free software, its devotees become addicted to its ability to
produce publication-quality manuscripts from a plain-text,
version-control-friendly format.
Most TeX users use LaTeX,
which is a set of commands and macros built on
top of TeX that allow automated cross-referencing, indexing, creation of
a table of contents, and automatic formatting of many types of
documents. TeX, LaTeX, a host of associated utilities, fonts, and related
programs are assembled into a large package called TeX Live. It’s available through
the package managers of many Linux distributions, but to get an up-to-date
version, one often needs to download it from its
maintainers directly.
The 2017 version of TeXLive was recently released. As usual, this new
release contains no big surprises and should create no compatibility
issues. All of those who use TeX in their daily work will eventually get
around to starting the big download before bedtime to reap the benefits of
dozens of incremental
improvements. Enthusiastic readers of release
notes may encounter one notable tidbit, however: the version of LuaTeX,
a key component of TeX Live, is now numbered 1.0.4.
Featured Content Ads
add advertising hereLuaTeX is a project with several components and goals, all of which take
TeX in new directions. This project modernizes font handling, using Unicode
for input and output, including for math. It allows you to use any OpenType or
TrueType font on your system, and to select typefaces, styles, variants, and
font features easily and flexibly. LuaTeX embeds the Lua scripting language
into TeX, allowing authors a new level of power and control. Previously,
authors who needed to bend TeX to do non-standard or unusual typesetting
tasks were obliged to program in the TeX language itself, which is an arcane and
specialized skill. With LuaTeX, authors can accomplish these tasks by
writing scripts in a language with a more familiar syntax.
As recently as just a few years ago, however, documentation writers were
warning against using the still-evolving LuaTeX in critical work. For
example, there is a cautionary note in “A guide to
LuaLaTeX” [PDF], which is one of the few available introductory
documents about the
project.
LuaTeX is now out of beta, however, which means that the time for
worrying about this has now passed. According to the official roadmap, LuaTeX has
become “functionally complete“; most of the
functionality and interfaces are considered to be stable.
You can now undertake large LuaTeX projects without worrying about your
document breaking with the next upgrade. A “large LuaTeX project”
in this context means not just a long document that you happen to process
using LuaTeX, but one that makes extensive use of the Lua scripting that
the project makes possible.
In fact, LuaTeX has been the generally preferred TeX implementation for
some time. It has become an inseparable part of ConTeXt, which is
a large
project that is a popular alternative to LaTeX for publishing all kinds of
documents, especially books and pamphlets. LuaTeX passing version 1 means
that its official status has caught up to its
de facto status as the center of development in the TeX world.
Terminology
As terms in the TeX world can become confusing, this brief aside may be
called for. The engine is the command that you type to compile
your document. The name of the engine determines, among other things, what
format will be used, and what kind of output file TeX will
create. There are two choices of output file: the original DVI file and the
generally more popular and useful PDF. There are a handful of formats, such
as Plain TeX and LaTeX, that are large sets of macros that alter the
behavior of the TeX typesetting program that underlies everything, and that
expose various settings and commands to the author.
For example, the engine pdftex
will use the Plain TeX
format to create a PDF; the latex
command will use the LaTeX
format to create a DVI; and lualatex
will use the LaTeX format
and create a PDF by running a version of the pdftex
program
that has been rewritten in C and can interoperate with the Lua scripting
language.
Most of the terms with weird mixtures of case refer to formats (LaTeX)
or projects such as LuaTeX.
Why LuaTeX?
LuaTeX provides many advantages over the more traditional versions of
TeX, and a couple of possible disadvantages. There are some, usually older,
LaTeX packages that are incompatible with LuaTeX; if you depend on one of
these, then you may want to stick with XeLaTeX (a part of XeTeX), which is the Unicode-aware
LaTeX-like project, or even the older pdfLaTeX, if necessary. You may,
however, want to consider either making the jump to ConTeXt, which can
duplicate the capabilities of many old LaTeX packages, or trying to
reproduce the effects you want with some Lua scripting.
Another disadvantage is that, for some documents, luatex can be
slower
than the other engines. According to the Introduction of the LuaTeX
Reference [230-page PDF], when using Plain TeX, for example,
pdftex
will almost always be faster than luatex
,
but complex documents, especially using the recent incarnation of ConTeXt,
often run faster with luatex
. In any case, the difference is
typically about a factor of two or so.
One of the delightful advantages of LuaTeX is its simple and powerful
font handling, which I demonstrated in a previous article about recent TeX
developments. In addition to the features introduced in that article, I
should mention that LuaTeX has highly developed support for directional
typesetting, which means it can handle languages that go from right to left
or vertically. Another advantage, for some, is easier access to MetaPost, bringing an
efficient graphical subsystem into the TeX engine.
LuaTeX has a handful of commands to suppress different kinds of error
messages. Some of these don’t merely turn off the messages, but cause
things to be permitted that were previously forbidden. For example, you can
tell TeX that it’s OK to have a paragraph break inside an equation, which
frees you up to format your math input more flexibly.
There are several improvements in math typesetting related to the
possibility of using wide glyphs in equations. The distance between
equations and their numbers can now be controlled. There is a new type of
leader, whose alignment is based on the largest enclosing box (rather than
the smallest). You can set the minimum word length in which hyphenation
will be allowed.
Aside from font handling, these are all minor enhancements; there are
dozens more listed in the reference manual linked above. The headline
feature of LuaTeX is its embedding of the Lua scripting language, and
everything that enables.
Lua
Lua is a deceptively simple, but sophisticated, modern scripting
language. Some of Lua’s unusual, or unusually nice, features are: a single data
structure (the “table”); functions that can return multiple
results; proper tail calls; lexical scoping (closures, etc.); built-in
coroutines; and “metatables” and metamethods.
The features that make Lua so popular as an embedded scripting language,
however, are its small
footprint (10,000 lines of C and a compiled, static library of 500K)
and its design from the ground up as a language for embedding in C and C++
programs. Its JIT compiler and advanced garbage collection make for good
performance in both time and space. The tradeoff is its relatively spare
standard library, but this is supplemented by a healthy ecosystem of official and
contributed packages.
Lua is widely used, for example, by game developers,
who can program the core game behavior in C while defining the game play
logic in Lua. You can get
the latest version of the language up and running on any system with an
ANSI C compiler — there are no extra dependencies. On many distributions, there
is also a recent version available through the package management system,
and, finally, there are binaries provided for a variety of operating
systems. Lua has an interactive mode that you can start by typing
lua
in the terminal. This of course helps in learning the
language through experimentation but, unlike Python and Lisp, typing a
something won’t return its result unfortunately; you need to use the
print
statement.
Fortunately the application of Lua within TeX documents rarely requires
anything beyond basic knowledge of the language. Those who wish a
systematic introduction may want to study the book Programming in
Lua; all of the editions of the book, covering recent versions of the
language, can be found here,
including an early edition available free of charge.
Lua in LaTeX
By simply including the line usepackage{luacode}
in your
LaTeX document’s preamble, and processing it with the lualatex
command, you can mix Lua in with your LaTeX. There are two major ways of
incorporating Lua into your document; the first is to insert the results of
a Lua calculation into the TeX token stream, by using
tex.print
(or tex.sprint
, that inserts the output
inline, avoiding a line break) instead of the normal Lua print
function. This article shows
how to do that by using Lua to compute a numerical table within a LaTeX
document that formats the table, and another one that graphs it.
Below is
another example that uses Lua to calculate a fancy paragraph shape and
construct the TeX command for defining it. The parshape
command in LaTeX or Plain TeX lets you define any shape for the current
paragraph, but it’s verbose and cumbersome, because you need to type in a
list of line lengths and indents, following the number of lines that you
want it to apply to. Wouldn’t it be nice to be able to simply say that you
would like a paragraph to have the shape of a particular mathematical
expression? Here is a little LaTeX document that typesets the beginning of
a great American novel in the wavy shape of the cos2
function:
documentclass{article} usepackage{luacode} usepackage{fontspec} begin{document} setlength{parindent}{0pt} pagestyle{empty} begin{luacode*} function mpshape() shape = "" n = 23 bl = 10 for i = 1, n do indent = string.format("%4f", 2*math.cos(i*2*math.pi/n)^2) length = string.format("%4f", bl - 2*indent) shape = shape.." "..indent.."cm "..length.."cm" end tex.sprint("\parshape= "..n.." "..shape) end end{luacode*} luadirect{mpshape()} Call me Ishmael. Some years ago—never mind how long precisely—having [text deleted] end{document}
Here is the output:
A few things here need a little explanation. We need to include the
fontspec package to enable Unicode input: this should be a standard part of
your preamble if you are using LuaTeX, in which case you probably have a
handful of additional fontspec commands to set up your fonts (the em-dashes
in the input are the only Unicode here). The luacode*
environment passes its contents directly to Lua, and frees you from
worrying about escaping special TeX characters such as backslashes. This is
explained in the documentation
for the luacode
LaTeX package. Everything
between the lines begin{luacode*}
and
end{luacode*}
is not TeX, but pure Lua. The Lua code here
serves to define the function mpshape
, that takes no
arguments. The function uses an “arithmetic for” loop with a
common syntax. Note that all blocks are terminated with
end
, and that white space is insignificant (except
for comments). The “..” operator used several times is string
concatenation; Lua converts numbers to strings as needed.
We’ve used string formatting, which works in Lua just as in C and many
other languages. The dimensions are in cm, and bl
is the
unaltered line length. This document follows a common pattern, which is to
define some functions in a luacode*
environment, and invoke
them in the document as needed. You can also put the function definitions
in an external file.
When you want to actually execute some Lua code, call it as the argument
of a luadirect
command. Here you must use caution if you are
inserting Lua code directly rather than merely calling a function, as we do
here: the one place where line breaks are significant in Lua is in
comments, which begin with “–” and extend to the end of the
line. But TeX changes line breaks to spaces, which means that a comment in
your code will comment out everything that comes after it. Here the
luadirect
command runs our mpshape
function,
which in turn inserts the parshape
command into the TeX
stream. I’ve kept things simple for illustration, but you can easily see
how this could be generalized. Since Lua has an eval
function,
you could pass a mathematical expression into mpshape
as an
argument, along with values for bl
and n
(the
number of lines to process, here hard-coded as 23, which I found by trial
and error).
Altering typesetting with callbacks
There is a whole different level of Lua-TeX integration afforded by
callbacks from TeX’s processing stages. TeX processes documents in
a series of steps, such as reading input, hyphenating, inserting glue,
ligaturing, breaking lines, and many more (see the LuaTeX reference manual
linked to above). At each of these stages, TeX is dealing with a linked
list of nodes, which are elements such as glyphs, kerns, lines,
etc., with associated collections of properties. In order to modify one of
TeX’s processing stages, you write a Lua function that manipulates the
relevant node list, and register this function as a callback
attached to the stage. When the TeX engine reaches the stage in question,
it will call the registered function, and continue as normal after it
returns, but with a modified node list.
Here is a example of a LaTeX document that uses this technique to
gradually decrease the grey value of each line in a paragraph, creating a
fade-out effect. Consider that it would be impossible to create this effect
by, for example, inserting color commands in the input file, because one
does not know where TeX will break lines until one actually runs the
document through it; and the line breaks depend on the line width. You need
to insert the color commands as part of the line breaking process, and this
is precisely what LuaTeX callbacks enable.
Here is a complete document that you can process with the
lualatex
command to get the output in the figure. It will make
more sense than it does at first glance after the explanation that
follows.
documentclass{article} usepackage{luacode} usepackage{fontspec} usepackage[total={10cm,25cm},centering]{geometry} begin{document} setlength{parindent}{0pt} pagestyle{empty} begin{luacode*} function fadelines(head) WHAT = node.id("whatsit") COL = node.subtype("pdf_colorstack") colorize = node.new(WHAT,COL) gvalue = 0 for line in node.traverse_id(0,head) do colorize.data = gvalue.." g" node.insert_before(head, line, node.copy(colorize)) gvalue = math.min(gvalue + 0.06, 1) end return head end luatexbase.add_to_callback("post_linebreak_filter", fadelines, "fadelines") end{luacode*} Call me Ishmael. [text deleted] end{document}
Here is the result:
The luacode
section starts with another function
definition, this time taking an argument that is the head
of
a list of TeX nodes; which list of nodes it gets passed will be determined later
by how the function is registered as a callback. The first three lines of
the function define colorize
to be a “whatsit” node,
which is a node type used for such things as PDF output. The
for
block loops over nodes of type 0, which are “hlist
nodes”, beginning at the HEAD
. For each node it defines a
string with the value “gvalue g”, where the number
gvalue
starts at 0 (full black) and increases by 0.06 at each
iteration. The string is used as a new value for the data
field of the colorize
node, which is then inserted before the
current node. The node.insert_before
function takes care of
keeping the link structure correct as the new node is inserted into the
linked list. When TeX constructs the output PDF, the colorize nodes become
constructs that set the color.
After the function definition, the
luatexbase.add_to_callback
call registers the function as a
callback attached to the post-linebreak phase. Our function
fadelines
will be called immediately after TeX finishes
breaking the text into lines. The node list traversed by the function will
be the final, typeset list of lines. The string in the final argument can
be any label, and is used for a subsequent unregister command if
desired.
If all this seems a bit arcane, that’s because it is. There is little in
the way of gentle tutorial material to teach one how to do this kind of
work. But, after studying some example
code [52-page amusing PDF], carefully reading a few sections of the
reference manual, and some experimentation, I was pleasantly surprised at
how quickly I could go from an idea to a working implementation. Although
anything that you can do with these techniques you could also do, in
theory, by programming purely in TeX, using Lua and the interfaces defined
in the LuaTeX project is far simpler. If you’ve had the pleasure of trying
to read and understand a LaTeX style file, for example, the code here will
seem far less arcane by comparison.
A simple modification of our fadelines
function will change
the color of the output text letter-by-letter. Here is the new function and
its output:
function fadelines(head) GLYPH = node.id("glyph") WHAT = node.id("whatsit") COL = node.subtype("pdf_colorstack") colorize = node.new(WHAT,COL) cvalue = 0 for line in node.traverse_id(GLYPH,head) do colorize.data = cvalue.." "..1 - cvalue.." .5".." rg" node.insert_before(head, line, node.copy(colorize)) cvalue = math.min(cvalue + .0008, 1) end return head end luatexbase.add_to_callback("pre_linebreak_filter", fadelines, "fadelines")
At the beginning of the function the GLYPH
variable is set
to the node representing a single printed character. The constructed string
used for the data
field of the coloring node now has the form
“R G B rg
“, and is made to change gradually as the
loop traverses the GLYPH
nodes. Finally, we register the
function to the pre_linebreak_filter
callback, to get access
to the list of glyph nodes.
Parting words
Since any LuaTeX document may contain any code whatsoever in the form of
embedded Lua scripts, you must use caution in processing documents from
untrusted sources. “Normal” TeX has traditionally refused to run
operating system commands (to run external programs, for example) unless
you specifically enabled them, but that safety check is absent from
LuaTeX. You can, however, invoke lualatex
with the
--safer
flag, which disables several features that could cause
mischief, including spawning processes and creating files. See pp. 44-45 of
the above-linked reference manual for details.
If you are undertaking a large project using LaTeX, such as writing a
textbook, or a series of smaller projects where you may want to get TeX to
perform typesetting tricks that are not covered by a LaTeX package, I
believe it is well worth it to become acquainted with the techniques
described here. Although TeX is a Turing-complete language, actually
writing TeX code to do anything non-trivial is dark magic. In contrast,
after very little study you can do things with LuaTeX that would be
practically impossible without it. Lua is pleasant to program in. The
ability to insert the results of Lua computations into the TeX document is
already immensely useful; the next level, direct access to TeX internals,
gives you powers that
used to be the exclusive possession of the most advanced TeX wizards.
(Log in to post comments)