LifestyleLiterate programming: Knuth is doing it wrong (2014)

Literate programming: Knuth is doing it wrong (2014)

-

- Advertisment -

Oct 3, 2014

Literate programming: Knuth is doing it wrong

Literate programming advocates this: Order your code for others to read,
not for the compiler. Beautifully typeset your code so one can curl up in bed
to read it like a novel. Keep documentation in sync with code.
What’s not
to like about this vision? I have two beefs with it: the ends are insufficiently
ambitious by focusing on a passive representation; and the means were insufficiently
polished, by over-emphasizing typesetting at the cost of prose quality.
Elaboration, in reverse order:

Canonizing typesetting over organization

When I look around at the legacy of literate programming, systems to do
so-called semi- or quasi-literate programming dominate. These are systems that
focus on generating beautifully typeset documentation without allowing the
author to arbitrarily order code. I think this is exactly backwards; codebases
are easy to read primarily due to the author’s efforts to orchestrate the
presentation, and only secondarily by typesetting improvements. As a concrete
example, just about every literate program out there begins with cruft like
this:1

- Advertisement -
// Some #includes

or:

-- Don't mind these imports.

I used to think people just didn’t understand Knuth’s vision. But then I went
and looked at his literate programs. Boom, #includes:





- Advertisement -

The example Pascal program in Knuth’s
original paper
didn’t have any imports at all. But when it comes to
organizing larger codebases, we’ve been putting imports thoughtlessly at the
top. Right from day one.

Exhibit 2:





“Skip ahead if you are impatient to see the interesting stuff.”
Well gee, if only we had, you know, a tool to put the interesting stuff up
front.

Exhibit 3:

- Advertisement -





This is the start of the piece. There’s a sentence of introduction, and then
this:

We use a utility field to record
the vertex degrees.

#define deg u.I

That’s a steep jump across several levels of abstraction. Literally the first
line of code shown is a macro to access a field for presumably a struct whose
definition — whose very type name — we haven’t even seen
yet. (The variable name for the struct is also hard-coded in; but I’ll stop
nit-picking further.)

Exhibit 4: Zoom out just a little bit on the previous example:





Again, there’s #includes at the top but I won’t belabor that. Let’s look at
what’s in these #includes. “GraphBase data structures” seems kinda relevant to
the program. Surely the description should inline and describe the core data
structures the program will be using. In the immortal words of Fred
Brooks
:

Show me your flowcharts [code]
and conceal your tables [data types], and I shall continue to be
mystified. Show me your tables, and I won’t usually need your flowcharts;
they’ll be obvious.”

Surely a system to optimize order for exposition shouldn’t be stymied by
code in a different file.

On the whole, people have failed to appreciate the promise of literate
programming because the early examples are just not that good, barring the
small program in Knuth’s original paper. The programs jump across abstraction
layers. Problems are ill-motivated. There’s a pervasive mindset of top-down
thinking, of starting from main, whether or not that’s easiest to read. The
ability to change order is under-used, perhaps because early literate tools
made debugging harder, but mostly I think because of all the emphasis —
right from the
start
— on just how darn cool the typesetting was.2

All this may seem harsh on Knuth, but I figure Knuth can take it. He’s, well,
Knuth, and I’m nobody. He came up with literate programming as the successor
to structured programming, meaning that he was introducing ordering considerations
at a time when most people were still using gotos as a matter of
course. There was no typesetting for programmers or academics, no internet, no
hyperlinking. No, these early examples are fine for what they are. They
haven’t developed because we programmers have failed to develop them
over time. We’ve been too quick to treat them as sacred cows to be merely
interpreted (not noticing the violence our interpretations do to the original
idea anyway). I speculate that nobody has actually read anybody else’s
literate programs in any sort of detail. And so nobody has been truly inspired
to do better. We’ve been using literate programming, like the vast majority of
us use TAOCP,
as a signalling device to show that we are hip to what’s cool. (If you have
spent time reading somebody else’s literate programs, I want to hear about
your experiences!)

Canonizing passive reading over interactive feedback

I’ve been indirectly maligning typesetting, but it’s time to aim squarely at
it. There’s a fundamental problem with generating a beautifully typeset
document for a codebase: it’s dead. It can’t render inside just about any
actual programming environment (editor or IDE) on this planet, and so we can’t
make changes to it while we work on the codebase. Everybody reads a pdf about
a program at most once, when they first encounter it. After that, re-rendering
it is a pain, and proof-reading the rendered document, well forget about it.
That dooms generated documentation to be an after-thought, forever at risk of
falling stale, or at least rendering poorly.

You can’t work with it, you can’t try to make changes to it to see what
happens, and you certainly can’t run it interactively. All you can do,
literally, is curl up with it in bed. And promptly fall asleep. I mean, who
reads code in bed without a keyboard?!

What’s the alternative? In the spirit of presenting a target of my own for
others to attack, I’ll point you at some literate code I wrote last
year
for a simple interpreter. A sample of what it looks like:

 // Programs are run in two stages:
 //  a) _read_ the text of the program into a tree of cells
 //  b) _evaluate_ the tree of cells to yield a result
 cell* run(istream& in) {
   cell* result = nil;
   do {
       // TEMP and 'update' help recycle cells after we're done with
       // them.
       // Gotta pay attention to this all the time; see the 'memory'
       // layer.
       TEMP(form, read(in));
       update(result, eval(form));
   } while (!eof(in));
   return result;
 }
 
 cell* run(string s) {
   stringstream in(s);
   return run(in);
 }
 
 :(scenarios run)
 :(scenario examples)
 # function call; later we'll support a more natural syntax for
 # arithmetic
 (+ 1 1)
 => 2
 
 # assignment
 (=> 3
 
 # list; deliberately looks just like a function call
 '(1 2 3)
 => (1 2 3)
 
 # the function (fn) doesn't have to be named
 ((fn (a b)  # parameters (params)
     (+ a b))  # body
    3 4)  # arguments (args) that are bound to params inside this call
 => 7

A previous post describes
the format, but we won’t need more details for this example. Just note that it
is simple plaintext that will open up in your text editor. There is minimal
prose, because just the order of presentation does so much heavy lifting.
Comments are like code: the less you write, the less there is to go bad. I’m
paying the cost of ‘//
to delineate comments because I haven’t gotten around to fixing it, because
it’s just not that important to clean it up. You can’t see it in this sample,
but the program at large organizes features in self-contained layers, with
later features hooking into the code for earlier ones. Here’s a
test harness
. (With, I can’t resist pointing out, the includes at the
bottom.) Here’s a garbage
collector
. Here
I replace a flat namespace of bindings with dynamic scope. In each case, code
is freely intermingled with tests to exercise it (like the scenarios
above), tests that can be run from the commandline.

 $ build_and_test_until 029  # exercises the core interpreter
 $ build_and_test_until 030  # exercises dynamic scope
 ...

Having built the program with just a subset of layers, you’re free to poke at
it and run what-if experiments. Why did Kartik write this line like
so?
Make a change, run the tests. Oh, that’s why. You can add
logging to trace through execution, and you can use a debugger, because you’re
sitting at your workstation like a reasonable programmer, not curled up in
bed.

Eventually I’d like to live in a world where our systems for viewing live,
modifiable, interactive code are as adept at typesetting as our publishing
systems are. But until that day, I’ll choose simple markdown-like plain-text
documentation that the author labored over the structure of. Every single
time.

footnotes

1. Literate Haskell and CoffeeScript to a lesser extent allow
very flexible ordering in the language, which mitigates this problem. But then
we have their authors telling
us
that their tools can be used with any language, blithely ignoring the
fact that other languages may need better tools. Everybody’s selling mechanisms,
nobody’s inculcating the right policies.

2. We’ve all had the little endorphin rush of seeing our
crappy little papers or assignments magically improved by sprinkling a little
typesetting. And we tend to take well-typeset documents more
seriously
. The flip side to this phenomenon: if it looks done you
won’t get as much feedback on it
.



Join the pack! Join 8000+ others registered users, and get chat, make groups, post updates and make friends around the world!
www.knowasiak.com/register/
Read More

- Advertisement -

2 Comments

  1. Yeah, the word "read" seems to have seduced us into an obviously false analogy. "Reading" code is not like "reading" a novel. Perhaps this could seem more plausible if your work was primarily along the lines of Knuth's, rather than combining the same bag of shopworn, well-known techniques into academically rather uninteresting production code.

    Another article on this topic I recommend is "Code Is Not Literature," from Peter Siebel: https://gigamonkeys.com/code-reading/

  2. I think it's clear to everyone that Knuth's style of literate programming didn't achieve any sort of mainstream adoption.

    However, this article doesn't mention a newer form of literate programming that has gone mainstream. Notebooks! Whether it's jupyter, colab, whatever the Julia one is called, etc. People use literate programming all the time for data science and machine learning.

You might also likeRELATED
Recommended to you

The sector as we comprehend it is ending. Why are we quiet at work?

For a moment in early 2020, it seemed like we might get a break from capitalism. A novel coronavirus was sweeping the globe, and leaders and experts recommended that the US pay millions of people to stay home until the immediate crisis was over. These people wouldn’t work. They’d hunker down, take care of their…

Emacspy: Withhold an eye on Emacs in Python In preference to Emacs Sing

emacspy enables you to program Emacs in Python instead of ELisp. It works by using dynamic modules support introduced in Emacs 25. Building and loading Install Cython (pip install cython) and run make. emacspy.so will appear in the current directory. Make sure your Emacs build has loadable modules support enabled (default Ubuntu build doesn't have!):…

FBI can also shut down police expend-of-pressure database as a consequence of lack of police involvement

In an attempt to create a definitive database on how often police officers use force on citizens, the FBI launched the National Use-of-Force Data Collection program in 2019, imploring police departments to submit details on every incident, not just fatal shootings. But the failure of police and federal agencies to send their data to the…

Changing into a Religion of the Book: Scripture Earlier than the Bible

The oldest scriptures that eventually became the Bible were created within an environment where no appreciable religious function was assigned to texts. The stories, proverbs, songs, and prayers dating from the ninth and eighth centuries bc that researchers have managed to reconstruct from the Bible are examples of literature rather than holy scripture. They evolved…
- Advertisement -

Must read

Intellectual Loneliness

I have a confession to make: I leave most parties early because I’d rather read a book. That’s not what I tell people though. Usually, I make up an excuse. Something like “Oh, I have early plans in the morning.” I don’t like being deceptive, but there’s no socially acceptable way to say “I love…

THE Xtra; THE more Better

Hello we are uploading our site contents fully soon;...
- Advertisement -