Forbidden Emacs Lisp Knowledge: Block Comments

47

12/02/2022

Note: The 37 sequence appearing in the code snippets is one
character, escaped for readability.

It’s been eight years since I started using Emacs and Emacs Lisp and I
still keep running into dusty corners. Traditionally, Lisp dialects
use the semicolon for line comments, with block and s-expression
comments being optional features.

Dialect Line comment Block comment S-expression comment
Clojure, Hy ; n/a #_
Common Lisp[1] ; #|...|# #+(or)
Emacs Lisp, Lush ; n/a n/a
ISLisp, LFE, uLisp ; #|...|# n/a
NewLisp ;, # n/a n/a
Picolisp[2] # #{...}# n/a
Racket, Scheme[3] ; #|...|# #;
TXR Lisp ; n/a #;
WAT[4] ;; (;...;) n/a

Emacs Lisp is special though. Here’s an unusual section from the
Emacs Lisp reference on comments:

The #@COUNT construct, which skips the next COUNT characters,
is useful for program-generated comments containing binary data.
The Emacs Lisp byte compiler uses this in its output files (see
“Byte Compilation”). It isn’t meant for source files, however.

At first sight, this seems useless. This feature is meant to be used
in .elc, not .el files and looking at a file produced by the
byte compiler, its only use is to emit docstrings:

;;; This file uses dynamic docstrings, first added in Emacs 19.29.

[...]

#@11 docstring37
(defalias 'my-test #[...])

This is kind of like a block-comment, except there is no comment
terminator. For this reason, the characters to be commented out need
to be counted. You’d think that the following would work, but it
fails with an “End of file during parsing” error:

(defvar my-variable #@8 (/ 1 0) 123)

It took me a dive into the reader to find out why:

#define FROM_FILE_P(readcharfun)                            
  (EQ (readcharfun, Qget_file_char)                         
   || EQ (readcharfun, Qget_emacs_mule_file_char))

static void
skip_dyn_bytes (Lisp_Object readcharfun, ptrdiff_t n)
{
  if (FROM_FILE_P (readcharfun))
    {
      block_input ();                /* FIXME: Not sure if it's needed.  */
      fseek (infile->stream, n - infile->lookahead, SEEK_CUR);
      unblock_input ();
      infile->lookahead = 0;
    }
  else
    { /* We're not reading directly from a file.  In that case, it's difficult
         to reliably count bytes, since these are usually meant for the file's
         encoding, whereas we're now typically in the internal encoding.
         But luckily, skip_dyn_bytes is used to skip over a single
         dynamic-docstring (or dynamic byte-code) which is always quoted such
         that 37 is the final char.  */
      int c;
      do {
        c = READCHAR;
      } while (c >= 0 && c != '37');
    }
}

Due to encoding difficulties, the #@COUNT construct is always used
with a terminating 37 AKA unit separator character. While it
seems that the FROM_FILE_P macro applies when using the reader
with get-file-char or get-emacs-mule-file-char (which are used
by load internally), I never managed to trigger that code path.
The reader therefore seems to always ignore the count argument,
essentially turning #@COUNT into a block comment facility.

Given this information, one could obfuscate Emacs Lisp code to hide
something unusual going on:

(message "Fire the %s!!!" #@11 "rockets")37

(reverse "sekun"))

A more legitimate usecase is a multi-line shebang:

#!/bin/sh
#@0 -*- emacs-lisp -*-
exec emacs -Q --script "$0" -- "$@"
exit
#37

(when (equal (car argv) "--")
  (pop argv))

(while argv
  (message "Argument: %S" (pop argv)))

In case you want to experiment with this and want to use the correct
counts, here’s a quick and dirty command:

(defun cursed-elisp-block-comment (beg end)
  (interactive "r")
  (save-excursion
    (save-restriction
      (narrow-to-region beg end)
      (goto-char (point-min))
      ;; account for space and terminator
      (insert (format "#@%d " (+ (- end beg) 2)))
      (goto-char (point-max))
      (insert "37"))))

There’s one more undocumented feature though, #@00 is
special-cased as EOF comment:

/* Read a decimal integer.  */
while ((c = READCHAR) >= 0
       && c >= '0' && c <= '9')
  {
    if ((STRING_BYTES_BOUND - extra) / 10 <= nskip)

Read More

Vanic
WRITTEN BY

Vanic

“Simplicity, patience, compassion.
These three are your greatest treasures.
Simple in actions and thoughts, you return to the source of being.
Patient with both friends and enemies,
you accord with the way things are.
Compassionate toward yourself,
you reconcile all beings in the world.”
― Lao Tzu, Tao Te Ching