emacs

recursion

C-c-c-conjecturing, and dealing with recursion in Emacs (more excursus)

Monday, 24 February 2025

I’m not putting this in the lambda-calculus series, though it touches on issues from the last post in the series, but specifically issues of recursion. I was curious to go back and recall how The Little Schemer dealt with problems of recursion (and the Y Combinator (which we still haven’t got properly to yet, but we will, I promise)).

In Chapter 9 of The Little Schemer (“and Again, and Again, and Again,…"), it starts off by querying the reader if they want caviar and how to find it in a list, and then essentially gets (from caviar and grits) into issues around the halting problem.

Figure 1: Duane Bibby illustration heading Chapter 9 of “The Little Schemer”

But a number of pages into this chapter the book queries of a particular function: “Is this function total?” and presents the following:

(define C
  (lambda (n)
    (cond
     ((one? n) 1)
     (else
      (cond
       ((even? n) (C (÷ n 2)))
       (else (C (add1 (× 3 n)))))))))

Code Snippet 1: a function puzzle from "The Little Schemer" (p.155)

I didn’t know what C was (a number of readers probably have recognised it already, but please don’t spoil it for the others just yet).

But whether we get the point of the function or not,we can implement it in Emacs Lisp, translating from the Scheme into Emacs Lisp (defining an equivalent of even? as an Emacs Lisp evenp first) as:

;; -*- lexical-binding: t; -*-
(defun evenp (n)
  "Returns `t' if `n' is even."
  (if (= (% n 2) 0)
      t
    nil))

(defun C (n)
  "Function C from Little Schemer p.155."
  (cond
   ((= n 1)
    1)
   ((evenp n)
    (C (/ n 2)))
   (t
    (C (1+ (* 3 n))))))

Code Snippet 2: translation of the The Little Schemer 'C' function into Elisp

And we can try running this on various numbers:

(C 1) ; 1
(C 9) ; 1
(C 108) ; 1
(C 837799) ; 1
(C 8400511) ; 1
(C 63728126) ; 1

That is, for all of these numbers it just outputs 1. Well, we can try a couple of other values:

(C 0) ; cl-prin1: (error "Lisp nesting exceeds ‘max-lisp-eval-depth’")
(C 63728127) ; cl-prin1: (error "Lisp nesting exceeds ‘max-lisp-eval-depth’")

So C fails for 0 and numbers above 63728126, with a “excessive lisp nesting”-type issue.

But since the only non-error result we’re getting is 1, we could write a much simpler function.¹

;; -*- lexical-binding: t; -*-
(defun ersatz-C (n)
  "Returns 1 if `n' is 1;
enter an infinite nesting loop
if `n' is 0 or above 63728126;
for any other integer value,
pause briefly for sake of plausibly,
and then return 1."
  (cond
   ((= n 1)
    1)
   ((= n 0)
    (ersatz-C n))
   ((> n 63728126)
    (ersatz-C n))
   (t (progn
        (sleep-for 0.005)
        1))))

Code Snippet 3: A simpler function to return "1"

But obviously function C is doing something more interesting. So, let’s make it actually show us what it’s doing:

;; -*- lexical-binding: t; -*-
(defun C-verbose (n &optional count)
  "For a number `n', return `1' if it's `1'; otherwise,
if it's even, run the same function on half of `n';
else, run the same function on one more than three times
`n'."
  (let ((count (or count 0)))
    (print (format "Step %s: %s" count n))
    (setq count (1+ count))
    (cond
     ((= n 1)
      1)
     ((evenp n)
      (C-verbose (/ n 2) count))
     (t
      (C-verbose (1+ (* 3 n)) count)))))

(C-verbose 7) ; =
;; "Step 0: 7"
;; "Step 1: 22"
;; "Step 2: 11"
;; "Step 3: 34"
;; "Step 4: 17"
;; "Step 5: 52"
;; "Step 6: 26"
;; "Step 7: 13"
;; "Step 8: 40"
;; "Step 9: 20"
;; "Step 10: 10"
;; "Step 11: 5"
;; "Step 12: 16"
;; "Step 13: 8"
;; "Step 14: 4"
;; "Step 15: 2"
;; "Step 16: 1"

(C-verbose 9) ; =
;; "Step 0: 9"
;; "Step 1: 28"
;; "Step 2: 14"
;; "Step 3: 7"
;; "Step 4: 22"
;; "Step 5: 11"
;; "Step 6: 34"
;; "Step 7: 17"
;; "Step 8: 52"
;; "Step 9: 26"
;; "Step 10: 13"
;; "Step 11: 40"
;; "Step 12: 20"
;; "Step 13: 10"
;; "Step 14: 5"
;; "Step 15: 16"
;; "Step 16: 8"
;; "Step 17: 4"
;; "Step 18: 2"
;; "Step 19: 1"

Code Snippet 4: Write `C' to make it show us what it's doing; with some examples of use

Perhaps not surprisingly, we see that setting n=9 makes the function take more steps to reach 1 than setting n=7. But it’s not actually a direct linear relation, because if we set n=10, it’s a much shorter path:

(C-verbose 10) ; =
;; "Step 0: 10"
;; "Step 1: 5"
;; "Step 2: 16"
;; "Step 3: 8"
;; "Step 4: 4"
;; "Step 5: 2"
;; "Step 6: 1"

So we might still wonder about exactly what’s going on with function C, how it works, and what’s the point of it anyway.

Conjectures about simple arithmetic operations and integer pathways #

The Little Schemer does give a clue about what C is, referring to the question of whether it’s a total function or not:

It doesn’t yield a value for 0, but otherwise nobody knows. Thank you, Lothar Collatz (1910–1990).

A bit of research and we can find quite a bit about C: it turns out to be the Collatz conjecture: one of the most famous unsolved problems in mathematics.

The Wikipedia page on the Collatz conjecture is useful here in laying out basics. As well, Eric Roosendaal‘s “On the 3x + 1 problem”. And Quantamagazine’s “The Simple Math Problem We Still Can’t Solve” presents a fairly accessible overview, including recent developments.

There’s some very interesting visualisations of the Collatz Functions paths as a Collatz tree, which is a rather aesthetically-appealling visual object. Here’s Algoritmarte’s 3D live render, based on a discussion from the Numberphile channel (for Numberphile’s original video, done by hand, see the fn):²

Algoritmarte's Collatz Tree visualisation

But Collatz conjecture is an unsolved problem that looks like it should certainly be solvable. It seems quite trivial in some sense: inductive logic would seem to suggest that the same basic patterns would repeat, even if there are more of them, or they are longer, given that every number that’s been tested has converged to 1 (or, more precisely, fallen into the “4-2-1” loop) and so this tempts people in trying to solve it. But Paul Erdős said of it “Mathematics may not be ready for such problems”.³

Figure 2: https://xkcd.com/710/ [see: https://www.explainxkcd.com/wiki/index.php/710:_Collatz_Conjecture ]

Well, we’ll not try to solve it, but rather use it for thinking about issues of recursion and long calculations in Emacs Lisp, and doing some practical engineering-type testing of different methods of dealing with such functions.

One thing we might note, as a sort of aside, is that The Little Schemer‘s C isn’t properly the Collatz function. The conjecture is really about whether the process will ever reach the repeating “4-2-1” loop, because the process of dividing even numbers by 2 and multiple odd numbers and adding 1 to them would actually mean that once we’ve reached 1, we need to multiple it by 3 and add 1, which means we’re back at 4, which is even, so we’ll divide by 2 and get 2, which is even, so we divide by 2 and get 1, which is odd, so we…. And so on.

We can write this “real Collatz” function:

;; -*- lexical-binding: t; -*-
(defun real-collatz (n)
  "For an integer `n', if it's evan, divide it by 2;
if it's odd, multiple it by 3 and add 1."
  (cond
   ((evenp n)
    (real-collatz (/ n 2)))
   (t (real-collatz (1+ (* n 3))))))

Code Snippet 5: the real basic Collatz operations

If we now try to feed it 9, we’ll hit the lisp-nesting error, but if we turn on toggle-debug-on-error, we can inspect what’s going on, and it’s exactly the “4-2-1” loop we predicted:

(real-collatz 9) ; cl-prin1: (error "Lisp nesting exceeds ‘max-lisp-eval-depth’")

;; DEBUG:
....
  real-collatz(1)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(2)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(4)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(1)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(2)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(4)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(1)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(2)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(4)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(1)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(2)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(4)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(1)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(2)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(4)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(1)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(2)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(4)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(1)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(2)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(4)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(8)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(16)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(5)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(10)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(20)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(40)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(13)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(26)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(52)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(17)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(34)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(11)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(22)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(7)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(14)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(28)
  (cond ((evenp n) (real-collatz (/ n 2))) (t (real-collatz (1+ (* n 3)))))
  real-collatz(9)
  eval((real-collatz 9) nil)
  elisp--eval-last-sexp(nil)
.....

We’ll make use of it later on though, somewhat (essentially with the n=1 condition displaced.)

Dealing with long winding paths, through hailstorms #

Ok, so two things. One, let’s not just print things (longer paths are going to make his quite messy), but return a list of the steps our function goes through and a count of them. We do still want to be able to inspect the path patterns. (Wikipedia notes on this “[t]he sequence of numbers involved [in the paths] is sometimes referred to as the hailstone sequence, hailstone numbers or hailstone numerals (because the values are usually subject to multiple descents and ascents like hailstones in a cloud)".)

And, two, let’s use the cl-labels trick from last time to get tail-call optimisation in Emacs.

(And a minor thing: not bother with a separate evenp function; just use the calculation directly.)

;; -*- lexical-binding: t; -*-
(defun collatz-cllabels-tco (n)
  "Calculates the 'Collatz path' for `n',
return a list of all of the steps cons'ed
with count of the steps in the path.

[Uses `cl-labels' to get tail-call optimisation.]"
  (cl-labels ((collatz* (n r)
                  (setq r (cons n r))
                  (cond
                     ((= n 1)
                      r)
                     ((= (% n 2) 0)
                        (collatz* (/ n 2) r))
                     (t
                        (collatz* (1+ (* 3 n)) r)))))
    (let* ((clst (collatz* n nil))
          (clngth (1- (length clst))))
      (cons (nreverse clst) clngth))))

Code Snippet 6: A tail-call optimising implementation of a recursive Collatz function

Here you will note that we’ve got our collatz* function (defined in cl-labels) being self-recursive only properly in tail positions, so it can be tail-call optimised. We start by calling collatz* with arguments n (whatever value we passed for n) and nil, with collatz* taking two arguments, n and r and setting r to the cons of n with r (we’ll pass this value on, collecting the steps), and then implementing the rest of the Collatz function in the same way as previously, except passing along the r value as our collector. At the end, we’ll count the steps and cons the list of the steps themselves, using nreverse to flip the list over (because it’ll have the last results on the “outer” left and the first results on the “inner” right), with the step count.

Some examples:

(collatz-cllabels-tco 9) ; ((9 28 14 7 22 11 34 17 52 26 13 40 20 10 5 16 8 4 2 1) . 20)
                ; path from 9 to 1; 20 steps

(collatz-cllabels-tco 21) ; ((21 64 32 16 8 4 2 1) . 8)
                 ; path from 21 to 1; 8 steps

Excellent, but what about “63728127”?:— the number that tripped us up before:

;; -*- lexical-binding: t; -*-

(collatz-cllabels-tco 63728127) ; it works!

;; ...but the result is quite long, you can try it yourself;
;; use setq to be able to see all of it, e.g.,
;; (setq c-result (collatz-cllabels-tco 63728127)).

;; for, let's just see how many steps it is:
(defun collatz-return-steps (n)
  "Output just the number of steps that
`n' takes to reach 1."
  (format "%s takes %s steps to reach 1"
          n
          (cdr
           (collatz-cllabels-tco n))))


(collatz-return-steps 63728127) ; =
"63728127 takes 949 steps to reach 1"

Code Snippet 7: simple counting of Collatz steps on the path

“Let’s try it with a bigger number!", you say. Well, we can try it on most-positive-fixnum (which is I think probably set to 2305843009213693951 by default) — but remember the size of the number and the number of steps is not a direct linear relation, so you might be disappointed:

(collatz-return-steps most-positive-fixnum) ; =
"2305843009213693951 takes 860 steps to reach 1"

This is less than for 63728127.

Ok, so let’s make something really big then. 2^1000 + 1.

(collatz-return-steps (1+ (expt 2 1000))) ; =
"10715086071862673209484250490600018105614048117055336074437503883703510511249361224931983788156958581275946729175531468251871452856923140435984577574698574803934567774824230985421074605062371141877954182153046474983581941267398767559165543946077062914571196477686542167660429831652624386837205668069377 takes 7248 steps to reach 1"

It takes 7,248 steps. What about 2^10000 + 1?

(collatz-return-steps (1+ (expt 2 10000))) ; =
"19950631168807583848837421626835850838234968318861924548520089498529438830221946631919961684036194597899331129423209124271556491349413781117593785932096323957855730046793794526765246551266059895520550086918193311542508608460618104685509074866089624888090489894838009253941633257850621568309473902556912388065225096643874441046759871626985453222868538161694315775629640762836880760732228535091641476183956381458969463899410840960536267821064621427333394036525565649530603142680234969400335934316651459297773279665775606172582031407994198179607378245683762280037302885487251900834464581454650557929601414833921615734588139257095379769119277800826957735674444123062018757836325502728323789270710373802866393031428133241401624195671690574061419654342324638801248856147305207431992259611796250130992860241708340807605932320161268492288496255841312844061536738951487114256315111089745514203313820202931640957596464756010405845841566072044962867016515061920631004186422275908670900574606417856951911456055068251250406007519842261898059237118054444788072906395242548339221982707404473162376760846613033778706039803413197133493654622700563169937455508241780972810983291314403571877524768509857276937926433221599399876886660808368837838027643282775172273657572744784112294389733810861607423253291974813120197604178281965697475898164531258434135959862784130128185406283476649088690521047580882615823961985770122407044330583075869039319604603404973156583208672105913300903752823415539745394397715257455290510212310947321610753474825740775273986348298498340756937955646638621874569499279016572103701364433135817214311791398222983845847334440270964182851005072927748364550578634501100852987812389473928699540834346158807043959118985815145779177143619698728131459483783202081474982171858011389071228250905826817436220577475921417653715687725614904582904992461028630081535583308130101987675856234343538955409175623400844887526162643568648833519463720377293240094456246923254350400678027273837755376406726898636241037491410966718557050759098100246789880178271925953381282421954028302759408448955014676668389697996886241636313376393903373455801407636741877711055384225739499110186468219696581651485130494222369947714763069155468217682876200362777257723781365331611196811280792669481887201298643660768551639860534602297871557517947385246369446923087894265948217008051120322365496288169035739121368338393591756418733850510970271613915439590991598154654417336311656936031122249937969999226781732358023111862644575299135758175008199839236284615249881088960232244362173771618086357015468484058622329792853875623486556440536962622018963571028812361567512543338303270029097668650568557157505516727518899194129711337690149916181315171544007728650573189557450920330185304847113818315407324053319038462084036421763703911550639789000742853672196280903477974533320468368795868580237952218629120080742819551317948157624448298518461509704888027274721574688131594750409732115080498190455803416826949787141316063210686391511681774304792596709377 takes 72378 steps to reach 1"

72,378 steps.

Can we go one bigger?

(collatz-return-steps (1+ (expt 2 100000)))
;; Debugger entered--Lisp error: (overflow-error)

No. But it’s because Emacs has maximum-integer-width set to 65536 (so a number can only be 65,536 digits long). We can set it to something bigger, and it’ll work.

(let ((integer-width 99999999))
  (collatz-return-steps (1+ (expt 2 100000)))) ; =

;; I'll spare you the number `n' itself as it's realy quite long,
;; but that number takes 717,858 steps to reach 1.

You’ll note, if you’ve been evaluating along in your own Emacs that these results have been nearly instantaneous so far. This one takes a bit longer.

(I subsequently tried doing collatz-cllabels-tco on 2^1000000 + 1, but with only 16Gb in my machine, Linux’s OOM-killer killed Emacs before it reached the end. I should probably try setting up some sort of swap better.)

Streaming though the hail #

We can also try the stream package we looked at before, which creates “lazy” conses, of potentially infinite sequences, which can be evaluated one piece at a time, as needed.

(defun collatz-stream (n)
  "For a number `n', return `1' if it's `1'; otherwise,
if it's even, run the same function on half of `n';
else, run the same function on one more than three times
`n'.
(Also, collect each step into a list and, when `n' is `1',
print the list.)
[Uses the `stream' package for lazy-evaluated conses.]"
  (cl-labels ((collatz* (n)
                (stream-cons n
                             ;; note that this is the `real' collatz func
                             ;; no checking for 1 here; we do it below
                              (cond
                               ((= (% n 2) 0)
                                (collatz* (/ n 2)))
                               (t (collatz* (1+ (* 3 n))))))))
    (let ((clst (collatz* n))
          (last 0)
          (out nil))
      (while (not (= last 1)) ;; stop when we reach 1
        (setq last (stream-pop clst))
        (setq out (cons last out)))
      (let* ((steps (1- (length out)))
            (rlist (nreverse out)))
        (cons rlist steps)))))

Code Snippet 8: Using lazy cons sequences in "stream" for Collatz calculations

This, however, turns out to in fact be much slower than our cl-labels tco recursion.

In fact, we can quantify this a bit further using Adam Porter‘s bench-multi macro:

(bench-multi
  :times 5
  :forms (("collatz-stream" (let ((integer-width 99999999))
            (collatz-stream (1+ (expt 2 10000)))))
          ("collatz-cllabels-tco" (let ((integer-width 99999999))
            (collatz-cllabels-tco (1+ (expt 2 10000)))))))

Code Snippet 9: bench-marking cl-labels vs stream collatz functions

Here the forms are ranked in order by fastest first; the second column says how much slower the form is compared against the fastest; then there are total runtime (for all iterations; we did here 5 for each); and total garbage collections performed by Emacs during those tests; and the total amount of time the garbage collection tasks themselves took.

Form	x fastest	Total runtime	# of GCs	Total GC runtime
collatz-cllabels-tco	fastest	1.310546	2	0.484381
collatz-stream	12.72	16.665472	45	11.808777

I’m not quite sure why collatz-stream is this much slower than collatz-cllabels-tco is the case. My guess is that because of the delayed/lazy evaluation, we aren’t able to get real reduction of stack frames because we’re only evaluating one piece/instance of the Collatz function on each stream-pop, and there’s no advantage to using cl-labels here really.

The stream method certainly triggers more garbage collection, which is part of it, though I’m not entirely sure why it should involve triggering more garbage collection. (I suppose we could play with garbage collection settings in emacs and see if we can deal garbage collection and make the stream method better.)

Caught in loops still? #

But what about a plain looping method for the Collatz function. We generally saw with the Fibonacci series that loops seemed more efficient.

We can try with both plain while loops and also cl-loops:

(defun collatz-while-loop (n)
  "Collatz function using emacs `while' loop."
  (let ((m n)
        (clst (list n)))
    (while (not (= m 1))
      (setq m
            (cond
             ((= (% m 2) 0)
              (/ m 2))
             (t (1+ (* 3 m)))))
      (setq clst (cons m clst)))
          (let* ((steps (1- (length clst)))
            (rlist (nreverse clst)))
            (cons rlist steps))))

(defun collatz-cl-loop (n)
  "Collatz function using `cl-loop' with
`while' clause condition."
  (let ((m n)
        (clst (list n)))
    (cl-loop while (not (= m 1))
             do (progn
                  (setq m
                        (cond
                         ((= (% m 2) 0)
                          (/ m 2))
                         (t (1+ (* 3 m)))))
                  (setq clst (cons m clst)))
             (let* ((steps (1- (length clst)))
                    (rlist (nreverse clst)))
               (cons rlist steps))))

Code Snippet 10: looping methods for the Collatz function

And now comparing all of the methods:

(bench-multi
  :times 5
  :forms (("collatz-stream"
           (let ((integer-width 99999999))
             (collatz-stream (1+ (expt 2 10000)))))
          ("collatz-cllabels-tco"
           (let ((integer-width 99999999))
             (collatz-cllabels-tco (1+ (expt 2 10000)))))
          ("collatz-while-loop"
           (let ((integer-width 99999999))
             (collatz-while-loop (1+ (expt 2 10000)))))
          ("collatz-cl-loop"
           (let ((integer-width 99999999))
             (collatz-cl-loop (1+ (expt 2 10000)))))))

Code Snippet 11: bench-marking our Collatz function implementations

We find the following results:

Form	x fastest	Total runtime	# of GCs	Total GC runtime
collatz-while-loop	fastest	0.876315	1	0.240661
collatz-cl-loop	1.02	0.896261	1	0.235273
collatz-cllabels-tco	1.43	1.255294	2	0.449736
collatz-stream	18.77	16.446478	48	11.677753

Both types of loops turn out faster than even collatz-cllabels-tco, with the plain emacs while loop method seeming slightly faster than the cl-loop method, but probably within margin of error (or maybe just a bit of extra machine in cl-loop which loses a small amount of efficiency.)

For our TCO tail-recurive cll maybe it’s the additional garbage collection that slows down collatz-cllabels-tco.

What if we just look at the two loop methods, and run more trials?

(bench-multi
  :times 100
  :forms (("collatz-while-loop"
           (let ((integer-width 99999999))
             (collatz-while-loop (1+ (expt 2 10000)))))
          ("collatz-cl-loop"
           (let ((integer-width 99999999))
             (collatz-cl-loop (1+ (expt 2 10000)))))))

Code Snippet 12: bench-marking loop methods of implementing the Collatz function

Our results are:

Form	x fastest	Total runtime	# of GCs	Total GC runtime
collatz-while-loop	fastest	17.584334	20	4.851656
collatz-cl-loop	1.03	18.027418	20	5.070158

Still the plain while is slightly faster than cl-loop.

An attempt at promoting our recursive tco function #

I’d really like to somehow improve our recursive TCO collatz function.

What if we pull the collector r out in a higher-scoping let and so don’t have to pass it into collatz* each time; thus:

(defun collatz-cllabels-tco-improved (n)
  "Calculates the 'Collatz path' for `n',
return a list of all of the steps cons'ed
with count of the steps in the path.

[Uses `cl-labels' to get tail-call optimisation.]"
  (let ((r nil))
    (cl-labels ((collatz* (n)
                    (setq r (cons n r))
                    (cond
                       ((= n 1)
                        r)
                       ((= (% n 2) 0)
                          (collatz* (/ n 2)))
                       (t
                          (collatz* (1+ (* 3 n)))))))
      (let* ((clst (collatz* n))
            (clngth (length clst)))
        (cons (nreverse clst) clngth)))))

Code Snippet 13: trying to eke out more performance for TCO cl-labels recursive Collatz function

(bench-multi
  :times 10
  :forms (("collatz-cllabels-tco"
           (let ((integer-width 99999999))
             (collatz-cllabels-tco (1+ (expt 2 10000)))))
          ("collatz-cllabels-tco-improved"
           (let ((integer-width 99999999))
             (collatz-cllabels-tco-improved (1+ (expt 2 10000)))))))

Code Snippet 14: bench-marking the two cl-labels recursive Collatz functions

Here are the results:

Form	x fastest	Total runtime	# of GCs	Total GC runtime
collatz-cllabels-tco-improved	fastest	2.198572	3	0.719563
collatz-cllabels-tco	1.17	2.579470	4	0.923288

We do seem to eke out a bit more performance.

But compared to the loops, it’s still not as performant:

(bench-multi
  :times 10
  :forms (("collatz-cllabels-tco"
           (let ((integer-width 99999999))
             (collatz-cllabels-tco (1+ (expt 2 10000)))))
          ("collatz-cllabels-tco-improved"
           (let ((integer-width 99999999))
             (collatz-cllabels-tco-improved (1+ (expt 2 10000)))))
          ("collatz-while-loop"
           (let ((integer-width 99999999))
             (collatz-while-loop (1+ (expt 2 10000)))))
          ("collatz-cl-loop"
           (let ((integer-width 99999999))
             (collatz-cl-loop (1+ (expt 2 10000)))))))

Code Snippet 15: benchmarking two cl-labels and two loop Collatz implementations

We seem to have done a bit better. Now collatz-cllabels-tco-improved is only 1.26x slower than the fastest, collatz-while-loop. We save an extra gc over the older collatz-cllabels-tco, but still have to run an additional one compared to the two loop functions.

Form	x fastest	Total runtime	# of GCs	Total GC runtime
collatz-while-loop	fastest	1.754987	2	0.486154
collatz-cl-loop	1.01	1.769489	2	0.486728
collatz-cllabels-tco-improved	1.26	2.208662	3	0.732637
collatz-cllabels-tco	1.44	2.528099	4	0.899793

Putting all of the methods we’ve come up with together and running more trials:

(bench-multi
  :times 100
  :forms (("collatz-stream"
           (let ((integer-width 99999999))
             (collatz-stream (1+ (expt 2 10000)))))
          ("collatz-cllabels-tco"
           (let ((integer-width 99999999))
             (collatz-cllabels-tco (1+ (expt 2 10000)))))
          ("collatz-cllabels-tco-improved"
           (let ((integer-width 99999999))
             (collatz-cllabels-tco-improved (1+ (expt 2 10000)))))
          ("collatz-while-loop"
           (let ((integer-width 99999999))
             (collatz-while-loop (1+ (expt 2 10000)))))
          ("collatz-cl-loop"
           (let ((integer-width 99999999))
             (collatz-cl-loop (1+ (expt 2 10000)))))))

Code Snippet 16: full benchmark of all of our Collatz implementations

Here, interestingly, we end up with collatz-cl-loop slightly edging out collatz-while-loop. And our favourite (well, my favourite) collatz-cllabels-tco-improved function doesn’t do too poorly: it’s only x1.28 slower than the fastest (collatz-cl-loop); and even the old collatz-cllabels-tco with additional argument passing is only x1.53 slower

Form	x fastest	Total runtime	# of GCs	Total GC runtime
collatz-cl-loop	fastest	17.733980	19	4.817641
collatz-while-loop	1.01	17.850171	20	4.977408
collatz-cllabels-tco-improved	1.28	22.736898	33	7.961263
collatz-cllabels-tco	1.53	27.146827	47	10.735285
collatz-stream	18.97	336.443089	961	240.167718

The collatz-stream function remains significantly slower, x19 so.

There is an obvious correlation between runtime and garbage collection, (19 vs 20 vs 33 vs 47 vs 961), so again possibly tweaking garbage collection could change things. Although, if we get rid of the garbage collection runtime differences we end up with:

Form	Total runtime w/o GC	x fastest
collatz-while-loop	12.872762999999999	fastest
collatz-cl-loop	12.916338999999999	1.003
collatz-cllabels-tco-improved	14.775635000000001	1.148
collatz-cllabels-tco	16.411541999999997	1.275
collatz-stream	96.27537099999998	7.479

Now collatz-while-loop is again indeed actually the fastest, but still just barely more so than collatz-cl-loop; both recursive TCO’ing cl-labels aren’t actually too much slower than the loops, though the improved version is indeed better; and collatz-stream is still quite slow in comparison. So garbage collection is part of the difference, but not the only one, especially for the streams method.

Escaping (at least some) loops: out of recursion and into… different recursion (next time) #

This was really an excursus (on an excursus). And, next time — really — we’ll deal with the Y Combinator. I know this one really wasn’t a lambda calculus post at all, but an excursion on our previous excursion into recursion, here mostly practical considerations for Emacs. But recursion is crucial for the upcoming discussion of the Y Combinator, and paradoxes (and their uses). And that’ll be interesting for lambda calculus, and its connections with Lisp.

And, after we’re through with Y Combinators and Fibonacci sequences and barbers who shave everyone who does’t shave themselves, and how these things all tie together, we can get to something I’ve been working on for a while:— given that, connections aside, Lisp isn’t lambda calculus: can we come up with a good implementation of lambda calculus in Lisp?

(And specifically in Emacs Lisp: which is maybe one of the Lisps least naturally suited for this: a Scheme or Clojure would be a better choice really. But it’s more fun to try to get it to work in Elisp, with all of the additional challenges it presents.)

(I’m perhaps very slightly giving the game away; or, at least providing a clue in the naming of it if you can think of what C-words “ersatz” rhymes with.) ↩︎
See The Collatz Tree | Algoritmarte. And the video it’s based on https://www.youtube.com/watch?v=LqKpkdRRLZw
↩︎
The Quantamagazine article notes:

Do not try to solve this math problem.

You will be tempted. This problem is simply stated, easily understood, and all too inviting. Just pick a number, any number: If the number is even, cut it in half; if it’s odd, triple it and add 1. Take that new number and repeat the process, again and again. If you keep this up, you’ll eventually get stuck in a loop. At least, that’s what we think will happen.

Take 10 for example: 10 is even, so we cut it in half to get 5. Since 5 is odd, we triple it and add 1. Now we have 16, which is even, so we halve it to get 8, then halve that to get 4, then halve it again to get 2, and once more to get 1. Since 1 is odd, we triple it and add 1. Now we’re back at 4, and we know where this goes: 4 goes to 2 which goes to 1 which goes to 4, and so on. We’re stuck in a loop.

Or try 11: It’s odd, so we triple it and add 1. Now we have 34, which is even, so we halve it to get 17, triple that and add 1 to get 52, halve that to get 26 and again to get 13, triple that and add 1 to get 40, halve that to get 20, then 10, then 5, triple that and add 1 to get 16, and halve that to get 8, then 4, 2 and 1. And we’re stuck in the loop again.

The infamous Collatz conjecture says that if you start with any positive integer, you’ll always end up in this loop. And you’ll probably ignore my warning about trying to solve it: It just seems too simple and too orderly to resist understanding. In fact, it would be hard to find a mathematician who hasn’t played around with this problem.

I couldn’t ignore it when I first learned of it in school. My friends and I spent days trading thrilling insights that never seemed to get us any closer to an answer. But the Collatz conjecture is infamous for a reason: Even though every number that’s ever been tried ends up in that loop, we’re still not sure it’s always true. Despite all the attention, it’s still just a conjecture.

….

It’s easy to verify that the Collatz conjecture is true for any particular number: Just compute the orbit until you arrive at 1. But to see why it’s hard to prove for every number, let’s explore a slightly simpler function, ℊ.

g(n)= {n/2n+1, if n is even

{n+1, if n is odd

We might conjecture that orbits under ℊ always get to 1. I’ll call this the “Nollatz” conjecture, but we could also call it the n + 1 conjecture. We could explore this by testing more orbits, but knowing something is true for a bunch of numbers — even 268 of them — isn’t a proof that it’s true for every number. Fortunately, the Nollatz conjecture can actually be proved. Here’s how.

First, we know that half of a positive integer is always less than the integer itself. So if n is even and positive, then ℊ(n) = n/2 < n. In other words, when an orbit reaches an even number, the next number will always be smaller. …. This tells us that when an orbit under ℊ reaches an odd number greater than 1, we’ll always be at a smaller number two steps later. And now we can outline a proof of the Nollatz conjecture: Anywhere in our orbit, whether at an even or an odd number, we’ll trend downward. The only exception is when we hit 1 at the bottom of our descent. But once we hit 1 we’re trapped the loop, just as we conjectured. … [For f, the Collatz conjencture,] as with ℊ, applying f to an even number makes it smaller. And as with ℊ, applying f to an odd number produces an even number, which means we know what happens next: f will cut the new number in half.

But here’s where our argument falls apart. Unlike our example above, this number is bigger than n: 3n+12 = 3n2 + 12, and 3n2 = 1.5n, which is always bigger than n. The key to our proof of the Nollatz conjecture was that an odd number must end up smaller two steps later, but this isn’t true in the Collatz case. Our argument won’t work.

↩︎