Last weekend I had a conversation with an undergraduate student new to computer science, who was reading CLRS. “I wish” they said, “that all the pseudocode in my algorithms textbook was just written in Python.” “Ah” I said, “but textbook authors sometimes want their work to endure beyond a decade.” “But Python’s been around for a long time” came the reply, “and it’s very readable, and you can’t execute pseudocode anyway so what’s the harm?”
It’s true that Python has been around for three decades and popular for at least two of them, although post-2000 non-pseudocode-based textbooks generally went with Java instead. I do feel that tying algorithms - these immortal mathematical objects - to languages relevant within a half-century sliding window cheapens their presentation somehow. I’m old enough to have lived through the waning of several once-popular languages. Pseudocode is timeless! Its lack of standardization & executability is arguably a problem; this presumably is why Donald Knuth defined his own assembly language running on an imaginary computer architecture to use in The Art of Computer Programming.
But what if pseudocode could itself be made executable? Not only that, what if formal proofs could be written about its correctness that connect directly to the definitions in the pseudocode itself? This was the basic idea behind PlusCal, a pseudocode-like language developed by Leslie Lamport that transpiles to TLA⁺. PlusCal sees most of its use in distributed & concurrent systems, but I wanted to see how it handled pseudocode of the sort found in CLRS or published papers. I wanted to know:
- Do algorithms written in PlusCal look like their pseudocode counterparts?
- Can we execute the algorithms written in PlusCal without many alterations from the form most resembling the original pseudocode? Can we easily find the bugs?
- How difficult is it to write formal proofs of correctness for PlusCal pseudocode?
- How readable are those proofs compared to the informal proof given in the paper?
I also wanted to see how Python performed on the first two questions. Let’s find out!
The Algorithm
For our case study we’re going to stretch back in time to a paper published in 1980, now 43 years old - 10 years older than the first edition of CLRS. The paper is Lexicographically Least Circular Substrings by Kellogg S. Booth, now-retired Professor Emeritus at the Department of Computer Science at the University of British Columbia. I knew this paper from my time in competitive programming during undergrad. Specifically I knew the pseudocode given in the paper was wrong; the corrected version of the code was disseminated within the competitive programming community. There’s an interesting aside here about technical correctness vs. general correctness in published results; the algorithm in the paper is literally wrong, even though it comes with a proof of correctness, but the paper is still a good paper because the fundamental insight driving the algorithm is correct, and the technical problems can be fixed. So if you hear a statistic like “X% of math/computer science papers contain significant technical errors” that is almost certainly true, but those errors aren’t necessarily fatal. These are the sort of errors that are recorded via publication of errata, not retraction. Actually there is an errata page for this very paper, but the amended algorithm is still not correct!
So what does the algorithm actually do?
Given a string, the algorithm outputs the rotation of the string that comes first alphabetically - the lexicographically-least circular substring.
For example, the string "abca"
has the rotation "aabc"
which is first alphabetically among its four possible rotations:
"abca"
"bcaa"
"caab"
"aabc"
The algorithm outputs a positive number denoting the number of times the input string must be rotated to the left to reach its lexicographically-minimal rotation.
For "abca"
, this is 3.
Some strings, like "baba"
, have multiple lexicographically-minimal rotations (1 and 3); in these cases, the algorithm outputs the smallest rotation (1).
The algorithm outputs 0 when given strings of length zero or one.
Given a string of length \(n\), the algorithm has \(O(n)\) time & memory complexity.
The naive algorithm of generating every possible string rotation and comparing them has \(O(n^2)\) time complexity.
Where is this algorithm used? Beyond contrived programming competition problems, perhaps you have a database of circular DNA strings and you want to store them in normalized form for easier searching. The problem also crops up when normalizing cycles generally, for example cycles of a graph. It is also found in various places in mathematics.
What is the actual algorithm? Just to set expectations, it is short but very difficult to understand. It is a modification of the famously-inscrutable pre-process step of the Knuth-Morris-Pratt string search algorithm. Thankfully for the purpose of this post you don’t need to understand why the algorithm actually works. Here is the paper’s pseudocode transcribed verbatim; the input string’s characters are denoted by \(b_k\), while \(n\) denotes the string’s length. \(f\) is an array in which we store some scratch-work. Behold a labyrinth of index arithmetic!
$$ \begin{aligned} &\text{integer procedure } LCS(b_0 \ldots b_{n-1})\text{:} \cr &\quad\text{begin} \cr 1.&\quad\quad f(0) \gets nil; \cr 2.&\quad\quad k \gets 0; \cr 3.&\quad\quad \text{for } j \gets 1 \text{ to } 2n-1 \text{ do} \cr &\quad\quad\quad \text{begin} \cr 4.&\quad\quad\quad\quad \text{if } j - k \geq n \text{ then return } k; \cr 5.&\quad\quad\quad\quad i \gets f(j - k - 1); \cr 6.&\quad\quad\quad\quad \text{while } b_{j \pmod{n}} \neq b_{k+i+1 \pmod{n}} \text{ and } i \neq nil \text{ do} \cr &\quad\quad\quad\quad\quad \text{begin} \cr 7.&\quad\quad\quad\quad\quad\quad \text{if } b_{j \pmod{n}} < b_{k+i+1 \pmod{n}} \text{ then} \cr 8.&\quad\quad\quad\quad\quad\quad\quad k \gets j - i - 1; \cr 9.&\quad\quad\quad\quad\quad\quad i \gets f(i); \cr &\quad\quad\quad\quad\quad \text{end} \cr 10.&\quad\quad\quad\quad \text{if } b_{j \pmod{n}} \neq b_{k+i+1 \pmod{n}} \text{ and } i = nil \text{ then} \cr &\quad\quad\quad\quad\quad \text{begin} \cr 11.&\quad\quad\quad\quad\quad\quad \text{if } b_{j \pmod{n}} < b_{k+i+1 \pmod{n}} \text{ then} \cr 12.&\quad\quad\quad\quad\quad\quad\quad k \gets j; \cr 13.&\quad\quad\quad\quad\quad\quad f(j - k) \gets nil; \cr &\quad\quad\quad\quad\quad \text{end} \cr &\quad\quad\quad\quad \text{else} \cr 14.&\quad\quad\quad\quad\quad f(j - k) \gets i + 1; \cr &\quad\quad\quad \text{end} \cr &\quad \text{end} \cr \end{aligned} $$
The algorithm has a nested loop. Naively you’d expect this to mean it has \(O(n^2)\) time complexity instead of \(O(n)\). Not so!
Translation
Reading the pseudocode, a few things pop out:
- The pseudocode is inconsistent in using begin/end statements versus indentation to encapsulate code blocks; in the original paper I believe this is due to column width constraints, but one has to be careful to write the instructions in the correct block.
- There is only one return statement: the one in instruction 4. If the loop actually terminates the return value is undefined; perhaps the loop is guaranteed to return in instruction 4 before termination but most programming languages can’t be convinced of this and would refuse to compile it.
- Mixing in the value \(nil\) with integer values in function \(f\); will this cause problems?
- Initialization: how many values does \(f\) need to store? What are the values of \(f\) initialized to? What is the initial value of variable \(i\)?
These are all very typical issues when translating pseudocode. We handle them as follows:
- Check your code, then check it again. No really! I’ve yet to translate this algorithm without making at least one mistake.
- This is left out of the pseudocode for some reason, but reading the rest of the paper it’s made clear that \(k\) is the final result value we’re looking for. So the end of the pseudocode really should look like: $$ \begin{aligned} &\quad\quad\quad \text{end} \cr 15.&\quad\quad \text{return } k; \cr &\quad \text{end} \cr \end{aligned} $$
- The value \(nil\) does indeed cause problems.
If you trace execution for a simple string like
"ab"
, you’ll see that in instruction 6 the value \(b_{k+i+1 \pmod{n}}\) is calculated when \(i\) has value \(nil\). Even short-circuiting that calculation by checking \(i \neq nil\) first only shifts the problem to instruction 10. The solution, somewhat inscrutably, is to make \(nil\) have the value \(-1\). - Some experimentation shows \(f\) needs to be initialized with \(2n\) values, all set to \(-1\). Variable \(i\) is contained within the scope of the for loop, and is initialized in instruction 5.
So quite a bit of elbow grease is necessary to understand how a translated program would look.
Python
Here’s a straightforward Python translation:
def lcs(b):
n = len(b)
f = [-1] * (2 * n)
k = 0
for j in range(1, 2 * n):
if j - k >= n:
return k
i = f[j - k - 1]
while b[j % n] != b[(k + i + 1) % n] and i != -1:
if b[j % n] < b[(k + i + 1) % n]:
k = j - i - 1
i = f[i]
if b[j % n] != b[(k + i + 1) % n] and i == -1:
if b[j % n] < b[(k + i + 1) % n]:
k = j
f[j - k] = -1
else:
f[j - k] = i + 1
return k
It actually looks pretty similar to the pseudocode! Very readable. The code seems to give correct results without any crashes on a few test strings.
PlusCal
If you get into string algorithms, you quickly learn to dread off-by-one errors. This is compounded by about half the original sources using 0-indexed strings, and the other half using 1-indexed strings; This is further compounded because most programming languages have 0-indexed strings & arrays, except for a few like Fortran and… PlusCal & TLA⁺. Fortunately the immense flexibility of PlusCal & TLA⁺ makes defining our own 0-indexed string type simple, to match the 0-indexed presentation of our chosen algorithm. I’ll omit that code below for simplicity. Here’s the PlusCal translation I came up with:
--algorithm LeastCircularSubstring
variables
b \in Corpus;
n = ZLen(b);
f = [index \in 0..2*n |-> nil];
i = nil;
j = 1;
k = 0;
define
Corpus == ZSeq(CharacterSet)
nil == -1
end define;
begin
L3: while j < 2 * n do
L4: if j - k >= n then goto Done; end if;
L5: i := f[j - k - 1];
L6: while b[j % n] /= b[(k + i + 1) % n] /\ i /= nil do
L7: if b[j % n] < b[(k + i + 1) % n] then
L8: k := j - i - 1;
end if;
L9: i := f[i];
end while;
L10: if b[j % n] /= b[(k + i + 1) % n] /\ i = nil then
L11: if b[j % n] < b[(k + i + 1) % n] then
L12: k := j;
end if;
L13: f[j - k] := nil;
else
L14: f[j - k] := i + 1;
end if;
LVR: j := j + 1;
end while;
end algorithm;
Very similar to the pseudocode!
It looks even better typeset; TLA⁺ has a command to translate specs to \(\LaTeX\) output.
You can see the result of that as a PDF here.
I think it would have been neat to typeset the % n
operator as \(\pmod{n}\), but it still looks nice.
A few things to note about the PlusCal code:
- The operators
ZLen(s)
andZSeq(S)
are defined in aZSequences
module I wrote for zero-indexed sequences;ZSeq(S)
is the set of all strings over the character setS
. The lineb \in Corpus
nondeterministically initializesb
as an arbitrary element of this set.CharacterSet
is an input to this spec. - In PlusCal you can label statements to refer to the statements in later proofs or invariants; I have set these to correspond to the pseudocode instruction numbers. The
Done
label is a built-in label denoting the program state upon termination. - PlusCal doesn’t have a notion of variable scope; all variables are global. It also doesn’t have for-loops, only while-loops. We have an extra instruction, labeled
LVR
for loop variant, that incrementsj
at the bottom of the loop. - This algorithm is written without return statements; similar to the pseudocode, we just look at the value of
k
after the algorithm has terminated.
There’s one more weird thing about PlusCal. I mentioned it transpiles to TLA⁺; it actually does this inside the source file itself. A block is set aside below this code for the corresponding TLA⁺ code to be emitted. I used to really dislike this. Not only does it involve generated code, but generated code inside a human-written source file!? This turns out to make sense when you’re writing proofs and invariants over this code in TLA⁺ itself. I’d be interested in alternatives, though.
Execution & Testing
Python
The Python code executes as-is, so let’s find those bugs! We could dream up a bunch of tricky strings, but we live in the modern era so let’s act like modern people and use property-based testing. We will generate a ton of random strings, get the output of our algorithm, and compare it to the naive “check every rotation” approach. Here’s that naive approach in Python:
def naive_lcs(s):
n = len(s)
if n == 0:
return 0
rotations = [s[i:] + s[:i] for i in range(n)]
least_rotation = min(rotations)
return rotations.index(least_rotation)
We can then use Hypothesis to auto-generate strings from a very simple alphabet, just "0"
and "1"
:
from hypothesis import given
from hypothesis.strategies import text, sampled_from
@given(text(sampled_from("01")))
def test_lcs(b):
assert lcs(b) == naive_lcs(b)
Running pytest
, we very quickly find a bug!
The algorithm incorrectly handles the string "010"
, returning 0
when it should return 2
.
Here we can defer to the published errata, which says to modify instruction 4, the early return statement, from:
$$
\begin{aligned}
4.&\quad\quad\quad\quad \text{if } j - k \geq n \text{ then return } k; \cr
\end{aligned}
$$
to:
$$
\begin{aligned}
4.&\quad\quad\quad\quad \text{if } j - k > n \text{ then return } k; \cr
\end{aligned}
$$
So using \(>\) instead of \(\geq\).
Let’s test it out!
Unfortunately after this fix Hypothesis immediately finds another failing string: "0010"
, where we get 0
when we should get 3
.
The errata says that the early return in instruction 4 is only a “shortcut” and isn’t necessary for the correctness of the algorithm - let’s try removing it to see if that helps:
def no_early_return_lcs(b):
n = len(b)
f = [-1] * (2 * n)
k = 0
for j in range(1, 2 * n):
i = f[j - k - 1]
while b[j % n] != b[(k + i + 1) % n] and i != -1:
if b[j % n] < b[(k + i + 1) % n]:
k = j - i - 1
i = f[i]
if b[j % n] != b[(k + i + 1) % n] and i == -1:
if b[j % n] < b[(k + i + 1) % n]:
k = j
f[j - k] = -1
else:
f[j - k] = i + 1
return k
The Hypothesis test passes! Maybe it just isn’t testing enough inputs though, so let’s pump it up a bit and expand the alphabet:
from hypothesis import given, settings
from hypothesis.strategies import text, sampled_from
@settings(max_examples=10000)
@given(text(sampled_from("abc")))
def test_lcs(b):
assert no_early_return_lcs(b) == naive_lcs(b)
The test runs for about twenty seconds without finding a failing example.
Indeed, no_early_return_lcs
matches the bugfixed code that was disseminated within the competitive programming community.
This code is also available on the Wikipedia article; I have emailed Dr. Booth to see whether he will add to the errata.
A classic case of an optimization compromising the correctness of an algorithm or system!
You can see the full Python source file here.
PlusCal
Python had its fun, now let’s see whether PlusCal finds those bugs.
We can use the TLC model checker to exhaustively test every string in a certain character set below a certain length.
First, like Python, we have to define the naive algorithm for lexicographically-minimal string rotation to have something to test against.
We’re writing this in TLA⁺ itself, not PlusCal.
Here TLA⁺ really suffers from its minimal standard library; I have to go all the way down to defining what lexicographic order means for my ZSequences
module:
a \preceq b ==
LET
s1len == ZLen(a)
s2len == ZLen(b)
RECURSIVE IsLexLeq(_, _, _)
IsLexLeq(s1, s2, i) ==
CASE i = s1len \/ i = s2len -> s1len <= s2len
[] s1[i] < s2[i] -> TRUE
[] s1[i] > s2[i] -> FALSE
[] OTHER -> IsLexLeq(s1, s2, i + 1)
IN IsLexLeq(a, b, 0)
Then define what it means to rotate the string along with the set of all rotations:
Rotation(s, r) ==
IF s = EmptyZSeq
THEN EmptyZSeq
ELSE [i \in ZIndices(s) |-> s[(i + r) % ZLen(s)]]
Rotations(s) ==
IF s = EmptyZSeq
THEN {}
ELSE {[
shift |-> r,
seq |-> Rotation(s, r)
] : r \in ZIndices(s)
}
Then define a function checking whether a given rotation is minimal:
IsLeastMinimalRotation(s, r) ==
LET rotation == Rotation(s, r) IN
/\ \A other \in Rotations(s) :
/\ rotation \preceq other.seq
/\ rotation = other.seq => (r <= other.shift)
This is a lot of code! Here we glimpse the dark truth at the heart of formal methods: it’s really just like writing the same program twice in two wildly different ways, then checking whether they do the same thing.
Anyway, now that’s all set up we can define an invariant. Invariants are formulas that must remain true in every state:
Correctness ==
pc = "Done" => IsLeastMinimalRotation(b, k)
Pretty simple, it just says that when our PlusCal code reaches the Done
state then k
must be the least minimal rotation of our string.
pc
is a well-known variable name generated during the PlusCal transpilation process denoting what state the algorithm is in.
We can then create a model to pass to TLC, which in this case means defining our character set as {0, 1}
and the maximum string length as, say, 6.
Does it find the bug?
Error: Invariant Correctness is violated.
Error: The behavior up to this point is:
State 1: <Initial predicate>
/\ b = (0 :> 0 @@ 1 :> 1 @@ 2 :> 0)
/\ f = (0 :> -1 @@ 1 :> -1 @@ 2 :> -1 @@ 3 :> -1 @@ 4 :> -1 @@ 5 :> -1 @@ 6 :> -1)
/\ i = -1
/\ j = 1
/\ k = 0
/\ n = 3
/\ pc = "L3"
State 2:
...
State 18: <Action line 10, col 1 to line 10, col 69 of module MCLeastCircularSubstring>
/\ b = (0 :> 0 @@ 1 :> 1 @@ 2 :> 0)
/\ f = (0 :> -1 @@ 1 :> -1 @@ 2 :> 0 @@ 3 :> -1 @@ 4 :> -1 @@ 5 :> -1 @@ 6 :> -1
/\ i = -1
/\ j = 3
/\ k = 0
/\ n = 3
/\ pc = "Done"
TLC spits out a long state trace (omitted) ending in the state where pc = "Done"
.
Same as Python, we see the string "010"
erroneously results in k = 0
.
After switching instruction 4 to use \(>\) instead of \(\geq\), we then see:
Error: Invariant Correctness is violated.
Error: The behavior up to this point is:
State 1: <Initial predicate>
...
State 35: <Action line 10, col 1 to line 10, col 75 of module MCLeastCircularSubstring>
/\ b = (0 :> 0 @@ 1 :> 0 @@ 2 :> 1 @@ 3 :> 0)
/\ f = ( 0 :> -1 @@ 1 :> 0 @@ 2 :> -1 @@ 3 :> 0 @@ 4 :> 1 @@ 5 :> -1 @@ 6 :> -1 @@ 7 :> -1 @@ 8 :> -1 )
/\ i = 0
/\ j = 5
/\ k = 0
/\ n = 4
/\ pc = "Done"
So we see string "0010"
is our new culprit, same as found with Python.
What about after removing the early return?
As you might expect, TLC terminates without error!
You can see the final file here and its dependencies here.
Our algorithm is correct for every string composed of 6 or fewer “0” or “1” characters.
We can expand this as much as we want; checking all strings composed of 8 or fewer “a”, “b”, or “c” characters takes about two minutes.
The only limit is memory, computing power, and patience.
Formal Proof
What if we’re impatient? What if we want to know that this algorithm is truly correct for all strings of finite length across all possible character sets? This algorithm has such a history, with decades of corrections and incorrections and whispered accounts of how it’s really done that we deserve to know, once and for all, that this is really the algorithm we’re looking for. No more revising errata, the buck stops here in this very post! Can we do it?
Okay we probably could do it, but I’d estimate it would take at least two weeks. I actually accomplished a near-decade-long goal and finally taught myself the TLA⁺ proof language for this blog post, hoping to prove the correctness of our algorithm! For warmup I proved correct a very, very simple algorithm that just iterates through a sequence of numbers to find its greatest element. That took me three entire days! The proof was subsequently greatly simplified with help from Stephen Merz and future proofs will surely be faster with all the experience I acquired, but still! This post is about a very complicated algorithm. So I am sorry to say the buck does not actually stop here. The buck is passed along to whomever is willing to invest two weeks of their life formally proving this algorithm correct. That person could still be future me, who knows! But it is not imminent-future me.
Still, with my experience I can answer the questions about proofs I had at the beginning of the post. I wanted to know how difficult it was to write formal proofs of correctness for PlusCal pseudocode, and how readable those proofs were. Difficulty? Actually quite low. Proofs about algorithms can be thought of as an inch deep and a mile wide. You rarely encounter anything more mathematically-challenging than \(a \leq b \land b \leq c \implies a \leq c\). However you have to prove so many things that the time investment is quite large. Usually you prove correctness of a program by defining something called an inductive invariant, then demonstrating that every possible code path maintains the correctness of that invariant (the idea being upon termination this invariant proves the correctness of your algorithm). So I’d be writing a fairly detailed proof for every instruction in our program, of which there are twelve, and some of those have branches which would themselves require sub-proofs. A lot of work! Once you learn the quirks of the TLA⁺ proof manager it’s definitely tractable, though.
How about proof readability?
I would say quite readable!
Mileage may vary but the individual proof steps are simple, and once you understand what’s being proved & how proofs are usually structured it’s easy to pick out the interesting parts.
For example, with my “find the greatest element in a sequence” algorithm it’s clear the most interesting part of the proof happens when we compare the current value of the sequence with the current highest recorded value, and choose one or the other.
That part of the proof is here.
The rest of the proof is boilerplate by comparison: necessary for the proof manager to give us its stamp of correctness, but intellectually not too interesting.
Still, even if you don’t know TLA⁺ that proof is fairly readable!
You can see we’re showing that the index element f[i] <= h'
, the highest recorded value, where h' = max(h, f[i])
(the '
means “prime” and denotes the value of the variable in the next state).
TLA⁺ proofs are very interesting when compared to another proof language I’ve used, Lean. The experience of writing Lean proofs is wonderful. There is never any confusion about what the current proof goal is, what assumptions you can use, and what moves you can make. However, Lean proofs as written are absolutely unintelligible. Like what the hell does this mean!?
rw nat.succ_eq_add_one,
rw ← nat.add_succ,
rw (show nat.succ 1 = 2, by refl),
rw left_distrib (d + 1) d 2,
rw mul_comm 2 (d + 1),
rw mul_comm d (d + 1),
rw add_comm,
This was taken from my post on doing a math assignment in Lean. You’ve heard about programming paradigms, now you’re gonna learn about proof paradigms! Lean proofs only make sense if you’re reading them inside a Lean programming environment. As you move the cursor through the text, a window on the right shows how each instruction transforms the proof goals or assumptions. By contrast, TLA⁺ proofs make sense in and of themselves. The tradeoff here is they are much harder to write! It’s never quite clear why the backend provers are having trouble proving something, and the only way to figure it out is to break a statement down into further proofs and subproofs until you realize it just didn’t know it could assume some variable was an integer, so you delete like 15 lines and replace them with a single statement. However, the end result is a structured proof that is readable as written, without accompanying software.
Conclusion
I’ll admit, Python made a very compelling case for itself here. The final algorithm, naive algorithm, and property-based testing code combined took up only 35 very readable lines! I’ve been convinced, immortal mathematical objects be damned, Python is a perfectly fine alternative to pseudocode in papers and textbooks. Of course I would suggest sticking to a very restricted subset of Python, probably avoiding any fancy features (no list comprehensions, pattern matching, etc.) and not using the standard library. I wonder whether such a restrictive subset of Python has been standardized. Perhaps it is simply Python 1.0!
The verdict for PlusCal is a bit more ambiguous. It gained a lot of verbosity from having 1-indexed arrays, an extremely minimal standard library, and in-file code generation. It ran around 275 lines (100 of those generated) split across four source files just to match Python’s 35-line execution & test functionality. Also, it took more effort to write: probably an entire day of work for the basics plus another day for polish, vs. less than an hour for the Python version.
To be fair, sequential string algorithms are not PlusCal’s strength. It really shines in concurrent & distributed algorithms, where the model checker explores all possible interleavings of instructions. PlusCal also has two killer features here for paper & textbook authors:
- It can output paper-ready \(\LaTeX\) pseudocode directly from the code itself, so the code that’s published is the same as the code that’s been parsed & model-checked
- The formal proof system directly checks that same code
I don’t think it would be too hard to write a \(\LaTeX\) generator for a restricted subset of Python code (if such a thing would even be desirable), but Python doesn’t have model checking. Hypothesis is wonderful for sequential algorithms, but nothing can really compare to TLA⁺ model checking for quickly testing designs of concurrent & distributed systems.
What about proofs? Here’s a probably-not-correct hot take: proofs will be the future of TLA⁺ relevance. Model checking was the killer app for the past decade, but property-based testing, model-based testing, and fuzzing have come so far in that time and they operate on the actual running system instead of its abstract specification. MongoDB worked with an apparently impressive stealth-mode company in this space called Antithesis that set up very effective network fuzzing for their database. Everybody’s also heard of Jepsen. However, TLA⁺ can retreat to infinity and prove an algorithm correct for all possible inputs! It also works well in tandem with the model checker, as you can quickly check the validity of a statement before proving it (to quote Leslie Lamport, it’s easier to prove something if it’s true!) Given some UX improvement I could see proofs becoming a viable option, especially if everybody gets a LLM tutor to walk them through the learning curve.
Okay, so what’s the final word? Should you use PlusCal for pseudocode in your paper? Yes, that or Python. I have come to believe that both are superior to an ad-hoc unspecified pseudo-math language. If you want to see PlusCal pseudocode in an actually published paper, take a look at the WPaxos paper by Ailidani Ailijiang, Aleksey Charapko, Murat Demirbas, and Tevfik Kosar.
Afterword
If you enjoyed this post, consider hiring me for a contract! See my portfolio on the home page. On the other hand if you didn’t enjoy this post, hiring me would be an effective method of preventing me from writing other ones. You win either way!