Using TLA⁺ at Work
Designing a snapshot coordination system

Here’s a short report of a time I used TLA⁺ at work, with interesting results. TLA⁺ is a formal specification language that is particularly effective when applied to concurrent & distributed systems. TLA⁺ made it tractable for an ordinary software engineer to reason about a tricky distributed systems problem, and it found a bug introduced by an “optimization” I tried to add (classic). The bug required 12 sequential steps to occur and would not have been uncovered by ordinary testing.
[Read More]