irundebian 5 days ago

How do I can check an implementation against a TLA+ spec?

  • pron 4 days ago

    First, the art and science of formal verification has largely moved away from proving programs correct to better/cheaper ways of catching more bugs (at least when mainstream software is concerned). So the most important question is "is a TLA+ specification a cost-effective way of catching bugs even without a formal tie to the implementation?" My opinion on that is an emphatic yes. (BTW, code reviews are an effective assurance technique even though they're not formal at all).

    However, there are ways to create formal ties between a TLA+ spec and a running program. As always, the sounder the method the more costly it is (and the cost often grows faster than the increased confidence). One approach that is being research is checking execution logs against the model, perhaps online as the program is running.

    At the end of this talk, https://youtu.be/TP3SY0EUV2A, I mention the technique and theory behind it, referring to this paper, Model-Based Trace-Checking [1]. I believe research is ongoing and you can find more recent papers on the subject.

    [1]: https://arxiv.org/abs/1111.2825

    • 0x1ceb00da 4 days ago

      What are some of the lesser known "better/cheaper ways of catching more bugs"?

      • pron 4 days ago

        If we look at practices that fall under formal methods, I would include things like TLA+ (or Alloy) specifications and concolic testing (https://en.wikipedia.org/wiki/Concolic_testing) (which is sometimes used as a step in the process of automatic test generation).

        Practices that are not usually considered formal methods include clever semi-blackbox fuzzing (like Antithesis) and mutation testing to improve test quality.

  • tombert 5 days ago

    I haven't tried it yet, but there is a PlusCal-to-Go compiler now: https://distcompiler.github.io/

    I have no idea how well it works, but it seems pretty interesting.

    I do think that that's something a bit more compelling about something like Isabelle, which offers first-class support for exporting programs that can be directly executed, which in theory would inherit all the proofs for the parent Isabelle (though that's not strictly true).

  • Aurel300 5 days ago

    Welcome to the research field of program refinement. There are a couple of approaches one can take: - Certify: the program isn't actually checked/proven to satisfy the spec, but experts in the field will pore over the code and try to find any disparities or edge cases. - Top-down: given a spec, try to generate an implementation that satisfies it by construction. This depends a lot of the concrete framework, but e.g. Coq supports extraction of executable code from its (verified) source. The results are not very well decoupled from the spec and are often lacking in performance (single-threaded, unoptimised, etc). - Bottom-up: given an actual implementation, prove that it behaves according to the spec. This can allow for optimised programs and concurrency, but both complicate the proofs. (This happens to be my PhD research topic.)

    Some work to look at, if you are curious: Event-B, Trillium, IronFleet

  • k__ 5 days ago

    Someone writes theorems, prove them in their head, and try to write down that proof.

    Someone else reads both and tries to reason about the proof and theorem and notifies the creator of there are mismatches.

    A more practical person goes and translates all that math/theory to code.

    Another practical person will check implementation and theory and notify the implementor of mismatches.

    TLA+ is basically an automated version of step two in this process. Someone has an idea for a theorem, writes it down in TLA+ and the system will notifyil if it makes any sense.

  • 7e 4 days ago

    For something that's actually practical for most engineers and at scale, I would look at:

    1) https://www.stateright.rs and 2) https://antithesis.com

    • irundebian 3 days ago

      Thank you. I took a look at both and while stateright seems to check/derive implementation against the spec, antithesis rather seems to do testing instead of verification.