Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Except that's another question to which the answer is currently, "you don't", at least not at any interesting scale, not with TLA+ and not with anything else. But TLA+ is a tool aimed to help achieve correctness effectively, i.e. get a good cost/benefit for it. Between the highly scalable, low-confidence, testing, and the high-confidence, low-scale end-to-end verification, there's a huge swath of software correctness techniques, and the flexible TLA+ tries to hit a rather large swathe of that spectrum.

In other words, we ask, is this better than what we do now in a sufficiently large portion of cases, rather than is this the silver bullet? The answer to the second question is no. I think the answer to the first one is yes.



Formal program derivation satisfying a given specification has been around for well over half a century now. Hoare logic and the predicate calculus are entirely adequate for formal derivation in any language with well-documented semantics. Lamport himself has contributed significantly to the field. Unfortunately the American dominated programming industry has a strange phobia about mathematical reasoning[1]. The irrationality is exposed by patently ridiculous statements like that formal derivation doesn't scale. If you believe that, then you can't believe any nontrivial results from the hard physical sciences or pure mathematics. The actuality is that formal derivation makes building nontrivial constructs considerably easier. It is a tool, and like any tool when used properly it acts as a capability multiplier.

[1] This is doubly confusing to me because it's not a matter of difficulty. Anyone with the mental horsepower to learn a modern programming language is easily capable of gaining proficiency in the predicate calculus.


That's like saying that we've been able to turn lead into gold for a while now. It depends on what you mean by "we" and by "can." Some specialists have turned some lead atoms into gold in a billion-dollar particle accelerator.

Common, "ordinary" software can neither be mechanically derived nor verified using any feasibly affordable process at common scales. Let me remind people that the largest programs ever verified end-to-end were one-fifth the size of jQuery, and they required world-experts years of effort as well as careful simplification.

I've written a fair bit on program correctness and formal verification/synthesis [1], and I, myself, am a practitioner and advocate of formal methods, but I am not aware of any in-the-field, industrial scale, general (non-niche), non-specialist, repeatable, affordable formal synthesis/verification techniques that can handle systems and specifications of the scale TLA+ is commonly used for by ordinary engineers. If you know of such examples, I'd be interested in seeing them.

[1]: https://pron.github.io/


I recently found Microsoft’s P programming language https://github.com/p-org/P and I’m wondering how it fits into this space. The P README says:

P is a language for asynchronous event-driven programming. P allows the programmer to specify the system as a collection of interacting state machines, which communicate with each other using events. P unifies modeling and programming into one activity for the programmer. Not only can a P program be compiled into executable code, but it can also be systematically tested using Model Checking. P has been used to implement and validate the USB device driver stack that ships with Microsoft Windows 8 and Windows Phone. P is currently being used extensively inside Amazon (AWS) for model checking complex distributed systems.


This is a good callout. P language evolved into the P# framework (as opposed to a language) which then evolved into Coyote (https://microsoft.github.io/coyote/)

Coyote allows developers to systematically test concurrency expressed through .NET async/await tasks. I encouraged my team to embrace Coyote based concurrency testing right from the start as we set out to build a new Azure service from the ground up. The results have been encouraging. The unification of the programming and modeling language is actually _quite_ useful in practice. For one, developers who aren't experts in formal methods are able to write concurrency tests fairly easily with minimal guidance. Secondly, the model and code never go out of sync as they are one and the same. The trick through which we are able to scale out these techniques to large scale software-in-the-wild is to have numerous tests, each test exercising concurrency in one set of APIs. This approach has scaled out in a decent manner. We run our entire distributed system in one process with various tests exercising concurrency and faults triggered through a focused set of APIs. We'll blog more about our usage of Coyote in the coming months.

While TLA+ is very powerful and useful, we've also found language integrated tools like Coyote to be quite effective in testing and catching concurrency bugs in real world code in Azure.


The size of P programs verifiable by a model checker is similar to that of TLA+ specifications checkable by TLA+, but because TLA+ can specify systems at a non-compilable level of detail, it also allows specifying and (perhaps non-soundly) verifying much larger systems than those you can soundly verify in P. There are many programming languages with model-checkers, but they are commonly used in niche areas -- like hardware or embedded software -- where programs are relatively small and simple compared to most "ordinary" software.


This is a fair call-out. I'll like to share a counter-example where we 'model check' our real-world service written in C# with pretty decent effectiveness. As you rightly call out, the number of execution points around which the model checker explores interleavings in a concurrent situation cannot be too large. If you look at _most_ micro-services, the really interesting points around which interleavings have to be explored are to external storage systems, say, Cosmos DB or Azure Storage etc. This is because most micro-services maintain their state in these external systems and are otherwise stateless. This insight allows us to only explore interleavings around calls to these external services as that's where a ton of concurrency bugs reside and ignore the interleavings at arbitrary points _within_ the services. Leveraging this insight allows us to utilize concurrency testing tools like Coyote on real world code and services with the testing scaling decently with the size of the program.


First, thanks for the link to your github.io. At a glance it looks quite interesting and I look forward to reading more.

Let me state a couple assumptions that I'm making and please let me know if they're incorrect. First, by mechanically derived you mean automatically generating a program text from a specification with no additional human input. Second by mechanically verified you mean something like calling a function that takes a specification and a program text and returns true if the program text satisfies the program with no additiona human input.

I do not believe either mechanical derivation or verification is possible in the general case using these definitions. Both appear obviously undecidable in the general case. Nevertheless I believe it's a fruitful field of research since there are already specific cases where one or both are already possible and it would be valuable to grow that set. My somewhat jaded view of the focus on mechanical verification is that it's what the industry that funds much of the research wants, since they probably dream of it as a way to save money on programmers and testing.

Here and elsewhere when I talk about formal derivation of a program I mean as a method by which a human programmer constructs the program text. The simplest useful example of this that I can think of is deriving a totally correct loop from a specification. The loop invariant method is so easy and so powerful there is absolutely no valid reason why every programmer shouldn't use it when writing any loop. Dijkstra wrote a textbook[1] with a number of other non-trivial examples that you may enjoy reading if you haven't already. In short, Dijkstra shows how formally deriving a program is equivalent to deriving a mathematical proof. It's actually a relatively informal book and not at all mathematically demanding. He and Scholten fully developed the concepts with a much more mathematically rigorous textbook too[2].

So with that in mind, I'd personally like to see less of a focus on how to remove the programmer from the process and more on making it easier for him to formally derive his programs. One approach I imagine is a kind of side by side editor with good visual feedback where on the one side you build up your specification and on the other side you write code to ensure the various conditions the specification adds hold everywhere in the state space. If I were to undertake such a thing myself I'd probably start with a very simple purpose built language to avoid getting in over my head, but in principle any language with formally definable semantics could be used. A new project would start with the empty specification and the empty program, and obviously total correctness holds. All current major editors can be thought of as implementations of this model that only support the degenerate case of the empty specification that admits all programs. The initial verification strategy would rely on the programmer as its oracle, but additional strategies could be used where feasible. But even simply providing a nice way for a programmer to track that yes he's quite sure he's satisfied this part of the spec would be helpful.

[1] https://www.goodreads.com/book/show/2276288.A_Discipline_of_...

[2] https://www.goodreads.com/book/show/3144463-predicate-calcul...


There are various avenues of research in this area (you may want to look at David Harel's Behavioural Programming [1], and for more deductive approaches look at Back's refinement calculus, which is very similar to what you have in mind), but to date, we do not have any methodology that allows us to either synthesise or verify programs that correspond to specifications that are routinely written and verified in TLA+; we're nowhere even remotely close. It's comparing a battle-tested, industrial-grade tool with hints of more or less promising directions in the lab. The gap between the lab and the field is so large that we must be careful to treat them as almost two different worlds.

[1]: http://www.wisdom.weizmann.ac.il/~bprogram/more.html


I agree with everything you state, but I think we're talking past each other a bit now. My claim is that even without any additional tooling beyond TLA+ to produce a specification and your development environment of choice, just using mathematical reasoning as you derive your program text makes it considerably easier to build systems that satisfy the specification. That mathematical method using predicate transformers is well-developed and has been for decades. This formal approach to deriving the program text results in an object that is a mathematical proof of its own correctness. The human act of writing the program is exactly the same as deriving a proof. I'm simply incapable of accepting the proposition that avoiding mathematical rigor scales better than using it.


> just using mathematical reasoning as you derive your program text makes it considerably easier to build systems that satisfy the specification.

And I am saying that since this has never been achieved in practice for systems of considerable scale, this is a rather hypothetical use of the word "easier." Is it possible that one day this will actually be easier? Sure. Is this the case now? I don't think so.

> The human act of writing the program is exactly the same as deriving a proof.

The human act of painting your ceiling is exactly the same as painting the Sistine Chapel and the human act done by a first-year student taking a calculus exam is exactly the same as a Field medalist writing a proof in a paper. Look at how, say, a verified quicksort is written in Idris [1] and compare that to how it's written in Java/Python (one could write Quicksort in Idris as in Java, but then the program will not serve as proof of its correctness). I think that the claim that with what we know how to do today producing the former is as easy as the latter, let alone easier, requires some empirical evidence.

[1]: https://github.com/bmsherman/blog/wiki/Quicksort-in-Idris


> And I am saying that since this has never been achieved in practice for systems of considerable scale, this is a rather hypothetical use of the word "easier." Is it possible that one day this will actually be easier? Sure. Is this the case now? I don't think so.

Respectfully, I recommend that you read Dijkstra's A Discipline of Programming that I linked above. I think you'll find it interesting at the very least and possibly even enlightening. I assume you are a product of the American educational system, where these concepts are rarely taught. There are exceptions though, for example the great Leslie Lamport has also done notable work[1] in this area. You'll find that out of necessity on some level we all have to reason about state spaces and the possible coordinates in those spaces to write software. Using formal mathematical reasoning is strictly better than the sloppy approach generally used. I'm not claiming that it's a magic bullet though. Even mathematicians make errors after all.

There are various middle grounds too. Done properly, TDD and its cousins amount to an implied specification and "ensure" that the program written will satisfy the tested subset of said implied specification. The primary practical limitation of this approach is that testing can never show the absence of errors, only their presence.

> The human act of painting your ceiling is exactly the same as painting the Sistine Chapel and the human act done by a first-year student taking a calculus exam is exactly the same as a Field medalist writing a proof in a paper.

You've lost me here, I don't at all understand how these analogies have anything to do with the kind of reasoning a human can do about programs vs the kind of "reasoning" programs can do about programs.

The point is that it doesn't matter what language a program is written in. The program text is always a mathematical object describing a state space (usually reasoned about as a set of state spaces like structs or objects) and transformations in those state spaces. Granted, some languages, like Perl, have ill-defined semantics that make reasoning about correctness impracticable, but often it's possible to restrict oneself to a subset of the language that can be reasoned about. For what it's worth I've applied these techniques quite successfully in my professional life. I hate getting paged and I hate debugging, so I've found it useful to learn to write correct software. I seriously doubt that I'm the only one, but as I've said before there is some kind of bizarre phobia American programmers have about being told that what they are doing is math that makes it very hard to talk about with most of them.

[1] https://lamport.azurewebsites.net/pubs/lamport-win.pdf


I think you are not sufficiently familiar with this field, Dijkstra wrote "Discipline" years before computational complexity results on program analysis were known, and I think you misunderstand the tremendous gap between what works on small examples in the lab and what is scalable enough to be applied widely in the field and the difference between something that might be possible in principle vs. something that's affordably feasible in practice. We know more than Dijkstra did, the techniques have improved, and still no "ordinary" (non-niche) real-world software of any significant size has been written using formal refinement to date. If you claim this is possible today -- go ahead and do it for a medium-sized program of say, at least 100KLOC. A formalism like TLA+ is well suited for this task, so go ahead and try.

I am not American, I don't like making generalisations about Americans or any other nationality, and, as you could see from my blog, I am very much in favour of employing maths in system design -- that's what TLA+ is all about. But confusing "mathematical object" with "can be feasibly reasoned about" is just ignoring computational complexity. I can give you an example of two subroutines, each only a few lines long and each very simple to understand, and yet you won't be able to answer a most basic question about their composition. Complexity theory tells us that mathematical problems have an inherent hardness, and the fact you can write them does not mean you can feasibly solve them. There are no simple answers here. In some carefully constructed ways formal methods can help in some specific circumstances, different methods helping in different use-cases, and you cannot just say that it's "better" overall. For example, while programs of millions of lines of code are routinely tested, no one has ever managed to formally reason about deep properties of programs of this size.


I take it that means you've never read Discipline or Predicate Calculus and Program Semantics.

I don't doubt your expertise in TLA+, as I hope you can see from my studious agreement with you on the items that are clearly within your expertise, but on this point you're not even wrong. You are attempting to refute the claim that a human programmer can derive a program that has desired properties more effectively with mathematical reasoning than without it by offering the objection that mechanical verification has high computational complexity[1]. That is sheer nonsense. It's like saying that because factoring products of primes is hard that it's in no way feasible to multiply them rigorously so just use an estimate instead.

As an aside, the avionics for the Space Shuttle was over 250KLOC and was proved correct from the bottom up. And it functioned flawlessly. This was done before computational complexity results on program analysis were known, but that lack of knowledge was observably irrelevant to building correct software. Sadly, the hardware left a bit to be desired.

[1] It's undecidable in the general case.


No ordinary, real-world software has ever been fully and affordably formally proven. There have been specialised 1MLOC programs checked with a model checker, but the techniques don't generalise. I would be interested in a source to your claim about the Space Shuttle software, because even though that's both high-cost and niche software, I think you might have the details wrong. Anyway, I encourage you to try to affordably write ordinary software in this way. This is certain to win you some high-profile award, as no one has been able to do it so far. Either you succeed or you fail, but there's really no point in insisting it can be done without pointing to anyone who has done it.


This discussion would be more fruitful if you carefully read my responses. I'm not actually disagreeing with you, but you're missing my point. I don't mean to be rude, but you really appear to be suffering from man-with-a-hammer syndrome here. As I've already shown, what can be mechanically verified with a model checker has no bearing on a programmer's ability to formally derive totally correct software. It would be extremely surprising if it were otherwise, since that would mean that one of the most fundamental theorems in computing science is incorrect.

> Anyway, I encourage you to try to affordably write ordinary software in this way

I already stated in an earlier response that I've been doing this professionally. And I have been for over a decade.

Regarding the Shuttle, you'll have to hit a library, the only online source I can find is Feynman's minority report and it doesn't go into great detail. It does however show how properly decomposing a complex system into component parts is necessary to keep it intellectually manageable. And bear in mind that while Feynman is best known as a physicist, he knew enough about computing science to write a book on it.


And my point is that discussion of hypothetical ability is not what I am talking about. I can show you why there are good reasons to doubt the programmer's ability to reason about a large program in light of computational complexity results accumulated in the last twenty years (and you are simply wrong about the bearings of model checking to the situation), but instead of trying to argue over hypotheticals, I'm merely pointing out that the ability you believe exists has never, to the best of my knowledge, actually been realised in a scalable, repeatable, affordable way in the field. If you claim you could achieve it, please try.

> And I have been for over a decade.

What exactly have you done for over a decade?

I am quite familiar with predicate transformers, but the distance between using Hoare logic and the like on code units -- which is relatively easy and pretty widely done (e.g. with SPARK) -- and formally (which means mechanically checkable, even if it is human directed) deriving a large program from a specification is gargantuan. Proving/deriving code units from contracts and the like is not the problem I'm talking about. Neither Dijkstra nor anyone else has come remotely close to achieving it, and achieving it is a guaranteed Turing award.


> I can show you why there are good reasons to doubt the programmer's ability to reason about a large program in light of computational complexity results accumulated in the last twenty years (and you are simply wrong about the bearings of model checking to the situation)

OK. Show me.

This is why it's important to write the specification with an eye toward keeping the implementation manageable. This is probably a drawback of using a tool like TLA+ to check a specification. The tool makes it relatively easy to write a terribly structured specification, check it with TLA+ and declare victory, and leave the programmer in a hopeless situation. A well designed specification will unsurprisingly mirror well designed code[1]. What I have empirically tested is that keeping my functions/methods simple enough to be easily specified also keeps them easy to formally derive correct program text for. The problem of composition is treated entirely separately. You don't need to understand the fully concatenated predicate transformers for each function/method, you just need to know what wp (or another appropriate predicate transformer) is for each and be sure to satisfy it.

> What exactly have you done for over a decade?

Please understand this is a greatly abridged response. My general habit is to look at the requirements and infer the implied specification since a formal specification is sadly almost always missing. Thankfully most business code is theoretically trivial so postulating a reasonable specification from the requirements is easy. So, I then document my assumptions about what I believe the customer is actually asking for formally. Before I write code I think about the postcondition that I want that code to establish and determine what preconditions, if any, must hold. I put this as comments in the test modules. Then I do a more or less standard TDD approach, with the tests trivially derived from the specification[2], because management likes tests and they're paying the bills. Then I derive the code to make the tests pass from the specification using predicate transformers from the precondition to the desired postcondition. And then I run the tests and move on to the next properly isolated concern.

Obviously no human discipline is infallible and I still make mistakes, but I make far fewer mistakes than I would using the typical sloppy approach most programmers use. One time I got chewed out by my manager for allowing one bug to make it into production in nine months, because I had raised expectations of the quality of my code so high. And by the way I've always just presented my methodology as TDD because of the mathematical phobia that's so prevalent in American industry. If I tell management that this code is easy to maintain, you just need programmers proficient in the predicate calculus they will dismiss me. If on the other hand I say look I have well over 90% branch coverage I'll get an attaboy and management will be happy.

By this time I hope it's clear that my thoughts on the subject were not formed in the lab, but at a Big Tech Company You've Definitely Heard Of. I hope we can agree that the general tendency in industry to ignore or even deride what computing science can teach us is unfortunate.

> formally (which means mechanically checkable, even if it is human directed)

Dubious[3]. A mathematical formalism does not require algorithmic automated checking to exist. And even if said checking is decidable, which it mostly isn't, it may not run in a practical time on any possible mechanical computer we can conceivably build. However if you mean like any other formal mathematics where "the symbols do the work" so long as you follow the appropriate rules of derivation then I am in complete agreement.

[1] Proper separation of concerns and such.

[2] Obviously no tests are going to exhaustively check a non-trivial specification, but my approach is still strictly better than the typical practice of primarily optimizing for satisfying coverage tools.

[3] https://en.wikipedia.org/wiki/Formalism_(philosophy_of_mathe...


> OK. Show me.

Here is a simple specification: write a program that, given a positive even integer, returns two prime integers, whose some is the input, and never fails to do so. Start with a formal statement of the requirement, and derive a program that provably satisfies it. Decompose your program however you like.

> Then I derive the code to make the tests pass from the specification

Do you use predicate transformers all the way from a high-level spec to each and every line in a 1MLOC distributed and interactive system, with management consoles, reports and ten-thousand users per server?

> A mathematical formalism does not require algorithmic automated checking to exist.

It most certainly does. That is the literal meaning of "formal" in formal logic/method/synthesis/verification. A formal logic has a proof theory made of inference rules. It means that conclusions are reached by manipulation of the syntactic form -- hence formal -- without use of any mathematical intuition. Alan Turing then showed that a proof system can always be described as a Turing machine, and that formal = mechanical. There is a huge gap between informal mathematical reasoning and formal reasoning. I believe I've seen someone mention an exponential blowup in proof size of formal proofs compared to mathematical proofs.

> Obviously no human discipline is infallible and I still make mistakes, but I make far fewer mistakes than I would using the typical sloppy approach most programmers use.

There's a very, very wide gap between "disciplined" and formal. I am all for disciplined and for using formal methods in various places. End-to-end formal derivation/verification, i.e. from high-level spec to code, in a scalable, feasible manner, is yet to be achieved.


> Here is a simple specification: write a program that, given a positive even integer, returns two prime integers, whose some is the input, and never fails to do so.

I'm not going to solve Goldbach's conjecture on hackernews or anywhere else. I wish I could, I really do, but my incapability there doesn't invalidate my arguments. I'm not even annoyed though, that was a cute trick and I appreciate your cleverness. Fortunately Fields Medal tier mathematical problems have never come up in my professional programming career.

> It most certainly does. That is the literal meaning of "formal" in formal logic/method/synthesis/verification. A formal logic has a proof theory made of inference rules. It means that conclusions are reached by manipulation of the syntactic form -- hence formal -- without use of any mathematical intuition.

Once again, I completely agree with this. Let the symbols do the work. But we keep coming back to the same misunderstanding. A program can feasibly be derived that can't feasibly be verified without access to the discrete documented steps of the derivation[1]. Showing the work of the derivation is sufficient to convince any honest interlocutor of the correctness of the proof. It may well be enough to allow an algorithm to perform the verification too, I'll defer to your expertise on that. In any event, as I said before using the analogy of primes, verification is rather a lot harder than construction.

Edit: Also I want you to know I'm finding this a pleasant discussion and am thankful that you're taking the time to help develop my thoughts and hopefully yours too.

[1] By analogy, factoring the product of two primes is easy when in addition to being given the product you are also given the multiplier and multiplicand.


> Fortunately Fields Medal tier mathematical problems have never come up in my professional programming career.

Ah, but you see, I've used this problem only because you could be easily convinced of its difficulty because it's a well-known one and others have tried solving it unsuccessfully. But hard problems come up all the time, especially when concurrency is involved. You just don't know they're hard at first glance. For example, the specification that a database be kept on multiple computers to offer redundancy without loss of data is a staple of distributed computing, but finding an algorithm that does it (perhaps together with some other latency requirements) is extremely hard and it took many years to find the first one that could do it.

That easy specifications are easy to derive doesn't help us, because they might also be the same programs that are easy to test or be convinced of their correctness in some other ways. The whole point of formal methods is to help us with the hard cases, which, sadly, are very frequent.

> A program can feasibly be derived that can't feasibly be verified without access to the discrete documented steps of the derivation.

Not if it's done formally. A formal derivation guarantees proof, and, in fact, the problem of verification is easily reducible to derivation, i.e., derivation is no easier. Here is the proof: X is a program, and so, also a specification, and P(X) is some property that you want to verify for X. A formal derivation of a program from the specification X ∧ P(X) will, therefore, prove P(X). This is the "trick" I used with the "Goldbach program", except it's not a trick, but a basic result in the study of the hardness of program analysis, a subject that's been extensively studied for a few decades now. The claim that it's easier to derive than to prove is just mathematically false. With regards to claims about informal "proofs" or "derivations" it's hard to say anything, because the problem is not well-stated.


> finding an algorithm that does it (perhaps together with some other latency requirements) is extremely hard and it took many years to find the first one that could do it.

I agree. And correctly implementing those algorithms is much easier using a formal derivation with a mathematical tool such as the predicate calculus than it is by just eyeballing it and hoping your tests catch all errors.

> But hard problems come up all the time, especially when concurrency is involved

Predicate transformers are well suited for reasoning about nondeterminism and concurrency.

> Not if it's done formally

Program derivation with predicate transformers is done formally. It's a chain of equivalences, implications, and consequences as appropriate to the problem. The program text is literally developed as a proof, and each transformation of the code must be justified by axiom, theorem, or lemma and at the end of that derivation all properties of the specification hold. You can't possibly be claiming that no formal mathematics was done before the invention of automated checkers, because that would be both ridiculous and observably false.

> Here is the proof: X is a program, and so, also a specification,

I disagree. A specification maps to either an empty or countably infinite set of programs that satisfy it. Also all programs satisfy countably infinitely many specifications. They are not isomorphic or even dual. In this case that conflation doesn't seem to impair our understanding too much though.

> and P(X) is some property that you want to verify for X.

Right, and the task of the programmer is to write a program text that provably establishes that P(X) holds everywhere in the program state space.

> A formal derivation of a program from the specification X ∧ P(X) will, therefore, prove P(X)

Yes this is what I meant when I said you should use the same techniques to derive the program text that you would to derive any other mathematical proof.

> it's not a trick, but a basic result in the study of the hardness of program analysis

It's a trick because it's a clever way of saying write a program for which there is no known algorithm. As of now nobody is going to be able to do that regardless of their methodology. I get the impression that you mean this to be a compelling point, but I'm not grasping how I'm better off not using mathematical reasoning to derive the programs that I do know how to write.

> The claim that it's easier to derive than to prove is just mathematically false

It's not even false it's incoherent. But I didn't make that claim. Both derivation and verification are ways to prove a program satisfies a specification.

As you say, a derivation of a program that satisfies a specification is a proof. Specifically, a proof by construction. The claim is that it's considerably easier to derive a program that satisfies some property than it is to verify if an arbitrary program satisfies that property. For example say we have a specification S: "returns a product of two primes" and two programs. P1: "Return 2^82589933-1 * 2^77232917-1" and P2: "Return 2^8258993365541237147-1". The proof of the correctness of P1 is trivial because I constructed it to be so[1]. As for P2 I have no idea if it satisfies S, but I am certain that it'll be considerably harder to convince myself and others that it does than it was for P1.

> With regards to claims about informal "proofs" or "derivations" it's hard to say anything, because the problem is not well-stated.

I've made no claims of the sort.

[1] https://primes.utm.edu/largest.html


> And correctly implementing those algorithms is much easier using a formal derivation

Again, no one in the history of humanity has managed to do this affordably for programs of significant size. If you have, you are pretty much guaranteed a Turing award.

> You can't possibly be claiming that no formal mathematics was done before the invention of automated checkers, because that would be both ridiculous and observably false.

Formal mathematics is not necessarily checked by a computer, but it is necessarily checkable by a computer.

> I disagree. A specification maps to either an empty or countably infinite set of programs that satisfy it.

It's not a matter of disagreement. This is the proof. A program is a specification, but it is not necessarily the only implementation of this specification. There are mathematical definitions of what it means to implement a specification (it's called the abstraction-refinement relation, and one definition is that of "simulation": https://en.wikipedia.org/wiki/Simulation_preorder). The issue of the hardness of program analysis and synthesis is not a matter of opinions, but has been studied quite extensively for decades. You can read papers on the subject by people like Philippe Schnoebelen, Rajeev Alur and quite a few others. I can't give an introduction to the subject here.

> I'm better off not using mathematical reasoning to derive the programs that I do know how to write.

That's not the point. The point of formal methods is not to do the things we know how to do, but to do things we're not sure we know how to do. Formal methods tries to help us specify some properties and then write programs that satisfy them, even if -- and especially -- if doing so is hard or even uncertain to be possible. I'm not yet clear on what it is that you do, but I am explaining what formal program synthesis/refinement means in the context of formal methods, and what it is that hasn't been achieved. For example, taking the distributed system specification and deriving a program that implements something like Paxos. We know that easy things are easy. Formal methods are mostly for the cases that are particularly tricky.

> The claim is that it's considerably easier to derive a program that satisfies some property than it is to verify if an arbitrary program satisfies that property.

Your claim is mathematically disproven by the proof I've given. This isn't an argument, but a mathematical result: program verification is reducible to synthesis.


> Again, no one in the history of humanity has managed to do this affordably for programs of significant size. If you have, you are pretty much guaranteed a Turing award.

I offer that perhaps the issue here is an ignorance of the history of humanity, at least with respect to computing?

You have heard of decomposing a large program into smaller programs, right? That's how it is done and has been done.

Dijkstra did do what you claim is impossible several times in his career, for example the THE multiprogramming system. And he did win a Turing award for doing so[1]. He also had more than a few colleagues and students over the years and at least some learned what he was awarded for.

Of course if you're going to move the goalposts such that it having been done means the problem isn't of "significant size" then I'm happy to dismiss your argument as sophistry.

> Formal mathematics is not necessarily checked by a computer, but it is necessarily checkable by a computer.

Irrelevant to my point. And human programmers routinely derive programs specified to cause pathological behavior in checkers, which is to say making them impracticable to check.

> That's not the point. The point of formal methods is not to do the things we know how to do, but to do things we're not sure we know how to do.

This is a foolish self-imposed constraint. The point of formal mathematical reasoning is to be unambiguous, to reduce cognitive load by letting the symbols do as much of the work as possible, and to be confident in the result arrived at. This is just as useful for solving known problems[1] as it is for researching novel algorithms. And as I'm sure you'd agree there are plenty of concurrency problems that are solved in the literature but are still tricky to implement correctly even given a formal specification. Formal mathematical reasoning, for example using predicate transformers to derive a program text that satisfies the specification, is a good way to keep that trickiness within the capabilities of those of us who are not Von Neumann tier intellectual powerhouses, but are at least bright enough to manage the predicate calculus.

> Your claim is mathematically disproven by the proof I've given. This isn't an argument, but a mathematical result: program verification is reducible to synthesis.

You didn't at all prove that, but I trust you that reducibility of verification to synthesis has been proved[3]. It's still irrelevant to my claim about it being easier for a programmer to derive programs that meet an arbitrary specification than to test that an arbitrary program text satisfies it. Human programmers aren't computers. Please stop misrepresenting my claims. Are you one of those philosophical types who believes that human beings are turing machines? That would certainly explain your confusion and the persistent disconnect in our views.

[1] https://amturing.acm.org/award_winners/dijkstra_1053701.cfm

[2] Most working programmers can't even implement binary search correctly, to say nothing of writing a correct lock-free queue or similar.

[3] That's less interesting than it sounds. Given a working verifier we can just enumerate all strings and return the first one the verifier accepts. Obviously the complexity of the verification dominates and the synthesis will fall within the same computational complexity class.


> I offer that perhaps the issue here is an ignorance of the history of humanity, at least with respect to computing?

Perhaps, but as no one has offered a counterexample, I have no way of knowing.

> Dijkstra did do what you claim is impossible several times in his career

No, he did not. The meaning of a technique not being scalable is that while it does work for small instances it does not for large ones.

We have learned so much since Dijkstra, and can apply even more powerful end-to-end verification/synthesis than he had at his disposal and for larger programs, but we still can't scale it, just as he couldn't, and, indeed, didn't.

> You have heard of decomposing a large program into smaller programs, right? That's how it is done and has been done.

No, it has not been done, but unlike 50 years ago, in the decades we've studied the "hardness" of program analysis/synthesis we have an idea as to why. There are now complexity theory results that show that decomposition cannot always work. In addition, we now know -- what we didn't 50 years ago -- that there are essential barriers to scaling. Things that are easy when small can be absolutely impossible when large.

> This is a foolish self-imposed constraint.

It's not a constraint as much as saying that the claim you have a technique that can more effectively yield programs that we already know how to get more-or-less right and the claim you have a technique that can effectively, feasibly and scalably produce programs that are otherwise very hard to get right are two very different claims. It's like the difference between improving a polynomial algorithm by a constant and finding a polynomial algorithm for NP-complete problems. The first might get you published; the second will make you world-famous.

> It's still irrelevant to my claim about it being easier for a programmer to derive programs that meet an arbitrary specification than to test that an arbitrary program text satisfies it.

The proof is relevant because it precisely proves that your claim, as stated, is, in fact, false. Your claim might be made about "common cases" etc., but that would require a much more careful wording, as well as a careful empirical study rather than insistence.


> No, he did not. The meaning of a technique not being scalable is that while it does work for small instances it does not for large ones.

I see, so a complete and correct multiprocessing operating system running on unreliable hardware is a "small instance" according to you. This is the sophistry I promised to dismiss. Consider it done.

> There are now complexity theory results that show that decomposition cannot always work.

Obviously decomposition can't always work. Primitives can't be decomposed by definition. Yet another trivial truth.

>It's not a constraint as much as saying that the claim you have a technique that can more effectively yield programs that we already know how to get more-or-less right and the claim you have a technique that can effectively, feasibly and scalably produce programs that are otherwise very hard to get right are two very different claims. It's like the difference between improving a polynomial algorithm by a constant and finding a polynomial algorithm for NP-complete problems. The first might get you published; the second will make you world-famous.

This word-salad has nothing to do with anything I've written.

> The proof is relevant because it precisely proves that your claim, as stated, is, in fact, false. Your claim might be made about "common cases" etc., but that would require a much more careful wording, as well as a careful empirical study rather than insistence.

This is more interesting, but also misleading. Obviously every single program ever written falls within the class of "common cases," since, asymptotically speaking, very nearly all program texts have never been written, let alone studied. Look ma, I can do sophistry too!


> Hoare logic and the predicate calculus are entirely adequate for formal derivation in any language with well-documented semantics.

Formal verification, yes. Formal derivation of a correct program from a spec, I really don't think so. In particular, I don't know of an easy way of deriving the required induction hypotheses/loop invariants to prove correctness of a program which includes recursion or loops.


I use formal to mean using mathematical methods like those a mathematician uses when deriving a proof, which is to say a human directed process. I would say mechanical derivation if I meant having a program do it automatically.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: