Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Category theory is a universal modeling language (math.mit.edu)
214 points by bmc7505 on April 29, 2021 | hide | past | favorite | 91 comments


I just recently stumbled upon the "univalent foundations" approach to mathematics, i.e. homotopy type theory (HoTT), and holy schnikes this is a deep rabbit hole to fall into. It would appear that category theory and set theory can both be represented in HoTT.

I think the world is not going to reap the true benefits of computers for scientific progress until we systematize formal knowledge within a single shared syntactical universe. Open science means nothing if it has become so siloed and complex that results are largely unverified and frequently unverifiable.


It's not really a matter of what formal system is most powerful. It's a matter of what formal system can represent ordinary math in the simplest and clearest fashion. I think this what proponents of category theory argue. Basically, category theory lets one take many standard mathematical theorems and generalize them widely - ie, lot of standard math theories can be seen as theories about categories and when you see them that way, the generality of a category means the theorem applies broadly.


I know category theory is popular in CS-land but over in math land it falls squarely in the "general nonsense" bucket for most working mathematicians.

E.g. most mathematicians learn the definitions of categories and the Yoneda Lemma perhaps as a chapter in some other text -- say a text on more modern algebraic geometry -- and that's about it.

The thing to understand is that mathematicians have jobs to do like everyone else -- they are not philosophers sitting in contemplation, but eager beavers trying to answer specific questions and solve open problems. So they are interested in picking up the tools that will help them solve unsolved problems more than picking up the tools that will help repackage the research of others. And the main feature of category theory is its ability to repackage the discoveries of others, rather than necessarily driving new research.

That's not to say that occasionally category theory can't be useful, but just not that many people are going to be working in those research fields where they find anything beyond the chapter's worth of Yoneda lemma plus some of the diagram formalism to be useful, and everyone's time and attention span is limited to what will help them make discoveries in their field.

Rarely is the answer to the question of how to solve some problem in differential geometry, pdes, number theory, symplectic geometry, combinatorics etc, -- "more category theory". The same is true for formal logic. Most people learn the incompleteness theorems and maybe even some things on computability, but they do not use formal logic to advance their research agenda, even though in theory proofs can be encoded in formal systems and so in some sense, these formal systems are a unifying framework to view mathematical proofs.

On the other hand, there have been some spectacular breakthroughs with physics/string theory, so a lot of people were trying to brush up on that. But here, too, progress is quite difficult, and most people aren't going to go out of their way to invest too much time in it beyond a summer school or so.


I share your view of what working mathematicians think like on this subject. I summarize it thus: how many interesting theorems are there in Category Theory? And the answer you get from proponents varies, but it seems to be like 1-3 with some being kind of a stretch. I would love to learn of a body of theorems like those of analysis or topology or algebra, but by comparison there seems to be very little "there" there. I would be happy to be proven incorrect on this!


Yeah, there's a lot of confusion for people using concepts like morphisms as a notation in their work versus category theory as a research program in and of itself. I don't think there is any research program there. But a lot of the more abstract topics like Motives use this language of morphisms and so people consider that to be category theory as well. I've even heard of people calling all the diagram chasing techniques like spectral sequences category theory, but I have to keep pressing the notation versus ideas distinction and give that a firm "no". If we start organizing mathematical subjects by their notational or bookkeeping systems, then we will not end up with useful categories (so to speak:)).


I am personally interested in this, as a cryptography PhD student, because the state of “provable security” is an absolute dumpster fire. In my opinion this is one particular application where machine-checkable proofs need to be established as a baseline standard.

Mathematics is not a jobs program—we should be building on previous work in the most efficient and effective way we can, and getting everyone to use standard techniques that easily translate to mechanizable proofs is the only way that is going to happen.


There are very few areas where mechanizable proofs have had success in math. I think some of the classification of finite groups was done with computer assistance, and some of the cases of the four color problem as well. But basically these are hand-crafted programs. Think of someone telling you that software engineers will be replaced by software writing computers -- that was the (failed) promise of UML. This is one of those situations where the software is generally not smarter than the person writing the code, and mathematical research, at any point in time, is driven by about 2-300 of the smartest people alive in the world at any time, with another few thousand adding finishing touches and elaborations on their work, and perhaps 10,000 adding small marginal adjustments to it. The idea that mathematicians' jobs are going to be automated by software is orders of magnitude less likely than the idea that software engineers will be automated by software. Lots of grand plans, but not much in the way of productive, transferable progress.


My comment was not to imply that software will replace people “doing math”, but that the math which is “done” will be encoded in a standard format that can be reused (and verified) more readily.

Really, this is about software—if you want to do anything with a computer, it must be formalized. Some of this formalization takes place before a line of code is written, usually in the form of mathematical statements. I’d argue that this part of the process ought to be standardized just like library code is standardized, or we will be carrying a very heavy load of collective technical debt as software is asked to do more and more complex and high-assurance tasks.


I don't quite understand -- we know that, for example, correct program analysis is theoretically impossible. There is no tool to verify a general software program. Why should there be a tool to verify a general theorem? Verification is at least as complicated as writing the program you are verifying.


> Why should there be a tool to verify a general theorem?

We want to verify proofs, not theorems. And that's far from "theoretically impossible."


Sorry, of course I meant to say "proof". A proof is just a more general case of a program. A program has logical operations, variables, control flow, etc. You can include all of these elements into a proof, and often do. Every program can be made into a proof, but not every proof can be made into a program. If you can verify the correctness of an arbitrary proof, you can verify the correctness of an arbitrary program. And even in practice many of the reasoning errors that happen in proofs also happen in programs -- forgetting to consider unforseen cases, not being correct about the possible ranges of variables, not anticipating all the external inputs into the program, making incorrect assertions about the state of variables, etc.


> Every program can be made into a proof

I'm not sure what you mean by this, but in my usage it's exactly the other way around. A proof can easily be interpreted as a program (as per the BHK and similar interpretations) but to interpret a program as a proof of something, you at least need to also formally prove that it halts. Edit: your edit clarified things a bit. The control flow in formal (and informal!) proofs is restricted compared to the control flow in programs: in order to do recursion or loops, you must have a formal argument that the recursion / loop terminates.


So normally when we say "proof", we assume the proof is correct. But in this discussion we are talking about possibly incorrect proofs, with the goal of verifying the correctness of the proof. So let's change "proof" to set of statements/procedures which may evaluate to "true", and we need to verify that they do evaluate to true. Things like loops do appear in proofs as quantifiers. So for example a proof may perform a sequence of steps in which some set is depleted, and conclude that "as we must deplete all the elements the set, the claim holds", but determining whether all elements are actually depleted by the algorithm spelled out in the proof may be hard. These types of claims appear in actual proofs, and the types of errors that are found in proofs often boil down to some case being not considered which would cause the set not to be depleted.

Update: I'm unable to reply due to hn recursion limits, but as to your strictly smaller argument, that is not enough with infinite sets, which are common in proofs. Consider a proof about all surfaces, which you think you divided into a countable number of cases, but in reality there are some other types of surfaces you haven't considered. In math you are making assertions, in some sense, about the world, albeit the mathematical world, and you are claiming to have explored all the relevant parts of it for the sets of interest, but you may have not explored some parts you didn't know existed. The idea that you can formally verify such a sequence of claims is .. mind boggling. There is a reason why very little progress has been made in formalizing math, and mistakes are sometimes discovered in even published proofs.


This is all a bit vague, so I may be misinterpreting here. But in the given example, the proof must also prove that at each step the set is getting strictly smaller, so the computer still only has to check that the proof is correct. That is, it doesn't have to check for itself whether the set is getting smaller, it just asks you to prove that the set is getting smaller in a way that it understands.

Anything more fancy than that is left up to proof automation, which is free to give up and say "nope, that leap's too big for me."

> So let's change "proof" to set of statements/procedures which may evaluate to "true", and we need to verify that they do evaluate to true.

I'm not sure that's a good way of looking at it. I think the right image is to think of a proof as a program, and checking the proof as typechecking the program. The program has the additional restriction that whenever it does loops or recursion, it has to prove that something is getting strictly smaller with each iteration.


Wrt your update,

> The idea that you can formally verify such a sequence of claims is .. mind boggling.

Eh, if you're convinced that something's correct, it's not too mind-boggling that you can also explain it to a computer, given that it accepts all the same basic assumptions that you do.

> There is a reason why very little progress has been made in formalizing math, and mistakes are sometimes discovered in even published proofs.

Yep: it looks like Lean is making good progress on their goal of formalizing the "undergraduate curriculum," but that's only the tiniest sliver of math. Math is hard, and doing it formally is even harder, so all I'm saying is that contrary to your earlier statements, it's definitely possible.


> Verification is at least as complicated as writing the program you are verifying.

I guess, writing a proof is basically like writing a program. Then checking a proof ought to be like *running* a program, not proving an arbitrary property of it. If your proof/program evaluates to "false", it's a wrong proof. So this is all a question of having an appropriate semantic model of computation/proof-checking, on which you can check your proof.

You could then ask: "why is it that I can always run a proof? What if running it does not terminate?" To which I'd have to admit that I am really, seriously not an expert on this, but I vaguely recall that interactive theorem proving systems like Coq have a non-Turing-complete type systems. Maybe it's for a relevant reason?


Nobody seriously thinks they can automate mathematicians, arguably the most creative people on the planet, out of a job.

Edit: also, I think you're wrong in saying that mathematical research "is driven by about 2-300 of the smartest people alive," but that's a rant for another time.


Perhaps look into Metamath Zero / mm0 which.. well I'll just quote from the project [1]:

> Metamath Zero is a language for writing specifications and proofs. Its emphasis is on balancing simplicity of verification and human readability of the specification. That is, it should be easy to see what exactly is the meaning of a proven theorem, but at the same time the language is as pared down as possible to minimize the number of complications in potential verifiers.

> The goal of this project is to build a formally verified (in MM0) verifier for MM0, down to the hardware, to build a strong trust base on which to build verifiers.

[1]: https://github.com/digama0/mm0


> I am personally interested in this, as a cryptography PhD student, because the state of “provable security” is an absolute dumpster fire. In my opinion this is one particular application where machine-checkable proofs need to be established as a baseline standard.

As a non-cryptographer and generally a layman, I wonder how useful machine-checked proofs about crypto are in security. What do they tell you?

The underlying reason for my skepticism is best illustrated with Spectre: it made architecture people sweat bullets, but also rendered previous applications of formal methods less useful. What good is a formal proof that a crypto algorithm is constant-time, if it assumed a model of computation that is a research toy? You can only verify properties that you can formulate within a given semantic model. I suppose, for hardware those are way less understood than for programming languages.

(Which is not to say that formal verification is hopeless of useless as it is. Just really curious what kind of proofs a crypto person would benefit from machine-checking, and how.)


Wasn't Spectre a hardware-level vulnerability and not at all related to cryptography? I remember it being something along the lines of, the way the CPU cache worked could be exploited and allow an attacker to read the data it contained. Correct me if I'm wrong.


You are absolutely right! I brought up Spectre not meaning to say it's an issue for cryptography (although maybe it is? it's above my paygrade :) ), but to illustrate one issue with formal proofs in security that I am trying to resolve for myself.

Spectre is a hardware-level vulnerability, which is based on how the code executed under misspeculation (the processor makes a branch prediction, executes stuff and then figures out the prediction was wrong) leaves detectable traces in cache. Technically, that is not specified, because caches and speculative execution are not a part of architecture. This renders security proofs at higher level of abstraction not necessarily correct or simply non-applicable (think properties like whether a program executes in constant time or if it is sandboxed), if they rely on the ISA being all that there is to know about hardware. Now, Spectre specifically might be "solved" as a problem, but the proof-minded people would be concerned with how to prove that. How to prove that things like this are not a security issue: https://asplos-conference.org/abstracts/asplos21-paper148-ex...

And to do that, it turns out they need to extend the models under the hood of their proofs with speculation semantics. A non-trivial task by itself, I bet it would trigger non-trivial conceptual changes to mechanized proofs too, so this all might be a labor-intensive game of catch. Alternatively, one could raise a question about what fully formal security proofs we want. Formal proofs make explicit assumptions about the underlying model of computation. Some of those assumptions might not hold because we (incorrectly) thought we could abstract some complexities of hardware away. Is that alright for the kind of security properties we want to have proven? It might be for some, but not others.

So with this in mind, I am curious how crypto experts think about properties they want to prove for their work. What value would the crypto community seek to extract from formal proofs? Are leaky abstractions an issue for them? It's all very interesting.


one thing where proofs are invaluable is the security of distributed systems where you need to exhaustively consider many uncoordinated actors.


The thing with “solving open problems” is whether one is solving particular instances of an abstract problem or a general case of a practical problem.

The latter is what software engineers call “design patterns.


Regarding what proponents of category theory argue as the purpose of category theory, [nlab] has some extremist takes, [chan] has a video series that shows us how a math student is typically sold on learning category theory, and [bauer] is a video which explores using a corner of category theory to formalize some basic computability results in a very different way to the usual formalism centered on Turing machines.

[nlab/0]: https://ncatlab.org/nlab/show/nPOV

[nlab/1]: https://ncatlab.org/nlab/show/applications+of+%28higher%29+c...

[chan]: https://www.youtube.com/playlist?list=PLgAugiET8rrLkzTFuwmiV...

[bauer]: https://vimeo.com/510188470


This kind of foundational (ZFC,CT,HTT) stuff is pretty niche in mathematics. Set theory, though touted as the foundation of math, isn't really used to its full extremes (HOD, etc) by anyone but set theorists. Its more like... A common datatype? And while CT could also be a foundation, as far as im aware its only practical uses as a tool are found in commutative algebra.

I think the truth is that most math is pretty simple to describe and doesn't require a grand framework to formalize.


> I think the truth is that most math is pretty simple to describe and doesn't require a grand framework to formalize.

I agree -- most fields of math are happy to establish their own axioms, and leave the reducibility of those axioms to some "foundations" as an exercise -- an important one, but not one that necessarily furthers the field itself.

However, Voevodsky's stated aim for his Univalent Foundations project was not to formalize everything under a grand theory of everything. He got the bejeebus scared out of him when one of his results was very, very subtly wrong, in a way that was difficult for mathematicians to verify for years, even given a reasonable inkling of a counterexample. So he sought to improve the state of the art in proof verification.

You might argue, it's better to produce a verification tool for one field of mathematics than all of them at once. But you'll have a hard time finding any significantly complex theorem that doesn't draw insight from multiple places. (Arguably, that's why they're significantly complex in the first place). And all fields of mathematics have some basic elements in common. So, I'd argue, you're unlikely to do a good job for any but the most narrow domains of mathematics without seeking some level of generality.

Any "foundation of mathematics" can be used for this purpose, and plenty of theorem provers have been built on each (Twelf, PVS, Agda). The draw of category theory is that it seems to capture at a fundamental level many of the basic conceptual elements we see across mathematics. (Homomorphisms are so universal it's not even funny.) It seems reasonable to pick something that starts us closer to where we want to be; the difficulties we encounter are more likely to be fundamental problems and not issues of encoding.


Couldn’t agree more. I am a current (mid career) PhD student in cryptography and WOW is it a mess. Provable security arguments are generally written in the implicit propositional logic + set theory of working mathematicians, but it is woefully inadequate for the job. Some very powerful composable frameworks have been developed, but they are really straining against the clunkiness of the formal system in which they have been constructed. The best new ideas I’ve seen for formalizing cryptographic settings all belong in category theory or topology, but have been formalized in set theory because that seems to be the only tool that the average cryptographer intuitively knows how to use.


Im not opposed to the use of proof assistant software for difficult theorems. And if one foundation or another is more applicable to that software, then thats fine.

Its more the narrative of the grand unification of mathematics thats spun around 'foundational theories' that irks me. In reality its just various emulation schemes.

Rephrasing other mathematicians work and claiming fundamental status, while not producing many novel theorems... just strikes me as very distasteful.


Only if one allows their ego to get in the way of the search for abstract truth does this become “distasteful”.

There’s no claim of fundamental status here, merely a desire to standardize aspects of what is very much an artisanal process.


In addition to CT's application to commutative algebra, it's pretty nice for reasoning about programs and logic.

Programs: http://cseweb.ucsd.edu/~rtate/publications/proofgen/proofgen..., https://blog.sumtypeofway.com/posts/introduction-to-recursio..., the Functor/Applicative/Monad hierarchy in Haskell et al

Logic: https://publish.uwo.ca/~jbell/catlogprime.pdf

CT as a bridge between programs and logic: https://existentialtype.wordpress.com/2011/03/27/the-holy-tr...


All formal systems have their limitations as far as what is provable.

Example:, for any statement A unprovable in a particular formal system F , there are, trivially, other formal systems in which A is provable (take A as an axiom). On the other hand, there is the extremely powerful standard axiom system of Zermelo-Fraenkel set theory (denoted as ZF, or, with the axiom of choice, ZFC), which is more than sufficient for the derivation of all ordinary mathematics. Now there are, by Gödel’s first theorem, arithmetical truths that are not provable even in ZFC. Proving them would thus require a formal system that incorporates methods going beyond ZFC. There is thus a sense in which such truths are not provable using today’s “ordinary” mathematical methods and axioms, nor can they be proved in a way that mathematicians would today regard as unproblematic and conclusive.


This is well known. Type theories also do not assume the law of excluded middle. What's your point and it's relevance to the comment?


The issue is not one of “power”—one can make almost any formal system sufficiently “powerful” by adding a few axioms that get you over the rough spots in your proofs.

The problem is that the modern world increasingly relies on mathematics, in nearly all aspects of life, to make consequential decisions about how to interpret data and act on it. The challenge is not one of power, but of fidelity—using a mathematical structure that is a suitable abstraction for the real world situation. Use the wrong abstraction, and you end up with paradoxes like the incompleteness theorems.

I believe a goal of univalent foundations is to make the abstract structure that one is working with absolutely explicit, evidenced by the (almost) total absence of axioms from the system. If you want to work in ZFC, you can add those axioms yourself.


An interesting thing happened when category theory was used to model sets. It turned out set membership was problematic. When wearing your category theory glasses "parameterized elements" (subobject classifier) were the natural thing. Think tangent planes to a sphere instead of points on a sphere.


Isn't that a similar idea to the Hilbert program, which failed due to godel's incompleteness theorem?


Hilbert's program was about whether every statement could be proved ab nihilo either true or false. The question remains: given a proof, can we verify it? Arguably, any proof conceived by a human mind should be verifiable in principle. So the Incompleteness Theorem doesn't really apply, either formally or morally.

A more interesting argument could be made on the basis of Rice's Theorem: for a Turing-complete language, the only non-trivial properties provable about a program are fundamentally syntactic. (Type systems can be seen as a way of elaborating your syntax to capture finer-grained details of your semantics). Here we have a richer and nearer-to-hand collection of programs whose properties are inaccessible -- take, for instance, whether the 3n+1 algorithm halts on all inputs. Again, though, if we have a proof of some property, we still ought to be able to verify it.


Isn’t verifying a proof equivalent to determining whether a certain computer program halts?


No, but it would be equivalent to verifying that a given computer program halts on a specified N steps.

This is the same distinction between (among other things) P and NP problems. For any problem in P, given an instance of the problem I can produce a solution in polynomial time. For any problem in NP, given an instance of the problem and a candidate solution, I can determine whether the solution is valid in polynomial time.

Determining whether a certain computer program halts would be equivalent to producing a proof, not verifying an existing one.

Verifying a proof is, in principle, only a matter of making sure that each step of the proof is valid. Coming up with the right steps is a creative endeavor. Certainly, coming up with the right set of allowable steps is a creative endeavor, and proving that your system is expressive enough for interesting proofs is a creative endeavor. But once given a proof, verifying that it legally applies legal steeps is a purely mechanical endeavor.


> But once given a proof, verifying that it legally applies legal steps is a purely mechanical endeavor.

But isn't this the same as once having a program then executing it? which then has the problem of does it finish? so back to the halting problem?


No, because a proof (especially a human-constructed proof) is a finite object built out of basic pieces. The only question as to the validity of a proof is whether the basic pieces have been fit together according to the established rules of the game. We don't have to have any notion of "execution" of a proof to verify that its construction is valid.

Let me put it this way. I tell you that I can make apple juice if you give me an apple and a juicer. Do you have to give me anything to believe me?

Several researchers have indeed put together a theory of "execution" of a proof; see for instance Prof. Frank Pfenning's research program on logic-based process calculi. These theories tend to have the common theme of execution as normalization -- the chosen mode of execution actively changes and evolves the proof structure until it reaches a terminal form, usually one with a particularly simple logical structure. Cut elimination is a central theme: any logical system worth calling "computational" (one argues) will support cut elimination, allowing you to replace a proof with steps like `A -> B, B -> C |- A -> C` with a direct proof of `A -> C` that does not use the intermediate hypothesis `B`.

But we're not doing that when we're verifying a proof. We're looking at the finite, static structure of a proof term, and checking that it was put together legally.


The problem with Hilbert’s program was, loosely, that the power to construct propositions was stronger than the power to mechanically check proofs. That is, the formal language of propositional logic + number theory can construct propositions that are not decidable in a finite number of steps.

There are two responses to this discovery: tweak the formal system so you can’t accidentally run into these traps, or shrug and just be more careful about how you prove things.

Now that the world is increasingly computerized, it’s become clear that the right choice is to stop using a default form of mathematical reasoning that causes major issues for mechanized proof assistants and instead build a new foundation that is friendly to computers (but can be extended to support the power of classical formalisms)


Parent is referring to something a bit different - Hilbert’s program was about all of mathematics, even that which hadn’t been discovered yet, whereas the parent is talking about formalizing existing mathematical knowledge in some HoTT-enabled programming language or proof-checker (hence the emphasis on verifying and the problem of siloed knowledge).

Obviously true theorems with existing proofs don’t apply to Godelian incompleteness.


Naive outsider intuition is that you can encode all category theorems as graphs, but only because you can encode anything as a graph, and the property of category theory that could make it a universal modelling language is really just an artifact of it being a way to talk about things that can be encoded as graphs.

At best, perception may be graphs all the way down. As though we were living in a simulation specified by a Matrix of adjacencies.


The main thing categories add to graphs is identifications of paths. Every pair of edges f: A -> B and g : B -> C generates a third edge g.f, and we identify the path (g)(f) with the new edge (g.f). Not only that, we add h.(g.f) = (h.g).f for any h: C -> D, which basically says that "." acts like concatenation of paths (it doesn't matter how you group the edges in a path, you get the same thing).

Those are the basic axioms to say you're working with a nice operation that turns paths into edges in a reasonable way. It gets interesting when you add more identifications (commutative diagrams) and operations (functors). Prototypical example: vertices are sets, edges are functions, you get identification of paths (sequential compositions of functions) whenever the underlying composite functions are pointwise equal.


> The main thing categories add to graphs is identifications of paths.

There's another, very different way to describe "identification of paths", although I agree that both are fundamental to categories.

What you've described is better known in graph theory as the "transitive closure" of a graph. Indeed, every graph (and even every multigraph, with multiple edges between two nodes) gives a category via its transitive closure. But this category is the free category on a graph, and there are many categories that are not "free", so there is still something missing.

What categories add on top of the transitive closure is the ability to say that two paths between the same nodes are fundamentally equivalent. (In other words, you are identifying two paths -- see the confusion?) If those paths could be further composed with other paths, then the resulting composites are also equivalent (i.e. if a = b then ac = bc); proceeding in this way gets you a new category.

It's the ability to say that two distinct "paths" (lists of edges) give you the same "edge" (single step) that really distinguishes category theory from graph theory. Commutative diagrams are popular in category theory precisely because they graphically display which paths produce equivalent edges.


> It's the ability to say that two distinct "paths" (lists of edges) give you the same "edge" (single step) that really distinguishes category theory from graph theory. Commutative diagrams are popular in category theory precisely because they graphically display which paths produce equivalent edges.

It's the idea of graph isomorphism that confuses me about this, because I have trouble separating it from recognizing two equivalent paths as a type. The main reason is I don't have depth in the concepts at all, but the secondary one is that di-graphs with attributes seem to provide a covering abstraction for this. I should probably RTFM, (or more accurately, Do-TF-Graduate-Degree) but every time I read an explanation it creates a curiosity I can't leave alone.


I'm nervous that there might be a confusion between levels of abstraction here. Let me try to ground the terminology a bit, and see if that helps first.

As a recognized term, "graph isomorphism" would lift to category theory as "category isomorphism". These are not entities within a category (uh, per se), but they're relationships about and between categories. A graph/category isomorphism describes how two graphs/categories can be equivalently described in terms of each other.

In graph theory, a bidirectional path is a pair of paths within a graph. Two nodes are part of the same strongly-connected component if you can travel between them both ways. In the transitive closure of a graph, two nodes are in the same strongly-connected component if and only if the edges a->b and b->a exist within the graph.

In category theory, an isomorphism of objects is a pair of arrows f, g within a category between the same two objects -- such that their compositions fg and gf are both equivalent to the identity arrow. (In graphs, the paths fg and gf must be considered equivalent to the empty path.)

In a graph, you might naturally think of isomorphic objects as those that "relate" the same way to all other objects -- that is, if you didn't already know which node you were looking at, you couldn't tell the two apart from a vantage point anywhere else in the graph. This intuition broadly carries into category theory, though you can have multiple edges betwen nodes -- that's why the `fg = gf = 1` rule needs to be added. The transitive closure guarantees that for any isomorphic objects A and B, any path entering (leaving) A (B) can be extended to enter (leave) B (A), and the identity condition guarantees that the extension we added to cross between the two objects didn't add or remove information.

Put a slightly different way, if I have a path X -> Y that passes through either A or B, there is no way to distinguish whether the path went through A or went through B.


Thnk you, yes, levels of abstraction is precisely what I'm being tedious about. These are very generous explanations and I respect your time on this, please ignore if I veer into a cranky Gish gallop.

This distinction between a bidirectional path in a graph vs. a functor between objects is a great clarification, and I can begin to comprehend how I would be confusing the representative directional arrow in a graph with the logical operation of mapping in CT is different - because I'm treating them both as just morphisms between objects.

The universality of CT implied a conjecture to me where, either there are theorems of CT that cannot be expressed as graphs, or there are graph theorems that cannot be expressed in CT.

"Expressed," is handwavy, but because we're on the edge of notation/encoding and representation vs. whether there a homology between graphs and categories, when I get into the question of a universal modelling languege, it raises the question of what the minimum necessary set of instructions to express theorems in that language (either CT or graphs), and whether the smallest program that can express CT will resolve to a graph.

If the Kolmolgorov complexity of the program that produces theorems in CT is greater than the one which produces theorems for graphs, and that program itself reduces to a graph, it implies to me that the less complex program is the more universal modelling language.

This was why in terms of modelling I thought that CT seemed more like an ornamented subset of graph theory. However, that's like if someone said math is a just subset of Godel numbering, which well, yes, but that's not very helpful. :)


> Gish gallop

Thanks for that; I learned a new phrase ^_^

Yes, admittedly your post throws a bit of a wide net, but I think I see what you're angling for.

> it raises the question of what the minimum necessary set of instructions to express theorems in that language

That's a really good question. Take the NAND gate as an example: we know we can implement any boolean function with only NAND gates. So the circuit language consisting purely of the NAND gate (and wiring rules for connecting gates) is ridiculously compact, even minimal. In some sense, it has a very very low entropy.

However, the circuits we construct are conversely ridiculously complex! Any given circuit is just a wad of NANDs, and trying to extract meaning from that circuit beyond executing it as a black box is going to be very difficult. You could say that these circuits have a very high entropy.

In comparison, suppose we allow ourselves the usual set of {AND, OR, NOT} gates. We know that any logical function implementable by {NAND} is implementable by {AND, OR, NOT}, and vice versa, so there is no loss in functionality. But circuits expressed in this language have far more structure, making them individually much easier to analyze. We can define such notions as "conjunctive normal form", in that any circuit can be transformed into an equivalent three-layer circuit: a layer of NOTs feeding into a layer of ORs feeding into a layer of ANDs. These circuits have far lower entropy in exchange for a modest increase in our circuit language.

So when you say this:

> it implies to me that the less complex program is the more universal modelling language.

I think we have two competing notions of "complexity" at hand, and it's very hard to say which one ought to win out. As a personal preference, I like to imbue my creations with more inherent structure, so I prefer things like {AND, OR, NOT} over things like {NAND}. It makes the constructions easier to understand.

If you're studying the languages and not the things created in those languages, then things like {NAND} may be superior. But you usually need some kind of language in which to describe your languages -- did you notice I was using set-theoretic notation? -- and so you have the same question again at a meta level.

One of the attractions of category theory is that it's also a pretty good language for speaking about category theory!


Super interesting, thank you. I forsee these topics distracting me well into old age.

This idea of adding the structure of AND, OR, NOT over a field of NAND gates, without losing information sounds like the the beginnings of where Categories become more meaningful than graphs. The analogy would be that AND would be like a functor category, where again, it is represented as an 'arrow' but it's really it's more like a lossless abstraction, in a strict information or logical sense.

This circuit that is a just wad of NANDs is not unlike a generalized universal intermediate representation of a program, or even an esolang program like BF, because while they aren't efficient for human reasoning or computation at all, they are consistent universal forms for symbolic representation.

The problem in code is always, "this would just be compute/work problem if only this thing were made of <consistent-data>," and what attracted me to the idea of CT was that it seemed to be a tool for reasoning 'different' things into compositions of their consistent abstractions. I still think that's true, but now have a better sense of why.

I will revisit the short course that led me here as well (https://applied-compositional-thinking.engineering/lectures/)


I've never studied categories, but my reading of the parent comment was that a category was the combination of a graph and a seperate identification of paths within the graph in addition to the bare graph. Thus having isomorphic graphs would be necessary but not sufficient for two categories to be equivalent.

I guess you could represent that as an extra complex graph, but I'm guessing that would be unwieldly, and may not actually be useful for analysos.


> I've never studied categories, but my reading of the parent comment was that a category was the combination of a graph and a seperate identification of paths within the graph in addition to the bare graph.

Pretty much, yes. In group theory we call this a "presentation": we give the generating elements of the group, followed by the equations to "mod" by. For instance, the group presented by <x, y | xy = yx> is the free commutative group on two elements -- the equation we've modded by forces commutativity, but leaves us otherwise unconstrained. Or the group presented by <x | x^5 = 1> -- that's the cyclic group of order 5.

Categories are novel algebraic structures in some respects, but conceptually relatively standard in others. A naive "presentation" of a category is exactly as you say: you define some fundamental edges and assert any desired equivalences of paths composed from them [0]. But, given the expressive power of categories, sometimes this is still quite explicit and clunky. In particular, you don't usually want to spell out all of the basic arrows you're starting with.

> I guess you could represent that as an extra complex graph, but I'm guessing that would be unwieldly

That being the case, it's often easier to define a model category by stating how it relates to other, already-understood categories. A "category with finite products" is any category that relates in a particular way to the category with two objects and no non-trivial morphisms. Of course, the phrase "a particular way" is load-bearing, but it's formalized on the notion of "functor" and "universal property". It's a much more implicit kind of definition, but in turn it gives you a very powerful tool for speaking about your category.

[0] https://ncatlab.org/nlab/show/presentation+of+a+category+by+...


Categories aren't just graphs; they have additional properties, the most important of which is composition (ex. function composition). They can also be thought of as a type algebra, like groups or rings, but again with different properties.

Edit: these extra properties allow you to reason about a wide class of problems when they're represented as categories, not just round trip them from unencoded to encoded to unencoded. For example, Pijul uses a category of patches to model file patching and prove certain properties of the version control system.


While Category Theory does make an extensive use of the language of diagrams, this in no way implies that it can be reduced to them. Category Theory is much deeper than graph theory. In fact, the very notion of a category should be seen as ancillary to the much more important concept of a functor (which, in turn, if you believe the word of the founders, was only needed to provide a base for a rigorous definition of natural equivalence).


Visual perception and perhaps touch because there is a general mutual exclusivity of visual and tactile 'things' (once you draw a boundary around them.)

But taste, smell, and sound ... you can't so readily and purely taxonomize these...


Open source implementation and academic papers on using CT for databases: https://www.categoricaldata.net


As an engineer looking in from the outside of academia, all of the discussion I see pop up around Category Theory seems hauntingly similar to ideas discussed when producing SQL schemas in well-normalized databases.

Is 3NF/BCNF something that can be described in terms of Category Theory?

I always felt like there was some super deep & fundamental link between these mathematical concepts and relational modeling ideas. Reading "Out of the Tar Pit" was a big trigger for me on connecting dots between imperative/functional/relational programming as well as implications that a good combination of these things would have positive impact on domain modeling.


The answer is yes. There's even a notion of 'categorical normal form' that extends 3NF that guarantees the direct representability of such relational databases as set-valued functors. Spivak talks about it in e.g. http://math.mit.edu/~dspivak/informatics/FunctorialDataMigra...


Thank you for this. I was having a hard time finding Batman and Mr. Wayne in the same document at the same time. Definition 1.1.1 is blowing my mind right now.

It would seem I should dive into category theory to see how far this rabbit hole goes. From a practical modeling perspective, there certainly feels like a "done/correct" point where any further normalization seems wrong, even if you ignore all the specific rules.


> It would seem I should dive into category theory to see how far this rabbit hole goes.

And that was the last we ever heard from bob1029.


Welcome to the applied category theory community :-)


That's precisely what it is. Homotopy/Category theory is strongly-normalising.

Which is roughly what the Univalence axiom tells us: identity is equivalent to equivalence.

The identity type is the normal/canonical/unique form for all object of a particular type.

Yet another perspective is the question "Do A and B have the same structure? Are they isomorphic?". When you are dealing with finite data types the answer to such questions is inevitably in the domain of Finite Model Theory ( https://en.wikipedia.org/wiki/Finite_model_theory ). Finite categories are the same sort objects as DB schemas.


> I always felt like there was some super deep & fundamental link between these mathematical concepts and relational modeling ideas.

The relational model is relies on set theory (more specifically relational algebra). An alternative view on data and data modeling is based on 1) sets, and 2) functions, and is called the concept-oriented model [1, 2]. It is actually quite similar to category theory and maybe even could be described in terms of category theory. It is also quite useful for data processing and there is one possible implementation which is an alternative to map-reduce and join-groupby approaches [3].

[1] A. Savinov, On the importance of functions in data modeling https://www.researchgate.net/publication/348079767_On_the_im...

[2] A. Savinov, Concept-oriented model: Modeling and processing data using functions https://www.researchgate.net/publication/337336089_Concept-o...

[3] https://github.com/asavinov/prosto Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby


The categorical databases formalism is also known as 'sketches'. For example, here are some slides from NASA et al that connect to David's work but use this alternative phrasing. It shows how to use category theory to 'align' ontologies, with an example about voting. https://www.nasa.gov/sites/default/files/ivv_wojtowicz_sketc...


What's the tradeoff? I mean, how smart you need to be to produce/understand something useful.

Modelling is about communication and understanding, if the language is expressive, precise and utterly abstruse for most people, it's a niche application at most (and it's likely very valuable in its niche).

There's a reason most informal methods become more popular.


I can imagine that there is some theorem that says Category theory is a universal modeling language in some sense. For example, it includes all of set theory, so it "includes" everything classical math can model.

But the real question is whether it is a _useful_ modeling language for everything. Engineers know everything has a tradeoff, and category theorists would do well to recognize the limitations of their tools or risk turning people (like me) away after years of study.


The trade-off has to do with computability: because (finitely presented) categories can express so much, reasoning about them (for example, the 'word problem' for them) is undecidable. So to use CT as a modeling language requires automated theorem proving techniques, which are computationally expensive, and fast reasoning is a large part of what we're commercializing.


> But the real question is whether it is a _useful_ modeling language for everything.

Category theory is definitely not useful for modeling everything. However, it's useful for modeling a whole lot. Categories have few axoims, and those they do have align well with how humans break down complex problems, so a lot of systems can be modeled neatly with categories. And category models are useful for proving existence, uniqueness, and notions of "maximum" and "minimum" for their "arrows", which are often useful things people try to prove about systems.


I haven't seen any particularly convincing examples of this outside of pure math.


No, it isn't.

It's a really nice framework though, but there's plenty (pleeeeenty!) of things that lie beyond its scope.

Idealizing CT and its applications only does more harm than good to the field.

Edit: Sure, I'll be glad to talk about it. Have plenty of experience doing (also, avoiding doing) things w/ CT. Just drop me an email (see profile), you're always welcome.


You forgot the part where you explain <why>/point out/link to examples of things that lie beyond its scope.


Isn't there supposed to be a nice three-way equivalence between certain systems of logic, category theory, and lambda calculus? Presumably something that lies beyond the scope of CT also lies beyond the scope of those other systems.


I can see ways to represent lambda calculus constructs using CT and viceversa. I guess that 'systems of logic' includes lambda calculus, or systems with equivalent semantics; so yeah, I can believe there's a connection at some level.

The (one?) problem with CT is that it, by definition, puts too much importance on the relation between things via composition, which is a powerful paradigm in itself, but also quite restrictive. 'Structure' is not usually encoded like that in a natural way.

CT is just part of a larger framework that is currently being unveiled (!!!); but, to be quite frank, people who focus mostly on CT seem to be missing the forest for the trees.


Not to disagree, I just would make a slight correction: in this case it is more like people seem to be missing the trees for the forest.


See e.g. Baez & Stay 2009, "Physics, Topology, Logic, and Computation: A Rosetta Stone". [1]

[1] https://arxiv.org/abs/0903.0340


Could you provide some examples where that this was the case?


It almost feels like your goal is to collect emails :). You made me curious. Please share your insights


>It almost feels like your goal is to collect emails

Now that's a new one, keep going pal :)


I'm not a mathematician, but I found Spivak's book on the linked page to be very accessible. They've started a company as well.

https://conexus.com/


I'll second that. Not a mathematician either, and I liked that book too.


Newer work from the same author: Seven Sketches in Compositionality: An Invitation to Applied Category Theory https://arxiv.org/abs/1803.05316


Theres a physical book as well. (It's nice!)

https://mitpress.mit.edu/books/category-theory-sciences


The problem is that RDBMS do not structure information as humans do, making them very bad at modeling and working with information as we humans do. We think in a graph of relationships which we recurse through constantly. These are the two worst things that RDBMS do, so the architecture of data is simply wrong for the job.


Relational databases as introduced by Codd AFAIK are pretty much just storages for Prolog's fact statements: they map exactly one-to-one. The underlying data structures used for implementation (B-trees etc.) may or may not be optimal for the Prolog-style queries, of course.


Sure, but Category theory can handle recursion. See Section "6.6 Application: Primitive Recursion" of "OLOGS: A CATEGORICAL FRAMEWORK FOR KNOWLEDGE REPRESENTATION" (http://math.mit.edu/~dspivak/informatics/olog.pdf).


For anyone interested in learning about category theory, this series is very good: https://www.youtube.com/playlist?list=PLbgaMIhjbmEnaH_LTkxLI...


To be honest, I liked The Catsters better. Short and to the point.


Are there any prominent category theorists who are not “characters” and present it in a humble / conventional style of mathematical exposition? A lot of the style of presentation of CT is off-putting.


I got into informatics and ontology because I stumble across David's paper on ologs. Years later and I'm still waiting for someone to implement a knowledge representation system that uses them, and hoping that it won't end up being me.


I have learned enough Category theory to realise that it isn’t helpful in solving any software design/programming problems I care about. And also that Type theory is a much simpler and cleaner foundation for math than Category theory.


The formalization of what it means for modules to exist and compose with each other is valuable, but learning category theory as a mathematician would is not sufficient for grokking this. Category theory's insights are often obscured by terminology that fails to reify the abstractions at a level where its wide scope becomes clear.

The candidate I would push forward as the proper reification would be Mark Burgess' Promise Theory. Mark's starting point isn't, "what's the most general algebra I can form for encoding mathematical objects", but "what happens if I generalize points in spacetime to be a special case of any kind of graph". When you do this, you make each point in space responsible for either promising information to another point, or accepting the outcome of a promise, as bidirectionality cannot be directly inferred.

So if every point in spacetime cannot be assumed to be connected both ways, then you have to explicitly define the communication channels that exist between points. This idea unifies both physical and digital "spaces", as everything is now modeled as a communications network from atoms up. Thus, Promise Theory makes the connection between compositionality and information and the rest of reality more clear than categories alone.

Mark acknowledges that Category Theory and Promise Theory are both similar and related. For instance, they both deal with notions of interiority. And Mark encodes his axioms with categories. He's written several books on the topic, like "In Search of Certainty" and "Smart Spacetime", and has published papers on arxiv. I recommend "Smart Spacetime" to see how powerful this kind of semantics can get.

https://arxiv.org/abs/1411.5563




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: