I always found semantic versioning a little too verbose. Particularly when deciding when to release major versions. OSX was on version 10 for many years but of course released a new "major" version every year.
Semantic versioning is just something everyone does in software development, but is is really that necessary?
Semantic versioning is for APIs, not for functionality. So it's for developers consuming that API (whether a library, or a service).
For releases in production, use a calendar version. v2025-11-02 is a clear release tag. Add preciseness as required. There should be a SBOM/Manifest (Bill Of Materials) of the versioned major components and configuration for that overall release.
For users, it depends on the type of user and what they expect. Their focus is on functionality. So when there's a new feature, bump the number.
It's a bit like the car model. It can be random extension letters like "-X", or "6Si".
So, amongst others, they had Oracle 8i at the height of the dot com boom (i for "Internet"), then a few years later when clustering became big news there was Oracle 10g (the g standing for "grid", I think?), and so on.
Actually, it looks like they might still be doing it - I just checked, and their current version is 23ai...
Developers are "users" (of a library, API, tool...), and "API functionality" is a subset of "functionality": what purpose would such distinction serve?
For example, in end user desktop software (say a text editor), how would you indicate a security bug fix for an old version v2023-11-02 without forcing users to pay for a new version of v2025-09-15?
Again, versioning is a tool, and depending on the release structure of a project, SemVer might work well or it might not (including for APIs/libraries).
Semver is semantic. It tells you about the changes in the API, not in the implementation. So it's relevant to the users (ie developers) of that API.
If I fix a bug in the implementation that doesn't affect the API itself, the semver doesn't change.
So it works to define the versioning of an interface.
Release versioning (like vyyyy-mm-dd) is about SRE and configuration management, it's about documenting what is actually operating in production.
User versioning is about user expectations. If you're doing a security bug fix, then a) it should be free to users affected, and b) it should be documented as the reason for the release.
For a security bug fix of v2023-11-02, you can add a "hotfix" extension to the numbering so v2023-11-02.001 or equivalent.
I can see how you are using the multiple versioning schemes to achieve certain goals, but that does not make your approach "one true way".
"Semantic" really means "has some background meaning", and SemVer meaning behind all the numbers is in no way limited to API versioning: I am confused if you are making a proposal to consider it only that, arguing that different schemes should be used for different types of software, or stating that this is the only thing SemVer can be used for?
On any of those points, I disagree and you haven't made a convincing argument why that would make sense.
Note that I am not saying that SemVer is the only true way to do versioning (very much not so) for all software, just that it depends on your release strategy what is most applicable, and not on the type of software you are shipping.
The original reasoning behind Semver was that it described the changes from the point of view of the consumer of the interface. If the interface didn't change, then the semver didn't change.
So yes, I am arguing it should not be used beyond that.
Horses for courses... semver is about semantic meaning to the consumer of an interface, release versioning is about configuration and SRE, user versioning is about functionality and (for a client app) security.
As for proposing the "One True Way", no, I'm not proposing that, I'm saying that I use 3 different versioning strategies for 3 different use cases.
In a production system that I am involved with, the current production is:
API: v5.0.0 (yes we just introduced major breaking changes, however, we have an endpoint for older semvers)
Release: v2025-10-30 (with release notes saying which API Semvers were available and which versions of the user app)
User app: v3.5.1 (there was a security patch to v3.5)
But very similar versioning schemes have been used by different software for a long time before SemVer was formalized as such: again, if you accept any user interface as the interface, vs API interface (or interfaces for developers) as you originally proposed, I have no issue with that.
Eg. GNU coding standards[1] said the following at least 18 years ago:
You should identify each release with a pair of version numbers, a
major version and a minor. We have no objection to using more than
two numbers, but it is very unlikely that you really need them.
I'd also note that your user app versioning seems to use the same approach.
I don't accept that "any user interface as the interface, vs API interface". API has a specific meaning, it is for "Application Programmers", which implies that it is for developers of other systems/applications that interface with the system.
What I'm saying is that there are 3 different "users" (actors/use-cases/your-word-of-choice) for a system and each of them need different information about a system that can be expressed in a specific format and implementation of a versioning schema:
1. Other systems, connecting via a documented API (Semver tells devs when they need to be concerned about changes in a particular release to production).
In this case, the users (other developers) are specifically concerned with changes in meaning of an interface, including new information, changes in existing formats, or removal of information.
By definition, this is semantic and Semver provides a schema to support that. Each digit/component of the semver has a specific meaning associated with it.
2. SRE and operations managing the system in production (a calendar based version tells them when a system was released into production, potentially any hotfixes applied, and what SBOM to consult when operating)
It doesn't have to be calendar based, the release version is an arbitrary label. As long as the interpretation of that label is clear between the development and operations teams, it is fine.
Calendar based labelling is usually appropriate for changes to configuration of a system in production.
3. End users of any UX/UI, noting that some systems do not have end users, only other systems. My example was just one of many different "end user version" schemes. Just look at any iOS apps you have installed and what their latest "release" is.
The versioning here can be anything that is clear to the users. A financial system might be versioned by the financial year, or the release of tax and other codes on a regular basis.
Consumer applications might add arbitrary buzzwords like "ai" or "e" or anything else created by the marketing teams.
I believe you are conflating a few things under wrong categories: something that only has a single "stream" of releases (for instance a web application, but it could be anything, an API or a desktop/phone application) needs to only communicate where it's at for "debugging" purposes (talking about a feature/bug from previous releases, referring to the problem a customer has...). A calendar version works fine as long as it matches the release cadence (eg. do you do 100s of releases a day? might need more than a date) and allows easy referentiality.
If you strictly maintain a single API version and expect all users to stay in sync, calendar versioning works there too: it is has different meaning from numbers in SemVer, but it is semantic in that it informs the user of when the changes were introduced in your stream of releases.
Just like it does make sense to use SemVer for desktop software you have paying customers for 2.1 and 3.0.
Your versioning should have a goal and fulfill that goal. Your attempt to shoehorn software types into versioning schemes makes no sense, even if I am with you that SemVer can drive good API development practices if applied to the letter (or can result in you quickly getting to 145.14.7 where SemVer did not really help and is likely not suitable for your product release cadence).
If you're getting to a semver of 145.14.7 with Semver then sack your API designer.
Remember the major version bumps when it is not backward compatible. New functionality can be backward compatible.
You can keep older semver APIs working with endpoints that support upgrading the older format to the newer internally, which allows you to enable things like sunsetting API versions to provide automated upgrade notifications.
All I'm saying is that user version numbers are more a marketing exercise than anything logical.
All I am saying is that they are a communication tool regardless of the product, and there is no one single answer to which is best based on the type of software being built.
You seem to want to encode good API design practices into a versioning scheme, and I claim it does not work that way ;)
Versioning is a tool to communicate changes and backwards compatibility to the users. SemVer makes sense in a lot of cases, but it neither covers everything (eg. compare with Debian/Ubuntu packaging versions), nor is it always needed (think of REST API versions which usually only go with major versions, and commonly multiple major versions from the same codebase).
If you subscribe to extended mind theory and Merleau Ponty's brand of phenomenology, tools are just an extension of your cognitive process, and "shelling out" in this way is really to be expected of high intelligence, if not consciousness. Some would say it might even be a prerequisite for consciousness, that you need to be a being-in-the-world etc etc
Not to say that GPT is conscious, in its current form I think it certainly isn't, but rather I would say reasoning is a positive development, not an embarrassing one
I can't compute 297298*248 immediately in my head, and if I were to try it I'd have to hobble through a multiplicaion algorithm, in my head... it's quite simlar to what they're doing here, it's just they can wire it right into a real calculator instead of slowly running a shitty algo on wetware
Yeah humans have done this physically for very long. We have puny little teeth but have knives and butchering tools much better than any animal's teeth. We have little hair, but make clothes that can keep us warm in the arctic or even in space. We have an underdeveloped colon and digestive system, but instead we pre-digest the food by cooking it on fire. In some sense the stove is part of our digestive system, the jacket is part of our dermis (like the shell of a snail, except we build it through a different process), and we have external teeth in the form of utensils and stone tools.
Now, ideally, the LLMs could also design their own tools, when they realize there is a recurring task that can be accomplished better and more reliably by coding up a tool.
Heidegger ready-to-hand is another take on this same idea. Something I took to heart years ago and was a big part of my using and contributing to free software as much as possible. Proprietary software is a form of mind control along these lines of thought and I don't like that one bit.
Okay, but, like, how much worse is it to re-render the whole thing? And it's not like Turbo hasn't been around for more than a decade, doing exactly that, for free, automatically.
I'm a bit confused. What are they sending over the wire exactly? I thought the whole point of quantum communication is you use entanglement to instantly send from point A to point B, and there is no wire?
Entanglement doesn’t allow instant communication, you can’t beat the speed of light, and you still need a communication channel. Instead, entanglement is used to prevent undetected eavesdropping on the communication: https://en.wikipedia.org/wiki/Quantum_network#Secure_communi...
> It took the author just few minutes to solve this but for someone like Perplexity it would take hours of engineering and maintenance to implement a solution for each custom implementation which is likely just not worth it.
These are trivial for an AI agent to solve though, even with very dumb watered down models.
Yeah I quite agree with this take. I don't understand why editors aren't utilizing language servers more for making changes. Crazy to see agents running grep and sed and awk and stuff, all of that should be provided through a very efficient cursor-based interface by the editor itself.
And for most languages, they shouldn't even be operating on strings, they should be operating on token streams and ASTs
Strings are a universal interface with no dependencies. You can do anything in any language across any number of files. Any other abstraction heavily restricts what you can accomplish.
Also, LLMs aren't trained on ASTs, they're trained on strings -- just like programmers.
No, it’s not really “any string.” Most strings sent to an interpreter will result in a syntax error. Many Unix commands will report an error if you pass in an unknown flag.
In theory, there is a type that describes what will parse, but it’s implicit.
Exactly. LLMs are trained on huge amounts of bash scripts. They “know” how to use grep/awk/whatever. ASTs are, I assume, not really part of that training data. How would they know how to work well with on? LLMs are trained on what humans do to code. Yes, I assume down the road someone will train more efficient versions that can work more closely with the machine. But LLMs work as well as they do because they have a large body of “sed” statements in their statistical models
treesitter is more or less a universal AST parser you can run queries against. Writing queries against an AST that you incrementally rebuild is massively more powerful and precise in generating the correct context than manually writing infinitely many shell pipeline oneliners and correctly handling all of the edge cases.
I agree with you, but the question is more whether existing LLMs have enough training with AST queries to be more effective with that approach. It’s not like LLMs were designed to be precise in the first place
It's so weird that codex/claude code will manually read through sometimes dozens of files in a project because they have no easy way to ask the editor to "Find Usages".
Even though efficient use of CLI tools might make the token burn not too bad, the models will still need to spent extra effort thinking about references in comments, readmes, and method overloading.
Which is why I wrote a code extractor MCP which uses Tree-sitter -- surely something that directly connects MCP with LSP would be better but the one bridge layer I found for that seemed unmaintained. I don't love my implementation which is why I'm not linking to it.
I agree the current way tools are used seems inefficient. However there are some very good reasons they tend to operate on code instead of syntax trees:
* Way way way more code in the training set.
* Code is almost always a more concise representation.
There has been work in the past training graph neural networks or transformers that get AST edge information. It seems like some sort of breakthrough (and tons of $) would be needed for those approaches to have any chance of surpassing leading LLMs.
Experimentally having agents use ast-grep seems to work pretty well. So, still representing a everything as code, but using a syntax aware search replace tool.
Didn't want to bury the lead, but I've done a bunch of work with this myself. It goes fine as long as you give it both the textual representation and the ability to walk along the AST. You give it the raw source code, and then also give it the ability to ask a language server to move a cursor that walks along the AST, and then every time it makes a change you update the cursor location accordingly. You basically have a cursor in the text and a cursor in the AST and you keep them in sync so the LLM can't mess it up. If I ever have time I'll release something but right now just experimenting locally with it for my rust stuff
On the topic of LLMs understanding ASTs, they are also quite good at this. I've done a bunch of applications where you tell an LLM a novel grammar it's never seen before _in the system prompt_ and that plus a few translation examples is usually all it takes for it to learn fairly complex grammars. Combine that with a feedback loop between the LLM and a compiler for the grammar where you don't let it produce invalid sentences and when it does you just feed it back the compiler error, and you get a pretty robust system that can translate user input into valid sentences in an arbitrary grammar.
Sounds like cool stuff, along the lines of structure editing!
The question is not whether it can work, but whether it works better than an edit tool using textual search/replace blocks. I'm curious what you see as the advantage of this approach? One thing that comes to mind is that having a cursor provides some natural integration with LSP signature help
Yes agentic loop with diagnostic feedback is quite powerful. I'd love to have more controllable structured decode from the big llm providers to skip some sources of needing to loop - something like https://github.com/microsoft/aici
After being pleasantly surprised at how well an AI did at a task I asked of it a few months ago that I thought was much more complicated, I was amused at how badly it did when I asked it to refactor some code to change variable names in one single source file to match a particular coding standard. After doing the work that a good junior developer might have needed a couple of days for, it failed hard at refactoring, working more at the level of a high school freshman.
Structured output generally gives a nice performance boost, so I agree.
Specifically, I'd love to see widespread structured output support for context free grammars. You get a few here and there - vLLM for example. Most LLMs as a service only support JSON output which is better than nothing but doesn't cover this case at all.
Something with semantic analysis - scope informed output, would be a cherry on the top, but while technically possible, I don't see arriving anytime soon. But hey - maybe an opportunity for product differentiation.
AST is only half of the picture. Semantics (aka the action taken by the abstract machine) are what’s important. What code helps with is identifying patterns which helps in code generation (defmacro and api services generations) because it’s the primary interface. AST is implementation detail.
If you look API exposed by LSP you would understand why. It's very hard to use LSP outside an editor because a lot of it is "where is a symbol in file X on line Y between these two columns is used"
I love how arrogant humans get when you hit at their “that’s supposed to be a human thing!” nerve. Sign language is language, dance is language, writing is language, speaking is language, semaphores in a sail boat is language, Morse code is language, two’s complement is language, a mushroom communicating with other mushrooms with a small vocabulary of tokens is language, to think otherwise is incredibly small minded.
You don’t need consciousness or phenomena or quailia to have language. Heck, every computer understands some sort of language. ChatGPT is capable of conversing in English better than anyone in this room, and it doesn’t need consciousness or any of that stuff to do it. Language is neither unique nor special.
We have formal definitions, it’s anything that can have a grammar, these definitions are incredibly broad, big endian two’s compliment integer encoding has a grammar consisting of two tokens, 0 and 1. All you need is something that can form a sentence and something that can parse it and the thing going on between them is language. Languages vary in complexity but the floor for that is way lower than you are thinking. Simple languages consisting of one or two tokens are still languages.
There are multiple definitions of language, not one. As another commenter pointed out, if you make the definition broad enough it loses usefulness.
> it’s anything that can have a grammar,
"Can" have a grammar is not restricting. "Does" is the correct verb, and that hasn't been proved.
> All you need is something that can form a sentence and something that can parse it and the thing going on between them is language.
OK, you've actually drawn out an important point. CAN the fungi form a sentence?
I can encode the 2nd sentence in "War and Peace" into polypeptides, but the polypeptides are still just an encoding of the language I picked (English? Russian?), and it doesn't make exchange of polypeptides by any other organism (even grad students!) a linguistic exercise.
Only things for which you have parsing rules and a parser and a thing that can form sentences. This is literally the CS definition. It just doesn’t require consciousness.
Truly random tappings on a tree are not language, but if a type of bird signals that predators are nearby by striking a tree with their beak with consistent rules that are understood by other birds or creatures, then you have language. But you would have us not study these things because you think they are beneath us. I would instead say we are nothing special, and thus everything is special, or at least historically we always miss out on scientific insight when we arrogantly assume humans have X but other life forms do not.
Show me two rocks where one forms a sentence and another one parses it and sure.
Now if your definition of language is something arrogant like “a system of communication where combinations of tokens are given semantic meaning that correspond with conscious states and phenomena” then I would say that is incredibly limiting and ignores the preponderance of language all around you both in nature and technology. If someone builds a computer that can parse x86 assembly, it doesn’t cease to be parsing language if I stipulate that humans never existed and this computer just happens to exist. The tree still falls in the forest even if there is no human consciousness there to perceive it and the same goes for language. If it is encoded by something and decoded by something else fairly consistently (fuzzy is fine if the communication is still generally effective) then you have language.
More importantly the existence of unconscious systems that can generate and parse sentences in arbitrary languages means consciousness isn’t very relevant or necessary when analyzing language and perhaps focusing on it too much actually gets in the way of meaningful research and discovery.
reply