One approach to guard against Hyrum’s Law is GREASE (aka “Generate Random Extensions And Sustain Extensibility” used in the TLS 1.3 protocol) i.e. behavior randomization to avoid inadvertent dependencies on unspecified behavior: https://textslashplain.com/2020/05/18/a-bit-of-grease-keeps-...
(Which I find a hilarious second order example of Hyrum's law - if you add true randomness to prevent people from depending on the iteration order, they might use it as a way to randomly access items in the map!)
This was mostly annoying because Go doesn't really provide any ordered map in std (and for while didn't give you generics, which are needed to make a general-purpose one that's perfomant). I feel like the best response to people relying on the order of iteration is to recognize that as a valid need an provide it via a separate type.
There's also the Python approach, where dict() iteration became insertion-ordered as of Python 3.7+ (it was also ordered in 3.6, but only as an implementation detail, not as a language feature).
At the time I thought this was a questionable decision that'd cause nightmare debugging scenarios, where someone writes code accidentally depending on insertion-ordered iteration, only to deploy it on a python runtime without it (<=3.5). I'm sure this has happened to someone, but at least it hasn't happened to me.
Matt Kulukundis (sp?) mentions this problem is his talk about Google's rather fancier Swiss tables.
At that point their debug builds would do a "coin toss" with 50.3% chance to be heads during hash table insertion and he expected some idiots would use that to generate random bits then be annoyed if it breaks. I believe in production it's entirely seeded from ASLR, so you're actually leaking address layout info if you try to use these bits as entropy instead of just to make the hash table defeat your bad unit tests.
Our team went with a similar approach when refactoring Protobuf debug APIs (https://bughunters.google.com/blog/6405366705946624/fixing-d...). People were relying on debug output and trying to parse it, so in the new implementation we threw up big warning flags and made the output unstable so that you couldn't make the mistake.
The key lesson to take away is: if you want something to be an implementation detail, make sure to have multiple differing implementations :)
When people don't know which implementation will be used, they tend to stick to the standard or the documentation.
We see this in places like the C language or in Common Lisp. Library writers must take into account that the implementation they use their library on may not be the same as the implementation their users will use.
Telling people who depended on probably internal behavior that wasn’t an explicit part of your contract to fuck off when they whine about you changing something.
When the clients have the power (e.g. a team that contributes to 90% of company's revenue), breaking the contract is often non-option from the beginning.
What are some fun or unusual examples of Hyrum’s Law that you’ve run into? Were you the user, inadvertently depending on some implementation detail? Or were users depending on your implementation’s details?
I reviewed a CL from Hyrum at Google where he was trying to remove a `set_timeout(float)` method in favor of `set_timeout(absl::Duration)` and changed the former to delegate to the latter. It turned out that there was some special handling of inf/nan in the legacy API, despite no mention in the documentation, and his CL broke a number of tests. It was amusing to experience Hyrum's law so directly :)
> I guess the linux kernel still supprorting binaries from ages ago might be too.
Actually it's Linus making sure that kernel changes don't break old binaries from ages ago. And in fact there were many cases of undocumented behavior being relied on by apps, and Linus would send big rants about how you never break userspace, whether the behavior is documented or not.
Don't have any of them on hand right now, unfortunately.
At Google (long time off and on home of Hyrum), they migrated to swisstable from other hash maps. Unordered map iterates through the elements in insertion order. They explicitly couldn't support that without adding overhead to their new table so they randomized iteration order of the map
The migrations to the new table were fun because you would migrate some code and then find a test that assumed the order things were inserted into some seemingly distant protocol buffer array were in insertion order of the map
Usually the ordering was just a defect of the test, but it required actually digging in to be sure
IIRC There's a (CppCon?) talk about this rollout where Hyrum is in the audience so that he can heckle each time they describe a change which "obviously" can't break anybody because of course Hyrum's Law did cause that to break at Google when they did it.
Maybe somebody who remembers better can link it and/or correct my memory of exactly what's going on.
I joined a company once where some APIS were, for some reasons, returning strings instead of booleans (e.g. "false" instead of _false_).
One of the system returned a typo for some edge cases, e.g. "ture" instead of "true". But instead of fixing this, other systems relied on the typo to be returned, so it was difficult to just fix the typo and move on.
I implemented a development-only feature which span up a thread and permanently held a lock on the database, effectively making the app useless until restarted.
A couple months later, I had a user thank me for it. I still don't know why
Let's just say "this HTTP endpoint expects its body to be JSON" and "we use a library that allows invalid JSON" (e.g. trailing commas) aren't exactly a good combination.
Code easily ends up depending on the specific library's quirks and config, regardless of what the actual contract said.
I'm not trying to be overly dramatic, but I think it's precisely when the industry accepted this as a law -- instead of treating it as something that needs be trained out of junior programmers -- is when software quality started to tank.
Consider any other form of engineering. Take some kind of screw. It has a documented specification in terms of torque, material strength and whatnot. Good engineers on the customer side will use the screws in a way that keeps within the spec. And good engineers on the supplier side will find ways to fulfill that specification as cheaply (which usually also means as narrowly) as possible.
There could be a kind of Hyrum's Law at play, if hardware engineers were idiots. Let's say that the screws accidentally overfulfil the specification today by 20%, and a customer measures the material to figure this out, and starts to depend on that. A year later, the supplier finds a cheaper way to produce the screws (or introduces a binning process) and as a consequence, the screws only exceed the spec by 5%, and the customer's product breaks. Who's responsible? The customer. And I should add, obviously.
This is fixed by training engineers. No one in their right mind would start to introduce artificial faults into their screws purely for the reason to prevent users from depending on the excess strength or something. But that's exactly the kind of thing that's regularly suggested to guard against Hyrum's Law.
So whenever I see software developers on my team look at the source code of a library to figure out "whether it's thread-safe" or "whether the sort order is stable" or whatever, I die a little on the inside. The problem is, you almost have to do that, because were two generations of software engineers into this and the practice is so accepted now that libraries no longer bother to document what they do (and don't) guarantee.
And then people wonder why every existing piece of software needs a fully staffed team nowadays just to keep it working. And why as a consequence, even core products built by competent companies (like Google Maps) have regressions in 15-year-old features every other week.
Yes, the relationship between "Hyrum's Law" and having a terrible documentation culture is strong.
The important part of the "Law" is the bit where it says "it does not matter what you promise in the contract".
That's definitely going to be true in a place where neither writing nor reading documentation is taken seriously (and one of the main things this site teaches us is that Google is such a place).
Yeah, no, humans can't hold complicated documentation in their heads. The whole experience of being human is finding a way to get by with a simple model which fits in your head but isn't wrong enough to cause trouble, and that's exactly what Hyrum is reflecting.
What you're getting at is the C++ "Just don't make any mistakes" approach to software engineering, which is a disaster that has cost our civilisation a great deal.
That is emphatically not what I am "getting at". I cannot express my opposition to that approach strongly enough.
I do not believe that the place people should hold documentation is "in their heads".
I do not believe that Hyrum's "Law" is helpful in getting to a situation where people's beliefs about what software will do match reality.
This isn't the first time I've come across people (particularly around the Rust community) who have got the idea into their heads that saying "reading documentation is important" is somehow close to saying "Real programmers don't make mistakes". I think that conflation is doing great harm.
If you accept that people are going to rely on things you didn't contract for by mistake then you're right back to Hyrum's Law. It's that easy.
Hyrum's Law isn't about what you should do, or how things should be, it's telling you an observable fact about our world that some people don't like. So no, the law isn't going to fix people's beliefs any more than Newton's Laws did for weird beliefs about motion.
I'd actually say the importance of documentation is better understood in the Rust community. Including, which is vital here, the importance of not relying on humans remembering to read all this text when you've got better options. Rust has documentation telling you that you're not promised whether or not some elements which compare equal are swapped when you [T]::sort_unstable. But it doesn't need to spend a lot of time warning you that that you shouldn't [T]::sort_unstable a slice of type T which doesn't even claim to have ordering, because the compiler rejects such nonsense anyway.
Indeed even the naming is an example. In C++ that function is just named sort. Because you know, an unstable sort is faster†. Will it sometimes surprise some poor noob because it's unstable? Sure, but apparently that's OK because if they had read and properly digested the documentation they would know it's an unstable sort. I suggest if the function were named better the user is much less likely to make this mistake before they even glance at the documentation.
> Hyrum's Law isn't about what you should do, or how things should be, it's telling you an observable fact about our world
But it isn't.
Neither Hyrum, nor anybody else, has ever seen a system where "all observable behaviors" were depended on by somebody. And if they somehow had, they couldn't know that it hadn't mattered how clear the documentation had been.
There are two much weaker statements which I think are true:
- "No matter how carefully you document your contracts, it will happen from time to time that you leave something unstated and people reasonably guess wrongly what you intended."
- "No matter how carefully you document your contracts, from time to time some people will choose to rely on things you didn't promise, without caring about that."
As well as actually being true, these statements have the advantage of not falsely implying that you can't improve the situation by putting effort into documentation.
Surely your statements are just corollaries which are noticeable with fewer users ? Hyrum's Law is more succinct because of its prefix, "With a sufficient number of users".
And I think we do see small systems where all observable behaviors are indeed depended upon, lots of trivial systems exhibit exactly this property, it's just it doesn't trigger the part of Hyrum's Law that apparently annoys you - "it does not matter what you promise in the contract" because if anybody did write a contract it would state the entire behavior, no surprises are possible.
And that permits a valuable conclusion from Hyrum's law. It's better to design my interface so that it's so simple any fool will use it right, than to document all the weird sharp edges of my interface so that I can potentially win an "Um, actually" episode each time a fool cuts themselves on the sharp edges. That's not always possible but often in our industry it's apparent nobody was even trying.
Laws of nature are not prescriptive: they only predict what will happen. It is up to us to fend off consequences we don't want, by whatever means we can muster. The law means that calling out non-promises in the name is favored. But that is often not practical.
> it's precisely when the industry accepted this as a law -- instead of treating it as something that needs be trained out of junior programmers -- is when software quality started to tank.
And precisely when exactly do you think that happened, when do you think there was this golden age where people actually relied on only documented behavior?
Windows and Linux both have been bending backwards to retain all sorts of undocumented features since early/mid 90s. Glibc was notoriously hampered by emacs relying on some details of its internals. C programs relying on implictly or explicitly undefined behavior has caused endless handwringing as long as there has been a C standard. The list goes on and on. Relying on implementation details has been the modus operandi in computing since day 1.
- Person 1 makes an observation that they assert is true
- Person 2 refers to this observation as “Person 1’s Law” for convenience when discussing the observation and whether or not it is true.
There’s nothing wrong or “graceless” with Person 1 then using the term themselves, even for the purpose of arguing in favor of their assertion (which is what this web page is doing).
One thing I found surprising is how disregarding many companies are about creating and nurturing know-how transfer culture among teammates. They seem to value compartmentalisation so much that they forget they are not de-risking themselves from Bus Factor.
feels like there's two slightly distinct phenomena that are interesting to point out -- implicit (or explicit) dependency on something that is actually true vs dependency on something that is not true
in the latter case the client forms a dependency on observed service behaviour based on an incorrect assumption of a contract -- i.e. as with the Google Chubby service story -- so they end up getting burned when the server does not satisfy that (of course it's the developer's responsibility to prevent them from forming the dependency in the first place)
the former is when clients form dependencies over assumptions that are actually correct but have not been intended to be observable, e.g. using some sort of nice library or data structure internally, then users starting to depend on that performance, making it hard for you to change the implementation
One could argue that React’s strict mode behaviour in dev is a guard against this —- as it prevented you from building things that will break with (the then) upcoming functionality.
What are other approaches?