Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Managing mutable data in Elixir with Rust (lambdafunctions.com)
131 points by clarkema on Feb 15, 2024 | hide | past | favorite | 58 comments


Rustler is great. Though this gets me thinking about how you can maintain as many Elixir invariants and conventions as possible, even while escaping them under the covers. Being able to call FeGraph.set/2 and have db actually be mutated violates Elixir's common call patterns, even if it's technically allowed.

For example: I wonder if it wouldn't be more "erlangy"/"elixiry" to model the mutable ops behind a genserver that you send messages to. In the Elixir world it's perfectly normal to make GenServer.call/3 and expect the target PID to change its internal state in a non-deterministic way. It's one of the only APIs that explicitly blesses this. The ETS API is another.

Alternatively, you could have the ref store both a DB sequence and a ref ID (set to the last DB sequence), and compare them on operations. If you call FeGraph.set/2 with the same db ref two times, you compare the ref ID to the sequence and panic if they aren't equal. They always need to operate with the latest ref. Then at last the local semantics are maintained.

Maybe this is less relevant for the FeGraph example, since Elixir libs dealing with data are more willing to treat the DB as a mutable thing (ETS, Digraph). But the it's not universal. Postgrex, for example, follows the DB-as-PID convention. Defaulting to an Elixiry pattern by default for Rustler implementation is probably a good practice.


That's an interesting point that I should perhaps have covered in the original article.

The real code that this is based on is in fact hidden behind a GenServer for this exact reason -- to maintain the expectations of other Elixir code that has to interact with it. The advantage of the escape hatch, as another commenter mentions, is allowing efficient sparse mutations of a large chunk of data, without having to pay a copy penalty every time. I definitely wouldn't recommend sharing the db handle widely.


Did you consider a port (written in Rust) instead of a NIF?

When you're presenting a GenServer like message passing interface a port is a natural fit, with none of the risks related to linking a NIF into the VM itself.

(admittedly those risks are much lower with Rust than C)


In our case one of these NIF stores is created per user for a specific task; ironically, with the amount of polish that Rustler puts around NIFs I suspect it would have been more work and more risk to go down the port route and manage everything manually.


Have you measured performance? If mutating from Elixir like this can bring serious benefits, maybe there's a place for mutable versions of libraries like Explorer and Nx.


Explorer does actually use Rust (and polars) for a lot of its work -- its one on the libraries I looked at while figuring out my memory management issues.


But would it benefit from mutating the value of one reference? At the moment it does not do that, right?


No, it doesn't -- looking at the website that's an explicit trade-off of pure performance vs 'Elixir-ish-ness'. It would certainly break a lot of expectations to have data mutating like that without it being hidden away somewhere, so I can understand why they went that way.

In my case the data I'm dealing with is more of a store than a single data item, so I'm leaning on the example of things like ETS. Also it's within a single application rather than being a large generally-available library, so the trade-offs are different. It would be interesting to know if they did tests though.


Probably not ideal since Polars builds on Apache Arrow and that tends to want to treat the structure as immutable if I recall correctly.


> For example: I wonder if it wouldn't be more "erlangy"/"elixiry" to model the mutable ops behind a genserver that you send messages to.

It depends on the use case. For example, when creating a resource (basically a refcounted datastructure), it might make sense to allow mutable access only through a process as the "owner" of the resource. But if you have only read-only data behind that resource, sharing the resource similar to ETS might be what you want.


I also want to give a shout out to the Rustler folks for creating a great library! We use Rustler quite extensively at Doctave, and have written about our experiences with Rustler before [0] (though our architecture has advanced quite a bit since the article was written).

Integrating Elixir and Rust has been delightfully straightforward and is a great choice for calling into libraries not available in Elixir, or offloading CPU intensive tasks.

[0]: https://www.doctave.com/blog/2021/08/19/using-rust-with-elix...


Getting rustler up and running for us was very easy. Thank you to the team for making this excellent library.

We had some inconsistent build results (ours is an umbrella app) but apart from forcing a compilation and losing the ability to cache the rust builds, everything else has worked so well so we’re happy to get access to the massive rust ecosystem.


It’s exactly this use case that nudged me (primarily an Elixir dev) to start learning Rust a few years back.

Unfortunately, I haven’t had a project where I’ve needed to use Rustler yet, though.


Nice. I thought that Zig would be a nice language for writing NIFs - but of course Rust would be good too. Cool!


Rust perfect for this because Rust code can be very reliable which is needed for NIFs in Erlang because a NIF can crash the whole VM.

So using C and Zig libraries without fully understanding them can be a death trap while in Rust as long as it doesn't use unsafe code you can feel pretty good about using it.


This has nothing to do with Rust itself. While the compiler does prevent a lot of common pitfalls, you can still write erroneous code with it.

It's entirely the rustler project's effort (and goal) to wrap any kind of Rust program so that it will not bring down the BEAM under any circumstance, which they have done a great job achieving.


It's still to a large degree Rust itself. It's the language design which makes it possible to wrap the NIF C API in a safe fashion (e.g. using lifetimes, phantom data, etc.). The only additional safety feature we use is catch_unwind (https://doc.rust-lang.org/std/panic/fn.catch_unwind.html) to prevent panics from unwinding into the BEAM (and killing it).



Cool writeup. A little ironic, since Erlang's `digraphs` are also mutable!


Erlang's digraphs are stored in an ETS table, so aren't they only mutable in the same way that ETS tables are mutable?

I don't normally see people consider (D)ETS tables as mutable, however.


ETS tables are absolutely mutable, they even have specific functions to iterate over them while being mutated (https://www.erlang.org/doc/man/ets#safe_fixtable-2). I use them extensively to share data in a "lock-free" fashion with other processes (a `gen_server` that gets all messages and aggregates data in ETS tables, retrieval via direct reads on a known table name instead of gen_server:call). Mnesia is also (usually) ETS down below.


Yeah. I think even though the article doesn’t use it as an example, what’s really desirable about escape hatching to a systems language is the ability to in-place mutate lots of data. Specifically sparse mutations of a large chunk of data where a copy penalty would be wasteful. ETS is basically just swapping pointers (which I hope is mutation under the hood)


This is super cool. I learn something new every day.


Immutable data is not a “foundation of scalability and robustness”.


> Immutable data is not a “foundation of scalability and robustness”.

It may not be the only way to get to scalability and robustness. But it certainly is the cornerstone of how Erlang gets there.

1. First, the way Erlang treats data ensures that every piece of data can be sent over the wire by default. This helps pave the way for another amazing characteristic of Erlang, and that is when you refer to and use an object, it's essentially transparent to your code whether that object is on this machine or another machine in the cluster. This would not be possible without the fact that all data structures are remotable, which is enabled by the immutable data. (See also side note below.)

2. The immutable data also leads to clean rollback semantics, making it easy to always have a self-consistent state of the system ready to use even after some kind of fault.

3. The immutable data also leads to very clean and easy ways to handle multithreading because you never have to worry about making object copies. You can be assured that it's ok for two threads to use the same memory object because there's no way either of them can change it.

Side note: Alan Kay, the inventory of OO, has said that people get the entire idea of what he was talking about all wrong. He said that object orientation isn't about objects, but its about communication. He was talking about the idea of an object being more like what we'd call a web endpoint today, where when you instantiate it you communicate with it by sending it messages. It's funny to me that a functional language like Erlang best embodies that OO idea today. Go code can, too.

"I'm sorry that I long ago coined the term 'objects' for this topic because it gets many people to focus on the lesser idea. The big idea is 'messaging'" - Alan Kay <https://en.wikipedia.org/wiki/Alan_Kay>

He goes on in the original underlying document to say "OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things." All of these ideas are front-and-center in Erlang (and by extension Elixir).


As an aside: The particular technique used in the OP (resource objects) is actually not transparent. If you send the object handle to another machine, it will have an opaque handle that it can't do anything with apart from storing and sending around. However, if the other machine sends it back (and the resource object hasn't been deallocated in between) it will be the same as the one that you sent.


> you instantiate it you communicate with it by sending it messages

Does this mean a web-endpoint has to be immutable? If you send it the same parameters multiple times, is it required to respond with the same response every time? If not, does that not mean it is in fact mutable?

I read elsewhere that in Elixir programs, there is no difference in messaging a local "agent" or a remote one? The caller does not know whether the other party is remote or not. Is it still guaranteed to be immutable?

Just asking since I don't know much about Elixir.


> Does this mean a web-endpoint has to be immutable? If you send it the same parameters multiple times, is it required to respond with the same response every time? If not, does that not mean it is in fact mutable?

This is a concept called purity, and it's only loosely related to immutability. Immutability makes purity easier to implement and reason about, but does not guarantee it. Erlang/Elixer are not pure. For example, `DateTime.now("Etc/UTC")` will return different things at different times.

As a counter-example, Haskell functions are pure, so the `getCurrentTime` function cannot return a value directly as it would be different every time. Instead it uses the return type `IO UTCTime` which act like instructions on how to calculate the time, rather than the time itself.

https://en.wikipedia.org/wiki/Pure_function


When they say all the data in Erlang/Elixir is immutable, it does not mean that there is no state. There's definitely data in the programs that change values, because like you point out, how in the heck can you write a useful system that doesn't have any state anywhere?

State or data that changes values is typically put in one of 3 different places:

1. On the stack. It's pretty typical to have a function like handle_message(app_state, request). The current state of the app is in the call parameters. At the end of this message, handle_call() would call itself again with the new updated state. Somewhere else in the system we keep track of the last value of that state, and if handle_call crashes, we just use the last state.

2. Another place to hold state is in external storage somewhere.

3. The third main place to hold state is via references to other objects, which do #1 or #2 above.

Regarding whether there is a difference between messaging local versus remote objects-- there's an operator for sending messages to another object. It works the same for local and remote. I think it's possible to inspect the actor address and see where it is, but the mechanism works the same.


I'm not sure Joe Armstrong would agree with your comment.


After a certain scale, it actually does start to slow you down

https://discord.com/blog/using-rust-to-scale-elixir-for-11-m...


This is a great engineering blogpost. Other good ones from Discord on scaling (elixir, rust, cassandraDB vs. scyllaDB:

- https://discord.com/blog/how-discord-scaled-elixir-to-5-000-... — continually improving elixir

- https://discord.com/blog/how-discord-stores-trillions-of-mes... — moving to scyllaDB


That was an excellent write up. And great case study in how to iteratively approach a performance problem. First address the algorithmic issues. Then once the algorithm is no longer the bottleneck, look into dropping into a higher performance, lower overhead language like Rust. Jumping directly to Rust with the original algorithm, would not have helped much. And in the general case, getting the algorithm right might make dropping down into Rust unnecessary.

Very nice work.


Sadly, due to his untimely passing, I don't think Joe Armstrong ever really got a good chance to analyze Rust's borrow checker from his perspective. So I'd be reluctant to be too dogmatic about what he'd think about it.

Personally I think that if you can stomach the additional complexity (which is a non-trivial "if", but a doable one), Rust's approach supercedes immutability. Full immutability was an interesting theory in the 1990s, and I mean that respectfully and not sarcastically, but in the end I think it was overkill and overcompensation. The correct thing to do is not to eliminate mutability, but to firmly, firmly control it. Rust has a sophisticated method for doing so, with compiler support. It may not be the only one, but it seems a very solid one. Immutability is another method of controlling it, but in my view, it's actually kind of a blunt instrument applied to a complex problem.

In my considered opinion, in the end, immutability isn't even important to Erlang. What matters in Erlang is that you can't send references across messages, so there is no way to witness mutation done by another process. It was not necessary within a given process to be immutable, and I suspect that has been a non-trivial ball and chain around Erlang's legs in terms of adoption even to this day. There was never any need for a newbie Erlang developer to also have to learn how to program immutably within a process.


> Sadly, due to his untimely passing, I don't think Joe Armstrong ever really got a good chance to analyze Rust's borrow checker from his perspective. So I'd be reluctant to be too dogmatic about what he'd think about it.

I agree that Joe was a great explorer of ideas. I'm not sure if he expressed thoughts on Rust, but he would probably look at it again from time to time.

> In my considered opinion, in the end, immutability isn't even important to Erlang. What matters in Erlang is that you can't send references across messages, so there is no way to witness mutation done by another process.

In some ways, you may be right. But you can always mutate the process dictionary if immutable data really bothers you. But even with the process dictionary, it's not possible to construct a self refering datastructure as an Erlang term, which is important! That makes garbage collection simple.

Also, functional programming makes Erlang processes effectively preemptive, when they're built from cooperative user space threads. Tokio tasks can loop and tie up the OS thread it's running on; but an Erlang process will always come to a function call in finite time and can be descheduled at that time, so all runable processes will get a share of cpu.

Edit to add: It's also important to note that Immutability is a property of Erlang (and Elixir), not a property of BEAM. The BEAM vm has opcodes for mutation, and the Erlang compiler will emit them in certain sequences --- if you never use the old value again, it's ok to mutate it rather that create a new modified value; you're most likely to see that with Tuples, IIRC.


I don’t really care what people making claims say when they make claims without evidence. ”Who” makes a claim has no bearing on its truth.

Immutability is a tool, not a rule, and I am free to reject any assertion otherwise when those assertions provide no evidence, or shitty anecdotes.

Prove your claims.

Certainly, immutability is a foundation for performance problems.

Another provable rule in computing is that more lines of code = more bugs. Immutability uses more lines of code.

Another demonstrable fact is that Haskell based programs have just as many bugs as any other programming language whether you have immutability or not. Therefore, immutability is not a bastion of robustness.

You’re going to have significant difficulty proving to me that immutability = scalability and robustness when both are demonstrably not true just by taking measurements of thing you expect to improve out of those foundations.

Immutability is not a silver bullet. It is a tool that is sometimes useful, but has significant drawbacks, including shitty performance, and significantly limiting how your data can be managed (without that limitation paying off in any significant way)


You begin with "I don’t really care what people making claims say when they make claims without evidence". May I hold you to your own standards?

Because the rest of your post is pretty LOL-worthy in light of your opening sentence.


Sure:

1) immutability has performance problems: source: literally every measurement of immutable vs not data structures ever performed.

Source 2: logic - copying data is slower than not copying it

Source 3: cache lines: modern CPUs rely pretty heavily on cache lines and branch prediction to improve performance. Immutability measurably harms both.

2) immutability requires more code and loc is the best predictor of defects

Clarification: runtime immutability requires more code

Source: it takes more lines of code to return deep copies of objects than to not do that.

Source: https://www.researchgate.net/publication/316922118_An_Invest...

Package densities are the best predictors of defects

3) Haskell projects have as many bugs as any other language

Source: the best evidence we have here is “the large scale study of programming languages on GitHub”, but I suggest that you look deeper here, as the authors qualifications of defects is somewhat questionable (a project that never fixes defects would have low defect rates in this study, it additionally doesn’t properly compare projects sizes and other things). Anyways, in responses that do have better controls in place (and hilariously even in this paper itself, where we see Haskell programs tend of see higher defects as projects go on while c projects tend to have fewer), we see that Haskell does absolutely no better than anything else for bugs and defects.


When you say "the large scale study of programming languages on GitHub", are you referring to this? https://web.cs.ucdavis.edu/~filkov/papers/lang_github.pdf

"Table 7: Functional languages have a smaller relationship to defects than other language classes where as procedural languages are either greater than average or similar to the average."

"The data indicates functional languages are better than procedural languages; it suggests that strong typing is better than weak typing; that static typing is better than dynamic; and that managed memory usage is better than un-managed."

You got owned by your own source.

As for your un-sourced claim that "copying data is slower than not copying it", I'd suggest learning how immutable-first languages practice data sharing between objects to minimize the amount of copying needed.


I wanted to write that making defensive copies is something you need to do in mutable situations to preserve safety, (not in immutable situations!), but it looks like enough commenters have hit that point.

So lets disabuse your mistrust of immutability in another domain!

Here is some typical "go fast and mutable!" nonsense code:

    int foo(int i, int j) {
      while (i < 10) {
        j += i;
        i++;
      }
      return j;
    }
Let's compile it with https://godbolt.org/, turn on some optimisations and inspect the IR (-O2 -emit-llvm). Copying out the part that corresponds to the while loop:

  4:
    %5 = sub i32 9, %0, !dbg !20
    %6 = add nsw i32 %0, 1, !dbg !20
    %7 = mul i32 %5, %6, !dbg !20
    %8 = zext i32 %5 to i33, !dbg !20
    %9 = sub i32 8, %0, !dbg !20
    %10 = zext i32 %9 to i33, !dbg !20
    %11 = mul i33 %8, %10, !dbg !20
    %12 = lshr i33 %11, 1, !dbg !20
    %13 = trunc i33 %12 to i32, !dbg !20
    tail call void @llvm.dbg.value(metadata i32 poison, metadata !17, metadata !DIExpression()), !dbg !18
    tail call void @llvm.dbg.value(metadata i32 poison, metadata !16, metadata !DIExpression()), !dbg !18
    %14 = add i32 %1, %0, !dbg !20
    %15 = add i32 %14, %7, !dbg !20
    %16 = add i32 %15, %13, !dbg !20
    br label %17, !dbg !21

  17:
    %18 = phi i32 [ %1, %2 ], [ %16, %4 ]
Well, would you look at that! Clang decided (even in this hot loop) never to re-assign any of the left-hand-sides, even though my instructions were just: "mutate j in-place. mutate i in-place."


Listen dude. I’m not going to argue about this:

Immutability is measurably slower. Full stop.

The fact that you can come up with silly, overly simplistic, non-idiomatic anecdotes showing that sometimes a compiler will prefer calculation doesn’t change that. It is a commonly known fact in low level programming that just because you’re calculating something doesn’t make it slower by default.

When Haskell devs can produce a game engine that doesn’t look like PS2 on a 4090, we can chat again about how immutability is supposedly not slow.


> Source 2: logic - copying data is slower than not copying it

> Source: it takes more lines of code to return deep copies of objects than to not do that.

Defensive copying and deep copying is not a thing you have to do in immutable languages. Even under the covers, it's not happening the way you seem to think it is. If I had a large immutable map in use by some other process, and needed a version of it with an element changed or added, why would I deep copy it when I can just point to that same map instance, and add a pointer to the key-value pair I want to substitute [1]? I think this is a common reservation people have about immutable programming because they come into it with a OO mindset. At least, I know I did.

In a really simplified example, a = (1, 2, 3, ..., 100) and b = (2, 3, ..., 100) are not allocated as two full lists in memory space. a contains 1 followed by a pointer to b. Because you have guarantees that b will never change, the single instance of b can be recycled in other data structures (or passed to many other functions and threads) and you avoid the complexity of managing race conditions, mutexes, semaphores, which are a significant source of bugs in other languages.

See [2] for a more realistic implementation.

1. https://en.wikipedia.org/wiki/Radix_tree

2. https://en.wikipedia.org/wiki/Hash_array_mapped_trie


In order to flatten these to regain cache hits, a deep copy is required all the way down.

There’s a reason that the fastest Haskell game engine looks like ps2 graphics on modern hardware.


If you're requesting proofs, don't make so many bold statements you cannot prove yourself.


Provided above.


The only link you've provided in this discussion is about the relationship of LoC to bugs, nothing to do with immutability or FP: https://www.researchgate.net/publication/316922118_An_Invest...

You have posted nothing else besides your own assertions.


There isn't some universe where there exists a list of axiomatic, unfalisifable proofs for what or what doesn't constitute the foundations of scalability and robustness, rather, you have a tradition of development practices that have much more often lead you to scalability and robustness than than the alternatives, and Joe Armstrong was someone who trailblazed that tradition in blood sweat and tears so to speak, as well as a majority of developers here, many of which have had experience writing software that handles millions, sometimes hundreds of millions of requests per second who would very much likely agree with Joe on a lot of things.

Not everything people say on a discussion board is some scientific claim, subject to scientific inquiry and in need of a thesis defense. But if you really off-the-cuff dismiss Joe Armstrong's opinion on a matter because it hasn't met your criteria of proof, despite you thinking you are somehow being the rational scientist here, you are actually revealing your own stupidity.


The author claimed that immutability is a foundation for scalability and robustness.

I reject that claim.

In this comment, you simultaneously agree and disagree with me.

I don’t give a shit what Joe Armstrong says about immutability because the facts are the facts:

1) immutability cause performance problems

2) immutability significantly limits how you can manage data, which is counter to what computers are meant to do

3) immutability measurably does not reduce bugs in programs

I am not dismissing <insert name> off the cuff. I am dismissing them because their claim does not align with metrics you expect to improve as a result of their claim.

>it doesn’t have to be a scientific claim

When you are telling people to “make immutability a foundation of their programming”, you 100% are opening yourself to scientific scrutiny. If you cannot back up this claim with actual metrics, and you’re just going to say “hurr durr, just let me make claims without calling me out to providing evidence please”, why should anyone believe you?


You continue to express strong claims on strong claims without backing a single one of them up. And then you drop to mocking others for challenging those claims.

Have you ever heard the saying, “assertions made without evidence can be dismissed without evidence?” My experience differs.

Immutability has been the foundation of many of our large scale programs. It makes safe concurrent programming easier, and languages built around immutable data structures usually optimize memory handling in ways that are not available when simply writing “functional style” code in non-functional languages. ie under the hood they’re using persistent data structures, structural sharing, tail call optimization, etc.


I think the main thing (in the context of Elixir) is that immutability gives you concurrency out-of-the-box, which in turn gives you predictability and scalability. It is very easy to spin up a couple of nodes that can communicate with each other thanks to this ecosystem.


> 3) immutability measurably does not reduce bugs in programs

I'd be curious to see how you back this claim up. Are you referring to something published that we can all go and read?


It doesn't sound like you've used a natively mutable language before. If all you've ever used is immutable.js or something like that then I can understand why you think this way. Otherwise, you've gotta be trolling.


A bit of the pot calling the kettle black here. You still aren't providing evidence yourself while making strong claims.

For your points:

1) Yes, immutability can cause performance problems in some contexts. However, it can also help in the whole. Mutability in concurrent systems requires all sorts of complications such as mutexes that slow things down considerably. In even single-threaded systems, mutability leads to defensive copying in practice. Furthermore, persistent data structures[0] exist for lists, dictionaries, etc., that achieve very good space and time performance by mutating internally while exposing an immutable interface.

At any rate, even if it is slower, most of the time the performance difference just doesn't matter.

2) How does it limit how you can manage data? It's still possible to mix immutable and mutable data if necessary, but immutable data can be transformed just as mutable data can.

3) You say it measurably does not reduce bugs in programs, again with no evidence. Immutability eliminates entire classes of commonly-encountered bugs, including many pernicious ones related to concurrency. These are bugs that happen commonly with mutable data, but simply don't for immutable data.

In addition, there is some limited empirical evidence to the contrary, which is rare for this kind of thing. Immutable-first Clojure had the lowest proportion of Github issues labeled as bugs, even beating out static languages. [1]

[0] https://en.wikipedia.org/wiki/Persistent_data_structure

[1] https://dl.acm.org/doi/10.1145/2635868.2635922


> you have a tradition of development practices that have much more often lead you to scalability and robustness than than the alternatives

I'm not GP, but these traditions are usually not backed by any evidence but by cargo-culting and cult-of-personalities. Not to mention people who over-hype their favourite technologies to high heavens, poisoning the well for everyone else (no, most telecom industry doesn't run Erlang, Naughty Dog didn't ship Lisp on PlayStation 2, and Prolog didn't lead to fifth-generation computing).


Do they need to be backed by evidence to be correct? What if they are just right? Do other companies not running Erlang mean its wrong? Because that's a bold argument that could be restated to disprove many things -- X technology claims to do Y well and has a track record of doing it really well, but Z companies don't use X technology, therefore...[insert whatever].

I don't believe in blindly believing things without evidence either, especially if I have never encountered them before, but I also don't believe in blindly dismissing experience of world renowned experts in their field because they didn't provide me a point by point prooftext of every claim they made (Again we aren't sitting here discussing a dissertation or mathematical proof). Their experience and what they've provided to the world is the evidence. We took this 19th century german ultra-materialist philosophy too far here in the west, and that's what gave us post modernism/poststructuralism with its disastrous consequences, but it still seems like we haven't learned anything from that.

The ancients had it right that theres different types of knowledge, and different ways of knowing things (and knowing them to be true, at least as far as it mattered). We here in the modern era with the most unfettered access to information have quite possibly the narrowest definition, ironically.


> The ancients had it right that theres different types of knowledge

People hawking these traditions usually do for consulting money, not spiritual fulfilment. I am all for non-materialism, but only as long as it's not used to exploit me. Belonging to a post-Colonial country, I know exactly where that leads.


> Another demonstrable fact is that Haskell based programs have just as many bugs as any other programming language whether you have immutability or not. Therefore, immutability is not a bastion of robustness.

Bugs happen when you think you can program something correctly, but can't.

If you look at the implementations of transactions in any other language... Oh wait there aren't any!

People keep trying to implement it in their own languages, figure out it's a non-starter (because of uncontrolled mutation), and give up.


I mostly agree with this (having worked on a simple STM for C++), but with a couple caveats:

- Clojure doesn't enforce purity (it can't), but from what I hear its STM seems to work pretty well (aside from some perf issues possibly? haven't used it). That's because "mostly pure" functional programming is encouraged by both the language itself and the culture and ecosystem around it, so uncontrolled side effects are less likely to be a problem.

- I think STM can work "well enough" in unmanaged languages as long as you don't try to boil the ocean and make it perfectly transparent, safe, and fast under all circumstances (Microsoft, IBM, Intel and several others tried for years and failed). That means there will inevitably be huge footguns for non-expert programmers (e.g., any side effect might be invoked every time a transaction is optimistically and transparently retried). These footguns can be mitigated by affordances like commit/abort handlers and infallible transactions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: