One thing I've noticed since coding with immutable data structures & functions (rather than mutable OOP programs) is how tests run really fast, and are easy to run in parallell.
I/O only happens in a few functions, and most other code just takes data in -> transforms -> returns data out. This means I only have few functions that need to 'wait' on something outside of itself to finish, and much lesser delays in the code.
This is coding in Clojure for me, but you can do that in any language that has functions (preferable with efficient persistent data structures. Like the tree-based PersistentVector in Clojure).
Immutable data structures give you easy parallelism, however there's a hidden runtime cost: you have to allocate way more objects. For example, I was able to save a ton of object allocations here: https://github.com/mime-types/ruby-mime-types/pull/93 mostly by mutating. For tasks that are not easily parallelizable it may be slower to use immutable structures.
I mostly only ever hear about how fast FP languages are, so maybe they use some tricks to avoid allocations somehow. I would be interested in hearing more about it.
Yes, mutating something at one place in memory is more efficient, because you don't need to allocate new memories.
I don't know all to much about other functional languages, as I learned perl -> ruby -> javascript -> little bit C & Java & Go -> now doing Clojure. But I find Clojures collection data structures interesting. The vector (collection like lists or array) type look immutable, but under the hood are trees. When you append to a vector, you seem get a new vector returned.
In reality, you added the new element as a node in a tree. Then just modified pointers to that the new and old version share almost all of the pointers & allocations. With simple arrays or lists, you would allocate every element anew.
Personally, most things in business I find easily parallelisable. You mostly decouple the I/O parts of something with the logic parts of it. But yeah I still have much to learn. Thanks for the link! Interesting :)
That's cool, In Ruby Hash merges are really expensive. I played around with a non-mutating data structure that uses references two two hashes and behaves as if it had been merged instead of allocating. It was a fun thought experiment but wasn't 100% API backwards compatible, as mutations got ugly. Thanks for the link.
>Transients are an optimisation on persistent data structures, which decreases the amount of memory allocations needed. This makes a huge difference for performance critical code.
Can you say more? I don't know how C++ uses destructor positioning? Seems to me if it gets to GC it's too late, you already made an allocation which is the expensive part.
I/O only happens in a few functions, and most other code just takes data in -> transforms -> returns data out. This means I only have few functions that need to 'wait' on something outside of itself to finish, and much lesser delays in the code.
This is coding in Clojure for me, but you can do that in any language that has functions (preferable with efficient persistent data structures. Like the tree-based PersistentVector in Clojure).