More

stuarthalloway · on May 17, 2024

Thanks Kyle! It is evident that our docs were insufficient. Rich and I have written what we hope is clearer, more comprehensive documentation about Datomic’s transaction model. We hope that this can preempt common misconceptions and we welcome all feedback!

https://docs.datomic.com/transactions/model.html

stuarthalloway · on May 16, 2024

Formal verification is very powerful but still not full assurance. Fun fact: Testing and monitoring of Datomic has sometimes uncovered design flaws in underlying storages that formal verification missed.

asa400 · on May 16, 2024

Is there anything I can read about what capabilities Datomic requires of the underlying storages it uses?

nine_k · on May 16, 2024

What kind of flaws? I would expect performance problems.

jupp0r · on May 16, 2024

To start with, you usually perform verification on a behavioral model and not on the code itself. This opens up the possibility that there are behavioral differences between the code itself and the model which wouldn't be caught.

nine_k · on May 16, 2024

Could you offer an example?

jupp0r · on May 17, 2024

If you work through the TLA+ tutorial[1] it will help you get a good idea of the benefits and limitations of verification.

[1] https://learntla.com/

stuarthalloway · on May 16, 2024

It is worth noting here that Datomic's intra-transaction semantics are not a decision made in isolation, they emerge naturally from the information model.

Everything in a Datomic transaction happens atomically at a single point in time. Datomic transactions are totally ordered, and this ordering is visible via the time t shared by every datom in the transaction. These properties vastly simplify reasoning about time.

With this information model intermediate database states are inexpressible. Intermediate states cannot all have the same t, because they did not happen at the same time. And they cannot have different ts, as they are part the same transaction.

refset · on May 16, 2024

Thank you for the explanations. Do you happen to know why transactions ("transaction requests") are represented as lists and not sets?

stuarthalloway · on May 16, 2024

When we designed Datomic (circa 2010), we were concerned that many languages had better support for lists than for sets, in particular list literals and no set literals.

Clojure of course had set literals from the beginning...

richhickey · on May 16, 2024

An advantage of using lists is that tx data tends to be built up serially in code. Having to look at your tx data in a different (set) order would make proofreading alongside the code more difficult.

stuarthalloway · on May 16, 2024

Ed Wible is a founder of Nubank and chose Datomic. He and Lucas Cavalcanti gave a talk on it at Clojure/conj 2014.

https://www.youtube.com/watch?v=7lm3K8zVOdY

stuarthalloway · on May 16, 2024

What does the following expression return?

(let [x 1] [(inc x) (inc x)])

In Clojure the answer is [2 2]. A beginner might guess [2 2] or [2 3]. Both are reasonable guesses, so a beginner needs to be quite careful!

But that isn't particularly interesting, because beginners always have to be quite careful. When you are learning any technology, you are a beginner once and experienced ever after. Tool design should optimize for the experienced practitioner. Immutability removes an enormous source of complexity from programs, so where it is feasible it is often desirable.

rtpg · on May 16, 2024

this is a lot less of a convincing argument when you consider the fact that we’re talking about a database interaction. If we want to do the FP weenie thing, I would assume a transaction context is monadic!

Another ding on your argument is that in the metaphor you _would_ get [2 3] if you aren’t in a transaction! Of course the reasons for this are not incorrect, but I think generally “the end result of this list of actions is different if you are in a transition for a part of it, even if you are the only client operating” is a bit surprising compared to default transaction semantics in other systems.

Of course, people should learn about their tools and not assume everything works exactly the same. I think it’s incorrect to assume that the choice made here was the only one that would make sense for Datomic.

stuarthalloway · on May 15, 2024

Datomic software needed no changes as a result of Jepsen testing. All functionality in datomic.api is unchanged.

klysm · on May 16, 2024

Congrats, that is a rare outcome!

aphyr · on May 16, 2024

Yeah, I think this is next to Zookeeper as one of the most positive Jepsen reports. :-)

stuarthalloway · on May 15, 2024

As a proponent of just such tools I would say also that "enough rope to shoot(?) yourself" is inherent in tools powerful enough to get anything done, and is not a tradeoff encountered only when reaching for high power or low ceremony.

nickpeterson · on May 16, 2024

I always loved the broken phrase because it implies something really went terribly wrong ;)

stuarthalloway · on May 15, 2024

As Jepsen confirmed, Datomic’s mechanisms for enforcing invariants work as designed. What does this mean practically for users? Consider the following transactional pseudo-data:

[

[Stu favorite-number 41]

;; maybe more stuff

[Stu favorite-number 42]

]

An operational reading of this data would be that early in the transaction I liked 41, and that later in the transaction I liked 42. Observers after the end of the transaction would hopefully see only that I liked 42, and we would have to worry about the conditions under which observers might see that 41.

This operational reading of intra-transaction semantics is typical of many databases, but it presumes the existence of multiple time points inside a transaction, which Datomic neither has nor wants — we quite like not worrying about what happened “in the middle of” a transaction. All facts in a transaction take place at the same point in time, so in Datomic this transaction states that I started liking both numbers simultaneously.

If you incorrectly read Datomic transactions as composed of multiple operations, you can of course find all kinds of “invariant anomalies”. Conversely, you can find “invariant anomalies” in SQL by incorrectly imposing Datomic’s model on SQL transactions. Such potential misreadings emphasize the need for good documentation. To that end, we have worked with Jepsen to enhance our documentation [1], tightening up casual language in the hopes of preventing misconceptions. We also added a tech note [2] addressing this particular misconception directly.

[1] https://docs.datomic.com/transactions/transactions.html#tran...

[2] https://docs.datomic.com/tech-notes/comparison-with-updating...

aphyr · on May 15, 2024

To build on this, Datomic includes a pre-commit conflict check that would prevent this particular example from committing at all: it detects that there are two incompatible assertions for the same entity/attribute pair, and rejects the transaction. We think this conflict check likely prevents many users from actually hitting this issue in production.

The issue we discuss in the report only occurs when the transaction expands to non-conflicting datoms--for instance:

[Stu favorite-number 41]

[Stu hates-all-numbers-and-has-no-favorite true]

These entity/attribute pairs are disjoint, so the conflict checker allows the transaction to commit, producing a record which is in a logically inconsistent state!

On the documentation front--Datomic users could be forgiven for thinking of the elements of transactions as "operations", since Datomic's docs called them both "operations" and "statements". ;-)

stuarthalloway · on May 15, 2024

Mea culpa on the docs, mea culpa. Better now [1].

In order for user code to impose invariants over the entire transaction, it must have access to the entire transaction. Entity predicates have such access (they are passed the after db, which includes the pending transaction and all other transactions to boot). Transaction functions are unsuitable, as they have access only to the before db. [2]

Use entity predicates for arbitrary functional validations of the entire transaction.

[1] https://docs.datomic.com/transactions/transactions.html#tran...

[2] https://docs.datomic.com/transactions/transaction-functions....

lgrapenthin · on May 15, 2024

Somewhat unrelated ad docs: It appears that "Query" opens a deadlink

JB024066 · on May 16, 2024

Thanks for the report! just fixed the link.

Voultapher · on May 15, 2024

The man the myth the legend himself. I haven't ceased to be awed by how often the relevant person shows up in the HN comment section.

Loved your talks.

puredanger · on May 15, 2024

Datomic transactions are not “operations to perform”, they are a set of novel facts to incorporate at a point in time.

Just like a git commit describes a set of modifications, do you or should you want to care about which order or how the adds, updates, and deletes occur in a single git commit? OMG no, that sounds awful.

The really unusual thing is that developers expect intra-transaction ordering to be a thing they accept from any other database. OMG, that sounds awful, how do you live like that.

cdchn · on May 16, 2024

Do developers not expect intra-transaction ordering from within a transaction?

kccqzy · on May 16, 2024

It depends on the previous experience of said developers, and such expectation varies widely.

voganmother42 · on May 15, 2024

Nested transactions or savepoints also exist in other systems

stuarthalloway · on April 20, 2021

Do you have a specific example? Datomic is great for evolving in prod (https://blog.datomic.com/2017/01/the-ten-rules-of-schema-gro...), and Clojure is a great language for generating data (or ETL) when you want to make radical changes in dev.

blacktriangle · on April 20, 2021

It was a service ticket system where we moved from a boolean status to a string for more states (noob modeling mistake I know). The problem came up a few months later when a dev pulled every attribute, saw the old field, and wrote new code against the old field, leading to some really confused tickets.

Looking at that blog post though, it seems that using #9: Annotate your Schema could have helped us avoid this situation. Thats an interesting feature of Datomic I was unaware of.

stuarthalloway · on July 23, 2020

All Datomic consulting will continue.

nojito · on July 23, 2020

It says this in your post.

Cognitect will continue to offer professional services for Datomic customers, but will transition out of general consulting development.

Does this mean your general consulting services are going away?

mark_l_watson · on July 23, 2020

I friend of mine hired the principles at Cognitect to kick off a Clojure/Datomic project. They were like a force of nature, really, and very quickly got things set up and helped with architectural decisions, etc. I would imagine that they have helped hundreds of customers this way.

From the blog post, I think that the answer to your question is yes: they will stop general consulting services.

randompwd · on July 23, 2020

Here's hoping they release a getting started project now that they've dipped out of general consulting.There must have been a standard-ish template in use.

dkersten · on July 23, 2020

Great, so it sounds like not much will change for external users of Clojure and Datomic. Sounds like the the biggest change is in terms of funding.

Sounds good to me, congratulations!

EDIT: I guess my confusion was between professional services for Datomic and general consulting, since both were mentioned in the context of Datomic.