Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There are several reasons for CouchDB's use of a single, strictly append-only file, but the b-tree is not one of them.

In fact, it's one of CouchDB's most clever tricks to store a b-tree in an append-only fashion (it's far more common to update in place).

Two reasons CouchDB is strictly append only.

1) Safety: By never overwriting data, it avoids a whole class of data corruption issues that plague update-in-place schemes.

2) Performance: Writing at the end of the file allocates new blocks, leading to lower fragmentation and less seeking than an update-in-place scheme.

"durability" does not mean "the file is...never left in an odd state". The right word there is "consistent". CouchDB provides both (In fact it provides all four ACID properties, but the scope of a transaction is constrained to a single document update).

Finally, I should point out that using a single file, as opposed to a series of append-only files (like Berkeley JE, for example) is just a pragmatic choice. Nothing in principle would prevent a JE style approach, it's just a lot of (quite hard) work to do well.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: