Hacker Newsnew | past | comments | ask | show | jobs | submit | more irfansharif's commentslogin

> Actually, I always believed that the internal key-value store that they use would never scale to represent table workloads.

Care to elaborate here? As someone working at that layer of the system, our RocksDB usage is but a blip in any execution trace (as it should be, any network overhead you have given it's a distributed system would dominate single-node key-value perf). That aside, plenty of RDBMS systems are designed such that they sit atop internal key-value stores. See MySQL+InnoDB[0], or MySQL+RocksDB[1] used at facebook.

[1]: https://en.wikipedia.org/wiki/InnoDB

[0]: https://myrocks.io/


Don't get me wrong, both RocksDB and the work done by CCDB is pretty cool.

Yet I still believe that layering a row model as the V of a K-V introduces by definition inefficiencies when accessing columnar data in a way that row stores do, as compared to a pure row storage. Is not that it can't work, but that I believe it can never be as efficient as a more row-oriented storage (say like Postgres).


I have no idea what you're saying. What's a "row-oriented storage" if not storing all the column values of a row in sequence, in contrast to storing all the values in a column across the table in sequence (aka "column store"). What does the fact that it's exposed behind a KV interface have to do with anything? What's "more" about Postgres' "row-orientedness" compare to MySQL?

In case you didn't know, a row [idA, valA, valB, valC] is not stored as [idA: [valA, valB, valC]]. It's more [id/colA: valA, id/colB: valB, id/colC: valC] (modulo caveats around what we call column families[0], where you can have it be more like option (a) if you want). My explanation here is pretty bad, but [1][2] go into more details.

[0]: https://www.cockroachlabs.com/docs/stable/column-families.ht...

[1]: https://www.cockroachlabs.com/blog/sql-in-cockroachdb-mappin...

[2]: https://github.com/cockroachdb/cockroach/blob/master/docs/te...


I know well. I have read most CCDB posts and architecture documentation. Pretty good job.

There are several ways to map a row to K-V stores, and different databases have chosen different approaches, I'm not referring specifically to CCDB's.

Whether you do [idA: [valA, valB, valC]] or [id/colA: valA, id/colB: valB, id/colC: valC], what I say is that I believe it is less efficient than [idA, valA, valB, valC], which actually also supports more clearly the case of compound keys (aka [idA, idB, idC, valA, ....]). Both are the ways Postgres stores rows.


> Just, please, pretty please, give us a --insecure

Have you taken a look at [0]? (All caveats around running an insecure cluster apply)

[0]: https://www.cockroachlabs.com/docs/stable/deploy-cockroachdb...


TrueTime is not a publicly available GCP service.


We take our claims if correctness very seriously. The behavior you're describing, depending on exactly what you're doing, is likely an allowed serializable history, which is the isolation level we claim to support. See [0] for a deeper dive on the subject. Either way, you should show us what you found (if you haven't already).

As for your performance analysis, do share your results and methodology. We published an in-depth, reproducible comparison with YugaByte here [1] in addition to publishing our TPCC-100k numbers[2]. If you're seeing performance that doesn't line up with the above, let us know.

[0]: https://www.cockroachlabs.com/blog/consistency-model/

[1]: https://www.cockroachlabs.com/blog/unpacking-competitive-ben...

[2]: https://www.cockroachlabs.com/blog/tpcc-100k/


Thanks, "serializable history" could well have been the root cause. I did discuss this in the forum but was unable to resolve it at that time.



Still seems quite hidden compared to the usual https://sas.io/pricing


As someone working for crdb, I disagree with you re: "the whole point is that it enables Google scale". There's a wide gulf up to that point that cockroachdb is still imminently useful for, especially considering the alternatives.


CockroachDB runs the Jepsen test suite nightly. We've been following along Aphyr's recent test additions (`multi-register` for instance, which immediately caught [0]), porting them over when appropriate. We definitely have work to be done incorporating the more DDL focused tests that tripped up YugaByte.

[0]: https://github.com/cockroachdb/cockroach/pull/40600


That's pretty damn cool. I wish I had more time to seriously try out CockroachDB.


v20.1 is slated for release in April.


Those numbers are surprising to me. Have latency charts to share? And are you using think time? http://www.tpc.org/tpcc/detail.asp


Hmm, it does seem like they are not using think time.

From the official TPC-C specification[0]:

> The maximum throughput is achieved with infinitely fast transactions resulting in a null response time and minimum required wait times. The intent of this clause is to prevent reporting a throughput that exceeds this maximum, which is computed to be 12.86 tpmC per warehouse.

So the maximum result they should be able to get at 100K warehouses is 1,286,000 tpmC.

To reach higher, they should do what Alibaba did[1] and use 4,794,240 warehouses. They got officially accepted with a result which dwarfs even Planetary’s incorrect benchmark.

[0]: http://www.tpc.org/tpc_documents_current_versions/pdf/tpc-c_...

[1]: https://www.alibabacloud.com/blog/oceanbase-did-better-than-...


> cdb went with Go but their storage layer ended up being rewritten with C,

The storage layer was not rewritten in C, it's just RocksDB. There's ongoing work to use a custom built LSM store instead: https://github.com/cockroachdb/pebble


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: