> *It sounds like Uber is using MySQL as just a data bucket with primary keys* T...

smoodles · on July 26, 2016

Operating Cassandra at the scale that Uber is going to require is going to be painful and as operationally draining as MySQL if not more.

There are really not a large number of options here anymore with the departure of FoundationDB from the market. CockroachDB might be an option in a few years, though I'm still confused why they are moving towards a SQL-ish vs key-value interface...

nickpsecurity · on July 26, 2016

"departure of FoundationDB from the market"

Pissed me off so much. Only thing close to Google's F0 RDBMS on the market, at a reasonable rate, and the beginning of a good offer to enterprises. Then, "poof!" It's a good example of why I tell companies to not put anything critical into something from a startup. If they do, better have a synchronized, backup option tested and ready to go.

"why they are moving towards a SQL-ish vs key-value interface..."

That's easy: most databases and buyers use SQL. Key-value is preferred by startups & non-critical, side projects in big companies you see here a lot but aren't representative of most of the market. Need first-rate, SQL support. I think EnterpriseDB shows that it's also a good idea to clone a market leader's features onto alternative database.

ngrilly · on July 29, 2016

> Only thing close to Google's F0 RDBMS

Did you mean F1 (instead of F0)?

nickpsecurity · on July 29, 2016

Yeah, yeah. I keep getting the 0 and 1 mixed up. Thank you.

ddorian43 · on July 26, 2016

What dbs would you suggest in their scale ? that are easier operationally than cassandra ?

MikeKusold · on July 26, 2016

I was at MesosCon and ended up talking to some Uber people. They are currently using Cassandra in prod. I can't speak as to why they use MySQL the way they do though.

verma7 · on July 27, 2016

I gave a talk at MesosCon about how we (are starting to) run Cassandra across multiple datacenters at Uber (https://www.youtube.com/watch?v=U2jFLx8NNro, https://schd.ws/hosted_files/mesosconna2016/60/mesoscon-uber...).

mandeepj · on July 26, 2016

Thanks for sharing this. I was wondering - the limitations they have listed could be easily overcome with cassandra

dastbe · on July 26, 2016

So arguably, they are using mysql as a storage engine rather than as a database.

They don't explicitly answer the question "Why didn't you use InnoDB/WiredTiger/etc. for your dataplane?", but you get the idea that they were very happy with the specific characteristics of MySQL for their use case and so they built on top of it. It also sounds like they had some deadlines (specifically, the death of their datastore) that they had to meet :).

carapace · on July 26, 2016

I had that same thought, that the time spent rolling their own system could be better spent just learning some existing good-enough thing.

A great way to get familiar with something is to be the folks who write it. It's also much more fun to design and implement something new than to just learn some other fella's software. I'm guilty of this myself.

But I've started to remind myself that "somebody else has had this problem" and there's probably a good enough solution out there already.

Put another way, is what you are trying to do really so novel? In the case of Uber's infrastructure, you would have to talk for awhile to convince me that they really really need something not-off-the-shelf.

mdani · on July 27, 2016

IMHO some possible reasons of not using Cassandra could be the following.

You can't use Cassandra if you need atomic increments (yes, they're included but painfully slow due to several trips required to satisfy PAXOS).

Also there are no transaction rollbacks (atomic batches always go one way - forward).

You may hit GC pauses if the JVM is not tuned properly.

If the use case involves its of deletes then tombstone related issues need to be considered.

stubish · on July 27, 2016

I wouldn't have trusted Cassandra back then either. 0.9, 1.0 or maybe 1.2 was reaching sufficient maturity to actually be recommended. Modern Cassandra has come leaps and bounds, with the 2.x series finally becoming stable this year and just recently 3.0.x finally getting blessed by the community as stable enough for production. And ScyllaDB hot on their heels.