Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Why MongoDB thinks single server durability is overvalued (mongodb.org)
49 points by rgeorge28 on Feb 10, 2010 | hide | past | favorite | 29 comments


Tokyo Cabinet Dude™ mentioned the same thing a few months ago: http://1978th.net/tech-en/promenade.cgi?id=6

Summarized: "Please use replication because hardware breaks down."


What sites are using MongoDB on production?


We're using it for http://www.serverdensity.com where our database is around 600GB replicated over multiple data centres.


How do you handle security when replicating across data centers? It's my understanding that MongoDB supports authentication, but not encryption.


Yeh, we have a site to site VPN through Cisco hardware firewalls. This is a good way to do it because it removes extra complexity from the database. Let the database handle just what it needs to do, and do it well, because there's probably better tools for the other stuff elsewhere.


An encrypted tunnel that isn't initiated at the database level should work fine.

Addendum: I'm more than a little puzzled that someone would be aware that a particular nosql database didn't support encryption natively and yet wasn't aware how tunneling works.


You are making an assumption in your addendum that uggedal does not understand how tunneling works. All he did was ask how they handle security not "how does tunneling work?"


I've actually used http://www.tinc-vpn.org/ to tunnel Tokyo Tyrant between data centers. It was a bit flaky at times though, so hardware Cisco tunneling looks like a good option if you have access to the firewall in your environment (eliminates all cloud providers).



We are using it in production, currently replicating across multiple datacenters. We're mainly using it as a CRUD datastore with advanced stored procedures and atomic update capabilities. At peak moments our MongoDB servers are handling around 10,000 operations / second.


Can you tell me your hardware specs and CPU usage? Just curious since I'm considering mongo for some of our future games.


Well, it's kind of hard to tell, since we're combining our MongoDB nodes with other services too (Hadoop DFS datanodes too, to name a common example). We're running MongoDB on 12 different virtual dedicated servers, comparable to a dual core Intel Xeon with 4GB ram.

But I can tell you that you really shouldn't worry about MongoDB and performance, unless you want a really high load on a single server; I've heard it being able to do 20k write ops / sec on a single commodity server. From what I can see, our results kinda line up with that: MongoDB performance certainly isn't a problem.


http://get.harmonyapp.com/ (by the author of MongoMapper if I'm right)

I use MongoDB in production for 2 customers projects too.


You are right - both are projects of John Nunemaker.


A bunch of large sites use it for specialized tasks, and business insider uses it for everything. They have a list here: http://www.mongodb.org/display/DOCS/Production+Deployments


We're in the process of moving from SimpleDB to MongoDB for http://www.introspectrum.com, with a bit south of 100GB of data at the moment.


I am for a small, new site (all aspects of the site's data needs). Haven't had any troubles since I started using it last summer.


Defensio. Our Mongo database is ~250gb


I use it in production for everything on two smaller sites and one largish one, and as a message queue on a number of others. For what it's worth I have been really impressed by its performance and flexibility, and the professional attitude of its developers. Definitely worth checking out.


I love how he mentions "water damage" in the list of things that can happen to a server. That made me snicker... who doesn't hate those leaky datacenter roofs!


At work, a server that I am responsible for the software on, but not the general IT-type management, was becoming increasingly flaky. It kept powering down at odd times, once managing to corrupt a MySQL database pretty badly. Shortly after a firmware update, it ceased coming up. I honestly did assume it was something in my software update, as updating this server was often the next step after QAing the software. Once I realized that I couldn't even ping the device (which I am 2000 miles from), I had to send in the local IT personnel to log in physically.

This is when they discovered that the device would no longer physically turn on. Next step is to de-rack the device, at which point some strange discoloration was discovered in the holes in the case. Next step was to open the case, where it was discovered that the entire machine was full of mold. The roof was leaking, and this was on the top of the rack directly under the leak. In Silicon Valley, leaky roofs can apparently take a while to discover, but they will eventually make their presence known.

Now, this was in a local office, albeit a relatively well-equipped on that holds in the hundreds of rack machines, not a colo. But you know what? Shit happens. And you don't really have a guarantee that shit won't happen just because someone has magically slapped the word "data center" on a building.


once managing to corrupt a MySQL database pretty badly.

Rare as a Himalayan Snowcock spotting!


How about a sprinkler system? Or the fire department? Or a flood? These sorts of things are actually more common than you may think.

See: http://www.youtube.com/watch?v=ttcQy3bCiiU also, http://www.datacenterknowledge.com/archives/2007/12/04/rains...


Yeah, I know it does happen occasionally. It just amused me that he listed it in first position, no less (and I'm sure he did it for that very reason ;) ).

As common as water damage may be, it is probably a rounding error compared to less spectacular failures such as dying disk drives and PSUs.

Regardless, that datacenter flood video is absolutely priceless. Thanks for that!


Hey, it happens. One company I worked at moved into a new office (after I left) near the top floor of a highrise. There was a crack in the roof under one of the building's cooling units, and their computer room was near the plenum. During a big rain storm, water leaked down from the roof, through the plenum, ran along the ceiling and dripped all over the racks. Oddly, many of the computers remained running despite having water pooled inside of them.


I have some concerns about MongoDB and since it gets such good reviews from a lot of people, I think I am either making a big deal out of nothing or everyone else is crazy.

I wrote up some comments elsewhere but this thread is more populated with people _using_ mongo in production.

Elsewhere = http://news.ycombinator.com/item?id=1110366


"First, there are many scenarios in which that server loses all its data no matter what. If there is water damage, fire, some hardware problems, etc..."

Err! So there are burglers and other evildoers sneaking through the city. Does that mean I should stop locking my door, since burglaries happen anyway?

"In the real world, traditional durability often isn’t even done correctly. If you are using a DBMS that uses a transaction log for durability, you either have to turn off hardware buffering or have a battery backed RAID controller."

Or you write to raw devices, which perform pretty neatly. The fact that a _lot_ of databases are set up and managed in a crappy way justifies that this is considered the norm? I don't think so.

Here's for the disclaimer. I don't know anything about MongoDb and not a helluvalot about NoSQL databases.

But I think the reasoning behind this post is not very sound. No matter which database.


[deleted]


which database servers advertise "single server durability" as a feature

Any database server that advertises being ACID compliant:

http://en.wikipedia.org/wiki/ACID#Durability


Well, I guess I fail reading comprehension then :-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: