Picking SQL or NoSQL

petepete · on June 20, 2016

Basically, unless you have a really really good understanding of your problem and know (for sure) that you won't need joins, just use a RDBMS.

tuna-piano · on June 20, 2016

When I began working on my first app backend, I had heard so much about the MEAN stack I thought that would be the way to go.

Turns out NoSQL should definitely not be the default answer for a project.

eric_h · on June 20, 2016

I was chatting with a guy at a barbecue who'd just finished a General Assembly Web Development Immersive course and had just been hired as one of 3 developers.

Asking what his stack was he said something to the effect of "Rails, with a MongoDB backend".

I pressed him further asking how far along the project was, and found out they had just started. He seemed offended when I suggested that leading with Mongo was a premature optimization. "But we're just dealing with blobs of json".

Admittedly, I have no idea what this guy's name is, or the company he worked for (and this is therefore unadulterated speculation), but my bet is that they have either collapsed from the world of shit they created for themselves, or they've had to hire someone (or several) to dig them out of said world of shit and it's cost them quite dearly.

Don't get me wrong, NoSQL has its place, but not as the "paper of record" for the very beginning of a project.

[Edit: kinda skipped over my point, which was that it's never "just a blob of json", it has structure which, if not explicitly defined somewhere, will bite you in the ass]

kpil · on June 20, 2016

"If I have seen further, it is by standing on the shoulders of giants" - Isaac Newton.

"Let's shove my head into my own ass. NOSQL!" - said by too many young and aspiring developers.

It's a leap backwards to the seventies and DL/1, but without transactions, that's what it is.

Edit:

I have never worked with a system that have blobs of "data" and that's been in production for a while, where that turned out to be a good decision.

Also, data integrity is hard. Building it on your own, is really hard. Most people have trouble getting it right when the hard bits are already provided for them.

SQL isn't perfect, but it's the easy way out 99 times out of 100.

eric_h · on June 20, 2016

The problem is really that SQL isn't the easiest way to, starting from nothing, take some data, manipulate it and display it on screen. You have to take a couple of steps to set up a schema, take the data you have, convert it to that schema then manipulate it and display it on screen.

Mongo and its ilk give you instant gratification and a tremendously false sense of productivity. It's the easiest way for step 1 without question, it's steps 2 and onwards where it starts to show its problems.

combatentropy · on June 21, 2016

It's odd that he put NoSQL at lower level of abstraction than SQL. In a way, it makes sense, because yes you do have to reimplement many SQL things.

But I always thought that the attraction to NoSQL was that you didn't have to unravel your application's objects into tables. You just stuck them in there. You worried about reports later.

Zeimyth · on June 20, 2016

An interesting article. I'm glad to see someone giving an equal treatment of SQL and NoSQL databases for a change. I appreciate the analogy of SQL being an optimized and general-purpose high-level language and NoSQL being a more powerful but specialized low-level or assembly language.

However, I feel like the article fundamentally fails to address the core question of "When should I pick NoSQL over SQL?" If SQL is indeed the choice 99 times out of 100, how can you know when you are faced with that 1-in-100 time when NoSQL is actually the right option?

devuo · on June 21, 2016

It's really "not" that complicated. Start by assuming by default that you will use an RDBMS, then:

1) Attempt to model a database schema that fully and completely supports said data. This will force you to get a sufficiently deep understanding of the domain problem and how it's different parts relate to one another.

2) Is there a part of the domain problem whose dynamic or unstructured nature cannot be properly represented on a fixed database schema and/or this unstructured piece of data must be performantly queried? Add a "companion" Document DB to your architecture.

3) Are there performance issues/requirements on a given part of your system and you need to hold volatile data such as a cache of processed data, or you need to efficiently handle user sessions across multiple machines, or even a queuing system for processing tremendous amounts of data coming in? The add a "companion" Key-Value DB to your architecture.

... And many, many other examples of other specialized NoSQL DBs could be added here, but you get the idea.

Summing up: Understand the data > Attempt to model an RDBMS schema that supports said data > Use proper "companion" DBs for whatever data that does not fit in a RDBMS.

PeCaN · on June 20, 2016

> If SQL is indeed the choice 99 times out of 100, how can you know when you are faced with that 1-in-100 time when NoSQL is actually the right option?

If you have to ask yourself “Should I use SQL or NoSQL?” then you should use SQL. If you have to use NoSQL, you won't have to ask yourself. Usually it's pretty obvious when you will not use joins at all.

maxxxxx · on June 21, 2016

From my experience it will be pretty obvious at that point.

xs83 · on June 21, 2016

Basically if you need to model any kind of relationship between objects then NoSQL will start to hurt and hurt hard!

kedean · on June 21, 2016

That's not always true, of course. NoSQL databases can usually handle 1st level relationships just fine. The problem arises when the related item needs to be it's own entity.

For example, if you have a table of users who have permissions represented by text, that's really easy in a NoSQL database, as most of them provide some kind of list datatype (list<X> in cassandra, arrays in mongo, list in redis). The problem arises when consumers of the database need to be able to add or remove from the set of possible permissions. If it's limited forever to 'read'/'write'/'execute', or is unlimited, you're golden! But if it's some sort of system where an admin needs to be able to dynamically create customized permissions, an RDBMS will be the clear cut winner.