I was chatting with a guy at a barbecue who'd just finished a General Assembly Web Development Immersive course and had just been hired as one of 3 developers.
Asking what his stack was he said something to the effect of "Rails, with a MongoDB backend".
I pressed him further asking how far along the project was, and found out they had just started. He seemed offended when I suggested that leading with Mongo was a premature optimization. "But we're just dealing with blobs of json".
Admittedly, I have no idea what this guy's name is, or the company he worked for (and this is therefore unadulterated speculation), but my bet is that they have either collapsed from the world of shit they created for themselves, or they've had to hire someone (or several) to dig them out of said world of shit and it's cost them quite dearly.
Don't get me wrong, NoSQL has its place, but not as the "paper of record" for the very beginning of a project.
[Edit: kinda skipped over my point, which was that it's never "just a blob of json", it has structure which, if not explicitly defined somewhere, will bite you in the ass]
"If I have seen further, it is by standing on the shoulders of giants" - Isaac Newton.
"Let's shove my head into my own ass. NOSQL!" - said by too many young and aspiring developers.
It's a leap backwards to the seventies and DL/1, but without transactions, that's what it is.
Edit:
I have never worked with a system that have blobs of "data" and that's been in production for a while, where that turned out to be a good decision.
Also, data integrity is hard. Building it on your own, is really hard. Most people have trouble getting it right when the hard bits are already provided for them.
SQL isn't perfect, but it's the easy way out 99 times out of 100.
The problem is really that SQL isn't the easiest way to, starting from nothing, take some data, manipulate it and display it on screen. You have to take a couple of steps to set up a schema, take the data you have, convert it to that schema then manipulate it and display it on screen.
Mongo and its ilk give you instant gratification and a tremendously false sense of productivity. It's the easiest way for step 1 without question, it's steps 2 and onwards where it starts to show its problems.
It's odd that he put NoSQL at lower level of abstraction than SQL. In a way, it makes sense, because yes you do have to reimplement many SQL things.
But I always thought that the attraction to NoSQL was that you didn't have to unravel your application's objects into tables. You just stuck them in there. You worried about reports later.
An interesting article. I'm glad to see someone giving an equal treatment of SQL and NoSQL databases for a change. I appreciate the analogy of SQL being an optimized and general-purpose high-level language and NoSQL being a more powerful but specialized low-level or assembly language.
However, I feel like the article fundamentally fails to address the core question of "When should I pick NoSQL over SQL?" If SQL is indeed the choice 99 times out of 100, how can you know when you are faced with that 1-in-100 time when NoSQL is actually the right option?
It's really "not" that complicated. Start by assuming by default that you will use an RDBMS, then:
1) Attempt to model a database schema that fully and completely supports said data. This will force you to get a sufficiently deep understanding of the domain problem and how it's different parts relate to one another.
2) Is there a part of the domain problem whose dynamic or unstructured nature cannot be properly represented on a fixed database schema and/or this unstructured piece of data must be performantly queried? Add a "companion" Document DB to your architecture.
3) Are there performance issues/requirements on a given part of your system and you need to hold volatile data such as a cache of processed data, or you need to efficiently handle user sessions across multiple machines, or even a queuing system for processing tremendous amounts of data coming in? The add a "companion" Key-Value DB to your architecture.
... And many, many other examples of other specialized NoSQL DBs could be added here, but you get the idea.
Summing up: Understand the data > Attempt to model an RDBMS schema that supports said data > Use proper "companion" DBs for whatever data that does not fit in a RDBMS.
> If SQL is indeed the choice 99 times out of 100, how can you know when you are faced with that 1-in-100 time when NoSQL is actually the right option?
If you have to ask yourself “Should I use SQL or NoSQL?” then you should use SQL. If you have to use NoSQL, you won't have to ask yourself. Usually it's pretty obvious when you will not use joins at all.
That's not always true, of course. NoSQL databases can usually handle 1st level relationships just fine. The problem arises when the related item needs to be it's own entity.
For example, if you have a table of users who have permissions represented by text, that's really easy in a NoSQL database, as most of them provide some kind of list datatype (list<X> in cassandra, arrays in mongo, list in redis). The problem arises when consumers of the database need to be able to add or remove from the set of possible permissions. If it's limited forever to 'read'/'write'/'execute', or is unlimited, you're golden! But if it's some sort of system where an admin needs to be able to dynamically create customized permissions, an RDBMS will be the clear cut winner.