Bottom line is - you do not need hadoop until you cross 2TB of data to be processed (uncompressed).
Modern servers ( bare metal ones, not what AWS sells you ) are REALLY FAST and can crunch massive amounts of data.
Just use a proper tools, well optimized code written in C/C++/Go/etc - not all the crappy JAVA framework-in-a-framework^N architecture that abstracts thinking about the CPU speed.
Bottom line, the popular saying is true:
"Hadoop is about writing crappy code and then running it on a massive scale."
Dell sells a server with 6TB of ram (I believe.) I think the limit is way over 2TB. If you want to be able to query it quickly for analytical workloads, MPPs like Vertica scale up to 150+TB (at Facebook.) I honestly don't know what the scale is where you need Hadoop, but it's gotten to be a large number very quickly.
My question is what do you mean by 2TB? At my current client, we have 5 TBs of data sitting (that's relatively recent). Before we had 2-ish. However, we had over 30 applications doing complex fraud calculations on that. "Moving data" (data being read and then worked) is about 40 TB daily. Even with SSD and 256 GB of RAM, a single machine would get overwhelmed on this.
If you're only working one app on less than 1 TB, maybe you don't need something as complex as Hadoop. But given that a cluster is easy to setup (I made a really simple NameNode + Two Data nodes in 45 minutes, going cold), it might not be a bad idea.
I'll take this further and say that some tools for Hadoop that are not from Apache are really nice to work with even in a for non-Hadoop work. For example, I've got to join several 1 GB files together to go from a relational, CSV model into a Document store model. Can I do this with command line tools? Maybe. Cascading makes this really easy. Each file family is a tap. I get tuple joins naturally. I wrote an ArangoDB tap to auto load into ArangoDB. It was fun, testable and easy. All of this runs sans-hadoop on my little MBP.
Fun fact about the Cascading tool set is that I can take my little app from my desktop and plop it onto a Hadoop cluster with little change (taps from local to hadoop). Will I do that in my present example? No. Can I think of places where that's really useful? Yes, daily 35 fraud models' regression tests executed with each build. That's somewhere around 500 full model executions over limited, but meaningful data. All easily done courtesy of a framework that targets Hadoop.
Just use a proper tools, well optimized code written in C/C++/Go/etc - not all the crappy JAVA framework-in-a-framework^N architecture that abstracts thinking about the CPU speed.
Bottom line, the popular saying is true: "Hadoop is about writing crappy code and then running it on a massive scale."