Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, this seems to be yet another tool that falls prey to what I think of as "The Bisquick Problem". Bisquick is a product that is basically pre-mixed flour, salt, baking powder that you can use to make pancakes, biscuits, and waffles. But why would you buy this instead of its constituent parts? Does Bisquick really save that much time? Is it worth the loss of component flexibility?

Worst of all, if you accept Bisquick, then you open the door to an explosion of Bisquick options. Its a combinatorial explosion of pre-mixed ingredients. In a dystopian future, perhaps people stop buying flour or salt, and the ONLY way you can make food is to buy the right kind of Bisquick. Might make a kind of mash up of a baking show and Black Mirror.

Anyway, yeah, Airflow (and so many other tools) feel like Bisquick. It has all the strengths, but also all the weaknesses, of that model.



The art of software engineering is all about finding the right abstractions.

Higher-order abstractions can be a productivity boon but have costs when you fight their paradigm or need to regularly interact with lower layers (in ways the designs didn't presume).

Airflow and similar tools are doing four things:

A) Centralized cron for distributed systems. If you don't have a unified runtime for your system, the old ways of using Unix cron, or a "job system" become complex because you don't have centralized management or clarity for when developers should use one given scheduling tool vs another.

B) Job state management. Job can fail and may need to be retried, people alerted, etc ... Most scheduling system has some way to do deal with failure too, but these tools are now treating this as stored state

C) DAGs, complex batch jobs are often composed of many stages with dependencies. And you need the state to track and retry stages independently (especially if they are costly)

D) What many of these tools also try to do, is tie the computation performing a given job to the scheduling tool. This now seems to be an antipattern. They also try to have "premade" job stages or "operators" for common tasks. These are a mix of wrappers to talk to different compute systems and actual compute mechanisms themselves.

If you have the kind of system that is either sufficiently distributed, or heterogeneous enough that you can't use existing schedulers, you need something with #A, but if you also need complex job management, you need #A, #B and #C, and having rebuilt my own my times, using a standard system is better when coordinating between many engineers. What seems necessary in general is #D.


I meant to say D seems unnecessary


Just playing devil's advocate for a bit: the horror of your Bisquick scenario depends in part on the assumption that salt, flour, etc are fungible across applications, which is not quite true. Flour, sugar, and probably other trace ingredients for managing texture benefit from using different types in different recipes. If any of those benefit from economies of scale, it could well be optimal in some sense to have mixes for everything. This is much closer to being true in software, where different circumstances demand different concrete implementations of abstractions like, say, "scheduler" (analogous to grade/type of abstract ingredient like "flour").

Ed: I should say, I really like this metaphor, and I expect it will crop up in my thinking in the future.


I realize this is a metaphor and I'm answering the metaphor and not the underlying problem, but: camping. Seriously, want some quick pancakes or donuts when you're out in the field? Bisquick and just change up how much water you add.


Also disaster survival.

Couscous, Bisquick and other low-or-no-heat, premixed, just-add-water solutions are a godsend to have when a tornado takes out your gas line or electric grid.


On top of that, such Bisquick takes hours of learning, deployment, troubleshooting and has complex failure modes compared to few cronjobs and trivial scripts.


Assume that future kinds of Bisquick can have negative amounts of flour, salt or baking powder. Now recipes in the dystopian future just require a simple change of basis.


Yes, that is true. If you allow negative ingredients you can indeed reach all points in the characteristic state-space of baking, even when limited to picking from a huge set of proprietary Bisquicks. Which is a hopeful thought. I think.


Implying that Airflow is a simple mixture of a few ingredients is selling it quite short. There are a lot of knobs and switches in Airflow (i.e. features) that have been built and battle-tested over a lot of users. It has quite a lot of dependencies across the scheduler, webserver, cli, and worker modes. And there is a lot of new development going into Airflow in recent months (new API, DAG serialization, making the scheduler HA).


Brawndo. It's what plants crave.


Your comment doesn't provide insight of when and when not to use it.


I guess as I've grown older I've grown wary of black-and-white thinking. The insight I would share with you is to be wary of Bisquick, but do not dismiss it outright. All creation is combination, and you won't succeed saying no to all combinations. In the same way, you won't succeed saying yes to every combination.


I think what you’re saying is, you cannot start from first principles if you want to accomplish most things, but you need to understand first principles to not misuse those things.


I guess so, like everything comes with pros and cons you know




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: