I don't normally use Python. When I see something that I'd like to install, and the docs say "written in Python", my heart sinks. I know I'll have to find out how to install multiple dependencies in various packaging systems, all of which will be incompatible (but surely much better) than my system package manager.
I find all of this ironic given that the main gripe people used to have against Common Lisp was "but how do I make a self-contained binary?". Somehow this was never required of Python.
This means the users don't have to know it's made with Python or install anything, and it just works.
However, Python is not like Go or Rust, and providing such an installer requires more work, so a huge part of the user base (which have a lot of non professional coders) don't have the skill, time or resources to do it.
And few people make the promotion of it.
I should write an article on that because really, nobody wants to setup python just to use a tool.
Do check out nuitka though, it's has great support for QT, numpy, advanced Python constructs and has a permissive learning curve. You may even get a up to a X4 perf boot.
>That's more of cultural problem in the Python community.
The Python community also loves making minor breaking changes. Like the language itself, with the changes from Python2 to Python3, then later on the changes to e.g. introduce "async" as a keyword and break all existing code that had a variable of the same name. Some dynamic languages, like Clojure, take the approach that it's not okay to collectively waste tens to hundreds of thousands of hours of users' collective time just to satisfy the library developer's aesthetic urges, and packages very rarely make breaking changes. Personally I'd much prefer a language/library with some deprecated cruft than one that wastes days of my time updating my code every time some developer finds a new way to make their API "cleaner".
> The Python community also loves making minor breaking changes.
This is almost entirely untrue. Python 3 was a huge deal because the community held off for so long on making breaking changes. The async thing was probably a misstep. That happens.
That's actually very true, the author just didn't choose a very good example.
The reality of Python is this: there's no standard and no obligation to keep any Python code working. The matrix of supported environments is huge. So, to even build and test packages for more than 4 minor language versions is prohibitively expensive.
The 4 here is a very important number. Python's own release cycle is intended in such a way that they don't care about more minor versions backwards: and typically, you can no longer build Python of that version on a modern OS with a modern compiler.
Another consequence of this is that if you need to install an older package, even if in principle it should work with current Python, you'll discover that Python packages will have a lot of very particular and very convoluted dependency relationships. These typically work if you can allow yourself to be within last 4 versions of Python, but if you fall behind -- all bets are off. You start to discover that you can no longer install things due to botched dependency specification, you start installing things manually, repackaging while fixing dependency definitions and so on.
Some of these problems can be blamed / explained in terms of others. Eg. interface of SSL library Python uses changed in incompatible ways not so long ago, which now prevents older codebase from building with new library and newer codeabse building with the old one. There's a similar story with C interface, TK bindings, and probably more that I didn't encounter personally.
> community held off for so long on making breaking changes.
That's bullshit. Community bravely moves fast and breaks things. Just not in the case of Python 2.X -> Python 3.X transition. But that's rather an odd exception to the rule.
Python was created 30 years ago, before Java, at the time the first game boy came around.
It accumulated a lot of things, and it’s a constant battle to balance what to change (to keep the project lean and modern) and what to keep (to maintain compat).
Whatever move you chose, someone in the community will say the core devs are doing it wrong.
I read as many posts saying python is moving too fast as ones saying it’s not adopting “feature x” fast enough. As many people saying we should get rid of the GIL than one saying they are deprecating too many API.
When you are as popular as Python, with a community as diverse, it’s very hard job.
Why is python so popular? It reminds me of a less flexible JavaScript, except a little cleaner feeling. However, the flexibility (functions are data, etc) is the main allure of JavaScript, besides it being required for web
What draws everyone to python? Why don’t people use c# or Java or c++ or some other language? What’s the killer benefit they get?
Numpy, scikit (learn/image), panda, and probably tensorflow (never used that one).
Tbf numpy+pandas is like using a mathlab that also have easy way to do http requests/web scraping, so there's that.
I didn't really do datascience in 6 years now (except that one time I demo'ed our jupyterlab+dask setup vs a classic Hadoop), but to me those are the clear reasons.
Also the first time I used python I was doing a CTF and had to add between 70 and 130 NOP (randomly) before the payload. I think I now could do that in perl and awk, but at the time it was a python one-liner and I do think python was the easiest to debug between the three, hence was the best solution.
Today, Python is popular because Python is popular.
It doesn't have any engineering merits of its own. But you also cannot compete against it on engineering merits. It doesn't matter if you do everything better in your language than how it's done in Python. Unless you find a way to make it popular, Python will stay popular and your language will fade into oblivion.
What was the initial seed that created this popularity? -- well, it seemed cool for nonsense reasons. The choice of Python was often reactionary, to spite those who chose Java. Python just happened to fill the niche of "what language should I write in if I think Java sucks?"
This had an appeal to better programmers of that time (early 2000s), and this made others believe that language was what made those better programmers better. The better programmers started to flee Python some 5-10 years later, but it had already enough momentum to attract lots of mediocre programmers, and before you knew it, Python started to be used everywhere, and while the quality of Python programmers consistently dropped, the popularity only grew.
So... if you are planning on a project for yourself, or for a group of skilled and motivated individuals -- there's no reason to choose Python, in fact, you'll come to regret it if you do. But, if your goal is to make a project in the context of an industry giant, then Python is a great pick -- you'll have an infinite stream of replaceable programmers, lots of 3rd-party libraries. You'll save a lot on developing the product. It won't be a high-quality one, but in this context nobody cares about quality anyways.
>Python was created 30 years ago, before Java, at the time the first game boy came around.
C++ was created even earlier, and it largely managed to maintain backwards compatibility; most code from back then will still compile, albeit with warnings.
Python was never lean nor modern. It was a still-born language, intended more like a toy / joke, if you want to give credit to its author.
It's very aspirational to claim that there's some process that involves intelligent thought that decides what features should be added or removed from the language. From where I stand, a petri dish has more intelligence than all Python core devs combined.
The reason to have an interpreter or any kind of separation of software into modules is that you can audit common pieces of software just once, that you don't waste space on users' computers to store it, that you don't require to build walls to protect separate environments with similar but not the same copies of the same piece of software.
Another benefit of an environment like Python is that you have source code available and you can fix problems before or instead of engaging the original developers.
I had a misfortune to work with the tools written by someone like you (a lot of Linux LDAP utilities are like this). It's a huge pain to deal with.
It's really not an awful idea in general, but might be for a user like you. I'm not sure if you read the article but it details the many various types of users and why they may use Python projects and libraries. Having an executable available will make installing the tool much much easier for some users. And you can still have the source code and regular packages available for those that know how to handle it.
OP mistakes the problems Python users have with the actual problems Python packaging system has.
The overwhelming majority of "pain" the author talks about is self-inflicted. The overwhelming majority of Python users are bottom-of-the-barrel worst programmers to walk the face of the Earth. Their problems come from just being really, really bad at what they are supposed to be good at.
So, one complaint you could make is that Python packaging isn't very friendly to incompetent users... well, that's true, but I wouldn't care about that. Quite the contrary: I wouldn't mind it if things were more complex, but worked well, once you mastered them.
Your idea of distributing a single binary is bad for all the reasons I described in the previous post. You just decided that it's not so bad because it helps users who suck to suck less... But I don't want a "solution" that would sacrifice many other desired properties to deal with my potential incompetence. I'd rather learn how to do things right than enjoy being stupid and have someone else cover for my stupidity.
that you can audit common pieces of software just once, that you don't waste space on users' computers
Python has completely given up on that idea years ago. Even when not packaging your python project into binaries, the official recommendation is that each application gets it own complete copy of all its dependencies and to never share library code between projects.
Why does it matter what Python has given up on? -- Python sucks, this is the central point of my argument. They've given up on doing the right thing? -- Is this news to anyone? This has been like that since as far back as I can remember... so, at least two decades now.
In other words, who cares what "recommendation" says (recommended by who? Same Python core devs? -- why should anyone take those clowns seriously)?
- you have to install pipx and they recommand to install it... with pip. Back to all the same problems.
- pipx implicitly use a python version underneath. For things like mypy or black, you get stuck to the syntax of this version, and you'll get errors a lot of users can't solve
- pipx is not stable enough, it just fails sometimes
- people have to know about a third party tool
- it adds to the paradox of choice
- pipx PATH injection is hit or miss
I should have listed pipx in the article, it's not a solution, it's a crutch.
This was the main reason I started writing my random tools and scripts in Go.
I can just write, cross-compile and deploy.
Trying to get a "simple" Python tool with library dependencies working on a random server is a huge pain in the keister I'd rather not deal with if I have the option.
This became even easier now that I can "simply" feed my old Python stuff to GPT4 and have it transform it to mostly-working Go code =)
My problem is the exact opposite. Every distribution under the sun comes with some form of Python but getting Go running can be a hassle (sometimes even impossible). For scripts, Python works great in my experience.
Every time I need to load some Go related project into my workflow I somehow end up at some open Github issue that says "we can't do X I go yet but we plan on implementing it in 2021 when <some compiler feature> is done".
It's not just Go, but its opinionated nature (solving difficult problems by picking one solution making it the default you can't change, and then assume nobody ever has a problem with it) and the fact it compiles to binaries makes it susceptible to "it always works except if your system does <something my system happens to do>". Things like "paths are always UTF-8" are great assumptions Go tools utterly fails to deal with.
> For scripts, Python works great in my experience
For scripts _with no external dependencies_ I do agree with you.
But as soon as you do something non-trivial then you have requests and boto in there and maybe a few smaller utility ones. And now you've got issues. You can't just install them globally, because that'll change it for ALL Python programs on the machine.
Not the original commenter, but Go programs are all statically-compiled, with everything needed at run time bundled into the executable. Go does use a runtime, but that's incorporated into the exe, so no external libraries required. As such, you can get a program written in Go running on a new machine just by downloading it :)
For machines where you may actually have a Go installation already, the majority of programs now can also be installed from source via Go's built-in package manager, which installs it to your home directory. A lot of dev utils, like go-imports, the language server, etc. are typically installed this way. It's usually just a single command (though the exact command can vary between projects, but is usually signposted fairly clearly).
For 99% of stuff I need to move either one statically compiled Go binary and maybe a configuration yaml. That's it.
With Python I need to have Python installed in there, and it has to be the correct version because of <reasons>. Then I need to figure out how I can install the dependencies for the program, with the correct version(s).
This usually would require me to figure out what's the current virtual env du jour and how do I get it working on a random *nix distro on the target computer.
Then I can install what's in requirements.txt or whatever the venv tool is using. And now I can maybe run it if all went well.
And when I need to update said program, with Go I can just scp/rsync a new binary on top of the old one at it'll Just Work.
With Python I need to move the Python source file(s) there, check for possible updates to libraries, update them using the venv tool and re-check that I have the correct Python version installed once more.
Now picture a situation where I need to do the process above for, let's say 12 different servers on different clients after every fortnight. All servers are Intel-based, but run a variety of Linux distros. I can use the exact same Go binary for every single one, the Python solution is a complete clusterf...
I do need to spend a bit more time when developing a Go solution compared to Python, but the ease of deployment is definitely worth it.
It seems too easy coming from python land where dependencies are handled so poorly, but for a lot of go programs (depends on how you're holding it), a simple scp program server: is enough.
Most programmers don't give a crap about the user. Python assumes that the user will be a Python programmer, and want to maintain a whole independent part of their system just to run a Python program.
> I don't normally use Python. When I see something that I'd like to install, and the docs say "written in Python", my heart sinks.
I do use python regularly, but still have this reaction. If I've set up a project that runs a CLI script, or Django web server etc, I can manage dependencies. If it's someone else's code, all hope is lost! It also makes me wonder why it's not wrapped into a binary using one of the various tools available.
Maybe you are doing something wrong? I have installed countless Python packages using "pip install --user pkg-name" If something fucks up i just rm -rf ~/.local/lib/python3.11/site-packages and reinstall the package. I've stuck to this workflow for over ten years and the only major issue I have had with it was with TensorFlow which didn't support the latest Python version.
Nowadays, I have to add the flag "--break-system-packages" but the workflow still works as well as it ever did.
Sample apps using PyTorch are best examples of an adventure, when ran on Windows. One needs to carefuly consider a WSL2 machine, full vm or Anaconda, because each comes with some caveats. WSL doesn't implement sound/microphone access, full vm's don't pass through the main display GPU, Anaconda based ones complain that the host system doesn't support hard symlinks (unless the Developer mode is turned on)
Why does it matter if a program is a self-contained binary? It seems like such an odd requirement. If you really want that it seems in principle easy to compile it all into one script with no imports and put a shebang at the top. What's the point, though?
Pyinstaller is pretty good for if you need a standalone installer.
> Why does it matter if a program is a self-contained binary?
Because simple is better than complex. I'd rather download a binary than download a binary + dependencies + set them up + bookkeep for when I want to delete all those files.
> Pyinstaller is pretty good for if you need a standalone installer.
PyInstaller doesn't work for many edge cases. If you're using a Python package that uses a compiled binary written in another language, good luck on your way down the rabbit hole of PyInstaller config. I personally could not succeed in packaging a uvicorn app, for example.
> Why does it matter if a program is a self-contained binary? It seems like such an odd requirement.
Because sometimes I just want to write `my_program | your_program` without learning how to install and set up a program in a language I don't personally use, particularly when that language has the worst packaging ecosystem of any mainstream language.
This doesn't match my experience at all. Can you give any concrete examples of packages that are difficult to install because they are written in Python?
"difficult" is in the eyes of the beholder. But one quite simple and well maintained package, which many have still have problems with is librosa. See for example Stack Overflow questions about it or the issue tracker. Some of the challenges come from having dependencies on native libraries (audio codecs), that need to be installed manually. Others come from using numba, which break on every new Python version.
librosa is a library. That means you need to learn to do Python development to be able to use it obviously.
This is the problem. People aren't even talking about the same thing. This thread started off as something about single file distributables and the only concrete example given is a library.
I assume the gold standard is just downloading a compiled binary and using that. Which is the norm for almost everyone ... except python projects.
When it comes to compiling/running from source, Python is in the middle of the pack. It's not as smooth as rust (install cargo from your package manager or from rustup, run "cargo build") but not as bad as C++ (run through the install section of the readme of each dependency, then debug whatever build system the project is using).
Perl, Ruby, JavaScript, PHP, Bash, Go, C, C++, Rust and even Java is easier to deal with than Python tools. Seeing that it is a Python tool usually make my heart drop a lot and makes me look for alternatives before attempting to get it running.
pip installing one thing is usually not a problem.
You can do the binary thing with Python for sure. Does Lisp really work much better?
Part of the issue with Python is versions of the language. So you really shouldn't be installing packages locally in most cases, but you should use one of these tools so you're running your dev environment isolated from other environments.
Most of the popular ones are a lot better than Python. Python might be comparable to CAD Lisp or that abomination embedded in GIMP (though I think they stopped supporting it) in terms of quality of design and execution.
> So you really shouldn't be installing packages locally in most cases
Where do you suggest I install them instead? On someone else's computer? What kind of advise is that?..
Many languages have much better tools that make it easy to compile/bundle an artefact that can be run without the end user having to build or install anything.
Surely "to run the program from source, you need to know how to build/run it" applies to every language.
But I'm more comfortable working through problems I might encounter in languages I've worked with, than in languages I'm not familiar with.
I'd say Python is notorious, since you're probably going to hear about at least six or seven ways people are able to run things; so if you're not familiar with it, you're likely to do something wrong. -- e.g. elsewhere in this thread, you've pointed out that stuff should be run from a venv; that's something you have to know.
It's true: some people like Docker for Python, some like venv, some like conda... some lose it entirely and install everything locally! (acting like I haven't done this at least once!)
Of course badly written code is buggy and will require bug fixing because it only works on the machine of the developer. I'm not sure this isn't true in other languages. I've seen go CLI utilities that were dynamically linking Gnome libraries and could not start without installing Gnome.
I recall the negative reaction I got, on Hacker News and on Twitter, back in 2018, when I made the point that the Python community was one of the main groups driving the adoption of Docker: "Package management in Python is so hopelessly broken that the Python community is eager to use Docker everywhere, as the only way to save the situation."
In response, people laughed: "No one is using Docker for package management! Python has an abundance of good tools for that!"
And yet it remains true, if package and dependency management in Python was seamless, then Docker would have had fewer advocates back in the early days.
And one of the criticisms of Docker remains that it is sometimes used as a Band-Aid for flaws in some computer programming language, flaws that need a more fundamental solution.
I’ve been writing python code for about 19 years, and I remember yours or similar arguments about docker.
I completely agree. Docker has more or less saved python from its own packaging issues, especially in the cloud.
On local machines, I really found the most comical show of python distribution wreckage in RedHat/CentOS distributions. Back in 2015 or so, yum was tied to the system version of python. Heaven help you if you didn’t realize it and went mucking around on it (like me).
This is likely true for me–I’m a fan of Docker and a longtime Python developer. Docker rose in popularity at around the same time that many applications were in between Python 2 and 3, and putting apps in containers was a bit of a magic wand to wave away various issues, some of which still exist and some that were worse during that transition.
I still like Docker regardless of Python–I wouldn’t want to go back to development using a vm-per-app, or dealing with mismatched db/cache/queue/system package versions between projects or across different developers machines.
Insert the "if those people (behind the window with a poster) could read" meme.
Seriously, if Python developers could produce a coherent thought, we wouldn't be here. They aren't bothered by a contradiction you pointed out. They still push for Docker adoption, even though it's not helping them, because they "want to believe".
For me none of the things that are suppose to "relieve packaging pain" have ever done anything but cause more pain.
Plain virtual environments with venv work fine. If you need a different version of python, download the source, build it and install with make altinstall, then use it to make your venv.
Thank you for your post, I was looking for something like altinstall, I wasn't aware of it. I've used dnf to install python3.8 side by side with the default python3.6 on Almalinux and they seem to work fine, but I keep wondering if I haven't broken something.
As long as you're installing/updating Python via your distro's package manager you won't break anything.
Most horror stories of borked system Pythons are caused by people uninstalling/updating packages from the system Python via pip or fiddling with their PATH.
altinstall is only useful if you have to compile Python from source to get it onto your system.
In your example, one of the abstractions is 'what Python does it use'. Because I'm dealing with this very issue right now, distributing a simple Python tool to a Windows end user, and they're on heavily managed government machines and it turns out that their 'python' command invokes a python.exe somewhere on a network drive and for some reason the '-m venv venv' fails, which then causes all the rest to fail as well of course. Now I don't know yet if this network thing is really the root cause, I've just received some error logs that showed this this morning, so I'm on another round of trying to remote debug this, via email, with someone who is (while patient and willing) not a Python programmer.
Python packaging is such a ridiculous PITA, it's laughable if it wasn't so sad. I tried nuitka as suggested upthread, and send my customer a single binary to try, let's see if it works.
Except many python packages have started to move away from requirements.txt to pyproject.toml files, so you have to deal with that. Plus now you have to either remember where you put all your venvs and which tool is where or do weird hacky things with $PATH.
That might work for a subset of projects, but try that with anything that does for instance numerical analysis stuff, like machine learning. There is a reason why Conda is a requirement for so many Python projects.
If the provider of requirements.txt did their homework, yes. However I often got incomplete requirements files or they contained mutually incompatible versions (numba and numpy soemtimes dont get along).
What if there are native dependencies required? It starts to get a bit hairy whenever something cannot be provided by pip alone. Shipping binrary libs is not a suitable option.
Half the problems in this long article can be avoided by not using Windows. I wouldn't try developing a .NET Forms application with Sublime Text on Gentoo, would I?
Open Source is done by people solving their own problems and by their own volition, sharing their solutions with the community. Some of those projects are useful enough for people to become commercial or semi-commercial endeavors.
But yet, resources are limited, and compatibility with Windows will only be there if more people care about it than they care for other features.
Most Linux distributions have no licensing costs, are highly compatible with most basically the same hardware Windows runs on, and even more, in very inexpensive hardware like raspberries pi where Windows doesn't even run. So, given that resources for development are not infinite, I think that we should be glad that Linux and other Unix-like environments are the baseline target for compatibility for python packages instead of Windows.
Also, most Python software that is of interest to non-programmers like Jupyter Notebooks, PyTorch and Keras is still of a technical nature. It is not like people are installing mail clients or Word Processors written in Python with pip install. Even if those projects are being used by non-professional programmers themselves, the user must learn some programming to use them. It is reasonable that someone wanting to use such sophisticated packages will have to devote some time to learn how to properly install them. Again, someone could argue this is unacceptable, but this comes from the reality that making this easier requires resources, programmer time, testing and access to environments for said testing. Most open source project don't have the luxury of access to vast farms of devices with hundreds of different combinations of hardware, devices and software as a company like Microsoft does. But on the other hand, they are not asking you thousands of dollars for a Microsoft Subscription, neither they receive a tax the moment you bought your machine with their operating system installed.
I’m a fairly recent convert to poetry, but I definitely acknowledge the bootstrapping problem with it— it seems to work best if you have some non-Python means of acquiring poetry itself, whether that's a PPA, nixpkgs, or something else that's already "at hand".
Anyone who rarely uses python, or who uses it all the time, should take the time to read this article and its companions in the series. Good, solid, advice that will make life easier. Just read it and try out the suggestions made. I was amazed at how close it came to my own workflow developed over many years since migrating from perl to python.
The disadvantage of this approach is that you can't easily switch major Python versions per project. I use miniconda [1], and I'll create a conda env per project, with the newest version of Python that is compatible with the dependencies. Also, conda doesn't just install Python packages, it can also install cuda drivers (e.g. for GPU accelerated pytorch/tensorflow [2]), it can also install ffmpeg, etc.
The complaint in the article is true, that you do generally need to do a mixture of `conda install` and `pip install` depending on whether the package is in a conda package repository or not; but I've not found this too bad in practice - I look up whether the package is in a conda repo, and install from pip if it isn't. Also it would probably be better for conda to be automatically activated for a given project / folder, rather than needing manual activation via `conda activate <name>`, but you get used to it.
1. Anaconda tooling and PyPI tooling are just different things, you cannot substitute one for the other, so whether you had any success with Anaconda or not is really irrelevant to OPs problem.
OP's advise comes from not understanding of what Anaconda Python is (most likely simply never tried it). The problem OP is fighting is a complex of problems related to Python packaging the way PyPI sees it (i.e. Wheels, Eggs, PyPI index, dependency specification, versions specification as done according to PyPI).
Anaconda Python ignores all of that. It distributes a different set of packages, governed by different rules. PyPI tools and Anaconda tools aren't interchangeable the way pyenv virtual environments are with, eg. venv virtual environments.
On the other hand, if the user needs something from Anaconda world, then no PyPI tools are of any use to them. It's just a different thing, essentially, it's a different language. I mean, it's the same grammar rules and the interpreter binary is mostly the same, but everything else is different. It's kind of like advising someone having issues with error handling in GNU Emacs to not use Spacemacs. It's just irrelevant, although for the outsider might feel like it's somehow related.
As for your conda example? -- so what? OP didn't want to use conda in the first place... why are you showing this to me?
I’m not convinced by the argument as it sounds like it’s tone implies that it’s written to people who aren’t aware of who their audience is.
Sure, not all of these tools are right for all audiences, but that doesn’t mean that they aren’t good suggestions for wide swathes of the ecosystem.
I think it’s important instead to think about your largest user groups and provide workflow ideas for those groups. For example, with Python we may have (this is a simplified list):
* Web Development
* Data Science
* Embedded
* Compilation / Transpilation (e.g., PyO3)
Of these groups, I’d expect the tools mentioned to be reasonable for two or three of the groups, but less so in groups like data science where there is a higher concentration of non-development first specialties.
There are always outliers, but I think choosing your target audience is important. On a team, I wouldn’t necessarily want to teach a person a random workflow if the team all uses one of these types of tools.
> "There are usually two kinds of coders giving advice. A fresh one that has no idea how complex things really are yet. Or an experienced one, that has forgotten it."
For a language purported to be user friendly, Python sure does have an extremely unfriendly package management ecosystem. I'm sure a lot of it comes down to familiarity though. With nodejs I'll use asdf to switch and manage runtime versions, and npm to install packages per project. Dealing with transpilers and bundlers is terrible experience though.
The best programming language that I know to get everything right with the out of box experience is Rust. It has a tool that allows me to switch Rust compiler versions and cargo Just Works.
I recently switched back to venv and pip with pip-tools with dependencies configured via pyproject.toml. I have a basic Makefile containing convenience command aliases with inter-dependencies.
While not perfect, the venv/pip approach has simplified my development process, allowed conventional deployments to PaaS, and made things easier for new developers.
I will gladly dispense with pip-tools and Makefile if Python had/has a more capable standard library CLI.
This is a plate of cow dang with glass shards on steroids. I wouldn't even know where to begin to explain how bad this is... I mean, you already had an abomination in the form of pip-tools and pyproject.toml, and then you decided to pepper this trash with Make? Like, come on... you are with a straight face suggesting people treat toothache with a chainsaw...
I think this article is rather long-winded and not super informative. The previous post [1] lays out what one should actually do, and I agree with most of it, at least from a developer perspective.
This post is the justification of the one you link to. It's not meant to be practical, but rather an answer to the same comments I get over and over again when I suggest the method you just posted.
Even with it though, you can see in the thread people don't read it and completely ignore its rational :)
I understand what you're trying to achieve here, but I think the content isn't very well organized. The people that have the chops to understand most of what you wrote won't need to read it, and those that don't won't come away with any clearer understanding.
If you can trim the content to about 30% length, I think it would be better.
I don't think I'd characterize some of those things as lying. Setting up a development environment is no different than setting up other kinds of software: you obtain the software, and follow instructions to get it up and running. There's no substitute for this, unless you're setting up an environment for someone else (eg, a student), in which case, they don't need to know about many of the details.
Node.js installs the libraries into the `node_modules` subfolder in the working directory of the project. Python installs the libraries in to a system dir (unless you're using venv or conda or whatever). This means that dependency mismatches are resolved on a per-project level with node.js but are on a system-wide basis for python. If `npm install` is failing, it's not unreasonable to just run `rm -rf node_modules` and then try again. Worst that happens is that project is still broken. Meanwhile, if I nuke `/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/*` on my mac to try and resolve a problem, I've broken everything else that uses python 3.11 on my system.
The failure scenario with Python is when two or more projects need conflicting version of numpy or scipy or whatever. Then you start fighting with pyenv/poetry/venv/conda/docker to get a different environment to run it in. As noted, it works, eventually, but it's just more work. More work is just more work and we as developers hate doing busy work like that.
Always use containers! Any big project should be sandboxed, having completely isolated environment, dependencies, and you have all installation steps well described and reproducible. Containers in Linux are lightweight citizens and using them is absolutely necessary to avoid any dependency hell and broken or trashed system. Plus, you can simply distribute them.
Even for short scripts I write? A unique container for every script? How do I go about calling one script in one container from a script in a different container.
Containers in Linux are lightweight
Much of the world isn't running Linux. Deploying a single project you have developed, targeting a single OS and CPU arch, onto a server that you have complete control over is the easiest case and not really where most problems show up. The hard problem is being able to email a script I wrote to a colleague and have it run on their machine.
I think you’re right. I also think that’s a big problem for python as a language. For a lot of use cases just choosing a different language from the jump that allows you to easily compile a standalone executable can make life so, so much easier.
I've had success using pyenv on Mac and Linux. If I had to use Windows, I'd probably try pyenv-win [0].
I have run into issues trying to use packages that were available for one platform but not another, due to native code, etc. Most of the time I could find a pure Python alternative, but not always. This can lead to using containers, which adds complexity, which is a drawback because one of the advantages of Python to me is the simplicity (assuming you have something like pyenv).
I've used Poetry in the past, but it added enough complexity/overhead that I probably won't again.
I've had some success putting a line in a README for an internal tool that other devs can use to pip install from a Git repo. Again, assuming you have pyenv or the like, starting from a clean venv and pip install from Git seems to be pretty straightforward.
Use conda to set up new environments and switch between them, and use pip to install modules. It's really not as complicated as people make it seem. I haven't had any issues with this set up across Linux and MacOS, doing web / ML / data development for the last 3 years. It also avoids the pitfalls mentioned in the blog post.
I have the same workflow. The only issue is that size grows really quickly as nothing is shared between the environment. With just handful of environments my conda folder is 15 GB or so.
This problem is not unique to Python, it’s just the proliferation of incompatible tools that makes Python quite so annoying to use.
I’ve found that none of this is really an issue if you provide a container as a dev environment. Before someone accuses me of “yet another standard”ing this, it’s my standard across all my programming environments.
It’s not a silver bullet, especially with Apple silicon Docker can be a massive pain in the butt, but it works for me. Inside the container I can be guaranteed that only a single Python version is installed (correctly) and packages are installed in a standard way from a package file. Adding a package can be scripted so that it gets executed inside the container and adds it to you packages file.
The other massive benefit is that your local machine needs no locally installed tools other than docker (or podman, or whatever). this works for me and my team it’s not perfect, but it is a complex problem.
It's not "this" problem. It's many problems together.
Some of them, while maybe not unique to Python are completely preventable. Where the right thing to do was obvious, but Python core dev decided to cut corners, or do nothing etc. So, we need to be fair and put the blame where it belongs. Python packaging format didn't just "happen", it was designed... by a bunch of half-wits, but that's exactly the problem! Python import system didn't "just happen", it was designed, by mostly the same kind of people -- and, again, that's the problem!
If anyone wanted to put a good effort and thinking into designing packaging format and the import system, they'd realized that they need to deal with multiple versions of the same package somehow, they would've realized that they need to distinguish between binary distribution and source / documentation / data packages, that they need a reliable tool for installing those packages. But nothing of the kind was done. Every "development" in this area was incrementally adding more technical debt.
Is technical debt a new thing? -- surely not. Is it preventable? -- well, not so much, but being cognizant of it one could establish certain practices to mitigate the adverse effects. And this, this particular thing, has never been done in Python. Is this a common failure? -- well, again yes, but this doesn't vindicate Python core developers, they are just as guilty as a bunch of other people who suck at programming.
Let's say I want to run Stable Diffusion WEBUI (Python 3.10) and Whisper (Python 3.11).
Last I tried to run both at the same time it was hell - when I downgraded Python for Stable Diffusion as per the instructions, Whisper stopped working.
Not advocating to use them right now, but the fact is bootstrapping Python is finally acknowledged as one major cause of packaging issues and a priority to solve.
Those will likely cause more problems (as per https://xkcd.com/927/). While bootstrapping Python is a problem, it's a symptom, not a cause. The underlying cause is there is a substantial part of the ecosystem (of which the "data science" component overlaps with) that is coupled to hardware, the OS and external (non-Python) software libraries which doesn't align well with the single-language approach that a part of the Python community wants (vague gesture to the "web dev" side of the community). This drives the pip vs. conda arguments, the discussion around docker, nix, homebrew etc. You can't tool your way out of this divide over what the community wants (and while there are two vague meta-groups, there are really sub-groups which want different aspects of both).
It's worth looking at how some other popular languages (don't) handle this: from what I've seen, CGo is generally frowned upon, and so there is rewrites of libraries into go (Python does the same), or network/IPC connections are used to bridge the game; Rust also has the rewrite attitude, but for bridging to other languages there's build.rs (which calls other tools and wrappers), but those usually require you have the external library already installed (via some other means e.g. homebrew), or at least don't feature as a self-contained part of the cargo ecosystem (there's a cargo plugin which downloads a version of the library from somewhere); nodejs seems to either try to build from source (which fails if you don't have Python install for gyp), or download random binaries which sometimes work.
I use Nix shells & derivations to manage python versions and dependencies. The obvious disadvantage is needing to learn Nix, but the advantage is one system can track all the project's dependencies, not just the Python ones (e.g. you can also track the dependencies of Python modules written in other languages like Numpy).
You want something that manages different versions of Python itself. pyenv is a popular option for doing this, but I'm partial to rtx which is largely language agnostic and also handles Python virtual environments in a very elegant way.
Solution I'm using on windows is conda, think its called "miniconda" what you have to install. You'll need to make sure you activate the correct conda environment in the separate terminal windows to use both at once, works with CUDA.
But yeah everyones going to have their own way in that ecosystem.
I like venvs and hate dealing with the system wide impact of anaconda (and the weird side projects you need for it to resolve dependencies within a single human lifetime) but if I see something that uses it, I just suck it up.
It's annoying that my specific platform isn't supported, but unless I'm paying for some kind of support contract, I don't expect the devs to make things easy for me. They make what they need, and what apparently tons of other people also need, so I'm the one who will need to spend a few minutes getting everything sorted out.
If you're on Windows you're going to have trouble with some packages. If you're on Linux, you're going to have trouble with others. MacOS and BSD aren't any different. From what I can tell, LTS Linux (Ubuntu, Fedora) is the most compatible, luckily.
Yeah, I'd prefer to install everything in a venv. However, if I write software that works through venvs, someone with some obscure Python implementation is bound to report that the dependencies don't install in their system because of differences. They didn't choose that system just to be annoying, and neither did I. All of these solutions are valid answers to someone's problem.
> Every time she digs, there is one more thing to learn, and for each thing, hundreds of ways to fail.
Such is the life of someone trying to find workarounds for an upstream problem. "It works on this platform, not yours, but you can simulate that platform by..." implies that you're going to need to learn another platform or don't use the project until it's got support for your technology stack.
There's plenty of Python that doesn't run on most Windows machines just like there are plenty of VBA macros that don't work on LibreOffice. Whether you solve that through workarounds, conversion systems, or installing a different platform you need to learn (in a VM or on bare metal), stuff is going to get messy.
As for "just use venvs": to use venvs for various projects, you need to install cargo, clang/g++/msvc, sometimes Fortran, or even assemblers. All of these fail in spectacular fashion. Binary packages aren't a solution either, they'll just randomly fail to load (glibc mismatch, musl, or hell, "go binary loaded on Alpine by Python" just seems to fail for obscure Linux linker reasons). There is no easy answer that doesn't require some kind of additional tooling if you're trying to use big, unwieldy packages. This is the case for every programming language!
The advice (python3.x -m venv + python -m pip install ) is good, but I think it is incomplete. What's the best practice for making the venv reproducible? pip freeze + pip install -r? What about upgrading or deleting a package? Also, what if I don't want to install pure development packages (black, mypy, pylint, doc generation packages etc.) in the deployment venv?
> Among other things, it advised not to use homebrew, pyenv, anaconda, poetry, and other tools.
Why are there so many so low quality programmers jumping on this problem? Why cannot they just step aside and leave this to people who actually understand what they are doing?
OP has no clue what they are writing about. Anaconda is a different distribution of Python with a bunch of it's own packages, tools, processes... It definitely doesn't belong in the same group with homebrew, pyenv or Poetry. What are Python users supposed to do if they get a project exclusively distributed as an Anaconda package?
The real list of things you shouldn't use to install Python packages is anything that's governed by PyPA -- it's all garbage. Unfortunately, this is impractical. The situation is in a way very similar to Linux and proprietary drivers. If you want Linux to run on your laptop, you will most likely end up needing proprietary drivers. And that's bad. And you have to compromise.
So... setuptools? -- bad, shouldn't be used, but, realistically, you will end up using it anyways. pip? -- even worse, but it's very hard to avoid. In situations where you create a library, you'll probably have to also at some point face the need to provide it in Anaconda distribution, and you'll have to use conda too... which is also bad, but there's nothing you can do about it, and even if projects like mamba exist, it's still not going to help you because you must test your library with the tools the users are going to use on it, and it's a choice between not supporting a bunch of users and having to use crappy tools.
The deprecation of setuptools is a joke. There's no replacement. What PyPA does is that they produce ever more bullshit wrappers for setuptools pretending they are doing something new.
I think that by "apt packages" you mean DEB packages. As someone who has to package a lot of Python both for RHEL and for Debian-like... well... so far and in the observable future, setuptools isn't going anywhere. On the other hand, I could easily do without it, and sometimes do. But it's not because PyPA produced any replacement to it. It's because I simply grow frustrated with how bad of a tool setuptools is, and I just write my own.
The claims PyPA makes about setuptools are because nobody wants to support that project anymore. It's awfully written at every level, from the minutia of formatting and up to the high-level design decisions. The problem is that everything else that PyPA came up with so far is a joke. The only reason I keep paying attention to it is because I'm being paid to fix problems created by using it.
I'm also going to tell you a secret... well, of sorts. DEB, and RPM too are also pathetically bad... since you seem to have some experience with DEB -- to assemble this stuff you typically have to use a bunch of Make macros. Excruciatingly painful to debug. Virtually no documentation and very few people who could help you if you have a problem. And, when it comes to Python packaging, things also changed over time. Not so long ago I saw some write-up about how to package Python including virtual environment in the package. In truth, if I were a distro maintainer, and someone came to me with a package like that, I'd told them to try elsewhere. It's a sign of disrespect to the efforts of the distro maintainers to assemble a working and coherent system. But, guess who does stuff like that? -- big companies, like Amazon or Microsoft. Instead of spending effort to figure out how to play nice with the distro's packages, they distribute a huge blob that's not humanly possible to audit.
I’m all too familiar with how difficult it is to create a DEB package from scratch. I honestly can’t belive how difficult it is to create what is really just an archive.
And then there’s the issue of a lot of ML packages having a native component… I spent half a day yesterday trying to build a project in an M1 Mac. Not even using an x86 Linux Docker image helped as tensorflow complained it was compiled for AVX-512 and the CPU didn’t support it.
Very true. One (quite narrow) use-case does have a simple solution: a) your install targets are only Debian derivatives, b) you can survive with only the Python packages provided by the OS, then don't use pip at all, install everything via apt and have a long lie-in.
Even using python for almost a decade sometimes I feel the entire debacle around packaging and development environment it’s too sophisticated for a user base that grows every day with more beginners.
It kills me that the Python folks don't have a resource like to CTAN and an established naming convention so that things (mostly) work when installed, and when things don't help is just a "minimum (not) working example" away on tex.stackexchange.com
Worse than Javascript's 5+ different types of mostly incompatible module formats, transpilers for different version targets, megalithic node_modules directories and the community's love of importing new packages for one-line functions that can disappear and take down the internet?
As much as we all have criticized the npm ecosystem, after seeing what exists in the python world I actually came round to it and consider it very good.
One major difference is a venv doesn't allow nested dependencies. With npm, you might install version 2.3 of a package, only to find that some other package has installed version 1.4 under itself.
Python also has "wheels", which are pre-compiled distributions, rather than full-source distributions. These are often smaller and install faster.
> One major difference is a venv doesn't allow nested dependencies. With npm, you might install version 2.3 of a package, only to find that some other package has installed version 1.4 under itself.
a feature not a bug. Anyone who ahs built a sufficiently large project and had to abandon some dependencies know this pain
Javascript is a larger technical mess once you pull it open and look inside.
But for day to day use it's much better than Python. They've done a fantastic job overall allowing you not to need to worry about all the different formats and transpilers and whatnot.
In Python at least you "only" have to figure out venv, pip and conda and you're mostly set. In C/C++ world adding a dependency is such a pain that most projects have less than five dependencies. Most of the time it's simpler to just quickly write and debug a worse version yourself than to figure out how to find and integrate a library. Until you reach sufficient scale, at which point the sane thing seems to be to make your own build system instead
>In Python at least you "only" have to figure out venv, pip and conda and you're mostly set. In C/C++ world adding a dependency is such a pain that most projects have less than five dependencies. Most of the time it's simpler to just quickly write and debug a worse version yourself than to figure out how to find and integrate a library.
That's from the developer's perspective, not the user's. The user of a C++ project doesn't have to care about any of that; they just run the binary, or ./configure && make to build from source. The user of a Python project has to struggle through worst case days of pain depending on how many dependencies are randomly broken.
If your user is interacting with the package ecosystem, you already failed multiple steps. Which python projects are routinely doing, but that's a cultural issue about packaging.
For "users" that have to touch the code: well, those get the full developer's perspective. And I've lost more hours than I would like to admit trying to get C++ projects to build. A project has to be really small or packaged really well to make it as simple as ./configure && make
Can you elaborate what this system lacks or what are the key properties that make it „the worst“? I know its a mess, but I cant pinpoint the reason why.
The non-hierarchical, not-reused, reinvented package convention. The most popular package repository being incredibly hard to find one package you want among a sea of crap. The lack of a single standard solution to the most common problems. Frequent backwards compatibility issues.
It's definitely not the worst. But it's hilarious how bad it is considering how popular it is and how long it's been around
That so many things doesn't run on the latest stable version is one of them. Python 3.11 was published in all the way back in October, and it's still not stable and well supported.
Yes, it is way better which is why many layer package managers, e.g. cargo and composer, took the ideas from Ruby to improve on. Ruby has issues but the package management is not one of them.
I find all of this ironic given that the main gripe people used to have against Common Lisp was "but how do I make a self-contained binary?". Somehow this was never required of Python.