Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think Python needs much fewer packages in its standard library:

- array: everyone uses numpy for the same purposes

- bisect: very niche feature, not a good fit for standard library

- glob: should probably be part of os

- graphlib: why is it part of the standard library? even if you need to work with graphs, chances are you have different requirements or data format

- shlex: again, not really necessary in standard library (unless it's used in shutil, I don't know)

- statistics: should be handled by scipy

And that's just the modules mentioned in this article. There are tons of other modules that are outdated, unused, don't fit into standard library: getopt, curses, urllib (everyone uses requests), xml*, html, tkinter.

What Python should be doing is deprecating those modules and/or moving them outside of the standard library. Not adding more of them.



Sorry to pile-on on the disagreements, but I also try to keep my scripts as external dependency-free as possible (at the expense of reinventing the wheel a bit) so that I minimize supply-chain attacks, and to keep things as future-proof as possible.

For that, some of the included libraries really ease the pain of having to reinvent things like text-wrapping, special cases when iterating over things, creating TUIs, etc.

The more you read the docs, the more you appreciate the thought and effort of the Python developers in making things useful and convenient for everyone.


Python package management is a nightmare. A script that only uses the standard library can be shared as a single file.

To me, the breadth of the standard library is one of Python’s main strengths.


Strange selection you got there.

This is what will be (eventually) removed: https://docs.python.org/3/library/superseded.html

> urllib (everyone uses requests)

iirc requests is actually a wrapper for urllib. And there are certainly many scripts using urllib directly to avoid having an out-of-stdlib dependency. The API is not that bad.


requests uses urllib3, which was forked or split out of the standard library, presumably due to missing features or design flaws.


Thanks for the link! It's nice to see that some cleanup is ongoing.


They do sort out dead batteries [1]. But pretty much every case you've listed is not exactly dead, but more like a philosophical disagreement.

For example some modules are not what they seem to be; graphlib doesn't purport to be a general graph library and statistics doesn't mean to replace scipy. They are just groups for otherwise independent bits of code, named so that similar future code can be grouped together. In some cases you have a waaaaaay too high threshold for "unused" modules, for example [1] kept getopt because it mirrors a popular C library.

[1] https://peps.python.org/pep-0594/


> kept getopt because it mirrors a popular C library

It's an excellent reason to keep it as a separately maintained module. Having in the standard library two modules for the same purpose just adds to the confusion.


Strong disagree on 'statistics', I'm not pulling in the entirety of SciPy just so I can call stdev in a small script.


You can also use numpy to use stdev, and i would argue thay in most contexts where you need to compute stdev you are also likely to use other numpy and scicpy features


It is truly the jQuery of the Python world.


I personally love the (extra)-batteries included approach to the standard library.

Having them included by default is much more convenient; I may want to do some very simple stats without needing all of scipy, etc.


Array - What's the use case here? (And I'm not being snarky or even critical.)

My usage is generally lists for small stuff where I don't care about performance, and pandas/numpy for analysis. So maybe array is either if I don't want numpy, or if I am doing something real and large and list performance would truly be a problem?

Also, I looked for a PEP to answer these questions and couldn't find it. (The one I found was very old). Or any stats on performance improvements?


I agree with some of these, and some could be consolidated, but the features of `glob` and `shlex` are useful for writing scripts, which I think is a common use for Python. Seems to me that leaving them in is reasonable.

I also think having basic statistic computations is useful. There are people who may want to calculate simple statistics without adding a whole dependency.


I use the array module to make the Python/C extension interface easier, to save space for big data sets, and (with from/to), simplify I/O.

I don't want a dependency on NumPy, especially as I don't need any other feature of NumPy beyond storing a homogeneous, resizeable vector of simple data types.


> What Python should be doing is deprecating those modules and/or moving them outside of the standard library. Not adding more of them.

No, thanks.


No. The "batteries included" is awesome for many reasons. Removing in-use modules would also create serious upgrade problem.


Yes, but it includes wrong batteries. If anything, it should be including numpy or pandas, not arrays.


> array: everyone uses numpy for the same purposes

It still is used in situations where numpy doesn't fit (serialization of data, etc)

Agree with shlex and glob

> statistics: should be handled by scipy

It was just added. I guess for people who want to import numpy to calculate the mean of an array? (please don't do this - you don't need to do this)

getopt is a needed evil and I think requests uses a lot of builtin stuff

Python does deprecate some modules once in a while


Are you trolling?

For one thing, you can pry Tkinter from my cold dead hands. For another, I used array just last week.

This whole "no one is using it so throw it away" culture is part of the improving-to-death of the modern Python ecosystem. I think it must come from the Python 3 fiasco. It's become fashionable in a subset of Python community to aggressively deprecate working code.


Given how many warts remain in the language, not aggressive enough.


binary search is standard imo.


I agree, Python's stdlib doesn't have a lot of standard algorithms and data structures; bisect is a very welcome exception.


glob is part of pathlib in Python 3. Pathlib is great.


But, unfortunately, very slow.

https://youtu.be/tFrh9hKMS6Y


glob works on more than paths, though


Like what for example?


They probably mean fnmatch, which is what glob actually uses under the hood. That's indeed a useful module for simple wildcard filtering.


Exactly! I use fnmatch a lot to translate glob strings to regexes (in a VFX company, I'm a developer for a Houdini and in-house software pipeline).

Being able to provide solid globbing options to users in any scenario that is 'path-y' can provide massively more power to them, in lots of different cases I'd say.


My bad, I did mix it up with fnmatch.


glob could be part of os (or maybe even pathlib) that is true. But I have used all except graphli at one point or another with shlex being the most regular.


In the case of graphlib, I somewhat agree - graph functions are necessary for many modern applications, and the choice of functions supported by graphlib is... interesting. It would probably make more sense to either extend them or just replace it with a more useful library.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: