[HN Gopher] Cooperative Package Management for Python
___________________________________________________________________
Cooperative Package Management for Python
Author : Tomte
Score : 59 points
Date : 2021-09-01 05:56 UTC (4 hours ago)
(HTM) web link (lwn.net)
(TXT) w3m dump (lwn.net)
| derriz wrote:
| I probably haven't thought this fully through but wouldn't it be
| simpler to just have a system venv - root protected - perhaps
| distributed using the system's package manager? Then if you mess
| up site-packages at least you wouldn't break the system tools.
| BiteCode_dev wrote:
| It's already the case, that's why you see some mistaken
| tutorials telling you to "sudo pip install".
|
| A venv is just a dir and some path, and your os already have
| some dedicated to their own python install, even if it's not
| called a venv, but something like dist-packages + manual PATH
| fudging.
|
| This proposal makes the distinction clearer by putting
| safeguard to NOT to mess up with the system stdlib.
|
| You should be using "--user" or a local venv, and "-m" to call
| commands.
|
| See my other comments for more tooling.
| derriz wrote:
| I'm missing something I guess - I don't understand your
| comment in the context of this claim near the start of the
| article:
|
| "The root cause of the problem is that distribution package
| managers and Python package managers ("pip" is shorthand to
| refer to those throughout the rest of the article) often
| share the same "site-packages" directory for storing
| installed packages."
|
| Having a system venv - managed by the system's package
| manager - would mean this "root cause" would go away, no?
|
| Of course by poking around the filesystem tree and messing
| with managed files as root, you could mess up the system venv
| but that's possible with any installed package.
| faho wrote:
| >Having a system venv - managed by the system's package
| manager - would mean this "root cause" would go away, no?
|
| Oh, but you need to explain to pip that it's managed by the
| system's package manager. Which is what this does with the
| "EXTERNALLY-MANAGED" file.
|
| And you need to do that because the system's package
| manager and pip have separate sources - with `pip` you get
| the packages from pypi, with the system's package manager
| from its repo.
| BiteCode_dev wrote:
| As I said, a venv is nothing more than a directory + PATH
| setup. OS already have a dedicated one. The problem is not
| that it does not exist, the problem is that pip is using
| it.
| JonathanBeuys wrote:
| On my computer, I only use Python software that is in the Debian
| repos. So all dependencies are handled by Debian.
|
| Everything else I run in a Docker container.
|
| This way I never need a venv.
|
| I hope these changes will not make my workflow harder.
|
| Running 3rd party software in containers not only makes
| dependency handling easier. It also is a security thing. Can you
| really trust the whole dependency tree below any python code that
| you want to run? I prefer to keep it separate from my main OS.
| BiteCode_dev wrote:
| No, it fact it will make your workflow easier: no matter what,
| no script using pip will break your install.
| JonathanBeuys wrote:
| I don't know if I actually have something you would call an
| "install".
|
| I run 3rd party software in fresh Debian containers. So there
| is nothing in there I would be afraid to "break".
| BiteCode_dev wrote:
| The container has a system python stdlib, and it's used by
| apt. If it breaks, everything breaks. I've destroyed it
| twice, with style I must say.
|
| If any script, or yourself, run "pip install" using admin
| rights (the default on a lot of container techs), you have
| a chance to break it.
| JonathanBeuys wrote:
| "Chance to break" what?
|
| It is not as if I manually work in a terminal inside a
| container that is somehow permanent and valuable.
|
| I just once create a dockerfile with the commands to set
| up the application I want to use. That's it. From then
| on, I use that container whenever I want to use that
| application.
| BiteCode_dev wrote:
| If pip installs something that has a dependancy with the
| same name than any of the packages your install, it can
| cause a conflict, because by default, they will override
| each other.
| chriswarbo wrote:
| Eww. I get the rationale, but Python's packaging/import logic is
| already ridiculously convoluted. The underlying problem is using
| global mutable state (e.g. /usr/lib/.../site-packages). Giving
| each application its own package directory is better; there are
| Python-specific solutions to that (like virtualenv mentioned in
| the article), but I prefer Nix since it's language-agnostic (it
| can handle non-Python dependencies too).
|
| I'm also not a fan of all the Rube Goldberg machines being
| cobbled together to appease Docker's fundamentally broken way of
| working:
|
| > Distros that produce official images for single-application
| containers (e.g., Docker container images) should remove the
| EXTERNALLY-MANAGED file, preferably in a way that makes it not
| come back if a user of that image installs package updates inside
| their image (think RUN apt-get dist-upgrade). On dpkg-based
| systems, using dpkg-divert --local to persistently rename the
| file would work. On other systems, there may need to be some
| configuration flag available to a post-install script to re-
| remove the EXTERNALLY-MANAGED file.
|
| Here a "single-application container" is assumed to contain _an
| entire OS_ rather than, you know, a single application. That was
| _supposed_ to make dependency management easier, since we can
| tailor the OS to just that one application; but based on this
| article it sounds like even that was a lie. Should we revisit the
| assumption that Docker makes dependency management easier? No,
| the OS maintainers are now expected to change their distros, to
| add another layer to the Rube Goldberg machine.
|
| I also find it pretty alarming that 'apt-get dist-upgrade' is
| given as an example of something we might want to do to an
| "official image". What's the point sanctioning a snapshot as
| "official" if we're just going to immediately, and non-
| determinisitcally, overwrite arbitrary parts of it based on
| external-server-state du jour?
| olau wrote:
| > Giving each application its own package directory is better
|
| One simple way to do that is
|
| cd projectfoo mkdir libs pip install -t libs PYTHONPATH=libs
| python foo.py
|
| That way you can use system packages like a database connection
| library together with the local application dependency farm.
| BiteCode_dev wrote:
| While I agree about Nix being a cleaner solution, it will never
| be adopted, because... it's a cleaner solution.
|
| Nix is pretty much incompatible with everything, because our
| entire legacy of software stack has state, and we need to
| accommodate for it.
|
| So we keep adding those hacks because we favor compat above
| purity.
|
| Even nix has to, after all, there plenty of bash calls in NixOS
| config scripts as soon as you do something non trivial.
| chriswarbo wrote:
| Sure, but I'm not saying everyone should adopt Nix; I'm
| saying the solution to problems caused by global mutable
| state is to avoid adding more global mutable state.
|
| If 'apt-get dist-upgrade' is breaking the packages we pip-
| installed into /usr/local, I'd ask (a) why we're using two
| package managers, and (b) why 'put these files in these
| locations' is being solved implicitly by running imperative,
| non-deterministic commands to mutate things in-place, rather
| than e.g. extracting a .tar.gz of known-good dependencies.
| BiteCode_dev wrote:
| > If 'apt-get dist-upgrade' is breaking the packages we
| pip-installed into /usr/local, I'd ask (a) why we're using
| two package managers,
|
| debian packagers won't use pip for obvious reasons. dev
| need pip because it's portable, and doesn't need packagers
| validation.
|
| It would have been better if something like nix would have
| been adopted 30 years ago by both community, but it hasn't
| been so we have several packages managers. It's even worst
| now with poetry, conda, snap, flatpack and docker.
|
| > why 'put these files in these locations' is being solved
| implicitly by running imperative, non-deterministic
| commands to mutate things in-place, rather than e.g.
| extracting a .tar.gz of known-good dependencies.
|
| pip now uses wheels, and basically does that. Whl files are
| zip, and pip just unpack it at a know location. The problem
| is, this location is currently shared with the OS if you
| don't use --user or a venv. This article address that
| problems. My other comment also talk about some other
| solution.
|
| It's very hacky, because, well legacy and all that.
| georgyo wrote:
| You missed what he was saying.
|
| Nix solves this problem in multiple different ways. One
| of which is packaging, but also the entire /nix/store is
| mounted read-only.
|
| A sudo pip install would have no ability to break the
| system or remove something that was installed by nix.
| BiteCode_dev wrote:
| I get that, but since our entire legacy stack is
| stateful, there is no way around that unless you nixify
| the entire stuff.
|
| Immutable works only if the entire chain is.
| [deleted]
| BiteCode_dev wrote:
| It's a good safeguard, and it's going in the direction of the
| other initiatives to make python package management default
| behavior saner.
|
| PEP 852 is the another one to follow up:
| https://www.python.org/dev/peps/pep-0582/
|
| It basically uses the concept of node_modules, making python
| interpreters load any local directory names `__pypackages__` .
| There are 2 differences though:
|
| * Unlike JS, python can only have one version of each lib for a
| given setup.
|
| * Since having several versions of python often matters, you may
| have several __pypackages__/X.Y sub dirs to cater to each of
| them.
|
| It also forces you to use "-m" to call commands written in
| Python, which is the best practice anyway. I hope it will push
| jupyter to fix "-m" on windows for them because that's a blocker
| for beginners.
|
| If you are not already using "-m", start now. It solves a lot of
| different problems with running python cli programs. It's an old
| flag, but too few people know about it.
|
| E.G: instead of running "black" or "pylint", do "python -m black"
| or "python -m pylint". Or course you may want to chose a specific
| version of python, so "python3.8 -m black" for unix, or "py -3.8
| -m black" on windows.
|
| To test out the __pypackages__ concept, give a try to the pdm
| project: https://github.com/pdm-project/pdm
|
| At last, some other tools that I wish people knew more about that
| solves packaging issues:
|
| * pyflow (https://github.com/David-OConnor/pyflow): it's a
| package manager like poetry, but it also installs whatever python
| you want like pyenv. Except it provides the binary, no need to
| compile anything. It's a young project with plenty of bugs, but I
| wish it succeeds because it's really a great concept. Give it a
| try, it needs users and we would all benefit from it becoming
| popular as it's a very sane way of setuping a python dev env.
|
| * shiv (https://shiv.readthedocs.io/): it leverages the concept
| of zipapp, see PEP 441 from 2013, meaning the ability that python
| has to execute code inside a zip file. It's a successor to pex.
| Basically it lets you bundle your code + all deps inside a zip,
| like a Java .war file. You can then run the resulting zip, a .pyz
| file, like if it were a regular .py file. It will unzip on the
| first execution automatically and run transparently. It makes
| deployment almost as easy as with golang.
|
| * nuitka (https://nuitka.net/): takes your code and all
| dependencies, turns them into C, and compiles it. Although it
| does require a bit of setup, since it needs headers and a
| compiler, it results reliably in a standalone compiled executable
| that will run on the same architecture with no need for anything
| else. Also it will speed up your Python program, up to 4 times.
| In my experience, it's also easier and more robust than
| pyinstaller, cx_freeze and so on.
|
| * pyodide (https://pyodide.org/en/stable/): python compiled to
| WASM to run in the Web browser. Useless for web programming given
| the huge size of the runtime, but great for teaching, as it
| allows students to basically access a zero install full featured
| python dev env by clicking a link. Try it out, it's awesome, you
| can even create and query a sqlite db, thanks to the virtual FS:
| https://notebook.basthon.fr/
| ericvsmith wrote:
| nuitka is at https://nuitka.net/
| BiteCode_dev wrote:
| thanks, I'll fix my bad copy paste.
| gigatexal wrote:
| What's still annoying to this day is Python imports. I have been
| doing Python for a while now and I still get tripped up when some
| bit of code in a folder isn't importable. I've found adding the
| package dir that I'm working on to PYTHON_PATH works but then my
| editor doesn't resolve packages for autocomplete.
| georgyo wrote:
| If you have foo.py and bar.py in a directory and you want to
| import foo.py from bar.py you only need to add an empty
| __init__.py
|
| You would then `import .foo`
| terom wrote:
| https://www.python.org/dev/peps/pep-0420/ Python 3.3+ no
| longer requires the `__init__.py` file to make a package.
|
| > You would then `import .foo`
|
| The `import .foo` is a syntax error, and `from . import foo`
| does not work for scripts [1].
|
| The basic `import foo` works if the directory is on your
| PYTHONPATH. That will be the case if you run `bar.py` as a
| script.
|
| It gets more difficult if you want to separate scripts and
| importable modules into separate sibling directories (bin vs
| lib). You can't use relative imports to get around that, you
| need to use a virtualenv (or manual PYTHONPATH wrangling).
|
| [1] https://stackoverflow.com/a/14132912
| BiteCode_dev wrote:
| Use virtualenvs, that's the problem they solve.
|
| Python now ships with the venv cli tool.
|
| If you point your editor to the python interpreter in the venv,
| it will detect everything automatically.
| korijn wrote:
| I'm thrilled to see people adressing this (decades old?) footgun.
| I hope there will be more collaboration in the future. The split
| universes system and language specific package managers operate
| in are a cause for much more issues.
___________________________________________________________________
(page generated 2021-09-01 10:00 UTC)