[HN Gopher] Pypi.org is running a survey on the state of Python ...
___________________________________________________________________
Pypi.org is running a survey on the state of Python packaging
Author : zbentley
Score : 163 points
Date : 2022-09-07 15:05 UTC (7 hours ago)
(HTM) web link (pypi.org)
(TXT) w3m dump (pypi.org)
| sanshugoel wrote:
| Pickup poetry and fix it. I thought it would be fun to use poetry
| but it smacks itself here and there.
| clintonb wrote:
| The survey is at https://www.surveymonkey.co.uk/r/M5XKQCT.
| MonkeyMalarky wrote:
| Seems more oriented to (potential) contributors than end users
| of the packaging system. Who cares about mission statements and
| inclusivity, secure funding and pay developers to make the
| tools.
| woodruffw wrote:
| > Who cares about mission statements and inclusivity, secure
| funding and pay developers to make the tools.
|
| These are connected things.
|
| I maintain a PyPA member project (and contribute to many
| others), and the latter is aided by the former: the mission
| statement keeps the community organized around shared goals
| (such as standardizing Python's packaging tooling), and
| inclusivity insures a healthy and steady flow of new
| contributors (and potential corporate funding sources).
| nomdep wrote:
| The PSF are not engineers looking for a better developer
| experience, but politicians looking for power. That's why the
| pipenv fiasco a few years ago
| zo1 wrote:
| What was the pipenv fiasco?
| crazytalk wrote:
| This survey is the literal definition of leading question.
| Found about 2 boxes I could tick, before being forced to order
| a list of the designer's preferences according to how much I
| agree with them. The only data that can be generated from a
| survey like this is the data you wanted to find (see also
| Boston Consulting Group article earlier today). I cannot
| honestly respond to it
|
| The only question I have is, what grant application(s) is the
| survey data being used to support?
| KingEllis wrote:
| The absence of the go binary as a tool (i.e. "go get ...",
| "go install ..." etc.) is odd, considering that is what has
| been eating Python's lunch lately.
| erwincoumans wrote:
| I am pretty happy with PyPi/pip, it is an easy way to distribute
| Python and C++ code wrapped in a Python extension to others. For
| a C++ developer it is becoming harder to distribute native
| executables, since MacOS and Windows require signing binaries.
| Python package version conflicts and backwards incompatibility
| can be an issue.
| mdmglr wrote:
| My wishlist:
|
| We need a way to configure an ordered list of indexes pip
| searches for packages. --extra-index-url or using a proxy index
| is not the solution.
|
| Also namespaces and not based on a domain. So for example: pip
| install apache:parquet
|
| Also some logic either in the pip client or index server to
| minimize typosquatting
|
| Also pip should adopt a lock file similar to npm/yarn. Instead of
| requirements.txt
|
| And also "pip list" should output a dependency tree like "npm
| list"
|
| I should not have to compile source when I install. Every package
| should have wheels available for the most common arch+OS combos.
|
| Also we need a way to download only what you need. Why does
| installing scipy or numpy install more dependencies than the
| conda version? For example pywin and scipy.
| ciupicri wrote:
| > apache:parquet
|
| How are you going to name the file storing the wheel for that
| package? Using ":" on Windows is going to be problematic.
| tempest_ wrote:
| If you are using poetry you can add something to the
| pyproject.toml to handle the indexes, though I am not sure if
| they are ordered or not
|
| [[tool.poetry.source]] name = "my-pypi" url = "https://my-pypi-
| index.wherever" secondary = true
| orf wrote:
| For improvements I commented: Remove setup.py files and mandate
| wheels. This is the root cause of a lot of the evil in the
| ecosystem.
|
| Next on the list would be pypi namespaces, but there are good
| reasons why that is very hard.
|
| The mission statement they are proposing, "a packaging ecosystem
| for all", completely misses the mark. How about a "packaging
| ecosystem that works" first?
|
| I spent a bunch of time recently fixing our internal packaging
| repo (nexus) because the switch from md5 hashes to sha256 hashes
| broke everything, and re-locking a bajillion lock files would
| take literally months of man hours time.
|
| I've been a Python user for the last 17 years, so I'm sympathetic
| of how we got to the current situation and aware that we've
| actually come quite far.
|
| But every time I use Cargo I am insanely jealous, impressed and
| sad that we don't have something like it. Poetry is closest, but
| it's a far cry.
| baggiponte wrote:
| I recommend PDM over poetry!
| blibble wrote:
| > The mission statement they are proposing, "a packaging
| ecosystem for all", completely misses the mark. How about a
| "packaging ecosystem that works" first?
|
| I think at the point a programming language is going on about
| "mission statements" for a packaging tool, you know they've
| lost the plot
|
| copy Maven from 2004 (possibly with less XML)
|
| that's it, problem solved
| dalke wrote:
| > Remove setup.py files and mandate wheels
|
| What alternative is there for me?
|
| My package has a combination of hand-built C extensions and
| Cython extensions, as well as a code generation step during
| compilation. These are handled through a subclass of
| setuptools.command.build_ext.build_ext.
|
| Furthermore, I have compile-time options to enable/disable
| certain configuration options, like enabling/disabling support
| for OpenMP, via environment variables so they can be passed
| through from pip.
|
| OpenMP is a compile-time option because the default C compiler
| on macOS doesn't include OpenMP. You need to install it, using
| one of various approaches. Which is why I only have a source
| distribution for macOS, along with a description of the
| approaches.
|
| I have not found a non-setup.py way to handle my configuration,
| nor to provide macOS wheels.
|
| Even for the Linux wheels, I have to patch the manylinux Docker
| container to whitelist libomp (the OpenMP library), using
| something like this: RUN perl -i -pe
| 's/"libresolv.so.2"/"libresolv.so.2", "libgomp.so.1"/'
| /opt/_internal/pipx/venvs/auditwheel/lib/python3.9/site-
| packages/ auditwheel/policy/manylinux-policy.json
|
| Oh, and if compiling where platform.machine() == "arm64" then I
| need to not add the AVX2 compiler flag.
|
| The non-setup.py packaging systems I've looked at are for
| Python-only code bases. Or, if I understand things correctly,
| I'm supposed to make a new specialized package which implements
| PEP 518, which I can then use to boot-strap my code.
|
| Except, that's still going to use effectively arbitrary code
| during the compilation step (to run the codegen) and still use
| setup.py to build the extension. So it's not like the evil
| disappears.
| [deleted]
| korijn wrote:
| I emphatize with your situation and it's a great example. As
| crazy as this may sound, I think you would have to build
| every possible permutation of your library and make all of
| them available on pypi. You'd need a some new mechanism based
| on metadata to represent all the options and figure out how
| to resolve against available system libraries. Especially
| that last part seems very complicated. But I do think it's
| possible.
| orf wrote:
| To be clear, I'm not suggesting we remove the ability to
| compile native extensions.
|
| I'm suggesting we find a better way to build them, something
| a bit more structured, and decouple that specific use case
| from setup.py.
|
| It would be cool to be able to structure this in a way that
| means I can describe what system libraries I may need without
| having to execute setup.py and find out, and express compile
| time flags or options in a structured way.
|
| Think of it like cargo.toml va build.rs.
| dalke wrote:
| I agree it would be cool and useful.
|
| But it appears to be such a hard problem that modern
| packaging tools ignore it, preferring to take on other
| challenges instead.
|
| My own attempts at extracting Python configuration
| information to generate a Makefile for personal use
| (because Makefile understand dependencies better than
| setup.py) is a mess caused by my failure to understand what
| all the configuration options do.
|
| Given that's the case, when do you think we'll be able to
| "Remove setup.py files and mandate wheels"?
|
| I'm curious on what evils you're thinking of? I assume the
| need to run arbitrary Python code just to find metadata is
| one of them. But can't that be resolved with a
| pyproject.toml which uses setuptools only for the build
| backend? So you don't need to remove setup.py, only
| restrict when it's used, yes?
| infogulch wrote:
| The closest thing I've seen to a solution in this space
| is Riff, discussed yesterday [1], which solves the
| external dependency problem for rust projects.
|
| [1]: https://news.ycombinator.com/item?id=32739954
| dec0dedab0de wrote:
| The ability to create a custom package that can run any
| custom code you want at install time is very powerful. I
| think a decent solution would be to have a way to mark a
| package as trusted, and only allow pre/post scripts if they
| are indeed trusted. Maybe even have specific permissions
| that can be granted, but that seems like a ton of work to
| get right across operating systems.
|
| My specific use cases are adding custom CA certs to certifi
| after it is installed, and modifying the maximum version of
| a requirement listed for an abandoned library that works
| fine with a newer version.
|
| I think the best solutions would be an official way to
| ignore dependencies for a specific package, and specify
| replacement packages in a project's dependencies. Something
| like this if it were a Pipfile: public-
| package = {version = "~=1.0",replace_with='path/to/local-
| package'} abandoned-package = {version =
| "~=*",ignore_dependencies=True}
|
| But the specific problem doesn't matter, what matters is
| that there will always be exceptions. This is Python, we're
| all adults here, and we should be able to easily modify
| things to get them to work the way we want them to. Any
| protections added should include a way to be dangerous.
|
| I know your point is more about requiring static metadata
| than using wheels per se. I just believe that all things
| Python should be flexible and hack-able. There are other
| more rigid languages if you're into that sort of thing.
|
| edit:
|
| before anyone starts getting angry I know there are other
| ways to solve the problems I mentioned.
|
| forking/vendoring is a bit of overkill for such a small
| change, and doesn't solve for when a dependency of a
| dependency needs to be modified.
|
| monkeypatching works fine, however it would need to be done
| at all the launch points of the project, and even then if I
| open a repl and import a specific module to try something
| it won't have my modifications.
|
| modifying an installed package at runtime works reasonably
| well, but it can cause a performance hit at launch, and
| while it only needs to be run once, it still needs to be
| run once. So if the first thing you do after recreating a
| virualenv is to try something with an existing module we
| have the same problem as monkey patching.
|
| 'just use docker' or maybe the more toned down version:
| 'create a real setup script for developers' are both valid
| solutions, and where I'll probably end up. It was just very
| useful to be able to modify things in a pinch.
| kevin_thibedeau wrote:
| Setup.py can do things wheels can't. Most notably it's the
| _only_ installation method that can invoke 2to3 at runtime
| without requiring a dev to create multiple packages.
| orf wrote:
| It's lucky Python 2 isn't supported anymore then, and
| everyone has had like a decade to run 2to3 once and publish a
| package for Python 3, so that use case becomes meaningless.
| mistrial9 wrote:
| very unfortunately the direct burden of python2 is placed
| on the packagers.. users of Python 2 like their libs (me)
| and have no horse in this demonization campaign
| orf wrote:
| Pay for support for Python 2 then? At which point it's a
| burden on the person you are paying.
| coldtea wrote:
| You'd be surprised at how many billions lines of production
| code are still at 2 (and could not care less whether it's
| end-of-lined)
| orf wrote:
| I'm not surprised at all, but regardless they also should
| not be similarly surprised if people could not care less
| about that use case.
| ziml77 wrote:
| I tend to just give up on a package if it requires a C
| toolchain to install. Even if I do end up getting things set up
| in a way that the library's build script is happy with, I'll be
| inflicting pain on anyone else who then tries to work with my
| code.
| [deleted]
| tux3 wrote:
| It feels so suboptimal to need the C toolchain to do things,
| but having no solid way to depend on it as a non-C library
| (especially annoying in Rust, which insists on building
| everything from source and never installing libraries
| globally).
|
| I make a tool/library that requires the C toolchain at
| runtime. That's even worse than build time, I need end users
| to have things like lld, objdump, ranlib, etc installed
| anywhere they use it. My options are essentially:
|
| - Requiring users to just figure it out with their system
| package manager
|
| - Building the C toolchain from source at build time and
| statically linking it (so you get to spend an hour or two
| recompiling _all of LLVM_ each time you update or clear your
| package cache! Awesome!),
|
| - Building just LLD/objdump/.. at build-time (but user still
| need to install LLVM. So you get both slow installs AND have
| to deal with finding a compatible copy of libLLVM),
|
| - Pre-compiling all the C tools and putting them in a storage
| bucket somewhere, for all architectures and all OS versions.
| But then not have support when things like the M1 or new OS
| versions right away, or people on uncommon OSes. And now need
| to maintain a build machine for all of these myself.
|
| - Pre-compile the whole C toolchain to WASM, build Wasmtime
| from source instead, and just eat the cost of Cranelift
| running LLVM 5-10x slower than natively...
|
| I keep trying to work around the C toolchain, but I still
| can't see any very good solution that doesn't make my users
| have extra problems one way or another.
|
| Hey RiiR evangelism people, anyone want to tackle all of
| LLVM? .. no? No one? :)
| cycomanic wrote:
| I know this is unpopular opinion on here, but I believe all
| this packaging madness is forced on us by languages because
| Windows (and to a lesser degree osx) have essentially no
| package management.
|
| Especially installing a tool chain to compile C code for
| python is no issue on Linux, but such a pain on Windows.
| humanrebar wrote:
| C tends to work in those cases because there aren't a
| significant number of interesting C dependencies to add...
| because there is no standard C build system, packaging
| format, or packaging tools.
|
| When juggling as many transitive dependencies in C as folks
| do with node, python, etc., there's plenty of pain to deal
| with.
| progval wrote:
| > For improvements I commented: Remove setup.py files and
| mandate wheels.
|
| This would make most C extensions impossible to install on
| anything other than x86_64-pc-linux-gnu (or arm-linux-
| gnueabihf/aarch64-linux-gnu if you are lucky) because
| developers don't want to bother building wheels for them.
| urschrei wrote:
| cibuildwheel (which is an official, supported tool) has made
| this enormously easier. I test and generate wheels with a
| compiled (Rust! Because of course) extension using a Cython
| bridge for all supported Python versions for 32-bit and
| 64-bit Windows, macOS x86_64 and arm64, and whatever
| manylinux is calling itself this week. No user compilation
| required. It took about half a day to set up, and is
| extremely well documented.
| mathstuf wrote:
| I think it'd make other things impossible too. One project I
| help maintain is C++ and is mainly so. It optionally has
| Python bindings. It also has something like 150 options to
| the build that affect things. There is zero chance of me ever
| attempting to make `setup.py` any kind of sensible "entry
| point" to the build. Instead, the build detects "oh, you want
| a wheel" and _generates_ `setup.py` to just grab what the C++
| build then drops into a place where `build_ext` or whatever
| expects them to be using some fun globs. It also fills in
| "features" or whatever the post-name `[name]` stuff is called
| so you can do some kind of post-build "ok, it has a feature I
| need" inspection.
| korijn wrote:
| ...and ensure _all_ package metadata required to perform
| dependency resolution can be retrieved through an API (in other
| words without downloading wheels).
| orf wrote:
| Yeah, that's sort of what I meant by my suggestion.
| Requirements that can only be resolved by downloading and
| executing code is a huge burden on tooling
| LukeShu wrote:
| If the package is available as a wheel, you don't need to
| execute code to see what the requirements are; you just
| need to parse the "METADATA" file. However, the only way to
| get the METADATA for a wheel (using PyPA standard APIs,
| anyway) is to download the whole wheel.
|
| For comparison, pacman (the Arch Linux package manager)
| packages have fairly similar ".PKGINFO" file in them; but
| in order to support resolving dependencies without
| downloading the packages, the server's repository index
| includes not just a listing of the (name, version) tuple
| for each package, it also includes each package's full
| .PKGINFO.
|
| Enhancing the PyPA "Simple repository API" to allow
| fetching the METADATA independently of the wheel would be a
| relatively simple enhancement that would make a big
| difference.
|
| ----
|
| As I was writing this comment, I discovered that PyPA did
| this; they adopted PEP 658 in March of this year! https://g
| ithub.com/pypa/packaging.python.org/commit/1ebb57b7...
| korijn wrote:
| Yeah. Well, mandating wheels and getting rid of setup.py at
| least avoids having to run scripts, and indeed enables the
| next step which would be indexing all the metadata and
| exposing it through an API. I just thought it wouldn't
| necessarily be obvious to all readers of your comment.
| orf wrote:
| Just to be clear, package metadata already is sort of
| available through the pypi json api. I've got the entire
| set of all package metadata here:
| https://github.com/orf/pypi-data $ gzcat
| release_data/c/d/cdklabs.cdk-hyperledger-fabric-
| network.json.gz | jq '. | to_entries |
| .[].value.info.requires_dist' | head [
| "typeguard (~=2.13.3)", "publication (>=0.0.3)",
| "jsii (<2.0.0,>=1.63.2)", "constructs
| (<11.0.0,>=10.0.5)", "aws-cdk-lib
| (<3.0.0,>=2.33.0)" ]
|
| It's just not everything has it, and there isn't a way to
| differenciate between "missing" and "no dependencies".
| And it's also only for the `dist` releases. But anyway,
| poetry uses this information during dependency
| resolution.
| dalke wrote:
| What if I have a dependency on a commercial third-party
| Python package which is on Conda but not on PyPI?
| mistrial9 wrote:
| you are placing open code in a vendor lock-in, to start
| dalke wrote:
| Yes, I understand that.
|
| I see I misunderstood korijn's comment. My earlier reply
| is off-topic, so I won't continue further off the track.
| IceHegel wrote:
| If I see some JS, Go, or Rust code online I know I can probably
| get it running on my machine in less than 5 min. Most of the
| time, it's a 'git clone' and a 'yarn' | 'go install' | 'cargo
| run', and it just works.
|
| With python, it feels like half the time I don't even have the
| right version of python installed, or it's somehow not on the
| right path. And once I actually get to installing dependencies,
| there are often very opaque errors. (The last 2 years on M1 were
| really rough)
|
| Setting up Pytorch or Tensorflow + CUDA is a nightmare I've
| experienced many times.
|
| Having so many ways to manage packages is especially harmful for
| python because many of those writing python are not professional
| software engineers, but academics and researchers. If they write
| something that needs, for example, CUDA 10.2, Python 3.6, and a
| bunch of C audio drivers - good luck getting that code to work in
| less than a week. They aren't writing install scripts, or testing
| their code on different platforms, and the python ecosystem makes
| the whole process worse by providing 15 ways of doing basically
| the same thing.
|
| My proposal:
|
| - Make poetry part of pip
|
| - Make local installation the default (must pass -g for global)
|
| - Provide official first party tooling for starting a new package
|
| - Provide official first party tooling for migrating old
| dependency setups to the new standard
|
| edit: fmt
| bno1 wrote:
| I wish pip had some package deduplication implemented. Even
| some basic local environments have >100MB of dependencies. ML
| environments go into the gigabytes range from what I remember.
| mcdermott wrote:
| Installing packages, creating a manifest of dependancies,
| managing virtual environments, packaging, checking/formatting
| code, etc... should be built into the Python toolchain (the
| python binary itself). Needing to chose a bunch of third party
| tools to make it work... makes Python, well... un-pythonic.
| siproprio wrote:
| State of the art python packaging must include support for common
| use cases such as conda+machine learning.
|
| It's incredible how even Julia's Pkg.jl supports better python
| packaging in combination with conda than the official python
| packaging tools.
|
| This is very clearly a question of the culture of the core python
| developers (such as brett cannon) who seem to think the machine
| learning people with their compilers and JITs are not an
| important part of the community.
| bzxcvbn wrote:
| I just wish they'd change their name so that my students stop
| snickering. (The name is pronounced like the French word for
| "piss".)
| d0mine wrote:
| Isn't PyPI pronounced like: pie (food/p) + pea + eye ().
|
| that is different from the french "pipi": pea + pea.
| bzxcvbn wrote:
| Sure. Now tell that to a bunch of immature college juniors.
| If you read the letters "pypi" in French, it sounds exactly
| like "pipi".
|
| And wait until you learn what "bit" sounds like in French.
| di wrote:
| Yep: https://pypi.org/help/#pronunciation
| yewenjie wrote:
| Is there any hope that even if there emerges a consensus package
| management solution going forward, old packages will be easily
| portable to it?
| black3r wrote:
| Is there a problem with the package format itself though? There
| are lots of serious problems tied to distribution rather than
| package format which would make the experience way easier
| especially for beginners and people used to other package
| managers...
|
| lacking binary wheels on PyPI, problems with shipping project
| with dependencies, confusion about there being multiple
| "package managers" (pip, pip-tools, poetry, pipenv, conda) and
| multiple formats of dependency lists (setup.py, setup.cfg,
| requirements.txt, pyproject.toml, ...), sys.path associated
| confusion (global packages, user-level packages, and anything
| specified in PYTHONPATH, ...)
| ferdowsi wrote:
| Nothing will improve as long as the Python powers insist on
| packaging being an exercise for the community.
| at_a_remove wrote:
| I have a terrible admission to make: one of the reason I like
| Python is its huge standard library, and I like that because I
| just ... despise looking for libraries, trying to install them,
| evaluating their fitness, and so on.
|
| I view dependencies outside of the standard library as a kind of
| technical debt, not because I suffer from Not Invented Here and
| want to code it myself, no, I look and think, "Why isn't this in
| the standard library with a working set of idioms around it?"
|
| I haven't developed anything with more than five digits of code
| to it, which is fine for me, but part of it is just ... avoidance
| of having to screw with libraries. Ran into a pip issue I won't
| go into (it requires a lot of justification to see how I got
| there) and just ... slumped over.
|
| This has been a bad spot in Python for a long, long time. While
| people are busy cramming their favorite feature from their last
| language into Python, this sort of thing has languished.
|
| Sadly, I have nothing to offer but encouragement, I don't know
| the complexities of packaging, it seems like a huge topic that
| perhaps nobody really dreamed Python would have to seriously deal
| with twenty years ago.
| itake wrote:
| More packages in the standard library means it can run in less
| machines and more extra junk needs to be installed.
|
| Minimal standard library languages let you pick and choose what
| needs to be run. Golang is a nice happy medium since it's
| compiled.
| samwillis wrote:
| > despise looking for libraries, trying to install them,
| evaluating their fitness, and so on.
|
| This is exactly why I prefer the larger opinionated web
| frameworks (Django, Vue.js) to the smaller more composable
| frameworks (Flask, React). I don't what to make decisions every
| time I need a new feature, I want something that "just works".
|
| Python and Django just work, and brilliantly at that!
| [deleted]
| 5d8767c68926 wrote:
| Currently dealing with Flask and it makes me sad from the
| endless decision fatigue. Enormous variations in quality of
| code, documentation, SO answers, etc. To not even consider
| the potential for supply side attacks.
|
| With Django there is a happy path answer for most everything.
| If I run into a problem, I know I'm not the first.
| manuelabeledo wrote:
| This is one of the reasons why I don't quite like Node. It
| feels like _everything_ is a dependency.
|
| It seems ridiculous to me that there isn't a native method for
| something as simple and ubiquitous as putting a thread to
| sleep, or that there is an external library (underscore) that
| provides 100+ methods that seem to be staples in any modern
| language.
|
| Python is nice in that way. It is also opinionated in a
| cohesive and community driven manner, e.g. PEP8.
| mrweasel wrote:
| If requests and a basic web framework was in the standard
| library you'd effectively eliminate the majority of my
| dependencies.
|
| Honestly I doubt see the package management being an issue for
| most end-users. Between the builtin venv, conda and Docker I
| feel that the use-cases for most is well covered.
|
| The only focus area I really see is better documentation.
| Easier to read documentation more precisely. Perhaps a set of
| templates to help people getting start with something like
| pyproject.
|
| It feels like the survey is looking for a specific answer, or
| maybe it's just that surveys are really hard to do. In any case
| I find responses to be mostly: I have no opinion one way or the
| other.
| rpcope1 wrote:
| Something like bottle.py would be an excellent candidate for
| inclusion. The real reason to avoid putting anything into the
| standard library is that it seems to often be the place where
| code goes to stagnate and die for Python.
| at_a_remove wrote:
| I am not sure why that has turned into a truism.
|
| Really good code in the standard library should reach a
| level of near perfection, then eventually transition into
| hopeful speed gains, after which you're really only
| changing that code because the language has changed or the
| specification has updated.
| qayxc wrote:
| > I view dependencies outside of the standard library as a kind
| of technical debt
|
| That's an interesting position. So are you suggesting that very
| specialised packages such as graph plotting, ML-packages, file
| formats, and image processing should be part of the standard
| library? What about very OS/hardware-specific packages, such as
| libraries for microcontrollers?
|
| There are many areas that don't have a common agreed-upon set
| of idioms or functionality and that are way too specialised to
| be useful for most users. I really don't think putting those
| into the standard library would be a good idea.
| at_a_remove wrote:
| Hrm. Graph-plotting ... yes. File formats ... yes, as many as
| possible. Image processing, given the success of ImageMagick,
| I'd say yes there as well. I don't know much about ML to say.
|
| OS-specific packages, quite possibly.
|
| The thing about the standard library is that it is like high
| school: there's a lot of stuff you think you will never need,
| and you're right about _most_ of it, but the stuff you do
| need you 're glad you had something going, at least.
| qayxc wrote:
| ImageMagick is actually a good example: I use Python as my
| primary tool for shell scripting (I don't like
| "traditional" shell scripts for various reasons) - if I can
| use Python to control external tools such as ImageMagick,
| why would I want to include all its functionality, codecs,
| effects, etc. in the standard library?
|
| Including too much leads to a huge burden for the
| maintainers and consequently results in this:
| https://peps.python.org/pep-0594/
|
| Quote:
|
| > Times have changed. With the introduction of PyPI (nee
| Cheeseshop), setuptools, and later pip, it became simple
| and straightforward to download and install packages.
| Nowadays Python has a rich and vibrant ecosystem of third-
| party packages. It's pretty much standard to either install
| packages from PyPI or use one of the many Python or Linux
| distributions.
|
| > On the other hand, Python's standard library is piling up
| with cruft, unnecessary duplication of functionality, and
| dispensable features.
| shrimpx wrote:
| I like poetry for its simplicity but I can't tell how "official"
| it is in the python ecosystem. I hope it doesn't die out. I think
| it's the simplest possible way to maintain deps and publish to
| PyPI if you don't have any weird edge cases.
| wlkr wrote:
| Couldn't agree more. Poetry is fantastic and provides that
| 'just works' experience for most cases. It's not official
| (although possibly should be adopted) but has gained ground by
| virtue of its quality. Fortunately it's very actively developed
| so will hopefully stick around.
| tempest_ wrote:
| I like poetry. It still has a way to go though since it is
| slow as all hell doing almost anything and its error messages
| are closer to a stack trace than something actionable.
| shrimpx wrote:
| I agree that the stack trace error messages are weird. That
| aspect feels uncharacteristically hacky for an otherwise
| pretty polished tool.
| IceHegel wrote:
| It should be make the official package manager IMO.
| fsniper wrote:
| Python lacks the flexible universal binary distribution solution
| which nearly all of the new comers has. Consider golang, rust or
| docker images. Most probably docker image distribution is the
| only available solution for now and volume management is the
| worst problem on that front.
| black3r wrote:
| I just wish that PyPI would enforce binary wheels going forward
| (at least for linux x64/arm64 for people who use Docker, but
| ideally for all common platforms). They already supply the
| cibuildwheel tool to automate their builds, so it shouldn't be
| that hard for library developers...
|
| Software developers shouldn't need to figure out what build-time
| dependencies their libraries need...
| JonathonW wrote:
| I haven't had too much trouble with packages missing binary
| wheels lately. Occasionally Pip doesn't find them, which looks
| the same as if the binary wheel were missing entirely (looking
| at you, Anaconda-- update your Pip already), but they're
| usually there.
|
| But I'm usually on Windows if I need binary wheels; maybe the
| coverage is a bit different on Linux.
| verst wrote:
| Try installing grpcio-tools on Darwin/arm64 with Python 3.10.
| More often than not I run into problems where low level
| headers required by some cryptography libraries cannot be
| found and as a result compilation fails.
| dalke wrote:
| Should PyPI kick my project off because it don't support MS
| Windows?
|
| My package uses C and Cython extensions. While I support macOS
| and Linux-based OSes, I don't know how to develop on or support
| MS Windows.
|
| I've tried to be careful about the ILP32 vs LP64 differences,
| but I suspect there's going to be many places where I missed
| up.
|
| I also use "/dev/stdin" to work-around my use of a third-party
| library that has no way to read from stdin. As far as I can
| tell, there's no equivalent in Windows, so I'll have to raise
| an exception for that case, and modify my tests cases.
| ciupicri wrote:
| Can't you use "CON" instead of "/dev/stdin" on Windows?
| dalke wrote:
| I asked that question nearly 11 years ago on StackOverflow,
| at https://stackoverflow.com/questions/7395157/windows-
| equivale... . ;)
|
| Quoting the best comment, "echo test | type CON or echo
| test | type CONIN$ will read from the console, not from
| stdin."
| Kim_Bruning wrote:
| XKCD's opinion:
|
| https://xkcd.com/1987/
| 7373737373 wrote:
| And that's just installing packages, _creating_ packages is
| another hellscape altogether
| YetAnotherNick wrote:
| I wish there is some package manager in middle of conda and pip.
| Conda is too strict and often get stuck in SAT solving. pip
| doesn't even ask when reinstalling a version currently being
| used.
|
| Edit: Typo: reinstalling a version of package currently being
| used
| woodruffw wrote:
| > pip doesn't even ask when reinstalling a version currently
| being used.
|
| Just as an explanation: a "version" in Python packaging can
| come from one of many potential distributions, including a
| local distribution (such as a path on disk) that might
| different from a canonical released distribution on PyPI.
| Having `pip install ...` always re-install based on its
| candidate selection rules is _generally_ good (IMO), since an
| explicit run of `pip install` implies user intent to search for
| a potentially new or changed distribution.
| YetAnotherNick wrote:
| I meant this for dependency not the package I am installing.
| woodruffw wrote:
| I'm not sure I understand what you mean -- `pip` should not
| be reinstalling transitive dependencies. If you install A
| and B and both depend on C, C should only be installed
| once.
| Izkata wrote:
| I think they're referring to requirements files. I've
| seen the same behavior - on day 1, pip installs packages
| A and B, then on day 2 when someone else modified the
| requirements file it installs C and reinstalls B even
| though B hasn't changed.
|
| This one I've seen when the dependency doesn't specify an
| exact version and you included "--upgrade".
|
| There's a second case that I think was fixed in pip 21 or
| 22, where two transitive dependencies overlap - A and B
| depend on C, but with different version ranges. If A
| allows a newer version of C than B allows, C can get
| installed twice.
| 5d8767c68926 wrote:
| My ask would be to get rid of the need for conda all together.
|
| Conda obviously offers a lot of value in sharing hairy compiled
| packages, but it does not play well with anything else. None of
| available the tooling really works with both conda and pip. It
| fragments the already lousy packaging story.
| lmeyerov wrote:
| we find mamba solves deps solving for conda (we fail to do GPU
| dependencies without it), and I think it's getting integrated
|
| my main thing w/ conda is it's bananas figuring out how to make
| a new recipe, which is pretty surprising
| kylebarron wrote:
| Agreed, I've found packaging for conda to be so much harder
| than packaging for pip
| thefinaluser wrote:
| Try poetry. It wraps pip and fixes a lot of its issues
| lightspot21 wrote:
| Seconding Poetry. IMO it should have been the standard
| package manager - it just works (TM)
| Smaug123 wrote:
| Although it consumed 70GB of RAM before I killed it, when I
| tried to use it to `poetry install` stable-diffusion.
| dwagnerkc wrote:
| Try using mamba (https://github.com/mamba-org/mamba)
|
| We ran into many unsolvable or 30m+ solvable envs with conda
| that mamba handled quickly.
|
| The underlying solver can be used with conda directly as well,
| but I have not done that
| (https://www.anaconda.com/blog/a-faster-conda-for-a-
| growing-c...)
| ryan29 wrote:
| I didn't take the survey because I've never packaged anything for
| PyPI, but I wish all of the package managers would have an option
| for domain validated namespaces.
|
| If I own example.com, I should be able to have
| 'pypi.org/example.com/package'. The domain can be tied back to my
| (domain verified) GitHub profile and it opens up the possibility
| of using something like 'example.com/.well-known/pypi/' for self-
| managed signing keys, etc..
|
| I could be using the same namespace for every package manager in
| existence if domain validated namespaces were common.
|
| Then, in my perfect world, something like Sigstore could support
| code signing with domain validated identities. Domain validated
| signatures make a lot of sense. Domains are relatively
| inexpensive, inexhaustible, and globally unique.
|
| For code signing, I recognize a lot of project names and
| developer handles while knowing zero real names for the companies
| / developers involved. If those were sitting under a recognizable
| organizational domain name (example.com/ryan29) I can do a
| significantly better job of judging something's trustworthiness
| than if it's attributed to 'Ryan Smith Inc.', right?
| simonw wrote:
| That's a really interesting idea, but I worry about what
| happens when a domain name expires and is re-registered
| (potentially even maliciously) by someone else.
| ryan29 wrote:
| I think you'd probably need some buy in from the domain
| registries and ICANN to make it really solid. Ideally,
| domains would have something similar to public certificate
| transparency logs where domain expirations would be recorded.
| I even think it would be reasonable to log registrant changes
| (legal registrant, not contact info). In both cases, it
| wouldn't need to include any identifiable info, just a simple
| expired/ownership changed trigger so others would know they
| need to revalidate related identities.
|
| I don't know if registries would play ball with something
| like that, but it would be useful and should probably exist
| anyway. I would even argue that once a domain rolls through
| grace, redemption, etc. and gets dropped / re-registered,
| that should invalidate it as an account recovery method
| everywhere it's in use.
|
| There's a bit of complexity when it comes to the actual
| validation because of stuff like that. I think you'd need buy
| in from at least one large company that could do the actual
| verification and attest to interested parties via something
| like OAuth. Think along the lines of "verify your domain by
| logging in with GitHub" and at GitHub an organization owner
| that's validated their domain would be allowed to grant OAuth
| permission to read the verified domain name.
| dane-pgp wrote:
| You've already talked about Sigstore (which is an excellent
| technology for this space), so we can consider developers
| holding keys that are stored in an append-only log. Then it
| doesn't matter if the domain expires and someone re-
| registers it, since they don't have the developer's private
| keys.
|
| Of course there are going to be complexities involving key-
| rollover and migrating to a different domain, but a
| sufficiently intelligent Sigstore client could handle the
| various messages and cryptographic proofs needed to secure
| that. The hard part is how to issue a new key if you lose
| the old one, since that probably requires social vouching
| and a reputation system.
|
| [0] https://docs.sigstore.dev/
| jacques_chester wrote:
| > _Then it doesn 't matter if the domain expires and
| someone re-registers it, since they don't have the
| developer's private keys._
|
| A principal reason to use sigstore is to get out of the
| business of handling private keys entirely. It turns a
| key management problem into an identity problem, the
| latter being much easier to solve at scale.
| ryan29 wrote:
| > Then it doesn't matter if the domain expires and
| someone re-registers it, since they don't have the
| developer's private keys.
|
| That's a good point in terms of invalidation, but a new
| domain registrant should be able to claim the namespace
| and start using it.
|
| I think one possible solution to that would be to assume
| namespaces can have their ownership changed and build
| something that works with that assumption.
|
| Think along the lines of having 'pypi.org/example.com' be
| a redirect to an immutable organization;
| 'pypi.org/abcd1234'. If a new domain owner wants to take
| over the namespace they won't have access to the existing
| account and re-validating to take ownership would force
| them to use a different immutable organization;
| 'pypi.org/ef567890'.
|
| If you have a package locking system (like NPM), it would
| lock to the immutable organization and any updates that
| resolve to a new organization could throw a warning and
| require explicit approval. If you think of it like an
| organization lock:
|
| v1: pypi.org/example.com -->
| pypi.org/abcd1234
|
| v2: pypi.org/example.com -->
| pypi.org/ef123456
|
| If you go from v1 to v2 you _know_ there was an ownership
| change or, at the very least, an event that you need to
| investigate.
|
| Losing control of a domain would be recoverable because
| existing artifacts wouldn't be impacted and you could use
| the immutable organization to publish the change since
| that's technically the source of truth for the artifacts.
| Put another way, the immutable organization has a pointer
| back the current domain validated namespace:
|
| v1: pypi.org/abcd1234 --> example.com
|
| v2: pypi.org/abcd1234 --> example.net
|
| If you go from v1 to v2 you _know_ the owner of the
| artifacts you want has moved from the domain example.com
| to example.net. The package manager could give a warning
| about this and let an artifact consumer approve it, but
| it 's less risky than the change above because the owner
| of 'abcd1234' hasn't changed and you're already trusting
| them.
|
| I think that's a reasonably effective way of solving
| attacks that rely on registering expired domains to take
| over a namespace and it also makes it fairly trivial for
| namespace owners to point artifact consumers to a new
| domain if needed.
|
| Think of the validated domain as more of a vanity pointer
| than an actual artifact repository. In fact, thinking
| about it like that, you don't actually need any
| cooperation or buy in from the domain registries.
|
| > The hard part is how to issue a new key if you lose the
| old one, since that probably requires social vouching and
| a reputation system.
|
| It's actually really hard because as you increase the
| value of a key, I think you decrease the security
| practices around handling them. For example, some people
| will simply drop their keys into OneDrive if there's any
| inconvenience associated with losing them.
|
| I would really like to have something where I can use a
| key generated on a tamper proof device like a YubiKey and
| not have to worry about losing it. Ideally, I could
| register a new key without any friction.
| westurner wrote:
| CT: Certificate Transparency logs log creation and revocation
| events.
|
| The Google/trillian database which supports Google's CT logs
| uses Merkle trees but stores the records in a centralized
| data store - meaning there's _at least one_ SPOF Single Point
| of Failure - which one party has root on and sole backup
| privileges for.
|
| Keybase, for example, stores their root keys - at least - in
| a distributed, redundantly-backed-up blockchain that nobody
| has root on; and key creation and revocation events are
| publicly logged similarly to now-called "CT logs".
|
| You can link your Keybase identity with your other online
| identities by proving control by posting a cryptographic
| proof; thus adding an edge to a WoT Web of Trust.
|
| While you can add DNS record types like CERT, OPENPGPKEY,
| SSHFP, CAA, RRSIG, NSEC3; DNSSEC and DoH/DoT/DoQ cannot be
| considered to be universally deployed across all TLDs.
| Should/do e.g. ACME DNS challenges fail when a TLD doesn't
| support DNSSEC, or hasn't secured root nameservers to a
| sufficient baseline, or? DNS is not a trustless system.
|
| EDNS (Ethereum DNS) is a trustless system. Reading EDNS
| records does not cost EDNS clients any
| gas/particles/opcodes/ops/money.
|
| Blockcerts is designed to issue any sort of credential, and
| allow for signing of any RDF graph like JSON-LD.
|
| List_of_DNS_record_types:
| https://en.wikipedia.org/wiki/List_of_DNS_record_types
|
| Blockcerts: https://www.blockcerts.org/
| https://github.com/blockchain-certificates :
|
| > _Blockcerts is an open standard for creating, issuing,
| viewing, and verifying blockchain-based certificates_
|
| W3C VC-DATA-MODEL: https://w3c.github.io/vc-data-model/ :
|
| > _Credentials are a part of our daily lives; driver 's
| licenses are used to assert that we are capable of operating
| a motor vehicle, university degrees can be used to assert our
| level of education, and government-issued passports enable us
| to travel between countries. This specification provides a
| mechanism to express these sorts of credentials on the Web in
| a way that is cryptographically secure, privacy respecting,
| and machine-verifiable_
|
| W3C VC-DATA-INTEGRITY: "Verifiable Credential Data Integrity
| 1.0" https://w3c.github.io/vc-data-integrity/#introduction :
|
| > _This specification describes mechanisms for ensuring the
| authenticity and integrity of Verifiable Credentials and
| similar types of constrained digital documents using
| cryptography, especially through the use of digital
| signatures and related mathematical proofs. Cryptographic
| proofs enable functionality that is useful to implementors of
| distributed systems. For example, proofs can be used to: Make
| statements that can be shared without loss of trust,_
|
| W3C _TR_ DID (Decentralized Identifiers)
| https://www.w3.org/TR/did-core/ :
|
| > _Decentralized identifiers (DIDs) are a new type of
| identifier that enables verifiable, decentralized digital
| identity. A DID refers to any subject (e.g., a person,
| organization, thing, data model, abstract entity, etc.) as
| determined by the controller of the DID. In contrast to
| typical, federated identifiers, DIDs have been designed so
| that they may be decoupled from centralized registries,
| identity providers, and certificate authorities.
| Specifically, while other parties might be used to help
| enable the discovery of information related to a DID, the
| design enables the controller of a DID to prove control over
| it without requiring permission from any other party. DIDs
| are URIs that associate a DID subject with a DID document
| allowing trustable interactions associated with that
| subject._
|
| > _Each DID document can express cryptographic material,
| verification methods, or services, which provide a set of
| mechanisms enabling a DID controller to prove control of the
| DID. Services enable trusted interactions associated with the
| DID subject. A DID might provide the means to return the DID
| subject itself, if the DID subject is an information resource
| such as a data model._
| jacques_chester wrote:
| Sigstore uses Trillian for its transparency log, Rekor.
| dane-pgp wrote:
| For another example of how Ethereum might be useful for
| certificate transparency, there's a fascinating paper from
| 2016 called "EthIKS: Using Ethereum to audit a CONIKS key
| transparency log" which is probably way ahead of its time.
|
| Abstract: https://link.springer.com/chapter/10.1007/978-3-6
| 62-53357-4_...
|
| PDF: https://jbonneau.com/doc/B16b-BITCOIN-ethiks.pdf
| rhyselsmore wrote:
| Needs compensating controls to get it right.
|
| * Dependencies are managed in a similar way to Go - where
| hashes of installed packages are stored and compared client
| side. This means that a hijacker could only serve up the
| valid versions of packages that I've already installed.
|
| * This is still a "centralized" model where a certain level
| of trust is placed in PyPi - a mode of operation where the
| "fingerprint" of the TLS key is validated would assist here.
| However it comes with a few constraints.
|
| Of course the above still comes with the caveat that you have
| to trust pypi. I'm not saying that this is an unreasonable
| ask. It's just how it is.
| jacques_chester wrote:
| Maven Central requires validation of a domain name in order to
| use a reverse-domain package[0].
|
| It's not without problems. One is that folks often don't
| control the domain (consider Go's charming habit of conflating
| version control with package namespacing). Another is what was
| noted below: resurrection attacks on domains can be quite
| trivial and already happen in other forms (eg registering
| lapsed domains for user accounts and performing a reset).
|
| [0] https://central.sonatype.org/faq/how-to-set-txt-record/
| shireboy wrote:
| I've been tinkering with stable diffusion lately and this has
| been a rude introduction to python. Coming from .net (nuget) and
| JavaScript (npm), it's baffling that there isn't an established
| solution for python. It looks to me like people are trying, but
| different libraries use different techniques. To a newcomer this
| is confusing.
| timtom39 wrote:
| ML/AI is the deep end of python dependencies. Lots of hardware
| specific requirements (e.g. CUDA, cuDNN, AVX2 TensorFlow
| binaries, etc). A typical python web application is a lot
| simpler.
| antod wrote:
| _> I've been tinkering with stable diffusion lately and this
| has been a rude introduction to python. Coming from .net
| (nuget) and JavaScript (npm), it's baffling that there isn't an
| established solution for python._
|
| Python has had multiple legacy solutions going back a long time
| before nuget and npm existed, and before central registries of
| dependencies. Every new solution has to cope with all that
| compatibility/transitional baggage. Also a bunch of usecases
| .NET or JS never really had to deal much with - eg being a core
| system language for Linux distros, and supporting cross
| platform installs back in the download and run something days.
| The scope of areas Python gets used in means its packaging is
| pulled in more directions than most other languages who mostly
| stick to a main niche.
|
| So the history and surface area of problems to solve in Python
| packaging is larger than what most other languages have had to
| deal with. It also takes years for the many 3rd party tools to
| try out new approaches, gain traction and then slowly get their
| best ideas synthesized and adapted into the much more
| conservative core Python stdlib.
|
| Not saying it is great, just laying out some of the reasons it
| is what it is.
| zbentley wrote:
| I imagine many of you have feedback that could be useful to folks
| making decisions about the future of Python packaging, a common
| subject of complaint in many discussions here.
|
| Remember not to just complain, but to offer specific
| problems/solutions--i.e. avoid statements like "virtualenvs suck,
| why can't it be like NPM?" and prefer instead feedback like "the
| difference between Python interpreter version and what virtualenv
| is being used causes confusion".
| Kwpolska wrote:
| "virtualenvs suck, why can't it be like NPM?" is a specific
| problem and a specific solution. The problem being having to
| manage venvs (which have many gotchas and pitfalls, and no
| standarization), and the solution is to replace those with
| packages being installed into the project folder with
| standardized and well-known tools.
| qbasic_forever wrote:
| Keep an eye on https://peps.python.org/pep-0582/ it's a
| proposal to add local directory/node_modules-like behavior to
| package installs. It stalled out a few years ago but I heard
| there is a lot more discussion and push to get it in now.
|
| I think if this PEP makes it in then like 90% of people's
| pain with pip just completely goes away almost overnight.
| Love it or hate it the NPM/node_modules style of all
| dependencies dumped in a local directory solves a _ton_ of
| problems in the packaging world. It would go a long way
| towards making the experience much smoother for most python
| users.
___________________________________________________________________
(page generated 2022-09-07 23:00 UTC)