[HN Gopher] The return of lazy imports for Python
___________________________________________________________________
The return of lazy imports for Python
Author : mfiguiere
Score : 49 points
Date : 2022-12-25 17:18 UTC (5 hours ago)
(HTM) web link (lwn.net)
(TXT) w3m dump (lwn.net)
| bfung wrote:
| The only argument PEP 690 has is A) [performance on startup] or
| B) [when import functions is not used] in main body of the code.
|
| For B), easy enough to run one of many linters to detect this
| case and have people write less bad code.
|
| A) is way more subjective and can be fixed in many ways.
|
| With the many more Python coders these days with less coding
| experience, personal feeling is please stop throwing these
| production issue causing features in that I have to fix. Glad the
| PEP is rejected.
|
| Old programmer wisdom is to load all your configs and assumptions
| as early as possible to eliminate a whole space of problems with
| your code, making faster and easier to read/reason about later.
| rrdharan wrote:
| I'm curious about the other ways to fix startup performance.
|
| I've seen a moderately sized (~300k LoC) Python CLI project
| that had a horrendous, anger-inducing startup time until they
| switched to the lazy import approach basically
| described/standardized by PEP 690 and the improvement was
| massive.
| m000 wrote:
| Not a word though about the elephant in the room: circular
| imports.
|
| It is absurdly easy in Python to end up with a circular import
| situtation, where no real circular dependency exists. E.g. you
| can't have A.a1 -> B.b1 and B.b2 -> A.a2. So, you are forced to
| layout your code in some quite awkward ways.
| overgard wrote:
| The only way to really fix that is to disallow code running
| outside of functions, IE, basically a whole new language.
| capableweb wrote:
| Isn't having circular dependencies more awkward? Conceptually,
| it makes things more intertwined when instead you can build a
| better and more separated architecture.
| marius_k wrote:
| Is there an elegant way to import type hints without circular
| imports?
| [deleted]
| lopz wrote:
| Something like this does the trick:
|
| if TYPE_CHECKING: import WhateverClass
|
| https://docs.python.org/3/library/typing.html#typing.TYPE_C
| H...
| nomel wrote:
| I'm a fan of having a single state, used for everything.
| Splitting the code up into two states, one for the linter
| and one for the execution, seems like a recipe for
| incorrectness and confusion. I would hate to refactor
| something like that.
| closed wrote:
| The issue is that sometimes a function can take a type
| that is an optional dependency, so you don't want to
| import it unless you are type checking.
|
| (And some types are defined in the typeshed so only exist
| to be imported during type checking; eg the type checker
| lib itself is a dependency in this case)
| [deleted]
| nijave wrote:
| Lazy imports don't really seem that useful. The only time I've
| found them useful (in a Ruby project) was for unit tests/local
| development where only a small subset of the application is
| loaded at a time. Anything long running you generally want the
| predictability of loading everything up front. For command line
| utilities, it seems like you're going to need to load the module
| at some point or another regardless (if you're actually using it)
| so I'm not sure how you'd see a gain unless there's some
| async/multi thread hacks.
| [deleted]
| dalke wrote:
| Some package developers don't want their users to have the two
| step process of "import" than "use." NumPy imports 137 modules
| with "import numpy", of which 94 are specifically in the NumPy
| hierarchy: >>> import sys >>>
| len(sys.modules) 83 >>> import numpy as np
| >>> len(sys.modules) 220 >>> sum(1 for k in
| sys.modules if "numpy" in k) 94
|
| so people can write a one-liner like: >>>
| np.polynomial.chebyshev.Chebyshev([0,1,3])(np.linspace(-1.0,
| 1.0, 5)) array([ 2., -2., -3., -1., 4.])
|
| without having to import np.polynomial.chebyshev.Chebyshev
| first.
|
| This API design requires importing most of NumPy at startup,
| which has a cost they didn't consider so important because
| their users are primarily doing long-term computing and
| notebook-style development, where startup cost is relatively
| small.
|
| I've complained about this because I live in the short-lived
| program world, where it's annoying to have a 0.1 second import
| overhead if I only need one function from NumPy:
| py310% time python -c 'pass' 0.025u 0.006s 0:00.03 66.6%
| 0+0k 0+0io 0pf+0w py310% time python -c 'import numpy'
| 0.142u 0.292s 0:00.14 307.1% 0+0k 0+0io 0pf+0w
|
| As I understand it, SciPy wants a similar API design goal, but
| has a lot more packages. They've developed lazy imports to try
| to have the best of both worlds.
|
| > For command line utilities, it seems like you're going to
| need to load the module at some point or another regardless (if
| you're actually using it)
|
| Thing is, you might not actually use it. If the command-line
| tool uses subcommands, each different subcommand might need
| only a subset of the full set of packages.
|
| Perhaps only one of the subcommands uses NumPy, while for 95%
| of uses, NuPy isn't used at all.
|
| As the discussion for this feature points out, this can be
| addressed by only importing when needed. (One of the reasons
| I've started using click over argparse is click does more of
| this separation for me.) However, it's somewhat fragile, in
| that it's easy to add an rarely-needed expensive import at top-
| level without noticing it, and requires some non-standard
| tooling to detect issues, like the non-predictability you
| mentioned.
|
| I personally want something like the lazy-/auto- importer in my
| package, so I can reduce the two step process. My last package
| released used module-level getattr functions, which gets me
| mostly there, except for notebook auto-completion of the lazy
| wrappers. (It works in the command-line shell though.)
|
| I can't import everything on startup because parts of my
| package depend on third-party packages, which might not be
| installed. I instead want to raise an ImportError when those
| lazy objects are accessed. Plus, one of the third-party
| packages is through a Python/Java bridge, which has its own
| startup costs that I want to avoid.
| slaymaker1907 wrote:
| You could do the numpy style API lazily. They would just need
| to each API as an object that does the imports dynamically.
| klyrs wrote:
| In really large projects (e.g. SciPy as mentioned in the
| article), lazy imports make sense. Especially with the
| popularity of decorators, importing a file without any apparent
| module-level code will actually need to run a nontrivial amount
| of code. Multiply that by a few thousand files in a library
| with a tree of "import * from ..." and you're looking at
| perhaps seconds of startup time. Lazy importing can short-
| circuit that, but still make symbols available for ease of use.
| kelsolaar wrote:
| Numpy and Matplotlib were quite slow to import also, I
| haven't timed them recently though.
| isitmadeofglass wrote:
| Might not grasp the full context here, but it's trivial to lazily
| import modules in your own code. I know every beginners guide
| will advice you not to do that, but that's just because it's an
| easy footgun for new programmers. If you have some cli tool that
| only needs scipy for certain sub commands you can just move it to
| those subcommand calls so it's loaded when needed instead of up
| front.
| Waterluvian wrote:
| As long as you eagerly check if it exists. Don't wait until
| part way through your program to discover dependency issues.
| scott_w wrote:
| There is a downside: manually doing so means your import occurs
| every time you call that function. This would avoid that by
| only importing once lazily.
| ledauphin wrote:
| it's an extra function call, but module imports are cached so
| you're not incurring the actual import cost.
| T-A wrote:
| You can say
|
| mylib = None
|
| in the global scope and then
|
| global mylib
|
| if mylib is None: import mylib
|
| in your function to avoid the extra function call.
| bobbylarrybobby wrote:
| Isn't `is` a function? A cheap function, but still a
| python function
| coredog64 wrote:
| I usually only import argparse inside my 'if __name__ ==
| "__main__"' stanza.
| dalke wrote:
| That's not really the issue with argparse and subcommands.
|
| argparse with subcommands generally requires specifying all
| of the options for all of the subcommands, even if you only
| want one subcommand.
|
| These in turn may require importing subcommand-specific
| modules, to handle things like the right 'type' handler in an
| an add_argument() parameter. This callback function might,
| depending on the input value, select one from a dozen
| different additional packages.
|
| It's possible to avoid this, by deferring argument->type
| processing until later, and having a single large module
| containing all of the help strings and epilogs, though this
| will separate your argparse code from your subcommand code,
| and in general make things more complicated. I did this for a
| while.
|
| Alternatively, you can create your own subcommand dispatch
| system using an nargs="?" to get the subcommand and an
| nargs=argparse.REMAINDER to capture the rest of the flags, to
| pass to a new ArgumentParser, and develop a top-level --help
| replacements. I tried this too.
|
| I've since decided to use click, which does a better job at
| compartmentalizing at least this level of subcommand imports.
| Mehdi2277 wrote:
| That trivial way only lazily shallow imports. I don't see a
| good way to do a lazy deep import. A lot of libraries I import,
| then transitively import hundreds or more of other files. The
| file I import I may only need a small subset of those
| transitive imports. The lazy import pep would have meant that
| whenever the import was finally executed, the imports in that
| file are also lazy and only done if needed.
| overgard wrote:
| Personally, unless it has explicit "lazy" syntax I kind of hate
| the idea. One thing I always liked about python was how
| predictable and simple module imports are.
| slaymaker1907 wrote:
| I think such a mechanism already exists. You can just use the
| functional import syntax inside of a block. However, I think
| lazy imports could be ok so long as the language can show that
| a particular module is side effect free (i.e. no globals aside
| from something like constexpr).
| masklinn wrote:
| > You can just use the functional import syntax inside of a
| block.
|
| You don't even need functional import syntax, but as TFA
| notes this comes at a cost as it has to invoke the entire
| import machinery, which it can only skip to an extent (once a
| module is loaded and cached) as import hooks can have odd
| behaviours.
| code_runner wrote:
| Doubtful lazy imports would've helped at all... I joined a
| project where almost every import statement has side effects,
| some of which took multiple minutes to read things into memory
| etc.
|
| Tried some poor-man's debugging and never hit a breakpoint on the
| first significant line of code... took a while to figure out as
| it was my first Python project.
|
| It almost feels like Python needs a scripting and non-scripting
| mode, or some kind of warning logging "you did everything wrong"
| deepsun wrote:
| Python is a very good scripting language, so good that people
| sometimes mistake it for application language.
| rtzuul wrote:
| The import code in Python is a mess, probably made slower since
| the advent of importlib.
|
| You need developers who care about fast, clean code to fix the
| issue. Those kind of developers usually don't fare well in the
| Python swamp, so it won't happen.
| tersers wrote:
| I get an impression that regardless of topic, it's difficult for
| any decisions to be made for the future of Python? The discussion
| seems to always revolve around "what if?"s in not the most
| collaborative fashion. I wonder if what most languages need are
| less experts in computer science or language theory or anything
| technical, and more folks that can do facilitation.
| setr wrote:
| Why do you think managers take over everything? Regardless of
| topic, any sufficiently large problem eventually becomes
| primarily a coordination problem
| epgui wrote:
| Personally, I'd much rather languages be designed from
| mathematical foundations and/or very careful theory.
| CoastalCoder wrote:
| There certainly are languages like that, but I think you'll
| find there are tradeoffs to consider. Especially in
| commercial software.
| dragonwriter wrote:
| Python decision-making is rather conservative, preferring deep
| exploration of implications, because its a big, established
| language with a lot of existing use to support, and because of
| the 2->3 experience.
|
| I don't think lack of facilitation skill is an issue; its a
| deliberate policy choice.
| estebank wrote:
| Is there _any_ production ready language that isn 't
| conservative with its decision making?
| dragonwriter wrote:
| > Is there any production ready language that isn't
| conservative with its decision making?
|
| There's variations of degree, but probably not. Part of
| being production-ready is stability.
| A4ET8a8uTh0 wrote:
| The what if is a valid question I think. In my little corner of
| the universe, my boss is genuinely ( and the more I think about
| it, reasonably ) worried about introducing more dependency on
| Python in our daily work.
|
| The are a lot of reasons not to introduce it, but 'what ifs' at
| a company like ours could be devastating. I still think proper
| precautions can be taken, but it is harder for me to say that I
| would just say yes if I was in his shoes.
| wpietri wrote:
| I really appreciate both the good writeup here and the fact that
| so many people are thinking through changes so carefully.
| pard68 wrote:
| I didn't understand this when it first came up and I still don't.
| If you want to defer your imports than wait until it's needed. It
| might be useful if you need to load a behemoth of a module for
| some rarely used part of a CLI tool. Otherwise, an X ms load at
| startup is hardly any different than the same X ms load in the
| middle of the execution. And on a server it's actually worse.
| [deleted]
___________________________________________________________________
(page generated 2022-12-25 23:00 UTC)