[HN Gopher] Frankenstein's `__init__`
       ___________________________________________________________________
        
       Frankenstein's `__init__`
        
       Author : todsacerdoti
       Score  : 82 points
       Date   : 2025-04-19 11:32 UTC (11 hours ago)
        
 (HTM) web link (ohadravid.github.io)
 (TXT) w3m dump (ohadravid.github.io)
        
       | smitty1e wrote:
       | To paraphrase Wheeler/Lampson: "All problems in python can be
       | solved by another level of indiscretion."
        
         | jerf wrote:
         | Oh, that's good. That has some depth to it. I'm going to have
         | to remember that one. Of course one can switch out "Python" for
         | whatever the local situation is too.
        
       | _Algernon_ wrote:
       | fix: add time.sleep(0.1) in the .close method.
        
         | gnfargbl wrote:
         | # NOTE: doubled this to 0.2s because of some weird test
         | failures.
        
         | Const-me wrote:
         | Using sleep instead of proper thread synchronization is
         | unreliable. Some people have slow computers like a cheap tablet
         | or free tier cloud VM. Other people have insufficient memory
         | for the software they run and their OS swaps.
         | 
         | When I need something like that, in my destructor I usually ask
         | worker thread to shut down voluntarily (using a special posted
         | message, manual reset event, or an atomic flag), then wait for
         | the worker thread to quit within a reasonable timeout. If it
         | didn't quit within the timeout, I log a warning message if I
         | can.
        
           | btown wrote:
           | Notably on this point, resource contention on e.g. a
           | Kubernetes cluster can cause these kinds of things to fail
           | spuriously. Sleeps have their uses, but assuming that work is
           | done while you're sleeping is rarely a good idea, because
           | when it rains it will pour.
        
           | crdrost wrote:
           | I believe _Algernon_ was making a joke about how you could
           | take this problem and make the code even worse, not proposing
           | a solution to the problem.
           | 
           | It's also not even a problem with slow computers or
           | insufficient memory, __init__ does I/O here, it connects to
           | ZeroMQ, so it could have arbitrary latency in various
           | circumstances exceeding the 100 milliseconds that we would be
           | sleeping for. So the joke is, this fixes the problem in your
           | test environment where everything is squeaky clean and you
           | know that ZeroMQ is reachable, and now you have bugs in prod
           | still.
        
       | zokier wrote:
       | Pretty simple to fix by changing the _init to something like:
       | def _init(self):             init_complete = threading.Event()
       | def worker_thread_start():
       | FooWidget.__init__(self)                 init_complete.set()
       | self.run()                  worker_thread =
       | Thread(target=worker_thread_start, daemon=True)
       | worker_thread.start()             init_complete.wait()
       | 
       | Spawning worker thread from constructor is not that crazy, but
       | you want to make sure the constructing is completed by the time
       | you return from the constructor.
        
         | ohr wrote:
         | Thanks, I hate it! Jk, but would you consider this a good
         | solution overall?
        
           | im3w1l wrote:
           | Not him, but I would also consider making FooBarWidget _have_
           | a FooWidget rather than _be_ a FooWidget.
           | 
           | Normally with a FooWidget you can create one on some thread
           | and then perform operations on it on that thread. But in the
           | case of the FooBarWidget you can not do operations on it
           | because operations must be done in the special thread that is
           | inaccessible.
        
           | shiandow wrote:
           | I can't come up with a single good reason why you would wish
           | to use inheritance in this case.
           | 
           | Except possibly for type checking, but
           | 
           | 1. Python has duck typing anyway
           | 
           | 2. It's debatable whether these two classes should be
           | interchangeable
           | 
           | 3. You shouldn't need to use inheritance just to make the
           | type checking work.
           | 
           | It could be considered a clever hack in some situations, but
           | it's completely unsurprising that it causes issues. Putting
           | band-aids on it after you find a bug does not fix the real
           | problem.
        
           | gpderetta wrote:
           | I would consider this an hack. But a reasonable hack if you
           | can't change the base class.
        
         | greatgib wrote:
         | To be clear, a good init is not supposed to create a thread or
         | do any execution that might not be instant of things like that.
         | 
         | It would have been better to an addition start, run, exec
         | method that does that kind of things. Even if for usage it is
         | an inch more complicated to use with an additional call.
        
           | lmm wrote:
           | > It would have been better to an addition start, run, exec
           | method that does that kind of things. Even if for usage it is
           | an inch more complicated to use with an additional call.
           | 
           | That breaks RAII - you shouldn't give the user a way to
           | create the object in an invalid state. Although if it's
           | intended to be used as a context manager anyway then maybe
           | doing it on context enter rather than on init would be nicer.
        
             | greatgib wrote:
             | Doesn't necessarily have to be in an invalid state. You can
             | have your object with a none value for the socket or a
             | state like "not_started" and the different method will look
             | at that to handle everything properly.
             | 
             | Also, it was not supposed to be used in a context manager
             | as there is just a close method but it is the caller that
             | decided to wrap it in a context.
             | 
             | But as what you suggest, indeed it would have been a great
             | idea to add an __enter__ and do the starting there are as
             | it would be the proper way.
        
           | crdrost wrote:
           | I don't think it's totally crazy to, say, open a DB
           | connection in __init__() even though that's not an
           | instantaneous operation. That's not a hill I would die on,
           | you can just say "open a connection and hand it to me as an
           | argument," but it just looks a little cleaner to me if the
           | lifecycle of the connection is being handled inside this DB-
           | operations class. (You can also defer creating the connection
           | until it is actually used, or require an explicit call to
           | connect it, but then you are also writing a bunch of
           | boilerplate to make the type checker happy for cases where
           | the class is having its methods called before it was properly
           | connected.)
        
             | mjr00 wrote:
             | It's not totally crazy in that I see it all the time, but
             | it's one of the two most common things I've found make
             | Python code difficult to reason about.[0] After all, if you
             | open a DB connection in __init__() -- how do you close it?
             | This isn't C++ where we can tie that to a destructor. I've
             | run into _so_ many Python codebases that do this and have
             | tons of unclosed connections as a result.
             | 
             | A much cleaner way (IMO) to do this is use context managers
             | that have explicit lifecycles, so something like this:
             | @contextmanager         def create_db_client(host: str,
             | port: int) -> Generator[_DbClient, None, None]:
             | try:                 connection = mydblib.connect(host,
             | port)                 client = _DbClient(connection)
             | yield client             finally:
             | connection.close()                   class _DbClient:
             | def __init__(self, connection):
             | self._connection = connection                          def
             | do_thing_that_requires_connection(...):                ...
             | 
             | Which lets you write client code that looks like
             | with create_db_client('localhost', 5432) as db_client:  #
             | port 3306 if you're a degenerate
             | db_client.do_thing_that_requires_connection(...)
             | 
             | This gives you type safety, connection safety, has minimal
             | boilerplate for client code, _and_ ensures the connection
             | is created and disposed of properly. Obviously in larger
             | codebases there 's some more nuances, and you might want to
             | implement a `typing.Protocol` for `_DbClient` that lets you
             | pass it around, but IMO the general idea is much better
             | than initializing a connection to a DB, ZeroMQ socket, gRPC
             | client, etc in __init__.
             | 
             | [0] The second is performing "heavy", potentially failing
             | operations outside of functions and classes, which can
             | cause failures when importing modules.
        
               | lijok wrote:
               | > This isn't C++ where we can tie that to a destructor
               | 
               | `def __del__`
        
               | mjr00 wrote:
               | C++ destructors are deterministic. Relying on a
               | nondeterministic GC call to run __del__ is _not_ good
               | code.
               | 
               | Also worth noting that the Python spec does _not_ say
               | __del__ must be called, only that it _may_ be called
               | after all references are deleted. So, no, you can 't tie
               | it to __del__.
        
               | 12_throw_away wrote:
               | > def __del__
               | 
               | Nope, do not ever do this, it will not do what you want.
               | You have no idea _when_ it will be called. It can get
               | called at shutdown where the entire runtime environment
               | is in the process of being torn down, meaning that
               | nothing actually works anymore.
        
               | tayo42 wrote:
               | I get the points everyone is making and they make sense,
               | but sometimes you need persistent connections. Open and
               | closing constantly like can cause issues
        
               | mjr00 wrote:
               | There's nothing about the contextmanager approach that
               | says you're open and closing any more or less frequently
               | than a __init__ approach with a separate `close()`
               | method. You're just statically ensuring 1) the close
               | method gets called, and 2) database operations can only
               | happen on an open connection. (or, notably, a connection
               | that we expect to be open, as something external the
               | system may have closed it in the meantime.)
               | 
               | Besides, persistent connections are a bit orthogonal
               | since you should be using a connection pool in practice,
               | which most Python DB libs provide out of the box. In
               | either case, the semantics are the same, open/close
               | becomes lease from/return to pool.
        
           | eptcyka wrote:
           | What if the object is in an invalid state unless the thread
           | is started?
        
             | greatgib wrote:
             | Nothing prevents you to have a valid "not_yet_started"
             | state.
        
               | rcxdude wrote:
               | No, but it does complicate things substantially. I don't
               | understand why you would have a useless half-constructed
               | state, especially in python where init isn't all that
               | magical.
        
               | lgas wrote:
               | The benefits of making illegal states unrepresentable are
               | just not widely understood outside of pure FP circles. I
               | think it's hard for stuff like this to catch on in python
               | in particular because everything is so accessible and
               | mutable, it would be hard to talk about anything that
               | provides the level of confidence that eg. ADTs in Haskell
               | provide. And it's hard to understand why you would want
               | to try to do something like that in a context like python
               | unless you already have experience with it in a
               | statically typed context.
        
               | eptcyka wrote:
               | And what would you do with this object after
               | construction? Call start() on it immediately?
        
               | greatgib wrote:
               | Kind of yes or no, depend on your usage. But clearly
               | would have worked well for the case of the automatic test
               | if the start was the context
        
           | jhji7i77l wrote:
           | > To be clear, a good init is not supposed to create a thread
           | or do any execution that might not be instant of things like
           | that.
           | 
           | Maybe it indirectly ensures that only one thread is created
           | per instantiation; there are better ways of achieving that
           | though.
        
       | wodenokoto wrote:
       | > solving an annoying problem with a complete and utter disregard
       | to sanity, common sense, and the feelings of other human beings.
       | 
       | While I enjoy comments like these (and the article overall!),
       | they stand stronger if followed by an example of a solution that
       | regards sanity, common sense and the feelings of others.
        
         | crdrost wrote:
         | So in this case that would be, a FooBarWidget is not a subclass
         | of FooWidget but maybe still a subclass of AbstractWidget above
         | that. It contains a thread and config as its state variables.
         | That thread instantiates a FooWidget with the saved config, and
         | runs it, and finally closes its queue.
         | 
         | The problem still occurs, because you have to define what it
         | means to close a FooBarWidget and I don't think python Thread
         | has a "throw an exception into this thread" method. Just
         | setting the should_exit property, has the same problem as the
         | post! The thread might still be initing the object and any
         | attempt to tweak across threads could sometimes tweak before
         | init is complete because init does I/O. But once you are there,
         | this is just a tweak of the FooWidget code. FooWidget could
         | respond to a lock, a semaphore, a queue of requests, any number
         | of things, to be told to shut down.
         | 
         | In fact, Python has a nice built-in module called asyncio,
         | which implements tasks, and tasks can be canceled and other
         | such things like that, probably you just wanted to move the
         | foowidget code into a task. (See @jackdied's great talk "Stop
         | Writing Classes" for more on this general approach. It's not
         | about asyncio specifically, rather it's just about how the
         | moment we start throwing classes into our code, we start to
         | think about things that are not just solving the problem in
         | front of us, and solving the problem in front of us could be
         | done with just simple structures from the standard library.)
        
       | mjr00 wrote:
       | Doing anything like this in __init__ is crazy. Even
       | `Path("config.json").read_text()` in a constructor isn't a good
       | idea.
       | 
       | Friends don't let friends build complicated constructors that can
       | fail; this is a huge violation of the Principle of Least
       | Astonishment. If you require external resources like a zeromq
       | socket, use connect/open/close/etc methods (and a contextmanager,
       | probably). If you need configuration, create a separate function
       | that parses the configuration, then returns the object.
       | 
       | I appreciate the author's circumstances may have not allowed them
       | to make these changes, but it'd drive me nuts leaving this as-is.
        
         | echelon wrote:
         | Not just Python. Most languages with constructors behave badly
         | if setup fails: C++ (especially), Java, JavaScript. Complicated
         | constructors are a nuisance and a danger.
         | 
         | Rust is the language I'm familiar with that does this
         | exceptionally well (although I'm sure there are others). It's
         | strictly because there are no constructors. Constructors are
         | not special language constructs, and any method can function in
         | that way. So you pay attention to the function signature just
         | like everywhere else: return Result<Self, T> explicitly, heed
         | async/await, etc. A constructor is no different than a static
         | helper method in typical other languages.
         | 
         | new Thing() with fragility is vastly inferior to new_thing() or
         | Thing::static_method_constructor() without the submarine
         | defects.
         | 
         | Enforced tree-style inheritance is also weird after
         | experiencing a traits-based OO system where types don't have to
         | fit onto a tree. You're free to pick behaviors at will. Multi-
         | inheritance was a hack that wanted desperately to deliver what
         | traits do, but it just made things more complicated and tightly
         | coupled. I think that's what people hate most about "OOP", not
         | the fact that data and methods are coupled. It's the weird
         | shoehorning onto this inexplicable hierarchy requirement.
         | 
         | I hope more languages in the future abandon constructors and
         | strict tree-style and/or multi-inheritance. It's something
         | existing languages could bolt on as well. Loose coupling with
         | the same behavior as ordinary functions is so much easier to
         | reason about. These things are so dated now and are best left
         | in the 60's and 70's from whence they came.
        
           | fouronnes3 wrote:
           | How would you do traits in Python?
        
             | echelon wrote:
             | I'd really need to think long and hard about it, but my
             | initial feeling is that we'd attach them to data classes or
             | a similar new construct. I don't think you'd want to reason
             | about the blast radius with ordinary classes. Granted,
             | that's more language complexity, creates two unequal
             | systems, and makes much more to reason about. There's a lot
             | to juggle.
             | 
             | As much fun as putting a PEP together might be, I don't
             | think I have the bandwidth to do so. I would really like to
             | see traits in Python, though.
        
             | svilen_dobrev wrote:
             | there were "traits" in python :/ . Search for pyprotocols.
             | 
             | (searching myself.. not much left of it)
             | 
             | 2004 - https://simonwillison.net/2004/Mar/23/pyprotocols/
             | 
             | some related rejected PEP ..
             | https://peps.python.org/pep-0246/ talking about something
             | new to be expected.. in 2001?2005?
             | 
             | no idea did it happen and which one would that be..
        
               | dragonwriter wrote:
               | Possibly Guido, in that rejection, was talking about what
               | ended up as ABCs, original PEP:
               | https://peps.python.org/pep-3119/
               | 
               | Further work was done in this area, building on ABCs,
               | with Protocols, original PEP:
               | https://peps.python.org/pep-0544/
        
             | vaylian wrote:
             | Python has a trait-like system. It's called "abstract base
             | classes": https://docs.python.org/3/library/abc.html
             | 
             | The abstract base class should use the @abstractmethod
             | decorator for methods that the implementing class needs to
             | implement itself.
             | 
             | Obviously, you can also use abstract base classes in other
             | ways, but they can be used as a way to define
             | traits/interfaces for classes.
        
               | dragonwriter wrote:
               | Protocols, too (they are defined on top of ABCs)
               | 
               | https://typing.python.org/en/latest/spec/protocol.html#pr
               | oto...
        
               | gpderetta wrote:
               | Protocols are definitely the closest thing to traits in
               | Python. ABC are pretty much OoO interfaces.
        
           | rcxdude wrote:
           | The annoying thing is you can actually just use the rust
           | solution in C++ as well (at least for initial construction),
           | but basically no-one does.
        
           | whatevaa wrote:
           | In C#, it is also not a good idea to have constructors with
           | failable io in them.
        
             | neonsunset wrote:
             | Yup, and IO being async usually creates impedance mismatch
             | in constructors.
             | 
             | Had to refactor quite a few anti-pattern constructors like
             | this into `async Task<Thing> Create(...)` back in the day.
             | No idea what was the thought process of the original
             | authors, if there was any...
        
           | hdjrudni wrote:
           | I don't have enough experience with traits, but they also
           | sound like a recipe for creating a mess. I find anything more
           | than like 1 level of inheritance starts to create trouble.
           | But perhaps that's the magic of traits? Instead of creating
           | deep stacks, you mix-and-match all your traits on your leaf
           | class?
        
             | echelon wrote:
             | > I find anything more than like 1 level of inheritance
             | starts to create trouble.
             | 
             | That's the beauty of traits (or "type classes"). They're
             | _behaviors_ and they don 't require thinking in terms of
             | inheritance. Think interfaces instead.
             | 
             | If you want your structure or object to print debug
             | information when logged to the console, you custom
             | implement or auto-derive a "Debug" trait.
             | 
             | If you want your structure or object to implement copies,
             | you write your own or auto-derive a "Clone" trait. You can
             | control whether they're shallow or deep if you want.
             | 
             | If you want your structure or object to be convertible to
             | JSON or TOML or some other format, you implement or auto-
             | derive "Serialize". Or to get the opposite behavior of
             | hydrating from strings, the "Deserialize" trait.
             | 
             | If you're building a custom GUI application, and you want
             | your widget to be a button, you implement or auto-derive
             | something perhaps called "Button". You don't have to
             | shoehorn this into some kind of "GObject > Widget > Button"
             | kind of craziness.
             | 
             | You can take just what you need and keep everything flat.
             | 
             | Here's a graphical argument I've laid out:
             | https://imgur.com/a/bmdUV3H
        
         | rcxdude wrote:
         | Why? An object encapsulates some state. If it doesn't do
         | anything unless ypu call some other method on it first, it
         | should just happen in the constructor. Otherwise you've got one
         | object that's actually two types: the object half-initialised
         | and the object fully initialised, and it's very easy to confuse
         | the two. Especially in python there's basically no situation
         | where you're going to need that half-state for some language
         | restriction.
        
           | mjr00 wrote:
           | It's a _lot_ easier to reason about code if I don 't need to
           | worry that something as simple as
           | my_variable = MyType()
           | 
           | might be calling out to a database with invalid credentials,
           | establishing a network connection which may fail to connect,
           | or reading a file from the filesystem which might not exist.
           | 
           | You are correct that you don't want an object that can be
           | half-initialized. In that case, any external resources
           | necessary should be allocated _before_ the constructor is
           | called. This is particularly true if the external resources
           | must be closed after use. You can see my other comment[0] in
           | this thread for a full example, but the short answer is use
           | context managers; this is the Pythonic way to do RAII.
           | 
           | [0] https://news.ycombinator.com/item?id=43736940
        
             | lyu07282 wrote:
             | > might be calling out to a database with invalid
             | credentials, establishing a network connection which may
             | fail to connect, or reading a file from the filesystem
             | which might not exist.
             | 
             | In python its not unusual that,                   import
             | some_third_partylib
             | 
             | will do exactly that. I've seen libraries that will load a
             | half gigabyte machine learning model into memory on import
             | and one that sends some event to sentry for telemetry.
             | 
             | People write such shit code, its unbelievable.
        
           | banthar wrote:
           | Python context managers somewhat require it. You have to
           | create an object on which you can call `__enter__`.
        
             | ptsneves wrote:
             | It is a tragedy that python almost got some form or RAII
             | but then figured out an object has 2 stages of usage.
             | 
             | I also strongly disagree constructors cannot fail. An
             | object that is not usable should fail fast and stop code
             | flow the earliest possible. Fail early is a good thing.
        
               | jakewins wrote:
               | There is no contradiction between "constructors cannot
               | fail" and "fail early", nobody is arguing the constructor
               | should do fallible things and then hide the failure.
               | 
               | What you should do is the fallible operation _outside_
               | the constructor, _before_ you call __init__, then ask for
               | the opened file, socket, lock, what-have-you as an
               | argument to the constructor.
               | 
               | Fallible initialisation operations belong in factory
               | functions.
        
             | jakewins wrote:
             | This is the exact opposite? They explicitly encourage doing
             | resource-opening in the __enter__ method call, and then
             | returning the opened resource encapsulated inside an
             | object.
             | 
             | Nothing about the contract encourages doing anything
             | fallible in __init__
        
           | hdjrudni wrote:
           | > Otherwise you've got one object that's actually two types:
           | the object half-initialised and the object fully initialised,
           | and it's very easy to confuse the two.
           | 
           | You said it yourself -- if you feel like you have two
           | objects, then literally use two objects if you need to split
           | it up that way, FooBarBuilder and FooBar. That way FooBar is
           | always fully built, and if FooBarBuilder needs to do funky
           | black magic, it can do so in `build()` instead of `__init__`.
        
             | exe34 wrote:
             | no thanks, that's how we end up with FactoryFactory. If the
             | work needs to be done upon startup, then it needs to be
             | done. if it is done in response to a later event, then it
             | been done later.
        
               | duncanfwalker wrote:
               | For me the point is that __init__ is special - it's the
               | language's the way to construct an object. If we want to
               | side-effecting code we can with a non-special static
               | constructor like
               | 
               | class Foo @classmethod def connect(): return
               | Foo()._connect()
               | 
               | The benefit is that we can choose when we're doing 1)
               | object construction, 2) side-effecting 3) both. The
               | downside is client might try use the __init__() so object
               | creation might need to be documented more than it would
               | otherwise
        
         | jhji7i77l wrote:
         | > Even `Path("config.json").read_text()` in a constructor isn't
         | a good idea.
         | 
         | If that call is necessary to ensure that the instance has a
         | good/known internal state, I absolutely think it belongs in
         | __init__ (whether called directly or indirectly via a method).
        
           | mjr00 wrote:
           | You're right that consistent internal state is important, but
           | you can accomplish this with                   class MyClass:
           | def __init__(self, config: str):                 self._config
           | = config
           | 
           | And if your reaction is "that just means something else needs
           | to call Path("config.json").read_text()", you're absolutely
           | right! It's separation of concerns; let some other method
           | deal with the possibility that `config.json` isn't
           | accessible. In the real world, you'd presumably want even
           | more specific checks that specific config values were present
           | in the JSON file, and your constructor would look more like
           | def __init__(self, host: str, port: int):
           | 
           | and you'd validate that those values are present in the same
           | place that loads the JSON.
           | 
           | This simple change makes code _so_ much more readable and
           | testable.
        
             | asplake wrote:
             | Better still, pass the parsed config. Let the app decide
             | how it is configured, and let it deal with any problems.
        
         | abdusco wrote:
         | My go-to is a factory classmethod:                   class Foo:
         | def __init__(self, config: Config): ...
         | @classmethod             def from_config_file(cls, filename:
         | str):               config = # parse config file
         | return cls(config)
        
       | pjot wrote:
       | Rather than juggling your parent's __init__ on another thread,
       | it's usually clearer to:
       | 
       | 1. Keep all of your object-initialization in the main thread
       | (i.e. call super().__init__() synchronously).
       | 
       | 2. Defer any ZMQ socket creation that you actually use in the
       | background thread into the thread itself.
        
         | ledauphin wrote:
         | yeah, this just feels like one of those things that's begging
         | for a lazy (on first use) initialization. if you can't share or
         | transfer the socket between threads in the first place, then
         | your code will definitionally not be planning to use the object
         | in the main thread.
        
       | devrandoom wrote:
       | That's Python's constructor, innit?
        
       ___________________________________________________________________
       (page generated 2025-04-19 23:01 UTC)