[HN Gopher] Mpire: A Python package for easier and faster multip...
       ___________________________________________________________________
        
       Mpire: A Python package for easier and faster multiprocessing
        
       Author : lnyan
       Score  : 115 points
       Date   : 2023-08-11 15:30 UTC (7 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | jw887c wrote:
       | Multiprocessing is great as a first pass parallelization but I've
       | found that debugging it to be very hard, especially for junior
       | employees.
       | 
       | It seems much easier to follow when you can push everything to
       | horizontally scaled single processes for languages like Python.
        
         | wheelerof4te wrote:
         | Or just use numpy's arrays, which have their integrated
         | multiprocessing.
        
         | flakes wrote:
         | Depends on the workflow. For one off jobs or client tooling,
         | parallelism makes sense to have rapid user feedback.
         | 
         | For batch pipelines on that work many requests, having a serial
         | workflow has a lot of the advantages you mention. Serial
         | execution makes the load more predictable and makes scaling
         | easier to rationalize.
        
         | uniqueuid wrote:
         | I agree. The main problems aren't syntax, they are
         | architectural: Catching and retrying individual failures in a
         | pool.map, anticipating OOM with heavy tasks, understanding
         | process lifecycle and the underlying pickle/ipc.
         | 
         | All these are much more reliably solved with horizontal
         | scaling.
         | 
         | [edit] by the way, a very useful minimal sugar on top of
         | multiprocessing for one-off tasks is tqdm's process_map, which
         | automatically shows a progress bar
         | https://tqdm.github.io/docs/contrib.concurrent/
        
           | Terretta wrote:
           | From the linked Mpire readme:                   Suppose we
           | want to know the status of the current task: how many tasks
           | are         completed, how long before the work is ready?
           | It's as simple as setting the          progress_bar parameter
           | to True:                  with WorkerPool(n_jobs=5) as pool:
           | results = pool.map(time_consuming_function,
           | range(10), progress_bar=True)              And it will output
           | a nicely formatted tqdm progress bar.
        
           | xapata wrote:
           | How is coordinating between different machines any different
           | than coordinating between different processes?
           | 
           | A multiprocessing implementation is a good prototype for a
           | distributed implementation.
        
             | uniqueuid wrote:
             | To be honest, I don't think both are similar at all.
             | 
             | Parallelizing across machines involves networks, and well,
             | that's why we have jepsen, and byzantine failures, and
             | eventual consistency, and net splits, and leadership
             | election, and discovery - so in short a stack of hard
             | problems that in and of itself is usually much larger than
             | what you're trying to solve with multiprocessing.
        
               | [deleted]
        
               | xapata wrote:
               | True, the networking causes trouble. I usually rely on a
               | communication layer that addresses those troubles. A good
               | message queue makes the two paradigms quite similar. Or
               | something like Dask (https://www.dask.org/). Having your
               | single-machine development environment able to reproduce
               | nearly all the bugs that arise in production is a
               | wonderful thing.
        
         | dr_kiszonka wrote:
         | Parsl has quite good debugging facilities built in, which
         | include automatic logging and visualizations.
         | 
         | https://parsl.readthedocs.io/en/stable/faq.html
         | 
         | https://parsl.readthedocs.io/en/stable/userguide/monitoring....
        
       | captaintobs wrote:
       | Why is this faster than the stdlib? What does it do to achieve
       | better performance?
        
         | uniqueuid wrote:
         | It's in the readme of the github project.
         | 
         | > In short, the main reasons why MPIRE is faster are:
         | When fork is available we can make use of copy-on-write shared
         | objects, which reduces the need to copy objects that need to be
         | shared over child processes              Workers can hold state
         | over multiple tasks. Therefore you can choose to load a big
         | file or send resources over only once per worker
         | Automatic task chunking
        
           | misnome wrote:
           | Isn't this given "for free" by the fact that it's fork, even
           | in standard multiprocessing? What does the library do extra?
        
             | indeedmug wrote:
             | Yea, I am a struggling to figure out what the secret sauce
             | of this library and if that sauce is introducing foot guns
             | down the line.
             | 
             | Multiprocessing std uses fork in linux distros already. I
             | once ran a multiprocess code on Linux and Windows and there
             | was a significant improvement in performance when running
             | Linux.
        
               | mufti_menk wrote:
               | They're deprecating fork in 1 or 2 versions, one of the
               | main issues with it is copies locks across processes
               | which can cause deadlocks.
        
           | niemandhier wrote:
           | COW can come back and bite you by causing not easily
           | predictable runtime.
           | 
           | Your code goes down a rarely used branch and suddenly a large
           | object gets copied.
        
       | IshKebab wrote:
       | Why has Python never added something like We workers/isolates?
       | That seems like the obvious thing to do but they only have
       | multiprocess hacks.
        
         | misnome wrote:
         | There has been lots of movement towards running multiple copies
         | of the interpreter in the same process space, over the last
         | several releases. I'm sure it'll come at some point.
        
         | nine_k wrote:
         | It sort of has, but it's a work in progress.
         | 
         | https://lwn.net/Articles/820424/
        
       | bee_rider wrote:
       | Ah darn, was hoping for some MPI Python interface.
        
       | anotherpaulg wrote:
       | I often use lox for this sort of thing. It can use threads or
       | processes, and has a very ergonomic api.
       | 
       | https://github.com/BrianPugh/lox
        
         | ewokone wrote:
         | Thanks for sharing, this really looks promising for what I am
         | looking for.
        
         | [deleted]
        
       | MitPitt wrote:
       | Always dreamed of multiprocessing with tqdm, this is great
        
       | jmakov wrote:
       | How is this different from ray.io?
        
         | uniqueuid wrote:
         | Ray is parallelism across machines, this is only across cores.
        
           | whoiscroberts wrote:
           | Ray is cross core and cross machine.
        
             | theLiminator wrote:
             | Is ray faster across cores than stdlib multiprocessing is?
        
       | amelius wrote:
       | Very cool.
       | 
       | Except I'm a bit concerned that it might have _too many_
       | features. E.g. rendering of progress bars and such. This should
       | really be in a separate package and not referenced from this
       | package.
       | 
       | The multiprocessing module might not be great, but at least the
       | maintainers have always been careful about feature creep.
        
       | miohtama wrote:
       | Another good library for concurrency and parallel tasks is
       | futureproof:
       | 
       | https://github.com/yeraydiazdiaz/futureproof
       | 
       | > concurrent.futures is amazing, but it's got some sharp edges
       | that have bit me many times in the past.
       | 
       | > Futureproof is a thin wrapper around it addressing some of
       | these problems and adding some usability features.
        
       | trostaft wrote:
       | The particular pain point of multiprocessing in python for me has
       | been the limitations of the serializer. To that end,
       | multiprocess, the replacement by the dill team, has been useful
       | as a drop in replacement, but I'm still looking for better
       | alternatives. This seems to support dill as an optional
       | serializer so I'll take a look!
        
       | singhrac wrote:
       | I've spent a lot of time writing and debugging multiprocessing
       | code, so a few thoughts, besides the general idea that this looks
       | good and I'm excited to try it:
       | 
       | - automatic restarting of workers after N task is very nice, I
       | have had to hack that into places before because of
       | (unresolveable) memory leaks in application code
       | 
       | - is there a way to attach a debugger to one of the workers? That
       | would be really useful, though I appreciate the automatic
       | reporting of the failing args (also hack that in all the time)
       | 
       | - often, the reason a whole set of jobs is not making any
       | progress is because of thundering herd on reading files (god
       | forbid over NFS). It would be lovely to detect that using lsof or
       | something similar
       | 
       | - it would also be extremely convenient to have an option that
       | handles a Python MemoryError and scales down the parallelism in
       | that case; this is quite difficult but would help a lot since I
       | often have to run a "test job" to see how much parallelism I can
       | actually use
       | 
       | - I didn't see the library use threadpoolctl anywhere; would it
       | be possible to make that part of the interface so we can limit
       | thread parallelism from OpenMP/BLAS/MKL when multiprocessing?
       | This also often causes core thrashing
       | 
       | Sorry for all the asks, and feel free to push back to keep the
       | interface clean. I will give the library a try regardless.
        
       | milliams wrote:
       | Why does everyone compare against `multiprocessing` when
       | `concurrent.futures`
       | (https://docs.python.org/3/library/concurrent.futures.html) has
       | been a part of the standard library for 11 years. It's a much
       | improved API and the are _almost_ no reasons to use
       | `multiprocessing` any more.
        
         | craigching wrote:
         | Someone downvoted you, I upvoted because I think you have a
         | good point but it would be nice to back it up. I think I agree
         | with you, but I have only used concurrent.futures with threads.
        
           | milliams wrote:
           | I'll give some more detail. concurrent.futures is designed to
           | be a new consistent API wrapper around the functionality in
           | the multiprocessing and threading libraries. One example of
           | an improvement is the API for the map function. In
           | multiprocessing, it only accepts a single argument for the
           | function you're calling so you have to either do partial
           | application or use starmap. In concurrent.futures, the map
           | function will pass through any number of arguments.
           | 
           | The API was designed to be a standard that could be used by
           | other libraries. Before if you started with thread and then
           | realised you were GIL-limited then switching from the
           | threading module to the multiprocessing module was a complete
           | change. With concurrent.futures, the only thing that needs
           | change is:                 with ThreadPoolExecutor() as
           | executor:          executor.map(...)
           | 
           | to                 with ProcessPoolExecutor() as executor:
           | executor.map(...)
           | 
           | The API has been adopted by other third-party modules too, so
           | you can do Dask distributed computing with:
           | with distributed.Client().get_executor() as executor:
           | executor.map(...)
           | 
           | or MPI with                 with MPIPoolExecutor() as
           | executor:          executor.map(...)
           | 
           | and nothing else need change.
           | 
           | This is why I chose to use it to teach my Parallel Python
           | course (https://milliams.com/courses/parallel_python/).
        
             | henrydark wrote:
             | > Before if you started with thread and then realised you
             | were GIL-limited then switching from the threading module
             | to the multiprocessing module was a complete change
             | 
             | Is this true?
             | 
             | I've been switching back and forth between
             | multiprocessing.Pool and multiprocessing.dummy.Pool for a
             | very long time. Super easy, barely an inconvenience.
        
         | xapata wrote:
         | It's somewhat like metaclasses. As the joke goes: If you have
         | to ask, then you don't need metaclasses.
        
         | whalesalad wrote:
         | I can think of a lot of reasons to use multiprocessing. I do it
         | quite often. You can't always architect things to fit inside of
         | a `with` context manager. Sometimes you need fine grain control
         | over when the process starts, stops, how you handle various
         | signals etc.
        
           | tyingq wrote:
           | The with context manager doesn't seem mandatory. It seems
           | mostly like a convenience to call executor.shutdown()
           | implicitly when the block is done.
        
             | whalesalad wrote:
             | I think there is a time and a place for everything. I use
             | concurrent.futures in certain situations when I need to
             | utilize threads/procs to do work in a very rudimentary and
             | naive way. But in more sophisticated systems you want to
             | control startup/shutdown of a process.
             | 
             | TBH, assuming your stack allows it, gevent is my preferred
             | mechanism for concurrency in Python. Followed by asyncio.
             | 
             | For places where I really need to get my hands dirty I will
             | lean on manually controlling processes/threads.
        
       | liendolucas wrote:
       | I've written a very tiny multiprocessing pipeline in Python. It's
       | documented.
       | 
       | I've actually never made use of it but at the time I got a bit
       | obsessed and wanted to write it. It does seem to work as
       | expected.
       | 
       | Is highly hackable as it is only a single file and a couple of
       | classes.
       | 
       | Maybe is useful to someone, here's the link:
       | https://github.com/lliendo/SimplePipeline
        
       ___________________________________________________________________
       (page generated 2023-08-11 23:00 UTC)