hngopher.com

       [HN Gopher] Show HN: Chard - simple async/await background tasks...
       ___________________________________________________________________
        
       Show HN: Chard - simple async/await background tasks for Django
        
       Author : drpancake
       Score  : 55 points
       Date   : 2022-09-11 13:16 UTC (9 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | etaioinshrdlu wrote:
       | It looks like it requires all the code in your task to use async
       | code?
        
       | spapas82 wrote:
       | My problem with this (and similar projects) is that there
       | actually _is_ a dependency: Running the worker!
       | 
       | This may look like a small thing but in a production environment
       | this means that instead of running 1 thing for your django app (a
       | gunicorn or uwsgi app server) you need to add another thing (the
       | worker). This results to
       | 
       | * Monitoring the worker (getting alerts when the worker stops,
       | make sure it runs etc) * Start the worker when your server starts
       | * Re-start the worker when you deploy changes (this is critical
       | and easily missed; your worker won't pick any changes to your app
       | whne you deploy it, if you don't re-start it it will run stale
       | code) * Make sure you have some kind of logging and exception
       | tracking for the worker
       | 
       | All this adds up especially if you need to do it all the time for
       | new apps.
       | 
       | So, my ideal async task thingie would be to start an async worker
       | _on the fly_ when there are tasks; the worker will be killed when
       | the task has finished (or stay alive some time to pick any new
       | tasks, whatever). So no plumbing of running a worker
       | indenepdently.
       | 
       | My understanding is that although this is trivial in languages
       | like Java, C# or Elixir (and even C), it ain't really possible in
       | python because of how python works!
       | 
       | Also, if I am not mistaken, the whole async functionality that
       | python has added and django tries to include does not help on
       | this, i.e even if you use all the async features on django and
       | python you can't do something like create and run an async job
       | without running an extra process (a worker) along with the
       | plumbing that goes with that!
       | 
       | Or I am mistaken? I'd really like to be mistaken on that :)
        
         | nerdponx wrote:
         | It depends a lot on how the application is set up.
         | 
         | > So, my ideal async task thingie would be to start an async
         | worker on the fly when there are tasks; the worker will be
         | killed when the task has finished (or stay alive some time to
         | pick any new tasks, whatever). So no plumbing of running a
         | worker indenepdently.
         | 
         | If you are running a single process in a single thread, then
         | you have one async event loop scheduling all async tasks. In
         | that context, background jobs can run and stay running as long
         | as the event loop is running, and you won't have any problems
         | unless you accidentally call some blocking routine in the
         | background task (then you'll block up the whole event loop
         | because it's all still single-threaded).
         | 
         | You could even have multiple subthreads with one event loop in
         | each, or multiple subprocesses, each with one or more
         | subthreads, each with its own event loop. There is nothing
         | about the design of the Python language that prevents you from
         | doing this. The GIL is annoying but not a dealbreaker.
         | 
         | Things get trickier when you are dealing with a pool of workers
         | being managed by some external process. I can't speak for other
         | languages, but this is the industry standard setup in Python
         | web development. You could of course disregard the fact that
         | anything is different about your setup and go about the DIY
         | process of spawning threads or async background tasks, building
         | machinery to manage their liftetimes, etc. But then you're
         | circumventing not only your web framework but also the master
         | server process that manages the worker pool, and things could
         | get really weird and messy.
         | 
         | Exactly how weird it gets, and how much DIY framework hacking
         | you need to do, depends on the implementation details of your
         | particular framework and the "master server" system that you
         | are using.
         | 
         | Thus we ended up with standalone task queueing/running tools
         | like Celery and Dramatiq (and now I guess Chard), which let you
         | avoid fighting your framework and writing your own background
         | task background task runner, at the expense of running a
         | separate daemon and pool of workers. From the perspective of a
         | developer, this is a great tradeoff because it lets me write
         | web framework code that's focused on serving web stuff, and it
         | lets me write background task code that doesn't have web
         | framework details cooked into it. You could even go a step
         | further and use something like Google Cloud Tasks or Pub/Sub if
         | you want to avoid managing and monitoring another thing.
         | 
         | But I think it's important to be clear that these systems exist
         | not because of anything particularly weird or bad about Python
         | itself, but because of how Python web frameworks generally are
         | expected to work.
        
           | spapas82 wrote:
           | Great explanation thank you!
           | 
           | My go-to stack is Django over gunicorn over nginx like you
           | describe (industry standard setup in Python web development)
           | and I wouldn't like to change that. So, I guess need to to
           | keep using the extra worker and all the plumbing that goes
           | along with it :(
        
       | drpancake wrote:
       | Hey everyone. The idea here is take advantage of the new
       | async/await ORM features in Django 4.1* to build a lightweight
       | background task queue for Django that runs in a single worker
       | process without any external dependencies (i.e. no
       | Redis/RabbitMQ).
       | 
       | It's pretty basic and experimental. Feedback and PRs welcome!
       | 
       | *The Django ORM is still using sync-to-async compatibility layer
       | behind the scenes but they plan to phase this out in future
       | versions.
        
         | phoebefactorial wrote:
         | I've been very excited to see Django embrace asyncio, and your
         | package looks like a great way to do async task queues.
         | 
         | I've built a lot of very low traffic Django sites that are all
         | behind user login for only a few users (think company internal-
         | only CRUD tooling) and being able to use a task queue without
         | having to set up a Redis thing is a big bonus for me.
         | 
         | Excited to see how this evolves, thank you for sharing it!
        
         | canadiantim wrote:
         | This looks awesome! I'll definitely check it out. Always on the
         | lookout for minimalist django libraries. I'll give it a go in a
         | couple weeks. Thanks for making it and sharing!
        
         | xoudini wrote:
         | > without any external dependencies (i.e. no Redis/RabbitMQ)
         | 
         | You still depend on a database with the `Task` model. This
         | would be a no-go for that reason, since there's no reasonable
         | way to have an impact on its behaviour, outside of creating a
         | custom database router to avoid having every third-party
         | library hitting the same database as core logic.
         | 
         | If you absolutely must use a model, take a look at enumeration
         | types[1] for a slightly "neater" way to declare choices.
         | 
         | [1]:
         | https://docs.djangoproject.com/en/4.1/ref/models/fields/#enu...
        
         | adparadox wrote:
         | I love the simplicity of this idea because for lots of sites
         | the database works just fine as a queue backend and it reduces
         | the amount of infrastructure needed. I currently use
         | https://github.com/dabapps/django-db-queue for
         | https://devmarks.io which also uses the database to store tasks
         | instead of a dedicated queue infrastructure. `Django Q` also
         | has an option to use the database, but I haven't tested it at
         | all:
         | https://django-q.readthedocs.io/en/latest/configure.html#orm.
         | And if you are already running `redis` for your site,
         | https://github.com/rq/django-rq is another option.
         | 
         | The one benefit of this package is that it is async-first which
         | will be beneficial as Django continually adds in more async
         | capabilities. Nice work! I'm looking forward to trying this out
         | and seeing how it works!
        
       | lmeyerov wrote:
       | Awesome, we were just discussing internally how to do such an
       | architecture
       | 
       | Maybe a "where to help" / roadmap section for a sense of 'what is
       | left for a stable & reasonably complete 1.0 api'?
        
       ___________________________________________________________________
       (page generated 2022-09-11 23:01 UTC)