[HN Gopher] Finding and optimizing N+1 queries on a relational d...
       ___________________________________________________________________
        
       Finding and optimizing N+1 queries on a relational database
        
       Author : kiyanwang
       Score  : 16 points
       Date   : 2024-09-09 06:46 UTC (2 days ago)
        
 (HTM) web link (engineering.mixpanel.com)
 (TXT) w3m dump (engineering.mixpanel.com)
        
       | echelon wrote:
       | One solution is to not use an ORM.
       | 
       | These problems are framework problems. I find the cognitive
       | overhead of worrying whether a framework might do the wrong thing
       | or diagnosing problems after they arise often override any cost
       | savings frameworks provide in the first place.
        
         | 9dev wrote:
         | I totally share your frustrations over ORMs and the issues they
         | cause, but I seriously doubt your cost calculations. The sheer
         | number of working CRUD applications that don't rely on fancy
         | queries is mind-boggling, and most developers engaging in high-
         | level system design seem mostly oblivious to this. The ORM may
         | not make for pretty or efficient queries, but it enables armies
         | of mediocre programmers to crunch out generic line of business
         | applications very fast. The performance of these apps may not
         | be what it could be, but while we are still busy discussing the
         | finer details of query planner optimisation, folks are making a
         | fortune with N+1 queries in production that virtually nobody
         | cares about.
        
         | Daishiman wrote:
         | Not using an ORM is a solution in the same way that the
         | solution to changing a car's oil is to not own a car.
         | 
         | I've spent a good 20 years building web apps. The speed and
         | productivity of having an ORM like Django's a well-integrated
         | into business logic is unparalleled with the possible exception
         | of Rails and Laravel, maybe.
         | 
         | Systems that have 50+ tables become _trivial_ to maintain by
         | one person. You can refactor an underlying database table
         | without having to rewrite queries. There are entire classes of
         | refactoring that go away with the ORM. Serving database objects
         | with Django Rest Framework or building up CRUD interfaces with
         | the Django admin would simply be impossible otherwise.
         | 
         | Getting the hang of the ORM took me all of 6 months to build up
         | to the nastiest of edge cases, which are functionally as
         | difficult to debug as any complex SQL query with window
         | functions or esoteric functions.
         | 
         | The discourse is irrationally framed towards the first months
         | of learning, for a technology that I've been using for 14 years
         | with _no_ sign of finding a replacement.
        
         | ljm wrote:
         | You'll get N+1 or quadratic queries whether or not you use an
         | ORM. At the end of the day you are allocating time on the
         | database and it doesn't care about the way you build the query.
         | Runtime compute tends to be a lot cheaper than DB compute too -
         | might as well run those cycles before the DB does.
        
           | henryfjordan wrote:
           | If you actually have to reckon with the SQL queries in your
           | app, N+1 becomes less likely.
           | 
           | Consider `rows.map { row ->
           | repository.fetchRelatedRow(row.id) }`. That's much more
           | obviously going to cause issues than `rows.map {
           | row.relatedRow }`. You don't need to cross-reference the
           | model code for row to see if relatedRow is eager loaded or
           | not, the answer is right there for you.
        
       | henning wrote:
       | Database queries are likely to be the most constrained resource
       | in a typical enterprise application. They are essential but also
       | extremely slow.
       | 
       | Because of this, that limited resource needs to be carefully
       | managed and it needs to be straightforward to reason about the
       | I/O behavior of your application.
       | 
       | Attempting to hide them away and obscure what is actually
       | happening in your application is extremely counter-productive.
        
       | cool_dude85 wrote:
       | Not knowing anything about web development or Django or whatever,
       | why isn't this just in a real database? Where you can do a join
       | instead of 2000 individual queries?
        
         | senko wrote:
         | You can do joins in Django just fine, and you can even NOT
         | write them and have Django automagically do the correct thing
         | if you know how to use it.
        
           | ljm wrote:
           | The ORMs provided by full stack frameworks like Rails, Django
           | and DotNet are basically engines in and of themselves.
           | They're doing a lot of work at runtime to reduce pressure on
           | the DB and ensure connections are returned to the pool as
           | soon as possible.
           | 
           | They're not 'automagic'; you can reason with them quite
           | easily.
        
           | cool_dude85 wrote:
           | Then what's up with the pattern they show of querying every
           | board and then iterating through to query again for every
           | report associated with the board?
        
         | henryfjordan wrote:
         | Object Relational Mappers like the one Django use are a whole
         | topic of debate unto themselves.
         | 
         | They strive to map SQL concepts into a more code-friendly
         | domain. Rows are represented as objects, references become
         | attributes of those objects which can be eager-loaded by a join
         | when you fetch the parent object or lazy-loaded with their own
         | query when they are actually accessed. The N+1 problem comes
         | from this lazy-loading scenario.
         | 
         | This programming model abstracts away the SQL which can be
         | really cool and easy until you try to do something off the
         | beaten path. Junior devs love it for the ease, and diving into
         | the internals once things go wrong turns them into Senior devs!
        
       | senko wrote:
       | Heh, wrote something similar (django-queryinspect) a while ago,
       | then abandoned it since iirc django debug toolbar has it builtin,
       | and I moved to some other stuff.. maybe I should dust it off.
       | 
       | N+1 is fairly well understood and documented gotcha, and Django
       | has excellent tools that make it easy to avoid it (as the article
       | mentions).
       | 
       | To use Django ORM effectively in nontrivial apps, you do need to
       | understand how relational databases work, and it's helpful to
       | understand SQL. But it does make me a whole lot more productive
       | (and happy!) than writing SQL by hand.
        
       | IceHegel wrote:
       | This is the best write-up of tools to use to avoid shooting
       | yourself in the foot with Django N+1s that I have seen.
       | 
       | However, having used Django professionally for a while, I can say
       | that it is really starting to show its age even if you deal with
       | the N+1 problems. You still have to deal with:
       | 
       | 1. Blocking database calls that cannot easily be used in an
       | asyncio context
       | 
       | 2. No types
       | 
       | 3. 50% of package code being written without any regard for
       | performance
       | 
       | If you're starting a project today and considering Django, I
       | highly recommend you go with Nestjs and Prisma instead.
        
       | werdnapk wrote:
       | In Rails you can make this 3 queries. One to retrieve the board,
       | one to get all the report IDs associated with the board and a
       | third to get all the reports with the associated report IDs.
       | 
       | An example in Rails would look like:                 board =
       | Board.includes(:reports).find(board_id)
        
       ___________________________________________________________________
       (page generated 2024-09-11 23:00 UTC)