[HN Gopher] Finding and optimizing N+1 queries on a relational d...
___________________________________________________________________
Finding and optimizing N+1 queries on a relational database
Author : kiyanwang
Score : 16 points
Date : 2024-09-09 06:46 UTC (2 days ago)
(HTM) web link (engineering.mixpanel.com)
(TXT) w3m dump (engineering.mixpanel.com)
| echelon wrote:
| One solution is to not use an ORM.
|
| These problems are framework problems. I find the cognitive
| overhead of worrying whether a framework might do the wrong thing
| or diagnosing problems after they arise often override any cost
| savings frameworks provide in the first place.
| 9dev wrote:
| I totally share your frustrations over ORMs and the issues they
| cause, but I seriously doubt your cost calculations. The sheer
| number of working CRUD applications that don't rely on fancy
| queries is mind-boggling, and most developers engaging in high-
| level system design seem mostly oblivious to this. The ORM may
| not make for pretty or efficient queries, but it enables armies
| of mediocre programmers to crunch out generic line of business
| applications very fast. The performance of these apps may not
| be what it could be, but while we are still busy discussing the
| finer details of query planner optimisation, folks are making a
| fortune with N+1 queries in production that virtually nobody
| cares about.
| Daishiman wrote:
| Not using an ORM is a solution in the same way that the
| solution to changing a car's oil is to not own a car.
|
| I've spent a good 20 years building web apps. The speed and
| productivity of having an ORM like Django's a well-integrated
| into business logic is unparalleled with the possible exception
| of Rails and Laravel, maybe.
|
| Systems that have 50+ tables become _trivial_ to maintain by
| one person. You can refactor an underlying database table
| without having to rewrite queries. There are entire classes of
| refactoring that go away with the ORM. Serving database objects
| with Django Rest Framework or building up CRUD interfaces with
| the Django admin would simply be impossible otherwise.
|
| Getting the hang of the ORM took me all of 6 months to build up
| to the nastiest of edge cases, which are functionally as
| difficult to debug as any complex SQL query with window
| functions or esoteric functions.
|
| The discourse is irrationally framed towards the first months
| of learning, for a technology that I've been using for 14 years
| with _no_ sign of finding a replacement.
| ljm wrote:
| You'll get N+1 or quadratic queries whether or not you use an
| ORM. At the end of the day you are allocating time on the
| database and it doesn't care about the way you build the query.
| Runtime compute tends to be a lot cheaper than DB compute too -
| might as well run those cycles before the DB does.
| henryfjordan wrote:
| If you actually have to reckon with the SQL queries in your
| app, N+1 becomes less likely.
|
| Consider `rows.map { row ->
| repository.fetchRelatedRow(row.id) }`. That's much more
| obviously going to cause issues than `rows.map {
| row.relatedRow }`. You don't need to cross-reference the
| model code for row to see if relatedRow is eager loaded or
| not, the answer is right there for you.
| henning wrote:
| Database queries are likely to be the most constrained resource
| in a typical enterprise application. They are essential but also
| extremely slow.
|
| Because of this, that limited resource needs to be carefully
| managed and it needs to be straightforward to reason about the
| I/O behavior of your application.
|
| Attempting to hide them away and obscure what is actually
| happening in your application is extremely counter-productive.
| cool_dude85 wrote:
| Not knowing anything about web development or Django or whatever,
| why isn't this just in a real database? Where you can do a join
| instead of 2000 individual queries?
| senko wrote:
| You can do joins in Django just fine, and you can even NOT
| write them and have Django automagically do the correct thing
| if you know how to use it.
| ljm wrote:
| The ORMs provided by full stack frameworks like Rails, Django
| and DotNet are basically engines in and of themselves.
| They're doing a lot of work at runtime to reduce pressure on
| the DB and ensure connections are returned to the pool as
| soon as possible.
|
| They're not 'automagic'; you can reason with them quite
| easily.
| cool_dude85 wrote:
| Then what's up with the pattern they show of querying every
| board and then iterating through to query again for every
| report associated with the board?
| henryfjordan wrote:
| Object Relational Mappers like the one Django use are a whole
| topic of debate unto themselves.
|
| They strive to map SQL concepts into a more code-friendly
| domain. Rows are represented as objects, references become
| attributes of those objects which can be eager-loaded by a join
| when you fetch the parent object or lazy-loaded with their own
| query when they are actually accessed. The N+1 problem comes
| from this lazy-loading scenario.
|
| This programming model abstracts away the SQL which can be
| really cool and easy until you try to do something off the
| beaten path. Junior devs love it for the ease, and diving into
| the internals once things go wrong turns them into Senior devs!
| senko wrote:
| Heh, wrote something similar (django-queryinspect) a while ago,
| then abandoned it since iirc django debug toolbar has it builtin,
| and I moved to some other stuff.. maybe I should dust it off.
|
| N+1 is fairly well understood and documented gotcha, and Django
| has excellent tools that make it easy to avoid it (as the article
| mentions).
|
| To use Django ORM effectively in nontrivial apps, you do need to
| understand how relational databases work, and it's helpful to
| understand SQL. But it does make me a whole lot more productive
| (and happy!) than writing SQL by hand.
| IceHegel wrote:
| This is the best write-up of tools to use to avoid shooting
| yourself in the foot with Django N+1s that I have seen.
|
| However, having used Django professionally for a while, I can say
| that it is really starting to show its age even if you deal with
| the N+1 problems. You still have to deal with:
|
| 1. Blocking database calls that cannot easily be used in an
| asyncio context
|
| 2. No types
|
| 3. 50% of package code being written without any regard for
| performance
|
| If you're starting a project today and considering Django, I
| highly recommend you go with Nestjs and Prisma instead.
| werdnapk wrote:
| In Rails you can make this 3 queries. One to retrieve the board,
| one to get all the report IDs associated with the board and a
| third to get all the reports with the associated report IDs.
|
| An example in Rails would look like: board =
| Board.includes(:reports).find(board_id)
___________________________________________________________________
(page generated 2024-09-11 23:00 UTC)