[HN Gopher] Write an SQL query builder in 150 lines of Python
       ___________________________________________________________________
        
       Write an SQL query builder in 150 lines of Python
        
       Author : genericlemon24
       Score  : 129 points
       Date   : 2021-08-24 16:01 UTC (3 days ago)
        
 (HTM) web link (death.andgravity.com)
 (TXT) w3m dump (death.andgravity.com)
        
       | krosaen wrote:
       | Interesting to see what goes into an ORM library - but as others
       | note, in my experience learning SQL ends up being better. The
       | things that ORMs make easy are already pretty straightforward,
       | and when you get to more advanced queries, the ORM ends up
       | getting in the way and/or in order to use the ORM properly you
       | have to understand SQL deeply anyways.
       | 
       | For learning SQL, my favorite resource to get started:
       | 
       | Become a SELECT star! https://wizardzines.com/zines/sql/
       | 
       | followed by
       | 
       | The Best Medium-Hard Data Analyst SQL Interview Questions
       | https://quip.com/2gwZArKuWk7W
        
       | chrisofspades wrote:
       | What are you doing to handle more complicated WHERE clauses, such
       | as WHERE last_name = 'Doe' AND (first_name = 'Jane' OR first_name
       | = 'John')?
        
         | genericlemon24 wrote:
         | That's what the fancier __init__() at the end of the article[1]
         | is for :)
         | 
         | Here's the tl;dr of a real-world example[2]:
         | query = Query().SELECT(...).WHERE(...)         for subtags in
         | tags:             tag_query = BaseQuery({'(': [], ')': ['']},
         | {'(': 'OR'})             tag_add = partial(tag_query.add, '(')
         | for subtag in subtags:                 tag_add(subtag)
         | query.WHERE(str(tag_query))
         | 
         | It can be shortened by making a special subclass, but I only
         | needed this once or twice so I didn't bother yet:
         | for subtags in tags:             tag_query = QList('OR')
         | for subtag in subtags:                 tag_query.append(subtag)
         | query.WHERE(str(tag_query))
         | 
         | [1]: https://death.andgravity.com/query-builder-how#more-init
         | 
         | [2]:
         | https://github.com/lemon24/reader/blob/10ccabb9186f531da04db...
        
       | mcdonje wrote:
       | SQL is an absolutely amazing domain specific language, but people
       | keep building ORMs & other SQL abstractors. I get building one as
       | a learning or resume project. I don't see how it's useful in most
       | scenarios. Like, there's some difficulty breaking up the queries
       | in a logical way (which leads to duplication), so the solution is
       | another library that you need to write or install and learn how
       | to use?
       | 
       | Even if it's easy to use, I can't imagine it'd take less time to
       | read the documentation than to write some SQL. And then there's
       | another dependency; Another thing to check if there's performance
       | issues, another attack vector.
        
         | awinter-py wrote:
         | benefits of ORMs:                 - allow certain apps to work
         | with any sql dialect (though tbh I question the value of this
         | in the age of docker)         - makes simple selects easier
         | - enables plugins for web libraries
         | 
         | I think the 3rd thing is the most valuable. If you're in an
         | ecosystem like django / rails, you can consume plugins that
         | integrate with your sql backend and your schema more or less
         | automatically.
        
           | wutwutwutwut wrote:
           | > tbh I question the value of this in the age of docker
           | 
           | Haha. In the world outside of HN, big companies enforce
           | specific database servers for reasons such as centralized
           | monitoring, security, compliance and competence. Setting all
           | this aside because of "docker" is classic HN.
        
         | greenie_beans wrote:
         | you're missing the forest for the trees. this is a good python
         | tutorial for somebody at my skill level
        
         | kiliancs wrote:
         | I always like to bring up Ecto. Really nice sweet spot IMO.
         | query = from u in "users",                 where: u.age > 18 or
         | is_nil(u.email),                 select: %{name: u.name, age:
         | u.age}              # Returns maps as defined in select
         | Repo.all(query)
         | 
         | https://hexdocs.pm/ecto/Ecto.html#module-query
        
         | mozey wrote:
         | Doing it the other way around makes more sense, i.e. write SQL
         | to generate code. See https://github.com/kyleconroy/sqlc, why
         | aren't more people following this approach?
        
           | rrrrrrrrrrrryan wrote:
           | This is awesome.
           | 
           | Composing a SQL query is often the _easiest_ part. The issue
           | is that throwing that static SELECT statement string in your
           | code has a ton of downsides (as discussed everywhere in this
           | thread).
           | 
           | And fiddling with an ORM is tremendously frustrating when
           | you're just trying to get it to compose the SQL you've
           | already written in your head.
           | 
           | For a dev who's bothered to learn SQL, flipping the thing on
           | its head makes a ton of sense.
        
         | emodendroket wrote:
         | Yeah, everyone has a strong aversion to doing it but it's kind
         | of self perpetuating. Once you get comfortable writing SQL to
         | do everything, the ORMs feel like a frustrating game of trying
         | to get the library to generate the SQL you have in mind.
        
         | genericlemon24 wrote:
         | Sometimes you need dynamic SQL, because the DB doesn't have
         | stored procedures you can use for the same purpose (e.g.
         | SQLite).
         | 
         | I talk more about this use case here:
         | https://death.andgravity.com/query-builder-why#the-problem
        
         | laszlokorte wrote:
         | The issue with plain SQL is simply that you can not compose
         | queries at runtime.
         | 
         | For example let user decide which columns to select, in which
         | order to fetch the rows, which table to query etc. Or not even
         | let the user decide but decide based on some config file or the
         | system state.
         | 
         | You end up with a bunch of string manipulation that is fragile
         | and does not compose well.
         | 
         | What solution is there except from simulating SQL via your
         | languages nestable data structures?
        
           | stadium wrote:
           | Spark SQL is great for this. SQL strings and data frames code
           | can be used interchangeably, and performance is the same for
           | either type of query.
           | https://spark.apache.org/docs/2.1.1/sql-programming-
           | guide.ht...
        
           | xeromal wrote:
           | Honestly though, how often does this scenario actually
           | happen. Devs love to build for what if, but there's not a ton
           | of scenarios where you need to build completely custom
           | queries based on user input. A nice SPROC with some inputs
           | can do most of that while maintaining performance
        
             | rspeele wrote:
             | Pretty much every web app I've written has had some sort of
             | search feature with a table of results, where the user can
             | filter on multiple columns of the search. That requires
             | either:
             | 
             | a. Dynamically adding "and SomeColumn like @someValue" to
             | the where clause of the query
             | 
             | or
             | 
             | b. Having the query include _all_ filters but ignore them
             | when they are left blank, such as
             | 
             | `where @someValue is null or SomeColumn like @someValue`
             | 
             | The latter approach does allow you get by without string-
             | building queries for filtering, but tends to perform worse
             | if you have a lot of columns you might filter on.
             | 
             | And you _still_ have to play the string concatenation game
             | if you let the user click column headers to change the
             | ORDER BY.
             | 
             | With a decent ORM, like Entity Framework for C#, any of
             | this is very simple and there is no need to write SQL
             | string manipulation code in your application. But under the
             | covers that is what is happening.
        
               | tshaddox wrote:
               | Is this just a complaint about a lack of good tooling for
               | SQL code? User-configurable columns are probably always
               | going to be implemented as an array of columns that gets
               | built at runtime and then sent to whatever database
               | management system you're using. In any programming
               | language and database system, isn't the only thing
               | automatically preventing you from building a malformed
               | array of columns at runtime some sort of type-checking or
               | linting? It seems to be that this mistake is just as easy
               | to make using an ORM or data access library without type-
               | checking or linting.
        
             | UglyToad wrote:
             | The main place I've run into this is a table with sorting,
             | pagination and filtering while lazy loading data and I
             | encounter this need a lot.
             | 
             | With some runtime query builder most databases will have
             | decent performance for the space between "too much to
             | reasonably load in memory" to "we need a dedicated query
             | service like elastic". Unfortunately taking the example of
             | sort by X then by Y where X and Y are dynamic I don't know
             | of a nice solution in SQL only in MSSQL 2012 or MySQL 5.
        
               | taeric wrote:
               | I think you are playing into the comment, though. You'd
               | be surprised how far you can go with a table that doesn't
               | support sorting and filtering. Basic pagination can often
               | be foregone, if you can instead resort to offering export
               | and let them use whatever software they are used to for
               | this purpose. (Pagination is also just easier if you
               | forego custom sort and filtering.)
               | 
               | At some point, yes, you will want these things. But the
               | basics can get you really far.
        
               | UglyToad wrote:
               | For sure you can offer a good approximation through SQL
               | but I'm unlucky enough that I've needed to support
               | dynamic sorting and query builders against tables at
               | least 4 times for different datasets in the past years.
               | 
               | I'm particularly proud of the last one I built that still
               | absolutely flies despite doing wildcard querying on text
               | fields (no full text index) with multiple combined
               | queries and a dynamically computed status from aggregate
               | child properties in MySQL 5 with its lack of window
               | functions. The solution was basically dynamically
               | construct an efficient as possible inner query before
               | applying the joins.
               | 
               | To clarify I believe in doing as much as possible in raw
               | SQL and have a fondness for the stored procedure driven
               | system I worked with but this problem remains unsolvable
               | in SQL alone and I just keep encountering it.
        
           | genericlemon24 wrote:
           | Someone on Reddit suggested stored procedures, which seems
           | like a good alternative.
           | 
           | Alas, SQLite doesn't have them, so query building it is.
        
             | laszlokorte wrote:
             | Except if you build a complete query builder as a stored
             | procedure I do not see the problem solved.
             | 
             | Simply stated the problem is: Viewing code as data (in a
             | lisp sense) and transforming arbitrary data into a query
             | that can be executed.
        
             | justsomeuser wrote:
             | With stored procedures you run your code inside of the SQL
             | server process.
             | 
             | With SQLite, your entire application IS the process, and
             | your SQLite data moves from disk to your app processes RAM.
             | 
             | I think the SQLite model is much better as you get to use
             | your modern language and tooling (which is better than the
             | language used for stored procedures which has not changed
             | in 20 years, and is generally a bear to observe, test,
             | develop).
        
             | musingsole wrote:
             | Stored procedures are a nightmare that shepherd your
             | application into an illegible, unmanageable monstrosity.
             | 
             | Stored procedures are the slipperiest slope I've seen as a
             | developer.
        
               | BoxOfRain wrote:
               | That's a bit of an absolutist stance I think, a good
               | programmer can use stored procedures just fine and in
               | some cases it can even improve performance. Yes it makes
               | it easier for bad programmers to write bad code, but bad
               | programmers will write bad code no matter what tools they
               | have in their toolbox.
        
               | moksly wrote:
               | This isn't my experience at all, but I suppose it depends
               | on how you build your applications. I used to be of your
               | opinion, before we moved more and more of our code based
               | from .Net to Python an I was a linq junkie, but these
               | days I think you're crippling your pipeline if you don't
               | utilise SQL when and how it's reasonable to do so.
               | 
               | There are a lot of data retrieval where a stores
               | procedure will will save you incredibly amounts of
               | resources because it gives you the exact dataset you need
               | exactly when you need it.
        
               | taffer wrote:
               | In my experience, such problems only occur when people
               | believe that common software engineering practices do not
               | apply when writing stored procedures. Just have your
               | stored procedures version controlled, tested, and
               | deployed like any other code.
        
               | kbenson wrote:
               | The real problem is that you're almost always shifting
               | work from a language that is well known and understood by
               | you and/or your team to one that is less, or even poorly,
               | understood and known, and you end up incurring the cost
               | of novice programmers, which can be a real problem for
               | both security and performance.
               | 
               | If you have good knowledge and experience in the language
               | your preferred version of SQL implements, that's good. If
               | you just have people that understand how to optimize
               | schemas and queries, you might find that you encounter
               | some of the same problems as if you shelled out to
               | somewhat large and complex bash scripts. The value of
               | doing so over using your core app language is
               | _debatable_.
        
               | taffer wrote:
               | If your front-end is written in React and your business
               | logic is written in SQL, is it really fair to call what's
               | left in the middle tier a "core app language"? If you're
               | writing a SaaS application today, you're more likely to
               | want to rewrite your Java middle tier in Go than
               | replacing your DBMS.
        
               | kbenson wrote:
               | Not everyone is making a web app, or even something
               | amenable to using React native, and even if they are,
               | there's no guarantee that their middle tier isn't also in
               | JavaScript.
               | 
               | That said, I wasn't making a case about replacing your
               | DBMS. I specifically avoided that because yes, most
               | people stay with what they know and used, and even if
               | they switch, they switch for a different project, not
               | within the same project. There are _some_ cases where
               | multiple DBMS back-end support is useful, but I think
               | that 's a fairly small subset (software aimed towards
               | enterprises which wants to ease into your existing system
               | and note add new requirements, and open source software
               | meant to use one of the many DBMS back-ends you might
               | have).
               | 
               | My actual point is more along the lines of:
               | 
               | - Most DBMS hosted languages I've seen are pretty shitty
               | in comparison to what you're already using.
               | 
               | - The tooling for it is likely much worse or possibly
               | non-existent.
               | 
               | - You are probably less familiar with it and likely to
               | fall into the pitfalls of the language. All languages
               | have them, shitty languages have more. See first point.
               | 
               | - If you accept those points and the degree to which you
               | accept them should definitely play a role in deciding to
               | use stored procedure you've written in the language your
               | DBMS provides.
               | 
               | - I think trade offs are actually similar to what you
               | would see writing chunks of your program in bash and
               | calling out to that bash script. People can write well
               | designed and safe bash programs. It's not easy, and there
               | are a lot of pitfalls, and you can do it in the main
               | language you're writing probably. Thus the reasons
               | against calling out to bash for chunks of core are likely
               | similar to the reasons against calling a stored
               | procedure.
        
               | taffer wrote:
               | You make some good points. I personally have experience
               | with business web applications, i.e. large complex data
               | models with rather simple updates and report-like
               | queries. These types of queries combined with a well
               | normalized data model map well to set-based SQL. Of
               | course, it's a different story if you're writing
               | technical applications or games that are more about
               | crunching numbers than querying and updating data.
               | 
               | To me, the shitty procedural languages you mention are
               | just for gluing queries together. The important stuff
               | happens in SQL and the simplicity of keeping it all in
               | the database is worth it.
        
               | tshaddox wrote:
               | That seems to assume some very specific "scene" of
               | developers, or at least a team environment where it's
               | impossible to select for developers who know SQL. That
               | may very well be the case in your situation, but I see no
               | reason to expect it would be the case in most situations.
               | SQL isn't exactly a new or niche language.
        
               | kbenson wrote:
               | I'm not talking about plain SQL, I'm talking about the
               | included procedural language that often ships with it[1]
               | as an extension and which is much more DBMS specific.
               | 
               | So, the question is, does your team know SQL/PSM, or
               | PL/SQL, or PL/pgSQL, or some other variant, and how well.
               | 
               | 1:
               | https://en.wikipedia.org/wiki/SQL#Procedural_extensions
        
               | tshaddox wrote:
               | My team would know the variant that we use, yes.
        
               | empthought wrote:
               | Sounds like you're pretty bad at writing stored
               | procedures.
        
               | freedomben wrote:
               | Sounds like you've never worked on a team.
        
           | sumtechguy wrote:
           | I propose that the issue is at the ODBC layer.
           | 
           | We have 2 modes of using that thing.
           | 
           | Tightly bound. Basically you have a bit of query string with
           | question marks in it and you bind out your data points.
           | Either for sending/recieving.
           | 
           | Loosely bound. Here is a totally composed string ready to go
           | just run it. Also a good way to make SQL injection attacks.
           | 
           | Both involve string manipulation.
           | 
           | I think the issue is the ODBC interface does not really map
           | to what SQL is and does. The column binding sort of does. But
           | not table, not where, not having, not sort, etc. So we end up
           | re-inventing something that pastes over that. Building a
           | string to feed into an API that feeds that over the network
           | to a string parser on the other side to decompose it again
           | then runs it and goes back the other way in a similar manner.
        
           | phs2501 wrote:
           | This is my middle-ground solution for Python:
           | https://github.com/bdowning/sql-athame
           | 
           | Still fundamentally manipulating SQL text (which is a feature
           | as I don't want to learn a full DSL), but it handles
           | wrangling embedded placeholders while you're composing stuff
           | and some other common compositional tasks. It's worked well
           | for me anyway but I'm under no illusions it'd be right for
           | everyone.
           | 
           | Not an original concept regardless; my original version of
           | this was in Node: https://github.com/bdowning/sql-assassin,
           | but a few years after I wrote that (and mostly didn't use it)
           | I found https://github.com/gajus/slonik which was very
           | similar and much more fleshed-out; I rolled _some_ of its
           | concepts and patterns into sql-athame.
        
           | forinti wrote:
           | It would be easier to do that with a template engine.
        
             | genericlemon24 wrote:
             | I cover this solution here:
             | https://death.andgravity.com/own-query-builder#jinjasql-
             | sqlp...
        
           | agent327 wrote:
           | Something like C++'s {fmt}, specialized for SQL.
           | 
           | fmt ("SELECT foo, bar FROM mytable WHERE id = {}", id);
           | 
           | 'id' gets properly quoted, so that's not a concern anymore.
        
             | joshuamorton wrote:
             | This is simpler than what gp is asking for, they want the
             | ability to dynamically change which columns are requested
             | too, so change it to "foo, bar, baz", but still do so with
             | reasonable safety.
        
           | GLGirty wrote:
           | > simulating SQL via your languages nestable data structures
           | 
           | I've gone down this rabbit hole... you've described the in-
           | house query engine I work with.
           | 
           | Instead of sql, queries are nested json representing a
           | dumbest-possible sql parse tree. It's only a subset of sql
           | features, but it provides, as you point out, composability.
           | By having the query as a data struct, all sorts of
           | manipulations become possible... toggle fields for access
           | control; recovery from dropped or renamed columns (schema has
           | a changelog); auto-join related tables as required; etc.
           | 
           | It's magic when it works but it suffers from the drawbacks
           | noted elsewhere in the thread about traditional ORMS...
           | onboarding new devs, debugging when something goes wrong.
           | 
           | Adding new features to the dsl is a pain because the surface
           | area is not just _the query_ but _all the queries it might
           | form_ when composed with other queries in the system O_o
        
         | nobody0 wrote:
         | Your point is valid, it make sense just speak SQL to DB in some
         | scenarios . A good ORM will enable people to do that easily as
         | well.
         | 
         | However, good orms like LINQ in C# really shines in a lot of
         | places.
        
           | teh_klev wrote:
           | Wee nitpick, LINQ isn't an ORM, it's a querying API. Maybe
           | you meant the original LINQ to SQL ORM which was deprecated
           | in favour of Entity Framework (EF).
           | 
           | > A good ORM will enable people to do that easily as well
           | 
           | I agree. EF isn't a terrible ORM and covers many use cases
           | developers encounter when interacting with a database server
           | in your average line of business app. But you do need to keep
           | an eye on the SQL it generates with more complex queries i.e.
           | beware of cartesian hand grenades. Fortunately MS provides
           | some guidance on this kind of thing, but sadly many
           | developers often don't bother to read the docs, and this is
           | why ORM's get a bad name:
           | 
           | https://docs.microsoft.com/en-
           | us/ef/core/performance/efficie...
        
         | jrm4 wrote:
         | This all day. I teach intro Web App stuff (meaning I've never
         | built anything huge) and it's just painful to be like:
         | 
         | Hey everyone, look at this language with this incredibly
         | intuitive and simple syntax, perhaps the most "easy to read and
         | understand" thing you'll see in computing...
         | 
         | ...and now here's a bunch of painful stupid garbage that you'll
         | have to learn about, because industry has accreted a bunch of
         | crap around it to make it turn on and work.
        
         | emidln wrote:
         | I've written a lot raw SQL in the dialect my language supports.
         | This is fine for static queries. When queries are dynamic
         | (maybe they come from an admin panel or some other part with
         | lots of optionality), static SQL isn't enough. You then get to
         | do one of three things: ad-hoc string manipulation, rely on a
         | query AST->compiler (like SQLAchemy's core or HoneySQL), or an
         | ORM. ad-hoc string manipulation is a real security and
         | reliability nightmare, and if you don't have a good standalone
         | query AST to SQL compiler, an ORM is the next best thing.
        
           | codetrotter wrote:
           | > ad-hoc string manipulation is a real security and
           | reliability nightmare
           | 
           | Parameters.
           | 
           | A couple of examples:
           | 
           | https://www.psycopg.org/docs/usage.html#passing-
           | parameters-t...
           | 
           | https://www.php.net/manual/en/mysqli.quickstart.prepared-
           | sta...
        
             | jteppinette wrote:
             | How do you parametrize things like dynamic joins, where
             | clauses, field selections, or aggregations without string
             | manipulation or gross duplication?
        
             | emidln wrote:
             | Parameters have zero bearing on whether you should
             | dynamically construct SQL strings. If parameters can solve
             | all of your problems, you don't have a dynamic SQL query,
             | you have user-submitted values in a WHERE clause.
             | 
             | What can be parameterized depends wildly on the SQL
             | database in question. I haven't used one that could
             | parameterize table names (for use in dynamic JOINs or CTEs)
             | and many cannot parameterize values outside of a WHERE
             | clause. Dynamically selecting which function to call,
             | clauses to add or subtract, and sort orders are just a
             | slice of places parameters don't help.
             | 
             | In short, parameters alone do no eliminate the need for a
             | query builder. A good query builder should appropriately
             | parameterize values as the underlying database supports and
             | hopefully uses a type system or validation to constrict the
             | domain of values it uses to construct parts of the
             | expression that cannot be parameterized.
        
             | lowercased wrote:
             | You can't order/sort by parameters.
        
         | Cthulhu_ wrote:
         | My main argument for ORMs is the mapping between SQL columns
         | and your data model; having to write out SQL and code to load
         | an SQL result set into your data model feels like a lot of
         | duplication. And it gets a lot more difficult when you have
         | more complicated data models spanning multiple database columns
         | - that is, if your data model is like a document that has to be
         | inserted and updated into a set of normalized database tables.
         | 
         | That said, if that wasn't as much of an issue, or if I still
         | had the patience and typing skills to write a lot of boring
         | code, I wouldn't object too much. I'd also treat SQL as a first
         | class citizen, so dedicated .sql files that the application
         | will load or embed and use. That way, the SQL can be developed,
         | maintained and run outside of the context of your application -
         | editors can execute them directly to a (local) development
         | database.
        
           | vlunkr wrote:
           | Agreed, lots of comments here saying that people don't want
           | to learn SQL, I don't buy that argument. I think most people
           | already know SQL because at some point it's inevitable, and
           | it's also not that hard.
           | 
           | The value of the ORM is what the name implies: mapping your
           | data from rows into something convenient to use in your
           | language.
           | 
           | What people are really complaining about is ORMs that
           | needlessly try to re-invent SQL, but they don't all do this.
           | Someone else brought up Ecto, it's a brilliant ORM because
           | the resulting SQL is always obvious.
        
           | jakearmitage wrote:
           | What if your SQL is your data model? I believe object-
           | oriented design is forcing us to do a lot of duplication for
           | very little gains, just like it did on game design and the
           | industry started moving towards data-oriented design and
           | entity-component system.
           | 
           | The sooner we move away from the big lie that is DDD, the
           | more productive we can be.
        
             | [deleted]
        
             | true_religion wrote:
             | I think if sql is your data model, then it's a good case
             | for a query builder.
             | 
             | But at my company, our in house app is written using django
             | but treats the django model asan interface to
             | elasticsearch, Postgres, and redis.
             | 
             | The django model is sub optimal for sql, but it's a good
             | base for interacting with the system as a whole.
        
         | BoxOfRain wrote:
         | I feel a lot of arguments for ORMs boil down to "I can't really
         | be arsed to learn SQL properly" which is fair enough from a
         | developer time / performance ratio point of view, but not from
         | a purely technical point of view.
         | 
         | Like you say, I find there's way more mental friction dealing
         | with new ORMs and wading through the documentation than there
         | is in just going straight for raw SQL. There's less "magic"
         | that way too which is always nice when you're debugging or
         | optimising things. Different strokes for different folks I
         | guess.
        
           | huetius wrote:
           | Ymmv, but I like the both/and approach. I've seen high
           | quality codebases where the benefits of both are leveraged:
           | ORM for simple queries, idiomatic code reuse; raw sql for
           | complex or performance critical queries. Being smart about
           | when to use which is the key, but I have seen it work very
           | well.
        
             | GVIrish wrote:
             | This is what I think is the reasonable take on ORMs. If
             | you're using one because you don't wanna learn SQL, that is
             | a bad reason that will most likely result in more work in
             | the long run.
             | 
             | But if you're using it to increase productivity for
             | straightforward database calls, it can be a useful tool in
             | a lot of scenarios.
        
           | jasode wrote:
           | _> I feel a lot of arguments for ORMs boil down to "I can't
           | really be arsed to learn SQL properly" _
           | 
           | I learned SQL almost 30 years ago and used it in
           | Oracle/Informix/Sybase/IBMDB2/MSSQL/MySQL/SQLite/etc and have
           | written exam questions on SQL for outer-joins, nested
           | subqueries, crosstab, etc.
           | 
           | All that said, concatenating raw strings to form SQL that has
           | no compile time type-checking is tedious and error prone.
           | Thankfully, my "expert" knowledge of SQL does not blind me to
           | the benefits of what a well-written ORM can do for
           | productivity. (E.g. results caching, compile-time validation,
           | auto-completion in the IDE, etc)
           | 
           | The real technical reason why ORMs persist as a desirable
           | feature is the decades-old technical concept of _" working
           | memory of field variables"_ that mirror the disk file table
           | structure.
           | 
           | In the older languages like 1960s COBOL, and later
           | 1980s/1990s 4GL type of business languages such as dBASE,
           | PowerBuilder, SAP ABAP... they all have data tables/records
           | _as a 1st-class language syntax without using any quoted
           | strings_ :
           | 
           | - In COBOL, you had RECORDS and later versions of COBOL had
           | _embedded SQL_ without quoted strings
           | 
           | - dBASE/Foxpro had SCATTER/GATHER which is sort of like a
           | built-in ORM to pull disk-based data from dbf files into
           | memory
           | 
           | - SAP ABAP has built-in SQL (no quoted raw strings required)
           | with no manual typing of variables or structs to pull data
           | into working memory
           | 
           | The issue with "general purpose" languages like
           | Python/Javascript/C++ is that SQL is not 1st-class language
           | syntax to read database data into memory variables and write
           | them back out. The SCATTER/GATHER concept has to be _bolted-
           | on_ as a library ... hence you get the reinvention of of the
           | "db-->memory-->db" data read/write paradigm into Python via
           | projects like SQLAlchemy.
           | 
           | After tediously typing out hundreds of variable names or
           | copy-pasting fields from SQL tables via strings such as ...
           | _db.execquery( "SELECT cust_id, name, address FROM customer
           | WHERE zipcode=12345;")_
           | 
           | ... it finally dawns on some programmers that this is
           | _mindless repetitive work_ when you have hundreds of tables.
           | You end up writing a homemade ORM just to reduce the error-
           | prone insanity.
           | 
           | Yes, a lot of ORM projects are bad quality, with terrible
           | performance, bugs, etc. But that it still doesn't change the
           | fact that non-trivial projects with hundreds of tables and
           | thousands of fields need _some type of abstraction_ to reduce
           | the mindless boilerplate of raw SQL strings.
        
             | chousuke wrote:
             | I feel like there should be a distinction between full-
             | blown ORMs and just query builders.
             | 
             | I dislike ORMs that hide database operations from the
             | application code by pretending that in-memory application
             | objects are in concept interchangeable with persistable
             | objects; though just saying that hurts a bit because I
             | don't want to think in terms of objects, but in _data_.
             | 
             | Query builders are fine if they allow you to build
             | efficient queries in a type-safe and composable manner to
             | get data in and out of the database in the form your
             | application requires at the site of the database query, but
             | I don't want to be forced to pretend I fetch "User" objects
             | if all I really need from the database are the name and
             | e-mail.
        
             | [deleted]
        
             | [deleted]
        
           | luhn wrote:
           | I feel the opposite, the arguments _against_ ORMs largely
           | boil down to  "I can't really be arsed to learn SQL
           | properly." ORMs are a leaky abstraction by nature and
           | attempting to use one without understanding the underlying
           | technology will only lead to pain and a developer with grudge
           | against ORMs.
           | 
           | I think ORMs compliment SQL, not replace it. Without ORMs,
           | you end up getting a lot of clunky boilerplate trying to do
           | simple CRUD operations to your data. It's a bit like
           | Greenspun's 10th law: "Any sufficiently advanced RDBMS-backed
           | program contains an ad hoc, informally-specified, bug-ridden,
           | slow implementation of half of an ORM."
        
         | Emanation wrote:
         | I built one because I wanted to generate all my SQL scripts
         | from a graphql schema.
         | 
         | Handjamming SQL where statements for arbitrarily deeply nested
         | types is insanity.
        
         | yasserf wrote:
         | I agree with your point that adding multiple layers = more
         | attack vectors and abstraction of a really good domain specific
         | language. But what seems to happen on most of the projects I
         | work on is we end up hiding away extremely common logic behind
         | helper functions. It always starts off with SQL and then slowly
         | gets moved into higher level functions that offer a better
         | developer experience.
         | 
         | Shameless plug, but I just posted a library I wrote (for node
         | https://github.com/vramework/postgres-
         | typed/blob/master/READ...) which pretty much is a tiny layer
         | ontop of pg-node (which is query based / with value parameters)
         | and provides small util functions with typescript support
         | derived straight from postgres tables.
         | 
         | In an ideal world (one I think we are getting very close to) I
         | think we will end up having SQL queries validated in our code
         | against the actual DB structure the same way we have any other
         | compilation error. But until then we'll need to rely on tests
         | or helper libraries, and for the purpose of refactoring and
         | development I find the latter more enjoyable (although still
         | far from perfect).
        
           | throwaway858 wrote:
           | > In an ideal world (one I think we are getting very close
           | to) I think we will end up having SQL queries validated in
           | our code against the actual DB structure the same way we have
           | any other compilation error.
           | 
           | This library for TypeScript works exactly like this
           | 
           | https://github.com/MedFlyt/mfsqlchecker
        
           | x-shadowban wrote:
           | F# (and surely there are others) offers type providers, which
           | can reach into your db schemas at compile time and typecheck
           | linq-ish or even raw sql strings and the subsequent uses of
           | the result set. But in the simplest case it makes compilation
           | a function of not only source files but also db schema state.
           | You can extract schema defs and point the type provider at
           | the "cached" defs, but make sure rebuilding those is part of
           | the build process.
        
           | mcdonje wrote:
           | >SQL queries validated in our code against the actual DB
           | structure the same way we have any other compilation error
           | 
           | Good point about validation. The ideal scenario you propose
           | would indeed be ideal.
        
         | zzzeek wrote:
         | if you want to write one SQL query, you should not use a SQL
         | builder and/or ORM.
         | 
         | if you want to write thousands of SQL queries that naturally
         | derive from an object model of 30, 100, or 800 domain classes
         | or so and would otherwise be thousands of lines of repetitive
         | boilerplate if all typed out individually, then you use a query
         | builder and/or ORM.
         | 
         | different tools for different jobs.
        
         | jamamp wrote:
         | I work for an enterprise in-house web application, where over
         | time we have a ton of tables, like over several hundred
         | unfortunately. Writing queries for each any every table and
         | manually mapping those to objects sounds exhausting.
         | EntityFramework helps a lot with that, for CRUD operations.
         | 
         | Additionally, I think unit testing would benefit from an ORM as
         | well. Instead of calling an actual database, you can easily
         | have an in-memory database with an ORM with mocked data.
         | Though, I suppose that is technically possibly with straight
         | SQL queries as well, but I imagine it's easier with an ORM.
        
           | zaybqp wrote:
           | How would it be easier with ORM when switching to in-memory
           | database for standard SQL is just changing connection string?
        
           | furstenheim wrote:
           | The main problem with in memory databases is that they are
           | not the same database. I've had several issues which I had to
           | find a fix that worked on one database and was legal syntax
           | on the other. Say maybe one database allows force index but
           | in memory does not.
           | 
           | I've even seen one database layer for in memory and a
           | completely different one for real database. That means that
           | your unit tests are not really testing your production code
        
             | zaybqp wrote:
             | You can also make the same argument about ORM's mock DB
             | layer.
        
       | mikewarot wrote:
       | I've built SQL query builders to get things done behind the
       | scenes a few times. Users just love being able to search any
       | field, and get their results.
        
       | hownottowrite wrote:
       | Or just use sqlalchemy...
        
         | genericlemon24 wrote:
         | I wrote an entire article about why I didn't do that:
         | https://death.andgravity.com/own-query-builder#sqlalchemy-co...
         | 
         | In general, SQLAlchemy is an excellent choice; in my specific
         | case (single dev with limited time), the overhead would be too
         | much (even with my previous experience with it).
        
           | hownottowrite wrote:
           | That's a great article and a good way to go.
        
           | mynameismon wrote:
           | A curious question: why not Python's built in Sqlite3 package
           | [0]?
           | 
           | [0] https://docs.python.org/3/library/sqlite3.html
        
             | genericlemon24 wrote:
             | I am using exactly that; the query builder goes on top of
             | it :)
        
         | chooseaname wrote:
         | Exactly. Because nobody should ever use their dev skills to
         | build up their understanding of a tool by recreating some
         | aspect of it.
        
           | hownottowrite wrote:
           | Exactly. Why bother when you can just write another to-do
           | list manager.
           | 
           | Seriously, it's just a joke. Chill out.
        
       | Asm2D wrote:
       | This reminds me my older project called xql, for node.js:
       | https://github.com/jsstuff/xql
       | 
       | and fiddle: https://kobalicek.com/experiments/fiddle-xql.html
       | 
       | Interesting how all these builders look the same :)
        
       | thangalin wrote:
       | Many SQL abstractions substitute SQL keywords (e.g., "SELECT")
       | for a fluent interface function (e.g., "select"). The query
       | builder and output tend to be language-specific.
       | 
       | Several years ago, I prototyped a relational mapping language
       | based on XPath expressions. The result is a database-, language-,
       | and output-agnostic mechanism to generate structured documents
       | from flat table structurse. Although the prototype uses XML,
       | other formats are possible by changing the RXM library.
       | 
       | The queries might be bidirectional. Given a document and the RXM
       | query that generates the document, it may be possible to update
       | the database using the same query that produced the document. I
       | haven't tried, though.
       | 
       | https://bitbucket.org/djarvis/rxm/src/master/
        
       | globular-toast wrote:
       | > query.SELECT('one').FROM('table')
       | 
       | I don't really like this API. SQL is weird because it's written
       | backwards. What does `query.SELECT("one")` on its own represent?
       | A query on any table that happens to have a field called "one"? I
       | know you're not trying to build an ORM, but
       | `Table("table").select("one")` makes a lot more sense for an API
       | since the object `Table("table")` actually has a purpose and
       | would make sense to pass around etc.
        
         | gpderetta wrote:
         | also _select_ is really _project_ (and _where_ is _select_ ),
         | but that's the battle for another day.
        
         | genericlemon24 wrote:
         | > What does `query.SELECT("one")` on its own represent?
         | 
         | data['SELECT'].append('one')
         | 
         | I'm intentionally trying to _not_ depart too much from SQL  /
         | list-of-strings-for-each-clause model, since I'd have to
         | invent/learn another one.
         | 
         | For a full-blown query builder, I agree
         | Table("table").select("one") is better.
        
         | blackbear_ wrote:
         | I concur, despite the downvotes. This is also why the select is
         | last LINQ to SQL [1] queries:                   var
         | companyNameQuery =             from cust in nw.Customers
         | where cust.City == "London"             select
         | cust.CompanyName;
         | 
         | [1] https://docs.microsoft.com/en-
         | us/dotnet/framework/data/adone...
        
       | zer0faith wrote:
       | Please God no...don't do this.. This is exactly how security
       | vulnerabilities are introduced. Validate input and use
       | parametrized queries over something like this...
        
         | genericlemon24 wrote:
         | It's not either-or; I'm using the query builder to build
         | _parametrized_ queries.
         | 
         | Here are two examples:
         | 
         | https://death.andgravity.com/query-builder-why#introspection
         | https://github.com/lemon24/reader/blob/10ccabb9186f531da04db...
        
       | greenie_beans wrote:
       | this is a good guide for an intermediate-level python developer.
       | thanks.
        
         | genericlemon24 wrote:
         | Glad you liked it!
        
       | bluehark wrote:
       | I highly recommend pypika by Kayak:
       | https://github.com/kayak/pypika
       | 
       | Have used in multiple projects and have found it's the right
       | balance between ORMs and writing raw SQL. It's also easily
       | extensible and takes care of the many edge cases and nuances of
       | rolling your own SQL generator.
        
         | orangepanda wrote:
         | > and takes care of the many edge cases and nuances of rolling
         | your own SQL generator
         | 
         | Want to elaborate?
        
       | mynameismon wrote:
       | Ah, good ol' SQL injection attacks... If the author is reading,
       | please do add a disclaimer mentioning that the code would be
       | vulnerable to SQL injection attacks.
       | 
       | From a purely learning point of view, it is indeed an interesting
       | read, however.
        
         | genericlemon24 wrote:
         | (author here)
         | 
         | Yup, good thing to note, thank you!
         | 
         | I've kinda assumed everyone knows about the named parameters
         | some bindings offer, but that's probably not true. Will add a
         | disclaimer.
         | 
         | In the previous articles, I do talk about how this is an
         | upgrade from plain SQL, not from regular full-blown query
         | builders / ORMs.
        
           | quietbritishjim wrote:
           | > I do talk about how this is an upgrade from plain SQL
           | 
           | This statement is about functionality / ease of use, which is
           | fairly orthogonal from (preventing) SQL injection: with plain
           | SQL it's perfectly possible to avoid injection attacks, in
           | fact that's probably the most common and easiest way to do
           | it. In that sense, if anything this is a downgrade from
           | regular SQL.
        
             | genericlemon24 wrote:
             | > In that sense, if anything this is a downgrade from
             | regular SQL.
             | 
             | It likely is.
             | 
             | Ideally, instead of dynamically building queries, one would
             | use stored procedures.
             | 
             | I'm using this with SQLite, which doesn't have stored
             | procedures, so it's an acceptable downgrade for me.
             | Appending strings to lists gets messy quickly, example:
             | https://death.andgravity.com/query-builder-why#preventing-
             | co...
        
           | em500 wrote:
           | Definitely not true, I've encountered plenty of SQL users at
           | work who never heard of injection attacks or parameterized
           | queries. Some of them even built some ad-hoc query builders
           | to replace some of their own repetitive queries. (Note that
           | parameterized queries alone are not sufficient: often people
           | would try to parameterize table or column names.)
        
       ___________________________________________________________________
       (page generated 2021-08-27 23:02 UTC)