[HN Gopher] Write an SQL query builder in 150 lines of Python
___________________________________________________________________
Write an SQL query builder in 150 lines of Python
Author : genericlemon24
Score : 129 points
Date : 2021-08-24 16:01 UTC (3 days ago)
(HTM) web link (death.andgravity.com)
(TXT) w3m dump (death.andgravity.com)
| krosaen wrote:
| Interesting to see what goes into an ORM library - but as others
| note, in my experience learning SQL ends up being better. The
| things that ORMs make easy are already pretty straightforward,
| and when you get to more advanced queries, the ORM ends up
| getting in the way and/or in order to use the ORM properly you
| have to understand SQL deeply anyways.
|
| For learning SQL, my favorite resource to get started:
|
| Become a SELECT star! https://wizardzines.com/zines/sql/
|
| followed by
|
| The Best Medium-Hard Data Analyst SQL Interview Questions
| https://quip.com/2gwZArKuWk7W
| chrisofspades wrote:
| What are you doing to handle more complicated WHERE clauses, such
| as WHERE last_name = 'Doe' AND (first_name = 'Jane' OR first_name
| = 'John')?
| genericlemon24 wrote:
| That's what the fancier __init__() at the end of the article[1]
| is for :)
|
| Here's the tl;dr of a real-world example[2]:
| query = Query().SELECT(...).WHERE(...) for subtags in
| tags: tag_query = BaseQuery({'(': [], ')': ['']},
| {'(': 'OR'}) tag_add = partial(tag_query.add, '(')
| for subtag in subtags: tag_add(subtag)
| query.WHERE(str(tag_query))
|
| It can be shortened by making a special subclass, but I only
| needed this once or twice so I didn't bother yet:
| for subtags in tags: tag_query = QList('OR')
| for subtag in subtags: tag_query.append(subtag)
| query.WHERE(str(tag_query))
|
| [1]: https://death.andgravity.com/query-builder-how#more-init
|
| [2]:
| https://github.com/lemon24/reader/blob/10ccabb9186f531da04db...
| mcdonje wrote:
| SQL is an absolutely amazing domain specific language, but people
| keep building ORMs & other SQL abstractors. I get building one as
| a learning or resume project. I don't see how it's useful in most
| scenarios. Like, there's some difficulty breaking up the queries
| in a logical way (which leads to duplication), so the solution is
| another library that you need to write or install and learn how
| to use?
|
| Even if it's easy to use, I can't imagine it'd take less time to
| read the documentation than to write some SQL. And then there's
| another dependency; Another thing to check if there's performance
| issues, another attack vector.
| awinter-py wrote:
| benefits of ORMs: - allow certain apps to work
| with any sql dialect (though tbh I question the value of this
| in the age of docker) - makes simple selects easier
| - enables plugins for web libraries
|
| I think the 3rd thing is the most valuable. If you're in an
| ecosystem like django / rails, you can consume plugins that
| integrate with your sql backend and your schema more or less
| automatically.
| wutwutwutwut wrote:
| > tbh I question the value of this in the age of docker
|
| Haha. In the world outside of HN, big companies enforce
| specific database servers for reasons such as centralized
| monitoring, security, compliance and competence. Setting all
| this aside because of "docker" is classic HN.
| greenie_beans wrote:
| you're missing the forest for the trees. this is a good python
| tutorial for somebody at my skill level
| kiliancs wrote:
| I always like to bring up Ecto. Really nice sweet spot IMO.
| query = from u in "users", where: u.age > 18 or
| is_nil(u.email), select: %{name: u.name, age:
| u.age} # Returns maps as defined in select
| Repo.all(query)
|
| https://hexdocs.pm/ecto/Ecto.html#module-query
| mozey wrote:
| Doing it the other way around makes more sense, i.e. write SQL
| to generate code. See https://github.com/kyleconroy/sqlc, why
| aren't more people following this approach?
| rrrrrrrrrrrryan wrote:
| This is awesome.
|
| Composing a SQL query is often the _easiest_ part. The issue
| is that throwing that static SELECT statement string in your
| code has a ton of downsides (as discussed everywhere in this
| thread).
|
| And fiddling with an ORM is tremendously frustrating when
| you're just trying to get it to compose the SQL you've
| already written in your head.
|
| For a dev who's bothered to learn SQL, flipping the thing on
| its head makes a ton of sense.
| emodendroket wrote:
| Yeah, everyone has a strong aversion to doing it but it's kind
| of self perpetuating. Once you get comfortable writing SQL to
| do everything, the ORMs feel like a frustrating game of trying
| to get the library to generate the SQL you have in mind.
| genericlemon24 wrote:
| Sometimes you need dynamic SQL, because the DB doesn't have
| stored procedures you can use for the same purpose (e.g.
| SQLite).
|
| I talk more about this use case here:
| https://death.andgravity.com/query-builder-why#the-problem
| laszlokorte wrote:
| The issue with plain SQL is simply that you can not compose
| queries at runtime.
|
| For example let user decide which columns to select, in which
| order to fetch the rows, which table to query etc. Or not even
| let the user decide but decide based on some config file or the
| system state.
|
| You end up with a bunch of string manipulation that is fragile
| and does not compose well.
|
| What solution is there except from simulating SQL via your
| languages nestable data structures?
| stadium wrote:
| Spark SQL is great for this. SQL strings and data frames code
| can be used interchangeably, and performance is the same for
| either type of query.
| https://spark.apache.org/docs/2.1.1/sql-programming-
| guide.ht...
| xeromal wrote:
| Honestly though, how often does this scenario actually
| happen. Devs love to build for what if, but there's not a ton
| of scenarios where you need to build completely custom
| queries based on user input. A nice SPROC with some inputs
| can do most of that while maintaining performance
| rspeele wrote:
| Pretty much every web app I've written has had some sort of
| search feature with a table of results, where the user can
| filter on multiple columns of the search. That requires
| either:
|
| a. Dynamically adding "and SomeColumn like @someValue" to
| the where clause of the query
|
| or
|
| b. Having the query include _all_ filters but ignore them
| when they are left blank, such as
|
| `where @someValue is null or SomeColumn like @someValue`
|
| The latter approach does allow you get by without string-
| building queries for filtering, but tends to perform worse
| if you have a lot of columns you might filter on.
|
| And you _still_ have to play the string concatenation game
| if you let the user click column headers to change the
| ORDER BY.
|
| With a decent ORM, like Entity Framework for C#, any of
| this is very simple and there is no need to write SQL
| string manipulation code in your application. But under the
| covers that is what is happening.
| tshaddox wrote:
| Is this just a complaint about a lack of good tooling for
| SQL code? User-configurable columns are probably always
| going to be implemented as an array of columns that gets
| built at runtime and then sent to whatever database
| management system you're using. In any programming
| language and database system, isn't the only thing
| automatically preventing you from building a malformed
| array of columns at runtime some sort of type-checking or
| linting? It seems to be that this mistake is just as easy
| to make using an ORM or data access library without type-
| checking or linting.
| UglyToad wrote:
| The main place I've run into this is a table with sorting,
| pagination and filtering while lazy loading data and I
| encounter this need a lot.
|
| With some runtime query builder most databases will have
| decent performance for the space between "too much to
| reasonably load in memory" to "we need a dedicated query
| service like elastic". Unfortunately taking the example of
| sort by X then by Y where X and Y are dynamic I don't know
| of a nice solution in SQL only in MSSQL 2012 or MySQL 5.
| taeric wrote:
| I think you are playing into the comment, though. You'd
| be surprised how far you can go with a table that doesn't
| support sorting and filtering. Basic pagination can often
| be foregone, if you can instead resort to offering export
| and let them use whatever software they are used to for
| this purpose. (Pagination is also just easier if you
| forego custom sort and filtering.)
|
| At some point, yes, you will want these things. But the
| basics can get you really far.
| UglyToad wrote:
| For sure you can offer a good approximation through SQL
| but I'm unlucky enough that I've needed to support
| dynamic sorting and query builders against tables at
| least 4 times for different datasets in the past years.
|
| I'm particularly proud of the last one I built that still
| absolutely flies despite doing wildcard querying on text
| fields (no full text index) with multiple combined
| queries and a dynamically computed status from aggregate
| child properties in MySQL 5 with its lack of window
| functions. The solution was basically dynamically
| construct an efficient as possible inner query before
| applying the joins.
|
| To clarify I believe in doing as much as possible in raw
| SQL and have a fondness for the stored procedure driven
| system I worked with but this problem remains unsolvable
| in SQL alone and I just keep encountering it.
| genericlemon24 wrote:
| Someone on Reddit suggested stored procedures, which seems
| like a good alternative.
|
| Alas, SQLite doesn't have them, so query building it is.
| laszlokorte wrote:
| Except if you build a complete query builder as a stored
| procedure I do not see the problem solved.
|
| Simply stated the problem is: Viewing code as data (in a
| lisp sense) and transforming arbitrary data into a query
| that can be executed.
| justsomeuser wrote:
| With stored procedures you run your code inside of the SQL
| server process.
|
| With SQLite, your entire application IS the process, and
| your SQLite data moves from disk to your app processes RAM.
|
| I think the SQLite model is much better as you get to use
| your modern language and tooling (which is better than the
| language used for stored procedures which has not changed
| in 20 years, and is generally a bear to observe, test,
| develop).
| musingsole wrote:
| Stored procedures are a nightmare that shepherd your
| application into an illegible, unmanageable monstrosity.
|
| Stored procedures are the slipperiest slope I've seen as a
| developer.
| BoxOfRain wrote:
| That's a bit of an absolutist stance I think, a good
| programmer can use stored procedures just fine and in
| some cases it can even improve performance. Yes it makes
| it easier for bad programmers to write bad code, but bad
| programmers will write bad code no matter what tools they
| have in their toolbox.
| moksly wrote:
| This isn't my experience at all, but I suppose it depends
| on how you build your applications. I used to be of your
| opinion, before we moved more and more of our code based
| from .Net to Python an I was a linq junkie, but these
| days I think you're crippling your pipeline if you don't
| utilise SQL when and how it's reasonable to do so.
|
| There are a lot of data retrieval where a stores
| procedure will will save you incredibly amounts of
| resources because it gives you the exact dataset you need
| exactly when you need it.
| taffer wrote:
| In my experience, such problems only occur when people
| believe that common software engineering practices do not
| apply when writing stored procedures. Just have your
| stored procedures version controlled, tested, and
| deployed like any other code.
| kbenson wrote:
| The real problem is that you're almost always shifting
| work from a language that is well known and understood by
| you and/or your team to one that is less, or even poorly,
| understood and known, and you end up incurring the cost
| of novice programmers, which can be a real problem for
| both security and performance.
|
| If you have good knowledge and experience in the language
| your preferred version of SQL implements, that's good. If
| you just have people that understand how to optimize
| schemas and queries, you might find that you encounter
| some of the same problems as if you shelled out to
| somewhat large and complex bash scripts. The value of
| doing so over using your core app language is
| _debatable_.
| taffer wrote:
| If your front-end is written in React and your business
| logic is written in SQL, is it really fair to call what's
| left in the middle tier a "core app language"? If you're
| writing a SaaS application today, you're more likely to
| want to rewrite your Java middle tier in Go than
| replacing your DBMS.
| kbenson wrote:
| Not everyone is making a web app, or even something
| amenable to using React native, and even if they are,
| there's no guarantee that their middle tier isn't also in
| JavaScript.
|
| That said, I wasn't making a case about replacing your
| DBMS. I specifically avoided that because yes, most
| people stay with what they know and used, and even if
| they switch, they switch for a different project, not
| within the same project. There are _some_ cases where
| multiple DBMS back-end support is useful, but I think
| that 's a fairly small subset (software aimed towards
| enterprises which wants to ease into your existing system
| and note add new requirements, and open source software
| meant to use one of the many DBMS back-ends you might
| have).
|
| My actual point is more along the lines of:
|
| - Most DBMS hosted languages I've seen are pretty shitty
| in comparison to what you're already using.
|
| - The tooling for it is likely much worse or possibly
| non-existent.
|
| - You are probably less familiar with it and likely to
| fall into the pitfalls of the language. All languages
| have them, shitty languages have more. See first point.
|
| - If you accept those points and the degree to which you
| accept them should definitely play a role in deciding to
| use stored procedure you've written in the language your
| DBMS provides.
|
| - I think trade offs are actually similar to what you
| would see writing chunks of your program in bash and
| calling out to that bash script. People can write well
| designed and safe bash programs. It's not easy, and there
| are a lot of pitfalls, and you can do it in the main
| language you're writing probably. Thus the reasons
| against calling out to bash for chunks of core are likely
| similar to the reasons against calling a stored
| procedure.
| taffer wrote:
| You make some good points. I personally have experience
| with business web applications, i.e. large complex data
| models with rather simple updates and report-like
| queries. These types of queries combined with a well
| normalized data model map well to set-based SQL. Of
| course, it's a different story if you're writing
| technical applications or games that are more about
| crunching numbers than querying and updating data.
|
| To me, the shitty procedural languages you mention are
| just for gluing queries together. The important stuff
| happens in SQL and the simplicity of keeping it all in
| the database is worth it.
| tshaddox wrote:
| That seems to assume some very specific "scene" of
| developers, or at least a team environment where it's
| impossible to select for developers who know SQL. That
| may very well be the case in your situation, but I see no
| reason to expect it would be the case in most situations.
| SQL isn't exactly a new or niche language.
| kbenson wrote:
| I'm not talking about plain SQL, I'm talking about the
| included procedural language that often ships with it[1]
| as an extension and which is much more DBMS specific.
|
| So, the question is, does your team know SQL/PSM, or
| PL/SQL, or PL/pgSQL, or some other variant, and how well.
|
| 1:
| https://en.wikipedia.org/wiki/SQL#Procedural_extensions
| tshaddox wrote:
| My team would know the variant that we use, yes.
| empthought wrote:
| Sounds like you're pretty bad at writing stored
| procedures.
| freedomben wrote:
| Sounds like you've never worked on a team.
| sumtechguy wrote:
| I propose that the issue is at the ODBC layer.
|
| We have 2 modes of using that thing.
|
| Tightly bound. Basically you have a bit of query string with
| question marks in it and you bind out your data points.
| Either for sending/recieving.
|
| Loosely bound. Here is a totally composed string ready to go
| just run it. Also a good way to make SQL injection attacks.
|
| Both involve string manipulation.
|
| I think the issue is the ODBC interface does not really map
| to what SQL is and does. The column binding sort of does. But
| not table, not where, not having, not sort, etc. So we end up
| re-inventing something that pastes over that. Building a
| string to feed into an API that feeds that over the network
| to a string parser on the other side to decompose it again
| then runs it and goes back the other way in a similar manner.
| phs2501 wrote:
| This is my middle-ground solution for Python:
| https://github.com/bdowning/sql-athame
|
| Still fundamentally manipulating SQL text (which is a feature
| as I don't want to learn a full DSL), but it handles
| wrangling embedded placeholders while you're composing stuff
| and some other common compositional tasks. It's worked well
| for me anyway but I'm under no illusions it'd be right for
| everyone.
|
| Not an original concept regardless; my original version of
| this was in Node: https://github.com/bdowning/sql-assassin,
| but a few years after I wrote that (and mostly didn't use it)
| I found https://github.com/gajus/slonik which was very
| similar and much more fleshed-out; I rolled _some_ of its
| concepts and patterns into sql-athame.
| forinti wrote:
| It would be easier to do that with a template engine.
| genericlemon24 wrote:
| I cover this solution here:
| https://death.andgravity.com/own-query-builder#jinjasql-
| sqlp...
| agent327 wrote:
| Something like C++'s {fmt}, specialized for SQL.
|
| fmt ("SELECT foo, bar FROM mytable WHERE id = {}", id);
|
| 'id' gets properly quoted, so that's not a concern anymore.
| joshuamorton wrote:
| This is simpler than what gp is asking for, they want the
| ability to dynamically change which columns are requested
| too, so change it to "foo, bar, baz", but still do so with
| reasonable safety.
| GLGirty wrote:
| > simulating SQL via your languages nestable data structures
|
| I've gone down this rabbit hole... you've described the in-
| house query engine I work with.
|
| Instead of sql, queries are nested json representing a
| dumbest-possible sql parse tree. It's only a subset of sql
| features, but it provides, as you point out, composability.
| By having the query as a data struct, all sorts of
| manipulations become possible... toggle fields for access
| control; recovery from dropped or renamed columns (schema has
| a changelog); auto-join related tables as required; etc.
|
| It's magic when it works but it suffers from the drawbacks
| noted elsewhere in the thread about traditional ORMS...
| onboarding new devs, debugging when something goes wrong.
|
| Adding new features to the dsl is a pain because the surface
| area is not just _the query_ but _all the queries it might
| form_ when composed with other queries in the system O_o
| nobody0 wrote:
| Your point is valid, it make sense just speak SQL to DB in some
| scenarios . A good ORM will enable people to do that easily as
| well.
|
| However, good orms like LINQ in C# really shines in a lot of
| places.
| teh_klev wrote:
| Wee nitpick, LINQ isn't an ORM, it's a querying API. Maybe
| you meant the original LINQ to SQL ORM which was deprecated
| in favour of Entity Framework (EF).
|
| > A good ORM will enable people to do that easily as well
|
| I agree. EF isn't a terrible ORM and covers many use cases
| developers encounter when interacting with a database server
| in your average line of business app. But you do need to keep
| an eye on the SQL it generates with more complex queries i.e.
| beware of cartesian hand grenades. Fortunately MS provides
| some guidance on this kind of thing, but sadly many
| developers often don't bother to read the docs, and this is
| why ORM's get a bad name:
|
| https://docs.microsoft.com/en-
| us/ef/core/performance/efficie...
| jrm4 wrote:
| This all day. I teach intro Web App stuff (meaning I've never
| built anything huge) and it's just painful to be like:
|
| Hey everyone, look at this language with this incredibly
| intuitive and simple syntax, perhaps the most "easy to read and
| understand" thing you'll see in computing...
|
| ...and now here's a bunch of painful stupid garbage that you'll
| have to learn about, because industry has accreted a bunch of
| crap around it to make it turn on and work.
| emidln wrote:
| I've written a lot raw SQL in the dialect my language supports.
| This is fine for static queries. When queries are dynamic
| (maybe they come from an admin panel or some other part with
| lots of optionality), static SQL isn't enough. You then get to
| do one of three things: ad-hoc string manipulation, rely on a
| query AST->compiler (like SQLAchemy's core or HoneySQL), or an
| ORM. ad-hoc string manipulation is a real security and
| reliability nightmare, and if you don't have a good standalone
| query AST to SQL compiler, an ORM is the next best thing.
| codetrotter wrote:
| > ad-hoc string manipulation is a real security and
| reliability nightmare
|
| Parameters.
|
| A couple of examples:
|
| https://www.psycopg.org/docs/usage.html#passing-
| parameters-t...
|
| https://www.php.net/manual/en/mysqli.quickstart.prepared-
| sta...
| jteppinette wrote:
| How do you parametrize things like dynamic joins, where
| clauses, field selections, or aggregations without string
| manipulation or gross duplication?
| emidln wrote:
| Parameters have zero bearing on whether you should
| dynamically construct SQL strings. If parameters can solve
| all of your problems, you don't have a dynamic SQL query,
| you have user-submitted values in a WHERE clause.
|
| What can be parameterized depends wildly on the SQL
| database in question. I haven't used one that could
| parameterize table names (for use in dynamic JOINs or CTEs)
| and many cannot parameterize values outside of a WHERE
| clause. Dynamically selecting which function to call,
| clauses to add or subtract, and sort orders are just a
| slice of places parameters don't help.
|
| In short, parameters alone do no eliminate the need for a
| query builder. A good query builder should appropriately
| parameterize values as the underlying database supports and
| hopefully uses a type system or validation to constrict the
| domain of values it uses to construct parts of the
| expression that cannot be parameterized.
| lowercased wrote:
| You can't order/sort by parameters.
| Cthulhu_ wrote:
| My main argument for ORMs is the mapping between SQL columns
| and your data model; having to write out SQL and code to load
| an SQL result set into your data model feels like a lot of
| duplication. And it gets a lot more difficult when you have
| more complicated data models spanning multiple database columns
| - that is, if your data model is like a document that has to be
| inserted and updated into a set of normalized database tables.
|
| That said, if that wasn't as much of an issue, or if I still
| had the patience and typing skills to write a lot of boring
| code, I wouldn't object too much. I'd also treat SQL as a first
| class citizen, so dedicated .sql files that the application
| will load or embed and use. That way, the SQL can be developed,
| maintained and run outside of the context of your application -
| editors can execute them directly to a (local) development
| database.
| vlunkr wrote:
| Agreed, lots of comments here saying that people don't want
| to learn SQL, I don't buy that argument. I think most people
| already know SQL because at some point it's inevitable, and
| it's also not that hard.
|
| The value of the ORM is what the name implies: mapping your
| data from rows into something convenient to use in your
| language.
|
| What people are really complaining about is ORMs that
| needlessly try to re-invent SQL, but they don't all do this.
| Someone else brought up Ecto, it's a brilliant ORM because
| the resulting SQL is always obvious.
| jakearmitage wrote:
| What if your SQL is your data model? I believe object-
| oriented design is forcing us to do a lot of duplication for
| very little gains, just like it did on game design and the
| industry started moving towards data-oriented design and
| entity-component system.
|
| The sooner we move away from the big lie that is DDD, the
| more productive we can be.
| [deleted]
| true_religion wrote:
| I think if sql is your data model, then it's a good case
| for a query builder.
|
| But at my company, our in house app is written using django
| but treats the django model asan interface to
| elasticsearch, Postgres, and redis.
|
| The django model is sub optimal for sql, but it's a good
| base for interacting with the system as a whole.
| BoxOfRain wrote:
| I feel a lot of arguments for ORMs boil down to "I can't really
| be arsed to learn SQL properly" which is fair enough from a
| developer time / performance ratio point of view, but not from
| a purely technical point of view.
|
| Like you say, I find there's way more mental friction dealing
| with new ORMs and wading through the documentation than there
| is in just going straight for raw SQL. There's less "magic"
| that way too which is always nice when you're debugging or
| optimising things. Different strokes for different folks I
| guess.
| huetius wrote:
| Ymmv, but I like the both/and approach. I've seen high
| quality codebases where the benefits of both are leveraged:
| ORM for simple queries, idiomatic code reuse; raw sql for
| complex or performance critical queries. Being smart about
| when to use which is the key, but I have seen it work very
| well.
| GVIrish wrote:
| This is what I think is the reasonable take on ORMs. If
| you're using one because you don't wanna learn SQL, that is
| a bad reason that will most likely result in more work in
| the long run.
|
| But if you're using it to increase productivity for
| straightforward database calls, it can be a useful tool in
| a lot of scenarios.
| jasode wrote:
| _> I feel a lot of arguments for ORMs boil down to "I can't
| really be arsed to learn SQL properly" _
|
| I learned SQL almost 30 years ago and used it in
| Oracle/Informix/Sybase/IBMDB2/MSSQL/MySQL/SQLite/etc and have
| written exam questions on SQL for outer-joins, nested
| subqueries, crosstab, etc.
|
| All that said, concatenating raw strings to form SQL that has
| no compile time type-checking is tedious and error prone.
| Thankfully, my "expert" knowledge of SQL does not blind me to
| the benefits of what a well-written ORM can do for
| productivity. (E.g. results caching, compile-time validation,
| auto-completion in the IDE, etc)
|
| The real technical reason why ORMs persist as a desirable
| feature is the decades-old technical concept of _" working
| memory of field variables"_ that mirror the disk file table
| structure.
|
| In the older languages like 1960s COBOL, and later
| 1980s/1990s 4GL type of business languages such as dBASE,
| PowerBuilder, SAP ABAP... they all have data tables/records
| _as a 1st-class language syntax without using any quoted
| strings_ :
|
| - In COBOL, you had RECORDS and later versions of COBOL had
| _embedded SQL_ without quoted strings
|
| - dBASE/Foxpro had SCATTER/GATHER which is sort of like a
| built-in ORM to pull disk-based data from dbf files into
| memory
|
| - SAP ABAP has built-in SQL (no quoted raw strings required)
| with no manual typing of variables or structs to pull data
| into working memory
|
| The issue with "general purpose" languages like
| Python/Javascript/C++ is that SQL is not 1st-class language
| syntax to read database data into memory variables and write
| them back out. The SCATTER/GATHER concept has to be _bolted-
| on_ as a library ... hence you get the reinvention of of the
| "db-->memory-->db" data read/write paradigm into Python via
| projects like SQLAlchemy.
|
| After tediously typing out hundreds of variable names or
| copy-pasting fields from SQL tables via strings such as ...
| _db.execquery( "SELECT cust_id, name, address FROM customer
| WHERE zipcode=12345;")_
|
| ... it finally dawns on some programmers that this is
| _mindless repetitive work_ when you have hundreds of tables.
| You end up writing a homemade ORM just to reduce the error-
| prone insanity.
|
| Yes, a lot of ORM projects are bad quality, with terrible
| performance, bugs, etc. But that it still doesn't change the
| fact that non-trivial projects with hundreds of tables and
| thousands of fields need _some type of abstraction_ to reduce
| the mindless boilerplate of raw SQL strings.
| chousuke wrote:
| I feel like there should be a distinction between full-
| blown ORMs and just query builders.
|
| I dislike ORMs that hide database operations from the
| application code by pretending that in-memory application
| objects are in concept interchangeable with persistable
| objects; though just saying that hurts a bit because I
| don't want to think in terms of objects, but in _data_.
|
| Query builders are fine if they allow you to build
| efficient queries in a type-safe and composable manner to
| get data in and out of the database in the form your
| application requires at the site of the database query, but
| I don't want to be forced to pretend I fetch "User" objects
| if all I really need from the database are the name and
| e-mail.
| [deleted]
| [deleted]
| luhn wrote:
| I feel the opposite, the arguments _against_ ORMs largely
| boil down to "I can't really be arsed to learn SQL
| properly." ORMs are a leaky abstraction by nature and
| attempting to use one without understanding the underlying
| technology will only lead to pain and a developer with grudge
| against ORMs.
|
| I think ORMs compliment SQL, not replace it. Without ORMs,
| you end up getting a lot of clunky boilerplate trying to do
| simple CRUD operations to your data. It's a bit like
| Greenspun's 10th law: "Any sufficiently advanced RDBMS-backed
| program contains an ad hoc, informally-specified, bug-ridden,
| slow implementation of half of an ORM."
| Emanation wrote:
| I built one because I wanted to generate all my SQL scripts
| from a graphql schema.
|
| Handjamming SQL where statements for arbitrarily deeply nested
| types is insanity.
| yasserf wrote:
| I agree with your point that adding multiple layers = more
| attack vectors and abstraction of a really good domain specific
| language. But what seems to happen on most of the projects I
| work on is we end up hiding away extremely common logic behind
| helper functions. It always starts off with SQL and then slowly
| gets moved into higher level functions that offer a better
| developer experience.
|
| Shameless plug, but I just posted a library I wrote (for node
| https://github.com/vramework/postgres-
| typed/blob/master/READ...) which pretty much is a tiny layer
| ontop of pg-node (which is query based / with value parameters)
| and provides small util functions with typescript support
| derived straight from postgres tables.
|
| In an ideal world (one I think we are getting very close to) I
| think we will end up having SQL queries validated in our code
| against the actual DB structure the same way we have any other
| compilation error. But until then we'll need to rely on tests
| or helper libraries, and for the purpose of refactoring and
| development I find the latter more enjoyable (although still
| far from perfect).
| throwaway858 wrote:
| > In an ideal world (one I think we are getting very close
| to) I think we will end up having SQL queries validated in
| our code against the actual DB structure the same way we have
| any other compilation error.
|
| This library for TypeScript works exactly like this
|
| https://github.com/MedFlyt/mfsqlchecker
| x-shadowban wrote:
| F# (and surely there are others) offers type providers, which
| can reach into your db schemas at compile time and typecheck
| linq-ish or even raw sql strings and the subsequent uses of
| the result set. But in the simplest case it makes compilation
| a function of not only source files but also db schema state.
| You can extract schema defs and point the type provider at
| the "cached" defs, but make sure rebuilding those is part of
| the build process.
| mcdonje wrote:
| >SQL queries validated in our code against the actual DB
| structure the same way we have any other compilation error
|
| Good point about validation. The ideal scenario you propose
| would indeed be ideal.
| zzzeek wrote:
| if you want to write one SQL query, you should not use a SQL
| builder and/or ORM.
|
| if you want to write thousands of SQL queries that naturally
| derive from an object model of 30, 100, or 800 domain classes
| or so and would otherwise be thousands of lines of repetitive
| boilerplate if all typed out individually, then you use a query
| builder and/or ORM.
|
| different tools for different jobs.
| jamamp wrote:
| I work for an enterprise in-house web application, where over
| time we have a ton of tables, like over several hundred
| unfortunately. Writing queries for each any every table and
| manually mapping those to objects sounds exhausting.
| EntityFramework helps a lot with that, for CRUD operations.
|
| Additionally, I think unit testing would benefit from an ORM as
| well. Instead of calling an actual database, you can easily
| have an in-memory database with an ORM with mocked data.
| Though, I suppose that is technically possibly with straight
| SQL queries as well, but I imagine it's easier with an ORM.
| zaybqp wrote:
| How would it be easier with ORM when switching to in-memory
| database for standard SQL is just changing connection string?
| furstenheim wrote:
| The main problem with in memory databases is that they are
| not the same database. I've had several issues which I had to
| find a fix that worked on one database and was legal syntax
| on the other. Say maybe one database allows force index but
| in memory does not.
|
| I've even seen one database layer for in memory and a
| completely different one for real database. That means that
| your unit tests are not really testing your production code
| zaybqp wrote:
| You can also make the same argument about ORM's mock DB
| layer.
| mikewarot wrote:
| I've built SQL query builders to get things done behind the
| scenes a few times. Users just love being able to search any
| field, and get their results.
| hownottowrite wrote:
| Or just use sqlalchemy...
| genericlemon24 wrote:
| I wrote an entire article about why I didn't do that:
| https://death.andgravity.com/own-query-builder#sqlalchemy-co...
|
| In general, SQLAlchemy is an excellent choice; in my specific
| case (single dev with limited time), the overhead would be too
| much (even with my previous experience with it).
| hownottowrite wrote:
| That's a great article and a good way to go.
| mynameismon wrote:
| A curious question: why not Python's built in Sqlite3 package
| [0]?
|
| [0] https://docs.python.org/3/library/sqlite3.html
| genericlemon24 wrote:
| I am using exactly that; the query builder goes on top of
| it :)
| chooseaname wrote:
| Exactly. Because nobody should ever use their dev skills to
| build up their understanding of a tool by recreating some
| aspect of it.
| hownottowrite wrote:
| Exactly. Why bother when you can just write another to-do
| list manager.
|
| Seriously, it's just a joke. Chill out.
| Asm2D wrote:
| This reminds me my older project called xql, for node.js:
| https://github.com/jsstuff/xql
|
| and fiddle: https://kobalicek.com/experiments/fiddle-xql.html
|
| Interesting how all these builders look the same :)
| thangalin wrote:
| Many SQL abstractions substitute SQL keywords (e.g., "SELECT")
| for a fluent interface function (e.g., "select"). The query
| builder and output tend to be language-specific.
|
| Several years ago, I prototyped a relational mapping language
| based on XPath expressions. The result is a database-, language-,
| and output-agnostic mechanism to generate structured documents
| from flat table structurse. Although the prototype uses XML,
| other formats are possible by changing the RXM library.
|
| The queries might be bidirectional. Given a document and the RXM
| query that generates the document, it may be possible to update
| the database using the same query that produced the document. I
| haven't tried, though.
|
| https://bitbucket.org/djarvis/rxm/src/master/
| globular-toast wrote:
| > query.SELECT('one').FROM('table')
|
| I don't really like this API. SQL is weird because it's written
| backwards. What does `query.SELECT("one")` on its own represent?
| A query on any table that happens to have a field called "one"? I
| know you're not trying to build an ORM, but
| `Table("table").select("one")` makes a lot more sense for an API
| since the object `Table("table")` actually has a purpose and
| would make sense to pass around etc.
| gpderetta wrote:
| also _select_ is really _project_ (and _where_ is _select_ ),
| but that's the battle for another day.
| genericlemon24 wrote:
| > What does `query.SELECT("one")` on its own represent?
|
| data['SELECT'].append('one')
|
| I'm intentionally trying to _not_ depart too much from SQL /
| list-of-strings-for-each-clause model, since I'd have to
| invent/learn another one.
|
| For a full-blown query builder, I agree
| Table("table").select("one") is better.
| blackbear_ wrote:
| I concur, despite the downvotes. This is also why the select is
| last LINQ to SQL [1] queries: var
| companyNameQuery = from cust in nw.Customers
| where cust.City == "London" select
| cust.CompanyName;
|
| [1] https://docs.microsoft.com/en-
| us/dotnet/framework/data/adone...
| zer0faith wrote:
| Please God no...don't do this.. This is exactly how security
| vulnerabilities are introduced. Validate input and use
| parametrized queries over something like this...
| genericlemon24 wrote:
| It's not either-or; I'm using the query builder to build
| _parametrized_ queries.
|
| Here are two examples:
|
| https://death.andgravity.com/query-builder-why#introspection
| https://github.com/lemon24/reader/blob/10ccabb9186f531da04db...
| greenie_beans wrote:
| this is a good guide for an intermediate-level python developer.
| thanks.
| genericlemon24 wrote:
| Glad you liked it!
| bluehark wrote:
| I highly recommend pypika by Kayak:
| https://github.com/kayak/pypika
|
| Have used in multiple projects and have found it's the right
| balance between ORMs and writing raw SQL. It's also easily
| extensible and takes care of the many edge cases and nuances of
| rolling your own SQL generator.
| orangepanda wrote:
| > and takes care of the many edge cases and nuances of rolling
| your own SQL generator
|
| Want to elaborate?
| mynameismon wrote:
| Ah, good ol' SQL injection attacks... If the author is reading,
| please do add a disclaimer mentioning that the code would be
| vulnerable to SQL injection attacks.
|
| From a purely learning point of view, it is indeed an interesting
| read, however.
| genericlemon24 wrote:
| (author here)
|
| Yup, good thing to note, thank you!
|
| I've kinda assumed everyone knows about the named parameters
| some bindings offer, but that's probably not true. Will add a
| disclaimer.
|
| In the previous articles, I do talk about how this is an
| upgrade from plain SQL, not from regular full-blown query
| builders / ORMs.
| quietbritishjim wrote:
| > I do talk about how this is an upgrade from plain SQL
|
| This statement is about functionality / ease of use, which is
| fairly orthogonal from (preventing) SQL injection: with plain
| SQL it's perfectly possible to avoid injection attacks, in
| fact that's probably the most common and easiest way to do
| it. In that sense, if anything this is a downgrade from
| regular SQL.
| genericlemon24 wrote:
| > In that sense, if anything this is a downgrade from
| regular SQL.
|
| It likely is.
|
| Ideally, instead of dynamically building queries, one would
| use stored procedures.
|
| I'm using this with SQLite, which doesn't have stored
| procedures, so it's an acceptable downgrade for me.
| Appending strings to lists gets messy quickly, example:
| https://death.andgravity.com/query-builder-why#preventing-
| co...
| em500 wrote:
| Definitely not true, I've encountered plenty of SQL users at
| work who never heard of injection attacks or parameterized
| queries. Some of them even built some ad-hoc query builders
| to replace some of their own repetitive queries. (Note that
| parameterized queries alone are not sufficient: often people
| would try to parameterize table or column names.)
___________________________________________________________________
(page generated 2021-08-27 23:02 UTC)