[HN Gopher] Better SQL JOINs
___________________________________________________________________
Better SQL JOINs
I think foreign keys could be valuable to improve how we write SQL
joins, in the special but common case when joining on columns that
exactly match a foreign key. The idea is to add a new ternary
operator, which would be allowed only in the FROM clause. It would
take three operands: 1) referencing_table_alias 2)
foreign_key_constraint_name 3) referenced_table_alias POSSIBLE
BENEFITS * Eliminate risk of joining on the wrong columns Although
probably an uncommon class of bugs, a join can be made on the wrong
columns, which could go undetected if the desired row is included
by coincidence, such as if the test environment might only contain
a single row in some table, and the join condition happened to be
always true. * Conciser syntax In a traditional join, you have to
explicitly state all columns for the referencing and referenced
table. This is somewhat addressed by the USING join form, but USING
has other drawbacks, why I tend to avoid it except for one-off
queries. When having to use fully-qualified table aliases, that
adds even further to the verboseness. * Makes abnormal joins stand
out If joining on something else than foreign key columns, or some
inequality expression, such joins will continue to be written in
the traditional way, and will therefore stand out and be more
visible, if all other foreign key-based joins are written using the
new syntax. When reading SQL queries, I think this would be a great
improvement, since the boring normal joins on foreign keys could be
given less attention, and focus could instead be made on making
sure you understand the more complex joins. SYNTAX Syntax is
hard, but here is a proposal to start the discussion:
from_item join_type from_item WITH
[referencing_table_alias]->[foreign_key_constraint_name] =
[referenced_table_alias] [ AS join_using_alias ] EXAMPLE To
experiment with the idea, I wanted to find some real-world queries
written by others, to see how such SQL queries would look like,
using traditional joins vs foreign key joins. I came up with the
idea of searching Github for "LEFT JOIN", since just searching for
"JOIN" would match a lot of non-SQL code as well. Here is one of
the first examples I found, a query below from the Grafana project
[1] [1] https://github.com/grafana/grafana/blob/main/pkg/services/a
ccesscontrol/database/resource_permissions.go
SELECT p.*, ? AS resource_id, ur.user_id AS
user_id, u.login AS user_login, u.email AS
user_email, tr.team_id AS team_id, t.name AS team,
t.email AS team_email, r.name as role_name FROM
permission p LEFT JOIN role r ON p.role_id = r.id
LEFT JOIN team_role tr ON r.id = tr.role_id LEFT JOIN team
t ON tr.team_id = t.id LEFT JOIN user_role ur ON r.id =
ur.role_id LEFT JOIN user u ON ur.user_id = u.id
WHERE p.id = ? Here is how the FROM clause could be
rewritten: FROM permission p LEFT JOIN
role r WITH p->permission_role_id_fkey = r LEFT JOIN
team_role tr WITH tr->team_role_role_id_fkey = r LEFT JOIN
team t WITH tr->team_role_team_id_fkey = t LEFT JOIN
user_role ur WITH ur->user_role_role_id_fkey = r LEFT JOIN
"user" u WITH ur->user_role_user_id_fkey = u WHERE p.id =
1; In PostgreSQL, the foreign keys could also be given
shorter names, since they only need to be unique per table and not
per namespace. I think a nice convention is to give the foreign
keys the same name as the referenced table, except if the same
table is referenced multiple times or is self-referenced.
Rewriting our example, using such naming convention for the foreign
keys: FROM permission p LEFT JOIN role r
WITH p->role = r LEFT JOIN team_role tr WITH tr->role = r
LEFT JOIN team t WITH tr->team = t LEFT JOIN user_role ur
WITH ur->role = r LEFT JOIN "user" u WITH ur->user = u
WHERE p.id = 1;
Author : JoelJacobson
Score : 19 points
Date : 2021-12-25 21:31 UTC (1 hours ago)
| magicalhippo wrote:
| While I haven't given it serious thought, I've been wishing for
| something like this SELECT u.name, r.role
| FROM users u JOIN user_roles r USING FOREIGN KEY
| WHERE u.id = :id
|
| Using this syntax one could optionally specify the foreign key
| name as well, in case it is ambiguous. It's more explicit than
| NATURAL JOIN and it feels very SQL-ish to me.
| mgradowski wrote:
| Related -- Postgres (other RDBMS's too, probably) has JOIN b
| USING (column, ...) and NATURAL JOIN. IIRC, both fail when
| there's ambiguity in column names.
| coolgeek wrote:
| > FROM permission p > LEFT JOIN role r WITH
| p->permission_role_id_fkey = r
|
| <snip>
|
| You're using the alias "r" here for two different things:
| - a table alias for the role table - a whatever (foreign
| key index object?) alias for p->permission_role_id_fkey
|
| > the special but common case when joining on columns that
| exactly match a foreign key
|
| I've been programming with RDBMSs since 1996. I'd say that
| approximately 99% of the thousands of JOINs I've written were
| based on PKs/FKs.
|
| The example that you are attempting to improve already operates
| on PKs/FKs.
|
| I don't understand the point of this proposed improvement at all
| zmmmmm wrote:
| Have always felt SQL is pointlessly complex for the
| overwhelmingly common case of inner joining a child table to
| parent via foreign key to primary key. Surely this could / should
| be made implicit somehow. I shouldn't even have to name the
| parent table, just let me select or reference parent table
| columns via indirection through the fk column should be enough.
|
| I often think half the ORMs in existence might not have been
| invented if SQL was just a bit more ergonomic.
| gregw2 wrote:
| WITH is a keyword already used for SQL common table
| expressions/subquery syntax. Perhaps use a distinct and different
| keyword... VIA perhaps?
| JoelJacobson wrote:
| VIA works for me, but I thought reusing an existing reserved
| keyword would be less invasive. I don't think WITH is currently
| allowed in the FROM clause, so think it should work. But if an
| entirely new keyword can be accepted, then I agree VIA is nice.
| [deleted]
| paozac wrote:
| Not sure about this, I wouldn't like having to look up the FK
| name every time or hope it was named following the convention.
|
| The first thing (among many others) that I would change in SQL is
| the position of the SELECT clause:
|
| FROM .. JOIN .. SELECT .. WHERE ..
|
| instead of
|
| SELECT .. FROM .. JOIN .. WHERE ..
|
| That would make the construction of many queries more natural.
| JoelJacobson wrote:
| Never thought about the order, but now when you mention it, I
| realize I always write SELECT FROM
|
| and start writing the FROM clause first.
|
| So if we could rewrite history, I agree the order you suggest
| would make more sense, but it's probably unrealistic to change
| it, except for entirely new SQL inspired languages.
|
| To comment on having to look up foreign keys. The idea I had in
| mind is to allow changing the default formatting of foreign key
| names, so you could figure out the name, except in special
| cases such as if there would be two foreign keys referencing
| the same table. For such cases you should explicitly name the
| foreign keys.
| emidln wrote:
| This is a good use case for an AST-based query compiler which
| would allow construction of clauses in arbitrary order. Using
| HoneySQL in Clojure, I frequently would write:
| (-> (from [:cities :c]) (join [:population :c] [:=
| :c.name :p.city-name]) (select :c.name :c.pizza-rank
| :p.population) (order-by [:c.pizza-rank :asc]))
|
| It's handy for a lot of reasons, but unless I had mostly static
| queries (changing just some where clause params), I would
| always seek out a AST library rather than attempt string
| building for a SQL use case.
___________________________________________________________________
(page generated 2021-12-25 23:01 UTC)