[HN Gopher] Anatomy of a SQL Engine
___________________________________________________________________
Anatomy of a SQL Engine
Author : ingve
Score : 152 points
Date : 2025-04-26 22:00 UTC (1 days ago)
(HTM) web link (www.dolthub.com)
(TXT) w3m dump (www.dolthub.com)
| jimbokun wrote:
| Very nice write up enumerating all the stages of SQL query
| execution. Interesting even if you don't care about the DoIt
| database specifically.
| Austizzle wrote:
| Man, this title tripped me up for a minute because I pronounce it
| with the letters like Ess-Queue-Ell
|
| So the "A" in "A ess-queue-ell" engine felt like it should have
| been an "An" until I realized it was meant to be pronounced like
| "sequel"
| perching_aix wrote:
| Not necessarily, I see native speakers completely ignore this a
| lot.
|
| Have you ever considered pronouncing it as squirrel by the way?
| kreetx wrote:
| Many (most?) non-native English speakers do pronounce it as
| ess-queue-ell, especially in their own languages, so yes, the
| use of "a" instead of "an" does look off from that perspective.
| SloopJon wrote:
| When I read _SQL for Dummies_ almost thirty years ago, it made
| a point of distinguishing "sequel" as a historical predecessor
| to standard "SQL." As I recall, the author even asserted that
| SQL is not an acronym/initialism for structured query language.
| I felt funny saying sequel for the next decade or so, because I
| wasn't an old timer experienced with this pre-SQL technology.
|
| Now I usually say sequel because everyone else does. That and
| it rolls off the tongue better than S-Q-L.
| jtolmar wrote:
| I prefer "ess queue ell" these days, but the first DBA I ever
| worked with pronounced it "squirrel".
| gopalv wrote:
| This is a great write up about a pull-style volcano SQL engine.
|
| The IR I've used is the Calcite implementation, this looks very
| concept adjacent enough that it makes sense on the first read.
|
| > tmp2/test-branch _> explain plan select count(_ ) from xy join
| uv on x = u;
|
| One of the helpful things we did was to build a graphviz dot
| export for the explains plans, which saved us days and years of
| work when trying to explain an optimization problem between the
| physical and logical layers.
|
| My version would end up displayed as SVG like this
|
| https://web.archive.org/web/20190724161156/http://people.apa...
|
| But the calcite logical plans also have that dot export modes.
|
| https://issues.apache.org/jira/browse/CALCITE-4197
| th0ma5 wrote:
| This is really great!!
| gavinray wrote:
| Calcite also has a relatively-unknown web tool for plan
| visualization that lets you step through execution.
|
| It's a method from "RuleMatchVisualizer":
|
| https://github.com/apache/calcite/blob/36f6dddd894b8b79edeb5...
|
| Here's a screenshot of what the webpage looks like, for anyone
| curious:
|
| https://github.com/GavinRay97/GraphQLCalcite/blob/92b18a850d...
| ignoreusernames wrote:
| I recommend anyone who works with databases to write a simple
| engine. It's a lot simpler than you may think and it's a great
| exercise. If using python, sqlglot
| (https://github.com/tobymao/sqlglot) let's you skip all the
| parsing and it even does some simple optimizations. From the
| parsed query tree it's pretty straightforward to build a logical
| plan and execute that. You can even use python's builtin ast
| module to convert sql expressions into python ones (so no need
| for a custom interpreter!)
| Abde-Notte wrote:
| Second this - building even a simple engine gives real insight
| into query planning and execution. Once parsing is handled, the
| core ideas are a lot more approachable than they seem.
| albert_e wrote:
| Sorry for slight digression.
|
| In a larger system we are building we need a text-to-sql
| capability for some structured data retrieval.
|
| Is there a way one could utilize this library (sqlglot) to
| build a multi-dialect sql generator -- that is not currently
| solved by directly relying on a LLM that is better at code
| generation in general?
| LtdJorge wrote:
| This is a SQL to X library, though. I don't think it's what
| you need.
| gavinray wrote:
| You can use an LLM to generate query-builder expressions from
| popular libraries in whatever language.
|
| For example, on the JVM there is jOOQ, which allows you to
| write something like: select(field("foo"),
| avg("bar")).from(table("todos"))
|
| And then it will render dialect-specific SQL. It has very
| advanced emulation functionality for things like JSON
| aggregations and working around quirks of dialects.
|
| Alternatively, you can ask an LLM to generate a specific
| dialect of SQL, and then use jOOQ to parse it to an AST, and
| then render it as a different dialect, like:
| val parser= DSL.using(SQLDialect.POSTGRES).parser()
| val parsedQuery = parser.parseQuery(postgresQuery)
| val renderedMySQL =
| DSL.using(SQLDialect.MYSQL).renderInlined(parsedQuery)
| println(renderedMySQL)
|
| Unsure if functionality like this exists in other Query
| Builder libraries for other languages.
| KyleBrandt wrote:
| Using dolthub's go-mysql-server for Grafana's upcoming SQL
| expressions feature (private preview in Grafana 12, but in the
| OSS version with a feature toggle).
|
| GMS lets you provide your own table and database implementations,
| so we use GMS to perform SQL queries against Grafana's dataframes
| - so users can join or manipulate different data source queires,
| but we don't have to insert the data into SQL to do this thanks
| to GMS.
___________________________________________________________________
(page generated 2025-04-27 23:01 UTC)