[HN Gopher] Show HN: SQLite query inside a Bash function
___________________________________________________________________
Show HN: SQLite query inside a Bash function
Part of the workflow for building my website is the generation of a
table in tab-separated column format (.tsv). The source data is
found in four other .tsv files. I use an SQLite query to perform a
4-way join and write out the new table. For convenience, I wrote a
script that encapsulates the query inside a Bash function. The
example below illustrates this technique.
repertoire() { pushd $CMM_SOURCES/_data sqlite3
<<EOS .headers on .mode tabs .import
category.tsv category .import composition.tsv composition
.import concert.tsv concert .import program.tsv program
.once repertoire.tsv SELECT category.name
AS category, composition.key,
composition.composer, composition.name AS composition,
concert.name AS concert FROM category,
concert, composition, program WHERE
julianday(concert.date) < julianday('now') AND
composition.category = category.name AND program.key =
composition.key AND program.date = concert.date
ORDER BY category.sequence, composition.key
; EOS popd }
Author : chmaynard
Score : 119 points
Date : 2021-07-07 15:28 UTC (7 hours ago)
| manx wrote:
| I recommend adding a little trick to ignore existing ~/.sqliterc
| files, which makes the script more portable:
| cat <<EOF | sqlite3 -init <(echo "") "$DBFILE" ...your
| sql... EOF
|
| For one-line queries I use: q() { echo "$@" |
| sqlite3 -init <(echo "") "$DBFILE"; }
| t0astbread wrote:
| The cat in the first script can be omitted I think (by using
| the heredoc on sqlite directly).
| tyingq wrote:
| Interesting. I noticed that "-init" doesn't error out if you
| give it a filename that can't be a file, like . or /. Using
| /dev/null seems to work also.
| sigmonsays wrote:
| If you're trying to put sql queries into a bash script, I beg the
| question to rewrite said script in a programming language. There
| are of course a bunch of other factors that would justify a
| rewrite so this isn't a hard rule.
|
| However, having written hundreds of lines of bash gluing systems
| together for various CI jobs and automations, at some point you
| need to stop writing bash and switch to something with a richer
| data model and easier to maintain/test.
| solotronics wrote:
| What language do you believe is better for DevOps / Linux
| systems engineering than Bash?
| cranekam wrote:
| Once you need concurrency or have anything but the most basic
| logic almost anything trumps bash. It's possible to implement
| basically anything in bash, of course, but that doesn't make
| it sensible. Its error handling is poor, there are too many
| ways to do things (eg [ vs [[ vs test), dealing with non-
| scalar data is a pain, it has subtle gotchas (accidentally
| spawning a subshell and shadowing variables), it's cryptic
| (${foo#bar} and friends are cool but much harder to remember
| than something like .replace()), and so on.
|
| If I'm just writing small tools that wire a few basic
| commands together bash is fine. For anything that actually
| needs data to be crunched or decent error handling Python,
| Go, Perl (though also cryptic), C++ are, IMO, a far better
| choice.
| atribecalledqst wrote:
| The cryptic nature of Bash syntax is why I generally try to
| do all my scripting in Python these days, if it exceeds a
| certain level of complexity. i.e. if I want to just group a
| bunch of shell commands together with no special logic,
| Bash is usually OK. For anything more complicated than an
| if statement or while loop I have to refer back to old
| scripts to remember the syntax. Not an ideal situation.
|
| I did recently have a case where I needed to run a shell
| command with a multi-line output, and an easy call to
| os.popen() wasn't returning the full output. So, back to
| Bash arrays and weird text parsing syntax (IFS=" " read
| tempArray...) I went...
| literallyaduck wrote:
| Powershell with its inconsistent returns, weird pipeline
| behavior at random, and bad default output settings is not
| better but like all crappy pop songs pushed by the music
| industry it will gain popularity.
|
| NodeJS and Python could be contenders but Bash has them beat
| in longevity.
| solotronics wrote:
| How is python better? If I want to say install 5 different
| yum packages and then do a kubernetes command the bash is
| pretty straightforward. With python I would have to do
| subprocess.call and then handle the exit codes and errors
| in python.
| literallyaduck wrote:
| Both Node and Python attempt an OS abstraction layer
| which in theory could be an advantage in practice it is a
| leaky abstraction. Both have many packages which could
| enhance scripting and even the opportunity for private
| repos. Installing from yum makes me believe you are using
| Fedora so cross platform might not be a concern for you.
| If running on windows is a concern Cygwin could be an
| opportunity. I'd say Bash, Node, Python, then PowerShell
| in order of advantage.
| 0xbadcafebee wrote:
| https://en.wikipedia.org/wiki/Begging_the_question
| _russross wrote:
| In English, meaning is defined by common usage. I know the
| formal logic hipsters have mounted a persistent campaign to
| make an exception here, but the fact remains that "begging
| the question" means exactly what ordinary people think it
| means.
| handrous wrote:
| This is one of the cases for which, when writing for a
| broad audience, it's best to avoid the phrase in question,
| in either of its senses--precisely to avoid confusion or
| distraction like what's happening in this thread, with the
| common definition, and to avoid confusing the uninitiated
| with its more "correct", but jargony, sense--but also to
| accept the common meaning in others' language without
| complaint (ahem).
| gxs wrote:
| While I agree with you in principle, I think it's more than
| permissible to make this distinction on this forum given
| the audience, IMO, even if the person who made the comment
| admittedly lacked tact in their approach.
| knorker wrote:
| I've spoken and written English for over three decades, and
| this is the first time I've seen "begging the question" be
| used to mean this.
|
| It's usually meant as "raises the question", but language
| lawyers insist it means what the wiki page said.
|
| But this is neither of those two. And if you're the only
| one who uses a phrase a certain way, and it means other
| things to everyone else, then that's actually wrong, even
| in a descriptive language.
| Cloudef wrote:
| Meanwhile the static sites i generate from cronjobs with bash
| have been rock-solid for years (jq and sqlite are awesome!),
| while the ones i did in python breaks on updates or cause other
| headaches (not just dynamic sites). Sure if its webapp, i'd do
| it in react or something.
|
| Bash/shell is great once you realize it's a glue language. You
| avoid doing anything but pipelines, and if theres no
| command/tool to do something, you can write that in other
| language and include it in the pipeline.
| spicybright wrote:
| But if OP isn't writing hundreds of lines of glue code and
| their solution works well for their use case, there's not much
| reason to re-write anything for scale that'll never happen.
| thomashabets2 wrote:
| In my opinion this is not just for scale. In my opinion
| anything more than a one-liner is too large of a scale for
| bash.
|
| More thoughts on this: https://blog.habets.se/2021/06/The-
| uselessness-of-bash.html
|
| Another piece of evidence is that I do many many code
| reviews. Not once have I reviewed bash script changes that
| did not have subtle bugs that would bite us in real life. Not
| once!
| mixedCase wrote:
| Doing all my scripting in Go because some people don't care
| to learn bash seems like a small version of hell.
|
| If I'm using bash, I just need to glue a few tools together
| and I've most likely decided already that concurrency more
| complex than simple wait(1p) is unnecessary, and that fine-
| grained error handling is unimportant.
|
| If I really cared about those things, I'd probably use
| something like Haskell that actually improves on error
| handling, unlike Go, or if I'm scripting something for a
| specific software project, whatever that project's language
| is.
| thomashabets2 wrote:
| It's not so much about them not knowing bash, as it takes
| MANY lines to do things correctly sometimes. Per the
| example in my blog post full of mkfifo and stuff.
|
| But sure, my choice is Go. You do haskell if it's best
| for you.
| mohanmca wrote:
| Thanks.
| rahimiali wrote:
| If you're sweating quoting issues in here-documents, you can put
| the sqlite query in a separate file and tell sqlite to execute
| that file instead.
|
| Here's a helpful example from SO:
| https://stackoverflow.com/a/21759264/711585
| dashezup wrote:
| True, I don't see the benefits of writing it in bash/shell
| script in this case. Plus you can get proper syntax
| highlighting for SQL in your text editor for " _.sql " file.
| sqlite3 database.db <query.sql
|
| Some other DBMS have similar things, for MariadDB you could
| write it in a "_.sql" file as well, setting variables and use
| it so that you can write a generic sql file. (seems you can't
| use the variable thing with sqlite3) MariaDB>
| SET @username='UserName@localhost'; MariaDB> SET
| @hostname='localhost'; MariaDB> SOURCE create-user.sql;
|
| Such things is usually cleaner/simpler than writing in bash.
| 1vuio0pswjnm7 wrote:
| Why not sh instead of bash. This is a script; it is not an
| interactive shell. sh is faster as a scripting language. Don't
| take my word for it. If you are using a UNIX-like OS such as BSD
| or GNU/Linux, sh has been chosen as the default scripting shell,
| not bash. Bash might be the default login shell (interactive use)
| but it is not the default scripting shell (non-interactive use).
|
| This is only one function. We cannot see the rest of the script.
| Maybe there is a compelling reason1 to to use pushd and popd
| instead of using sh features.
|
| 1. i.e., no way to achieve same result in sh
| AlphaSite wrote:
| Is comparability with sh really a worthwhile goal anymore?
|
| I haven't seen many places with sh but not full bash (busybox
| being the main exemptions but that breaks most things due to
| muslc anyway).
| 1vuio0pswjnm7 wrote:
| What OS are you using. It may use bash for login, but I would
| be willing to bet it does not use bash for scripting (non-
| interactive use).
|
| It is possible that many shell script authors do not
| understand the difference between a login shell (optimised
| for interactive use) and a scripting shell (optimised for
| non-interactive use). Thus the comparison could be
| worthwhile. sh is faster, more portable and arguably presents
| fewer potential gotchas. FWIW, shellschock affected bash but
| not sh.
|
| NetBSD uses ash-derived sh as both login and scripting shell.
| No bash. Debian and many, many other GNU/Linux distributions
| (including musl-based Void) all use NetBSD-derived sh called
| "dash" as the default scripting shell. I am not aware of any
| Berkeley distribution that uses bash as the default scripting
| shell.
| chmaynard wrote:
| FYI, here is the entire script:
|
| https://raw.githubusercontent.com/chmaynard/Sources/core/wor...
| chubot wrote:
| Yup, I do this all the time! :) I write Python programs that use
| sqlite and set up the DB state with bash.
| _benj wrote:
| this is really cool! Would you mind if I ask about "performance"?
| I'm working on a toy project with luajit and I can either use ffi
| and SQLite C API (which can be a little complicated) or just use
| the CLI and capture the output of a command as a string and parse
| that.
|
| As I'm writing this I'm thinking that I should just run o=some
| tests using both approaches, but, how's the performance for you?
| Any noticeable delays or something of the sort?
| chmaynard wrote:
| My guess is that the query itself takes the same amount of time
| either way, C API or CLI. There may be some I/O overhead when
| using sqlite3, but I doubt if it's anything to worry about. The
| beauty of using sqlite3 is that it hides the complexity of the
| C API, which is procedural.
| e12e wrote:
| If you don't intend to fill in/expand variables - you probably
| want to quote the HEREDOC tag,so rather than:
| sqlite3 <<EOS ... WHERE user = $(rm /etc/shadow)...
| EOS
|
| You might want: sqlite3 <<'EOS' ... WHERE
| user = $(rm /etc/shadow)... EOS
| remram wrote:
| You are not using any variables or anything from bash in that
| query. Wouldn't it make more sense to extract it to a .sql file
| and just run that, rather than your (rather long) heredoc?
| chmaynard wrote:
| Agreed. I like having the query _in situ_ during development
| because it simplifies editing. But there are disadvantages that
| you and others have identified.
| lxe wrote:
| You can encapsulate almost anything in a bash function via
| heredocs like that. The disadvantage is that editor features like
| indentation and syntax highlighting rarely work.
| tyingq wrote:
| The "here document" (<<EOS) is a nice workaround to keep from
| having to correctly quote a bunch of SQL syntax.
|
| I do agree, though, with some of the other comments that you will
| probably want a "real language" other than bash if this grows at
| all. Passing in string data for a where clause, for example,
| would get cumbersome and error prone.
|
| Python or Perl would be good choices if it goes there.
| chmaynard wrote:
| I wasn't aware that '<<' is a shell directive. That often
| happens when I mindlessly copy code snippets from somewhere
| else.
|
| https://tldp.org/LDP/abs/html/here-docs.html
| elwell wrote:
| Speaking of SQLite, this CoRecursive podcast a few days ago was
| enjoyable: https://corecursive.com/066-sqlite-with-richard-hipp/
| sgt wrote:
| Quite agree. And he seems like such an agreeable person too.
| Listening to Richard Hipp and his thought process felt familiar
| in the sense that he is the traditional old school programmer,
| and I know a few of those.
| _benj wrote:
| second this! I read the transcript and it was quite enjoyable
| indeed
| VWWHFSfQ wrote:
| neat
| [deleted]
| dinedal wrote:
| You can do this with https://github.com/dinedal/textql as well,
| and a little easier as you don't need to import the data
| explicitly. SQL_QUERY=<<-EOS SELECT
| category.name AS category, composition.key,
| composition.composer, composition.name AS composition,
| concert.name AS concert FROM category,
| concert, composition, program WHERE
| julianday(concert.date) < julianday('now') AND
| composition.category = category.name AND program.key =
| composition.key AND program.date = concert.date
| ORDER BY category.sequence, composition.key
| ; EOS textql --header --dlm=tab --sql $SQL_QUERY
| category.tsv composition.tsv concert.tsv program.tsv
| SahAssar wrote:
| Seems like textql is a wrapper around sqlite, I'm guessing
| staying with sqlite directly makes it easier to interact with
| other tooling.
|
| One nitpick with textql is that it says "sqlite import will not
| accept stdin, breaking unix pipes. textql will happily do so.",
| but that's not true, you just need to tell it to use stdin by
| doing: sqlite3 '.import /dev/stdin
| {youtablename}'
|
| I do this all the time for using gzip and sqlite to get
| compressed import with progress. For example filtering a
| compressed CSV from a SQL query with progress for import:
| pv icons.csv.gz | gzip -dc | sqlite3 ':memory:' -csv '.import
| /dev/stdin data' '.output result.csv' '.headers on' "SELECT
| name, keywords from data where name = 'chart-area'"
| tyingq wrote:
| Do keep in mind that /dev/stdin{out,err} aren't completely
| portable. Bash emulates them if they don't exist, but this
| usage bypasses Bash. So there are unix like platforms where
| this trick won't work. They do, though, seem to be supported
| on the ones that are most popular now (Linux, MacOS,
| FreeBSD).
| kingosticks wrote:
| Or with https://simonwillison.net/2021/Jun/19/sqlite-utils-
| memory/
| seppel wrote:
| Or with http://harelba.github.io/q/
| hnjst wrote:
| Or with https://csvkit.readthedocs.io/en/latest/
| zimpenfish wrote:
| +1 for csvkit because it is _extremely_ useful when
| dealing with people who love spreadsheets to the point of
| insanity.
| tkinom wrote:
| Personally, I like to see sqlite builds into the bash.
|
| For features such as: Continuous logging the
| bash commands, env, results, exec time in a sqlite database for
| security/auditing and regression testing. The
| db file can be kept opened, no need to spawn a separate process
| for each command logged.
|
| Anyone else interests in such approach. Anyone thinks of any
| potential issue for such integration?
|
| If there are enough interests, I can start a open source project
| for this.
| chmaynard wrote:
| Interesting idea. How would you add sqlite bindings to bash?
| Via a plugin?
___________________________________________________________________
(page generated 2021-07-07 23:01 UTC)