[HN Gopher] Show HN: Checksum.sh verify every install script
___________________________________________________________________
Show HN: Checksum.sh verify every install script
The pattern of downloading and executing installation scripts
without verifying them has bothered me for a while. I started
messing around with a way to verify the checksum of scripts before
I execute them. I've found it a really useful tool for installing
things like Rust or Deno. It's written entirely as a shell script,
and it's easy to read and understand what's happening. I hope it
may be useful to someone else!
Author : gavinuhma
Score : 65 points
Date : 2022-10-28 18:38 UTC (4 hours ago)
(HTM) web link (checksum.sh)
(TXT) w3m dump (checksum.sh)
| dontbenebby wrote:
| >The pattern of downloading and executing installation scripts
| without verifying them has bothered me for a while.
|
| Thanks for sharing this work OP! I didn't see a license mentioned
| -- did you intend this to go into the public domain? I like how
| you set up a cool domain name and did some sick graphics, but I'm
| not sure how I can _legally_ use your code in the future.
|
| That being said, I appreciate the work you put into this project.
|
| I'm not going to list off specific examples, but MANY open source
| projects serve either PGP keys or hashes in the clear. Or they
| serve just hashes over HTTPS and now you have a trust issue.
|
| Or, in one case, my favorite -- they had lovingly listed out the
| MD5 sum for the program... but they served both that checksum,
| and the code itself... over HTTPS.
|
| Now, to be fair, HTTPS _does_ provide an integrity check, so
| there 's a benefit beyond privacy or whatever but... this is a
| RAMPANT problem in the open source community.
|
| I ran into it mostly when trying to find esoteric security tools
| when I was attempting OSCP and interviewing around for
| penetration testing roles.
|
| I got the sense rapidly shifting from "I was so scared of the
| CFAA I did an entire master's thesis on the design of censorship
| circumvention tools" to "Oh gee, I used to be such a narcissis,
| demanding a high falutin salary when I couldmn't even fire up
| Metasploit to wipe a server."
|
| (The implication being that some folks abused their access when
| my powers were week, and now, in time for spooky season, it's
| time lean in to letting people take whatever drug they want if
| they feel scared -- reality scares me too some days.)
| gavinuhma wrote:
| Good catch. Let me add a license
| dontbenebby wrote:
| Thanks, it wasn't meant in a gotcha way.
| gavinuhma wrote:
| I totally just forgot to add one. Added MIT just now.
| Appreciate it!
| orf wrote:
| I feel like bash/sh should have this built in
| dundarious wrote:
| There are two big problems with the use of `echo $s` in
| bash/POSIX sh:
|
| 1. Never use echo to output untrusted content as the first
| argument
|
| Let's say `s='-e 1\n2'`, then `echo $s` will output:
|
| > 1
|
| > 2
|
| Instead of:
|
| > -e 1\n2
|
| Always use printf if you want to start output with untrusted
| content, e.g., `printf %s\\\n "$s"`.
|
| 2. Never use unquoted variable expansion when trying to exactly
| reproduce contents of the variable
|
| Similarly, unquoted variable expansion re-tokenizes the contents
| and will not preserve spaces appropriately. Say
| `s='"a<space><space>b"'` (where each <space> is a literal ' ', HN
| seems to be collapsing 2 spaces down to 1), then `echo $s` will
| output:
|
| > "a<space>b"
|
| Instead of:
|
| > "a<space><space>b"
|
| You can get the latter with `echo "$s"` but use `printf %s\\\n
| "$s"` to fix both issues.
|
| PS: If you fail to use quoted expansion with printf, for example
| like so, `printf %s\\\n $s`, then you'll notice the problem right
| away, as it will effectively turn that into `for i in $s ; do
| printf %s\\\n "$i" ; done`. That's actually a very useful feature
| of printf if you know to use it.
|
| Edit: These problems exist for bash/POSIX sh at least. Perhaps
| you're using a shell that works differently, like zsh, because
| otherwise issue 2 would probably have led to some checksum fails
| for you already.
| googlryas wrote:
| Great post, you are wise in the ways of the shell. Minutiae
| like this is exactly why I stop writing shell scripts the
| moment I start, and reach for python or some other sane
| language. But, I can't help but respect when I see masters of
| sh work their magic.
| dundarious wrote:
| Honestly, 90% of problems with scripts are people forgetting
| to put double quotes around stuff. The other stuff doesn't
| come up that much, and once you write a few decent scripts,
| the other stuff is as easy as noticing someone wrote `open =
| True` in Python, not realizing they've redefined a builtin
| function, and the fix is just do `is_open = True`.
|
| So just put double quotes around all your variable expansions
| unless you know you shouldn't -- 90% of scripts would be
| "fixed" with just that. And don't bother putting curly braces
| into the variable expansion unless you know you need to.
| People tend to think `echo ${s}` is somehow better than `echo
| $s` when it's exactly the same -- the curly braces are just a
| way to allow you to, e.g., write `"${s}_"` as distinct from
| `"${s_}"`. AFAIK in fish `${s}` is identical to `"$s"`, but
| that's a different kettle of sh.
| rnhmjoj wrote:
| For more caveats like this one I recommend reading:
| https://www.etalabs.net/sh_tricks.html
| gavinuhma wrote:
| This is awesome. Thank you! I've been through so many
| iterations but it's been fun to improve
| gavinuhma wrote:
| Like this? https://github.com/gavinuhma/checksum.sh/pull/2
| dundarious wrote:
| Missed the other `echo $s` piped into shasum. But I echo the
| sentiment of the another commenter that I'd rather rely on
| `shasum --check` to give the OK or not.
| gavinuhma wrote:
| Got it. Thanks.
|
| Re --check, I suppose the way to do that would be to
| download the file to disk, which --check requires as fair
| as I can tell. So I could download the file to disk,
| --check, and then remove it. I think most of these installs
| scripts are trying not to leave any artifacts around from
| install, other than the resulting binary.
| dundarious wrote:
| You only need to create a temp file for the checksum
| file, not the downloaded contents. In the below example,
| no file exists on disk with the contents of `$s`.
|
| > $ s='1<space><space>2'
|
| > $ printf %s\\\n "$s" | shasum -a 256 > tmp.sum
|
| > $ printf %s\\\n "$s" | shasum --check tmp.sum
|
| > -: OK
|
| So you can just `printf '%s<space><space>-\n' "$c" >
| tmp.sum` and check with `printf %s\\\n "$s" | shasum
| --check --status tmp.sum || { echo "checksum failed" > &2
| ; exit 1 ; }`
|
| Having to create temp files is a wrinkle (could probably
| avoid it by using process substitution if you want to
| give up on POSIX sh), but so is writing bash scripts in
| general.
| gavinuhma wrote:
| Solid! I couldn't figure this out which I why I stopped
| using "---check". I'll take a look
| yjftsjthsd-h wrote:
| If I may pile on with a general suggestion for people writing
| shell scripts: Use shellcheck. Always. It will catch these
| things automatically for you:)
| throwawaaarrgh wrote:
| If we kept a mirrored or distributed decentralized network of
| just cryptographic hashes, that might solve a huge number of
| problems around distributing files securely.
| ithkuil wrote:
| Awesome. I made something similar in
| https://github.com/mkmik/runck
|
| But I didn't but a fancy domain name :-)
| gavinuhma wrote:
| Haha thanks! Honestly when I saw the domain was available it
| motivated me to finish the project and share it
| thewataccount wrote:
| Serious question - What is the benefit of verifying a hash? Are
| we really worried about file integrity? Why don't people use GPG?
|
| The hash only verifies file integrity, and that the content of
| the url doesn't switch the script later. But keep in mind in most
| scenerios, and attacker would also just change the hash listed
| too (they're usually on the same website). This only mitigates
| one very specific attack.
|
| Why don't we use GPG here? That way we can verify ownership and
| file integrity with at minimum TOFU, plus optional manual
| verification? If we're going through the work of adding a wrapper
| and all that, we may as well no?
|
| This has the benefit that you only need to import the owner's
| cert once, all future changes have the same cert. Where hashes
| are obviously different every time, you have to trust the source
| of the hash every time it changes. With GPG at the very least you
| have TOFU with certs - and very best can have better assurance of
| the initial download too.
|
| EDIT: Just want to clarify - I'm openly asking why the "developer
| community" is going the direction of hashes for script
| verification vs GPG signatures.
|
| I don't mean to diminish your project, your project looks fun,
| and does make verifying hashes easier :)
| [deleted]
| tomrod wrote:
| I'm not terribly deep in this space. What is the conceptual
| difference of hash vs GPG sig?
| atoav wrote:
| A hash is the same when the values of the content are the
| same. But when you get a new (maliciously hacked) install
| script chances are that you won't have an old hash lying
| around to check whether the script changed. Any attacker who
| could swap the sceipt could also swap the hash, unless it is
| a different channel.
|
| With GPG the developer has a key pair (one private, one
| public). They can then sign all their scripts with their
| private key and publish the public one wherever. You can then
| take that public key and verify that the script has been
| indeed signed by the developers private key.
| thewataccount wrote:
| Admittedly this is likely the main reason GPG isn't more
| common place because of the complexity.
|
| This is the overview:
|
| Developer generates a private/public key they use for all of
| their projects.
|
| You import their public key once - you can verify this from
| their github, twitter, etc but that's optional.
|
| They can sign a file with their key. You can check this
| signature against their public key. This will guarantee the
| file was signed by using that key and is unmodified.
|
| If someone hijacks the website after this point and signs the
| new downloads with their own key - then you will be able to
| see it's invalid.
|
| If you manually verify the key then you'll know your initial
| download is valid - if you trust on first use then you at
| least know all future files signed from that developer with
| that cert are valid.
|
| They also are effectively a hash for file integrity.
|
| tl;dr - hashes tell you if a file is changed. Signatures tell
| you if the file is changed, and who the person that made the
| file is.
| Jarwain wrote:
| Hash essentially proves that the file you downloaded is the
| same as the file that was uploaded. It tells you nothing
| about Who uploaded the file. An attacker could make you
| download their own file, but then the hash of the file won't
| match what's published (unless the attacker changes the
| published hash).
|
| A GPG sig proves that the file was signed & uploaded by the
| author, which defacto doubles as proof that it's the same
| file. The idea here is that the author uploads their public
| key, signs the package with their private key, and now
| there's an association between the package and the author. An
| attacker would have to obtain the author's private key, or
| replace the public key with their own. Changing the public
| key, however, is a big red flag.
| pvg wrote:
| Because for all of its problems, Web PKI is a working,
| practical, large scale system of verification and GPG isn't -
| you don't get much by trying to replicate what your web browser
| and CAs do for you but clunkier.
| XCSme wrote:
| > would also just change the hash listed too
|
| In my project I "host" the hash on a different medium, so in
| order to compromise the file download the attacker would have
| to compromise both the file hosting server and the hash hosting
| medium (which in my case is GitHub).
|
| I also don't really display the hashes, as the download only
| happens when the script is updated, so your current version of
| the script will check the hash on GitHub vs the hash of the
| file download from the file hosting server.
|
| EDIT: To be clear, this doesn't solve the problem with the
| initial install and it is also not related to the Checksum.sh
| script.
| thewataccount wrote:
| Interesting idea,
|
| Does the script get the new version url&expected hash from
| the website alone? Or does it get the expected hash from the
| website, then calculate the URL from github?
|
| Basically I'm wondering if that prevents just needing to
| attack the website - if the url to download the update and
| the expected hash are in the same place then it's still a
| single point of failure.
| XCSme wrote:
| The latest file download URL is always the same /latest,
| hosted on my server.
|
| The version number and latest file hash are also fixed
| URLs, stored on GitHub.
|
| So for an update, the script checks GitHub for latest
| version number, if newer it downloads the latest version
| from my server, computes the hash and compares it to the
| hash stored on the fixed GitHub URL before proceeding.
|
| I think there's no way to replace the file with a malicious
| one that will be distributed to the users unless you get
| access to both my server and the GitHub repository.
| thewataccount wrote:
| Yeah I think that should work.
|
| It does have the downside still that changes to the
| website/github might break future updates in a way that
| isn't (easily) verifiable.
|
| While this is a solution personally I still like the idea
| of GPG more since it'll work for any new files, works for
| your new projects automagically, etc.
|
| But I think you did at least fix the future update
| problem with auto-updates, which is a lot more work then
| most people put into it so thank you for addressing the
| issue!
| koolba wrote:
| Just remember that any script that fetches anything else remotely
| would still pass the checksum as only the initial script is
| checked.
| ChadNauseam wrote:
| Yep. As an example, rustup happens to be in this category as
| the checksums for rustc, cargo, etc. aren't checked.
| gavinuhma wrote:
| It's really interesting. There should be a massive ledger of
| checksums for software
| jandrese wrote:
| It's called apt. Or dnf. Or most any package manager.
| Having a gigantic general list runs into the problem of how
| do you update it and how do you verify the updates?
| yjftsjthsd-h wrote:
| You use GPG and trust the people publishing things, who
| sign the artifact that you actually download. Which is
| internally how every package manager I've seen works
| internally, anyways.
| jandrese wrote:
| It's the age old root of trust problem. In practice the good
| enough is that if it passes SSL/TLS authentication on the
| official domain then we wouldn't be able to stop an injection
| attack either way. Validating against the source is no good if
| it is the source that is compromised.
|
| That's also kind of the issue with a lot of these shell
| injection attacks. Sure someone could insert environment
| variables or other shenanigans to take over your machine, but
| if they have that much control over your shell there are
| countless other ways they could also do it. Guarding against
| this one particular case doesn't buy you much.
| gavinuhma wrote:
| Definitely. Important to note. There is a long long supply
| chain
| neeh0 wrote:
| I wrote hundreds of those checks in scripts, makefiles, CI and
| whatever else. After I found Nix (and NixOS) it's ridiculous not
| to use it. Use it.
| gavinuhma wrote:
| I hadn't heard of NixOS. Super cool
| NovemberWhiskey wrote:
| I don't know; what's the threat model here?
|
| If the script is deliberately malicious as originally published,
| then the publisher will provide a valid checksum; so it doesn't
| help.
|
| If the script source is subverted by an attacker, then it only
| helps if the attacker doesn't also have the means to change the
| published checksum too.
|
| If an attacker can modify the site which publishes the URL for
| the script and the checksum, they can modify both at the same
| time.
| nerdponx wrote:
| Why not use the -c option? Especially if you're using Bash or Zsh
| which has "here-strings": checksum() {
| hash="$1" file="$2" sha256sum -c <<< "${hash}
| ${file}" }
|
| Or if you need to use a POSIX-ish shell:
| checksum() { hash="$1" file="$2"
| printf '%s %s' "$hash" "$file" | sha256sum -c }
|
| Of course you can add a `--binary` option (uses '%s *%s' instead
| of '%s %s'), options to use different hash functions, etc.
|
| I also think it's weird to use `alias` inside a function, instead
| of just using a parameter to store the name of the program to
| execute.
| gavinuhma wrote:
| Great point on alias, thanks. I think that was a relic of an
| older iteration.
|
| I'll work through these suggestions. Appreciate it. Feel free
| to send a PR if you want.
|
| For the here string I think that won't work because the file
| isn't being saved locally, it's just being piped (so $2 is a
| URL). I can't do the usual `shasum -c <<<
| "132e320edb0027470bfd836af8dadf174e4fee00 install.sh" which
| takes a local filename but not the file content. As far as I
| could tell anyway. I'll try it some more
___________________________________________________________________
(page generated 2022-10-28 23:01 UTC)