[HN Gopher] A lesson in dockerizing shell scripts
___________________________________________________________________
A lesson in dockerizing shell scripts
Author : bhupesh
Score : 150 points
Date : 2024-02-03 14:10 UTC (8 hours ago)
(HTM) web link (bhupesh.me)
(TXT) w3m dump (bhupesh.me)
| gbN025tt2Z1E2E4 wrote:
| I can appreciate the work to shrink the image, but copying the
| various standardized CLI tools and related library files into the
| image versus installing them with APK can introduce _many_
| compatibility challenges down the road as new base Alpine
| versions are released which can be difficult to detect if they
| don't immediately generate total build errors. Using static
| binary versions of the various CLI tools would be a better
| approach here, which inevitably means larger base binaries to
| begin with, again ballooning the docker image size... all for a
| minimal gain of 14MB overall is not worth it for a production
| build unless you're working in the most minimal of minimal
| embedded OS environments, which the inclusion of FZF -and-
| findutils would already seem to negate since there is so much
| duplication in functionality between the two tools already.
|
| Overall this approach results in an image so fragile I would
| never use the resulting product in a high-priority production
| environment or even just my local dev environment as I want to
| code in it, not have to fix numerous compatibility issues in my
| tools all over 14MB of space.
| bhupesh wrote:
| Author here
|
| > copying the various standardized CLI tools and related
| library files into the image versus installing them with APK
| can introduce _many_ compatibility challenges down the road as
| new base Alpine versions are released which can be difficult to
| detect if they don't immediately generate total build errors
|
| I'm maybe missing some context here, so you are saying that the
| default location of these binaries can change (the one's that
| get copied directly)? Or is it about the shared libraries
| getting updated and the tools depending on these libraries will
| eventually break?
| mrweasel wrote:
| > so you are saying that the default location of these
| binaries can change
|
| They could, Debian is in the process of unifying the bin
| directories, see: https://wiki.debian.org/UsrMerge
|
| Realistically it's not much of an issue.
|
| Given that you start out with a 31.4 MB image, I don't
| honestly think the introduced complexity in your build is
| worth the it. It's a good lesson, for people would doesn't
| know about build images and ships an entire build pipeline in
| their Docker image, for a bash script and a <50 MB image the
| complexity is a bit weird.
| bhupesh wrote:
| Oh, wasn't aware about UsrMerge, thanks for sharing.
| rst wrote:
| Can't necessarily speak for the author, but here's one thing
| that can happen:
|
| If the underlying system has a newer version of git than the
| one freeze-dried into your container, repositories managed
| there by native-git might be in a new format which container-
| git can't handle. (There might be some new, spiffier way of
| handling packs, for instance, or they might have finally
| managed to upgrade the hash function.) And similar issues
| potentially arise for everything else you're packaging.
| gbN025tt2Z1E2E4 wrote:
| COPY --from=ugit-ops /usr/lib/libpcre* /usr/lib/
|
| COPY --from=ugit-ops /usr/lib/libreadline* /usr/lib/
|
| COPY --from=ugit-ops /lib/libc.musl-* /lib/
|
| COPY --from=ugit-ops /lib/ld-musl-* /lib/
|
| No, what I'm saying is you're blanket copying fully different
| versions of common library files into operating system lib
| folders as shown above, possibly breaking OS lib symlinks
| and/or wholly overwriting OS lib files themselves in the
| process for _current_ versions used in Alpine OS if they
| exist now or in the future, potentially destroying OS lib
| dependencies, and also overwriting the ones possibly included
| in the future by Alpine OS itself to get your statically
| copied versions of the various CLI tools your shell script
| needs to work. The same goes for copying bash, tr, git, and
| other binaries to OS bin folders. No No NO!
|
| That is _insanely_ shortsighted. There's a safe way to do
| that and then there is the way you did it. If you want to
| learn to do it right and are deadset against static binary
| versions of those tools for the sake of file size, look at
| how Exodus does it so that they don't destroy OS bin folders
| and library dependency files in the process of making a
| binary able to be moved from one OS to another.
|
| Exodus: https://github.com/intoli/exodus
|
| This is why I'm saying your resulting docker image is
| incredibly fragile and something I would never depend on
| long-term as it's almost guaranteed to crash and burn as
| Alpine OS upgrades OS bins and lib dependency files in the
| future. That it works now in this version is an aberration at
| best and in reality, there probably are things that are
| broken in Alpine OS that you aren't even aware of because you
| may not be using the functionality you broke _yet_.
|
| OS package managers handle dependencies for a reason.
| parhamn wrote:
| > That is _insanely_ shortsighted.
|
| Relax. While I wouldn't recommend OPs approach either. But
| you're not particularly right either.
|
| Exodus clearly states:
|
| > Exodus is a tool that makes it easy to successfully
| relocate Linux ELF binaries from one system to another...
| Server-oriented distributions tend to have more limited and
| outdated packages than desktop distributions, so it's
| fairly common that one might have a piece of software
| installed on their laptop that they can't easily install on
| a remote machine.
|
| Exodus is specifically designed for moving between
| different systems.
|
| He is largely moving from the same base image. In the
| article base layer is `alpine:3.18` and the target image is
| `alpine:3.18` and in the latter part of the article
| `scratch` (less to zero conflict surface). One would assume
| those two would be coupled.
|
| There are other technical merits to not doing what he's
| doing but you haven't listed any and dismissed his work.
| I'd venture if you actually knew what you're talking about
| you'd have better things to add to this conversation than
| "OS package managers handle dependencies for a reason."
|
| Perhaps next time give some feedback that would help the
| writer get closer to a well-working exodus like solution.
| It's hackernews, "dont roll your own" discouragement should
| be frowned upon.
| gbN025tt2Z1E2E4 wrote:
| We see it differently. Exodus is useful in this capacity
| as much as any other, similar base os image or not for
| preventing overwriting.
| TJSomething wrote:
| Overwriting what? The destination's a completely empty
| root.
| benreesman wrote:
| Or just write a clean specification and get a docker image close
| to optimal, and if it's not, you can prove cryptographically if
| by some chance you beat the defaults:
|
| https://xeiaso.net/blog/i-was-wrong-about-nix-2020-02-10/
|
| I've got plenty of gripes with nixlang, but being worse than
| Dockerfile-lang isn't one of them.
| Cu3PO42 wrote:
| Yes, you can use Nix to get extremely small Docker images. I
| have personally used it to that effect, but it's not a magic
| bullet. In this specific case, it gives pretty bad results
| even. I have written the simplest possible Nix derivation for
| ugit and the resulting Docker image is 158MB gzipped. I haven't
| explored fully why that is, but that's much worse than even the
| first effort from the OP.
| politelemon wrote:
| Thanks for sharing this. I like what the author did, they pursued
| a goal and kept working at it, until they found a balancing
| point.
|
| I think my experience in similar pursuits would have led me to
| stop very early on - 31.4 MB is already pretty good, to be fair.
| Looking at the amount of potential maintenance required in the
| future, for example if the original ugit tool starts to need more
| dependencies which then have to be wrangled and inspected, makes
| me think that the size I didn't reduce is worth the tradeoff.
| Since the dependencies can be managed with package managers,
| without having to think too much, and as the author says, Linux
| is pretty awesome about these things already.
| SOLAR_FIELDS wrote:
| It always depends on your use case but yeah, in the world of
| docker images, 30 MB often feels like nothing, because gigabyte
| plus sizes are not at all out of the norm. To some extent it's
| a design flaw of the way images and layers work but also the
| tool doesn't seem to discourage the ballooning either
| bhupesh wrote:
| Hey author here
|
| True, 31.4 MB is definitely a stopping point. But my the nerd
| inside me kicked in and wanted to know what "exactly" is
| required to run ugit. It was a fun experience.
| osigurdson wrote:
| > or maybe ends up sponsoring...
|
| Sponsorship for a 500 line shell script. Wow!
| Cu3PO42 wrote:
| They aren't asking for sponsorship on the tool they created.
| They expressed that they do not have interest in investing even
| more work to rewrite it in Rust, Go, or what have you; unless
| someone paid them to do it. And I think that is completely
| fair!
|
| If someone has no inherent interest in doing something, is not
| othewise obligated to do it, it is not done as a favor to
| friends or something, paying that person to do the job anyway
| is a very accepted practice in our society. Almost all of our
| employers pay us to do things we might otherwise not do.
| osigurdson wrote:
| There is nothing wrong with extracting as much value as
| possible from a small effort. It just seems highly unlikely
| anyone would sponsor it so the request seems somewhat
| ridiculous.
|
| alias lsa=ls -a
|
| Sponsor me!
| raziel2p wrote:
| Is someone currently, for free, doing the work the author
| is suggesting sponsorship for? If not, it's not ridiculous.
| osigurdson wrote:
| Will you sponsor the maintenance of my alias command
| above then? I don't think anyone else is maintaining such
| a command. Or, is it ridiculous?
| SOLAR_FIELDS wrote:
| Dive is a great tool for debugging this. I like image reduction
| work just because it gives me a chance to play with Dive:
| https://github.com/wagoodman/dive
|
| One easy low hanging fruit I see a LOT for ballooning image sizes
| is people including the kitchen sink SDK/CLI for their cloud
| provider (like AWS or GCP), when they really only need 1/100 of
| that. The full versions of both of these tools are several
| hundred mb each
| bloopernova wrote:
| Do you have a link to a recommended guide to slimming down the
| cloud provider tools?
| bhupesh wrote:
| Can vouch for dive, the final system tree was generated by dive
| (should have acknowledged it, my bad)
| tuananh wrote:
| ugh, i would hate to maintain this dockerfile. i actually dont
| mind a 34MB docker image vs a 17MB image like this
| mhitza wrote:
| I didn't see it in the final tree listing, but I would expect the
| fzf.tar.gz to linger around after extraction as it was never
| removed. If that is so, should help squeeze a few more bytes out
| of the final image.
| tuananh wrote:
| it's multi-stage build. they only copy fzf bin to the final
| image (scratch)
| c0l0 wrote:
| [...] COPY --from=ugit-ops /usr/bin/tr /usr/bin/tr
| COPY --from=ugit-ops /bin/bash /bin/ COPY --from=ugit-ops
| /bin/sh /bin/ # copy lib files COPY
| --from=ugit-ops /usr/lib/libncursesw.so.6 /usr/lib/ COPY
| --from=ugit-ops /usr/lib/libncursesw.so.6.4 /usr/lib/ COPY
| --from=ugit-ops /usr/lib/libpcre* /usr/lib/ COPY
| --from=ugit-ops /usr/lib/libreadline* /usr/lib/ [...]
|
| For me, insane sh*t like this proves that those who do not learn
| from distribution and package management infrastructure
| engineering history are condemned to reinvent it, poorly.
| bhupesh wrote:
| Hey author here.
|
| I understand that you might have some context about package
| managers that I am missing. Would genuinely like some resources
| about your comment or maybe a bit of explanation.
|
| Thanks
| c0l0 wrote:
| Hey there Bhupesh - apologies for the snark! I was just
| venting some of the frustration I feel every day with modern
| "devops" tooling ;)
|
| I am in a bit of a rush right now (which is why I try my
| absolute best to keep procrastinating on HN at the the
| absolute minimum, I swear! ;)), but I will try to share some
| insight later (potentially as a comment on your blog).
| bhupesh wrote:
| Thanks, appreciate the help!
| codethief wrote:
| I'd be interested in this, too, so I'd be grateful if you
| could notify us here, wherever you end up posting your
| comment!
| tiziano88 wrote:
| It may be worth looking at Nix if you haven't already
| gbN025tt2Z1E2E4 wrote:
| I explained a bit here in my reply to your other comment:
|
| https://news.ycombinator.com/item?id=39243450
| chasil wrote:
| I have been able to run ksh93 in an nspawn container under
| systemd in a tiny fraction of what is presented here.
|
| I did this by tracking the output of the ldd command and moving
| only needed libraries into the container.
|
| Why is docker so big?
| renewiltord wrote:
| How does removing the shebang save two megabytes? Seems like a
| lot. Is it the env binary?
| bhupesh wrote:
| Yes, the size of env closes to 2mb. I maybe wrong here, though.
| Seems something is wrong.
|
| I wasn't able to dig deep enough on why that was the case,
| considering the "env" utility was coming from busybox which on
| copy averages close to 900Kb.
| zdw wrote:
| Are "Random shell scripts from the internet" categorically worse
| than "random docker images from the internet"?
|
| With the shell script, you can literally read it in an editor to
| make sure it isn't doing anything that weird. A single pass
| through shellcheck would likely tell you if it's doing anything
| that is too weird/wrong in terms of structure.
|
| Auditing a docker container is way more difficult/complex.
|
| "Dockerize all the things", especially in cases when the prereqs
| aren't too weird, seems like it wastes space, and also is harder
| to maintain - if any of the included components has a security
| patch, it's rebuild the container time...
| galleywest200 wrote:
| Reading the Dockerfile should tell you what was done to create
| the image. If you have trust issues around the "base" images
| such as Debian or Fedora that is a different set of inquiries.
|
| As for patching, you can tell your Dockerfile to always pull
| the latest versions of the items you are most concerned about.
| At that point rebuilding the container is as simple as deleting
| it with "docker container stop <id> && docker container rm
| <id>" and then run your docker-compose command again.
| zdw wrote:
| Does anyone read/diff the build commands every time they get
| a new `latest` docker image?
|
| There would already be implicit trust in whatever the local
| OS's package manager laid down, and trying to add another set
| of hard to audit binaries on top is not really an
| improvement.
| photonthug wrote:
| > Are "Random shell scripts from the internet" categorically
| worse than "random docker images from the internet"?
|
| Yes, because inspection aside, at least with a docker
| invocation you can specify the volumes
| zdw wrote:
| Does anyone in practical invocation specify the volumes?
|
| Or would they wrap it in _yet another shell script that calls
| docker with a set of options_ , or a compose file, etc?
|
| This quickly turns into complexity stacked on complexity...
| yjftsjthsd-h wrote:
| > Does anyone in practical invocation specify the volumes?
|
| First: yes, I have run docker with -v recently.
|
| Second:
|
| > Or would they wrap it in yet another shell script that
| calls docker with a set of options, or a compose file, etc?
|
| > This quickly turns into complexity stacked on
| complexity...
|
| I agree that it can get out of hand, but a Dockerfile, a
| compose file, and whatever is going inside the container
| can be an entirely reasonable set of files to have so long
| as you stick with that and are reasonable about what goes
| in each. Where to put it differently, I think it's okay
| because they actually are separation of concerns.
| msm_ wrote:
| Yes I run:
|
| sudo docker run -it -v (pwd):(pwd) my_dev_image
|
| many times every day, to create a development enviromnent
| in CWD. My_dev_image is a debian-based image with common
| developer utilities (pip, npm, common packages installed).
| I don't feel comfortable installing random packages from
| the internet on my host machine, so I use docker for
| everything.
| nopurpose wrote:
| https://github.com/containers/bubblewrap allows specifying
| volumes for scripts too
| amcpu wrote:
| The dive utility helps tremendously for exploring the
| filesystem contents of a container image. Combine that with the
| output of `docker inspect` to look at the metadata and you
| should be able to have a good understanding of what it will do
| when running as a container.
| zdw wrote:
| Evaluating the whole contents of a filesystem is
| significantly more complex than evaluating one shell script.
| sigotirandolas wrote:
| A script running in a container is mostly isolated from the
| host by default, so it can't just upload whatever SSH keys /
| Bitcoin wallets / other stuff you have lying around or add some
| payload on your ~/.bashrc unless you explicit share those files
| with the container.
| zdw wrote:
| Yes, I understand https://xkcd.com/1200/ as well.
|
| Running _anything_ without understanding what it does it is
| more dangerous than trying to understand it before running
| it.
|
| I'm arguing for _less complexity and easier auditing_ ,
| instead of a series of complex layers that each add to a
| security story, but make the overall result much harder to
| audit.
| eropple wrote:
| To move directionally in the way you describe, you probably
| have to make the user experience of running scripts of any
| kind _much weirder_. macOS does this to some extent by
| prompting via GUI if something tries to access data
| directories on your system (though it confuses iTerm2 for
| "anything iTerm2 runs" and that sucks), but I think people
| would have a lot more problems with trying to do that in a
| server shell.
|
| To that end, Linux namespacing is probably a better way to
| constrain the blast radius for most people. That's not to
| say it should be an _either-or_ , but in the absence of a
| _both-and_ because the userland is not set up for
| sufficient policing, I think Docker containers are a pretty
| clearly better solution.
| ReleaseCandidat wrote:
| This is true, but we are talking about running this script on
| some codebase (or whatever you want to "git undo"). I mean "I
| don't trust this script, but let's run it on our source code"
| sounds a bit weird.
| sigotirandolas wrote:
| I agree, in this case it's hard to defend against a rogue
| script or container image, as you need to give it read-
| write access to your source code, so it could add a
| malicious payload to your source code or install a Git hook
| to break out of the container into your host or get some
| malicious source code onto your company's Git server.
|
| There are measures that could defend against this (run all
| your development tools inside containers, and mandatory PRs
| with reviews) but they are probably beyond many/most
| developers are willing to do security-wise.
|
| There are a lot of scenarios where I think security through
| isolation/containerization makes a lot of sense (e.g. for
| code analysis tools, end-user applications like video
| games, browsers, etc.) but not too much for this particular
| one.
| swozey wrote:
| If you want an example of how little importance vetting oci
| images is to most ops/infra teams I have a great example- I
| used to work on low level k8s multitenant networking stuff,
| think cdns. Most of them use something like multus to split up
| vfio paths between tenants. Think chopping your NIC into 24
| private channels and each channel is one customer. The ENTIRE
| path has to be private, the container starts and claims that
| network path on the physical NIC. No network packet can ever be
| accessed by another channel, server or container. I was alpha-
| testing multus which controls this network pathing that every
| customer would take ingress and egress out of a cluster and put
| up some test containers on dockerhub.
|
| Multus sits at the demarc line between the container and the
| NIC channel. I'm not saying it's possible or ever been done but
| if I were going to set up a traffic mirror somewhere it'd
| logically have to be there or after the NIC..
|
| I wrote it 5 years ago. I have no idea what version of multus
| it's running but even today it's getting pulls, last pull 19
| days ago. Overall pulls over 5 years is over 10k.
|
| These containers would spin up every time a container starts on
| k8s that attaches an ovf interface. So, it's pretty much
| guaranteed that this is in use somewhere in someones scaling
| infra. I don't know if I SHOULD delete the image and
| potentially take down someones infra or just let them keep
| chugging at it. I'm not paying for dockerhub.
|
| https://hub.docker.com/repository/docker/swozey/multus/gener...
|
| edit: Looks like it's installing the latest multus package so
| not AS terrible but .. multus is not something to play loose
| with versioning..
|
| Also I really wish Dockerhub gave you more stats/analytics. It
| really means nothing in the end but I'm curious. They don't
| even tell you the number beyond 10k, it just says 10k+
| downloads.
|
| https://github.com/k8snetworkplumbingwg/multus-cni
| buffet_overflow wrote:
| Something like this would show up in perimeter
| network/firewall logs correct? But if someone was mirroring
| traffic to the same cloud provider you deploy in, it would be
| less obvious to find out _which_ set of cloud IPs aren't
| actually your own.
| tryauuum wrote:
| assuming you have both perimeter logs and a system which
| notifies a human if something is weird in logs.
|
| Do big clouds have a solution for this? I don't usually use
| GCP / AWS so I don't know what they have
| beeboobaa wrote:
| > Auditing a docker container is way more difficult/complex.
|
| I assume you mean auditing docker images. In which case, sure.
| That's why you grab their dockerfile and build it yourself.
|
| Though using dive[1] it's pretty easy to inspect docker images
| too, as long as they extend a base image you trust.
|
| [1] https://github.com/wagoodman/dive
| iforgotpassword wrote:
| > That's why you grab their dockerfile and build it yourself.
|
| Then you still didn't audit anything. What you need to do is
| inspect the docker file, follow everything it pulls in and
| audit that, finally audit the script itself that the whole
| container gets built for in the first place. Whereas when you
| just download the script and run that directly, you only need
| to do the last step.
| beeboobaa wrote:
| All of that is the same as a shell script, yes. A
| dockerfile is essentially just a glorified shell script
| installing dependencies, which you'd otherwise just be
| doing yourself.
| agumonkey wrote:
| oh dang, dive is really a nice tool, per layer diff and/or
| accumulated changes .. really nice
| 2OEH8eoCRo0 wrote:
| I never use containers from the web unless they're created be
| the company or developer themselves. If they don't produce one
| then I build my own.
| zilti wrote:
| Whenever I think there can't be any worse of a "use case" to
| dockerize something, someone comes along and proves me wrong...
|
| For the last goddamn time: Docker is not a package manager!
| codethief wrote:
| > In the Alpine ecosystem, it is generally not advised to pin
| minimum versions of packages.
|
| I think it would be more accurate to say, in the Alpine
| ecosystem, it is generally not advised to pin versions of
| packages _at all_. Actually, this is not so much a recommendation
| as it is a statement of impossibility: You can 't pin package
| versions (without your Docker builds starting to fail in a week
| or two), period. In other words: Don't use Alpine if you want
| reproducible (easily cacheable) Docker builds.
|
| I had to learn this the hard way:
|
| - There is no way to pin the apk package sources ("cache"), like
| you can on Debian (snapshot.debian.org) and Ubuntu
| (snapshot.ubuntu.com). The package cache tarball that apk
| downloads will disappear from pkgs.alpinelinux.org again in a few
| weeks.
|
| - Even if you managed to pin the sources (e.g. by committing the
| tarball to git as opposed to pinning its URL), or if you decided
| to pin the package versions individually, package versions that
| are up-to-date today will likely disappear from
| pkgs.alpinelinux.org in a few weeks.
|
| - Many images that build upon Alpine (e.g. nginx) don't pin the
| base image's patch version, so you get another source of entropy
| in your builds from that alone.
|
| Personally, I'm very excited about snapshot images like
| https://hub.docker.com/r/debian/snapshot where all package
| versions and the package sources are pinned. All I, as the
| downstream consumer, will have to do in order to stay up-to-date
| (and patch upstream vulnerabilities) is bump the snapshot date
| string on a regular basis.
|
| Unfortunately, the images don't seem quite ready for consumption
| yet (they are only published once a month) but see the discussion
| on https://github.com/docker-library/official-
| images/issues/160... for a promising step in this direction.
| bhupesh wrote:
| > I think it would be more accurate to say, in the Alpine
| ecosystem, it is generally not advised to pin versions of
| packages at all. Actually, this is not so much a recommendation
| as it is a statement of impossibility: You can't pin package
| versions (without your Docker builds starting to fail in a week
| or two), period. In other words: Don't use Alpine if you want
| reproducible (easily cacheable) Docker builds.
|
| Agreed, should have been clear with my sentiment there. Thanks
| for stating this :)
|
| > Personally, I'm very excited about snapshot images like
| https://hub.docker.com/r/debian/snapshot where all package
| versions and the package sources are pinned. All I, as the
| downstream consumer, will have to do in order to stay up-to-
| date (and patch upstream vulnerabilities) is bump the snapshot
| date string on a regular basis.
|
| This is really helpful, thanks for sharing. Looks like it will
| be a good change, fingers crossed.
| Cu3PO42 wrote:
| While I likely would not have made the same tradeoffs, I do
| relate to the desire to get the image as small as reasonably
| possible and commend the efforts. Going to "FROM scratch" is
| likely going to get you one of the best results possible before
| you start patching the application and switching out components.
|
| I find it mildly ironic, however, that bundling the dependencies
| of a shell script is - in some ways - the exact opposite of
| saving space, even if it is likely to make running your script
| more convenient.
|
| Unfortunately, I don't have a great alternative to offer. The
| obvious approach is to either let the users handle dependencies
| (which you can also do with ugit) or write package definitions
| for every major distribution. And if I were the author, I
| wouldn't want to do that for a small side project either.
| yjftsjthsd-h wrote:
| > Unfortunately, I don't have a great alternative to offer. The
| obvious approach is to either let the users handle dependencies
| (which you can also do with ugit) or write package definitions
| for every major distribution. And if I were the author, I
| wouldn't want to do that for a small side project either.
|
| Well... There's nix. Complete packaging system, fully
| deterministic results, lots of features, huge number of
| existing packages to draw from, works on your choice of Linux
| distro as well as Darwin and WSL. All at the tiny cost of a
| little bit of your sanity and being its own very deep rabbit
| hole.
| Cu3PO42 wrote:
| I do love Nix, and I think much more people should use it,
| but I don't really consider that a good alternative in the
| context of my original comment.
|
| I'd argue writing a Nix derivation isn't that different from
| writing a package definition for any one Linux distribution.
| It solves the distribution problem for people who use that
| particular distribution/tool, not everyone. Now, Nix can be
| installed on any distribution, but if I was going for
| widespread adoption, I might point to Nix being a solution,
| but I probably wouldn't advertise it as the main one.
| k__ wrote:
| When FirecrackerOS?!
|
| Fly.io, deliver us.
| codethief wrote:
| Does anyone here have experience using Nix to build minimal
| Docker images? How well does it work, and how does it compare to
| the author's approach of manually copying shared libraries into a
| scratch image?
| SirensOfTitan wrote:
| It works quite well and you can get very minimal docker images
| using nix with very few tricks compared to this.
|
| ...with that, building those nix images on Mac is still a bit
| rough--there's some official docs and work on getting a builder
| VM set-up, but it's still a bit rough around the edges.
| codethief wrote:
| Responding to myself: I see that someone else here in this
| thread commented on Nix:
| https://news.ycombinator.com/item?id=39241768
| ilaksh wrote:
| How would I use this? Say I just made a bad commit in my
| terminal. How would I run this container to fix it? The container
| doesn't have my working directory does it? Or is that the idea,
| to mount a volume with the working for or something?
|
| In that case, maybe it could be helpful, but to make it
| convenient, don't I need a script that stays in my main system
| and invokes the docker run command for me?
|
| So if you do that and just give me a one liner install command to
| copy paste then I guess this actually makes sense. A small docker
| container could eliminate a lot of potential gotchas with trying
| to install dependencies in arbitrary environments.
|
| Except it's a bash script. I guess it would make more sense to
| get rid of the dependency on fzf or something nonstandard. Then
| they can just install your bash script.
|
| For cases where you have more dependencies that really can't be
| eliminated then this would make more sense to me.
|
| Why does it need fzf? Is it intended to run the container
| interactively?
| bhupesh wrote:
| > How would I use this? Say I just made a bad commit in my
| terminal. How would I run this container to fix it? The
| container doesn't have my working directory does it? Or is that
| the idea, to mount a volume with the working for or something?
|
| You can refer to usage guidelines on dockerhub
| https://hub.docker.com/r/bhupeshimself/ugit
|
| > So if you do that and just give me a one liner install
| command to copy paste then I guess this actually makes sense. A
| small docker container could eliminate a lot of potential
| gotchas with trying to install dependencies in arbitrary
| environments.
|
| Yes, that was also an internal motivation behind doing this.
|
| > Why does it need fzf? Is it intended to run the container
| interactively?
|
| Hey fzf is required by ugit (the script) itself. I didnt want
| to rely on cli arguments to give ability to users undo command
| per a matching git command. Adding a fuzzy search utility makes
| it easier for people to search what they can undo about "git
| tag" for example.
| otteromkram wrote:
| It's not that hard to undo a git commit.
|
| I don't see what value the author's side project is bringing
| other than adding complexity to a simple task (or, more likely,
| bolstering their resume).
| kjkjadksj wrote:
| Whats wrong with make or dare I even suggest a package manager
| like conda? I get having a half dozen dependencies can be
| specified in tools like docker but its just another way to do the
| same old task thats been solved a dozen ways for decades. We are
| sharing a shell script here. Seems crazy to me to run an entire
| redundant file system to share a couple hundred line bash script.
| Plus now users need docker skills as well as command line skills
| to install and run this tooling. There are corners of the command
| line user/programmer world that have thankfully not been polluted
| by docker yet so its not nearly as widespread a tool as setting
| up environments for bash scripts using some older ways.
| swozey wrote:
| I think you're seeing this from the perspective of someone who
| runs a container for development and not someone who has to run
| a development container at hyperscale.
|
| We can't pass around bash scripts anymore. Every system has to
| be fungible, reproducible en masse and as agnostic to the
| underlying technology its on as possible.
| kjkjadksj wrote:
| You aren't writing machine code that can run on anything
| though, you have this docker dependency in order to run the
| container. Its just trading one dependency for another
| because docker is in style these days. I don't think
| deploying bash scripts at scale was some insurmountable
| challenge before docker showed up.
| swozey wrote:
| We don't have a "docker dependency" - we run OCI
| containers. You're equating Docker which is a tooling eco
| system with containers.
|
| Containers have been around for a LONG time, Solaris,
| jails, cgroups, etc are all _built-in_ to the kernels we
| use today.
|
| You don't need to use docker.
|
| The idea is fungible services, whether it's literally just
| a container that starts with a go binary I can quickly
| scale 1000s of COMPLETELY independent processes and
| ORCHESTRATE THEM over thousands of clusters from one
| centralized system.
|
| If I need to shift 1000s of that one go binary to US-WEST-1
| because US-EAST-1 is down I can run automate it or run one
| command based on a kubernetes tag label and shift traffic.
|
| These are just a few of the massive benefits we get with
| containers.
|
| I can deploy an ENTIRE datacenter with a yaml file. My
| ENTIRE companies infrasture MTTR (mean time to recovery)
| from a total outage, starting from a github repo is less
| than 35 minutes and we're a billion dollar company and 80%
| of that time is starting load balancers and clusters. The
| only NOT agnostic hardware stuff in any of this are the
| load balancers and network related things as each provider
| has its own apis, IAM/Policies, etc that are completely
| unique between providers/datacenters. Nothing cares about
| what ram, distro, cpu or anything else is being used, we
| can deploy anywhere ARM or x86.
|
| Without containers I would need a $150k F5 load balancer to
| distribute load between a ton of $30k dell poweredeges (and
| I'd need this x1000's).
|
| I've been in Infrastructure for 15+ years at massive scale,
| webhosts, cdns, I do NOT want to go back to not using
| containers ever. None of my team writes any non container
| code or infra. The FIRST thing we do in every single repo
| is make a dockerfile and docker-compose.yml to easily work
| on things and every single server any company has in the
| last decade of my SRE career we've migrated to containers
| and never once regretted it.
| swozey wrote:
| I've been writing containers for 10+ years and this last few
| years I've started using supervisord as pid 1 that manages
| multiple processes inside the container for various things that
| CAN'T function as disparate microservices in the event that one
| fails/updated/etc a lot more.
|
| And man I love it. It's totally against the 12 microservice laws
| and shoudl NOT be done in most cases, but when it comes to
| troubleshooting- I can exec into a container anywhere restart
| services because supervisord sits there monitoring for the
| service (say mysql) to exit and will immediately restart it. And
| because supervisor is pid1 as long as that never dies your
| container doesn't die. You get the benefit of the
| containerization and servers without the pain of both, like
| having to re-image/snapshot a server once you've thoroughly
| broken it enough vs restarting a container. I can sit there for
| hours editing .conf files trying to get something to work without
| ever touching my dockerfile/scripts or restarting a container.
|
| I don't have to make some changes, update the
| entrypoint/dockerfile, push build out, get new image, deploy
| image, exec in..
|
| I can sit there and restart mysql, postgres, redis, zookeeper, as
| much as I want until I figure out what I need done in one go and
| then update my scripts/dockerfiles THEN prepare the actual
| production infra where it is split into microservices for
| reliability and scaling, etc.
|
| I've written a ton of these for our QA teams so they can hop into
| one container and test/break/qa/upgrade/downgrade everything
| super quick. Doesn't give you FULL e2e but it's not we'd stop
| doing what tests we already do now.
|
| I mention this because it was something I did once a long long
| time ago but completely forgot something that you could do until
| I recently went that route and it really does have some useful
| scenarios.
|
| https://gdevillele.github.io/engine/admin/using_supervisord/
|
| I'm also really tired of super tiny containers that are absolute
| nightmares to troubleshoot when you need to. I work on prod infra
| so I need to get something online immediately when a fire is
| happening and having to call debug containers or manually install
| packages to troubleshoot things is such a show stopper. I know
| they're "attack vectors" but I have a vetted list of aliases,
| bash profiles and troubleshooting tools like jq mtr etc that are
| installed in every non-scratch container. My containers are all
| standardized and have the exact same tools, logs, paths, etc. so
| that everyone hopping into one knows what they can do.
|
| If you're migrating your architecture to ARM64 those containers
| spin up SO fast that the extra 150-200mb of packages to have a
| sane system to work on when you have a fire burning under you is
| worth it. For some scale the cross datacenter/cluster/region
| image replication would be problematic but you SHOULD have a
| container caching proxy in front of EVERY cluster anyway. Or at
| least at the datacenter/rack. It could be a container ON your
| clusters with it's storage volume a singular CEPH cluster, etc.
| adrianmonk wrote:
| > _The use of env is considered a good practice when writing
| shell scripts, used to tell the OS which shell interpreter to use
| to run the script_
|
| When using a shebang line, the reason for 'env' is actually
| something different.
|
| You can just leave out 'env' and do a shebang with 'bash'
| directly like this: #! /usr/bin/bash
|
| But the problem with that is portability. On different systems,
| the correct path may be /bin/bash or /usr/bin/bash. Or more
| unusual places like /usr/local/bin/bash. On old Solaris systems
| that came with ksh, bash might be somewhere under /opt with all
| the other optional software.
|
| But 'env' is at /usr/bin/env on most systems, and it will search
| $PATH to find bash for you, wherever it is.
|
| If you're defining a Docker container, presumably you know
| exactly where bash is going to be, so you can just put that path
| on the shebang line.
|
| TLDR: You don't have to have a shebang, but you can have a
| shebang at no cost because _your_ shebang doesn 't need an env.
| hitpointdrew wrote:
| Dockerizing a shell script????
|
| Unless your tool is converted to a service how would anyone ever
| use this? Do you expect them to run their project inside of your
| container?
|
| This is very bizarre.
| oftenwrong wrote:
| It's quite typical. You `docker run`, and specify the options
| to mount the work tree of the project into the container.
| avgcorrection wrote:
| > Yeah, I know, I know. REWRITE IT IN GO/RUST/MAGICLANG. The
| script is now more than 500+ lines of bash.
|
| These screeds get more and more random.
|
| The standard advice was always to just not let a program in Bash
| get beyond X lines. Then move to a real programming language.
| Like Python (est. 1991).
| citruscomputing wrote:
| This is neat :)
|
| I love going and making containers smaller and faster to build.
|
| I don't know if it's useful for alpine, but adding a
| --mount=type=cache argument to the RUN command that `apk add`s
| might shave a few seconds off rebuilds. Probably not worth it, in
| your case, unless you're invalidating the cached layer often
| (adding or removing deps, intentionally building without layer
| caching to ensure you have the latest packages).
|
| Hadolint is another tool worth checking out if you like spending
| time messing with Dockerfiles:
| https://github.com/hadolint/hadolint
| nunez wrote:
| I love reducing Docker images to their smallest forms. It's great
| for security (minimizes the bill of materials and makes it easier
| to update at-risk libraries and such), makes developers really
| think about what their application absolutely needs to do what it
| needs to do (again, great for security), and greatly improves
| startup performance (because they are smaller).
|
| We can definitely go smaller than 20MB and six layers.
|
| Here's a solution that compresses everything into a single 8.7MB
| layer using tar and an intermediate staging stage:
| https://gist.github.com/carlosonunez/b6af15062661bf9dfcb8688...
|
| Remember, every layer needs to be pulled individually and Docker
| will only pull a handful of layers at a time. Having everything
| in a single layer takes advantage of TCP scaling windows to
| receive the file as quickly as the pipe can send it (and you can
| receive it) and requires only one TCP session handshakes instead
| of _n_ of them. This is important when working within low-
| bandwidth or flappy networks.
|
| That said, in a real-world scenario where I care about
| readability and maintainability, I'd either write this in Go with
| gzip-tar compression in the middle (single statically-compiled
| binaries for the win!) or I'd just use Busybox (~5MB base image)
| and copy what's missing into it since that base image ships with
| libc.
___________________________________________________________________
(page generated 2024-02-03 23:00 UTC)