[HN Gopher] `COPY -chmod` reduced the size of my container image...
___________________________________________________________________
`COPY -chmod` reduced the size of my container image by 35%
Author : unmole
Score : 518 points
Date : 2022-03-26 03:11 UTC (19 hours ago)
(HTM) web link (blog.vamc19.dev)
(TXT) w3m dump (blog.vamc19.dev)
| maxekman wrote:
| I would recommend Google Ko if you are packaging Go apps:
| https://github.com/google/ko
| lenn0x wrote:
| Came here to suggest just this. Ever since coming across ko,
| it's been excellent in our CI pipelines.
| 3np wrote:
| This and other common considerations and gotchas are brought up
| in Shekow's excellent Docker optimization guide.
|
| https://www.augmentedmind.de/2022/02/06/optimize-docker-imag...
|
| Was posted here last month:
| https://news.ycombinator.com/item?id=30406076
| osener wrote:
| In my experience docker-slim[0] is the way to go for creating
| minimal and secure Docker images.
|
| I wasted a lot of time in the past trying to ship with Alpine
| base images and statically compiling complicated software. All
| the performance, compatibility, package availability headaches
| this brings is not worth it when docker-slim does a better job of
| removing OS from your images while letting you use any base image
| you want.
|
| Tradeoff is that you give up image layering to some extent and it
| might take a while to get dead-file-elimination exactly right if
| your software loads a lot of files dynamically (you can instruct
| docker-slim to include certain paths and probe your executable
| during build).
|
| If docker-slim is not your thing, "distroless" base images [1]
| are also pretty good. You can do your build with the same distro
| and then in a multi stage docker image copy the artifacts into
| distroless base images.
|
| [0] https://github.com/docker-slim/docker-slim
|
| [1] https://github.com/GoogleContainerTools/distroless
| throwaway894345 wrote:
| Typically I just use Go in a scratch base image where possible,
| since it's super easy to compile a static binary. Drop in some
| certs and /etc/passwd from another image and your 2.5MB image
| is good to go.
| cornstalks wrote:
| From the docker-slim README: Node.js
| application images: from ubuntu:14.04 - 432MB =>
| 14MB (minified by 30.85X) from debian:jessie -
| 406MB => 25.1MB (minified by 16.21X) from
| node:alpine - 66.7MB => 34.7MB (minified by 1.92X)
| from node:distroless - 72.7MB => 39.7MB (minified by 1.83X)
|
| Why are the minified Alpine images bigger than Ubuntu/Debian?
| Are a bunch of binaries using static linking and inflating the
| image? Or something else?
| kylequest wrote:
| You can try using the xray command that will give you clues
| (docker-slim isn't just minification :-)). The diff
| capability in the Slim web portal is even more useful (built
| on top of xray).
| kylequest wrote:
| Layers support in docker-slim is something that's on the todo
| list (exact time is tbd).
|
| The recent set of the engine enhancements reduce the need for
| the explicit include flags. Also new application-specific
| include capabilities are wip (already added a number of new
| flags for node.js, next.js and nuxt.js applications).
| AtNightWeCode wrote:
| I have quit using Alpine. It caused to many issues in
| production. For some workloads in GO for instance you can use
| Scratch directly. But slim and distroless are my preferred base
| images.
| Beltalowda wrote:
| What issues did you experience with Alpine and Go?
| unmole wrote:
| > I have quit using Alpine.
|
| As I said to the author of TFA a few days ago: Alpine delenda
| est.
| mt42or wrote:
| Yes we have experienced many issues with Alpine too. Ubuntu
| with 25MB compressed is ok.
| [deleted]
| hericium wrote:
| Until it's not. Well working deb/apt is replaced by some
| self-sabotage called snap before up-to-date packages land
| there.
| paskozdilar wrote:
| Snap is horrible.
|
| The software required for running a Snap Store instance
| is proprietary [0], and there are no free software
| implementations as far as I know. Also, the default
| client code hardcodes [1] Canonical snap store, so you
| have to patch and maintain your own version of snapd if
| you want to self-host.
|
| Snapd also hardcodes auto-updates that are also
| impossible to turn off without patching and maintaining
| your own version of snapd / blocking the outgoing
| connections to Canonical servers, so snapd is also
| horrible for server environments. To top that, the
| developers have this "I know what's good for you, you
| don't" attitude [2] that so much reminds me of You Know
| Who.
|
| [0] https://www.techrepublic.com/article/why-canonical-
| views-the...
|
| [1] https://www.happyassassin.net/posts/2016/06/16/on-
| snappy-and...
|
| [2] https://forum.snapcraft.io/t/disabling-automatic-
| refresh-for...
| johnisgood wrote:
| Yep. I am trying my best to boycott Canonical and their
| closed source Snap which is akin to Apple Store or Play
| Store, but for desktop... for... LINUX. Goes against the
| philosophy in every way imaginable.
| paskozdilar wrote:
| That being said, I was wondering how many people actually
| find the Snap system and ecosystem useful. Reverse
| engineering snapd (which is licensed under GPLv3) and
| snap app format in order to create a compatible server
| would be a fun project.
| dontcare007 wrote:
| So,another systemd in the making... how long until snap
| takes over dns resolution...
| paskozdilar wrote:
| Say what you want about systemd, but it is still Free
| Software.
| progforlyfe wrote:
| It's really sad especially given how Canonical introduced
| so many people to Linux through Ubuntu. I understand they
| need to monetize to survive but I wish it wasn't like
| this. I miss the "Ubuntu One" service, a simple Dropbox
| like alternative that you pay for. Completely optional
| and server side. Integrated into the UI.
| heavyset_go wrote:
| Are people actually using Snap in containers? It feels
| like a convenience feature for desktop users.
| ZiiS wrote:
| They don't give you the choice. Applications are only
| avaliable as either a snap or an deb. With many you might
| want to containerise (ie Chromium for CI) being snaps. I
| dont belive they can work in unpriviledged docker
| containers.
| shaicoleman wrote:
| There are Chromium debs for Ubuntu 20.04 available from
| the Linux Mint repo.
|
| To get it Chromium under docker, I'm using the following
| params: --headless --no-sandbox
| --disable-gpu --window-size=1920,1080 --disable-dev-shm-
| usage
| heavyset_go wrote:
| I understand that, it's just my impression with using
| Snap that it's used to ship desktop applications in a
| consistent way.
| yunohn wrote:
| Which apps are you using via Docker, that are only
| available as a snap?
| bigpod wrote:
| you cannot use snap inside a docker or any other OCI
| container, first of all snap is a containerised package
| as well so it doesnt make much sense but what is more
| important it requires SystemD and as far as i know if
| systemd isnt PID 1 snap deamon wont run and its CLI will
| output it cant run.
| burnoutgal wrote:
| Personally I hate scratch images because once you lose
| busybox, you lose the ability to exec into containers. It's a
| great escape hatch for troubleshooting.
| kevin_nisbet wrote:
| There are a few options that help here. With host access, I
| tend to just use nsenter most of the time to do different
| troubleshooting. It can be a bit of a pain doing network
| troubleshooting though since the resolv.conf will be
| different without the fs namespace.
|
| And kubernetes has debug containers and the like now.
| wereHamster wrote:
| What go build options are recommended to have go output a
| static binary (that can run inside a scratch image)?
| onei wrote:
| If you're using plain Go, you get static binaries for free.
| If you're linking against so C library you might be out of
| luck. You can try setting CGO_ENABLED=0 in your
| environment, but I've had mixed success in practice.
| throwaway894345 wrote:
| I've only had success with CGO_ENABLED. If you depend on
| a C library, then obviously it won't work, but mercifully
| few ago programs depend on C libraries (not having
| straight C ABI compatibility is a real blessing since
| virtually none of the ecosystem depends on C and its
| jerry rigged build tooling).
| wereHamster wrote:
| Well this is not what I'm seeing. I need to add -ldflags
| "-linkmode external -extldflags -static" to the go build
| command otherwise the binary doesn't run inside a scratch
| image.
| dimitrios1 wrote:
| CGO_ENABLED=0 has always worked for me in production, but
| it did come with some tradeoffs, notably, network
| performance.
| qwertox wrote:
| Thanks for mentioning these.
|
| I've been using Alpine religiously for years, until the build
| problems became too big. Mostly long build times and removed
| packages on major version updates.
|
| Now I first try with Alpine and if there is the slightest hint
| of a problem, I move over to debian-slim. So things like Nginx
| are still in Alpine for me, while anything related to Python
| not any longer.
|
| At first I thought your mention of docker-slim was an error for
| debian-slim, but I've followed the link am glad to have learned
| something useful.
| codethief wrote:
| Could anyone ELI5 how exactly docker-slim achieves
| minifications of up to 30x and more? I've read the README but
| it still seems like black magic to me.
| bigpod wrote:
| basicly what docker-slim does it basicly checks what your
| program is opening/ using(using simular system as strace) and
| what it does open/use is then copied to a new image. And how
| can get those kinds of numbers, basicly it removes parts of
| rootFS that are not required which is basicly your base
| images standard files like /etc /home /usr /root..., it also
| removes all development dependancies, source code and other
| cruft you might have copied in for use during build or
| simular.
| 28304283409234 wrote:
| While absolutely genius, it would be more awesome if we
| could shift this to the left. And have dpkg or apt, or
| something new, only fetch and place those binaries that are
| needed.
| kylequest wrote:
| it'll be possible to do something like that in the future
| where docker-slim will generate a spec that describes
| what your system and app package managers need to
| install. Using the standard package managers will be
| tricky for partial package installs though because it's
| pretty common that the app doesn't need the whole
| package. Even now docker-slim gives you a list of files
| your app needs, but the info is too low level to be
| usable by the standard package management tools.
| madduci wrote:
| While this is the go-to approach, I find really hard later to
| debug problems for images who don't have even an shell or a
| minimum set of tools to help debug a problem
| bigpod wrote:
| you can use --include-shell to inclue a shell. but my
| recommendation is always keep a non-slimmed image somewhere
| so you can refer to it and some debugging can be done in
| either, also that allows you to use something like
| slim.ai's(company behind docker-slim) web based SaaS features
| which allow you to see the differences between 2 images and
| see the file system in nice web base file tree. some bugs may
| stem from removing what is necessary for app to
| work(sometimes caused by not loading a library during
| "slimming" process) for those types of errors you need to
| know how your app runs more then anything. it a tradeoff
| sokoloff wrote:
| Related, is there any built-in facility that would log file
| system accesses that failed on the resulting image? Seems
| like that would give the answer in a substantial fraction of
| cases.
| atombender wrote:
| I recommend avoiding this kind of thinking, which leads to
| bundling all sorts of stuff in every single container. My
| philosophy is that images should contain as little as
| possible.
|
| To debug a container, a better way is to enter the
| container's kernel namespaces using a tool such as nsenter
| [1]. Then you can use all your favourite tools, but still
| access the container as if you're inside it. Of course, this
| means accessing it on the same host that it's running.
|
| If you're on Kubernetes, debug containers [2] are currently
| in beta, and should be much nicer to work with, as you can do
| just "kubectl debug" to start working with an existing pod.
|
| [1] https://man7.org/linux/man-pages/man1/nsenter.1.html
|
| [2] https://kubernetes.io/docs/tasks/debug-application-
| cluster/d...
| [deleted]
| dmix wrote:
| I noticed you start to find every operating system quirk ever
| when you start writing Dockerfiles. I've run into so many strange
| things just converting a simple predictable shell script into a
| Dockerfile.
| viraptor wrote:
| Tbh, that's just a docker quirk. There were so many ways to do
| this with a more reasonable implementation and yet...
| dmix wrote:
| I initially just wantetd to replicate my production env
| locally, then after that experience I wanted to replicate my
| Docker env in production so that would also be predictable.
| fivea wrote:
| > I noticed you start to find every operating system quirk ever
| when you start writing Dockerfiles.
|
| I've been using Dockerfiles extensively for years and I'm yet
| to find anything that fits the definition of a OS quirk.
|
| The quirkiest thing I've noticed in Dockerfiles is the ADD vs
| COPY thing.
|
| > I've run into so many strange things just converting a simple
| predictable shell script into a Dockerfile.
|
| What exactly are you trying to do setting up a Dockerfile that
| requires a full blown shell script?
|
| A Dockerfile should have little more beyond updating/installing
| system packages with a package manager, and copying files into
| the container image. First you run a build to get your
| artifacts ready for packaging, and afterwards you package those
| artifacts by running your Dockerfile.
| TobTobXX wrote:
| I don't know what qualifues a s quirk, but I sometimes had to
| think about stuff like "who's PID 0", "do I have the
| CAP_WHATEVER capability", "I need to 'reap' subprocesses?"
| and other stuff that 'just works' when you have a decent
| system with a decent init process and all the other things.
| ghoulishboo wrote:
| I'm kind of impressed with how aggressively this comment
| misunderstands its parent and then beats the hell out of that
| misunderstanding. Docker seems to be like catnip for people
| itching to laboriously explain The Way to others, and it's
| remarkable how often that requires bending reading
| comprehension to its unrecognizable limits in order to create
| that opportunity. You've thoroughly misunderstood what
| someone would be trying to do when turning a script into a
| Docker container and fallen back on your understanding of how
| containers are used, which is pretty demonstrably limited
| (based upon your own words).
|
| Nothing you said is relevant to their comment at all,
| including your counter-anecdote. Their point about OS quirks
| tracks, in my experience, and I'm baffled you think that
| point has anything to do with ADD/COPY; do you think Docker
| is the operating system? You really come across as quite
| inexperienced with Docker and systems administration in
| general, here, and may want to study for meaning before
| leaping to explain.
| staticassertion wrote:
| > I'm kind of impressed with how aggressively this comment
| misunderstands its parent and then beats the hell out of
| that misunderstanding.
|
| This is just peak irony.
| ghoulishboo2 wrote:
| staticassertion wrote:
| You learned what irony was in a literary theory class?
| Too wrote:
| Why not run your existing predictable script in a single RUN
| command? What need to be converted?
|
| The mistake many do is seeing dockerfiles as a 1:1 mapping of a
| shell script with RUN prefixed on every line. It's not, you
| should only split a run if you have a good reason to add a new
| COPY in between for layer caching reasons or switching user.
|
| With the --bind options from buildkit you can ensure the apt
| cache does not get layered and you can mount any big temporary
| files needed from host instead of copying them first.
| rmetzler wrote:
| Also each RUN, COPY, ADD, etc. creates a new layer in the
| image so you should put commands which are related in the
| same command. The example in the blog post violates this
| principle, which is the main reason for the unexpected size.
| _jal wrote:
| Welcome to the 'ops' in devops.
| speed_spread wrote:
| More like DevOops
| travisd wrote:
| "Welcome! Sorry."
| kkfx wrote:
| Classic deploy with system-provided "insulation" (GNU/Linux
| cgroups (firejail/bublewrap) or FreeBSD capsicum etc) reduce far
| more the size and the overhead...
| spullara wrote:
| Everyone is talking about workarounds when this should be fixed
| in the file system. This is just dumb. Changing metadata
| shouldn't require the entire file to be copied lol.
| Hendrikto wrote:
| > If you are wondering why a metadata update would make
| OverlayFS duplicate the entire file, it is for security
| reasons. You can enable "metadata only copy up"[0] feature
| which will only copy the metadata instead of the whole file.
|
| [0]:
| https://www.kernel.org/doc/html/latest/filesystems/overlayfs...
| benreesman wrote:
| You folks do all realize that almost all of this is trying to
| work around the fact that Ulrich Drepper is cramming dynamically-
| linked glibc up our uh, software stack, right?
|
| Linus doesn't break userland. A tarball is a deployment strategy
| if someone isn't dicking with /usr/lib under you.
| maccolgan wrote:
| I still use NSS.
| dhruvrrp wrote:
| Slight nitpick, but `apt-get update && apt-get install -y openssl
| dumb-init iproute2 ca-certificates` in the dockerfile is not the
| recommended approach.
|
| That command itself means that a docker container is no longer
| reproducible. You cannot build it (with any code changes for your
| service) and guaranteed to be the same since that might be in
| production due to changes in the packages.
|
| Always better to go with the base image, add your packages to the
| base and then use that new image as the base image for your
| application.
| philipswood wrote:
| Yes, this is bad not only from the reproducibility perspective,
| but you now also have two layers for the stuff that got
| updated.
|
| I mean the unupdated files in the base image, plus the copy-on-
| write changes in the subsequent layers.
| MrStonedOne wrote:
| wichert wrote:
| At that point your base image is not reproducible, so your
| improvement is going to be very limited.
| fivea wrote:
| > That command itself means that a docker container is no
| longer reproducible.
|
| It's a tradeoff between making container images reproducible,
| and not shipping security vulnerabilities.
|
| People tend to prefer the latter.
|
| Furthermore, you can exec your way into a container and check
| exactly which package version you installed.
| hericium wrote:
| > It's a tradeoff between making container images
| reproducible, and not shipping security vulnerabilities.
|
| You can regenerate your base images every day or more often
| and have consistent containers created from an image. Freshly
| generated image can be tested in a pipeline to avoid issues
| and you won't hit issues like inability to scale due to
| misbehaving new containers.
| fivea wrote:
| > You can regenerate your base images every day or more
| often and have consistent containers created from an image.
|
| That solves nothing, as it just moves the unreproducibility
| to a base image at the cost of extra complexity. Arguably
| that can even make the problem worse as you just add a
| delta between updates where there is none if you just run
| apt get upgrade.
|
| > Freshly generated image can be tested in a pipeline to
| avoid issues and you won't hit issues like inability to
| scale due to misbehaving new containers.
|
| You already get that from container images you build after
| running apt get upgrade.
| hericium wrote:
| `apt` runs during the creation of 1-3 VM images per
| architecture and not during creation of dozens of
| container images based on each VM image.
|
| When we have VM images upon which all our usual Docker
| images were successfully built, we trust it more than
| `FROM busybox/alpine/ubuntu` with following Docker
| builds. I've detailed the process in a neighboring
| comment[1] but you're right that it doesn't suit all
| workflows.
|
| [1] https://news.ycombinator.com/item?id=30810251
| orf wrote:
| For AMIs (and other VM images) it might make more sense.
| With containers? Not so much. And with a distributed
| socket image caching layer it makes even less sense.
| darkwater wrote:
| I mean, HN is the land of "offload this to a SaaS" and when
| we can actually offload something to a distro, like
| "guarantee that an upgrade in the same distro version is
| just security patches and won't break anything", it is
| recommended to avoid doing it?
| cornel_io wrote:
| Security assfarts will yell at you for either approach.
| It'll just be different breeds yelling at you depending
| which route you go, and which one most recently bit
| people on the ass.
| flatiron wrote:
| We have a maximum image age of 60 days at work. You gotta
| rebase at a minimum of 60 days or when something blows up.
| Keeps everyone honest and honestly not that bad. New sprint
| new image then promotion. And with a container repository
| and it being internal does reproducibility really matter?
| Just pull an older version if push comes to shove.
| wahnfrieden wrote:
| I don't know (I know) why people aren't moving to
| platforms like lambda to avoid NIH-ing system security
| patching operations. We can still run mini monoliths
| without massive architectural change if we don't get too
| distracted by FaaS microservice hype
| withinboredom wrote:
| Why would someone pay per-request when you can have
| infinite always-warm requests for a flat-rate?
| wahnfrieden wrote:
| When your workloads are unpredictable and spike suddenly
| such that you can't scale quickly enough to avoid having
| a bunch of spare capacity waiting around and have HA
| requirements. In this scenario more is spent on avoiding
| variable spend to achieve a "flat" rate
| matsemann wrote:
| The image is already built, so it won't rerun those
| commands when scaling up new instances. Or am I
| misunderstanding your comment?
| hericium wrote:
| I'm applying security patches, necessary updates and
| similar during system image creation (VM image - for
| example AWS AMI - the one later referred in Dockerfile's
| FROM). Hashicorp's Packer[1] comes in handy. System
| images are built and later tested in an automated fashion
| with no human involvement.
|
| Testing phase involves building Docker image from fresh
| system image, creating container(s) from new Docker image
| and testing resulting systems, applications and services.
| If everything goes well, the system image (not Docker
| image) replaces previously used system image (one without
| current security patches).
|
| We have somewhat dynamic and frequent Docker images
| creation. Subsequent builds based on the same system
| image are consistent and don't cause problems like
| inability to scale. Docker does not mess with the system
| prepared by Packer - doesn't run apt, download from 3rd
| party remote hosts but only issues commands resulting in
| consistent results.
|
| This way we no longer have issues like inability to scale
| using new Docker images and humans are rarely bothered
| outside testing phase issues. No problems with containers
| though, as no untested stuff is pushed to registries.
|
| [1] https://www.packer.io/
| hericium wrote:
| Wow, I messed up VMs and Docker images a bit in above
| post. We're using Packer for both.
| dhruvrrp wrote:
| Basically you recreate your personal base image (with the
| apt-get commands) every X days, so you have the latest
| security patches. And then you use the latest of those
| base images for your application. That way you have a
| completely reproducible docker image (since you know
| which base image was used) without skipping on the
| security aspect.
| orf wrote:
| Eh, that's a heavy handed and not great way of ensuring
| reproducibility.
|
| The smart way of doing it would be to:
|
| 1. Use the direct SHA reference to the upstream "Ubuntu"
| image you want.
|
| 2. Have a system (Dependabot, renovate) to update that
| periodically
|
| 3. When building, use "cache from" and "cache to" to push
| the image cache somewhere you can access
|
| And... that's it. You'll be able to rebuild any image
| that is still cached in your cache registry. Just re-use
| a older upstream Ubuntu SHA reference and change some
| code, and the apt commands will be cached.
| fivea wrote:
| > Basically you recreate your personal base image (with
| the apt-get commands) every X days, so you have the
| latest security patches.
|
| How exactly does that a) assure reproducibility if you
| use a custom unreproducible base image, b) improve your
| security over daily builds with container images built by
| running apt get upgrade?
|
| In the end that just needlessly adds complexity for the
| sake of it, to arrive at a system that's neither
| reproducible nor equally secure.
| vamc19 wrote:
| If I build an image using the Dockerfile in the blog post
| 10 days later, there is no guarantee that my application
| would work. The packages in Ubuntu's repositories might
| be updated to new versions that are buggy/no longer
| compatible with my application.
|
| OP's suggestion is to build a separate image with
| required packages, tag it with something like
| "mybaseimage:25032022" and use it as my base image in the
| Dockerfile. This way, no matter when I rebuild the
| Dockerfile, my application will always work. You can
| rebuild the base image and application's image every X
| days to apply security patches and such. This also means
| I now have to maintain two images instead of one.
|
| Another option is to use an image tag like
| "ubuntu:impish-20220316" (instead of "ubuntu:21.10") as
| base image and pin the versions of the packages you are
| installing via apt.
|
| I personally don't do this since core packages in
| Ubuntu's repositories rarely introduce breaking changes
| in the same version. Of course, this depends on package
| maintainers, so YYMV.
| OJFord wrote:
| Whether you have a separate base or not, it relies on you
| keeping an old image.
|
| The advantage a separate base has is allowing you to
| continue to update your code on top of it, even while the
| new bases are broken.
|
| You could still do that without it though, just by
| forking out of the single image at the appropriate layer.
| Not as easy, but how often does it happen?
| hddqsb wrote:
| That's a bold claim. Do you have any references to support it?
| The examples in Docker's documentation use apt-get directly and
| I don't see any recommendation to use a base image as you
| describe.[1][2]
|
| With Debian, there are snapshot images[3] which seem like a
| better approach for making apt-get reproducible. You'd simply
| have to change the "FROM" line in the Dockerfile to something
| like "FROM debian/snapshot:stable-20220316" (where 20220316 is
| the date of the image you are trying to reproduce, helpfully
| given in /etc/apt/sources.list).
|
| With the approach you describe, you would have to carefully
| manage the base images: tag them, record which one was used to
| create each application image, and keep them around in order to
| reproduce older application images.
|
| I'm sure there are situations where the approach you describe
| is useful (e.g. with other package managers, especially ones
| that don't have a notion of lockfiles), but it adds complexity
| and I don't think it's necessarily justified in the case of
| apt-get (at least on Debian).
|
| [1]: https://docs.docker.com/engine/reference/builder/#exec-
| form-...
|
| [2]: https://docs.docker.com/develop/develop-
| images/dockerfile_be...
|
| [3]: https://hub.docker.com/r/debian/snapshot
| vladvasiliu wrote:
| But the base images seem to not be stable themselves. The
| article's example of ubuntu:21.10 was released on Mar 18 2022
| as of today (Mar 26) [0]. So if the base image is not fixed,
| the reproducibility is already gone.
|
| https://hub.docker.com/_/ubuntu?tab=tags&page=1&name=21.10
| hddqsb wrote:
| `COPY --chmod` is quite new, and as mentioned in the post
| requires BuildKit (i.e. `docker buildx`).
|
| A more portable solution is to use `chmod` right after unzipping
| the binary. The `COPY` command will then preserve the executable
| permission.
| nodesocket wrote:
| A very common mistake I see (though not related to image size
| perse) when running Node apps is to do CMD ["npm", "run",
| "start"]. This is first memory wasteful, as npm is running as the
| parent process and forking node to run the main script. Also, the
| bigger problem is that the npm process does not send signals down
| to its child thus SIGINT and SIGTERM are not passed from npm into
| node which means your server may not be gracefully closing
| connections.
| afiori wrote:
| does this applies also to npx or yarn?
| wereHamster wrote:
| Yes.
| arnaudsm wrote:
| What do you recommend instead?
| nodesocket wrote:
| Just invoke your script:
|
| CMD ["/usr/local/bin/node", "server.js"]
| TameAntelope wrote:
| pm2 is good for some things.
| nodesocket wrote:
| pm2 is great when running on servers, but using pm2 in
| containers feels wrong and again wasteful. Just invoke your
| script. If it crashes, fine Kubernetes or Docker handles
| that. Logs, handled by k8s. Monitoring I use DataDog.
| chousuke wrote:
| What's the point of pm2? Every time I've seen it it's
| just been part of a messy misconfigured system and
| whatever it's actually doing could've been accomplished
| entirely with a tiny systemd unit running node directly.
| TameAntelope wrote:
| From their site, I'm on mobile so the paste is a little
| rough.
|
| BEHAVIOR CONFIGURATION
|
| SOURCE MAP SUPPORT
|
| CONTAINER INTEGRATION
|
| WATCH & RELOAD
|
| LOG MANAGEMENT
|
| MONITORING
|
| MODULE SYSTEM
|
| MAX MEMORY RELOAD
|
| CLUSTER MODE
|
| HOT RELOAD
|
| DEVELOPMENT WORKFLOW
|
| STARTUP SCRIPTS
|
| DEPLOYMENT WORKFLOW
|
| PAAS COMPATIBLE
|
| KEYMETRICS MONITORING
|
| API
| speedgoose wrote:
| You shouldn't use pm2 in software containers. That makes
| things more complex and not standard.
| TameAntelope wrote:
| It gives you a bunch of stuff you don't get running the
| script directly, and costs nothing, so why wouldn't I opt
| for that?
|
| They even have an explicit, "run in container" mode.
| jwdunne wrote:
| Can echo this. A colleague's node container was maxing
| out CPU and just removing PM2 and running node directly
| solved the problem. That was easier than debugging why
| PM2 was having such a hard time.
|
| To be fair, it was a straight up conversion of an old VM
| in vagrant and Docker was looked at as a one to one
| replacement before learning otherwise.
| unmole wrote:
| > This is first memory wasteful, as npm is running as the
| parent process and forking node to run the main script.
|
| With Linux's CoW semantics, wouldn't the child share pages with
| the parent?
| nodesocket wrote:
| If I exec into a container that runs npm and run top you'll
| see npm (parent) using res memory and the node process
| (child) itself using memory. I'm pretty sure the npm memory
| is just wasted.
| latchkey wrote:
| > _CMD [ "npm", "run", "start"]_
|
| Probably not the best searchfoo, but confirmed...
|
| https://github.com/search?q=%22CMD+%5B%22npm%22%2C+%22run%22...
| encryptluks2 wrote:
| 0des wrote:
| How dare you
| dedoussis wrote:
| To avoid any potential issue with signal propagation a good
| practice is to always use a lightweight init system such as
| dumb-init [1]. One could assume that the node process would
| register signal handlers for all possible signals, but I prefer
| to not have to make this assumption and use an init system
| instead.
|
| [1] https://github.com/Yelp/dumb-init
| AtNightWeCode wrote:
| That is funny. I would assume we don't even have NPM installed
| on the final Docker images. Some people simply don't know what
| they are doing.
| vermaden wrote:
| No need for such superstition actions on FreeBSD Jails containers
| :)
| 2OEH8eoCRo0 wrote:
| I found this to be useful as somebody new to containers in
| general. It's about building from scratch using buildah.
|
| https://fedoramagazine.org/build-smaller-containers/
| oefrha wrote:
| This problem can also be solved through squashing the layers,
| which is a lot more general solution.
| dpedu wrote:
| What's a good tool for doing this? I assume that you don't want
| to merge your custom layers with the base image layer.
| FooBarWidget wrote:
| And squashing is _still_ behind experimental flag after all
| these years.
| encryptluks2 wrote:
| Not with Buildah
| FooBarWidget wrote:
| Which doesn't work on macOS. And which has no caching.
| encryptluks2 wrote:
| I think it is built into the podman build code as well.
| nhoughto wrote:
| Do it all in Userland with jib cli, doesn't work for all
| cases but it's constraints generally keep you honest.
| FooBarWidget wrote:
| That only works for Java. Nearly all of the stuff I work
| on involves native executables as well as the need to
| setup the OS environment in the container (libraries,
| user accounts, etc)
| nhoughto wrote:
| Works for lots of examples where you are packaging a
| statically-ish linked thing into a container, could be
| node/golang/python/java etc. Def lots of other scenarios
| where it doesn't work, but sometimes you can push that to
| a base image built differently. Keep the majority of
| things simple, less footguns.
|
| (Edit: I realize most of those aren't statically linked,
| better description might be "things copied straight into
| container, not installed")
| riccardomc wrote:
| Funny, I had a very similar experience at a client of mine last
| month. They were using Apache Spark images and installing all
| kind of python libraries on top of them. The biggest contributors
| to image size were:
|
| - miniconda (~2GB)
|
| - a final RUN chown -R statement (~750GB)
|
| We reduced the image size and relative Spark cluster considerably
| by playing around with dependencies in order to stick with plain
| pip and using COPY --chown.
|
| I also recommend [dive](https://github.com/wagoodman/dive)
| analyse what contributes to each layer.
| dpedu wrote:
| I love docker, but it is baffling that the obvious need for a
| COPY or ADD argument like this has not been satisfied yet.
| marsven_422 wrote:
| You have a problem, you use docker to solve it, now you have two
| problems.
| georgia_peach wrote:
| Strikingly more accurate than the original regex quote.
|
| They believe it was Wilde who said, " _If you want to tell
| people the truth, you'd better make them laugh or they'll kill
| you._ "
| dark-star wrote:
| This shouldn't apply if you use btrfs as backend filesystem
| though, does it?
| TOGoS wrote:
| My understanding (based on other comments in the thread - I'm
| no Docker internals expert) is that it's about the size of
| Docker image files, which contain a tarball (or similar) of
| files that each layer adds or modifies on top of its base
| layer. There's no way for them to say "same as this other file,
| just with permissions changed". Which has always seemed to me
| like a bad design decision on Docker's part, because there's
| lots of room for deduplication within the images that just
| cannot be done due to the format they chose. Why not have the
| layers reference individual file by content hash + metadata? If
| there's a lot of small, unusual files, you could just bundle
| them with the image, sort of like how Git packs objects
| together for efficiency, but still retains the identity of
| each.
| polote wrote:
| A better title would be: using chmod doubled the size of my
| docker container image
| wabain wrote:
| I think the OP is confusing the runtime and image format a bit
| here. At runtime OverlayFS can use metadata-only copy up to
| describe changed files, but the container image is still defined
| as a sequence of layers where each layer is a tar file. There's
| no special handling for metadata-only changes of a file from a
| parent layer. As the OCI image spec puts it [1]:
|
| > Additions and Modifications are represented the same in the
| changeset tar archive.
|
| [1]: https://github.com/opencontainers/image-
| spec/blob/02efb9a75e...
| softwarebeware wrote:
| This site has a nice theme. I wish more of the internet looked
| like text files.
| habitue wrote:
| So many gotchas like this in dockerfiles. I think the issue stems
| from it being such a leaky abstraction. To use it correctly you
| need to know how docker works internally inside and out, as well
| as Linux inside and out.
|
| The default choices are baffling in docker, it really is a worse-
| is-better kind of tool.
|
| Has anyone worked on a replacement for dockerfiles? I know
| buildah is an alternative to docker build, but it just uses the
| same file format
| silisili wrote:
| Agreed, but not sure the answer.
|
| I run Linux as I always have. Building and running are super
| simple.
|
| I feel like Docker was created more or less to let Mac devs do
| Linux things. Wastefully. And without a lot of reason, tbh. And
| of course, they don't generally even understand Linux.
| dimitrios1 wrote:
| Why would a technology built on top of cgroups, a feature
| only available in the linux kernel, be created to "let mac
| devs do linux things"? In fact, running docker on Mac was
| painful in the early days with boot2docker.
| silisili wrote:
| Just my experience on my team. The Linux guys were already
| building and running things locally. So the sales pitch so
| to speak from our team were the Mac guys saying 'hey, now
| we can build and test locally!', whereas the Linux guys
| just kinda found it a slight annoyance.
|
| Things have certainly changed with the rise of kube, ecr,
| and the such. But in the time of doing standard deploys
| into static vms, it didn't make a ton of sense.
| dimitrios1 wrote:
| I encourage you to investigate where docker came from,
| and the rise of containerization in general. The notion
| that you have is rather misinformed and anachronistic.
| Competing against standard deploys onto VMS, especially
| using proprietary software is exactly why
| containerization gained a foothold.
|
| Whatever this anecdote your team told you about Mac guys,
| this just has nothing to do with docker's, and containers
| in general, rise to fame. It wouldn't be until much later
| when Mac users were starting to rely on tools like
| Vagrant for development environments where docker was
| seen as an alternative to that. If your team were real
| linux guys, they probably would have already known about
| lxc, as well as all the other technologies that lead up
| to it: jails, solaris containers, and vserver, so seeing
| this as "some annoying mac thing" is especially puzzling
| to me.
| silisili wrote:
| You know, I try to be reasonable so, you're right - my
| initial comment was way too broad and dismissive.
|
| I told a personal tale about adoption(not creation),
| which isn't exactly fair to the creators.
|
| It's a slightly different and perhaps jaded view when a
| perfectly solid workflow is upended, and when asking why
| get responses like 'consistent OS and dependencies',
| which our vms already had, and 'we can run it locally',
| which half of us already did.
|
| Admittedly, there is a lot of value in a consistent and
| repeatable environment specification(vs bespoke
| everywhere), being able to do so without needing to spin
| up vms, and yes - running linuxy things on Mac and Win,
| among other things.
| viraptor wrote:
| You can also use buildah commands without the whole dockerfile
| abstraction. As a structured alternative there's also an option
| to build container images from nix expressions.
| fenollp wrote:
| https://github.com/moby/buildkit
| Too wrote:
| Never tried myself but this is the most serious attempt I've
| seen on alternative docker syntax https://earthly.dev/
|
| You also have mockerfiles, being more of a proof of concept if
| I understand correctly https://matt-rickard.com/building-a-new-
| dockerfile-frontend/
| adamgordonbell wrote:
| Earthly is great (disclosure: work on it)
|
| But also checkout out IckFiles, an Intercal frontend for moby
| buildkit:
|
| https://github.com/adamgordonbell/compiling-
| containers/tree/...
| xinnixxinix wrote:
| Sure, there are, but they all have enough of a learning curve
| that they don't seem to take hold with the masses.
|
| Nix, Guix, Bazel, Habit, and others, all solve this problem
| more elegantly. There are some big folks out there, quiet
| quietly using Nix to solve:
|
| * reproducible builds
|
| * shared remote/CI builds
|
| * trivial cross-arch support
|
| * minimal container images
|
| * complete knowledge of all SW dependencies and what-is-live-
| where
|
| * "image" signing and verification
|
| I know docker and k8s well and it's kind of silly how much
| simpler the stack could be made if even 1% of the effort spent
| working around Docker were spent by folks investing in tools
| that are principally sound instead of just looking easy at
| first glance.
|
| Miss me with the complaints about syntax. It's just like Rust.
| Any pain of learning is very quickly forgotten by the unbridled
| pace at which you can move. And besides, it's nothing compared
| to (looks at calendar) 5 years of "Top 10 Docker Pitfalls!" as
| everyone tries to pretend the teetering pile of Go is making
| their tech debt go away.
|
| I never thought I'd come around to being someone wary of the
| word "container", as someone who sorta made it betting on them.
| There is so little care for actually managing and understanding
| the depth of one's software stack, well, we have this. (Pouring
| one out for yet another Dockerfile with apt-get commands in
| it.)
| habitue wrote:
| > it's kind of silly how much simpler the stack could be made
| if even 1% of the effort spent working around Docker were
| spent by folks investing in tools that are principally sound
| instead of just looking easy at first glance.
|
| This is the phrasing I was groping around for. Thank you
| yjftsjthsd-h wrote:
| > Miss me with the complaints about syntax. It's just like
| Rust.
|
| Yeah and it's competing against Dockerfiles, which I suppose
| in this analogy is like Python or bash with fewer footguns;
| syntax and parts of the functional paradigm are absolutely
| putting nix at a usability/onboarding disadvantage to docker.
| rtpg wrote:
| Docker provides a solution for balls of mud. You now have a
| more reproducible ball of mud!
|
| Bazel and company require you to clean up your ball of mud
| first. So your payoff is further away (and can sometimes be
| theoretical)
|
| Ultimately it's less about Docker and more about tooling
| supporting reproducibility (apt but with version pinning
| please), but in the meantime Docker does get you somewhere
| and solve real problems without having to mess around with
| stuff too much.
|
| And of course the "now you have a single file that you can
| run stuff with after building the image ". I don't believe
| stuff like Nix offers that
| ParetoOptimal wrote:
| > And of course the "now you have a single file that you
| can run stuff with after building the image ". I don't
| believe stuff like Nix offers that
|
| Yes it does? Also, any nix expression can trivially be
| built into a much more space efficient docker container.
| rtpg wrote:
| Can you generate a tar file that you can "just run" (or
| something to that effect)? My impression was that Nix
| works more like a package installer, but deterministic
| mananaysiempre wrote:
| With the new (nominally experimental) CLI, use `nix
| bundle --bundler github:NixOS/bundlers#toArx` (or
| equivalently just `nix bundle`) to build a self-
| extracting shell script[1], `...#toDockerImage` to build
| a Docker image, etc.[2,3], though there's no direct
| AppImage support that I can see (would be helpful to
| eliminate the startup overhead caused by self-
| extraction).
|
| If you want a QEMU VM for a complete system rather than a
| set of files for a single application, use `nixos-rebuild
| build-vm`, though that is intended more for testing than
| for deployment.
|
| The Docker bundler seems to be using more general Docker-
| compatible infrastructure in Nixpkgs[4].
|
| [1] https://github.com/solidsnack/arx
|
| [2] https://nixos.org/manual/nix/unstable/command-
| ref/new-cli/ni...
|
| [3] https://github.com/NixOS/bundlers
|
| [4] https://nixos.org/manual/nixpkgs/stable/#sec-pkgs-
| dockerTool...
| rekado wrote:
| `guix pack` (with its various options) can produce an
| archive that you could run after unpacking anywhere.
| henrydark wrote:
| There might be quicker ways to do this, but with one
| extra line a derivation exports a docker image which can
| in turn be turned to a tar with one more line.
|
| Nix's image building is pretty neat. You can control how
| many layers you want, which I currently maximize so that
| docker pulls from AWS ECR are a lot faster
| wereHamster wrote:
| > * trivial cross-arch support
|
| Uhm, can't get Nix to build a crossSystem on MacBook M1, it
| fails compiling cross GCC. I wouldn't say it's trivial. Maybe
| the Nix expressions look trivial, but getting them to
| actually evaluate is not.
| brigandish wrote:
| I've been using Packer with the Docker post-processor. I've had
| to give up multi-stage builds but being able to ditch
| Dockerfiles and simply write a shell script without a thousand
| &&\'s is more than enough reason to keep me using it.
| fivea wrote:
| > I've had to give up multi-stage builds but being able to
| ditch Dockerfiles and simply write a shell script without a
| thousand &&\'s is more than enough reason to keep me using
| it.
|
| I don't understand your point. If all you want to do is set a
| container image by running a shell script, why don't you just
| run the shell script in your Dockerfile?
|
| Or better yet, prepare your artifacts before, and then build
| the Docker image by just copying your files.
|
| It sounds like you decided to take the scenic route of Docker
| instead of just taking the happy path.
| googleantitrust wrote:
| What is the point of these technology inventiones? The desire of
| hoarding more bananas than the monkey can eat will make this
| planet unhabitable one day.
| toniti wrote:
| Using `COPY --chmod` is not the correct solution for this. It
| works, of course, but it isn't very logical from Dockerfile
| readability standpoint. The real issue is the incorrect use of
| multi-stage builds. In multi-stage builds you define additional
| build stages where you prepare your binaries(eg. compiling them)
| and copy to the final runtime stage, so your final stage remains
| clean of temporary files created by your build steps. Based on
| your comment in your current build stage you run curl, extract
| etc., but you don't actually finish preparing the binary by
| correcting the executable bit. Instead, you copy the half-
| prepared binary to runtime stage and then try to continue your
| further modifications there. Eg. similarily if you would skip
| extracting step, copy the zip instead and extract it in runtime
| stage and then you would have the zip and the final binary in
| your exported image.
|
| Another red flag is that you run `apt-get` after copying the
| binary to runtime stage(because you still want to tweak the
| binary there). That means any time source for binary changes, the
| `apt` commands need to run again and are not cached. If you just
| add the executable bit in your build stage you can reorder them,
| so the `COPY` comes after `RUN`.
| vamc19 wrote:
| You are correct - I should be running chmod in the download
| stage and that is what I did before realizing `--chmod`
| existed. However, `--chmod` is still a valid solution.
|
| The reason I did not stop with running chmod in the first stage
| is because this seemed like a common problem - what if I was
| ADDing a binary or a shell script directly from a remote source
| and I did not have a download stage?
|
| I'm sure there are better ways to write that Dockerfile - I'm
| by no means an expert. It just so happens that I noticed this
| problem when the Dockerfile (it was from a different project. I
| was modifying it) was in this state and I had nothing better to
| do than ~yak shave~ investigate why the image size was a bit
| larger than I expected :)
___________________________________________________________________
(page generated 2022-03-26 23:01 UTC)