hngopher.com

       [HN Gopher] Real-world stories of how we've compromised CI/CD pi...
       ___________________________________________________________________
        
       Real-world stories of how we've compromised CI/CD pipelines
        
       Author : usrme
       Score  : 226 points
       Date   : 2022-01-17 11:08 UTC (11 hours ago)
        
 (HTM) web link (research.nccgroup.com)
 (TXT) w3m dump (research.nccgroup.com)
        
       | tialaramex wrote:
       | A recurring theme is that they obtain _secret_ credentials from a
       | service which needs to verify credentials, and then turn around
       | and use those to impersonate the entity providing those
       | credentials. For example getting Jenkins to run some Groovy
       | discovers credentials Jenkins uses to verify who is accessing it,
       | and then you can just use those credentials yourself.
       | 
       | To fix this - almost anywhere - stop using shared secrets. Every
       | time you visit a (HTTPS) web site, you are provided with the
       | credentials to verify its identity. But, you don't gain the
       | ability to impersonate the site because they're not _secret_
       | credentials, they 're _public_. You can and should use this in a
       | few places in typical CI  / CD type infrastructure today, and we
       | should be encouraging other services to enable it too ASAP.
       | 
       | In a few places they mention MFA. Again, most MFA involves
       | secrets, for example TOTP Relying Parties need to know what code
       | you should be typing in, so, they need the seed from which to
       | generate that code, and attackers can steal that seed. WebAuthn
       | doesn't involve secrets, so, attackers who steal WebAuthn
       | credentials don't achieve anything. Unfortunately chances are you
       | enabled one or more vulnerable credential types "just in case"...
        
       | cerved wrote:
       | > a hardcoded git command with a credential was revealed
       | 
       |  _cries in security_
        
         | i_like_waiting wrote:
         | reminds me of tons docker tutorials, where all of them are
         | doing default password in plaintext in docker-compose file
        
           | rietta wrote:
           | I put devonly: as part of every placeholder secret in docker-
           | compose.yml or similar config that is committed to Git. The
           | goal is a developer who has just cloned the repo should be
           | able to run the setup script and have the whole system
           | running with random seed data without futzing with copying
           | secrets from coworkers.
        
             | nickjj wrote:
             | > I put devonly: as part of every placeholder secret in
             | docker-compose.yml or similar config that is committed to
             | Git. The goal is a developer who has just cloned the repo
             | should be able to run the setup script and have the whole
             | system running with random seed data without futzing with
             | copying secrets from coworkers.
             | 
             | This problem is solvable without hard coding env variables
             | into your docker-compose.yml file.
             | 
             | You can commit an .env.example file to version control
             | which has non-secret defaults set so that all a developer
             | has to do is run `cp .env.example .env` before `docker-
             | compose up --build` and they're good to go.
             | 
             | There's examples of this in all of my Docker example apps
             | for Flask, Rails, Django, Phoenix, Node and Play at: https:
             | //github.com/nickjj?tab=repositories&q=docker-*-exampl...
             | 
             | It's nice because it also means the same docker-compose.yml
             | file can be used in dev vs prod. The only thing that
             | changes are a few environment variables.
        
             | staticassertion wrote:
             | With buildkit Docker now has support for secrets natively
             | with `--secret`. This mounts a file that will only be
             | exposed during build.
        
             | inetknght wrote:
             | > *I put devonly: as part of every placeholder secret in
             | docker-compose.yml or similar config that is committed to
             | Git.&
             | 
             | I put it `insecure`. I think it makes it clear that the
             | password, and file, aren't secure by default and should be
             | treated as such.
        
       | Lucasoato wrote:
       | Is it just my impression or security in Jenkins seems much more
       | challenging and more time-consuming than in GitLab? This post
       | gives many examples where GitLab was attacked, so of course bad
       | practices like privileged containers can lead to the compromise
       | of a server independently by the technology used, but from my
       | experience with Jenkins, I've seen using passwords in plaintext
       | so many times, even in big companies.
        
         | 2ion wrote:
         | Jenkins is security game over if you overlook a small crucial
         | configuration option or if you install any plugin (and it's
         | unusable without some plugins), as plugin development is a
         | free-for-all and dependencies between plugins are many. We
         | basically decided that one instance of Jenkins plus slaves was
         | unfixable and unconfigurable to use securely across multiple
         | teams with developers of differing trust levels (external
         | contributors vs normal in-house devs) and started fresh with a
         | different CI design.
         | 
         | Jenkins is a batteries excluded pattern in one of its worst
         | possible incarnations.
         | 
         | Jenkins is basically a CI framework for trusted users only.
         | Untrusted workloads must not have access to anything Jenkins.
        
         | ramoz wrote:
         | I don't really like either. Both have traditionally been bad &
         | related to on-prem legacy workloads. Building for SVN apps or
         | teams new to git. It's usually a mess.
        
           | nathanlied wrote:
           | As someone adjacently interested in the field: care to
           | elaborate on what systems you do like? It's always
           | interesting to get new perspectives.
        
             | staticassertion wrote:
             | We've been happy with buildkite and hashicorp vault. One
             | nice feature we've leveraged in our CI is that vault lets
             | us revoke tokens after use, so we have very short lived
             | tokens and they're made that much shorter by having the
             | jobs clean up after themselves.
        
         | kiallmacinnes wrote:
         | > but from my experience with Jenkins, I've seen using
         | passwords in plaintext so many times, even in big companies
         | 
         | I reckon this has to do with how the CI tools are configured.
         | 
         | Everyone knows you shouldn't commit a secret to Git, so tools
         | like GitLab CI which require all their config be in git
         | naturally will see less of this specific issue.
        
         | formerly_proven wrote:
         | Jenkins was also affected by numerous Java serialization
         | vulnerabilities. It also used to be that any worker could
         | escalate to the main Jenkins server pretty much by design, not
         | sure what the current situation is.
        
         | [deleted]
        
         | 0xbadcafebee wrote:
         | Jenkins is a tire fire, and security is just one of their tires
         | on fire. Every aspect of it encourages bad practices.
        
       | dlor wrote:
       | This is a great resource. I'd love to see more reports like it
       | published. CI/CD pipelines often run with highly elevated
       | permissions (access to source code, artifact repositories, and
       | production environments), but they are traditionally neglected.
        
         | Kalium wrote:
         | I wouldn't say they are traditionally neglected, precisely.
         | CI/CD systems are often treated as a place where devs hold
         | infinite power with developer convenience prioritized above all
         | else. Developers, who are generally not security experts, often
         | expect to wholly own their build and deployment processes.
         | 
         | I've seen few things get engineer pushback quite like trying to
         | tell engineers that they need to rework how they build and
         | deploy because someone outside their team said so. It's just
         | dev, not production, so why should they be so paranoid about
         | it? Sheesh, stop screwing up their perfectly good workflows...
        
         | kevin_nisbet wrote:
         | I suspect this is also an under considered area even in
         | organizations with lots of attention to security. So would be
         | good to also get more mindshare, as after we discovered some of
         | our own CI/CD related vulnerabilities[1], it feels like most
         | approaches we looked at had similar problems, and it took alot
         | of research to find the rare solution that we could be
         | confident in.
         | 
         | [1] - https://goteleport.com/blog/hack-via-pull-request/
        
           | 0xbadcafebee wrote:
           | Been there, done that, bought the t-shirt....
           | 
           | We had this "deploy" Jenkins box set up with limited access
           | for devs, because it had assume-role privs to an IAM role to
           | manage AWS infra with Terraform. The devs run their tests on
           | a different Jenkins box, and when they pass, they upload
           | artifacts to a repo and trigger this "deploy" Jenkins box to
           | promote the new build to prod. The devs can do their own CI,
           | but CD is on a box they don't have access to, hence less
           | chance for accidental credential leakage. Me being Mr.
           | Devops-play-nice-with-the-devs, I let them issue PRs against
           | the CD box's repo. Commits to PRs get run on the deploy
           | Jenkins in a stage environment to validate the changes.
           | 
           | This one dev wanted to change something in AWS. But for
           | whatever reason, they didn't ask me (maybe because they knew
           | I'd say no, or at least ask them about it?). So instead the
           | dev opens a PR against the CD jobs, proposing some syntax
           | change. _Then_ the dev modifies a script which was being
           | included as part of the CD jobs, and makes the script
           | download some binaries and make AWS API calls (I found out
           | via CloudTrail). Once they 've made the calls, they rewrite
           | Git history to remove the AWS API commits and force-push to
           | the PR branch, erasing evidence that the code was ever
           | issued. Then close the PR with "need to refactor".
           | 
           | In the morning I'm looking through my e-mail, and see all
           | these GitHub commits with code that looks like it's doing
           | something in AWS... and I go look at the PR, and the code in
           | my e-mails isn't anyware in any of the commits. He actually
           | tried to cover it up. And I would never have known about any
           | of this if I hadn't enabled 'watching' on all commits to the
           | repo.
           | 
           | Who'd have thought e-mail would be the best append-only
           | security log?
        
             | maestrae wrote:
             | I'm very curious at to what happened next. How did the
             | conversation go with the dev and did they get to keep their
             | job?
        
               | 0xbadcafebee wrote:
               | I didn't tell his boss. I did tell my boss, in an e-mail
               | with evidence. We both had a little chat with the dev
               | where we made it clear that if this happened under
               | slightly different circumstances (if he was trying to
               | access data/systems he wasn't supposed to, if it was one
               | of the HIPAA accounts, etc) he'd not only be shitcanned,
               | he'd be facing serious legal consequences. We were
               | satisfied by his reaction and didn't push it further.
               | 
               | I was actually fired early in my career as a contractor
               | when an over-zealous security big-wig decided to go over
               | my boss's boss's head. I had punched a hole in the
               | firewall to look at Reddit, and because I also had a lot
               | of access, this meant I wasn't trustworthy and had to go.
               | People (like me) make stupid mistakes; we should give
               | them a second chance.
        
             | hinkley wrote:
             | Our deploy scripts make a call to a separate box that
             | actually does the deployment, ostensibly to avoid this sort
             | of problem and have some more control over simultaneous
             | deployments. But it is very hard to explain to anyone how
             | to diagnose a deployment failure on such a system, and once
             | in a while the log piping gets gummed up and you don't get
             | any status reports until the job completes.
             | 
             | I've maybe managed to explain this process to one other
             | extant employee, so pretty much everybody bugs me or one of
             | the operations people any time there's an issue. That could
             | be a liability in an outage situation, but I don't have a
             | concrete suggestion how to avoid this sort of thing.
        
       | mdoms wrote:
       | > The credentials gave the NCC Group consultant access as a
       | limited user to the Jenkins Master web login UI which was only
       | accessible internally and not from the Internet. After a couple
       | of clicks and looking around in the cluster they were able to
       | switch to an administrator account.
       | 
       | These kinds of statements are giving major "draw the rest of the
       | owl" vibes.
       | 
       | https://i.kym-cdn.com/photos/images/newsfeed/000/572/078/d6d...
        
       | staticassertion wrote:
       | Thank you for writing this up.
       | 
       | Some thoughts:
       | 
       | 1. Hardcoded credentials are a plague. You should consider
       | tagging all of your secrets so that they're easier to scan for.
       | Github automatically scans for secrets, which is great.
       | 
       | 2. Jenkins is particularly bad for security. I've seen it owned a
       | million and one times.
       | 
       | 3. Containers are overused as a security boundary and footguns
       | like `--privileged` completely eliminate any boundary.
       | 
       | 4. Environment variables are a dangerous place to store secrets -
       | they're global to the process and therefor easy to leak. I've
       | thought about this a lot lately, especially after log4j. I think
       | one pattern that may help is clearing the variables after you've
       | loaded them into memory.
       | 
       | Another I've considered is encrypting the variables. A lot of the
       | time what you have is something like this:
       | 
       | Secret Store -> Control Plane Agent -> Container -> Process
       | 
       | Where secrets flow from left to right. The control plane agent
       | and container have full access to the credentials and they're
       | "plaintext" in the Process's environment.
       | 
       | In theory you should be able to pin the secrets to that process
       | with a key. During your CD phase you would embed a private key
       | into the process's binary (or a file on the container) and then
       | tell your Secret Manager to use the associated public key to
       | transmit the secrets. The process could decrypt those secrets
       | with its private key but they're E2E encrypted across any hops
       | between the Secret Store and Process and they can't be leaked
       | without explicitly decrypting them first.
        
         | teddyh wrote:
         | > _Environment variables are a dangerous place to store secrets
         | - they 're global to the process and therefor easy to leak._
         | 
         | The two _real_ problems with environment variables are:
         | 
         | 1. Environment variables are traditionally readable by _any
         | other process in the system_. There are settings you can do on
         | modern kernels to turn this off, but how do you know that you
         | will always run on such a system?
         | 
         | 2. Environment variables are _inherited_ to all subprocesses by
         | default, unless you either unset them after you fork() (but
         | before you exec()), or if you take special care to use execve()
         | (or similar) function to provide your own custom-made
         | environment for the new process.
        
           | staticassertion wrote:
           | > 1. Environment variables are traditionally readable by any
           | other process in the system. There are settings you can do on
           | modern kernels to turn this off, but how do you know that you
           | will always run on such a system?
           | 
           | I think that this would require being the same user as the
           | process you're trying to read. Access to proc/pid/environ
           | should require that iirc. You can very easily go further by
           | restricting procfs using hidepid.
           | 
           | And ptrace restrictions are pretty commonplace now I think?
           | So the attacker has to be a parent process or root.
           | 
           | > 2. Environment variables are inherited to all subprocesses
           | by default, unless you either unset them after you fork()
           | (but before you exec()), or if you take special care to use
           | execve() (or similar) function to provide your own custom-
           | made environment for the new process.
           | 
           | Yeah, this goes to my "easy to leak" point.
           | 
           | Either way though you're talking about "attacker has remote
           | code execution", which is definitely worth considering, but I
           | don't think it matters with regards to env vs anything else.
           | 
           | Files suffer from (1), except generally worse. File handles
           | suffer from (2) afaik.
           | 
           | Embedding the private key into the binary doesn't help too
           | much if the attacker is executing with the ability to ptrace
           | you, but it does make leaking much harder ie: you can't trick
           | a process into dumping cleartext credentials from the env
           | just by crashing it.
        
             | teddyh wrote:
             | > _I think that this would require being the same user as
             | the process you 're trying to read._
             | 
             | IIRC, this was not always the case. But fair enough, this
             | might not be a relevant issue for any modern system.
        
           | otterley wrote:
           | A foreign process's environment variables are only readable
           | if the current UID is root or is the same as the foreign
           | process's ID. As user joe I can't see user andrea's process's
           | envvars.
        
             | teddyh wrote:
             | All right, fair enough. But I'm not sure this was always
             | the case on traditional Unixes, though.
        
         | momenti wrote:
         | I'm considering manually storing a secret under some
         | inaccessible directory on the host, e.g. `/root/passwords.txt`,
         | then expose this via Docker secrets[1] to the container.
         | Finally, in the entrypoint script, I'd set e.g. the user
         | passwords of some SQL server, which is then run as a non-
         | privileged user. Would that be reasonably safe?
         | 
         | [1] https://docs.docker.com/engine/swarm/secrets/
        
         | colek42 wrote:
         | When you start doing security this way you end up chasing your
         | tail. There are so many ways to mess it up.
         | 
         | There is a really good article that explains a different way of
         | securing these systems though sets of attestations.
         | 
         | https://grepory.substack.com/p/der-softwareherkunft-software...
        
         | notreallyserio wrote:
         | I think your agent idea is good. I'd want to add in a way for
         | the agent to detect when a key is used twice (to catch other
         | processes using the key) or when the code you wrote didn't get
         | the key directly (to catch proxies), and then a way to kill or
         | suspend the process for review. Would be pretty sweet.
        
       | lox wrote:
       | We've been using Sysbox (https://github.com/nestybox/sysbox) for
       | our Buildkite based CI/CD setup, allows docker-in-docker without
       | privileged containers. Paired with careful IAM/STS design we've
       | ended up with isolated job containers with their own IAM roles
       | limited to least-privilege.
        
         | lima wrote:
         | Never head of Sysbox before. At a first glance, the comparison
         | table in their GitHub repo and on their website[1] has a number
         | of inaccuracies which makes me question the quality of their
         | engineering:
         | 
         | -- They claim that their solution has the same isolation level
         | ("4 stars") than gVisor, unlike "standard containers", which
         | are "2 stars" only (with Firecracker and Kubevirt being "5
         | stars). This is very wrong - as far as I can tell, they use
         | regular Linux namespaces with some light eBPF-based filesystem
         | emulation, while the vast majority of syscalls is still handled
         | by the host kernel. Sorry, but this is still "2 stars" and far
         | away from the isolation guarantees provided by gVisor (fully
         | emulating the kernel in userspace, which is at the same level
         | or even better than Firecracker) and nowhere close to a VM.
         | 
         | -- Somehow, regular VMs (Kubevirt) get a "speed" rating of only
         | "2 stars" - worse than gVisor ("3 stars") and Firecracker ("4
         | stars"), even though they both rely on virtually the same
         | virtualization technology. If anything, gVisor is the slowest
         | but most efficient solution while QEMU maintains some
         | performance advantage over Firecracker[2]. These are basically
         | random scores, it's not a good first impression-if you do a
         | detailed comparison like that, at least do a proper evaluation
         | before giving your own product the best score!
         | 
         | -- They claim that "standard containers" cannot run a full OS.
         | This isn't true - while it's typically a bad idea, this works
         | just fine with rootless podman and, more recently, rootless
         | docker. Allowing this is the whole point of user namespaces,
         | after all! Maybe their custom procfs does a better job of
         | pretending to be a VM - but it's simply false that you can't do
         | these things without. You can certainly run a full OS inside
         | Kata/Firecracker, too, I've actually done that.
         | 
         | Nitpicking over rating scales aside, the claim that their
         | solution offers large security improvements over any other
         | solution with user namespaces isn't true and the whole thing
         | seems very marketing-driven. The isolation offered by user
         | namespaces is still very weak and not comparable to gVisor or
         | Firecracker (both in production use by Google/AWS for untrusted
         | workloads!). False marketing is a big red flag, especially for
         | something as critical as a container runtime.
         | 
         | Anyone who wants unprivileged system containers might want to
         | look into rootless docker or podman rather than this.
         | 
         | [1]: https://www.nestybox.com
         | 
         | [2]: https://www.usenix.org/system/files/nsdi20-paper-
         | agache.pdf
        
           | pritambaral wrote:
           | > They claim that "standard containers" cannot run a full OS.
           | ... this works just fine with rootless podman and, more
           | recently, rootless docker.
           | 
           | > Anyone who wants unprivileged system containers might want
           | to look into rootless docker or podman rather than this.
           | 
           | Perhaps I'm missing something, but I have been running full
           | OS userlands using "standard containers" in production for
           | years, via LXD[1].
           | 
           | [1]: https://linuxcontainers.org/
        
             | lima wrote:
             | LXD uses privileged containers, though - this exposes a lot
             | more attack surface, since uid 0 inside the container
             | equals uid 0 outside.
        
           | ctalledo wrote:
           | Thanks for the feedback; I am one of the developers of
           | Sysbox. Some answers to the above comments:
           | 
           | - Regarding the container isolation, Sysbox uses a
           | combination of Linux user-namespace + partial procfs & sysfs
           | emulation + intercepting some sensitive syscalls in the
           | container (using seccomp-bpf). It's fair to say that gVisor
           | performs better isolation on syscalls, but it's also fair to
           | say that by adding Linux user-ns and procfs & sysfs
           | emulation, Sysbox isolates the container in ways that gVisor
           | does not. This is why we felt it was fair to put Sysbox at a
           | similar isolation rating as gVisor, although if you view it
           | from purely a syscall isolation perspective it's fair to say
           | that gVisor offers better isolation. Also, note that Sysbox
           | is not meant to isolate workloads in multi-tenant
           | environments (for that we think VM-based approaches are
           | better). But in single-tenant environments, Sysbox does void
           | the need for privileged containers in many scenarios because
           | it allows well isolated containers/pods to run system
           | workloads such as Docker and even K8s (which is why it's
           | often used in CI infra).
           | 
           | - Regarding the speed rating, we gave Firecracker a higher
           | speed rating than KubeVirt because while they both use
           | hardware virtualization, the latter run microVMs that are
           | highly optimized and have much less overhead that full VMs
           | that typically run on KubeVirt. While QEMU may be faster than
           | Firecracker in some metrics in a one-instance comparison,
           | when you start running dozens of instances per host, the
           | overhead of the full VM (particularly memory overhead) hurts
           | its performance (which is the reason Firecracker was
           | designed).
           | 
           | - Regarding gVisor performance, we didn't do a full
           | performance comparison vs. KubeVirt, so we may stand
           | corrected if gVisor is in fact slower than KubeVirt when
           | running multiple instances on the same host (would appreciate
           | any more info you may have on such a comparison, we could not
           | find one).
           | 
           | - Regarding the claim that standard containers cannot run a
           | full OS, what the table in the GH repo is indicating is that
           | Sysbox allows you to create unprivileged containers (or pods)
           | that can run system software such as Docker, Kubernetes, k3s,
           | etc. with good isolation and seamlessly (no privileged
           | container, no changes in the software inside the container,
           | and no tricky container entrypoints). To the best of our
           | knowledge, it's not possible to run say Kubernetes inside a
           | regular container unless it's a privileged container with a
           | custom entrypoint. Or inside a Firecracker VM. If you know
           | otherwise, please let us know.
           | 
           | - Regarding "The claim that their solution offers large
           | security improvements over any other solution with user
           | namespaces isn't true". Where do you see that claim? The
           | table explicitly states that there are solutions that provide
           | stronger isolation.
           | 
           | - Regarding "The isolation offered by user namespaces is
           | still very weak and not comparable to gVisor or Firecracker".
           | User namespaces by itself mitigates several recent CVEs for
           | containers, so it's a valuable feature. It may not offer VM-
           | level isolation, but that's not what we are claiming.
           | Furthermore, Sysbox uses the user-ns as a baseline, but adds
           | syscall interception and procfs & sysfs emulation to further
           | harden the isolation.
           | 
           | - "False marketing is a big red flag, especially for
           | something as critical as a container runtime." That's not
           | what we are doing.
           | 
           | - Rootless Docker/Podman are great, but they work at a
           | different level than Sysbox. Sysbox is an enhanced "runc",
           | and while Sysbox itself runs as true root on the host (i.e.,
           | Sysbox is not rootless), the containers or pods it creates
           | are well isolated and void the need for privileged containers
           | in many scenarios. This is why several companies use it in
           | production too.
        
           | lox wrote:
           | I don't spend a lot of time on those comparison-style charts
           | if I'm honest, but that is good (and valid) feedback for
           | them. I also hadn't heard of it, I discovered sysbox via
           | jpettazo's updated post at
           | https://jpetazzo.github.io/2015/09/03/do-not-use-docker-
           | in-d..., he's an advisor of nestybox the company that
           | develops sysbox.
           | 
           | For the CI/CD usecase on AWS, sysbox presented the right
           | balance of trade-offs between something like Firecracker
           | (which would require bare metal hosts on AWS) and the docker
           | containers that already existed. We specifically need to run
           | privileged containers so that we could run docker-in-docker
           | for CI workloads, so rootless docker or podman wouldn't have
           | helped. Sysbox lets us do that with a significant improvement
           | in security to just running privileged docker containers as
           | most CI environments end up doing.
           | 
           | Just switching their docker-in-docker CI job containers to
           | sysbox would have mitigated 4 of the compromises from the
           | article with nearly zero other configuration changes.
        
             | lima wrote:
             | > We specifically need to run privileged containers so that
             | we could run docker-in-docker for CI workloads, so rootless
             | docker or podman wouldn't have helped.
             | 
             | rootless docker works inside an unprivileged container
             | (that's how our CI works).
        
         | xmodem wrote:
         | Could you elaborate a bit more how you get the containers into
         | their own IAM roles?
        
           | nijave wrote:
           | Not sure if this applies to the parent, but one way this
           | Buildkite
           | 
           | Queues map pipelines to agents. Agents can be assigned IAM
           | roles. If you want a certain build to run as an IAM role, you
           | give it a queue where the agents have that role. For AWS,
           | Buildkite has as a Cloud Formation stack that sets up auto
           | scaling groups and some other resources for your agents to
           | run.
        
             | xmodem wrote:
             | Most CI systems will have some way of assigning builds to
             | groups of agents. But it would in some cases be useful to
             | grant different privileges to different containers running
             | on the same agent, which is what I understood OP to have.
        
           | orf wrote:
           | AWS has IAM service accounts for containers. Comes for free
           | with EKS, not sure how you'd do it without EKS.
           | 
           | Basically it adds a signed web identity file into the
           | container which can be used to assume roles.
        
             | captn3m0 wrote:
             | Ref: https://docs.aws.amazon.com/eks/latest/userguide/iam-
             | roles-f...
        
             | otterley wrote:
             | Amazon ECS also offers task roles which do the same thing: 
             | https://docs.aws.amazon.com/AmazonECS/latest/developerguide
             | /...
        
           | selecsosi wrote:
           | For ECS: https://docs.aws.amazon.com/AmazonECS/latest/develop
           | erguide/...
        
           | lox wrote:
           | Yup, we have a sidecar process/container that runs for each
           | job and assumes an AWS IAM Role for that specific pipeline
           | (with constraints like whether it's an approved PR as well).
           | The credentials are provided to the job container via a
           | volume mount. This allows us to have shared agents with very
           | granular roles per-pipeline and job.
        
       | mvdwoord wrote:
       | The company I currently do contract work for, decided it would be
       | best to have one large team in Azure DevOps and subdivide all
       | teams in repositories etc with prefixes and homegrown "Governer"
       | scripts, which are enforced in all pipelines.
       | 
       | Global find on some terms like "key", "password" etc were great
       | fun. It really showed most people, our team included, struggled
       | with getting the pipeline to work at all. Let alone doing it in a
       | secure manner.
       | 
       | This is a 50k+ employee financial institute. I am honestly
       | surprised these kind of attacks are not much more widespread.
        
       | rawgabbit wrote:
       | You would think by now we would have better credential methods. I
       | still see username and passwords for system credentials. I see
       | tokens created by three legged auths. I don't get how that is an
       | improvement. The problem is that most deployed code doesn't have
       | just one credential but a dozen. Multiply that with several
       | environments and you get security fatigue and apathy.
        
       | MauranKilom wrote:
       | Interesting to learn that credentials in environment variables
       | are frowned upon. I mean, makes sense if your threat model
       | includes people pushing malicious code to CI, but aren't you more
       | or less done for at that point anyway? If "legitimate" code can
       | do a certain thing, then malicious code can do too. I guess
       | you'll want to limit the blast radius, but drawing these
       | boundaries seems like a nightmare for everyone...
        
         | imachine1980_ wrote:
         | say from everbody to all sec teams
        
         | xmodem wrote:
         | > makes sense if your threat model includes people pushing
         | malicious code to CI, but aren't you more or less done for at
         | that point anyway? If "legitimate" code can do a certain thing,
         | then malicious code can do too.
         | 
         | The answer is very much, 'it depends'. For oen thing,
         | developers can run whatever code in CI before it's benn
         | reviewed. I could just nab the env vars and post them wherever.
         | If there are no sensitive env vars for me to nab and you have
         | enforced code review, then I need a co-conspirator, and my
         | change is probably going to leave a lot more of a paper trail.
         | 
         | Another risk is accidental disclosure - I have on at least two
         | occasions accidentally logged sensitive environment variables
         | in our CI environment. Now your threat model is not just a
         | malicious developer pushing code - it's a developer making a
         | mistake, plus anyone with read access to the CI system.
         | 
         | I don't know about your org, but at my job, the set of people
         | who have read access to CI is a lot larger than the set who can
         | push code, which is again a lot larger than the set of people
         | who can merge code without a reviewer signing off.
         | 
         | > but drawing these boundaries seems like a nightmare for
         | everyone...
         | 
         | As someone currently struggling with how to draw them, yup.
        
           | nickjj wrote:
           | > For one thing, developers can run whatever code in CI
           | before it's been reviewed.
           | 
           | Yeah I don't think this gets talked about enough.
           | 
           | If you're talking about private repos in an organization then
           | CI often runs on any pull request. That means a developer is
           | able to make CI run in an unreviewed PR. Of course for it to
           | make its way into a protected branch (main, etc.) it'll
           | likely need a code review but nothing is stopping that
           | developer who opened the unreviewed PR to modify the CI yaml
           | file in a commit to make that PR's pipeline do something
           | different.
           | 
           | Requiring a team lead or someone to allow every individual
           | PR's pipeline to run (what GitHub does by default in public
           | repos) would add too much friction and not all major git
           | hosts support the idea of locking down the pipelines file by
           | decoupling it from the code repo.
           | 
           | Edit: Depending on which CI provider you use, this situation
           | is mostly preventable -- "mostly" in the sense that you can
           | control how much damage can be done. Check out this comment
           | later in this thread:
           | https://news.ycombinator.com/item?id=29967077
        
             | cshokie wrote:
             | It is also possible to segment the build pipelines into
             | separate CI and PR builds, where only CI builds have access
             | to any secrets. The PR pipeline just builds and tests, and
             | all of the other tasks only happen in CI once a change has
             | been merged into the main branch. That mitigates the
             | "random person creating a malicious PR" problem because it
             | has to be accepted and merged before it can do anything
             | bad.
        
             | reflectiv wrote:
             | Each environment should have its own keys for the services
             | it is talking to so _ideally_ this would restrict the scope
             | of damage.
        
               | nickjj wrote:
               | > Each environment should have its own keys for the
               | services it is talking to so ideally this would restrict
               | the scope of damage.
               | 
               | If a developer changes the CI pipeline file to make their
               | PR's code run in `deployment: "production"` instead of
               | `deployment: "test"` doesn't that bypass this?
               | 
               | Edit:
               | 
               | I'll leave my original question here because I think it's
               | an important one but I answered this myself. It depends
               | on which CI provider you're using but some of them do let
               | you restrict specific deployments from being run only on
               | specific branches or by specific folks (such as repo
               | admins).
               | 
               | In the above case if the production deployment was only
               | allowed to run on the main branch and the only way code
               | makes its way into the main branch is after at least 1
               | person reviewed + merged it (or whatever policy your
               | company wants) then a rogue developer can't edit the
               | pipeline in an unreviewed PR to make something run in
               | production.
               | 
               | Also with deployment specific environment variables then
               | a rogue developer is also not able to edit a pipeline
               | file to try and run commands that may affect production
               | such as doing a `terraform apply` or pushing an
               | unreviewed Docker image to your production registry.
        
               | acdha wrote:
               | > If a developer changes the CI pipeline file to make
               | their PR's code run in `deployment: "production"` instead
               | of `deployment: "test"` doesn't that bypass this?
               | 
               | As a concrete example, GitLab has the concept of
               | protected branches and code owners, both of which allow
               | you to restrict access to the corresponding environments'
               | credentials to a smaller group of people who have
               | permission to touch the sensitive branches. That allows
               | you to say things like "anyone can run in development but
               | only our release engineers can merge to
               | staging/production" or "changes to the CI configuration
               | must be approved by the DevOps team", respectively.
               | 
               | That does, of course, not prevent someone from running a
               | Bitcoin miner in whatever environment you use to run
               | untrusted merge requests but that's better than access to
               | your production data.
        
               | Nextgrid wrote:
               | CI should only have environment variables needed for
               | testing. For building/deploying to production, it just
               | has to _push_ the code /package/container image, not
               | _run_ it, meaning it has no need for production-level
               | credentials.
               | 
               | CI should never ever have access to anything related to
               | production; not just for security but also to prevent
               | potentially bad code being run in tests from trashing
               | production data.
        
             | formerly_proven wrote:
             | Yeah but I mean... that's why CI and CD are separate
             | things. CI _should not_ need any privileges. CI builds
             | _should_ be hermetic (no network access, no persistence,
             | ideally fully reproducible from what 's going in). CI
             | _should not_ talk to servers on your network, let alone
             | have credentials, especially for production systems.
        
               | tremon wrote:
               | Our CI systems have read access to the data definition
               | store (which is a sql database right now), because we
               | don't store interop data definitions in code. So fully
               | hermetic no, because our code (repository) is not fully
               | self-contained. The definition store has its own audits
               | and change tracking, but it's separate from the interface
               | code.
        
               | derefr wrote:
               | That seems fine, in the same way that e.g. an XML library
               | fetching DTDs from a known public URL is fine.
               | 
               | However, it'd probably be better if you could have the CI
               | _framework_ collect and inject that information into the
               | build using some hard-coded deterministic logic, rather
               | than giving the build itself (developer-driven Arbitrary
               | Code Execution) access to that capability.
               | 
               | Same idea as e.g. injecting Kubernetes Secrets into Pods
               | as env-vars at the controller level, rather than giving
               | the Pod itself the permission to query Secrets out of the
               | controller through its API.
        
               | NAHWheatCracker wrote:
               | CI for me has to access repositories. Eg. downloading
               | libraries from PyPI, Maven, NPM, etc...
               | 
               | For private repositories, that means access to
               | credentials. Probably read-only credentials, but it
               | requires network access.
               | 
               | Would you be suggesting that everyone should commit all
               | dependencies?
        
               | formerly_proven wrote:
               | That's a fair point. I think for many projects which only
               | consume dependencies the answer can be "yes, just commit
               | all your dependencies" in many instances. Pulling from
               | internal repos shouldn't be critical though as long as
               | the read-only tokens you're using can only access what
               | the developers can read anyway; pulling "unintended" code
               | in a CI _should_ never be able to escalate because we 're
               | running anything a dev pushed.
        
               | NAHWheatCracker wrote:
               | I've heard of doing this, although not worked on such a
               | project. Committing node_modules seems wild to me and I
               | foresee issues with committing all the Python wheels
               | necessary for different platforms on some projects.
               | 
               | I'm a proponent of lock-file approaches, which gain 99%
               | of the benefits with far less pain. It requires network
               | access, though.
        
               | emteycz wrote:
               | Yarn v2 is much better for committing dependencies than
               | committing node_modules.
               | 
               | However, it's problematic. Only use it if you're certain
               | it will solve a specific problem you have.
               | 
               | Consider using Docker to build images that include a
               | snapshot of node_modules.
        
               | derefr wrote:
               | You could "just"
               | 
               | 1. stand up a private package mirror that you control,
               | that uses a whitelist for what packages it is willing to
               | mirror;
               | 
               | 2. configure your project's dependency-fetching logic to
               | fetch from said mirror;
               | 
               | 3. configure CI to only allow outbound network access to
               | your package mirror's IP.
               | 
               | The disadvantage -- but also the _point_ -- of this, is
               | that it is then a release manager 's responsibility, not
               | a developer's responsibility, to give the final say-so
               | for adding a dependency to the project (because only the
               | release manager has the permissions to add packages to
               | the mirror.)
        
               | NAHWheatCracker wrote:
               | That's an interesting approach. I'm not keen on the
               | bureaucratic aspect, which leads to more friction than
               | it's worth in my experience.
               | 
               | I guess that's beside the point if your goal is only to
               | reduce risk of compromised CI/CD.
        
               | mulmen wrote:
               | You can compromise and give the developers the ability to
               | add dependencies as well. In a separate "dependency
               | mirror" package. But those have to be code reviewed like
               | anything else. So you have a paper trail but adding a
               | dependency is lower friction. And you still can't
               | accidentally (or maliciously!) pull in an evil dependency
               | without at least two people involved.
        
               | nightpool wrote:
               | Sure, but that's a completely different scenario than the
               | one the OP/TFA is talking about. We're talking about
               | disclosing privileged information or access tokens that a
               | single developer on their own shouldn't be able to handle
               | without code review, like being able to use your CI
               | system to access production accounts. Software libraries
               | don't fall under this category, since the developer would
               | already need to have accessed them to write the code in
               | the first place!
        
               | NAHWheatCracker wrote:
               | I was focused on the "no network access" aspect of
               | formerly_proven's comment. "no network access" would make
               | CI nearly pointless from my perspective.
               | 
               | I agree that there are tokens and variables that are
               | dangerous to expose via CI, but throwing the baby out
               | with the bathwater confused me.
        
               | pc86 wrote:
               | Based on what?
               | 
               | Every CI system I've _ever_ seem has pulled dependencies
               | in from the network.
        
               | inetknght wrote:
               | git clone --recursive --branch "$commitid" "$repourl"
               | "$repodir"        img="$(docker build --network=none -f
               | "$dockerfile" "$repodir")"        docker run --rm -ti
               | --network=none "$img"
               | 
               | Sure, CI pulls in from the network... but execution
               | occurs without network.
        
               | staticassertion wrote:
               | Can you explain wtf is happening here?
        
               | inetknght wrote:
               | I'd assumed some familiarity with common CI systems (and
               | assumed the commands would be tailored to your use case).
               | Let me walk it into some more depth.
               | 
               | First:
               | 
               | > git clone --recursive --branch "$commitid" "$repourl"
               | "$repodir"
               | 
               | The `git clone` will take your git repository URL as the
               | $repourl variable. It will also take your commit id
               | (commit hash or tagged version which a pull request
               | points to) as the $commitid variable (`--branch
               | $commitid`). It will also take a $repodir variable which
               | points to the directory that will contain the contents of
               | the cloned git repository and already checked out at the
               | commit id specified. It will do so recursively
               | (`--recursive`): if there are submodules then they will
               | also automatically be cloned.
               | 
               | This of course assumes that you're cloning a public
               | repository and/or that any credentials required have
               | already been set up (see `man git-config`).
               | 
               | Then:
               | 
               | > img="$(docker build --network=none -f "$dockerfile"
               | "$repodir")"
               | 
               | Okay so this is sort've broken: you'd need a few more
               | parameters to `docker build` to get it to work "right".
               | But as-is, `docker build` usually has network access so
               | `--network=none` will specify that the build process will
               | _not_ have access to the network. I hope your build
               | system doesn 't automatically download dependencies
               | because that will fail (and also suggests that the build
               | system may be susceptible to attack). You specify a
               | dockerfile to build using `-f "$dockerfile"`. Finally,
               | you specify the build context using "$repodir" -- and
               | that assumes that your whole git repository should be
               | available to the dockerfile.
               | 
               | However, `docker build` will write a lot more than _just_
               | the image name to standard output and so this is where
               | some customization would need to occur. Suffice to say
               | that you can use `--quiet` if that 's all you want; I do
               | prefer to see the output because it normally contains
               | intermediate image names useful for debugging the
               | dockerfile.
               | 
               | Finally:
               | 
               | > docker run --rm -ti --network=none "$img"
               | 
               | Finally, it runs the built image in a new container with
               | an auto-generated name. `-ti` here is wrong: it will
               | attach a standard input/output terminal and so if it
               | drops you into an interactive program (such as bash) then
               | it could hang the CI process. But you can remove that. It
               | also assumes that your dockerfile correctly specifies
               | ENTRYPOINT and/or CMD. When the container has exited then
               | the container will automatically be removed (--rm) --
               | usually they linger around and pollute your docker host.
               | Finally, the --network=none also ensures that your
               | container does not have network access so your unit tests
               | should also be capable of running without the network or
               | else they will fail. You could use `--volume` to specify
               | a volume with data files if you need them. You might also
               | want to look at `--user` if you don't want your container
               | to have root privileges...
               | 
               | And of course if you want _integration tests_ with other
               | containers then you should create a dedicated docker
               | network and specify its alias with `--network`: see `man
               | docker-network-create`; you can use `docker network
               | create -d internal` to create a network which shouldn 't
               | let containers out.
               | 
               | Does that answer your question?
        
               | staticassertion wrote:
               | Yes, thank you for the detailed explanation.
        
               | structural wrote:
               | After the CI system (with network access) pulls down the
               | code, including submodules, this code is then placed into
               | a container with no network access to perform the actual
               | build.
        
               | pharmakom wrote:
               | Is such a clean separation possible? I've seen some crazy
               | things...
        
               | snovv_crash wrote:
               | Now try this with eg. OpenCV, or ONNX, or tflite, or a
               | million other packages that try to download additional
               | information at compile time.
        
             | heavenlyblue wrote:
             | The solution to that is simple: do not do any tests with
             | production secrets :)
        
           | marcosdumay wrote:
           | Wait. Isn't any CI step done before review expected to be
           | configured for a test environment?
           | 
           | I'm failing to understand how that procedure even works. How
           | do you run the tests?
        
         | detaro wrote:
         | The article is more specific than that: They shouldn't be
         | shared with code run by people/jobs who shouldn't have access
         | to it. I.e. don't have secrets used for deploys in the
         | environment that runs automatically on every PR if deploys are
         | gated behind review by a more limited list of users.
        
         | mulmen wrote:
         | > I mean, makes sense if your threat model includes people
         | pushing malicious code to CI, but aren't you more or less done
         | for at that point anyway?
         | 
         | Maybe. Back in the old days if you had the commit bit your
         | badge didn't get you into the server room. I get the impression
         | a lot of shops are effectively giving their devs root but in
         | the cloud this time, which isn't necessary.
        
       | movedx wrote:
       | This is exactly why I both love and hate CI/CD.
       | 
       | Ultimately most CI/CD setups are basically systems administrators
       | with privileged access to everything, network connected and
       | running 24/7. It's pretty dangerous stuff.
       | 
       | I don't have an answer though, expect maybe to keep the CI and CD
       | in separate, isolated instances that require manual intervention
       | to bridge the gap on a case by case basis. That doesn't scale
       | very well though.
        
         | contingencies wrote:
         | _A distributed system is one where the failure of a machine you
         | 've never heard of stops you from getting any work done._ -
         | Leslie Lamport
         | 
         | ... via https://github.com/globalcitizen/taoup
        
         | hinkley wrote:
         | I think in general we put too much logic into our CI/CD
         | configurations.
         | 
         | There is an argument to be made for a minimalist CI/CD
         | implementation that can handle task scheduling and
         | dependencies, understands how to fetch and tag version control,
         | count version numbers and not much else. Even extracting test
         | result summaries, while handy, maybe should be handled another
         | way.
         | 
         | For many of us, if CI is down you can't deploy anything to
         | production, not even roll back to a previous build. Everything
         | but the credentials should be under version control, and the
         | right people should be able to fire off a one-liner from a
         | runbook that has two to four _sanity checked_ arguments in
         | order to trigger a deployment.
        
       | thomasmarcelis wrote:
       | >>In our final scenario, the NCC Group consultant got booked on a
       | scenario-based assessment:
       | 
       | >>"Pretend you have compromised a developer's laptop."
       | 
       | Most companies will fail right here. Especially outside of the
       | tech world security hygiene with developer's laptops is very bad
       | from what I have seen.
        
       | jiggawatts wrote:
       | A weakness of modern secret management is that it isn't.
       | 
       | A secret value ought to be very carefully guarded _even from the
       | host machine itself_.
       | 
       | .NET for example has SecureString, which is a good start -- it
       | can't be accidentally printed or serialised insecurely. If it is
       | serialised, then it is automatically encrypted by the host OS
       | data protection API.
       | 
       | Windows even has TPM-hosted certificates! They're essentially a
       | smart card plugged into the motherboard.
       | 
       | A running app can _use_ a TPM credential to sign requests but it
       | can't read or copy it.
       | 
       | These advancements are just completely ignored in the UNIX world,
       | where everything is blindly copied into easily accessible
       | locations in plain text...
        
         | pxx wrote:
         | Except afaict SecureString doesn't reliably do that and
         | shouldn't be used. https://github.com/dotnet/platform-
         | compat/blob/master/docs/D...
        
           | jiggawatts wrote:
           | "It's not perfectly secure so use a totally insecure
           | alternative instead" seems like terrible advice.
        
             | pxx wrote:
             | No it's "don't use this thing which doesn't say what it
             | does on the tin and is therefore a foot gun." Something
             | that is obviously insecure will be treated with more
             | caution / put on the correct side of the authorization
             | boundary than something that claims to be.
        
       | INTPenis wrote:
       | Many of these points are about running pipelines in privileged
       | containers. Something I actually took extra time to resolve for
       | my team. That's when I discovered kaniko first, and shortly after
       | podman/buildah.
       | 
       | After that podman and buildah have gotten a lot of great reviews
       | from people so I think they're awesome.
       | 
       | For an old time Unix sysadmin it just doesn't make sense to run
       | something as root unless you absolutely have to.
       | 
       | Which also makes the client excuse in the article so strange,
       | they had to run the container privileged to run static code
       | analysis. wtf. Doesn't that just mean they run a tool against a
       | binary artefact from a previous job? I fail to see how that
       | requires privileges.
        
       ___________________________________________________________________
       (page generated 2022-01-17 23:01 UTC)