[HN Gopher] Abusing Go's Infrastructure
___________________________________________________________________
Abusing Go's Infrastructure
Author : efge
Score : 297 points
Date : 2024-05-25 12:50 UTC (10 hours ago)
(HTM) web link (reverse.put.as)
(TXT) w3m dump (reverse.put.as)
| arccy wrote:
| it's a known issue https://github.com/golang/go/issues/31866
| yjftsjthsd-h wrote:
| That fix would help with accidents, but wouldn't someone
| _intentionally_ hoping doing it just add a .mod and .go file to
| the root?
| jerf wrote:
| How do you "fix" that at all?
|
| In the end, there is no definition of "a source control
| repository that is a Go module" that is robust to this sort
| of "attack"... although calling it an "attack" is kind of
| dubious, the reasons why this is a bad thing strike me as
| very strained and relatively weak. Mostly it hurts Google by
| hosting too much stuff, but, good luck bringing them down
| that way.
| oooyay wrote:
| Color me unsurprised Marwan is on this issue. He and Aaron
| wrote Athens, Marwan wrote (to my knowledge) the first Go
| download protocol implementation that Athens is based on.
|
| This issue is kind of curious because Athens already uses the
| go mod download -json command mentioned as a preflight check
| for module verification. More or less, if the repo passes the
| go module commands understanding of a module then Athens will
| serve it. In more verboten terms:
|
| - a module version, pseudo version, or +incompatible must be
| able to be formulated
|
| - that module (and it's dependencies) must produce a valid
| checksum
|
| The checksum of modules just has to do with the current .mod
| and all files + recursively for each dependency. So, as the
| author pointed out you can have lots of space for arbitrary
| files by design so long as you have a basic go program.
| kyrra wrote:
| Googler, opinions are my own. I know nothing about this space.
|
| I would hope the Go team collaborated with GCP and Drive, as
| hosting malicious files is something Google has to deal with all
| the time. This isn't much different from other endpoints Google
| already allows people to put random data on.
| 8organicbits wrote:
| I know pypi has some non-python projects as well. Python needs
| the ability to distribute wheels, which are compiled binaries, as
| the user may not be able to compile library code. Lots of that
| code is written in C, but Golang[1] is also possible. I can't
| find an example, but I believe I've seen this used for
| distributing applications (not libraries) as well. It's kinda
| cool to write some app in C, upload to pypi, and then ask users
| to install with `pip install`.
|
| [1] https://github.com/popatam/gopy_build_wheel_example
| bee_rider wrote:
| Hypothetically if they did try to add some requirement to use
| Python, people could just comply maliciously by providing the
| most minimal stub of Python code, right? Linux, but ls is
| written in Python. So it is probably better just to not play
| games.
| Maxatar wrote:
| You could embed the binary data in a Python string and then
| have the installer dump that string to a file.
| rfoo wrote:
| pip install cmake
|
| or even proprietary binaries, pip install nvidia-cudnn-cu12
| IshKebab wrote:
| Yeah I copied CMake's idea of using PyPI and I also use it to
| distribute some pure Rust CLI tools using Maturin. It works
| really well. Pip is... well it's about on par with most other
| package managers, i.e. not great, not terrible, but it has
| some pretty huge advantages over any other software
| distribution method on Linux:
|
| * Very likely to be installed already on Linux and probably
| Mac too.
|
| * Doesn't require root to install. You can even have isolated
| installs via pyenv.
|
| * I don't have to ask anyone's permission to publish a
| package.
|
| * I only have to make one package.
|
| If any can think of a better option I'm all ears but until
| then I'm fairly happy with this hack.
| Too wrote:
| Some of those arguments are becoming more and more
| difficult as pip and distros are pushing for use of venvs
| and now requires a scary --break-system-packages argument
| if you were to use the pre installed launcher.
| IshKebab wrote:
| That is a good point. Distro package managers have
| somehow screwed this up too.
| mort96 wrote:
| I guess that was much more useful as a use case before pip
| started requiring you to be in a
| venv/virtualenv/pipenv/pyenv/whatever to download packages
| 12_throw_away wrote:
| I've never encountered this requirement in many years of
| daily use - pip for me has always happily installed anything
| if it can.
|
| Now I've definitely seen customized distributions of python
| from package managers that have taken steps to prevent you
| from using pip. IIRC, the python you get from `apt-get
| install python` in Debian does this? I.e., it's designed to
| support system utilities, not as a user's general purpose
| python environment, and they want `apt-get` to control this
| environment, not pip. So they've removed pip and ensure_pip
| and easy_install from your core system python environment.
|
| TLDR: In my experience, that requirement doesn't come from
| pip, it's your distro taking steps to prevent
| https://xkcd.com/1987/
| thayne wrote:
| I'm not sure if that is upstream or an ununtu or debian
| patch, but that is the case on Ubuntu 24.04, at least
| unless you pass the --break-system-packages option.
| garblegarble wrote:
| Sorta, although it is a Python feature for distros to use:
| installations can be marked as externally managed and pip
| will refuse (without being forced) to make changes unless
| in a venv[1][2]
|
| 1: PEP 668 https://peps.python.org/pep-0668/
|
| 2: https://packaging.python.org/en/latest/specifications/ex
| tern...
| verdverm wrote:
| CUE's module system is finally rolling out, MVS likes Go's, but
| built on OCI infra. If you are interested in dependency
| management systems, here are some links
|
| - proposal: https://github.com/cue-
| lang/proposal/tree/main/designs/modul...
|
| - custom registry: https://cuelang.org/docs/tutorial/working-
| with-a-custom-modu...
|
| - road map: https://github.com/orgs/cue-lang/projects/10/views/8
|
| - in 0.9.0-alpha-5, modules become enabled by default:
| https://github.com/cue-lang/cue/releases/tag/v0.9.0-alpha.5
|
| For Go Sum, the Trillian project backs the transparency log:
| https://github.com/google/trillian
|
| CUE plans to piggyback on the OCI options with attestations and
| such
| dgellow wrote:
| What does that have to do with the linked article?
| verdverm wrote:
| The CUE team worked with the Go team on the module system.
| From these interactions, and community input, they decided
| against using a proxy like Go has. The "exploit" in the
| article was one of the reasons they made this decision, and
| chose to use OCI registries instead. The V1 proposal actually
| proposed using the same Go proxy servers as a stopgap, which
| received significant pushback from the community (I was
| probably the loudest voice against the idea). The Go team was
| supportive at the time, but this would have been exactly what
| OP talks about, having non-Go projects in the proxy/sumdb.
|
| So CUE's module design can be seen as an evolution on Go's,
| building on the good parts while addressing some of the
| shortcomings.
|
| Fun fact, CUE started as a fork of Go, mainly for the
| internal compiler tooling and packages
| ithkuil wrote:
| I toyed with the idea of piggybacking on (i.e. abusing) the
| golang proxy and sumdb to have a free transparent log of
| checksums of arbitrary URLs
|
| https://getsum.pub/
| arccy wrote:
| sounds convoluted. If you just want a public transparency log,
| the public rekor instance under the sigstore project is much
| more appropriate for that.
|
| https://www.sigstore.dev/
|
| https://docs.sigstore.dev/logging/overview/
| lpapez wrote:
| Sure, but the gosum database is a critical piece of worldwide
| software infrastructure, so you can count on it being
| accesible behind many firewalls and always up. And it's
| completely free and anonymus.
|
| Perfect for the purpose.
| ithkuil wrote:
| Yeah when I did that there was no public rekor instance ran
| by the sigstore project so I choose the only available public
| transparency log I could bend to my needs (x509 transparency
| logs were an alternative but it'd quickly hit rate limits by
| acme providers)
| skybrian wrote:
| Interesting! Looks like it's being used by some npm packages
| [1] and soon homebrew will be using it [2]. Any other
| interesting usage?
|
| As a user, the npm usage doesn't seem very prominent. On an
| npm's web page, there's a checkmark next to the version
| number on the right side that I hadn't paid any attention to
| before, with more information at the very bottom of the page.
| Here's an example. [3]
|
| [1] https://blog.sigstore.dev/npm-provenance-ga/ [2]
| https://blog.sigstore.dev/homebrew-build-provenance/ [3]
| https://www.npmjs.com/package/fast-check
| mynameisvlad wrote:
| It's at the bottom of the page _on mobile_. On desktop,
| that's the first thing on the right hand side of the screen
| IIRC.
| skybrian wrote:
| For me on desktop, the version seems to be the fourth
| thing down in the right column, under weekly downloads,
| and there's a checkmark. (Or maybe I'm missing
| something.)
| miki123211 wrote:
| Any online service that lets users upload material that is then
| publicly visible will eventually be used for command-and-control,
| copyright infringement and hosting CSAM. This is especially true
| for services that have other important uses besides file hosting
| and hence are hard to block.
|
| This already happened to Twitter[1], Telegram[2], and even the
| PGP key infrastructure[3], not to mention obvious suspects like
| GitHub.
|
| [1] https://pentestlab.blog/2017/09/26/command-and-control-
| twitt... [2] https://www.blazeinfosec.com/post/leveraging-
| telegram-as-a-c... [3] https://torrentfreak.com/openpgp-
| keyservers-now-store-irremo...
| liquidgecka wrote:
| And Gmail and Google groups, and Google drive, and Gchat, on
| and on. The data you store doesn't even have to be public. With
| Gmail they would distribute credentials to log in and read
| attachments that they uploaded via imap.
|
| (I am a former Google SAD-SRE [Spam, Abuse, Delivery])
| howenterprisey wrote:
| Just curious, "Delivery" doesn't seem to be the same sort of
| thing as "Spam" and "Abuse": why are the three grouped?
| fred256 wrote:
| No inside information, but presumably this means Delivery
| to other organizations, which, among other things, includes
| maintaining outbound IP reputation, which is closely
| related to Spam and Abuse.
| romafirst3 wrote:
| Delivery is what happens if it's not spam or abuse.
| zepolen wrote:
| Question, how would you know without invading the user's
| privacy?
| kccqzy wrote:
| An algorithm that processes private user data is by itself
| not invading anyone's privacy. It's clear to me that
| invasion of privacy only happens when humans look at
| private user data directly, or look at user data that's not
| sufficiently processed by an algorithm.
|
| Otherwise, something as simple as a spell checker would be
| an invasion of privacy because it literally looks at every
| word in an email you write. That's absurd.
| _heimdall wrote:
| At least in my opinion, there's a big difference with
| where the data lives and where the checking algorithm is
| run. I don't think a spell checker would fall into what
| I'd consider a privacy concern as long as the spell
| checker is running locally on my device.
| imachine1980_ wrote:
| I don't work in the area of email nor Google but I see
| two problems.
|
| 1) you need to constantly update the spell checker so
| each time you say this is word or something like that
| most likely the data is send the problem is part of the
| data, I assume Google do something similar whit data send
| to span and mark as not spam. This is full email redirect
| and analysis not partial like old word processing.
|
| 2)I feel ai make this even more harder so now you can't
| simply check patterns as simply as before, and you need
| to check the whole content constantly
| carom wrote:
| Companies are legally obligated to scan for CSAM in the US.
| toast0 wrote:
| I don't think that's accurate... Do you have a link?
|
| I do think there is an obligation to report if any is
| found, but I don't think they need to look.
| gloryjulio wrote:
| Just a side note, I found the name sad sre funny and blursed
| at the same time
| cdelsolar wrote:
| Whats' blursed mean?
| jprete wrote:
| Simultaneously blessed and cursed.
| follower wrote:
| > I am a former Google SAD-SRE
|
| From long enough ago that I should apologize to you for
| libgmail: https://libgmail.sourceforge.net ? :D
| nerdponx wrote:
| It seems like it would be pretty easy to use PyPI for this,
| because packages can contain arbitrary non-Python files. And
| you can also do things like base 64 encoding your files in
| strings in Python code.
| weinzierl wrote:
| Not sure if it has already happened, but the not so obvious one
| is HuggingFace.
| miki123211 wrote:
| No idea if it's used for CSAM or malware, but copyright
| infringement on a massive scale?
| https://huggingface.co/datasets/EleutherAI/pile
| 9dev wrote:
| Off topic: took a look at the domain, had a foreboding on the
| innuendo, found mostly what I expected on put.as ...
| moonlion_eth wrote:
| I made the mistake of doing that at work
| yazzku wrote:
| Not off-topic at all; came here just for this.
| KolmogorovComp wrote:
| https://put.as/ Mildly NSFW.
| mdtrooper wrote:
| Yes, it is a spanish plural word. But....
| IshKebab wrote:
| Maybe I'm being stupid but what exactly is the issue here? It's
| probably a bit wasteful of the proxy to cache non-Go repos, but
| even if it didn't you could make it store arbitrary data just by
| having it cache a Go repo surely? Sounds like a complete non-
| issue unless I've missed something.
| arandomusername wrote:
| you're right.
| gnfargbl wrote:
| I don't think you've missed anything. The news here appears to
| be that a unsecured public proxy is willing to proxy things and
| make them available to the public in an unsecured fashion.
|
| The article does make the point that some monitored networks
| might trust golang proxy URLs more than arbitrary web URLs and
| that this could be used for bypassing reputation filters etc --
| but there are already several ways to do that, and this one
| doesn't seem _particularly_ special.
| palata wrote:
| That's maybe naive, but... how is that different than just
| pushing files to e.g. a GitHub repository? Is it just the fact
| that you need to create an account for GitHub? Because I can
| store arbitrary data there, too. Without the 500M limit...
| withinboredom wrote:
| GitHub has some pretty stringent rate limits for anon requests.
| erik_seaberg wrote:
| W3C laid the groundwork for _everything_ on the Web to be heavily
| cacheable, so it 's weird that there are so few general-purpose
| proxy caches. Are publishers sending short "Cache-Control: max-
| age" or "Vary: Cookie" responses when they didn't need to? Are
| too many ISPs paying for transit rather than peering?
| lmz wrote:
| In general there's no way to ensure the cache hasn't tampered
| with the contents (e.g. ISP proxy ad injection on non HTTPS
| sites). For software downloads usually there are signatures and
| checksums. Arbitrary content, not so much.
| arccy wrote:
| There was HTTP SXG (signed exchanges) but it never seemed to
| get any traction https://web.dev/articles/signed-exchanges
| iainmerrick wrote:
| Even if the content is signed, there's still the issue that
| the proxy gets to see everything you read, right?
| yencabulator wrote:
| Maybe use cache only when Subresource Integrity is present.
|
| https://developer.mozilla.org/en-
| US/docs/Web/Security/Subres...
___________________________________________________________________
(page generated 2024-05-25 23:00 UTC)