[HN Gopher] Keylogger discovered in image generator extension
       ___________________________________________________________________
        
       Keylogger discovered in image generator extension
        
       Author : belladoreai
       Score  : 234 points
       Date   : 2024-06-09 17:29 UTC (5 hours ago)
        
 (HTM) web link (old.reddit.com)
 (TXT) w3m dump (old.reddit.com)
        
       | LtWorf wrote:
       | No domain and website registered?
        
       | skilled wrote:
       | Looks like a pretty small project. Only had 40 stars on GitHub
       | before the repo was removed.
       | 
       | Was this the main method of GPT4 and Claude integrations for
       | ComfyUI?
        
         | belladoreai wrote:
         | It was an extension for ComfyUI, which has 37k stars on GitHub.
         | The way ComfyUI is commonly used is that a person shares a
         | "workflow" file, which utilizes various obscure extensions
         | (called "custom nodes") and then the people who want to run the
         | workflow on their own computer will install all these obscure
         | custom nodes that have like 40 stars on GitHub or so.
        
           | szundi wrote:
           | Just like an npm install
        
         | LtWorf wrote:
         | Using stars as popularity doesn't work.
         | 
         | I have personally never starred anything that I use. And 90% of
         | the open source that I use isn't on github.
        
       | WarOnPrivacy wrote:
       | Some entity called Nullbulge Group claims they took over the
       | repo.
       | 
       | Today's capture (before the repo got 404'd) has their
       | belligerence spiel.
       | https://web.archive.org/web/20240609135118/https://github.co...
       | 
       | This is the capture from 3 days prior:
       | https://web.archive.org/web/20240525021402/https://github.co...
        
         | belladoreai wrote:
         | I have not seen a statement from Nullbulge so it's not
         | appropriate to say that they took over the repo.
         | 
         | The author of the repo is claiming that their repo is hacked,
         | but this is an obvious lie, because their very first GitHub
         | commit is the one where they push the malware. Nobody would
         | hack an empty GitHub account.
         | 
         | I don't know if the author of the repo is lying when they say
         | that Nullbulge is behind the attack (perhaps the author is part
         | of Nullbulge, perhaps not).
        
           | millzlane wrote:
           | I wouldn't be so sure no one would hack an idle account. I
           | had my Spotify account taken before I even used it. I think
           | in my case they used my account to pump up other lesser known
           | artists.
        
             | belladoreai wrote:
             | Okay, sure. But if we have an account which has never had
             | any legitimate activity on it ever - an account that has
             | only ever been used to push malware - then I don't know if
             | it matters much who is the "rightful owner" of the account.
             | Things would be different if the GitHub account had some
             | legitimate activity before the "hack".
        
               | millzlane wrote:
               | I agree it doesn't matter much. Could be a noob mistake
               | by the account owner and this is damage control.
        
             | janoc wrote:
             | There was also an actively exploited XSS vulnerability on
             | Github in the recent days.
             | 
             | Doesn't mean that this guy was not a malicious actor, only
             | that one shouldn't be so quick to cast stones without
             | evidence.
        
               | belladoreai wrote:
               | The person who created the custom node is the same person
               | who "hacked" it. Whether or not the account is
               | technically owned by some unrelated civilian is not
               | important, because there is no other activity on the
               | account.
        
         | zamalek wrote:
         | Must be script kiddies. You have the opportunity to deploy
         | anything to a machine that almost certainly has a powerful GPU,
         | and choose a key logger that exists in signature databases?
         | _Genius._
        
           | Stagnant wrote:
           | Telegram and discord webhooks are 100% signs of an
           | unsophisticated attacker and they are a very common sight in
           | malware samples. Github is full of skiddie "info stealer"
           | projects that use telegram api / discord webhook to deliver
           | the stolen data. They make no sense to use since anybody can
           | spam that webhook endpoint. Not 100% sure about discord, but
           | at least in the case of telegram anybody can even read and
           | download all the data that has been sent to it.
        
           | uyzstvqs wrote:
           | Quick search reveals anti-AI motivated script kiddies. Also
           | some degen NSFW "art" content on DeviantArt and Reddit by the
           | same name, their likely origin.
        
         | 8fingerlouie wrote:
         | Something is fishy here.
         | 
         | According to the original report, the "key logger" was in the
         | custom wheels in the requirements.txt, but looking at that
         | repository there has been only two commits, which according to
         | Reddit both had malicious code in them.
         | 
         | Of course, proper discovery would be easier if the GitHub
         | account still existed.
        
       | nsingh2 wrote:
       | Not surprised at all, ComfyUI extensions are just arbitrary
       | python code. The first time I tried ComfyUI extensions I put it
       | in a podman container with GPU passthrough and blocked network
       | access.
        
         | smarm52 wrote:
         | Hopefully this will be just the incentive they need to do
         | something safer. Something similar happened before the move
         | from PKL to SAFETENSOR for model files.
        
         | Maxious wrote:
         | Comfy UI manager recently added some security levels so that by
         | default you can't accidentally leave a public instance that
         | allows remotely installing arbitrary python code
         | https://github.com/ltdrdata/ComfyUI-Manager?tab=readme-ov-fi...
        
       | 14 wrote:
       | Is there no way to defend against a keylogger? What can you do if
       | a simple keylogger can steal your passwords?
        
         | millzlane wrote:
         | Use 2FA I'd imagine.
        
         | Latty wrote:
         | Ideally, don't use passwords: Passkeys where supported, SSH
         | Keys, client certificates, social login via a service that does
         | support one of these methods.
         | 
         | Magic link emails can also work, but are potentially vulnerable
         | if you copy/pasted it rather than clicking depending on the
         | keylogger's capability and clipboard visibility, although the
         | window for attack is small, it's a much more sophisticated
         | attack that leaves more traces (good sites will reject reuse).
         | 
         | Second best, also use a second factor: U2F ideally, TOTP with
         | the same caveats as magic link emails, and at the bottom of the
         | barrel SMS which is better than nothing but known to be very
         | flawed.
         | 
         | Honestly, if you are anything other than a casual user, and
         | don't have devices with support baked in already, it's crazy
         | not to spend ~PS60 on a pair of security keys for passkey/U2F.
         | It's not a lot of money and is just _so_ much more secure.
        
           | danieldk wrote:
           | _Ideally, don 't use passwords: Passkeys where supported, SSH
           | Keys, client certificates, social login via a service that
           | does support one of these methods._
           | 
           | If a process has the privileges to run as a keylogger, it can
           | also grab your local SSH private keys and possibly harvest
           | passwords and passkeys from your local password manager vault
           | [1]. The process has local access and since it is a key
           | logger presumably your master password. (The complexity
           | depends a bit on the password manager, e.g. IIRC macOS
           | keychain always requires a roundtrip through the secure
           | enclave).
           | 
           |  _Honestly, if you are anything other than a casual user, and
           | don 't have devices with support baked in already, it's crazy
           | not to spend ~PS60 on a pair of security keys for
           | passkey/U2F. It's not a lot of money and is just so much more
           | secure._
           | 
           | 100% this. A secure enclave or a hardware key is the only way
           | to keep your key material safe.
           | 
           | Also, app sandboxing should be the default. macOS App Store
           | Apps are sandboxed. Unfortunately, these days the standard is
           | still for applications to have unfettered access to a user's
           | files.
           | 
           | [1] Passkeys can also be on a security key, but e.g. Yubikeys
           | only have a small number of resident key slots and I think
           | _passkeys_ to most people means key material synced through
           | iCloud /1Password/your favorite cloud.
        
         | Retr0id wrote:
         | Aside from not using passwords or using 2FA, sandboxing helps.
         | 
         | A VM with GPU passthrough set up would be one example (although
         | this is usually a pain to set up and I expect most people
         | aren't doing it).
         | 
         | As a more user-friendly example, if you install an iOS app
         | (local-model LLM and image generation apps exist), the
         | sandboxing provided by the OS ought to be more than enough to
         | prevent keyloggers, short of 0day exploits.
        
           | nsingh2 wrote:
           | Not as secure as VMs but GPU passthrough with Docker/Podman
           | is much easier to set up, and you can even use the GPU on the
           | host machine at the same time.
        
             | creata wrote:
             | Are you giving it access to /dev/dri, or doing some fancier
             | sandboxing?
             | 
             | (Would you even need anything fancier? I think /dev/dri is
             | supposed to isolate users.)
        
               | nsingh2 wrote:
               | Nvidia provides a toolkit to do this [1], getting a GPU
               | into a container is as easy as running `podman run
               | --device nvidia.com/gpu=all`. The process is similar for
               | Docker, but rootless Docker requires some extra steps
               | IIRC.
               | 
               | [1] https://docs.nvidia.com/datacenter/cloud-
               | native/container-to...
        
         | ehsankia wrote:
         | "keylogger" may not be the right term here? I'm not familiar
         | with how that term is broadly used for, but my definition of
         | that term is a tool that logs your keypresses. Here, it seems
         | like it was scraping your chrome/firefox data for login
         | cookies?
         | 
         | Honestly there's quite a lot of malware that go against those
         | files, I wonder if there's a way to require high privilege to
         | accessing chrome/firefox appdata, or just block it entirely
         | from other apps.
        
           | stuffoverflow wrote:
           | Yeah you're right, people miss use the term keylogger
           | frequently. These kind of malware are broadly called
           | "stealers" and usually do not involve keylogging.
           | 
           | Actual keyloggers tend to be rare nowadays due to them being
           | easier to detect and the fact that in general the browser
           | data is a more valuable target.
        
         | sureglymop wrote:
         | I mean, anything with root access can very easily use libevdev
         | to get all keystrokes as well as mouse positions. (It's maybe
         | 10 lines of code to do that).
         | 
         | So, don't run stuff as root. If it needs root access, run it in
         | a virtual machine (personally I use qubes os for this).
        
       | 42lux wrote:
       | That discussion on reddit really is something else so much
       | misinformation and pretend knowledge at work. It's as scary as
       | the malware.
        
         | SoftTalker wrote:
         | And this is the input for AI training.....
        
           | nicce wrote:
           | Not just any input, but paid input :-)
        
       | Ukv wrote:
       | The user's reddit profile: https://archive.is/G5GIW
       | 
       | They have a couple of other tools hosted on HuggingFace, both
       | having the malicious dependencies and both requiring entering API
       | keys, namely:
       | 
       | "SillyTavern Character Generator": https://archive.is/gETq3
       | (requirements.txt: https://archive.is/xqqtA)
       | 
       | "Image Description with Claude Models and GPT-4 Vision":
       | https://archive.is/6Ydgs (requirements.txt:
       | https://archive.is/9Sp5C)
       | 
       | They've also posted some BeamNG mods, and were casting doubt on
       | accusations that some other account's mod contained malware:
       | https://archive.is/zLiaZ
       | 
       | That other account's reddit profile: https://archive.is/r9V1M
        
       | tamimio wrote:
       | Lesson for the people who run and execute stuff without looking
       | at the code first.
        
         | vvpan wrote:
         | Which is everybody in the world except for a handful of people.
        
           | tamimio wrote:
           | Not really, and it takes a few minutes because most of these
           | packages (including npm) are small. You don't have to read
           | the WireGuard codebase because it's reputable enough, but for
           | obscure or unknown add-ons/package code, it's on you to
           | double-check, just like reading the 'readme'.
        
             | froggertoaster wrote:
             | You can't "not really" this away. Most people don't bother
             | looking at small package code, much less code for packages
             | that are far more complex.
        
             | redserk wrote:
             | So just sneak the code in a dependency of a dependency.
             | 
             | Who's diving 3-4 layers deep into dependencies?
        
               | netule wrote:
               | No need to hide it inside dependencies, just modify the
               | code before building and pushing the package to PyPi.
        
             | bdlowery wrote:
             | I haven't looked at the source code of a single npm package
             | I've installed in the past 5 years.
             | 
             | "It takes a few minutes"
             | 
             | Dude my web dev projects have like 1,000s of dependencies.
             | I'm not going to check the source code of every package
             | tailwind requires.
        
               | fbdab103 wrote:
               | Even if you did review it, a motivated attacker is not
               | going to have an exfiltrate_user_data(). The xz backdoor
               | exploit was incredibly sophisticated, and one key of the
               | design was sneaking a "." into a single line of a build
               | test script.
               | 
               | A cursory audit of primary dependencies has almost zero
               | chance of catching anything but a brazen exploit.
        
               | redserk wrote:
               | Yeah. Realistically I think the best course of action is
               | just assume you're already using a library that can
               | exfiltrate data.
               | 
               | This requires allowlisting egress traffic and possibly
               | even architecting things to prevent any one library from
               | seeing too many things. This approach can be a big pain
               | though and could be difficult to implement practically.
        
             | akira2501 wrote:
             | This is why I refuse to use almost anything on npm. If you
             | have a zero dependency project I'll consider it. If you
             | have a dependency that also has a set of dependencies then
             | I will never use your code.
        
             | gotbeans wrote:
             | Imo this makes no sense. There's zero chance you will start
             | inspecting all dependencies even in a relatively small
             | application, which now a days could pull already a large
             | number of deps.
             | 
             | I don't see how doing any of this manually will help.
        
             | genter wrote:
             | Would you have caught the XZ backdoor?
        
           | creata wrote:
           | Most people should only download software from people they
           | trust (to not be evil and also to be competent).
           | 
           | If you download code off some unknown person's GitHub repo,
           | you'd be stupid not to read it very very carefully!
        
         | Geee wrote:
         | Ain't nobody got time for that. LLMs should be capable of
         | analysing code for anything malicious / suspicious.
        
           | yonixw wrote:
           | Since LLM and keyloggers are turing machines, it won't
           | happen. (Or more precisely: it won't beat the cat and mouse
           | game of obfuscations.)
        
           | kibwen wrote:
           | Unfortunately, no, because the existence of LLMs that can
           | automatically determine code that is suspicious will be
           | offset by the existence of LLMs that can generate malicious
           | code that bypasses the detection abilities of the
           | aforementioned LLMs.
        
             | CrazyStat wrote:
             | Generative Adversarial LLMs, let's go!
        
               | protosam wrote:
               | Perhaps we could just call these ALLMs (Adversarial Large
               | Language Models). You're already dropping the N in GAN, I
               | see no need for the G.
               | 
               | As an end result I think someone clever could make a
               | LLaMA pun for the name of a LLaMA based ALLM.
        
           | astromaniak wrote:
           | No, they cannot work with large code base, not yet. And have
           | very limited talent for logic and debugging. They may improve
           | at some point, probably will be hooked up with external
           | tools.
        
         | StressedDev wrote:
         | Everyone runs code they have not inspected. For example, almost
         | no one has read all of the code of in FreeBSD, Linux (kernel),
         | MacOS, Open BSD, or Windows. I also doubt people are reading
         | all of the code in their favorite Linux distribution.
         | 
         | Even inspecting the code is not enough because a lot of
         | security vulnerabilities are not obvious. Basically, security
         | is hard, and often there are not a lot of good solutions.
         | 
         | Here are some tricks I have found which have helped me minimize
         | my risk:
         | 
         | 1) Use different machines for different purposes. Basically,
         | you should not use 1 PC (or Mac) for everything. I have one for
         | my finances, one for gaming, and a general-purpose PC. If one
         | gets hacked, the others are still fine.
         | 
         | 2) Get software from trustworthy sources. Most of the major
         | software companies are not going to ship malicious code. For
         | open-source software, use software from popular projects which
         | have a good reputation.
         | 
         | 3) Ask yourself why is someone providing this software? Is it
         | for money? Are they creating it because they enjoy it? How do
         | they support themselves? For example, Google's business model
         | is building a dossier on people so it can deliver ads they are
         | more likely to click on. When Google gives you something for
         | "free", they will probably use it to track you, or track
         | visitors to your website.
         | 
         | 4) Support the people who build the software you use. If its
         | commercial software, pay for it, do not pirate it. If it's open
         | source, donate time or money to the projects you use. Also,
         | thank the people who work on the software, and ALWAYS treat
         | them with respect.
         | 
         | 5) Avoid pirated software, software from "free" porn web sites,
         | etc. People who provide illegal software, or sketchy software
         | are probably willing to put back doors in it.
        
           | creata wrote:
           | > For open-source software, use software from popular
           | projects which have a good reputation.
           | 
           | On this topic, how much should a person trust central
           | repositories of well-known operating system distributions
           | (e.g. Arch, Debian)? I know only trusted people can upload to
           | them, and the only time I've ever heard of malware slipping
           | past them was XZ, but I don't know how much care they take.
        
       | uyzstvqs wrote:
       | I'm curious if it'd be possible to use a Code LLM to scan GitHub
       | repos and detect possible malware hiding in source code.
        
         | ChrisMarshallNY wrote:
         | I have a feeling that we'll be seeing some businesses, built,
         | around exactly that.
        
           | bdangubic wrote:
           | Github? ;)
        
             | tehlike wrote:
             | Socket.dev is not built around this but makes use of this.
        
         | pingou wrote:
         | I'm afraid a few simple tweaks, especially if the hackers
         | themselves have access to the code LLM to try out their code,
         | will be sufficient to evade detection.
        
           | nicce wrote:
           | Endless race like with Anti-Virus software.
        
         | dmazzoni wrote:
         | If such a tool became commonplace, bad actors would just run it
         | on their own malware and keep tweaking it until the LLM failed
         | to detect it.
        
       | creata wrote:
       | Why does there seem to be such a disregard for security in deep
       | learning?
       | 
       | There's examples like this post, but also, until recently, almost
       | every deep learning model was literally distributed as a pickle
       | file.
        
         | wisemang wrote:
         | "Security is not my field, I'm a stats guy": a qualitative root
         | cause analysis of barriers to adversarial machine learning
         | defenses in industry [0]
         | 
         | [0] https://dl.acm.org/doi/abs/10.5555/3620237.3620448
        
         | chx wrote:
         | It's not specific to deep learning, practically every industry
         | will look at security as a cost just not worth it. When we
         | start throwing the CEO into jail instead of making them pay a
         | 18.5M fine for losing the data of 41 million customers that's
         | when things will change. Until then, it's just the cost of
         | doing business.
        
         | dysoco wrote:
         | From my outsider perspective, it's a field that moves very
         | fast, there seem to be new tools being released every week so:
         | 
         | 1) As the developer if you focus on hardening, you might be too
         | late to release.
         | 
         | 2) People downloading shiny new libs/files/programs constantly.
         | 
         | 3) Influx of people not that versed in the basics of computer
         | security playing around with local LLM models, image
         | generators, etc.
        
           | justinclift wrote:
           | That seems like an almost exact duplicate of the NodeJS/NPM
           | issues?
           | 
           | Those same points (but the NodeJS/NPM version of them) is a
           | lot of why that ecosystem is having security and reputation
           | issues as well.
        
         | sys_64738 wrote:
         | Isn't this just one of the milestones that'll eventually
         | happen? Blind panic due to security always occurs at some
         | point. There must be a 'law' defined for this somewhere.
        
       | Seattle3503 wrote:
       | How do people feel about using docker to prevent this sort of
       | thing? Does it strike the right balance between usability and
       | security?
        
         | belladoreai wrote:
         | Well, Docker is great for this as long as you're not one of the
         | unlucky few whose machine is bricked because of Docker. So,
         | mostly yes, I suppose.
        
           | lyu07282 wrote:
           | What does that even mean?
        
             | belladoreai wrote:
             | "Bricking" is when your electronic device stops working,
             | i.e. becomes a brick. Docker is known to occasionally brick
             | Windows machines.
        
               | jiggawatts wrote:
               | Wait... what!?
               | 
               | This is the first I'm hearing of this. Do you have any
               | references?
        
             | justinclift wrote:
             | Docker itself doesn't seem to have the best quality control
             | for their official releases, so blindly upgrading Docker
             | will likely bite you in the ass if you do it for a few
             | years. :(
        
       | badrunaway wrote:
       | what can be done to stop all this? We need some sort of OS level
       | layer to validate these things. If we put a local LLM which
       | checks the bytecode of things which are getting installed/running
       | for security = will that solve all this? My heart goes out to
       | those who must have lost their money due to this.
        
         | creata wrote:
         | One basic measure (one part of a solution) would be to split
         | Comfy into two parts: the part that does all the work (running
         | plugins, generating images) should have access to nothing but
         | read-only access to the files it needs, the GPU, and a socket
         | to communicate with the other part.
        
           | badrunaway wrote:
           | A cleaner API you mean which exposes what is necessary only.
        
             | creata wrote:
             | I meant sandbox the less trusted bit.
        
         | KennyBlanken wrote:
         | Well, for one, the keylogger is detected by antivirus programs.
         | 
         | I keep coming across various projects whose executables trigger
         | antivirus programs, and I think that when those triggers
         | happen, "it's fine, don't worry" claims need to be treated with
         | more skepticism.
         | 
         | At the same time, antivirus vendors need to stop being so lazy
         | and using strings and such that are clearly part of an open
         | source program/library for their signatures.
        
           | badrunaway wrote:
           | I believe there should be a clear indicator in UI of every OS
           | when any new program listens to your keystrokes.. it should
           | be the norm
        
         | FileSorter wrote:
         | I think this is one of the use cases for a sandboxed WASM
         | plugin system.
        
           | creata wrote:
           | But almost everyone working on these plugins _really_ wants
           | to use Python and PyTorch.
        
             | badrunaway wrote:
             | nobody ported python to wasm yet?
        
       | aintnolove wrote:
       | I peered down the ComfyUI rabbit hole [1] and it is shockingly
       | powerful. Did Adobe drop the ball on image generation? What are
       | they doing over there? There has to be a better, more secure way
       | to bundle up all this imagegen logic.
       | 
       | [1] https://learn.thinkdiffusion.com/bria-ai-for-background-
       | remo...
        
         | belladoreai wrote:
         | Yep, it's super powerful.
         | 
         | I would say that the "more secure way" is to just use ComfyUI
         | without installing any obscure nodes from unknown developers.
         | You can do pretty much anything using just the default nodes
         | and the big node packs.
        
       ___________________________________________________________________
       (page generated 2024-06-09 23:01 UTC)