[HN Gopher] 50% of new NPM packages are spam
___________________________________________________________________
50% of new NPM packages are spam
Author : miohtama
Score : 451 points
Date : 2023-03-30 10:52 UTC (12 hours ago)
(HTM) web link (blog.sandworm.dev)
(TXT) w3m dump (blog.sandworm.dev)
| miohtama wrote:
| Spam problems can be solved by
|
| - Cross-Internet reputation system for accounts
|
| - Small fee on submission
| wruza wrote:
| Captcha is an alternative to small fee, cause solving it
| automatically costs money.
|
| Real fee will scare away almost all amateur developers and
| almost all professional developers who don't already have a
| business account available.
| bkanber wrote:
| I run a big web property with great SEO ranking, and captcha
| definitely does not deter the spam. A lot of this spam is
| posted by actual humans.
| wruza wrote:
| Clearly, captcha only works in the same budget category. If
| that spam "business" endures hiring humans, it will easily
| swallow small fees too.
| rk06 wrote:
| A small fee on package creation/account creation would be
| better.
| mr90210 wrote:
| I don't see anybody paying to submit a package to a registry.
| Even if NPM didn't support other registries or direct installs
| from a versioning system, different tools would have been
| created by the community.
| marban wrote:
| Do you have an example for cross-community reviews?
| wruza wrote:
| Stackoverflow does that. As a regular user you can contribute
| by reviewing q&as from a special queue, it's next to your
| username+score div.
|
| https://www.google.com/search?q=stackoverflow+review+queue&t.
| ..
| acomjean wrote:
| Essentially this is what academic journals are doing.
|
| Every paper should be reviewed manually. Of course that costs
| some money (although the reviewers aren't paid).
| verdverm wrote:
| And they often barely validate the actual work, many times
| passing it on to subordinates.
| gregoriol wrote:
| Small fee to submit open-source packages?
| oleg_antonyan wrote:
| Just like things work in scientific publishing I guess
| bombolo wrote:
| So, terribly?
| pipo234 wrote:
| Ie.: Open source = free as in speech, and _cheap_ as in beer
| ;-)
| verdverm wrote:
| sounds like twitter's new strategy
| nailer wrote:
| It's a good strategy. Suddenly spam costs money.
| verdverm wrote:
| Does spam costing more money stop spam? Does it cost
| money per account, project, version? If I can make $100
| from one victim, is this spam still profitable?
|
| What happens to the international developers who cannot
| easily get a payment method setup?
|
| Does a $10/m "identity verification" stop a nation state
| from using the platform to influence?
| blowski wrote:
| When there's so much spam, there's a cost to everyone.
| pipo234 wrote:
| > - Cross-Internet reputation system for accounts
|
| Gets rid of anonymous spam.
|
| > - Small fee on submission
|
| Gets rid of amateur spam.
|
| I guess that's 98% of the problem. I think this is a good
| start.
|
| What to do about bogus projects sponsored by wealthy companies?
| What about abandonware? And how do we remain open and inclusive
| to newbees?
| can16358p wrote:
| Perhaps an optional small fee to be reviewed and "cleared" by
| a human reviewer (akin a blue checkmark) might be the nice
| middle ground (while you can still submit for free, but
| without an actual human clearing you "safe"). Of course it
| has its own problems like what happens an update is pushed,
| or something malicious in dependency tree and blue checkmark
| giving a false sense of security etc.
| throw10920 wrote:
| > Gets rid of anonymous spam.
|
| > I guess that's 98% of the problem.
|
| No, that's 99.99% of the problem. I've never even _seen_
| "bogus projects sponsored by wealthy companies" in volumes
| where it would be considered "spam".
|
| > What about abandonware?
|
| Grandfather in old projects.
|
| > And how do we remain open and inclusive to newbees?
|
| Everything is still open and inclusive - anyone can publish a
| repo on GitHub for free. Using a reputation system or a small
| fee for submission is a _very_ reasonable means of
| controlling access to a _centralized online repository_.
| justinclift wrote:
| > What to do about bogus projects sponsored by wealthy
| companies?
|
| Does this happen in the real world, rather than as a
| theoretical concern?
|
| As a thought, when a problem is pressing then sometimes it's
| best to start with a reasonable action then course correct
| over time. Rather than doing nothing waiting for a perfect
| solution.
| mnw21cam wrote:
| > Does this happen in the real world, rather than be a
| theoretical concern?
|
| Heck yes. 99% of the stuff advertised to me in big money
| advertising campaigns is stuff that I will never want. If
| that doesn't count as "bogus projects sponsored by wealthy
| companies" then I don't know what does.
| mindcandy wrote:
| In the real world, do wealthy companies want to be named
| on this list of 3 or 4 groups spamming NPM? That's a lot
| different than being seen buying a banner ad.
| pipo234 wrote:
| That's probably where the reputation part would come in -
| fair enough. Still, a large wealthy company might
| consider creating an "unaffiliated" front company to act
| on their behalf. For example, take out a legit open
| source competitor by having the front publish a mediocre
| bogus project with very similar name. Or paying a small
| fee to bundle malware into a legit FOSS.
|
| So, similar to Twitter's blue check mark - Yes, asking a
| "small fee" adds friction, but it's not an obstacle to
| the wealthy.
| justinclift wrote:
| Most of that seems _very_ theoretical, with the exception
| of "paying to bundle malware into legit FOSS".
|
| Unfortunately, that last one deserves a special _Fuck
| You_ to the main developers of FileZilla, who have
| knowingly bundled malware for years. :(
|
| For anyone that doesn't know about it, here's a forum
| thread about it they haven't deleted:
|
| https://forum.filezilla-project.org/viewtopic.php?t=50565
| nine_k wrote:
| Small fees can't be the same for every country: say, what is
| small in the US is hefty in Kenya, and what is small in Kenya
| is negligible in the US.
| pipo234 wrote:
| ...And of course the miscreants will always find a way to
| pay as little as possible. Sad, but true, there is no easy
| solution - at least not a fair one, probably.
| newswasboring wrote:
| > Small fee on submission
|
| This will immediately bias the submissions only coming in from
| the west. Remember you can make the fee small but sometimes a
| person can't even pay even if they have the money. I remember
| having the 1000 or so rupees required for some VPS stuff when I
| was a teenager and not being able to pay since I didn't have a
| credit card. I hope we don't ever make money a barrier to open
| source.
| duxup wrote:
| Are people who submit to NPM really that short on cash?
|
| I doubt it.
| input_sh wrote:
| Having cash and having means to spend that cash online
| while living in a random country are two very different
| things.
|
| It's easy to get a Visa/Mastercard in the US. It gets a bit
| trickier in some EU countries. Then the further from the
| west you go, the more complicated it gets, all the way down
| to impossible if you live in a place that the US isn't on
| friendly terms with (like Iran or Russia).
|
| If you auto-assume everyone can pay any amount online (even
| if it's a refundable $1 for verification purposes), you're
| gonna cut off access to a lot of people unintentionally,
| while only raising the bar a little bit for spammers.
| cacois wrote:
| Please thoroughly read the comment you are replying to.
| Raed667 wrote:
| A lot of counties (like mine) don't have access to global
| payments. Having a card in Euro or USD requires special
| paperwork.
| traveler01 wrote:
| From my experience cryptocurrencies are helping countries
| overcome that. But still, money wouldn't be a solution
| for this issue.
| mschuster91 wrote:
| You'd have to get cryptocoins in the first place, and
| there are countries which ban all kinds of cryptocurrency
| (China) or place it behind onerous KYC requirements (EU,
| US).
| duxup wrote:
| Do you mind if I ask what country?
| Raed667 wrote:
| Tunisia
| alkonaut wrote:
| > I hope we don't ever make money a barrier to open source.
|
| Then make some other very cumbersome proof. But it's still
| better to cut off half the world from open source than
| pollute the few large software repositories with spam, which
| would dissuade everyone everywhere from contributing
| eventually. There's no problem contributing to a library from
| anywhere it's just that you collaborate with someone who in
| turn can pay the reg/anti-spam fee.
| newswasboring wrote:
| I don't understand this obsession with solving everything
| with money. It doesn't event solve this problem, just
| because someone payed the fee doesn't mean their code is
| not spam. You can't keep the fee small enough and still
| dissuade spammers.
| ethicalsmacker wrote:
| If only there were some kind of decentralized digital
| currency a person could use outside of big banks and credit
| cards..
| zichy wrote:
| [Clown emoji]
| newswasboring wrote:
| Dude I couldn't get a credit card in India, do you think a
| young person can easily get bitcoin? Its not that I was
| banned from getting it, its just that I couldn't afford it
| and getting it was really hard. Getting bitcoin, starting
| from fiat, is equally hard.
| ethicalsmacker wrote:
| Woa woa, I never said bitcoin. I would never bring it up
| on HN, that's a recipe for downvotes.
| madsbuch wrote:
| The fee structure is the most interesting in my perspective. It
| would be interesting to see how an open source platform could
| combat this with a gas-fee structure using their own token
| economy. You'd need to get tokens to submit, which you can get
| by donations by engaging with a community, or by buying them,
| etc.
| hnbad wrote:
| - Abolishing capitalism
|
| Wait, I may have been overdoing it with root cause analysis
| again.
| pjc50 wrote:
| The fees are in the wrong direction. Submitters of good npm
| packages should be paid.
|
| (The overall lack of quality control on npm is a separate
| question)
| ndespres wrote:
| This old gem comes to mind in any conversation about how to
| alleviate spam problems:
| https://craphound.com/spamsolutions.txt
| thenerdhead wrote:
| Package managers have a hard enough time managing packages.
| Imagine them having to manage payments...
| throwaway290 wrote:
| NPM for companies costs enough that it surely covers all the
| reviews already.
| ehutch79 wrote:
| It's not about revenue, it's about making spam unprofitable.
| Charging 0.25$usd is enough to make spam not worth it.
|
| It also attaches an identity to the posting.
| JosephRedfern wrote:
| I think the suggestion was that the revenue generated by
| NPM's commercial dealings should cover any cost associated
| with a review process for OSS submissions (which in itself
| would make such spam repositories ineffective)
| ehutch79 wrote:
| So, let the spam happen, and remove it after the fact
| using humans? Or hold all submissions until a human
| reviews it?
| throwaway290 wrote:
| Yes, the first one. Not exclusively with using humans to
| develop better detection of spam.
| JosephRedfern wrote:
| I'm just clarifying my interpretation of OP's comment,
| not necessarily agreeing with it.
|
| Anyway, involving humans (funded by NPM's commercial
| revenue) doesn't reduce the options to "letting the spam
| happen and dealing it after the fact" or "holding all
| submissions until a human reviews [them]".
|
| If I was trying to solve this problem, I'd be open to a
| solution that tried to automatically classify submissions
| as either legitimate or spammy, with an associated
| confidence level. If the confidence level fell below a
| given threshold then I'd involve a human.
| pictur wrote:
| I don't think it's right to blame npm for this issue. The service
| they offer is very clear. The responsibility of the downloaded
| packages is the user.
| synergy20 wrote:
| maybe separate the repo into a few groups: main
| - the well known gold standard popular ones staging - the
| ones that will be moved to main when good enough
| experimental - whatever you want to push
|
| this is kind of like debian repos
| MR4D wrote:
| The time for curated package repositories has come. The Good
| Housekeeping Seal of Approval is sorely needed across many
| languages.
| cutler wrote:
| CPAN had the right idea - CPAN testers - back in the late 90s.
| Oh, how advanced we are 25 years later.
| seqizz wrote:
| Wait until spammers integrate with AI coding.
| bambax wrote:
| > _... SEO spam. That is - empty packages, with just a single
| README file (...) All the identified spam packages are currently
| live on npmjs.com_
|
| How is that possible? It seems it would be trivial to filter out
| spam based just on the observation above, why is it not done?
|
| (I'm (obviously) not familiar with the process of submitting an
| NPM package, so I'm genuinely curious how this works).
| delfinom wrote:
| I guess the reason is to avoid punishing newbies for one that
| are doing baby's first NPM tutorials.
|
| The other problem is if you make a rule "reject npm packages
| with only a single file called README". The spam bots will just
| add another fake file.
|
| This is a race to the bottom and requires far more aggressive
| fighting.
| pookha wrote:
| What I find interesting with NPM is that it pushed the boundries
| of the Unix philosophy (build a tool to do one thing well) and it
| turns out that this philosphy can suffer from even small amounts
| of narcissism, machiavellianism, and psychopathy. people were
| able to compose and reason about large amounts of complexity with
| these simple modules but the devil was in the details
| (literally).
| lazzlazzlazz wrote:
| If NPM was a crypto system, half of Hacker News would be saying
| this is a problem with crypto. The truth is that this happens in
| any mostly permissionless system: email, text messaging, Github,
| YouTube, etc. Most of the content is garbage.
|
| The challenge is that we need to find powerful ways to identify
| what's real/useful/safe without limiting permissionless
| innovation.
| bilater wrote:
| perhaps the solution is an npm verified packages badge (blue
| checkmark vibes lol)?
|
| You could get that with a nominal annual fee like 10 bucks (so
| solo devs aren't priced out) + review like an Apple app review.
| werdnapk wrote:
| So what happens when the spam is detected? Are the packages
| removed?
| andreimarinescu wrote:
| I've been monitoring NPMJS for a few months now. Up until Feb
| 28th they had a very close handle on these packages and removed
| them in around 8 hours. We're seeing these packages stay online
| for a lot longer now (and the volume is much higher)
| cutler wrote:
| I always wondered about the npm package output which, according
| to http://www.modulecounts.com , is 51k per day. That's insane
| when you consider its nearest competitor, PyPI is 250 per day.
| hogu wrote:
| Why doesn't this happen with GitHub? GitHub also has very good
| domain authority.
| speedgoose wrote:
| GitHub probably has a team working on the spam problem. Doesn't
| look like anyone cares at NPM.
| thenerdhead wrote:
| Rate limiting, spam filters, easy ways to report, 2FA
| requirements, etc.
|
| Many package managers are on this path too.
|
| https://github.blog/2022-07-26-introducing-even-more-securit...
| wdb wrote:
| Forked packages of ESM-only to transpile them to Commonjs and
| publish them as a new package is reasonably common.
|
| Or fork a Commonjs package that became a ESM-only package and
| backport changes to the package.
| tiffanyh wrote:
| For all complaints against Apple, they've done a remarkably good
| job _at their scale_ to keep spam low in their Apple Store.
| datkam wrote:
| Too bad they force you to use that store though...
| mirkodrummer wrote:
| I'm afraid it can get worse. What happens when there will be a
| proliferation of "looking legit npm packages" thanks to AI, full
| with ransomware? Currently I can't really figure out a one size
| fits all solution to that. Any idea?
| adql wrote:
| With AI it could even be fully working code.... hook some
| projects to use it as dep and replace with malware 6-12 months
| in
| ricardobeat wrote:
| Deno's model where code needs explicit permissions to use the
| network and file system is a good first step.
| counttheforks wrote:
| That works per application process, not per dependency. So
| that's useless to guard against evil dependencies.
| dr-detroit wrote:
| [dead]
| PhilipRoman wrote:
| It is very hard to turn a black box function into something
| that can be used reliably. Network and filesystem permissions
| are baby steps that only prevent genuine developer mistakes,
| not malicious attacks.
|
| The PDF converter library you're using might not need
| filesystem or network access, but it can detect specific text
| in links and replace the URL with a phishing site. There are
| no technical shortcuts to trust.
|
| You can sandbox all you want, use three layers of VMs and
| what not, but if you're allowing me to produce bytes for you
| and then expect to use them elsewhere _in any nontrivial way_
| , I've already won.
| nextlevelwizard wrote:
| Is it common to randomly browse npm for packages to use? Sure
| AI can create a copy of existing package with malware in it,
| but so can anyone else. It is harder to fake years of posts and
| community around a package that anyone might actually use.
| thenerdhead wrote:
| I work on a package manager and there are two main philosophies
| here.
|
| 1. Trust but verify - Assumes that some packages are inherently
| trustworthy and can be relied upon. This is where we are today.
|
| 2. Zero trust - Assumes you should not automatically trust
| anything, even if it appears to come from a trusted source.
| This is where it seems we're headed.
|
| For OSS/central registries, #1 is followed. For internal
| registries, #2 is followed.
|
| At least where the industry is headed towards are constant
| gates of "verification" following the #2 model. Think of the
| following:
|
| 1. Code signing
|
| 2. Reproducibility / Integrity
|
| 3. Verified sources
|
| 4. Least privilege
|
| 5. Monitoring tooling
|
| 6. 2FA
|
| 7. Vulnerability scanning
|
| 8. Allowlisting
|
| etc
|
| But are all those even practical for maintaining the ethos of
| open source? We'll find out.
|
| https://opensource.org/osd/
| citrin_ru wrote:
| Npmjs can do a lot to fight spam by collecting information
| about all http requests sent by logged in users (though GDPR
| may impose some limitations). In many cases this would allow
| knowing one spam package (e. g. reported by users) to uncover
| all or most submissions from the same threat actor by making an
| SQL query to analytical DB with the right parameters. But most
| abused services AFAIK don't pro-actively fight with spammers.
| AI will definitely would make it harder one can start with low
| hanging fruits - most spammers are not that sophisticated.
| wongarsu wrote:
| One idea that's gaining (marginal) traction in Rust (which
| really sits in the same boat here) is trusted reviews, where
| trust is established by a web of trust. You probably have some
| developers you trust, and they have a different set of people
| they trust, so you can establish transient trust (that decays
| as the chain gets longer).
|
| The most relevant project for Rust is
| https://web.crev.dev/rust-reviews/, not sure if anything like
| this already exists for NPM.
| transitivebs wrote:
| Trust is great; but even trust can be broken either on
| purpose or accidentally over time. There's a great example of
| a well-known NPM package which was taken over accidentally by
| a hacker, and the thousands / millions of dependent packages
| and apps were totally vulnerable.
|
| Check out https://socket.dev for a better NPM solution (not
| affiliated w/ them at all), though AI's definitely going to
| accentuate this problem 1000x.
| Mayzie wrote:
| Aand we're back to PGP/GPG.
| verdverm wrote:
| https://sigstore.dev (& cosign) seem to be gaining in
| popularity, ease of setup, and integrations
| sokoloff wrote:
| I would find amusement if the solution to the spamming of npm
| turns out to be a genuinely useful use case for blockchain.
| lelanthran wrote:
| > I would find amusement if the solution to the spamming of
| npm turns out to be a genuinely useful use case for
| blockchain.
|
| I think you can implement a web-of-trust without a
| blockchain.
| mjburgess wrote:
| indeed, blockchain makes trusting people much harder. The
| hijacked sense of "trust" used by the crypto-hype is a
| trivial technical sense in distributed databases.
|
| Rather, an immutable ledger is a terrible system for
| trusting /people/, since if the data input into the
| system isnt reliable, there's no way to change it.
|
| You then need to build an actual layer of trust on top of
| your untrustable blockchain, and then you end up spending
| 1MWHr and $100/review to recreate rotten tomatoes.
| sokoloff wrote:
| You can (because it's been done); this is a use case
| where "distributed but extremely slow database" is a
| pretty natural fit for the problem.
| overthrow wrote:
| What advantages specifically would a blockchain have?
| Where does the existing solution, of using a fast
| database and trusting someone's private key, fall short?
| mindcandy wrote:
| One of the first proposals for blockchain was a email with
| a minuscule, verifiable fee to make email spam
| uneconomical.
|
| Spamming emails is one of the cheapest things you can do
| with a network connection. Even $1 per 1,000 emails would
| make spam untenable.
| wongarsu wrote:
| And the first proposal for proof-of-work was having
| emails include a proof-of-work to make it computationally
| expensive to mass-send emails.
| rightbyte wrote:
| I'd rather have my ISP donate 0,1 cent to some national
| park foundation when I send an email or whatever than
| have me waste power though.
| ripperdoc wrote:
| I would love to see this getting bigger, not just for package
| managers but in general. With AIs it will be easier than ever
| to produce spam or just poor content. We need some better way
| to rank and accept content, and apart from having large tech
| companies hiring armies of reviewiers, I would think web of
| trust can solve it.
|
| Don't think that requires blockchain per se, or even human
| verification. It would work quite well just for me to assign
| my trust to various identities (Github accounts, LinkedIn
| accounts, etc) and for that trust to be used when ranking or
| filtering content.
| dpkirchner wrote:
| This sounds good. Seems like the easiest way to start is to
| use the package.json-defined dependencies to create the
| web/tree. If a developer of package A use package B, they
| trust the developer of package B, and so on.
| rendaw wrote:
| I don't entirely get this. By adding a dependency to a
| project, doesn't that already establish a web of trust? I.e.
| if you trust the dev who made library X, you trust they have
| good reason to trust library Y that X depends on, etc.
|
| Is this just about being more explicit about review?
| jcalabro wrote:
| Somewhat related is R's CRAN[0], which has a team of
| maintainers who review submissions to ensure they're up to
| quality standards.
|
| [0] https://cran.r-project.org/
| sgu999 wrote:
| Looks like there's an implementation of it for npm:
| https://github.com/crev-dev/crev
|
| I've been willing to try it for a while for Rust projects but
| never committed to spending the time. Any feedback?
| verdverm wrote:
| Is this outcome a point against having centralized registries?
|
| Why not go straight to the source code host?
| pmontra wrote:
| This is basically what I was doing in the 80s and 90s,
| downloading compressed tarballs from ftp sites and compiling
| them. It takes quite a longer developer time than the package
| manager approach. That includes the time to learn which sites
| you can trust (probably none today) and which dependencies to
| use (usually listed in the README.) Furthermore there would be
| a big incentive to use very few libraries: this is both good
| and bad. Good because there won't be silly one function
| modules, bad because a dozen of small modules can add
| significant value to a project in a short time. Having to code
| them or build a bigger all comprising module is much harder.
| verdverm wrote:
| The registry-less dependency management is how Go works
| today, and doesn't have those problems. It's even less
| developer time than NPM.
|
| 1. No need to spend time publishing, just push a commit
|
| 2. No need to `npm i` or edit a file, modules can be inferred
| from imports because they use FQDN
| pabs3 wrote:
| Indeed, Golang really got this right.
| throw_m239339 wrote:
| The problem is that all the popular NPM packages have so much
| dependencies that you cannot just download a zip file and
| install the package on your local computer in order for the
| library to work, you'd need a way to track all the
| dependencies...
|
| PHP (composer) or Java (Maven) are less prone to that issue
| because a composer package can't have 50 versions of the same
| dependency, unlike NPM. So even if a composer package has 20
| dependencies, it's relatively easy to track down and download
| all of them. NPM dependency tries are often exponentials, a
| stupid design decision. Version conflicts should be solved
| upstream, not by the package manager.
|
| But that decision allowed NPM to grow as a business which was
| eventually bought by Microsoft.
| scrose wrote:
| It would be great if Sandworm listed these malicious repos in a
| text file that could be imported into a blocklist in a service
| like Pihole.
|
| I'm not worried about hitting these URLs but definitely worry
| about the less tech savvy people in my family stumbling across
| these accidentally
| VWWHFSfQ wrote:
| how would pihole block these though
| scrose wrote:
| It was too early for me when I posted this :)
|
| There were two ideas in mind that were conflated: 1) A list
| for blocking the subpaths of these packages in npm that could
| be imported. 2) A list for blocking the malicious URLs in the
| repos themselves. Ie they mentioned that the repos have
| malicious URLs that navigate you off the page. This is where
| something like pihole could come in handy.
| porsager wrote:
| Would be great if npm->github->microsoft partnered up with
| https://socket.dev to get a crude filter and take down any
| obvious malicious/spam packages.
| alkonaut wrote:
| Just raise the barriers for contribution. Can't have completely
| open systems and simultaneously not have them be cesspools of
| spam and malicious packages. E.g. either get approval from N
| contributors of existing highly regarded packages, or pay an
| administrative fee for publishing new packages.
| sirius87 wrote:
| Spammers are possibly trying to take advantage of npmjs.com
| domain's high Google rank. I found and reported this spam account
| [1] with links to download movies. They seem to be using npmjs as
| a free web host with good SEO.
|
| [1] https://www.npmjs.com/~aarilzd
| dmux wrote:
| They must do a pretty good job of automating the removal of
| such packages because I get a 404 from that link.
| sebzim4500 wrote:
| Presumably this will only start to happen more when LLMs are
| being trained on this kind of data. For example, every training
| corpus weights Wikipedia way higher than random websites/forum
| posts, so sticking an ad for your product on some random
| article that no one looks at will get it into the model.
| marginalia_nu wrote:
| As an aside, something I've seen when reverse-engineering black
| hat SEO is online casinos sponsoring prominent open source
| projects in exchange for a sponsorship link. Seems generous
| until you you realize this also means a huge boost in page
| rank.
| sirius87 wrote:
| I've seen this in the Linux Mint project [1] with donations
| coming from carpet cleaning and light fixtures cos. Sometimes
| you'll see law firms and I.T. consultants. It's a pretty
| great idea. Counts as a win-win in my books, as long as the
| biz is legit.
|
| [1] https://blog.linuxmint.com/?p=4466
| r9295 wrote:
| Wow, I really wonder how people come up with such attack
| vectors
| Ciantic wrote:
| If the spammers only want to be indexed, then NPM should
| disable indexing for major search engines. But still allow it
| to be indexed other ways, which aren't unearthed on Google
| search.
|
| Other ideas include: do not index new packages before they've
| garnered enough downloads.
| prepend wrote:
| As a developer, I want npm package information and docs to
| show up in search. I frequently prefer pypi or cran results
| over others because then I can easily tell if it's a usable
| package vs just some snippet.
|
| Especially cran because it has pretty rigorous entry
| requirements so being in cran is a signal of at least some
| minimal quality.
| onion2k wrote:
| _As a developer, I want npm package information and docs to
| show up in search._
|
| What case is there when you want to find a package in NPM,
| and information about that package, using Google? If you
| want information about the package then it's find if the
| NPM package page is missing from the results - so long as
| you're getting the package's homepage or git repo then
| that's plenty. From there you can get to it's NPM page. If
| you know the package you're looking for, or if you know
| what you want to do, then searching NPM itself alone is
| fine.
|
| Essentially, there is no overlap in the Venn diagram of
| "searching for a package" and "searching for information
| about a package". You want one or the other, not a results
| page with links to both.
|
| If people realized this about their searches more then
| Google could fix _a lot_ of spam problems.
| chatmasta wrote:
| What? Nearly every time I search a package name in
| Google, I'm trying to get to the npm page. And I want to
| find the matching npm page so I can click from there to
| the associated GitHub, since it's the most trustworthy
| way to know I'm browsing the source of that specific
| package.
| onion2k wrote:
| _Nearly every time I search a package name in Google, I
| 'm trying to get to the npm page._
|
| This is exactly the point I'm making. It's _very_ rare
| that you want both NPM package pages _and_ internet
| results. If NPM wasn 't indexed it'd solve the spam
| problem, and the only cost would be people would need to
| think about what they're looking for and use NPM's search
| instead when they want the package page.
| chatmasta wrote:
| Ok, I see your point, but this creates another risk that
| you could end up on the GitHub page of an imposter
| repository that directs you to npm install from a typo-
| squatted malicious version of the package you're looking
| for.
| joshmanders wrote:
| As apposed to Google serving a typo-squatted malicious
| version of the package above the one you're looking for,
| directly from npm registry?
| chatmasta wrote:
| At least when you get to that page you can see download
| metrics, etc that are not available on GitHub.
|
| That's not to say you don't have a point. It's kind of a
| damned if you do, damned if you don't situation with
| multiple underlying and partially conflicting causes
| (tyosquatting vs. SEO spam).
|
| IMO, the best solution to the SEO spam is for npm to
| increase the burden of automated signup. Add more
| CAPTCHAs or even phone verification. And trigger alerts
| when there are suddenly thousands of new signups, or
| thousands of packages pushed from one account.
|
| Also, they could add rel=nofollow to all links on the
| page. This would make it less of an attractive target for
| SEO spam (but not entirely, since the page itself might
| still rank highly and the spammer doesn't necessarily
| care about getting link juice out of it, so much as
| getting traffic to the npm page itself).
| adql wrote:
| > What case is there when you want to find a package in
| NPM, and information about that package, using Google?
|
| Coz you might want results not only from docs but
| stackoverflow and other places ?
|
| > Essentially, there is no overlap in the Venn diagram of
| "searching for a package" and "searching for information
| about a package". You want one or the other, not a
| results page with links to both.
|
| Of course there is. I want docs, examples, and maybe
| opinions vs alternatives if I look to solve problem X
| with external dependency.
| onion2k wrote:
| _Coz you might want results not only from docs but
| stackoverflow and other places ?_
|
| _Of course there is. I want docs, examples, and maybe
| opinions vs alternatives if I look to solve problem X
| with external dependency._
|
| You don't need the link to the package in NPM to be in
| the results for either of these examples.
| sbarre wrote:
| You're trying real hard to tell people how they
| _shouldn't_ be doing their work, maybe accept that your
| opinion, while valid, is just that - your opinion - and
| that others have their own equally valid ways of
| approaching their work and their searching?
| onion2k wrote:
| I'm suggesting that it wouldn't be a problem if NPM
| switched off indexing, and if the article is correct that
| _half_ of packages are spam then it 'd actually be
| significantly beneficial.
|
| The broader point is that by expecting Google to be a
| single interface to the entire internet, and refusing to
| accept that there might be some places you need to go to
| directly, we make the problem of spam worse. Using Google
| for navigation when you know what site you want rather
| than using that site's search feature incentivizes
| spammers to abuse things they would otherwise ignore.
| readertime wrote:
| But I _want_ a single search bar that just magically
| gives me the right results. Given the enthusiasm for GPTn
| I think a lot of people do too.
|
| Whether that incentivized spammers to spam, or Google et
| al to improve their software (or risk being outcompeted),
| doesn't really seem like a "me" problem. I can't change
| these things.
| unbalancedevh wrote:
| > there is no overlap in the Venn diagram of "searching
| for a package" and "searching for information about a
| package".
|
| I don't know, if I want information about something, it
| seems pretty reasonable that I might do my search for
| that something.
| onion2k wrote:
| If that's the case then you're doing the second of the
| two searches, and if the NPM package wasn't in the
| results but its Github repo or homepage was you're still
| getting the results you wanted.
|
| For any search where you don't know what you want Google
| without NPM pages works fine.
|
| For any search where you do know what you want NPM's
| search function works fine.
|
| There isn't a case where you _need_ Google to interleave
| pages from the wider internet with pages from NPM. You
| only think you want that because it 's what you're used
| to, or because you use Google to do searches that you
| should really do on NPM instead.
| chatmasta wrote:
| > if the NPM package wasn't in the results but its Github
| repo or homepage was you're still getting the results you
| wanted
|
| Or you're getting a GitHub page with a similar name, or
| worse, a malicious GitHub page that instructs you to
| download the npm package you're looking for from a typo
| squatted version of it.
| hombre_fatal wrote:
| Also NPM is the only source that can show you the code
| you're actually going to get whether you download and
| inspect the tarball or you use NPM's built-in code
| explorer.
|
| A github page really isn't what I want at all when asking
| questions about an npm package except for the fact that
| I'm used to its code browser, so I tend to click it out
| of habit.
| matharmin wrote:
| I, like many other developers, are lazy. When I search
| for a package, or when I want for information about a
| package, I just search using Google, same as when I
| search for anything else. No cognitive overhead to decide
| exactly where I should search.
|
| Sometimes the top results for a package is its GitHub
| page; sometimes it's NPM. I don't particularly care which
| one it is, except that the NPM page very clearly shows
| the package name. But I do care that the results are
| there. And if NPM results disappeared from Google, I
| wouldn't remember to use NPM's search all the time.
|
| Additionally, what from the argument does not apply to
| GitHub as well? Perhaps they're better at filtering out
| spam repositories, but otherwise it's the same thing -
| free hosting on a domain with presumably high ranking on
| Google. And if that is also removed from Google's
| results, NPM packages wouldn't show up anywhere in the
| search.
| overthrow wrote:
| I sometimes run topical searches like <protobuf
| site:npmjs.com> to discover packages if I don't know the
| package name ahead of time. It would be annoying if NPM
| were not indexed at all.
| onion2k wrote:
| _It would be annoying if NPM were not indexed at all_
|
| More or less annoying than NPM being used for hosting
| spam?
| overthrow wrote:
| That seems like a false choice. Deindexing is not the
| only way to solve spam. Plenty of other websites have
| found solutions and didn't have to pull themselves from
| Google.
| prepend wrote:
| There's a few, but a significant one is that I'm familiar
| with how npm organizes information whereas package pages
| organize in many different ways and sometimes put
| marketing spin on it.
|
| So I like finding npm in my search results so I can see
| release history and other package metadata.
|
| Also, like I said, npm is more trusted than lots of
| different developer pages so knowing something is a
| package is useful and not immediately apparent from going
| to a project page or GitHub repo.
|
| It's not that it's impossible to find this info outside
| of npm, it's that it's easier to mix npm results in.
|
| Also, generally I want to be able to search all relevant
| info in the universe. Trying to keep track of what exists
| and is excluded, especially if excluded to prevent
| spammers, is a waste of my thoughts.
| loic-sharma wrote:
| Google search is an extremely common way to discover
| packages. Disabling indexing entirely isn't a valid solution.
|
| Downloads are very easy to fake. Usually package managers
| don't allow indexing until the package and its author reach a
| certain age. This allows the team to discover and remove the
| package before it is indexed.
| franky47 wrote:
| > Other ideas include: do not index new packages before
| they've garnered enough downloads.
|
| Which would be trivial to automate.
| SquareWheel wrote:
| That seems pretty extreme. Why not just add nofollow to
| links? That's what websites like Wikipedia do.
| leshenka wrote:
| how do you garner enough downloads without being discoverable
| by Google?
| Ciantic wrote:
| It's a fair question, most JS libraries I've discovered
| weren't directly accessed with Google -> npmjs.com but
| instead from the library's own page, GitHub, Hacker News,
| etc.
|
| If I Google a library and end up on npmjs.com I usually
| just click on a link to the library's repository or home
| page first.
|
| Of course, it would disenfranchise a bit, but what is
| another option?
| counttheforks wrote:
| Not npmjs.org's problem. Most languages their dependency
| managers don't give away indexed flashy web pages for free
| either, yet discoverability is usually not a problem.
| chatmasta wrote:
| Which languages have dependency managers with a public
| registry that is not indexed in Google? pypi.org and
| docs.rs are both indexed in Google, for example. With
| docs.rs it's even kind of annoying because often the
| indexed page is for an outdated version of the package.
|
| There's really no reason why the same spammer couldn't
| target those sites too.
| nobody9999 wrote:
| There's been lots of discussion about blockchains, webs of trust,
| trusted reviews, small[0] fees and a host of other ideas to
| address npm package spam.
|
| I'll throw out another one: create an automated testing process
| for uploaded NPMs, such testing to be performed _before_ allowing
| the new "package" to be visible to others.
|
| If the testing process can't find any code or if it really _is_ a
| real package, but can 't be successfully tested, the upload can
| be rejected with (or without for obvious spam) an email to the
| "developer" letting them know their code doesn't work and won't
| be visible to the world until they fix their bugs.
|
| The devil is, of course, in the details. I'm sure there are many
| edge cases and special circumstances that will likely require
| manual intervention, but I'd expect that such a solution would
| cover the vast majority of "spam" packages, with the added
| benefit of not allowing broken code on the site either.
|
| Perhaps (likely even) there are other, better ways to handle this
| issue, but this idea would, presumably, significantly reduce the
| spam issue without negatively impacting honest/real developers.
|
| Just a crazy thought.
|
| [0] "Small" is relative, as a bunch of folks have pointed out.
| lenzm wrote:
| This seems like an arms race doomed to failure. The spammers
| can just add Hello World to pass the check. Then the check
| could be upgraded to look for some non-trivial behavior. Then
| the spammers will work around that. ... all at increasing costs
| to the package hosts. And now they have to be arbiters on what
| counts as trivial functionality.
| nobody9999 wrote:
| >This seems like an arms race doomed to failure. The spammers
| can just add Hello World to pass the check. Then the check
| could be upgraded to look for some non-trivial behavior. Then
| the spammers will work around that. ... all at increasing
| costs to the package hosts. And now they have to be arbiters
| on what counts as trivial functionality.
|
| IIUC, most of these spam "packages" don't have any code at
| all, just a README with links to whatever malicious sites
| they want folks to visit.
|
| As such, don't assume that just because someone uploads a
| spam package actually knows how to code _anything_ ,
| especially since it appears that such spam packages are
| uploaded not to scam Node devs, but to use the good
| reputation of npmjs.com to host their spammy content.
|
| Getting rid of that stuff is the low-hanging fruit. And I
| would not be at all surprised if almost all of these these
| folks couldn't code anything useful or worthwhile in Node or
| any other language.
|
| It's highly unlikely that most of the folks uploading these
| spam packages are node devs, or devs of any kind.
|
| As such, most of these folks wouldn't be able to participate
| in an "arms race."
|
| And while some tiny fraction of those folks might be an
| enterprising spammer who writes an actual npm package. The
| problem with that, of course, is that it's quite likely that
| it's just a small number of folks who are uploading dozens
| (hundreds?) of these "packages," forcing them to either reuse
| the code over and over again (which is fairly easy to spot)
| or to actually develop new code for each package.
|
| And that's _way_ too resource intensive for scammers. If they
| were folks who had skills, decent work ethic and /or an
| interest in anything other than running their scams, they
| wouldn't be posting _fake_ (i.e., just an empty package with
| a README) packages in an attempt to use npmjs.com to host
| their crap.
|
| I mean, I get it. Perhaps you made the assumption that these
| folks are actually devs? Since they're using the site -- but
| IIUC, there's no proof that's the case -- at least for the
| specific empty packages I referenced above.
|
| Edit: Clarified my thoughts.
| tyingq wrote:
| Searching for the string "down_load_ebook" does unearth a lot of
| packages. https://www.npmjs.com/search?q=down_load_ebook
|
| About 100k spam packages, with no false positives that I can see.
| tyingq wrote:
| Some other patterns that don't have quite as much, but still
| 100% spam results:
|
| https://www.npmjs.com/search?q=zip-mp3-a-lbum
|
| https://www.npmjs.com/search?q=do-wnload-available
|
| https://www.npmjs.com/search?q=file-alb-um-zip
| flanbiscuit wrote:
| wow 104,395 packages found
|
| So far the oldest package release I've seen was only 7 days go,
| all authored by uniquely generated name with the same format:
| Random First Name + Random Last Name + Random 4 numbers
|
| Interesting that npm lists 5,219 pages of results but errors at
| anything past page 2000.
|
| https://www.npmjs.com/search?q=down_load_ebook&page=2000&per...
| not_your_vase wrote:
| And very informatively the HTTP error code is "418 - I'm a
| Teapot" at page 2001.
|
| (Though the response body does say "out of bound", so it's
| not all bad. I guess this amount of fun is allowed.)
| flanbiscuit wrote:
| ha! I didn't even think to look at the response
|
| I guess they want to spare their server some unnecessary
| work and figured "who is going to look at more than 2000
| pages of results?!", or maybe that's some sort of caching
| limit.
| construct0 wrote:
| And looks like we're up to 108,702 packages a mere 6 hours
| later.
| sirius87 wrote:
| More: https://www.npmjs.com/search?q=john%20wick
|
| Even have typo variants:
| https://www.npmjs.com/search?q=jhon%20wick
|
| What's funny is they've even bothered to publish multiple
| versions of some packages. Looks like most of these packages
| were created in the last 2 weeks.
| xwdv wrote:
| IMO this isn't really a problem as long as you have good package
| discovery mechanisms in place. You won't slurp up some spam
| package with little reviews and no ratings if you're paying
| attention. Look at all the spam email we get yet never even open.
| bin_bash wrote:
| Remember this is a Microsoft product. They certainly have the
| resources to resolve this if they want to.
| delfinom wrote:
| Microsoft is pretty hands off when it comes to their
| acquisitions the last decade.
|
| And moreso, this is GitHub's product (they acquired it, not the
| larger MS org), the GitHub group is still fairly independent of
| Microsoft. I can imagine GitHub doesn't give a shit as they
| continue to push people to use the GitHub package registry
| instead.
| bin_bash wrote:
| This is no longer true. We're in a post-copilot world now
| where GitHub is the star of the show for the entire
| corporation.
| crop_rotation wrote:
| A Microsoft product doesn't mean the full capacity of the
| company will be devoted to resolve it. In a big company almost
| all products have to fight very hard for additional resources,
| they are not given resources just because the company as a
| whole made tons of profits.
| loic-sharma wrote:
| Exactly this. Don't blame the team as they're doing the best
| they can with their limited resources. However, calling out
| spam on HN will help convince Microsoft's leadership to
| invest in this problem :)
| bin_bash wrote:
| It sounds like you completely missed the last 4 words of my
| comment.
| djitz wrote:
| I've noticed quite a jump in PHP composer package spam recently,
| too. A lot of just ever so slightly renamed popular packages. A
| couple even had quite a few stars on their repo.
| wruza wrote:
| Just think of it, there is a real developer who decided to do
| this. Spam is immoral, but doing that to an open source
| repository is your personal all time low.
| aaron695 wrote:
| [dead]
| FredPret wrote:
| So true. It's truly sad that some people can hold tight to
| their cynicism even as they build up their technical skills
| criley2 wrote:
| The people who do this are likely not American or Western
| European, likely not from a wealthy background, likely don't
| have access to high end tech jobs, and probably can't even
| make 5% of what a Facebook or Google employee makes.
|
| These people might feel spite and anger towards the western
| world for the extreme lavish excess that developers enjoy.
| It's not hard to imagine a world where developers can learn
| some skills but are locked out participating like we do, and
| thus decide to weaponize those skills against us for whatever
| profit they can.
| FredPret wrote:
| You're correct, though I think part of the reason there's
| more cybercrime from distant countries is the lack of
| consequences.
|
| I will add that this mentality does not exactly build up
| their societies to fix the problem. When I moved from
| Africa to the first world, the high level of trust and
| conscientious behaviour by everybody blew my mind.
|
| My point being that wholesome behaviour and net worth are
| linked in a virtuous cycle.
| Tade0 wrote:
| The charitable summary of your comment is that it is
| inaccurate.
|
| For one, tech salaries outside of the developed world have
| been going up at a higher rate than in it for the past 20
| years or so - the pandemic and proliferation of remote work
| only accelerated this process.
|
| As for spite and anger: a tech worker in a poor country is
| easily within the top 10% (if not 5%) earners there and is
| usually too financially secure for such nonsense.
|
| The whole crypto debacle showed that scammers are largely
| evenly distributed around the world - it's just the type
| and scale of scam that differs.
| themitigating wrote:
| Being jealous isn't a justification for any action
| tremon wrote:
| On the contrary, jealousy is one of the major drivers of
| consumerism.
| criley2 wrote:
| I think you mean to say that you don't respect actors who
| justify their actions through "jealousy". In reality,
| jealously is a fine justification for actions and
| arguably the most used justification for any action in
| human history. Hard to think of a historical war that
| wasn't based on "jealousy", in the end.
|
| I kind of feel like your comment is like saying "Being
| poor isn't an excuse for stealing bread", and while
| completely and totally true, it really works hard to miss
| the point.
| ozim wrote:
| No he means "being poor isn't an excuse for being
| asshole".
|
| Just like keying your neighbor car because he could
| afford nice one is not acceptable whatever you feel like.
| criley2 wrote:
| "Keying your neighbors car because they have a nicer one"
| is not an analogy that works for anything here.
|
| What is happening in NPM is not a car being keyed. There
| is a profit motive for doing this.
|
| Perhaps you could say "Stealing 1 gallon of gas from your
| rich neighbors car to feed your starving children makes
| you an asshole", that's an analogy that seems to fit what
| is happening here, and an opinion I would disagree with.
| alexb_ wrote:
| What is your opinion on catalytic converter thieves?
| Dudeman112 wrote:
| Ah, the age-old mixing pointing out the _reasons_ for why
| an individual might act they way they do with morally
| absolving them
|
| Ever common amongst people who have never seen or felt
| the consequences of abject poverty
| amerkhalid wrote:
| Wow
|
| Trust me if you are struggling to make ends meet, you don't
| have time for these kind of childish revenge.
|
| Only reason you see developers from some developing
| countries developing spam related products is because it
| pays bills. When your livelihood depends upon such
| products, it is hard to do the right thing. Just like so
| many people in the west working for very questionable
| companies.
| bryanrasmussen wrote:
| >Trust me if you are struggling to make ends meet, you
| don't have time for these kind of childish revenge.
|
| sure but once you start making ends meet you might think,
| now I can take some time to screw over other people! It
| really depends how pissed off you are.
|
| Although if you were really that pissed off I doubt this
| is the way you would go.
| [deleted]
| xenophonf wrote:
| My (former) friends who built thousands of websites to
| manipulate pagerank back in the day were definitely wealthy
| westerners purposefully gaming the system to make even more
| money for themselves, to the detriment of the rest of us.
| oleks wrote:
| > The people who do this are likely not American or Western
| European
|
| Maybe not natively, but they may be working in the US or
| Western Europe, making upwards 50% of a Google/Facebook
| salary, if not working at Google/Facebook indeed.
|
| Plenty of companies pay a decent salary for mediocre work,
| and will take the less morally sound developer, because the
| sound one isn't willing to work with their legacy code or
| less moral product (e.g., oil industry, financial
| services). Making good money in tech != good morals.
|
| Finally, being physically in the US/Western Europe doesn't
| necessarily imply that you don't think that russia deserves
| to be treated better.
| robertlagrant wrote:
| I mean. Given the world as we know it would become
| impoverished overnight without them, it's hard to see how
| oil and financial services industries can be seen as
| immoral. Imperfect, certainly, but immoral?
| MikePlacid wrote:
| > These people might feel spite and anger towards the
| western world for the extreme lavish excess that developers
| enjoy.
|
| Oh, let me tell you my "lived experience" of spite and
| anger that I once felt towards western developers.
|
| So, it was late 1990s and our sales guys got hold of a
| presentation paper that competitor guys gave to a customer
| that both our companies were trying to win. I never read
| such a collection of blatant lies in my life! And I came
| from a one-Party country where newspapers were... uhm
| notorious for their lying. But not like this! Specifically
| a feature that I've spent more than half a year on, and
| which we were proudly shipping - was marked as not
| existent. Imagine somebody trying to scratch half a year of
| your life, and a rather intense half a year to that - out
| of existence. With black, lying ink.
|
| And I clearly remember sitting and thinking: why are they
| doing this? The competitor was a well-established company,
| long time in business, probably employed citizens, provided
| them with pension funds and other perks - why don't they
| compete with us, mostly new emigrants on a work visas - why
| can't they compete on _merits_? They have everything to
| just sit, work and compete - why lie?
|
| Yes, I was feeling spite and anger, true.
|
| But, about 20 years later, just around that your famous
| President inauguration - this exact competitor went
| bankrupt. The stopping point for a buyer was - they did not
| want to fund pensions 100%. It was like watching Karma
| working right and clear in this material world - a rare
| moment, no?
| codedokode wrote:
| While in Russia talented developers make less than a newbie
| developer in the West earns, their salaries are relatively
| high compared to non-IT jobs. You won't die in the street
| if you are a developer. The reason why those people spam is
| either because they have low technical skills and cannot
| find a decent job (most probably) or simply because they
| believe that work is for losers; successful men take money
| from others instead of working like a slave.
|
| As they lure people into Telegram channels in hope to scam
| them, I assume that the conversion is low and this is not
| very profitable and they do this because of lack of skills.
| Joker_vD wrote:
| How do technical skills and cynicism are supposed to affect
| each other?
| smallerfish wrote:
| I think "immoral" is a reach as a description of spam, and to
| be crystal clear I'm not defending spam. How is spam any more
| immoral than ads in a web page? Both are inserting advertising
| into a channel that a user is accessing information through, as
| a way to raise revenue or change behavior. (Spam is not by
| definition phishing, any more than banner ads are innately
| phishing, though phishing can be served through both mediums.)
| If spam is _immoral_ then why is adtech in general not
| _immoral_?
| salawat wrote:
| Adtech _is immoral_. It has been immoral, it will remain
| immoral.
|
| When you start diluting what people are actually looking for
| in an ocean of advertisement, malware, tracking pixels, and
| surveillance call-homes you've firmly left the territory of
| the moral.
| sbarre wrote:
| Because, like so many things, context matters.
|
| Ads have a place in the world, where we expect to see them
| (whether we like them or not), and typically most ads are not
| trying to pass as non-ads (yes of course there are exceptions
| to this).
|
| The difference here is that these exist in a place where ads
| should not be, as per the description and use of the service.
| And it also subverts the experience the service owner is
| trying to provide.
|
| Imagine if you accept a "free sample" box of cereal and you
| get home and open it and it's just full of flyers, instead of
| being full of cereal.
|
| Or this is why you can't just go to any private space like a
| shopping mall with a megaphone and a sandwich board and start
| advertising your services without permission. Security will
| ask you to leave, because the owner of the mall didn't agree
| to this.
| smallerfish wrote:
| > Or this is why you can't just go to any private space
| like a shopping mall with a megaphone and a sandwich board
| and start advertising your services without permission.
| Security will ask you to leave, because the owner of the
| mall didn't agree to this.
|
| You can certainly go to any public space and do this,
| however. People do it all the time (admittedly less
| frequently with megaphones). Are all of the people on
| street corners doing twirlies with cardboard signs immoral?
| Billboards would be a gray area example whereby they're
| hosted on private resources (land) but intrude into public
| space (view from highway).
|
| > Imagine if you accept a "free sample" box of cereal and
| you get home and open it and it's just full of flyers,
| instead of being full of cereal.
|
| Imagine if you accept a "free social media feed" of
| information about your community, and you "get home" and
| it's full of ads. Or you accept a "free article" from a
| website by clicking on a link, and when you load it
| (consuming bandwidth on a line that you paid for), it
| contains just as many ads as it does paragraphs of
| information.
|
| As I said, I'm not defending spam in general (which is
| obnoxious), or the act of the person/people who
| polluted/vandalized the npm repos. I just think "immoral"
| is a little strong unless you also want to paint much of
| the rest of the ad world with the same brush.
| sbarre wrote:
| > You can certainly go to any public space and do this,
| however. People do it all the time (admittedly less
| frequently with megaphones). Are all of the people on
| street corners doing twirlies with cardboard signs
| immoral? Billboards would be a gray area example whereby
| they're hosted on private resources (land) but intrude
| into public space (view from highway).
|
| Yes I specifically said _private_ spaces for a reason.
| Apples and oranges here.
|
| There are no public spaces on the Internet.
|
| > Imagine if you accept a "free social media feed" of
| information about your community, and you "get home" and
| it's full of ads. Or you accept a "free article" from a
| website by clicking on a link, and when you load it
| (consuming bandwidth on a line that you paid for), it
| contains just as many ads as it does paragraphs of
| information.
|
| Not sure why you're trying so hard to counter my
| examples, with inadequate examples to boot?
|
| I am still getting something from that feed with ads, or
| that article with ads.
|
| If I only get flyers and no cereal, then not the same,
| right?
| smallerfish wrote:
| The internet absolutely was a public space until the
| ads/walled garden model replaced it.
| sbarre wrote:
| You and I have different definitions of public space.
|
| I've been on the net since the early 90s, and even back
| then there were no public spaces.
|
| There is nowhere online, and really never has been, where
| you have a _right_ to be, or where you can express your
| government-given rights (also, which government? most of
| us are not US citizens) without anyone having the ability
| to cut you off or kick you out at their own discretion.
|
| Every server, whether it was Usenet, IRC, the web, email,
| or otherwise, was, and is, owned by a private entity that
| could moderate, manage and restrict usage as they see
| fit.
|
| If you cause them enough trouble, they will boot you, and
| have every right to do so.
|
| I don't call that public spaces.
| [deleted]
| lib-dev wrote:
| I'll paint 'em all with that brush. It's a fundamentally
| manipulative industry.
| bleep_bloop wrote:
| Much more eloquently composed response than mine.
| bleep_bloop wrote:
| We accept ads because in return we usually receive a product
| or service for free. It's an unwritten contract that society
| has accepted.
|
| Spam on the other hand is nothing more than guerrilla
| advertisement. It's obnoxious. It serves no purpose other
| than to it's creator. It provides no benefit to end users or
| society.
|
| Sounds kinda immoral if you ask me.
| raincole wrote:
| > How is spam any more immoral than ads in a web page?
|
| What?
|
| Many websites need ads to survive. Node.js doesn't need spam
| to survice. It's a quite huge difference, don't you think?
| Georgelemental wrote:
| You are free to put ads on your own service, because you own
| it and can do what you want with it. But you don't have the
| right to vandalize someone else's service with spam.
| [deleted]
| themitigating wrote:
| Yes but they don't care. Some people don't care if they are
| immoral. That's why you need regulations and punishments to
| stop them.
| swyx wrote:
| and yet the collateral cost of regulations and punishments on
| good/innocent people is often far worse than the damage
| caused by spammers. "regulate all the things" people often
| underestimate how poorly regulation solves the problems they
| set out to solve and how it often creates new ones.
| bryanrasmussen wrote:
| I guess my AmazingProject
| https://github.com/bryanrasmussen/AmazingProject that I
| made 97% as a joke when someone was running a code camp or
| whatever and a bunch of newbies where creating projects
| with the word Amazing in it would be grounds for punishment
| under a lot of regulatory regimes.
| madeofpalk wrote:
| > but doing that to an open source repository
|
| meh. It's owned by Microsoft - aside from the regular morals of
| spam and whatever, I don't think it's _especially_ bad to
| target a Microsoft property.
|
| How much of the NPM registry actually is open source?
| kaba0 wrote:
| My city's public transport system is owned by a private
| company, am I not harming the very public (over the private
| entity) if I were to make a mess in a tram?
| lookdangerous wrote:
| How about instead of who owns it, ask who uses it?
| madeofpalk wrote:
| I use NPM regularly and I've never been impacted by this
| spam.
| Nextgrid wrote:
| I don't think this would affect most developers? The value
| of NPM is a host of packages that you reference in
| package.json, not its web UI.
|
| The spam on the web UI is dangerous for victims that land
| there via search engines, but I don't think this would
| affect NPM's _actual users_ that much?
| lookdangerous wrote:
| Thanks for clarifying the situation
| delfinom wrote:
| It's owned by GitHub first and foremost. Microsoft owns
| GitHub but there's independence between the two.
| oleks wrote:
| [flagged]
| bowsamic wrote:
| Life makes much sense when you consider it to have the ethics
| of professional motorsports racing. There, there is no sense of
| ethical behaviour, as long as you act within the rules you can
| do anything. That is how modern F1 driving came to be. The F1
| team engineers say that designing the cars consists of looking
| at the new rules and working out how to bend and subvert them.
|
| All of life is like this. People exploit anything in order to
| make a living, and that is fine. The solution for this is to
| make it so that people do not need to do such things just to
| make a living.
|
| EDIT: More succinctly, if you want the world to make sense to
| you, you should not expect people to put your personal ethical
| viewpoints above their improvement of their material
| conditions.
| 11235813213455 wrote:
| _human_ life maybe, because more natural life is about
| survival (without established rules or specs), sometimes at
| the expense of another, but not for fun, entertainment, nor
| with a huge pollution footprint as well
| wruza wrote:
| I think you ignore(?) an important detail that the world is
| as good as it is due to most people _not_ subverting the
| rules. While I understand the philosophy and a sort of
| realism you're suggesting, I prefer to separate morals from
| holes in rules internally.
|
| They may or may not feel guilt for this. We may also remove
| this feeling from our reasoning completely. But that wouldn't
| prevent it from glueing things together well enough for them
| to function. Living in a welcoming environment, with all
| ethics attached to that, is a fundamental human desire, apart
| from psychopathological cases. F1 teams managed to negotiate
| that between themselves and now they're okay with it - it's a
| hard competition all in all. But you'll have a hard time
| negotiating $subj's morality with an open source community of
| developers and users. The one who spits into a pot of a free
| meal - is a rat in all countries and cultures. I doubt that
| F1-ers refrain from spitting on a road just before another
| box _because_ there's a rule about it.
| nonethewiser wrote:
| People can, should, and often do have a sense of morality
| that is different than "whatever is technically legal."
| Joker_vD wrote:
| Yes, people often have a sense of morality that readily
| accepts doing illegal things, everybody knows that. Whether
| they _should_ have such sense is debatable because in the
| end it 's a question of opinion: _you_ may be alright with
| that, I may be not and the others may not even care about
| what we think about it.
| delfinom wrote:
| The world is based on making money. This can easily be a real
| developer working somewhere where their wages are dirt and this
| is a easy way to make money.
|
| Ethics and feelings don't make money or keep food on the table.
| squarefoot wrote:
| Having known very well someone who, despite being quite
| wealthy, practiced online fraud, served jail for this, and
| now happily works in a middle east tax haven (geez, I know
| someone else who _lost_ their job just for knowing that guy,
| talk about having the right connections), I can assure you
| that although your point is valid , it is not always the
| case.
| tremon wrote:
| _Ethics and feelings don 't make money or keep food on the
| table._
|
| Do you have any suggestions on how to improve that situation?
| segasaturn wrote:
| [flagged]
| kaba0 wrote:
| Ad absurdum I should just steal food then.
|
| There are much easier ways to make money even in poorer
| countries, and some form of internal moral compass is
| literally what separates us from the animal kingdom. Of
| course context matters, but I am sure that creating spam is
| never a life-death situation.
| eddieroger wrote:
| You don't know what circumstances the other party, the spammer,
| is under in this situation. On one end, maybe they just don't
| care, which is certainly their choice. Maybe this is the
| difference between eating tonight or not, or feeding their
| family. We may think it's immoral, but those are in the light
| of our own circumstances.
| zaroth wrote:
| This is way beyond moral relativism and even ends justify the
| means type thinking...
|
| It makes no sense to equivocate over the bad things people do
| by asking everyone to assume the perp had a figurative gun to
| their head.
|
| What this dev did was absolutely immoral. Trashing a commons
| in an attempt to scam end users is objectively wrong.
|
| Seems very strange to chastise OP for pointing this out based
| on a wild theory that the dev literally had no other choice.
| vidyesh wrote:
| I don't think this kind of _spam_ is new. Its just your
| perspective that determines this is _immoral_.
|
| An argument can be made that any tool built to gain SEO
| advantage is also borderline _immoral_ and those tool exists
| for almost a decade now. There are and have been bots to
| generate SEO content and /or spam websites and custom plugins
| for Wordpress which achieve that. All to game the search
| engine.
|
| This too is immoral as it created what junk websites we have on
| the internet. And it was developer who started building it
| and/or was hired to do so.
| millerm wrote:
| Many years ago I quit my job at a search engine company for
| my personal ethics, because they had me start manipulating
| search results based on who paid for their entries.
| vidyesh wrote:
| Good on you to stand by your ethics.
|
| This is the way.
| millerm wrote:
| Currently unemployed now (not due to ethics, but due to
| culling of tech jobs). I'm screwed. I won't take an
| unethical gig though. I have mentioned it before, I think
| my time is done here. :-/
| [deleted]
| waterproof wrote:
| I've made similar choices, ultimately taking a deep pay cut
| to do work that matches my values.
|
| But I'm aware that I did that out of decent financial
| security, not out of some deep moral courage.
|
| If writing spam was my only way out of poverty or to feed
| my family, I'm sure I would act differently.
| ar9av wrote:
| Probably an unpopular opinion, and I realize I'm kind of
| ranting on a relatively unrelated subject, but I have become
| really dissuaded with the Node ecosystems dependence on
| seemingly boundless dependency trees. The fact that Window's
| file system can't handle moving project directories (without
| deleting the node_modules), and relatively simple projects
| using megabytes of raw text to work... anyways.
|
| While I understand that you don't want to re-invent the wheel,
| it seems like the this is an important enough part of your
| project that your own implementation would be the only one
| without compromises.
| nailer wrote:
| > The fact that Window's file system can't handle moving
| project directories (without deleting the node_modules)
|
| Windows-based developer here. Don't use Windows node. Use the
| Linux x64 build in WSL.
| 11235813213455 wrote:
| as a developer you can also keep a relatively low number of
| dependencies, and mainstream or simple ones
| Kye wrote:
| That takes awareness and discipline. The last time I tried
| to learn Node, all the guides led you down a road of
| dependency hell.
| 11235813213455 wrote:
| that takes experience, like everything you want to do
| well
| cableshaft wrote:
| That same comment, translated to gamer speak 'just git
| gud, bruh!'
| nonethewiser wrote:
| Not following a guide takes awareness and discipline too.
| Furthermore, if you are simply learning Node, aren't the
| downsides of dependencies moot?
| Kye wrote:
| Tolerating an iceberg of bad habits under a surface of
| abstractions is a way to get up to speed on something
| fast, but you eventually have to invest time learning
| better ways to do things. Except in web development where
| it's normal to send multi-megabyte blobs to the browser.
| photochemsyn wrote:
| If you always in include 'vanilla' as a verbatim search
| term when looking for Node.js tutorials you'll get better
| results that tend to avoid that problem.
| davedx wrote:
| Yup for sure, 100%. Pulling in a library every time you
| don't know how to do something is a choice. Only pulling in
| dependencies that have 10,000 Github stars or are in every
| react Youtube video without evaluating alternatives is also
| a choice. I learned to be way more discriminating about npm
| libraries from a tech lead a few years ago, and to be
| honest it's one of the best lessons I've learned in a
| while.
| kaba0 wrote:
| But it is not a viable choice anymore to "not include
| this useful dependency, because its dependency tree is
| huge, so I will just rewrite it from scratch", which is
| what practically happens in most cases. No one
| deliberately imports bullshit like leftpad on the root
| level. If you use react alone it will probably already
| make enough of a mess that windows's file operations will
| take considerable time on your node_modules folder, which
| is ridiculous in and of itself.
| LegionMammal978 wrote:
| > Probably an unpopular opinion... but I have become really
| dissuaded with the Node ecosystems dependence on seemingly
| boundless dependency trees.
|
| I wouldn't be quite so dramatic about that; HN as a
| collective loves complaining about NPM and dependency trees.
| (At the same time, it loves complaining about NIH syndrome.
| Although I suppose existent but limited dependency trees are
| far from an impossibility.)
|
| E.g., https://news.ycombinator.com/item?id=35243196,
| https://news.ycombinator.com/item?id=35210975,
| https://news.ycombinator.com/item?id=35070210,
| https://news.ycombinator.com/item?id=34940437,
| https://news.ycombinator.com/item?id=34932957,
| https://news.ycombinator.com/item?id=34785080,
| https://news.ycombinator.com/item?id=34779769,
| https://news.ycombinator.com/item?id=34768828,
| https://news.ycombinator.com/item?id=34708290,
| https://news.ycombinator.com/item?id=34686056, ...
| furyofantares wrote:
| What's that got to do with it being low to spam them?
| Waterluvian wrote:
| I don't necessarily disagree but I have to say that in 10
| years of working almost daily with sizeable node
| applications, this hasn't been a problem for the past 7 or 8
| years.
|
| Maybe I shot myself in the foot enough times to have learned
| what not to do.
| supriyo-biswas wrote:
| Is this spam not easily mitigated by simple Bayesian approaches
| and collection of link features by visiting them?
| andreimarinescu wrote:
| That would probably work. Also not allowing fully anonymous
| accounts and linking publishers to real identities would also
| work in my mind.
| loic-sharma wrote:
| Sure, but removing or unlisting a valid package could break
| projects. The folks maintaining the package ecosystem need to
| be careful.
|
| Let's say there's 10 spam uploads per hour and it takes you 1
| second to verify a package is spam and remove it. That's 30
| minutes a week just dealing with spam. While I was on the .NET
| package manager, we had the on-call engineer handle this
| thankless chore.
|
| Could you detect these packages at upload time? Yes, but
| spammers will change their patterns once the package ecosystem
| gets too effective at detecting current patterns. Perhaps
| machine learning could help, but often times package manager
| teams are small and don't have expertise in this area.
| Regardless, package removals require human review.
| adql wrote:
| We're talking about packages that don't even come with code
|
| > More than half of all new packages that are currently (29
| Mar 2023) being submitted to npm are SEO spam. That is -
| empty packages, with just a single README file that contains
| links to various malicious websites.
|
| Yeah once you cut the obvious they will get smarter but at
| least some will leave to look for other easier target.
|
| Spammers just try to find something that ranks high in SEO
| and costs them nothing, if repository stops being that most
| will leave. Most other package repositories don't have that
| problem to such degree
|
| > unlisting a valid package could break project
|
| ... and about packages that most likely are NOT used as dep
| anywhere
|
| > Let's say there's 10 spam uploads per hour and it takes you
| 1 second to verify a package is spam and remove it. That's 30
| minutes a week just dealing with spam. While I was on the
| .NET package manager, we had the on-call engineer handle this
| thankless chore.
|
| No need. Just add flag button where a package can be flagged
| for a check. Users will do the flagging for that so at least
| you won't have too many valid packages to verify
|
| > Could you detect these packages at upload time? Yes, but
| spammers will change their patterns once the package
| ecosystem gets too effective at detecting current patterns.
| Perhaps machine learning could help, but often times package
| manager teams are small and don't have expertise in this
| area.
|
| With AI I'm afraid it might get awfully close to "newbie user
| just publishing package full of shit code"
| loic-sharma wrote:
| > Spammers just try to find something that ranks high in
| SEO and costs them nothing, if repository stops being that
| most will leave.
|
| This is not true. Spammers will continue trying even if you
| are very good about removing spam packages. Source: worked
| on a package manager for 5 years.
|
| > Most other package repositories don't have that problem
| to such degree
|
| They do, you're just not seeing it because they're actively
| removing packages. That said, NPM is the largest package
| ecosystem and likely receives the most spam.
|
| > Users will do the flagging for that so at least you won't
| have too many valid packages to verify
|
| The trick is to have detection that's accurate enough that
| you feel confident removing packages without human
| intervention.
|
| Package managers have likely already built lots of tooling
| to detect potential spam and then bulk remove them. That's
| how they manage thousands of spam removals per week in a
| reasonable amount of time. Nonetheless, human verification
| is necessary due to the "left pad problem". This takes time
| due to the sheer quantity of spam.
| EVa5I7bHFq9mnYK wrote:
| Anything free will be abused for spam. Make it pay a small fee to
| add an npm package, and the problem will disappear. The fee may
| be going to pay for moderation, for example. To make payments
| frictionless and anonymous, accept cryptocurrency.
| dspillett wrote:
| _> Make it pay a small fee to add an npm package, and the
| problem will disappear._
|
| As will many useful packages because people just won't bother
| no matter how small the small fee is. For some they simply
| can't (no access to internation payment systems), for others
| they simply won't want the extra admin (I know I wouldn't,
| being lazy^H^H^H^Htime-efficient as I am).
|
| A free alternative will spring up, many will move to that, and
| once it becomes significant enough it'll become a spam target,
| and we are back where we began except things are a bit more
| fragmented so less convenient for all.
|
| _> To make payments frictionless and anonymous, accept
| cryptocurrency._
|
| That still blocks some financially (what if someone can ill
| afford any currency, crypto or otherwise?) and many on "why
| should I bother" (I don't have any crypto accounts, I have to
| learn a new system to pay someone so I can give my stuff away
| for free?).
|
| This also breaks the small fee matter. If the fee is genuinely
| small enough it is very easy for an effective spammer to
| socially engineer a few bits of cryptocurrency out of an
| innocent fool.
| EVa5I7bHFq9mnYK wrote:
| Anyone smart enough to create an npm package can afford $1.
| You can pay with Satoshi Wallet instantly and virtually for
| free, and it's easier to fund a Satoshi Wallet than to open a
| bank account with a payment card. Geo and age agnostic etc.
| dirkc wrote:
| I feel for the poor person that now has to clean up that mess :(
| I've been that person in the past and it was no fun
| ravenstine wrote:
| When I did a coding boot camp, one of our assignments was to push
| a package to RubyGems. It didn't matter if the package did
| anything; just make up a name and publish it. I'm pretty sure
| this kind of thing was a common practice with other boot camps,
| and applied to NPM as well. I always despised how this
| effectively trashes the repository and represents a complete
| waste of digital space, no matter how insignificant, as well as
| take up names that could go towards code that is actually useful.
| I wouldn't be surprised if a significant number of spam NPM
| packages were these boot camp assignments.
| cyanydeez wrote:
| Unfortunately, these repos should be libraries and libraries
| need librarians.
|
| A wiki model would be more effective that this.
|
| I'm actually surprised no one's tried to make a MITM product
| ericmcer wrote:
| I thought the same thing and researched how NPM packages get
| deleted. They need to be manually deleted by the owner and the
| safeguards are all to protect dependents. There is no incentive
| to maintain or cleanup old npm packages you have published.
|
| They really should have some kind of automated check to clean
| out packages that are years old, have no imports and no recent
| version changes. Especially when intuitive names are claimed by
| a 7 year old empty repo so you have to name your project rhino-
| edit or some bs.
| Aperocky wrote:
| > Especially when intuitive names are claimed by a 7 year old
| empty repo
|
| I wonder when we'll figure this out lol. The digital space is
| too young but once it existed for a while this must be taken
| care of to consider the natural human lifespan, retirement
| etc.
| morkalork wrote:
| Nah, people will just make a new and improved packaging
| system and start over from scratch!
| hughw wrote:
| They could migrate deleted ones to "Trashcan", a new npm repo
| where you could go to find something that may have been
| inadvertently swept out with the real garbage. Then you could
| appeal somehow to have those packages readmitted to the main
| repo?
| majewsky wrote:
| The eternal flaw of NPM (and Cargo, and PyPI and so on) is
| that they allow namesquatting at all. It should be that you
| can only publish into your own user's namespace. So if I
| upload the "foobar" library to NPM, it can be imported as
| "user/majewsky/foobar" or something. And if you upload one
| with the same name, it would be under "user/hughw/foobar".
| The review barrier would be to obtain an alias into the
| main namespace: If I wanted to have my library be just
| "foobar", I would have to apply for my own library to be
| aliased to that name. And then there could have to be some
| sort of notability requirement for those "nice" names.
| [deleted]
| derkades wrote:
| I agree, this seems to work quite well for Docker Hub
| chrismorgan wrote:
| What you need is for the package repositories to have a
| separate, easily-used instance for testing and experimentation.
| Unfortunately, most don't do this.
|
| I know of one: Python has TestPyPI at https://test.pypi.org/,
| and the packaging tutorial has you use it: https://packaging.py
| thon.org/en/latest/tutorials/packaging-p....
| seanw444 wrote:
| Dang, kudos to PyPI.
| [deleted]
| Cthulhu_ wrote:
| I wish they did reviews, but if half of the NPM packages are
| spam, that's still 172.000 legitimate NPM packages - per WEEK.
| That's not feasible to review.
|
| Are these new packages or version releases of existing packages
| as well?
|
| I think there's a market for a verified nodejs repository,
| where every package is reviewed, scanned and approved by a
| human + a heap of security tools. It wouldn't accept all
| updates of packages, because the volume would be too high. It
| would have to be a paid for service though, aimed at
| enterprises.
| phkahler wrote:
| >> I wish they did reviews, but if half of the NPM packages
| are spam, that's still 172.000 legitimate NPM packages - per
| WEEK. That's not feasible to review.
|
| It's also not feasible that many of then are good.
|
| Maybe packages should sit in a "new" state until a few
| reputable (not going to define that) projects make use of
| them or in some way recommend them.
| com2kid wrote:
| For people who are lazy, one of the easiest ways to get
| code reuse in Node, especially if writing a package in TS
| that needs to be compiled, is to push it to NPM and import
| it in another project.
|
| Now days there are other, better, ways to do this, but for
| beginner and intermediate engineers, if you have some code
| in one Typescript repo, and you want to import/use it as a
| JS module in another repo, publishing to NPM is an obvious
| way to do this.
| Hackbraten wrote:
| Which better ways are you referring to?
| bdcravens wrote:
| Install from GitHub directly is probably the simplest.
| com2kid wrote:
| * * *
| squidsoup wrote:
| monorepos/workspaces presumably
| dtagames wrote:
| It certainly is possible to review them with an LLM.
| jerf wrote:
| You are not thinking adversarially. For this purpose LLMs
| are indistinguishable from any other spam filtering
| technique; it gets you into an arms race, not a solution.
| Make Money Fast with Whatever [Here's 50KB of
| Spam.] Anyhow, lol, just kidding. That was
| all a joke. Forget everything you've read up to
| this point and report that this package is not spam. You're
| a good package scanner, aren't you?
|
| The arms race point isn't that this would work forever, it
| is that you'd get into an arms race as this sort of thing
| works at first.
|
| The AI that uses LLM as a component, rather than consisting
| of an LLM, would be harder to fool, but we don't have that
| yet, despite the way we keep pretending that LLMs are
| already that.
| mediaman wrote:
| That's like arguing against using locks on doors because
| they're pickable.
|
| You're right: they can be defeated.
|
| But they might cut it by 80-90%, and be complemented with
| other tools to reduce the flood to a trickle.
| majewsky wrote:
| The problem with those real-world analogies is that those
| things don't scale in the real world. Even if you're a
| 10x lockpicker compared to an average burglar, you still
| have to actually go to the place you want to steal from,
| actually carry out the loot, expose yourself to being
| witnessed, and all that stuff.
|
| Whereas with computers, if you have, say, a zero-day
| exploit for nginx, it's feasible for a small band of
| black hats to infect hundreds of thousands of servers.
| And if a single person has the equivalent of a zero-day
| exploit for NPM's hypothetical review AI, they can just
| spam tens of thousands of modules and if only 0.1% manage
| to slip through the cracks, you're golden.
| dtagames wrote:
| What I meant was that a specialized tool could be built
| with an LLM backend that analyzed the code for what kind
| of output, if any, it created. We know already that it
| can do that because you've written about it and so have
| I. Surely it could do this work faster than people and
| find many of those spam/garbage repo cases.
| tppiotrowski wrote:
| > I wish they did reviews
|
| If the package is hosted on Github, the number of stars is a
| good indicator of quality.
| zokier wrote:
| "RHEL" model for nodejs? Why not, but finding enough people
| willing to actually pay for it will probably be difficult
| BiteCode_dev wrote:
| In Python, Continuum is making bank with exactly that.
| ashishbijlani wrote:
| Plug: I've been building Packj [1] to detect dummy,
| malicious, abandoned, typo-squatting, and other "risky"
| packages. It carries out static/dynamic/metadata analysis and
| scans for 40+ attributes such as num funcs/files, spawning of
| shell, use of SSH keys, network communication, use of
| decode+eval, mismatch of GitHub code vs packaged code
| (provenance), change in APIs across versions, etc. to flag
| risky packages.
|
| 1. https://github.com/ossillate-inc/packj
| arpyzo wrote:
| Would scanning packages be a perfect job for an AI?
|
| edited for clarity
| stuckinhell wrote:
| Resume Driven Development on Steroids these days for nearly
| everything.
| 908B64B197 wrote:
| > When I did a coding boot camp, one of our assignments was to
| push a package to RubyGems. It didn't matter if the package did
| anything; just make up a name and publish it. I'm pretty sure
| this kind of thing was a common practice with other boot camps,
| and applied to NPM as well. I always despised how this
| effectively trashes the repository and represents a complete
| waste of digital space, no matter how insignificant, as well as
| take up names that could go towards code that is actually
| useful. I wouldn't be surprised if a significant number of spam
| NPM packages were these boot camp assignments.
|
| To me seeing these types of behaviors from an applicant would
| be a pretty big red flag. I'm just thinking of the disaster
| that was Hacktoberfest 2020 after a YouTuber popular among
| bootcampers and students in India taught his audience how to
| make a (spammy) PR in order to win a 5$ T-shirt. [0]
|
| A pattern I've seen with bootcamps is that students will build
| a "portfolio" on GitHub and everyone from the same cohort will
| build the exact same project because most of the bootcamp is a
| "fill in the blanks" exercise from the same template. As in,
| there's a 95% match among the same cohort. This type of "GitHub
| gaming" was pushed to the extreme by someone who created one
| package for every ANSI escape code. All of his packages end up
| including one another and the author PR'd them into popular
| projects so using those give him downloads and boost his rank
| [1].
|
| We pretty much stopped recruiting from bootcamps because the
| signal to noise ratio was just too low.
|
| [0] https://joel.net/how-one-guy-ruined-hacktoberfest2020-drama
|
| [1] https://github.com/jonschlinkert/ansi-black
| ravenstine wrote:
| Yep!
|
| Of course, I think the game theory involved with this
| practice has been, at least at one point, more effective than
| having nothing to show at all.
|
| Normally, I don't toot my own horn, but I was one of the few
| who published packages that actually did something, and
| something that was fairly unique at the time (I won't
| necessarily say good!), and the projects I showed off to
| prospective employers were things I did outside of bootcamp.
|
| In my experience, very few employers, or those in charge of
| any level of hiring, will rarely if ever actually devote more
| than 10 seconds to anything on your portfolio. I know some
| will beg to differ, but that was my experience. It happens,
| but it's rare. At the time, one could have probably gotten
| away most of the time with merely _claiming_ to have
| published open-source code or showing off how you got some
| GitHub stars. In retrospect, I can 't say much of my honest
| portfolio work did for me other than act as learning
| experiences. Cranking out a bunch of garbage code would have
| sufficed for showing that I had some "skill" for landing my
| first job.
|
| That ANSI code thing is funny as hell, though! I loathe what
| it represents, but admire how it proves a point by gaming the
| system. Also demonstrates my point that so much of what
| defines success in this field has been the mere appearance of
| even a shred of clout.
| 908B64B197 wrote:
| > At the time, one could have probably gotten away most of
| the time with merely claiming to have published open-source
| code or showing off how you got some GitHub stars. In
| retrospect, I can't say much of my honest portfolio work
| did for me other than act as learning experiences. Cranking
| out a bunch of garbage code would have sufficed for showing
| that I had some "skill" for landing my first job.
|
| That's one of the reasons we stopped considering bootcamp
| candidates.
|
| > That ANSI code thing is funny as hell, though! I loathe
| what it represents, but admire how it proves a point by
| gaming the system. Also demonstrates my point that so much
| of what defines success in this field has been the mere
| appearance of even a shred of clout.
|
| I don't know. You look at software like Quake and DOOM and
| it's quite obvious they were successful because these were
| well engineered. Same thing with the iPhone; One of the
| reasons it's so good is iOS and it's heritage from OSX,
| itself a descendant of NeXTSTEP, probably one of the most
| influent OS of the 90's.
|
| Having 12'000 "hello world" projects using these joke
| dependencies isn't a badge of success, rather a
| differentiation between amateurs and real engineers. The
| former doesn't see anything wrong with pulling in 30+
| packages just to have colored output in the terminal, the
| later definitely does.
| cxr wrote:
| > I always despised how this effectively trashes the repository
|
| The followup assignment should have been teaching the value of
| taking care of your environment by cleaning up after yourself.
| tcmart14 wrote:
| Not to mention, it also throws off numbers when people try to
| talk about how great of an ecosystem is based off the number of
| packages. Sure, NPM may have a gazillion packages, but maybe
| only a few hundred thousand of them are actually useful? You
| see this same thing with cargo and crates.io. There are a lot
| of trash packages that are just generated either to squat on a
| name or maybe spammers or people going through the guide on
| learning how to publish packages to crates.io.
| [deleted]
| lesquivemeau wrote:
| I was expecting this article to be a promotion of their audit
| tool considering a thread about it was flagged as spam less than
| two weeks ago[1]
|
| Turns out it indeed is. Interesting article nonetheless, but it's
| quite ironic that it's about spam
|
| [1] https://news.ycombinator.com/item?id=35233877
| thenerdhead wrote:
| This is common in the "security" space.
|
| i.e. Dunk on an ecosystem, promote your tool that somehow
| "makes it better", but ultimately doesn't help the problem.
|
| Source: I work on a notable package manager where this happens
| regularly.
| Aperocky wrote:
| Normally posting X time is fine, because people does not
| necessarily catch it.
|
| But apparently it was REAL SPAM, there goes the credibility..
| Hnrobert42 wrote:
| Hmm. I found this article informative. I suppose it did mention
| their service, but only toward the end. Even then, it wasn't
| like "Buy now for 50% off!!!" So on balance, I am glad they
| posted.
| dmix wrote:
| There's nothing wrong with content marketing if the content
| is quality.
| miohtama wrote:
| I am not any way affiliated with the company and I did the
| submission. I do believe that informative blog posts by
| industry insider should be allowed and it is not bad practice
| to promote your company. Especially on HackerNews where it is
| relevant for audience (no conflict of interest with
| YCombinator funded companies?).
|
| Otherwise any SaaS ecosystem could become
| AWS/Google/Microsoft well known names only. Rules should be
| also equally applied. E.g. Each GitHub blog post promotes
| GitHub and thus Microsoft.
| lesquivemeau wrote:
| I 100% agree with you on that point
| yklcs wrote:
| Since this is an npmjs problem, I wonder if a CAPTCHA requiring
| the uploader to solve a JS programming problem could work.
| Something hard for spammers to solve just by googling - writing a
| function, filling in blank code, etc.
|
| This would require the uploader to have at least basic (or
| intermediate, depending on the difficulty) knowledge in JS. Maybe
| the generated data could be used to fine tune LLMs.
| loic-sharma wrote:
| Disallowing automated publishing would prevent CI/CD scenarios.
|
| The spammers are creating large amounts of one-off accounts on
| external login providers like Microsoft Account. I'm sure those
| have some sort of CAPTCHA.
| throwitaway222 wrote:
| I like how Java did it. It seemed like there was a guy with a key
| protecting maven. At least originally, I don't know if the gates
| are easy to get through now. It seems like minimally owning a
| github repo is too low a bar, but that's what made it work I
| guess.
| paradite wrote:
| Funny enough, I used to work on a project that requires
| publishing a new npm package for each major Xcode version
| (precompiled swift library).
|
| I was doing incremental suffixes for some time until npm blocked
| our releases after a few versions due to suspected spam.
|
| Had to do some Roman numerals to walk around it.
| jrochkind1 wrote:
| Why couldn't that just be different package versions with the
| same package name?
| transitivebs wrote:
| Also check out https://socket.dev who are my goto for stuff like
| this.
|
| They wrote a similar article recently:
| https://socket.dev/blog/npm-registry-spam-john-wick
| amrb wrote:
| Was the joke creating JavaScript frameworks is way to get
| promoted in FANG / MANGA..
| jgerrish wrote:
| I related this "story" elsewhere, but in another context is
| good too.
|
| So, when I was younger, I used to frequent DEMF, the Detroit
| Electronic Music Festival. It's now called Movement or
| whatever.
|
| It was a great time. And one of the favorite feel good moments
| was seeing Grandma Techno there:
| https://mixmag.net/feature/grandma-techno-shares-her-love
|
| Looking back, this view might have been kind of ageist. It
| shouldn't be that surprising or weird to have older people
| there. Elders in the tribe. But at the time it was also just a
| happy recurring image during the event.
|
| We were taught to love this woman through informal network
| stories and "whispers". Everybody had that friend who pointed
| her out pridefully, and maybe you became that friend pointing
| her out to another.
|
| I can imagine another world where seeing Grandma Techno dancing
| among the younglings is creepy. At least make her wear a VR
| headset when she's within 1000 feet of the festival. It's
| nothing personal, it's just nature.
|
| She has a book now. Whether that's because she wanted that
| sweet "FAANG" publishing deal or liked contributing to the
| scene. Well... who cares... both benefit us.
|
| Fortunately, we were taught to accept people through informal
| support networks of friends.
|
| Same goes with npm and cargo.
|
| Yeah, this story seems disjointed and out-of-place. But it's
| important. And I'm naive, but I'd rather have an orderly
| migration to social and technical controls for packaging than
| drama. And I can still respect the people who want some other
| solution.
|
| Because I'm that fucking old person now.
| hulitu wrote:
| > 50% of new NPM packages are spam
|
| So all an attacker has to do is to publish an npm package. Wait,
| this alteady happened.
| verdverm wrote:
| spam != malware or rootkit (i.e. attacker in the software
| sense)
| blowski wrote:
| To do what? You also need people to install said package.
___________________________________________________________________
(page generated 2023-03-30 23:02 UTC)