[HN Gopher] Stripe's Monorepo Developer Environment
       ___________________________________________________________________
        
       Stripe's Monorepo Developer Environment
        
       Author : edran
       Score  : 339 points
       Date   : 2024-08-15 18:22 UTC (4 days ago)
        
 (HTM) web link (blog.nelhage.com)
 (TXT) w3m dump (blog.nelhage.com)
        
       | domenkozar wrote:
       | We've been building https://devenv.sh for that reason, I expect
       | more companies to go back to local development once they see DX
       | has improved locally.
        
         | evnix wrote:
         | How is this better or different from tools like dev which use
         | docker
        
           | drakerossman wrote:
           | It (obviously) leverages Nix, which in turn means the
           | environment is declarative and fully reproducible (not
           | "reproducible" as in docker). Now, you can use just Nix's
           | devShells, but with devenv you have a middleground between
           | just Nix package manager and a full fledged NixOS module
           | system. Basically, write out one line of code - and you've
           | got your Postgres, another one - full linter set up for
           | whatever language you're using, etc.
        
             | tmerse wrote:
             | Can I also get the security/isolation benefits that a duly
             | configured docker/podman can provide (container can only
             | act on mounted volume, non-root user, other seccomp
             | settings?).
             | 
             | I feel better doing my "npm install"s in such an
             | environment (of course it's still not a VM - but that's
             | another topic).
             | 
             | When I read about nix, reproducibility is a goal, but
             | security/isolation is a non-goal.
        
               | ParetoOptimal wrote:
               | You can generate fully reproducible OCI/docker containers
               | with devenv, so yes I think.
               | 
               | https://devenv.sh/containers/
        
               | pxc wrote:
               | > When I read about nix, reproducibility is a goal, but
               | [...] isolation is a non-goal.
               | 
               | Generally, yes.
               | 
               | But you can use or put together something like this to
               | run Nix inside a devcontainer instead of locally:
               | https://github.com/xtruder/nix-devcontainer
               | 
               | So you can use them in conjunction (or alternation, if
               | for some projects you're okay running without a
               | container) without having to specify your development
               | environments twice.
               | 
               | > I feel better doing my "npm install"s in such an
               | environment (of course it's still not a VM - but that's
               | another topic).
               | 
               | There are basically two kinds of integration you can do
               | for a project with Nix, which I'll call deep and shallow.
               | In shallow integration, you just have Nix provide the
               | toolchain and then you build the project (manually, with
               | a script, with a Makefile, whatever). This is pretty
               | common and pretty easy, and gives you no protection from
               | malicious NPM build scripts.
               | 
               | For deep integration, you can actually have Nix build
               | your whole project. This has some downsides, like that it
               | can't really handle incremental builds. It also imposes
               | restrictions, like no network access by anything but Nix
               | at build time, all packages are built by special build
               | users with no homedirs and no perms to access anything,
               | etc. When you do that kind of build/install, you do get
               | some protection from crypto miners lurking in the NPM
               | registry or PyPI or whatever.
        
         | stavros wrote:
         | Nix is the right tool for this, developing a tool to make Nix's
         | UX easier is a great idea. Thanks for this!
        
           | dirtbag__dad wrote:
           | What about dev containers?
        
             | stavros wrote:
             | You mean Docker? They tend to rot much more than I'd like,
             | mostly because you forget to pin something at some point.
             | With Nix, you can't forget.
        
               | janjongboom wrote:
               | FYI, I've helped set up StableBuild
               | (https://www.stablebuild.com) to help pin stuff in Docker
               | that's normally virtually impossible to pin (e.g. OS
               | package repos, Docker base images, random files from the
               | internet, etc.)
        
               | 1oooqooq wrote:
               | did the word rot change meaning recently?
               | 
               | pin is what causes rot, not what solves it.
        
               | otabdeveloper4 wrote:
               | Good luck with your Docker containers in three years.
               | (You're gonna need it.)
        
               | 0x457 wrote:
               | Different kind of rot. With nix and flakes, I can come
               | back to a project 5 years later and as long as external
               | dependencies (i.e. package sources) still available it
               | will bring me back straight to that environment like it
               | was yesterday.
               | 
               | If you have a Dockerfile from 5 years ago...well good
               | luck building it today.
        
             | ParetoOptimal wrote:
             | You can create them with devenv, but they are actually
             | reproducible :
             | 
             | https://devenv.sh/containers/
             | 
             | https://devenv.sh/integrations/codespaces-devcontainer/
        
               | 1oooqooq wrote:
               | i missed any description of the actual container content
               | on those examples.
        
               | 0x457 wrote:
               | IIRC, it uses what is defined for shell environment. Just
               | instead of activating on your machine, it produces OCI
               | image with that environment.
               | 
               | I have nixOS definitions that I can use to make a SD card
               | image, overtake a running linux system via ssh, deploy to
               | nixos via ssh, or deploy to a local system - all from one
               | definition.
        
             | pxc wrote:
             | Containers are a great deployment target, but they're not
             | really a great development environment for a few reasons
             | (e.g., they're Linux-specific, so they require extra
             | virtualization on non-Linux operating systems, the kind of
             | isolation they provide is more of a hindrance than a help
             | when it comes to working on your local filesystem, and for
             | them to be useful you have to set up infrastructure to push
             | and pull your private containers to and from).
             | 
             | Nix is a better fit for this, and when you're using Nix you
             | can also have Nix generated containers for deployment. I
             | think you can also use a container with Nix in to provide
             | the devcontainers interface to devs who don't have Nix
             | installed locally, and have it in turn use Nix against your
             | project's flake to set up its environment.
        
           | eadmund wrote:
           | > Nix is the right tool for this
           | 
           | Or Guix, which has the advantage of a more pleasant language.
        
             | earthling8118 wrote:
             | The language isn't the problem with nix.
        
               | 0x457 wrote:
               | It's not "the problem", but it's a problem. It's better
               | than alternatives, but it's hacky nature shows.
        
               | ParetoOptimal wrote:
               | Well, if you believe:
               | 
               | - discoverability is a problem in Nix
               | 
               | - Guix encourages or "shepherds" more discoverable
               | functions, modules, and abstractions
               | 
               | Then the language could be a problem.
        
               | nitsky wrote:
               | What is?
        
             | chpatrick wrote:
             | YMMV, I really don't like lisp braces personally.
        
             | otabdeveloper4 wrote:
             | That's, like, just your opinion, man.
             | 
             | Scheme and/or Lisp is literally the worst language choice
             | for this problem domain.
        
               | 0x457 wrote:
               | I wouldn't say it's the worst. I don't like Lisp and co,
               | but I think it's alright for this. I don't like Guix for
               | a very different reason.
        
               | ParetoOptimal wrote:
               | Why is Lisp the worst for this problem domain?
               | 
               | I will admit I find guix to be much more verbose than
               | Nix.
        
             | pxc wrote:
             | At my workplace, Guix's lack of macOS support takes away
             | some of the benefit of using something like Nix or Guix as
             | opposed to HVM solutions like Docker Desktop or Vagrant. I
             | imagine this situation is unfortunately common.
             | 
             | For teams where GNU/Linux is the primary development OS,
             | Guix seems like a great choice.
        
         | pxc wrote:
         | My small team uses devenv for all our development environments
         | and we really like it. Local DX is really important to me and
         | to our team, which is a big part of why we've chosen Nix and
         | devenv.
         | 
         | As we've started to use it more extensively, we've also found
         | that we want to add some enhancements, work out some bugs, and
         | experiment with our own customizations out-of-tree, etc. I'm
         | happy to report here on HN that devenv is well-documented and
         | easy to extend for Nix users who have some experience with Nix
         | module systems, and that Domen is really responsive to PRs. :)
        
       | reillys wrote:
       | I chatted to Nelson when I was designing brisk
       | (https://github.com/brisktest/brisk) and his insight informed the
       | development of it.
       | 
       | Among other things, Brisk allows you to run tests for your local
       | code changes in the cloud (basically the pay mini test piece but
       | for any test runner)
       | 
       | We also have a sync step much like the one described here and
       | allow users to run one off commands (linters, tsc etc)
        
         | IshKebab wrote:
         | Can't you achieve all that just using a build system with
         | reliable remote builds & caching e.g. Bazel, Buck, Please, etc?
         | 
         | That also avoids hacky sync scripts.
        
           | reillys wrote:
           | No you can't.
           | 
           | They don't work from your local development env and also work
           | in your CI env.
           | 
           | Mostly Brisk was designed to run your complete test suite on
           | every codes save (ie local save) but it also works great from
           | your CI.
           | 
           | We can run entire test suites in seconds which is performance
           | you don't get with those systems you named (which are
           | generally for building/compiling)
        
             | reillys wrote:
             | To be clear the sync step is used for the test suite
             | execution not only the one off command running - it's just
             | something we can also easily do because we have a hot env
             | in the cloud
        
             | joshuamorton wrote:
             | > They don't work from your local development env and also
             | work in your CI env.
             | 
             | This is one of the biggest selling points of bazel-like
             | build systems. Like to the extent that, for some changes,
             | bazel can say "even though you changed this source file, I
             | can be 100% certain that that change didn't affect any
             | tests and so I will not run them"
        
             | Maxious wrote:
             | I'd suggest you revise your competitor analysis. Bazel
             | definitely has a test command that with remote execution
             | and caching absolutely allows you to run entire test suites
             | in seconds* both locally and in CI eg.
             | https://blog.aspect.build/typescript-with-rbe
        
               | reillys wrote:
               | This blog post says 2 and a half minutes not seconds.
               | 
               | I know Bazel is a build system which distributes builds
               | among remote machines.
               | 
               | In fact using any computer language you can achieve these
               | goals - you just need to program it.
               | 
               | So yes you could probably do all the things with all the
               | things, but Basel does not solve this problem out of the
               | box.
               | 
               | I wonder why stripe didn't "just use Bazel".
        
               | IshKebab wrote:
               | > This blog post says 2 and a half minutes not seconds.
               | 
               | It's meaningless to say "we can run tests in seconds".
               | You can't run _my_ tests in seconds because they 're
               | single threaded and take 10 minutes. The important thing
               | is the speedup, and they got a pretty good speedup.
               | Arguably the nop build/test time is important too but it
               | doesn't look like they measured that.
               | 
               | > Basel does not solve this problem out of the box.
               | 
               | Yes it does.
               | 
               | > I wonder why stripe didn't "just use Bazel".
               | 
               | In my experience it's because setting up Bazel is a) more
               | work than setting up some ad-hoc build system (Make or
               | CMake or whatever) and b) difficult to switch to
               | retrospectively. So it only gets used where you have
               | people who are experienced enough to know that you _will_
               | wish you had started with it, and can convince the
               | inexperienced people that it 's worth the effort.
               | 
               | Usually you get too many inexperienced people saying
               | "it's too difficult; we'll be fine with Make".
        
               | zrail wrote:
               | First release of Basel was in 2015 when Stripe was
               | already 5 years old and the progenitor of this tooling
               | was already running with several dozen users.
        
               | jvolkman wrote:
               | Stripe does use Bazel.
               | 
               | https://stripe.com/blog/fast-secure-builds-choose-two
        
               | nkohari wrote:
               | Stripe does use Bazel. It just didn't exist before Stripe
               | built some of its own internal systems, but it's
               | gradually replacing ~everything from a build standpoint.
               | 
               | The one thing to know about Bazel is that it's both
               | incredibly impressive, and also one of the least
               | ergonomic pieces of software ever created. It's very
               | clearly an internal project which was cleaned up and open
               | sourced without any attempt to make it more usable
               | outside of Google.
               | 
               | Bazel's kind of like Kubernetes in a way -- you don't
               | actually get enough benefits to adopt it until you're at
               | a certain point in the company lifecycle, and to get to
               | that point you usually have to build other systems first.
               | Then you have to gradually replace those systems with
               | Bazel.
        
             | IshKebab wrote:
             | > They don't work from your local development env and also
             | work in your CI env.
             | 
             | Err yes they do? Unless you mean something really specific
             | that I'm not getting?
        
         | riffraff wrote:
         | > Brisk allows you to run tests for your local code changes in
         | the cloud
         | 
         | how does this work for interactive debugging?
         | 
         | I was going to ask the same about the system in TFA but I might
         | as well ask you :)
        
       | KolmogorovComp wrote:
       | > In addition, Stripe's monorepo was (to our knowledge) the
       | largest Ruby codebase in existence
       | 
       | Bigger than shoppify's?
        
         | Macha wrote:
         | So from a gut feeling that sounds right, finance is a pretty
         | complicated domain with a lot of per vendor interactions, and
         | Shopify outsources their payment stuff to Stripe.
         | 
         | Also on a headcount level, Google tells me Shopify has 3,500
         | employees to Stripe's 9,500. Obviously neither company is
         | compromised entirely of engineers, so this is a ballpark
         | estimate.
         | 
         | GitHub feels like the real case where there might be a larger
         | codebase. It's in the middle for employees (6,500), but it's
         | existed longer than Stripe (though not as much longer as my gut
         | feeling told me, interestingly)
        
         | spacemonkey92 wrote:
         | I also wonder how they handle merge requests in a monorepo,
         | especially when it comes to the code review process.
        
           | azthecx wrote:
           | Typically you have owner files or similar in the subprojects
           | that are read by automation tooling and humans alike
        
           | popinman322 wrote:
           | It's possible to get stuck in merge hell where all your
           | reviewers ok the PR but someone merged a conflict 2 seconds
           | ago, or you've got a reviewer in Singapore while you're in SF
           | and conflicts appeared overnight.
           | 
           | In general it was pretty rare, in my experience. The code
           | bases were pretty well modularized.
        
           | shepwalker wrote:
           | Hi! I work at Stripe on this. What're you curious about?
        
         | froydnj wrote:
         | The most recent publicly available numbers (that I know of,
         | maybe there's a talk available somewhere that's more recent)
         | are from https://stripe.com/blog/sorbet-stripes-type-checker-
         | for-ruby
         | 
         | > currently amounting to over 15 million lines of code spread
         | across 150,000 files
         | 
         | The monorepo has only gotten bigger over the last two years
         | (source: I work at Stripe).
        
           | froydnj wrote:
           | I should also note that number is Ruby files only.
        
       | rvz wrote:
       | This isn't recommended practice really and there is nothing about
       | this which justifies having to maintain huge code bases in a
       | single folder or multiple folders in one larger one.
       | 
       | Won't be surprised to see that many would probably need a safari
       | map or README documentation in every single folder to navigate a
       | repository as large as stripes.
       | 
       | Sounds like an emergence of a new bad practice if you are having
       | to praise how large your code base is.
        
         | pavlov wrote:
         | Meta also has a massive monorepo accessed primarily through
         | cloud devservers.
         | 
         | When several of the world's most successful software companies
         | use this approach, it's hard to argue that it's inherently bad.
         | Of course it's sensible to discuss what lessons apply to
         | smaller companies who don't have the luxury of dedicated
         | tooling teams supporting the monorepo and dev environment.
        
           | n_ary wrote:
           | Just because some successful companies use some approach
           | doesn't make it the best practice. I have seen firsthand
           | nuisance of monorepo, which took almost 15minutes to
           | correctly switch branches on intel machines(and decently
           | spiked the CPU by causing windows defender to panic). It has
           | decent benefit of easy code sharing, but build and test are
           | soul sucking experiences and if someone decides to run some
           | updated formatter and linter rule accidentally, the whole MR
           | becomes a nightmare to correctly review(once had a 2k+
           | changes and had to request to rollback and then only commit
           | what they actually wanted to change).
        
             | aidos wrote:
             | Why would you feel obliged to accept a MR in which someone
             | has accidentally changed large amounts of code?
        
             | tail_exchange wrote:
             | > took almost 15minutes to correctly switch branches on
             | intel machines
             | 
             | This can probably be fixed with trivial tuning. Just
             | configuring Git to fetch only your branches would speed up
             | the branch switching significantly.
             | 
             | > build and test are soul sucking experiences
             | 
             | Why? It doesn't have to be. If you are going to build the
             | entire monorepo, then yes, but this should only happen when
             | you are running CI, and even then you can break down the
             | builds into smaller components.
             | 
             | > the whole MR becomes a nightmare to correctly review
             | 
             | Not if you set up code ownership properly. You also need to
             | think what happens in case of emergencies, so having a
             | selected list of "super users" and users with permissions
             | to bypass reviews is important.
             | 
             | It sounds like this company wanted a monorepo, but nobody
             | invested any money or time to actually think about
             | developer productivity. When this happens, yes, of course
             | it won't be good, because no project succeeds like this.
             | The nice thing about a monorepo is that instead of 1,000
             | repos with tooling all over the place and no specialist to
             | take care of them, you can have one repo with really good
             | tooling and a team dedicated to just keep it running
             | smoothly. But if nobody is actually taking care of the
             | monorepo, it will rot just like any other codebase.
        
             | riwsky wrote:
             | "Someone autoformatted the whole thing under new settings
             | at the same time as introducing a new feature" is hardly a
             | monorepo problem. That could be a pain in the ass to review
             | even in a single file. But the flip-side, of someone
             | cleanly wanting to a do a mass autoformat or autorefactor,
             | is much easier in a monorepo than in split repos.
        
             | kccqzy wrote:
             | Nothing you describe is inherent to monorepos. Git is slow
             | yes but go use hg. Build and test are slow? That's a CI
             | problem: you didn't allocate enough resources to the build
             | system. Someone ran a formatter accidentally? That's that
             | someone's mistake.
        
           | mootoday wrote:
           | Meta also uses React and we know what mess that introduced to
           | the world...
        
         | ABS wrote:
         | very much recommended practice by many with, of course, caveats
         | and situations where perfect is the enemy of good, etc, etc
         | 
         | e.g. https://trunkbaseddevelopment.com/monorepos/
        
         | lijok wrote:
         | > Won't be surprised to see that many would probably need a
         | safari map or README documentation in every single folder to
         | navigate a repository as large as stripes.
         | 
         | No different to having thousands of smaller repos instead.
         | 
         | I personally dislike monorepos, for very niche, in-the-weeds
         | operational reasons (as an infra person), but their ergonomics
         | for DX cannot be understated.
        
           | __jonas wrote:
           | The 'ergonomics for DX' benefit is that you can share code
           | across projects without having to go down the path of
           | creating a package / library pushed to some internal registry
           | and pulled by each project right?
           | 
           | Or are there any other aspects to the monorepo architecture
           | that make it beneficial for large companies like that?
           | 
           | Just curious, I've never worked in such an environment
           | myself.
        
             | dezgeg wrote:
             | In addition to what you mentioned, the ability to
             | atomically commit to a library and all of its consumers.
             | And for a change to a library run the tests of all of its
             | consumers as well.
        
             | bastawhiz wrote:
             | Every host running a particular commit is running the code
             | you think it is. No submodules or internal packages. If you
             | updated the Button component in the design system, when
             | your commit is deployed, every service that gets deployed
             | has the new button now.
        
             | triceratops wrote:
             | Dependency versioning is much smoother.
             | 
             | Example: Service A requires version 1.1 of libFoo and
             | libFoo 1.1 requires version 0.1 of libBar. But Service A
             | also directly uses libBar version 0.2. Now you have a
             | conflict.
             | 
             | If libFoo and libBar are internal code stored in a monorepo
             | they're automatically version-compatible because there is
             | only one version of both.
        
             | oftenwrong wrote:
             | To put it in the most general terms: It provides the same
             | value that using a VCS has for a project, but applied to
             | the entire company.
             | 
             | In a standalone project, would you accept a change that is
             | incompatible with other code in the project? For example,
             | would you allow a colleague to change a function in a way
             | that breaks the call sites? No, you probably would not.
             | 
             | The attitude within monorepo shops is that this level of
             | rigour should be applied to the entire company. Nobody
             | should be able to make a change anywhere if it would break
             | anything elsewhere, or they should only be permitted to do
             | so with intention. There are caveats to this, but that is
             | the general idea.
        
             | aylmao wrote:
             | I'd say there's 4 main advantages, summarizing what other
             | comments are saying but also from my own experience:
             | 
             | - atomic PRs. All changes for a migration/feature living in
             | one spot makes development much easier, especially when
             | dealing with api changes and migrations
             | 
             | - single history. This is useful when debugging. A commit
             | can more easily encapsulate the state of "the whole system"
             | as opposed to a single part of it. This makes reverting, if
             | necessary, easier
             | 
             | - environment consistency. updating the linting tool,
             | formatting tool, UI library, etc is never a priority, so
             | there's always drift, where an old repo gets stuck with old
             | tools, dependencies and an old environment
             | 
             | - not shipping your org chart is easier when everyone can
             | see and work work on the whole codebase, as easily as
             | possible.
        
           | chrisweekly wrote:
           | understated -> overstated
        
         | papruapap wrote:
         | imo monorepos are great, but the tooling is not there,
         | especially the open-sourced ones. Most companies using
         | monorepos have their own tailored tools for it.
        
         | bastawhiz wrote:
         | > Won't be surprised to see that many would probably need a
         | safari map or README documentation in every single folder
         | 
         | Is...documentation a bad thing?
        
       | Aeolun wrote:
       | They decided to keep the code on the local machine, but the
       | language server on the remote one. That seems like a recipe for
       | inconsistency. You only get relevant results from your language
       | server once your code has synced.
        
         | Hackbraten wrote:
         | The article mentions that the LSP itself already has baked-in
         | support to enable editors to send chunks of unsaved edits to
         | the language server (LS) as they happen.
         | 
         | What Stripe's configuration introduced is that they used a
         | remote LS instead of the default local LS. Regardless, VS Code
         | already defers LSP communication until it feels idle, and
         | developers are used to that. So I wouldn't expect a remote LS
         | to significantly impact the level of inconsistency that
         | developers already accept when using a local LS.
        
         | bastawhiz wrote:
         | I was at Stripe until 2022 and inconsistency with the language
         | server was never an issue
        
           | aidos wrote:
           | Due to the work that this team put in though, right?
           | 
           | The choice to run dev environment far away from the files
           | puts you in the position of needing to engineer your way past
           | the inconsistency.
        
             | bastawhiz wrote:
             | Yes, almost certainly.
             | 
             | On the other hand, there was so much code that running
             | everything on your own laptop was essentially out of the
             | question. Doing a git pull after a long vacation locked up
             | your dev box for a hot minute while it checked all the
             | types--doing the same thing on your MacBook would be
             | painful at best.
        
         | paxys wrote:
         | The code syncs on every keystroke. Consistency isn't an issue
         | unless you are having connection issues. And if you are then
         | pretty much all development is broken anyways.
        
       | srvaroa wrote:
       | "This scale - the scale of devprod, and in turn the scale of the
       | overall organization, such that it could afford 10 FTEs on
       | tooling - was a major factor in our choices"
       | 
       | Is basically the summary for most mono/multi repo discussions,
       | and a bunch of other related ones.
        
         | mhh__ wrote:
         | Not sure.
         | 
         | I think a lot of this is just type of thing comes because with
         | a monorepo you can actually see the problems to solve whereas
         | you can easily end up with the same N engineers firefighting
         | the same problems K times across all your polyrepos.
        
           | bluGill wrote:
           | You have different problems with both. Some problems are
           | hidden in one, but there is no one best answer. (unless your
           | project is small/trivial - which is what a lot of them are)
        
         | klodolph wrote:
         | Multirepo also comes with cost overhead. I think people talk
         | about it somewhat less. I've worked at multirepo and monorepo
         | places, both, before. My current company has a multirepo setup
         | and it sure seems like it comes with plenty of tooling to fetch
         | dependencies. That tooling has to be supported by FTEs.
        
           | hibikir wrote:
           | Internally, they definitely do. I worked at Stripe's monorepo
           | many years ago, and I am working at a larger company with
           | massive amounts of repos. The difference in pain has little
           | to do with mono v multi, but with the capabilities of your
           | tooling team.
           | 
           | If there's anything I'd say to low-level execs, the kind that
           | end up with a few hundred developers under them, it's that
           | mis-sizing the tooling team, in one way or the other, comes
           | with total productivity penalties that will appear invisible,
           | but will make everything expensive. Understanding how much of
           | a developer's day is toil is very important, but few really
           | try to figure that out.
        
           | aylmao wrote:
           | +1. I'd go as far to say that multi-repo probably needs as
           | much, if not more effort to properly keep functioning, but
           | all that effort is better "hidden" so people assume monorepos
           | are more work.
           | 
           | With a monorepo, it's common to have a team focused on
           | tooling and maintaining the monorepo. The structure of the
           | codebase lends itself to that
           | 
           | With a multirepo codebase, it's usually up tu different teams
           | to do the work associated with "multirepo issues"--
           | orchestrate releases, handle dependencies, dev environment
           | setup, etc. So all that effort just kinda gets "tucked away"
           | as overhead that each team assumes, and isn't quite as
           | visible
        
         | bluGill wrote:
         | It doesn't matter if you have a mono-rep or multi-repo, you
         | will need engineers on tooling to make it work if your project
         | is large. There are pros and cons to both multi-repo and mono-
         | repo with no one right answer (despite what some will tell
         | you). They are different pros and cons, but which is best
         | depends on your particular context.
        
           | srvaroa wrote:
           | Yeah that was my point. In the end both approaches can be
           | fine (depends on your context). The real difference is that
           | whatever choice you take, it will need the right investment
           | in tooling and support.
        
       | bool3max wrote:
       | Off-topic but the font on this blog is stunning - after some
       | digging it seems to be "Vollkorn".
        
       | delhanty wrote:
       | >Some caveats: It's been nearly five years, and I have no doubt
       | that I have misremembered some of the specific details, even
       | though I'm confident in the overall picture. I'm also certain
       | that Stripe has continued evolving and I make no claim this
       | document represents the developer experience at Stripe as of
       | today.
       | 
       | Are there any more recently ex-Stripe folks here willing and able
       | to comment on how Stripe's developer environment might have
       | evolved since the OP left in 2019?
        
         | artyom wrote:
         | Not ex-Stripe but in "close relationship" with them since its
         | inception and there's a clear mark in my calendar circa end of
         | 2018 when their decisions and output started to become...
         | weird, or ill-designed.
         | 
         | I don't think it has to do with the dev environment itself, but
         | I'd blame such thing for allowing to deliver "too fast" without
         | thinking twice. Combine that with new blood in management and
         | that's an accident waiting to happen *
         | 
         | They're the best in business still, but far from the well-
         | designed easy-to-use API-first developer-friendly initial
         | offering.
         | 
         | * Pure speculation based on very evident patterns
        
           | rattray wrote:
           | Ex-Stripe ('17-'20) here. Agree.
           | 
           | Though I am under the impression that things have gotten more
           | sensical internally over the last year or so.
           | 
           | Note also that the devprod team has largely been shielded
           | from the craziness, and may still be making good decisions
           | (but I don't know what they are in this realm personally).
        
         | nkohari wrote:
         | I spent 4.5 years at Stripe, and left in March.
         | 
         | The biggest difference not mentioned is the article is that
         | code is no longer kept on developer machines. The sync process
         | described in the article was well-designed, but also was a
         | fairly constant source of headaches. (For example, sometimes
         | the file watcher would miss an update and the code on your
         | remote machine would be broken in strange ways, and you'd have
         | to recognize that it was a sync issue instead of an actual
         | problem with your code.) As a result, the old devbox system was
         | superseded by "remote devboxes", which also host the code.
         | Engineers use VSCode remote development via SSH. It works
         | shockingly well for a codebase the size of Stripe's.
         | 
         | There are actually several different monorepos at Stripe, which
         | is a constant source of frustration. There have been lots of
         | efforts to try to unify the codebase into a single git repo,
         | but it was difficult for a lot of reasons, not the least of
         | which was the "main" monorepo was already testing the limits of
         | the solution used for git hosting.
         | 
         | Overall, maintaining good developer productivity is an
         | _extremely_ challenging problem. This is especially true for a
         | company like Stripe, which is both too large to operate as a
         | "small" company and too small to operate as a "big" company.
         | Even with a well-funded team of lots of super talented people
         | putting forth their best efforts, it's tough to keep all of the
         | wheels fully greased.
        
           | jcmfernandes wrote:
           | Thanks for this. Can you share the experience of those who
           | don't use VS Code?
        
             | tail_exchange wrote:
             | IntelliJ is also supported. If you want to use something
             | else, like VIM, then you need to ssh into the remote devbox
             | machine. They have support for custom dotfiles, so you can
             | set up your cool VIM environment for all your remote
             | devboxes.
             | 
             | If you don't want remote devboxes, the regular devboxes
             | still work. You just need to deal with the additional pain
             | for syncing the files.
        
           | cynicalpeace wrote:
           | Glad to see that they moved to code living with the execution
           | environment. The code living separate from the execution
           | environment seemed like too much overhead and complexity for
           | not enough benefit.
           | 
           | Especially given VSCode, or Cursor ;), work so well via ssh.
           | 
           | To the engineers that don't want to use those IDE's it might
           | suck temporarily, but that's it.
        
         | chaosphere2112 wrote:
         | I was only there in 2022, but at that point there were in fact
         | three or more monorepos (forked roughly based on toolchain - go
         | and scala in one, primarily Ruby in the one detailed here, and
         | there was one for the client stripe api libs that was JS only.
         | There may have been more.
        
         | bhuga wrote:
         | Some important differences from 2019:
         | 
         | * Code is off of laptops and lives entirely on the dev server
         | in many (but not all) cases. This has opened up a lot of use
         | cases where devs can have multiple branches in flight at once.
         | 
         | * Big investments into bazel.
         | 
         | * Heavier investment into editor experiences. We find most
         | developers are not as idiosyncratic in their editor choices as
         | is commonly believed, and most want a pre-configured setup
         | where jump-to-def and such all "just work".
        
           | cynicalpeace wrote:
           | I'm glad to see that first bullet point. The code living
           | separate from the execution environment seemed like too much
           | overhead and complexity for not enough benefit.
        
           | eikenberry wrote:
           | That last point has long been a red flag when interviewing. A
           | developer who doesn't care about their tooling also tends to
           | not care about the quality of their work.
        
             | ParetoOptimal wrote:
             | Its also a red flag for me when a company mandates an IDE.
        
             | mvdtnz wrote:
             | I'd rather work with developers who are flexible and open
             | minded about the conditions they can work in than those who
             | get notoriously pissy if things aren't set up exactly the
             | way they like it. Especially when that way is ridiculously
             | elaborate and non-standard.
        
       | pjmlp wrote:
       | Yet another replay of timesharing development experiences, I
       | guess we need a couple of generations more to count how many
       | times does a pendulum swing back and forth during a developer's
       | lifetime.
        
       | jdtig wrote:
       | Does Stripe use RoR?
       | 
       | The author mentions the codebase was Ruby, but I didn't see if
       | they talked about Rails.
        
         | bastawhiz wrote:
         | It is Ruby but not rails
        
           | jdtig wrote:
           | Thanks. I wonder what the experience is like working on a
           | very large codebase with or without a framework. E.g. Stripe
           | vs Shopify.
           | 
           | Or if the framework is barely noticeable at that scale and
           | doesn't really matter anymore. That's the impression I get
           | for Instagram (which was built with Django).
        
             | esprehn wrote:
             | At that scale there's certainly a framework and many in
             | house libraries with opinions and patterns. It's just not
             | rails.
        
             | bastawhiz wrote:
             | They had their own ORM, and a web framework built on
             | Sinatra. It wasn't as though you needed to reach far for a
             | tool if you needed one
        
           | jcmfernandes wrote:
           | Do they use zeitwerk?
        
             | froydnj wrote:
             | We do, yes.
        
       | anonzzzies wrote:
       | We use similar practices in our 3.5 person team; we work via
       | code-server and Aider with our own tooling on VPSs and this gets
       | synced to execution VPSs which run dev versions, a lot of sentry
       | logging and tests (mostly playwright these days). There is also a
       | vps which does builds all day and logs to Sentry too. We can
       | almost instantly get on our own test versions and see what we
       | did, and, over the space of some seconds to minutes we see test
       | and build data coming in. It works incredibly well for many years
       | already. Onboarding people is easy and no one ever has 'it
       | doesn't build on my system' as that's not something we do (you
       | can of course, all scripts are there but why waste the time?).
       | 
       | I grew up with mainframes, minis and unix batch andor multiuser
       | machines; for me this is the best way for business applications.
       | I didn't particularly like the move to local all that much.
        
       | aidos wrote:
       | Maybe a silly question, but why all this engineering effort when
       | you could host the dev environment locally?
       | 
       | By running a Linux VM on your local machine you get a consistent
       | environment that you can ssh to, remove the latency issues but
       | you remove all the complexity of syncing that they've created.
       | 
       | That's a setup that's worked well for me for 15 years but maybe
       | I'm missing some other benefit?
        
         | yeswecatan wrote:
         | I came to ask the same thing. We use docker-compose to describe
         | all our services which works fine.
        
           | JasonSage wrote:
           | This does not scale to a large number of services with a
           | certain amount of RAM/processing per service.
        
             | aidos wrote:
             | You could still run the proxy they have that lazy boots
             | services - that's a nice optimisation.
             | 
             | I don't think that many places are in a position where the
             | machines would struggle. They didn't mention that in the
             | article as a concern - just that they struggled to keep
             | environments consistent (brew install implies some are
             | running on osx etc).
        
               | sulam wrote:
               | I think it's safe to assume that for something with the
               | scale and complexity of Stripe, it would be a tall order
               | to run all the necessary services on your laptop, even
               | stubs of them. They may not even do that on the dev
               | boxes, I'd be a little surprised if they didn't actually
               | use prod services in some cases, or a canary at any rate,
               | to avoid the hassles of having to maintain on-call for
               | what is essentially a test environment.
        
               | aidos wrote:
               | I don't know that's safe to assume. Maybe it is an issue
               | but it was not one of the issues they talk about in the
               | article and not one of the design goals of the system.
               | They have the proxy / lazy start system exactly so they
               | can limit the services running. That suggests to me that
               | they don't end up needing them all the time to get things
               | done.
        
             | recroad wrote:
             | If you have 100 services in your org, I don't have to have
             | 100 running at the same time in your local dev machine. I
             | only run the 5 I need for the feature I'm working on.
        
               | Daishiman wrote:
               | I've been on this path and as soon as you work on a
               | couple of concurrent branches you end up having 20
               | containers in your machine and setting these up to run
               | successfully ends up being its own special PITA.
        
               | layer8 wrote:
               | What exactly are the problems created by having a larger
               | number of containers? Since you're mentioning branches,
               | these presumably don't have to all run concurrently, i.e,
               | you're not talking about resource limitations.
        
               | Daishiman wrote:
               | Large features can require changing protocols or altering
               | schemas in multiple services. Different workflows can
               | require different services, etc. Keep track of different
               | service versions in a couple branchs (not unusual IMO)
               | and it just becomes messy.
        
               | layer8 wrote:
               | What does this have to do with running locally vs. on a
               | dev server? You have to properly manage versions in any
               | case.
        
               | adamdecaf wrote:
               | We have 100 Go services (with redpanda) and a few
               | databases in docker-compose on dev laptops. It works well
               | when and we buy the biggest memory MacBooks available.
               | 
               | https://moov.io/blog/education/moovs-approach-to-setup-
               | and-t...
        
               | JasonSage wrote:
               | Your success with this strategy correlates more strongly
               | with 'Go' than '100 services' so it's more anecdotal than
               | generally-acceptable that you can run 100 services
               | locally without issues. Of course you can.
               | 
               | Buying the biggest MacBook available as a baseline
               | criteria for being able to run a stack locally with
               | Docker Compose does not exactly inspire confidence.
               | 
               | At my last company we switched our dev environment from
               | Docker Compose to Nix on those same MacBooks and CPU
               | usage when from 300% to <10% overnight.
        
               | ikety wrote:
               | Have any details on how you've implemented Nix? For my
               | personal projects I use nix without docker and the
               | results are great. However I was always fearful that nix
               | alone wouldn't quite scale as well as nix + docker for
               | complicated environments.
               | 
               | I've used the FROM SCRATCH strat with nix:
               | 
               | https://mitchellh.com/writing/nix-with-dockerfiles
               | 
               | Is that how you implemented it?
        
               | adamdecaf wrote:
               | Buying the biggest Mac's also lets developers run an
               | electron app or three (Slack, IDE, Spotify, browser, etc)
               | while running the docker-compose stack.
        
               | JasonSage wrote:
               | You're right. My coworkers remarked that they could run
               | Slack and do screensharing while running the apps locally
               | when we removed docker-compose.
        
               | stealthybox wrote:
               | That's a huge win -- has your team written about or spoke
               | on this anywhere?
        
               | JasonSage wrote:
               | No but I'd be happy to (I maintained the docker-compose
               | stack, our CLI, and did the transition to Nix).
        
               | adamdecaf wrote:
               | I'd like to learn more about switching compose to nix. We
               | will hit a wall with compose at some point.
        
               | stealthybox wrote:
               | I'm curious about the # of svc's / stack / company / team
               | size -- if you have your own blog -- would love to read
               | it when you publish
               | 
               | could be a cool lightning talk (or part of something
               | longer)
               | 
               | maybe it's a good piece for https://nixinthewild.com/ ?
               | 
               | I'm @capileigh on twitter and hachyderm.io if you wanna
               | reach out separately -- here is good tho too
        
         | n0us wrote:
         | You're limited by the resources available to you on your local
         | laptop and when you close that laptop the dev environment stops
         | running. Remote dev environments are more costly and
         | complicated to maintain but they can be shared, can scale
         | vertically (or horizontally) on demand, can persist when you
         | exit them, and managing access to various internal services
         | from dev environments can in some cases be simpler.
         | 
         | It also centralizes dev environment management to the platform
         | team that owns them and provides them as a service which cuts
         | down on support tickets related to broken dev environments.
         | There are certainly some trade offs though and for most
         | companies a local VM or docker compose file will be a better
         | choice.
        
           | giido wrote:
           | Also tends to security advantages to mitigate/manage dev
           | risks. Typically hosts will have security tooling installed
           | (AV, EDR, etc) that may not be installed on local VMs, hosts
           | are ephemeral so quickly created and destroyed, network
           | restrictions, etc.
        
           | underdeserver wrote:
           | Most local laptops are much stronger than is needed to run
           | the entire stack of your average startup with no resource
           | issues.
           | 
           | And the dev environment stops running when you close the
           | laptop, but you also don't need it since you're not
           | developing.
           | 
           | Not saying it can work for absolutely all cases but it's
           | definitely good enough for a lot of cases.
        
             | anthonypasq wrote:
             | ... this is an article about Stripe, not your average
             | startup
        
               | underdeserver wrote:
               | And yet the discussion can go beyond that.
        
           | crabbone wrote:
           | Not even once did I want to share my dev. environment, nor
           | did anyone want to share mine. We are talking about 25-odd
           | years of being a developer.
           | 
           | Never in my life did I want to scale my dev. environment
           | vertically or horizontally or in any other direction. Unless
           | you work on a calculator, I don't know why would you need
           | that.
           | 
           | I have no problems with my environment stopping when I close
           | my laptop. Why is this a problem for anyone?
           | 
           | For overwhelming majority of programming projects out there
           | they fit on a programmer's laptop just fine. The rare
           | exceptions are the projects which require very specialized
           | equipment not available to the developers. In any case, a
           | simulator would be usually a preferable way to dealing with
           | this, and the actual equipment would be only accessed for
           | testing, not for development. Definitely not as a routine
           | development process.
           | 
           | Never in my life did I want development process to be
           | centralized. All developers have different habits, tastes and
           | preferences. Last thing I want is to have centralized
           | management of all environments which would create unwanted
           | uniformity. I've been only once in a company that tried to
           | institute a centrally-managed development environment in the
           | way you describe, and I just couldn't cope with it. I quit
           | after few month of misery. The most upsetting aspect about
           | these efforts is stupidity. These efforts solve no problems,
           | but add a lot of pain that is felt continuously, all the time
           | you have to do anything work-related.
        
             | otabdeveloper4 wrote:
             | > For overwhelming majority of programming projects out
             | there they fit on a programmer's laptop just fine.
             | 
             | What? No. You live in a very sheltered world, my friend.
        
             | marcosdumay wrote:
             | I get a serious feeling that interpreted languages,
             | monorepos, environment orchestration, snapshot ecosystem
             | aggregators, and per-function execution evironments are all
             | pushing software development into the wrong direction.
             | 
             | Those things are not bad by themselves. But people tend to
             | do bad things with them, and those bad things spread
             | remarkably well, disrupting every place they infect.
        
         | bhuga wrote:
         | I work on this at Stripe. There's a lot of reasons:
         | 
         | * Local dev has laptop-based state that is hard to keep in sync
         | for everyone. Broken laptops are _really hard_ to debug as
         | opposed to cloud servers I can deploy dev management software
         | to. I can safely say the oldest version of software that's in
         | my cloud; the laptops skew across literally years of versions
         | of dev tools despite a talented corpeng team managing them.
         | 
         | * Our cloud servers have a lot more horsepower than a laptop,
         | which is important if a dev's current task involves multiple
         | services.
         | 
         | * With a server, I can get detailed telemetry out of how devs
         | work and what they actually wait on that help me understand
         | what to work on next; I have to have pretty invasive spyware on
         | laptops to do the same.
         | 
         | * Servers in our QA environment can interact with QA services
         | in a way that is hard for a laptop to do. Some of these are
         | "real services", others are incredibly important to dev itself,
         | such as bazel caches.
         | 
         | There's other things; this is an abbreviated list.
         | 
         | If a linux VM works for you, keep working! But we have not been
         | able to scale a thousands-of-devs experience on laptops.
        
           | aidos wrote:
           | I want to double check we're talking about the same thing
           | here. I'm referring to running everything inside a single VM
           | that you would have total access to. It could have telemetry,
           | you'd know versions etc. I wonder if there's some confusion
           | around what I'm suggesting given your points above.
           | 
           | I'm sure there are a bunch of things that make it the right
           | choice for Stripe. Obviously if you just have too many things
           | to run at a time and a dev laptop can't handle it then it's a
           | dealbreaker. What's the size of the cloud instances you have
           | to run on?
        
             | drited wrote:
             | I see in another comment thread you mentioned downloading
             | the VM iso, presumably from a central source. Your comment
             | in this thread didn't mention that so perhaps this answer
             | (incorrectly) assumes the VM you are talking about was
             | locally maintained/created?
        
             | bhuga wrote:
             | > I'm referring to running everything inside a single VM
             | that you would have total access to. It could have
             | telemetry, you'd know versions etc. I wonder if there's
             | some confusion around what I'm suggesting given your points
             | above.
             | 
             | I don't think there's confusion. I only have total access
             | when the VM is provisioned, but I need to update the dev
             | machine constantly.
             | 
             | Part of what makes a VM work well is that you can make
             | changes and they're sticky. Folks will edit stuff in /etc,
             | add dotfiles, add little cron jobs, build weird little SSH
             | tunnels, whatever. You say "I can know versions", but with
             | a VM, I can't! Devs will run update stuff locally.
             | 
             | As the person who "deploys" the VM, I'm left in a weird
             | spot after you've made those changes. If I want to update
             | everyone's VM, I blow away your changes (and potentially
             | even the branches you're working on!). I can't update
             | anything on it without destroying it.
             | 
             | In constrast, the dev servers update constantly. There's a
             | dozen moving parts on them and most of them deploy several
             | times a day without downtime. There's a maximum host
             | lifetime and well-documented hooks for how to customize a
             | server when it's created, so it's clear how devs need to
             | work with them for their customizations and what the
             | expectations are.
             | 
             | I guess its possible you could have a policy about when the
             | dev VM is reset and get developers used to it? But I think
             | that would be taking away a lot of the good parts of a VM
             | when looking at the tradeoffs.
             | 
             | > What's the size of the cloud instances you have to run
             | on?
             | 
             | We have a range of options devs can choose, but I don't
             | think any of them are smaller than a high-end laptop.
        
               | aidos wrote:
               | So the devs don't have the ability to ssh to your cloud
               | instances and change config? Other than the size issue,
               | I'm still not seeing the difference. Take your point on
               | it needing to start before you have control, but other
               | than that a VM on a dev machine is functionally the same
               | as one in a cloud environment.
               | 
               | In terms of needing to reset, it's just a matter of git
               | branch, push, reset, merge. In your world that sync
               | complexity happens all the time, in mine just on reset.
               | 
               | Just to be clear, I think it's interesting to have a
               | healthy discussion about this to see where the tradeoffs
               | are. Feels like the sort of thing where people try to
               | emulate you and buy themselves a bunch of complexity
               | where other options are reasonable.
               | 
               | I have no doubt Stripe does what makes sense for Stripe.
               | I'd also wager than on balance it's not the best option
               | for most other teams.
               | 
               | PS thanks for chiming in. I appreciate the extra insights
               | and context.
        
               | bhuga wrote:
               | > So the devs don't have the ability to ssh to your cloud
               | instances and change config?
               | 
               | They do, but I can see those changes if I'm helping
               | debug, and more importantly, we can set up the most
               | important parts of the dev processes as services that we
               | can update. We can't ssh into a VM on your laptop to do
               | that.
               | 
               | For example, if you start a service on a stripe machine,
               | you're sending an RPC to a dev-runner program that
               | allocates as many ports as are necessary, updates a local
               | envoy to make it routable, sets up a systemd unit to keep
               | it running, and so forth. If I need to update that
               | component, I just deploy it like anything else. If
               | someone configures their host until that dev runner
               | breaks, it fails a healthcheck and that's obvious to me
               | in a support role.
               | 
               | > Just to be clear, I think it's interesting to have a
               | healthy discussion about this to see where the tradeoffs
               | are. Feels like the sort of thing where people try to
               | emulate you and buy themselves a bunch of complexity
               | where other options are reasonable.
               | 
               | 100% Agree! I think we've got something pretty cool, but
               | this stuff is coming from a well-resourced team; keeping
               | the infra for it all running is larger than many
               | startups. There's tradeoffs involved: cost, user support,
               | flexibility on the dev side (i.e. it's harder to add
               | something to our servers than to test out a new kind of
               | database on your local VM) come immediately to mind, but
               | there are others.
               | 
               | There are startups doing lighter-weight, legacy-free
               | versions of what we're doing that are worth exploring for
               | organizations of any size. But remote dev isn't the right
               | call for every company!
        
               | aidos wrote:
               | Ah! So that's a spot where we're talking past each other.
               | 
               | I'd anticipate you would be equally as able to ssh to VMs
               | on dev laptops. That's definitely a prerequisite for
               | making this work in the same way as you're currently
               | doing.
               | 
               | The only difference between what you do and what I'm
               | suggesting is the _location_ of the VM. That itself
               | creates some tradeoffs but I would expect absolutely
               | everything inside the machine to be the same.
        
               | bhuga wrote:
               | > I'd anticipate you would be equally as able to ssh to
               | VMs on dev laptops. That's definitely a prerequisite for
               | making this work in the same way as you're currently
               | doing.
               | 
               | Our laptops don't receive connections, but even if they
               | could, folks go on leave and turn them off for 9 months
               | at a time, or they don't get updated for whatever reason,
               | or other nutty stuff.
               | 
               | It's surprisingly common with a few thousand of them out
               | there that laptop management code that removes old
               | versions of a tool is itself removed after months, but
               | laptops still pop up with the old version as folks turn
               | them back on after a very long time, and the old tool
               | lingers. The services the tools interact with have long
               | since stopped working with the old version, and the
               | laptop behaves in unpredictable ways.
               | 
               | This doesn't just apply to hypothetical VMs, but various
               | CLI tools that we deploy to laptops, and we still have
               | trouble there. The VMs are just one example, but a
               | guiding principle for us been that the less that's on the
               | laptop, the more control we have, and thus the better we
               | can support users with issues.
        
           | hibikir wrote:
           | To provide historical context, 10 years ago there was a local
           | dev infrastructure, but it was already so creaky as to be
           | unreliable. Just getting the ruby dependencies updated was a
           | problem. The local dev was also already cheating: All the
           | asynchronous work that was triggered via RabbitMQ/Kafka was
           | getting hacked together, because trying to run everything
           | that Infra/Queues did locally would have been very wasteful.
           | So magic occurred in the calls to the message queue that
           | instead triggered the crucial ruby code that would be hit in
           | the end.
           | 
           | So if this was a problem back then, when the company had less
           | than 1000 employees, I can't even imagine how hard would it
           | be to get local dev working now
        
           | underdeserver wrote:
           | The way these problems are stated mighy make it seem like
           | they're unsolvable without a lot of effort. I just want to
           | point out that I've worked at places that do use a local,
           | supported environment, and it works well.
           | 
           | Not saying it's the wrong choice for you, but it's a choice,
           | not a natural conclusion.
        
         | crabbone wrote:
         | Working in a configuration where your development environment
         | isn't on your computer is always a huge downgrade. Work with
         | VM? -- sooner or later you'll have problems with forwarding
         | your keyboard input to the VM. Work with containers? -- no good
         | way to save state, no good way to guarantee all containers are
         | in sync etc. God forbid any sort of Web browser-based solution.
         | The number of times I accidentally closed the tab or did
         | something else unintentionally because of key mapping that's
         | impossible to modify...
         | 
         | However, in some situations you must endure the pain of doing
         | this. For example, regulatory reasons. Some organizations will
         | not allow you to access their data anywhere but on some cloud
         | VM they give you very botched and very limited control over.
         | While, technically, these are usually easy to side-step, you
         | are legally _required_ to not move the data outside of the
         | boundaries defined for you by the IT. And so you are stuck in
         | this miserable situation, trying to engineer some semblance of
         | a decent utility set in a hostile environment.
         | 
         | Another example is when the infrastructure of your project is
         | too vast to be meaningfully reduced to your laptop, and a lot
         | of your work is exploratory in nature. I.e. instead of typical
         | write-compile-upload-test you are mostly modifying stuff on the
         | system you are working on to see how it responds. This is kind
         | of how my day-to-day goes: someone reported they fail to
         | install or use one of the utilities we provide in a particular
         | AWS region with some specific network settings etc. They'd give
         | me a tunnel to the affected cluster, and I'd have some hours to
         | spend there investigating the problem and looking for possible
         | immediate and long-term solutions. So, you are essentially
         | working in a tech-support role, but you also have to write
         | code, debug it, sometimes compile it etc.
        
           | aidos wrote:
           | Sounds like you're talking about something else (more like
           | the Citrix / virtual desktop type model - I don't know the
           | name).
           | 
           | The idea here is that you use a VM (cloud or local) to run
           | your compute. Most people can run it in the background
           | without explicitly connecting to it.
        
         | simonw wrote:
         | In my opinion the single most important feature of any
         | development environment is a reliable "reset" button.
         | 
         | The amount of time companies lose to broken development
         | environments is incredible. A developer can easily lose half a
         | day (or more) of productive time.
         | 
         | With cloud environments it's much easier to offer a "just give
         | me a brand new environment that works" button somewhere. That's
         | incredibly valuable.
        
           | aidos wrote:
           | For sure, _but_ , a VM has that feature too. They have to run
           | _some_ services directly on the laptop to handle the code
           | syncing. So if you accept a certain amount of "need to do
           | some dev machine setup" as a cost, installing Parallels and
           | running a script to download an iso is a pretty small surface
           | area that allows for a full reset.
           | 
           | I don't doubt that Stripe have a setup that works well for
           | them them but I also bet they could have gone done a
           | different path that also worked well _and_ I suspect that
           | other path (local VMs) is a better fit for most other smaller
           | teams.
        
             | sam_perez wrote:
             | To be fair, it seems like the cloud development environment
             | choice was driven by the scale of Stripe's organization.
        
         | dheera wrote:
         | > By running a Linux VM
         | 
         | Or just run Linux on your local machine as the OS. I don't get
         | the obsession with Macs as dev workstations for companies whose
         | products run on Linux.
        
           | philwelch wrote:
           | Especially when they don't even deploy to ARM servers.
        
           | uncanneyvalley wrote:
           | The year of Linux on the laptop has yet to arrive for most of
           | us. Windows and MacOS both offer better battery life, if for
           | no other reason (and there are usually other reasons, like
           | suspend/wake issues, graphics driver woes, etc.)
        
         | trevor-e wrote:
         | From what I remember (left Stripe in late 2022) much of
         | Stripe's codebase was/is a Ruby tangled "big ball of mud"
         | monorepo due to lack of proper modules. Basically a lot of the
         | core modules all imported code from each other with little
         | layering so you couldn't deploy a lean service without pulling
         | in almost all of the monorepo code. And due to the way imports
         | worked it would load a ton of this code a runtime. This meant
         | that even a simple service would have extremely high memory
         | usage and be unsuitable for a local dev environment where you
         | have N of these bloated services running at the same time.
         | There was a big refactoring effort to get "strict modules" in
         | place to cut down on this bloat which had some promising
         | results. I'm not an expert in this area but I believe this was
         | the gist of it.
        
       | mleo wrote:
       | I use syncthing to manage the synchronization of files between
       | local laptop and remote development server. The software code
       | base is upwards of 20 years and has dependencies on Windows for
       | runtime. I can run unit tests locally on very fast MacBook Pro or
       | run it much slower on Windows VM. With syncthing I can easily
       | edit files locally or remotely and they are available locally for
       | source control.
       | 
       | The worst problem is refining the ignore settings to ensure only
       | code is synced preventing conflicts on derivative files and that
       | some rule doesn't overlap code file names.
        
         | nxicvyvy wrote:
         | Try unison, it's built for this use case.
         | 
         | https://www.cis.upenn.edu/~bcpierce/unison/
        
           | shepherdjerred wrote:
           | I like Unison, though I found Mutagen a bit better.
           | 
           | https://mutagen.io/
        
       | mootoday wrote:
       | I've worked with remote dev environments for many years,
       | including some time with one of the providers of such a service.
       | 
       | It became clear to me that cloud-only is not the way to go, but
       | instead a local-first, cloud-optional approach.
       | 
       | https://mootoday.com/blog/dev-environments-in-the-cloud-are-...
        
         | numbsafari wrote:
         | This is my biggest complaint with GitHub CodeSpaces.
         | 
         | I should be able to launch a local VM using the GitHub Desktop
         | App just as easily as I can an Azure-hosted instance.
        
           | ParetoOptimal wrote:
           | But then how will they lock you into paying them monthly?
        
             | numbsafari wrote:
             | They're just forcing me to stop paying them altogether.
        
       | truetraveller wrote:
       | "I've described a lot of fairly-involved custom tooling; we
       | needed enough engineers to build and maintain it, and enough
       | "customer" engineers for that investment to pay off."
       | 
       | This is so important when deciding to re-invent the wheel. I've
       | gotten bitten by this many times.
        
       | p-o wrote:
       | It's always so enlightening to have articles like this one shed
       | light on how companies at scale operate. It goes without saying
       | that many of the problems Stripe faced with their monorepo isn't
       | application to smaller businesses, but there are still bits and
       | pieces that are applicable to many of us.
       | 
       | I've been working on an ephemeral/preview environment operator
       | for Kubernetes(https://github.com/pier-oliviert/sequencer) and as
       | I could agree to a lot of things OP said.
       | 
       | I think dev boxes is really the way to go, specially with all the
       | components that makes an application nowadays. But the
       | latency/synchronization issue is a hard topic and it's full of
       | tradeoff.
       | 
       | A developer's laptop always ends up being a bespoke environment
       | (yes, Nix/Docker can help with that), and so, there's always a
       | confidence boost when you get your changes up on a standalone
       | environment. It gives you the proof that "hey things are working
       | like I expected them to".
        
         | draw_down wrote:
         | Right, dev boxes do not need to do double duty as a personal
         | computer plus development target, which allows them to more
         | closely resemble the machine your code will actually run on.
         | They also can be replaced easily, which can be helpful if you
         | ever suspect something is wrong with the box itself - if the
         | new one acts the same way, it wasn't the dev box.
         | 
         | I don't recall latency being a big problem in practice. In an
         | organization like this, it's best to keep branches up to date
         | with respect to master anyway, so the diffs from switching
         | between branches should be small. There was a lot of work done
         | to make all this quite performant and nice to use. The slowest
         | part was always CI.
        
           | tmpz22 wrote:
           | I feel like we're not getting the right lessons from this. It
           | feels like we're focusing on HOW we can do something versus
           | pausing for a brief moment to consider if we SHOULD in the
           | first place.
           | 
           | To me the root issue is the complexity of production
           | environments has expanded to the point of impacting
           | complexity in developer environments just to deploy or test -
           | this is in conjunction with expanding complexity of developer
           | environments just to develop - i.e. web pack.
           | 
           | For very large well resourced organizations like Stripe that
           | actually operate at scale that complexity may very well be
           | unavoidable. But most organizations are not Stripe. They
           | should consider decreasing complexity instead of investing in
           | complex tooling to wrangle it.
           | 
           | I'd go as far as to suggest both monorepos and dev-boxes are
           | complex toolchains that many organizations should consider
           | avoiding.
        
             | epinephrinios wrote:
             | Absolutely, I worked on tech behemoths and smaller
             | companies. The dev experience was significantly better when
             | all development was local. I even worked on initiatives to
             | move development _away_ from the cloud, and although other
             | devs were skeptical, they ended up loving it.
        
             | Eridrus wrote:
             | I think we don't have good solutions for scaling down prod.
             | 
             | Our relatively simple prod architecture has 5 containers &
             | a hosted database (so 6 containers when run locally), and
             | any less would impact our product goals.
             | 
             | I still find running prod locally valuable, and is the most
             | common way anyone does development here, but containers are
             | fairly heavyweight when you want to run everything on one
             | machine. It's also impossible if you have parts that need
             | special accelerators to get good latency, etc.
             | 
             | If you're willing to build everything from scratch, you can
             | have a framework that seamlessly lets you build conceptual
             | services and then separate the physical deployment
             | concerns, like Google has and sometimes even uses. But for
             | the rest of us where we're clobbering together a bunch of
             | different technologies, that's a luxury we can't really
             | afford.
        
             | jrochkind1 wrote:
             | > I'd go as far as to suggest both monorepos and dev-boxes
             | are complex toolchains that many organizations should
             | consider avoiding.
             | 
             | I'm not sure "monorepo" means the same thing to you as it
             | does to me? To me, it just means "keep all the code in one
             | repo, instead of trying to split things up into different
             | repos."
             | 
             | To me, it's the thing that _is_ the simple solution, it
             | just means  "a repo" -- the reason it gets a name is
             | because it's _unusual_ for large orgs with enormous
             | codebases to have everything in one repo, it 's unusual for
             | them to do the simple thing that works fine for a small org
             | with a normal codebase.
             | 
             | What is it you're suggesting a simple organization should
             | do instead of a "monorepo"?
        
               | ahtihn wrote:
               | How is a mono repo the simple solution compared to one
               | repo per independently releasable component ?
               | 
               | All the tooling is much easier to use when each
               | application has its own repo.
        
               | jrochkind1 wrote:
               | The argument would be that for simple organization,
               | dividing things into independently releasable components
               | is less simple than just having one app. I think that's
               | what most simple organizations do, no? Why do you need
               | the complexity of independently releasable components for
               | your simple organization? Now you have to track
               | compatibility between things, ensure what version of what
               | independently releasable thing works with what version of
               | what independently releasable other thing, isn't that
               | added complexity? Why not just have one application,
               | isn't that simpler? You don't need to worry about
               | incompatibilities between your separately releasable
               | things -- every commit that passes CI on your single repo
               | means all the parts are compatible (sans untested bugs).
               | 
               | Usually it stops being "simpler" at a level of
               | organizational complexity or code size where it becomes a
               | mess. The "monorepo" is the attempt to do what everyone
               | was just doing anyway for simple orgs with simple
               | codebases, but keep doing it at huge sizes.
        
               | zten wrote:
               | If you're living in the same dysfunctional world I am,
               | then maybe your organization split things into repos that
               | are separately releasable, but are conceptually so
               | strongly coupled that you now need to create changes on 3
               | repos to make a change.
        
               | tmpz22 wrote:
               | > To me, it just means "keep all the code in one repo,
               | instead of trying to split things up into different
               | repos."
               | 
               | To me, and perhaps more from a Devops-like perspective,
               | mono repo means "one repo many _diverse_ deployment
               | environments and artifacts often across multiple
               | programming languages ".
               | 
               | Im advocating against the Google/Stripe situation of a
               | singular massive repo with complex build tools to make it
               | function - like Bazel. I think sometimes _small_
               | organizations get lured by ego and bad cost /benefit
               | analysis into implementing such an architecture and it
               | can tank entire product orgs in my experience (obviously
               | not for Stripe, Google, etc.).
        
         | hamandcheese wrote:
         | My main gripe with the dev box approach is that a cloud
         | instance with similar compute resources as a developers MacBook
         | is hella expensive. Even ignoring compute, a 1TB ebs volume
         | with equivalent performance to a MacBook will probably cost
         | more than the MacBook every month.
        
           | axus wrote:
           | The article didn't actually say what "Stripe's cloud
           | environment" was, besides "outside of the production
           | environment". I assumed the company had their own hardware
           | but your assumption is more probable.
        
           | MetaWhirledPeas wrote:
           | Wouldn't this be a reasonable alternative? Asking because I
           | don't have experience with this.
           | 
           | 1. New shared builds update container images for applications
           | that comprise the environment
           | 
           | 2. Rather than a "devbox", devs use something like Docker
           | Compose to utilize the images locally. Presumably this would
           | be configured identically to the proposed devbox, except with
           | something like a volume pointing to local code.
           | 
           | I'm interested in learning more about this. It seems like a
           | way to get things done locally without involving too many
           | cloud services. Is this how most people do it?
        
             | DiggyJohnson wrote:
             | I manage a dev environment for a small, inexperienced (but
             | eager) team and I have a similar setup. I'll do a write up
             | at some point if I have time. It can work, and does for me,
             | but there are some funny consequences can end up mediate
             | the relationship between a developer's computer and his
             | code, which is a terrible place to be.
        
       | secondcoming wrote:
       | What's the easiest way of sharing things like protobuf
       | definitions across multiple separate repos and making sure things
       | are always in sync?
        
         | MrDarcy wrote:
         | buf.build
        
       | crabbone wrote:
       | NB. What the article describes isn't a developer environment in
       | the cloud. It's _testing_ in the cloud. The editor in their model
       | lives on the programmers ' laptops, the editing happens there as
       | well and so on. The code is deployed to cloud infrastructure for
       | testing.
        
       | physicsguy wrote:
       | I think for smaller companies, you can get a long way towards a
       | lot of this with judicious use of docker-compose, and convenience
       | scripts in a Makefile. As long as you don't do anything stupid
       | like try and spin up 100 services when you're a team of 8, most
       | laptops these days are sufficiently capable of handling a
       | database, Redis, your codebase, and something like LocalStack.
        
         | PedroBatista wrote:
         | I would say you can even go a looong way without any Docker at
         | all.
         | 
         | And for the large majority of the companies/projects, if your
         | project is so complex and heavy of resources that it doesn't
         | fit on a modern laptop, the problem is not in the laptop, it's
         | in the whole project and the culture and cargo-cult around
         | "modern" software development.
        
           | vlovich123 wrote:
           | Containers/VMs are a nice way to isolate away any machine
           | configuration discrepancies. Conversely it does encourage the
           | use of non hermetic and deterministic build systems which
           | come with other issues too (eg speed differences surfacing
           | race conditions in the build)
        
           | elktown wrote:
           | - "A single-binary app behind a load-balancer might scale to
           | far beyond our needs, but the promotion/resume trade-off
           | can't be justified."
        
       | adamdecaf wrote:
       | We've been using a hundred repositories and a hundred Go services
       | in a local docker-compose setup that's worked fairly well. CI
       | runners can struggle if their disks can't keep up with Docker.
       | 
       | It comes up that we should make a devprod for front end folks to
       | make the backend abstracted more.
       | 
       | Overall a lot of people prefer local dev because it gives them
       | access to the entire stack, lets them run branch images easier,
       | and has better performance than remote boxes.
       | 
       | https://moov.io/blog/education/moovs-approach-to-setup-and-t...
        
       | prasoonds wrote:
       | I wonder if there's a devbox-as-a-service tool out there. I use a
       | MacBook Air for most of my work and on occasion would be
       | benefited by using a beefier machine in the cloud. I just don't
       | want to set up a machine, set up sync etc.
        
         | metachris wrote:
         | You could just rent a beefy server for like $40/month at
         | hetzner or OVH and use VS Code with the remote development
         | extension.
        
       | stealthybox wrote:
       | This is an awesome writeup of the tools and culture issues you
       | run into maintaining dev environments.
       | 
       | From post, the problems that justified central dev boxes are
       | roughly: 1. dependency / config mgmt / env drift on laptops 2.
       | collaboration / debugging between engineers 3. compute scaling +
       | optimization 4. supporting devs with updates and infra changes
       | 
       | The last one is particularly interesting to me, because
       | supporting the dev env is separate engineering role/task that
       | starts small and grows into teams of engineers supporting the
       | environment.
       | 
       | I'm helping build Flox. We're working on these pain points by
       | making environments (deps, vars, services, and builds) workable
       | across all kinds of Mac/Linux laptops and servers. 1) a.
       | Virtualize the pkg manager per-project b. Nix packages can
       | install across OS/arch pretty well 2) Imperative actions like
       | `flox install`/`upgrade` always edit a declarative env
       | manifest.toml -- share it via git 3) less Docker VM's -- get more
       | out of devteam Macbooks 4) reduce toil with a versioned,
       | shareable envs --> less sending ad-hoc config and brew commands
       | to people (as mentioned in the post.) Just `git pull && flox
       | activate`.
       | 
       | I think on problem point #2, collab tools are advancing to where,
       | pairing on features, bugs, and env issues can be done without
       | central SSH. (ex: tmate, vscode liveshare, screensharing, etc) --
       | however, that does sort of fall apart on laptops for async
       | debugging of env issues (ex: when devprod is in the US, and eng
       | is in London). Having universal telemetry on ephemeral cloud dev-
       | boxes with a registry and all of the other DNS and SSH goodies
       | could be the kind of infra to aspire to as your small teams run
       | into more big-team problems.
       | 
       | In the Stripe anecdote, adopting the centralized infra created
       | new challenges that their devprod teams were dedicated to
       | supporting: - international latency from central, US-based VM's -
       | syncing code to the dev boxes
       | (https://facebook.github.io/watchman/) - linting, formatting,
       | generating configs (run it locally or serverside?) - a dev
       | workflow CLI tool dedicated to dev-box workflows and sync'ing
       | with watchman's clock - IaaS, registry, config, glue for all the
       | servers
       | 
       | This is all very non-trivial work, but maybe there's a future
       | where people can win some portability with Flox when they are
       | small and grow into those new challenges when it's truly needed
       | -- now their laptop environments just get a quick `flox activate`
       | on some new, shiny servers or Cloud IDE's.
       | 
       | I really like the notes from the author on how useing Language
       | Server Protocol across a high latency link has great
       | optimizations that work along side the watchman sync for real-
       | time code editing.
        
       | vfclists wrote:
       | How does a payment service wind up with over a 1000 engineers?
       | 
       | I understand that "engineers" may not mean "developers", it could
       | DevOps, site reliability and all the bits and pieces that make up
       | a large service provider, but over a 1000?
       | 
       | Can someone please enlighten me?
        
         | itsjustjordan wrote:
         | Surely in 2024 we can't be classifying Stripe as just "a
         | payment service"
        
       | ronef wrote:
       | I love this. I believe I might have even interfaced with your
       | team around that time. I was leading Facebook's (now Meta)
       | Developer Products team and we were building against super
       | similar areas internally.
       | 
       | We ran back then a similar project that I coined "Developer On-
       | Demand" to tackle that same problem space. It's also what
       | eventually lead me to find the magics of Nix and then build Flox.
       | 
       | I also agree with a lot of what was shared in other comments,
       | while the problems we tackled at large orgs such as Facebook,
       | Shopify, Uber, Google (to name a few teams I remember working
       | with) and obviously also Stripe, certain areas of the pain are
       | 100% universal regardless of team size.
       | 
       | On the Flox side, we're trying to help with a few of them today
       | and many more hopefully in the soon future, very open for
       | thoughts! Things like - simple to use Nix for each of your
       | projects + keep deps and config up to date across everyones
       | Macbooks and Linux boxes, etc -- even if you don't have a full
       | AWS team and Language Server team ready to support.
        
       ___________________________________________________________________
       (page generated 2024-08-19 23:01 UTC)