[HN Gopher] Bento: Jupyter Notebooks at Meta
       ___________________________________________________________________
        
       Bento: Jupyter Notebooks at Meta
        
       Author : Maro
       Score  : 147 points
       Date   : 2024-09-18 14:30 UTC (8 hours ago)
        
 (HTM) web link (engineering.fb.com)
 (TXT) w3m dump (engineering.fb.com)
        
       | web3aj wrote:
       | The internal tools at Meta are incredible tbh. There's an
       | ecosystem of well-designed internal tools that talk to each
       | other. That was my favorite part of working there.
        
         | Random_BSD_Geek wrote:
         | Polar opposite of my experience. To achieve the technical
         | equivalent of changing a lightbulb, spend the entire day
         | wrangling a dozen tools which are broken in different ways,
         | maintained by teams that no longer exist or have completely
         | rolled over, only to arrive at the finish line and discover we
         | don't use those lightbulbs anymore. Move things and break fast.
        
           | extr wrote:
           | Yeah 100%. I found it immensely frustrating to be using tools
           | with no community (except internally), so-so documentation,
           | and features that were clearly broken in a way that would be
           | unacceptable for a regular consumer product. If you have a
           | question or error not covered by an internal search or
           | documentation, good luck, you'll need it. Literally part of
           | the reason I left the company.
        
             | zer0zzz wrote:
             | Agreed. I often get my work done using open source build
             | instructions and tools and then when everything works I
             | port it to internal infra. Other people are the opposite
             | though, which for open source based code bases has a nasty
             | side effect of the work having no upstream able tests!
        
             | landedgentry wrote:
             | Well, you're supposed to read the code and figure it out.
             | And if you can't, you're not good enough an engineer.
             | According to people at Meta.
        
               | extr wrote:
               | People probably think you're exaggerating but it's true.
               | Sometimes when I would get blocked the suggestion was to
               | "read the source code" or "submit a fix" on some far
               | flung internal project. Huge fucking waste of time and
               | effort, completely unserious.
        
               | moandcompany wrote:
               | Same as Google. Many internal tools have painful
               | interfaces and poor or documentation because the hiring
               | bar was high and it was acceptable to assume that the
               | user's skill level is high enough to figure it out. That
               | attitude becomes a bigger problem when trying to sell
               | tools to the public (e.g. Google Cloud Platform).
        
               | yodsanklai wrote:
               | As an outsider, I was always under the impression that
               | Google had a tradition of engineering excellence (robust
               | tools, clean and while tested code following strict
               | guidelines), while Meta has more of a Hacker culture
               | (move fast and break things).
        
           | loeg wrote:
           | IMO there's a mix of a few really good, widely used, well-
           | supported tools as well as a long tail of random tiny tools
           | where the original team is gone that are cruftier.
        
           | uuddlrlrbaba wrote:
           | Mmm breakfast
        
             | grantsucceeded wrote:
             | haha the reason I stayed as long as i did
        
           | bozhark wrote:
           | Move Smooth and Fix Things (tm) is our nonprofit
           | corporation's version of this atrocious motto.
        
         | crabbone wrote:
         | A friend of mine is doing his PHD while being an intern at
         | Meta. He does _not_ share your excitement... at all. To
         | summarize his complaints: a framework written a long while ago
         | with design flaws that were cast in stone, that requires
         | exorbitant effort to accomplish simple things (under the
         | pretense of global integration that usually isn 't needed, but
         | even if was needed, would still not work).
        
           | slt2021 wrote:
           | how else can you build empire as Engineering Manager and get
           | promo?
           | 
           | fork open source, then demand resources to maintian this
           | monster.
           | 
           | easiest promotion + job security.
           | 
           | its even called "Platform Engineering" these days
        
         | jchonphoenix wrote:
         | Meta tools are best in class when the requirement is scale. Or
         | that the external tools haven't matured yet
        
         | JohnMakin wrote:
         | One of the crazier things a L4 meta colleague of mine told me,
         | that I still don't believe entirely, is that meta pretty much
         | has their own fork of _everything_ , even tools like git. is
         | this true?
        
           | sdenton4 wrote:
           | It wouldn't be terribly surprising. Forking everything
           | provides a liiiitle bit of protection against things like the
           | 'left pad' incident.
        
           | tqi wrote:
           | Facebook actually doesn't use git, they use mercurial
           | (https://graphite.dev/blog/why-facebook-doesnt-use-git).
           | 
           | That decision is also illustrative of why they end up forking
           | most things - Facebook's usage patterns at the far extreme
           | end for almost any tool, and things thats are non-issues with
           | fewer engineers or a smaller codebase become complete
           | blockers.
        
             | kridsdale3 wrote:
             | Yes when I used to talk about this to interviewees, I
             | described that every tool people commonly use is somewhere
             | on the Big-O curves for scaling. Most of the time we don't
             | really care if a tool is O(n) or O(10 n) or whatever.
             | 
             | At Meta, N tends to be hundreds of billions to hundreds of
             | trillions.
             | 
             | So your algorithm REALLY matters. And git has a Big-O that
             | is worse than Mercurial, so we had to switch.
        
             | LarsDu88 wrote:
             | They use sapling. An in-house clone of mercurial that was
             | open sourced 2 years ago
        
           | jamra wrote:
           | Meta doesn't use git. It uses mercurial. It does fork it
           | because they have a huge monorepo. They created a concept of
           | stacked commits which is a way of not having branches. Each
           | commit is in a stack and then merged into master. Lots of
           | things built for scaling.
        
           | ipsum2 wrote:
           | Yep. Zeus is a fork of Zookeeper, Hack is a fork of PHP, etc.
           | It's usually needed to make it work with the internal
           | environment.
           | 
           | The few things that don't have forks are usually the open
           | source projects like React or PyTorch, but even those have
           | some custom features added to make it work with FB internals.
        
             | gcr wrote:
             | This is also how things work at Google.
             | 
             | Google also maintains a monorepo with "forks" of all
             | software that they use. History diverges, but is
             | occasionally synchronized for things like security updates
             | etc.
        
               | zhengyi13 wrote:
               | Am I completely off-base/confused thinking that the GFE
               | originally started life (like back under csilver) as a
               | fork of boa[0]?
               | 
               | [0]: http://www.boa.org/
        
               | lacker wrote:
               | I thought it was GWS that originally started as a fork of
               | boa.
        
             | grantsucceeded wrote:
             | Few companies experienced the explosive growth fb did,
             | though many will claim to have done so. Hack made the
             | existing codebase of php scale to insane levels while
             | reaching escape velocity for the overall company to even
             | attempt to transition away or shrink the php codebase, as i
             | recall (i was an SRE, not a dev)
             | 
             | zeus likewise.
        
               | ipsum2 wrote:
               | You worked at FB, but you call yourself an SRE, not a PE?
               | ;)
        
               | fragmede wrote:
               | You still call it Facebook?
        
         | Qshdg wrote:
         | Looking at some of the bureaucracy in their open source
         | projects, I'd say that they need less tooling and more
         | thinking. These tools help to keep spaghetti code bases from
         | imploding totally.
        
         | moandcompany wrote:
         | My opinion: Many Meta tools and processes seem like they were
         | created by former Googlers that sought to recreate something
         | they previously had at Google, during the Google->FB Exodus,
         | but also changed aspects of the tool that were annoying or
         | diverged from their needs. This is not a bad thing.
         | 
         | Since Bento doesn't appear to be usable by the public,
         | aparallel version of this that people can get a feel for cross-
         | tool integration would be Google's Colaboratory / Colab
         | notebooks (https://colab.research.google.com/) that have many
         | baked-in integrations driven by actual internal use (i.e.
         | dogfooding).
        
           | kridsdale3 wrote:
           | As someone from both, I confirm/support your opinion 100%.
        
         | baggiponte wrote:
         | Uuuh can you tell a bit more about wasabi, the Python LSP? Saw
         | a post years ago and been eager to see whether it'd be open
         | sourced (or why it wouldn't).
        
       | fauria wrote:
       | Can this be downloaded somewhere?
       | 
       | Couldn't find any link in the open source site:
       | https://opensource.fb.com/ nor the ELI5:
       | https://developers.facebook.com/blog/post/2021/09/20/eli5-be...
        
         | michaelmior wrote:
         | I don't believe Bento has been open-sourced.
        
           | make3 wrote:
           | interesting that they make external articles about it
        
             | rovr138 wrote:
             | "Oh that's cool.", "It'd be interesting to work on problems
             | like that.", "That's a neat solution"
             | 
             | If anyone's on the fence about applying, that could be
             | enough to nudge them in the direction. If anyone's worked
             | in similar areas, could be worth applying and looking at
             | the team, etc.
        
               | michaelmior wrote:
               | Totally agree, although odd that the post was tagged as
               | "open source."
        
               | tqi wrote:
               | I think thats because it's based on an open source
               | project
        
         | tqi wrote:
         | TBH the value of bento over other notebook offerings was almost
         | entirely how well it plays with the rest of the data and infra
         | stack within facebook. It was super easy to go from raw data
         | (entire DE and DI orgs responsible for ETL and cluster
         | maintenance) to a cleaned up table (usually built by DEs) to an
         | ad hoc table to support a specific use-case that could then be
         | accessed via bento, analyzed, and then published / shared to
         | anyone in the company.
        
         | jamra wrote:
         | If you use jupyterlite, you're using the same thing. Bento is
         | just the internal Meta version and the only potential benefits
         | is the internal integration.
        
         | ipsum2 wrote:
         | Probably not. It's written in Hack, and heavily tied to
         | internal frameworks, so it'll be practically impossible to
         | extract into a standalone package, unless they do a "clean
         | room" implementation (like they did for Sapling UI
         | https://sapling-scm.com/docs/addons/isl/).
         | 
         | But it has some cool features that notebook developers can take
         | inspiration from.
        
       | sk11001 wrote:
       | I kind of love Meta for all the seemingly unnecessary internal
       | stuff they do. They have so many projects that are absolutely not
       | critical for them, maybe not even net positive, but they spend
       | who knows how many hours building and maintaining them.
        
         | apwell23 wrote:
         | > Meta for all the seemingly unnecessary internal stuff they
         | do.
         | 
         | Netflix would like to have a word.
        
           | Narhem wrote:
           | Netflix's situation is caused by their business model.
        
             | fwip wrote:
             | Is it? It seems like 90% of what Netflix is (from a
             | technical PoV), is a CDN + video playback. There's a lot
             | more value in the content library they've negotiated and
             | the business agreements with ISPs than there is in the
             | software stack.
             | 
             | Apologies if this response is delayed, 6 posts today is
             | "too fast."
        
               | rNULLED wrote:
               | Netflix now builds many of the video production tools
               | they need to produce their own content. This now includes
               | games as well.
        
               | apwell23 wrote:
               | sure but i was alluding to stuff like this
               | 
               | https://netflixtechblog.com/maestro-netflixs-workflow-
               | orches...
        
               | scottyah wrote:
               | I'm not sure anyone has access to the real data, but I've
               | had a suspicion that Netflix is able to remain a lot more
               | profitable due to their superior tech. Cloud hosting and
               | streaming (not to mention labor) can get very expensive,
               | and I think while it's easier to set up nowadays (in
               | comparison to when they started) a lot of the other
               | companies are burning cash to try to keep up. HBO Max
               | (just Max now?) has always had poor streaming quality
               | compared to netflix and I imagine they're paying a lot
               | more for it.
        
         | bbor wrote:
         | Internal startups have the same value proposition as external
         | ones, I think; most fail, but every once in a while you hit a
         | React or a Gmail.
        
       | big-chungus4 wrote:
       | can I, a mere mortal, use it?
        
       | kyrrewk wrote:
       | this is cool! wish there was a commerical product that did this.
       | marimo does something similar, but you have to do the deployment
       | yourself
        
         | mscolnick wrote:
         | marimo has a playground to run notebooks via WebAssembly -
         | similar to Bento - without having to deploy yourself:
         | https://marimo.app/
        
       | Fraterkes wrote:
       | A bit off-topic, but my problem with any notebook type of tool
       | (ie you create a document that mixes code, the output of that
       | code, and text/media) is that they always feel like they're meant
       | to be these quick, off the cuff ways to present data. But when I
       | try to use them they just feel awkward and slow. (I tried doing a
       | jupyter notebook with the vscode plugin, and while everything was
       | very polished, it feld like I was ponderously coding in Word or
       | something. The same was true for R-notebooks in rstudio. Maybe
       | it's a better experience if you have a decently fast laptop)
        
         | Fraterkes wrote:
         | Also I always think it's a littly sad that Jupyter was one of
         | the best shots for Julia to get more mainstream attention, and
         | instead the notebooks people write are basically exclusively
         | python
        
         | lamename wrote:
         | IME notebooks in VS Code are even worse (but improving).
         | Jupyter lab is faster...but that depends on how fast you prefer
         | ;)
        
           | wenc wrote:
           | I have the exact opposite experience -- VS Code notebooks are
           | much snappier and are possibly the best Jupyter
           | implementations I've ever used (better and more responsive
           | than vanilla Jupyter or Jupyter labs).
           | 
           | VS code notebooks also support LSPs with refactoring, typing
           | etc. Black is supported. Step by step debugging is supported.
           | Venv is built in.
           | 
           | There are so many conveniennces in VS Code that whenever I
           | have to use Jupyter Lab I feel a lot of stuff is missing.
        
             | adolph wrote:
             | Killer feature of VS Code notebooks is Vim keybindings. It
             | also manages movement between cells, so you have to be very
             | aware of the current mode.
        
         | wenc wrote:
         | Sounds like you've diagnosed your issue in the last line.
         | 
         | Notebooks are usually not inherently slow -- I use Jupyter in
         | VS Code running off a remote server and it's snappy.
         | 
         | I have a MacBook Pro 2020.
        
         | taeric wrote:
         | I'm assuming you've seen
         | https://www.youtube.com/watch?v=7jiPeIFXb6U&t=61s? I know I
         | found it far more amusing than I should have when it was
         | released.
         | 
         | I will confess that I found Mathematica kind of neat back in
         | the day. I never got as good with it as peers did. I'm curious
         | if that would be different for me today.
        
       | talles wrote:
       | Tanya Rai - Introducing Bento: Jupyter Notebooks @ Facebook |
       | JupyterCon 2020 : https://www.youtube.com/watch?v=f3UfVX4_PD4
        
       | talles wrote:
       | I wish more people used marimo, so much better than jupyter
        
         | akshayka wrote:
         | For the curious: https://github.com/marimo-team/marimo
        
       | tantalor wrote:
       | Glad to see people using the term "serverless" to mean "actually
       | without a server" instead of what other places are doing.
        
       | mhh__ wrote:
       | I've been using Marimo along these lines recently. I'm fan. So so
       | glad to not use Jupyter.
        
       ___________________________________________________________________
       (page generated 2024-09-18 23:00 UTC)