[HN Gopher] Bento: Jupyter Notebooks at Meta
___________________________________________________________________
Bento: Jupyter Notebooks at Meta
Author : Maro
Score : 147 points
Date : 2024-09-18 14:30 UTC (8 hours ago)
(HTM) web link (engineering.fb.com)
(TXT) w3m dump (engineering.fb.com)
| web3aj wrote:
| The internal tools at Meta are incredible tbh. There's an
| ecosystem of well-designed internal tools that talk to each
| other. That was my favorite part of working there.
| Random_BSD_Geek wrote:
| Polar opposite of my experience. To achieve the technical
| equivalent of changing a lightbulb, spend the entire day
| wrangling a dozen tools which are broken in different ways,
| maintained by teams that no longer exist or have completely
| rolled over, only to arrive at the finish line and discover we
| don't use those lightbulbs anymore. Move things and break fast.
| extr wrote:
| Yeah 100%. I found it immensely frustrating to be using tools
| with no community (except internally), so-so documentation,
| and features that were clearly broken in a way that would be
| unacceptable for a regular consumer product. If you have a
| question or error not covered by an internal search or
| documentation, good luck, you'll need it. Literally part of
| the reason I left the company.
| zer0zzz wrote:
| Agreed. I often get my work done using open source build
| instructions and tools and then when everything works I
| port it to internal infra. Other people are the opposite
| though, which for open source based code bases has a nasty
| side effect of the work having no upstream able tests!
| landedgentry wrote:
| Well, you're supposed to read the code and figure it out.
| And if you can't, you're not good enough an engineer.
| According to people at Meta.
| extr wrote:
| People probably think you're exaggerating but it's true.
| Sometimes when I would get blocked the suggestion was to
| "read the source code" or "submit a fix" on some far
| flung internal project. Huge fucking waste of time and
| effort, completely unserious.
| moandcompany wrote:
| Same as Google. Many internal tools have painful
| interfaces and poor or documentation because the hiring
| bar was high and it was acceptable to assume that the
| user's skill level is high enough to figure it out. That
| attitude becomes a bigger problem when trying to sell
| tools to the public (e.g. Google Cloud Platform).
| yodsanklai wrote:
| As an outsider, I was always under the impression that
| Google had a tradition of engineering excellence (robust
| tools, clean and while tested code following strict
| guidelines), while Meta has more of a Hacker culture
| (move fast and break things).
| loeg wrote:
| IMO there's a mix of a few really good, widely used, well-
| supported tools as well as a long tail of random tiny tools
| where the original team is gone that are cruftier.
| uuddlrlrbaba wrote:
| Mmm breakfast
| grantsucceeded wrote:
| haha the reason I stayed as long as i did
| bozhark wrote:
| Move Smooth and Fix Things (tm) is our nonprofit
| corporation's version of this atrocious motto.
| crabbone wrote:
| A friend of mine is doing his PHD while being an intern at
| Meta. He does _not_ share your excitement... at all. To
| summarize his complaints: a framework written a long while ago
| with design flaws that were cast in stone, that requires
| exorbitant effort to accomplish simple things (under the
| pretense of global integration that usually isn 't needed, but
| even if was needed, would still not work).
| slt2021 wrote:
| how else can you build empire as Engineering Manager and get
| promo?
|
| fork open source, then demand resources to maintian this
| monster.
|
| easiest promotion + job security.
|
| its even called "Platform Engineering" these days
| jchonphoenix wrote:
| Meta tools are best in class when the requirement is scale. Or
| that the external tools haven't matured yet
| JohnMakin wrote:
| One of the crazier things a L4 meta colleague of mine told me,
| that I still don't believe entirely, is that meta pretty much
| has their own fork of _everything_ , even tools like git. is
| this true?
| sdenton4 wrote:
| It wouldn't be terribly surprising. Forking everything
| provides a liiiitle bit of protection against things like the
| 'left pad' incident.
| tqi wrote:
| Facebook actually doesn't use git, they use mercurial
| (https://graphite.dev/blog/why-facebook-doesnt-use-git).
|
| That decision is also illustrative of why they end up forking
| most things - Facebook's usage patterns at the far extreme
| end for almost any tool, and things thats are non-issues with
| fewer engineers or a smaller codebase become complete
| blockers.
| kridsdale3 wrote:
| Yes when I used to talk about this to interviewees, I
| described that every tool people commonly use is somewhere
| on the Big-O curves for scaling. Most of the time we don't
| really care if a tool is O(n) or O(10 n) or whatever.
|
| At Meta, N tends to be hundreds of billions to hundreds of
| trillions.
|
| So your algorithm REALLY matters. And git has a Big-O that
| is worse than Mercurial, so we had to switch.
| LarsDu88 wrote:
| They use sapling. An in-house clone of mercurial that was
| open sourced 2 years ago
| jamra wrote:
| Meta doesn't use git. It uses mercurial. It does fork it
| because they have a huge monorepo. They created a concept of
| stacked commits which is a way of not having branches. Each
| commit is in a stack and then merged into master. Lots of
| things built for scaling.
| ipsum2 wrote:
| Yep. Zeus is a fork of Zookeeper, Hack is a fork of PHP, etc.
| It's usually needed to make it work with the internal
| environment.
|
| The few things that don't have forks are usually the open
| source projects like React or PyTorch, but even those have
| some custom features added to make it work with FB internals.
| gcr wrote:
| This is also how things work at Google.
|
| Google also maintains a monorepo with "forks" of all
| software that they use. History diverges, but is
| occasionally synchronized for things like security updates
| etc.
| zhengyi13 wrote:
| Am I completely off-base/confused thinking that the GFE
| originally started life (like back under csilver) as a
| fork of boa[0]?
|
| [0]: http://www.boa.org/
| lacker wrote:
| I thought it was GWS that originally started as a fork of
| boa.
| grantsucceeded wrote:
| Few companies experienced the explosive growth fb did,
| though many will claim to have done so. Hack made the
| existing codebase of php scale to insane levels while
| reaching escape velocity for the overall company to even
| attempt to transition away or shrink the php codebase, as i
| recall (i was an SRE, not a dev)
|
| zeus likewise.
| ipsum2 wrote:
| You worked at FB, but you call yourself an SRE, not a PE?
| ;)
| fragmede wrote:
| You still call it Facebook?
| Qshdg wrote:
| Looking at some of the bureaucracy in their open source
| projects, I'd say that they need less tooling and more
| thinking. These tools help to keep spaghetti code bases from
| imploding totally.
| moandcompany wrote:
| My opinion: Many Meta tools and processes seem like they were
| created by former Googlers that sought to recreate something
| they previously had at Google, during the Google->FB Exodus,
| but also changed aspects of the tool that were annoying or
| diverged from their needs. This is not a bad thing.
|
| Since Bento doesn't appear to be usable by the public,
| aparallel version of this that people can get a feel for cross-
| tool integration would be Google's Colaboratory / Colab
| notebooks (https://colab.research.google.com/) that have many
| baked-in integrations driven by actual internal use (i.e.
| dogfooding).
| kridsdale3 wrote:
| As someone from both, I confirm/support your opinion 100%.
| baggiponte wrote:
| Uuuh can you tell a bit more about wasabi, the Python LSP? Saw
| a post years ago and been eager to see whether it'd be open
| sourced (or why it wouldn't).
| fauria wrote:
| Can this be downloaded somewhere?
|
| Couldn't find any link in the open source site:
| https://opensource.fb.com/ nor the ELI5:
| https://developers.facebook.com/blog/post/2021/09/20/eli5-be...
| michaelmior wrote:
| I don't believe Bento has been open-sourced.
| make3 wrote:
| interesting that they make external articles about it
| rovr138 wrote:
| "Oh that's cool.", "It'd be interesting to work on problems
| like that.", "That's a neat solution"
|
| If anyone's on the fence about applying, that could be
| enough to nudge them in the direction. If anyone's worked
| in similar areas, could be worth applying and looking at
| the team, etc.
| michaelmior wrote:
| Totally agree, although odd that the post was tagged as
| "open source."
| tqi wrote:
| I think thats because it's based on an open source
| project
| tqi wrote:
| TBH the value of bento over other notebook offerings was almost
| entirely how well it plays with the rest of the data and infra
| stack within facebook. It was super easy to go from raw data
| (entire DE and DI orgs responsible for ETL and cluster
| maintenance) to a cleaned up table (usually built by DEs) to an
| ad hoc table to support a specific use-case that could then be
| accessed via bento, analyzed, and then published / shared to
| anyone in the company.
| jamra wrote:
| If you use jupyterlite, you're using the same thing. Bento is
| just the internal Meta version and the only potential benefits
| is the internal integration.
| ipsum2 wrote:
| Probably not. It's written in Hack, and heavily tied to
| internal frameworks, so it'll be practically impossible to
| extract into a standalone package, unless they do a "clean
| room" implementation (like they did for Sapling UI
| https://sapling-scm.com/docs/addons/isl/).
|
| But it has some cool features that notebook developers can take
| inspiration from.
| sk11001 wrote:
| I kind of love Meta for all the seemingly unnecessary internal
| stuff they do. They have so many projects that are absolutely not
| critical for them, maybe not even net positive, but they spend
| who knows how many hours building and maintaining them.
| apwell23 wrote:
| > Meta for all the seemingly unnecessary internal stuff they
| do.
|
| Netflix would like to have a word.
| Narhem wrote:
| Netflix's situation is caused by their business model.
| fwip wrote:
| Is it? It seems like 90% of what Netflix is (from a
| technical PoV), is a CDN + video playback. There's a lot
| more value in the content library they've negotiated and
| the business agreements with ISPs than there is in the
| software stack.
|
| Apologies if this response is delayed, 6 posts today is
| "too fast."
| rNULLED wrote:
| Netflix now builds many of the video production tools
| they need to produce their own content. This now includes
| games as well.
| apwell23 wrote:
| sure but i was alluding to stuff like this
|
| https://netflixtechblog.com/maestro-netflixs-workflow-
| orches...
| scottyah wrote:
| I'm not sure anyone has access to the real data, but I've
| had a suspicion that Netflix is able to remain a lot more
| profitable due to their superior tech. Cloud hosting and
| streaming (not to mention labor) can get very expensive,
| and I think while it's easier to set up nowadays (in
| comparison to when they started) a lot of the other
| companies are burning cash to try to keep up. HBO Max
| (just Max now?) has always had poor streaming quality
| compared to netflix and I imagine they're paying a lot
| more for it.
| bbor wrote:
| Internal startups have the same value proposition as external
| ones, I think; most fail, but every once in a while you hit a
| React or a Gmail.
| big-chungus4 wrote:
| can I, a mere mortal, use it?
| kyrrewk wrote:
| this is cool! wish there was a commerical product that did this.
| marimo does something similar, but you have to do the deployment
| yourself
| mscolnick wrote:
| marimo has a playground to run notebooks via WebAssembly -
| similar to Bento - without having to deploy yourself:
| https://marimo.app/
| Fraterkes wrote:
| A bit off-topic, but my problem with any notebook type of tool
| (ie you create a document that mixes code, the output of that
| code, and text/media) is that they always feel like they're meant
| to be these quick, off the cuff ways to present data. But when I
| try to use them they just feel awkward and slow. (I tried doing a
| jupyter notebook with the vscode plugin, and while everything was
| very polished, it feld like I was ponderously coding in Word or
| something. The same was true for R-notebooks in rstudio. Maybe
| it's a better experience if you have a decently fast laptop)
| Fraterkes wrote:
| Also I always think it's a littly sad that Jupyter was one of
| the best shots for Julia to get more mainstream attention, and
| instead the notebooks people write are basically exclusively
| python
| lamename wrote:
| IME notebooks in VS Code are even worse (but improving).
| Jupyter lab is faster...but that depends on how fast you prefer
| ;)
| wenc wrote:
| I have the exact opposite experience -- VS Code notebooks are
| much snappier and are possibly the best Jupyter
| implementations I've ever used (better and more responsive
| than vanilla Jupyter or Jupyter labs).
|
| VS code notebooks also support LSPs with refactoring, typing
| etc. Black is supported. Step by step debugging is supported.
| Venv is built in.
|
| There are so many conveniennces in VS Code that whenever I
| have to use Jupyter Lab I feel a lot of stuff is missing.
| adolph wrote:
| Killer feature of VS Code notebooks is Vim keybindings. It
| also manages movement between cells, so you have to be very
| aware of the current mode.
| wenc wrote:
| Sounds like you've diagnosed your issue in the last line.
|
| Notebooks are usually not inherently slow -- I use Jupyter in
| VS Code running off a remote server and it's snappy.
|
| I have a MacBook Pro 2020.
| taeric wrote:
| I'm assuming you've seen
| https://www.youtube.com/watch?v=7jiPeIFXb6U&t=61s? I know I
| found it far more amusing than I should have when it was
| released.
|
| I will confess that I found Mathematica kind of neat back in
| the day. I never got as good with it as peers did. I'm curious
| if that would be different for me today.
| talles wrote:
| Tanya Rai - Introducing Bento: Jupyter Notebooks @ Facebook |
| JupyterCon 2020 : https://www.youtube.com/watch?v=f3UfVX4_PD4
| talles wrote:
| I wish more people used marimo, so much better than jupyter
| akshayka wrote:
| For the curious: https://github.com/marimo-team/marimo
| tantalor wrote:
| Glad to see people using the term "serverless" to mean "actually
| without a server" instead of what other places are doing.
| mhh__ wrote:
| I've been using Marimo along these lines recently. I'm fan. So so
| glad to not use Jupyter.
___________________________________________________________________
(page generated 2024-09-18 23:00 UTC)