[HN Gopher] I built Excel for Uber and they ditched it
___________________________________________________________________
I built Excel for Uber and they ditched it
Author : robdimarco
Score : 169 points
Date : 2023-09-15 19:01 UTC (3 hours ago)
(HTM) web link (basta.substack.com)
(TXT) w3m dump (basta.substack.com)
| itsthecourier wrote:
| "When I met the members of the Crystal Ball team, it was about
| roughly ... people"
|
| how many people there were at that point, @bastawhiz?
| romnon wrote:
| people be rough
| bastawhiz wrote:
| Sorry, obvious typo! It was four people.
| lph wrote:
| Excel is an ancient and complex beast. I get the appeal of
| building this project---it sounds fun---but trying to duplicate
| the Excel engine to the level of producing identical outputs is,
| frankly, bonkers. The author caught the one discrepancy they
| noticed, circular reference handling, but how many did they miss?
| How do they know different inputs won't cause it to deviate from
| Excel? I didn't get the sense from the blog post that this had
| extensive test coverage. Putting it into use for a business-
| critical financial calculation is a massive risk, but I guess
| that's how Uber rolls -\\_(tsu)_/-
|
| It would have been less fun but way, way less risky to wire a
| headless Excel up to a javascript front-end.
| bastawhiz wrote:
| No Windows, no Excel license, hundreds of concurrent users. How
| confident are you that each user is getting the calculations
| that they triggered and not someone else's? How are you
| deploying that spreadsheet? How are you version controlling the
| mutable parts of the system? How confident are you that the
| solution would be robust in the face of bad inputs and high
| load?
|
| I won't say there wasn't risk, but there was quite a bit of
| testing and a human always made the final call anyway (I never
| fully understood why we didn't eliminate humans from the
| processes altogether).
| erulabs wrote:
| I read this article looking forward to the complex bespoke code
| to be ripped out and deleted - but the author clearly grew as an
| engineer in a way I didn't expect:
|
| > Sometimes that's just how it is. The devops saying "Cattle, not
| pets" is apt here: code (and by proxy, the products built with
| that code) is cattle. It does a job for you, and when that job is
| no longer useful, the code is ready to be retired. If you treat
| the code like a pet for sentimental reasons, you're working in
| direct opposition to the interests of the business.
|
| A lot of code is fun to write. A lot of problems are fun to
| solve. But a business, especially a startup, needs to stay razor
| focused. My entire career is effectively to sit in meetings and
| tell young, passionate engineers not to build things. It's a bit
| depressing, but it's also vital.
|
| A good engineer can solve any problem with clever code. A great
| engineer knows what problems aren't really problems and probably
| an XLS download link updated daily would have been fine.
| emmo wrote:
| Give the time and labour you're paid for, don't give emotional
| energy.
| stocknoob wrote:
| Short term yes, but long term, life's too short. Pursue FI,
| especially if you're a craftsman, so you can work on what
| brings you joy, without an expiration date.
|
| "Cattle, not pets" may be a good way to run a business, but
| not your life.
| emmo wrote:
| Oh on your own projects and personal passions absolutely, I
| think emotional investment should be high. Not for someone
| else, though.
| sublinear wrote:
| > (from the article) - Having Excel in the browser was a useful
| solution, but the problem wasn't showing spreadsheets in the
| browser: the problem was getting a specific UI delivered to the
| right users quickly.
|
| > (from the above comment) - A good engineer can solve any
| problem with clever code. A great engineer knows what problems
| aren't really problems and probably an XLS download link
| updated daily would have been fine.
|
| I saw the bullet list further down the substack page and it's
| still not good enough for this level of requirements gathering.
| Those questions _describe_ the scenario, but _asking_ them
| would not have arrived at this simple solution. Checklist
| thinking is a crutch and just overcomplicates the problem. All
| the signals here were organizational and social, and not a
| matter of improving a process.
|
| This should be obvious, but people who are not involved with
| implementation details can't answer questions about
| implementation details.
|
| "Just make it like Excel" is a super low quality answer from
| someone who has a completely different set of objectives. The
| only way forward would have been to consult with someone closer
| to the actual users and counter-argue from there. What's
| missing here is the courage to recognize weak assumptions and
| deliberately avoid writing any code until enough details are
| pinned down to get to an agreement from _all_ parties, not just
| say yes to the person "in charge".
| bastawhiz wrote:
| > "Just make it like Excel" is a super low quality answer
| from someone who has a completely different set of
| objectives. The only way forward would have been to consult
| with someone closer to the actual users and counter-argue
| from there.
|
| The only contact we had with "actual users" was over WeChat
| because they were on the other side of the planet.
|
| > What's missing here is the courage to recognize weak
| assumptions and deliberately avoid writing any code until
| enough details are pinned down to get to an agreement from
| all parties, not just say yes to the person "in charge".
|
| Uber was pathologically bad in this sense. There was no time
| to get details pinned down. We had a product to ship in two
| weeks for non-technical stakeholders. If we didn't, the
| stated consequence was millions of dollars in losses to the
| business. Throwing up your hands until you get product
| clarity when you know you can solve the problem as-is is a
| great way to find yourself with a PIP.
| [deleted]
| Retric wrote:
| Exactly, in 2016 there was several off the shelf options for
| doing the exact same thing. It's a perfect example of a young
| engineer feeling a huge accomplishment from reinventing the
| wheel, and then realizing the clever solution wasn't actually
| worth anything like the effort required to create it.
|
| I had a long conversation to convince someone not to go down
| that path in 2006, and I am sure someone's going to do it in
| 2026.
|
| Pausing to think: _I wonder how someone else solved this exact
| problem_ is such a huge part of how you grow as a developer I
| wish schools would focus more on it.
| yard2010 wrote:
| I would say what you talk about is experience. To have an
| experience, you must go this path to realize what to not do.
| It feels like a catch 22 kind of thing.
| Retric wrote:
| Doing it well comes down to experience, but doing it at all
| comes down to asking which of your unconscious assumptions
| are hard requirements. Nobody is actually saying you have
| skills X, Y, Z, which you must use to solve this problem.
|
| Similarly stepping outside the existing tech stack could be
| worth it, or perhaps the problem isn't actually that
| important. Making that call takes experience, but realizing
| the possibility exists can be as simple as a checking stack
| overflow.
| cush wrote:
| Any PM that walks in with "Just make it like Excel" hasn't spent
| a moment writing code.
| bastawhiz wrote:
| Even worse: he was a director
| Macha wrote:
| So a lot of my time as a more junior engineer was spent on a
| similar project that I've described as "rebuilding excel". In my
| case, it was in the form of a table widget. It was an inhouse
| widget for displaying tabular data. We were working on AngularJS
| 1, and had moved from a prior iteration of our application in
| ExtJS. Now ExtJS had a reasonably featureful table widget out of
| the box, and so when we were porting to AngularJS, it was
| considered important that we continued to have these UI features,
| but at the time (the Angular ecosystem was relatively immature at
| that time) there were not very many advanced table widgets
| available in open source, so we ended up building our on.
|
| However, while the ExtJS table widget had been treated by product
| management as pretty immutable "this is what ExtJS gives us, we
| can customise the colours and wording and that's it", the idea
| that we could customise the table widget started something
| amongst one of the PMs. And so we would get a constant stream of
| feature requests for the table widget to add stuff and enhance it
| and soon we were significantly more featureful than the ExtJS
| widget. It's still, to this day, the most featureful table widget
| I've seen in a web app. Everything Excel had in terms of resizing
| tables, sticky columns, scrolling behaviours, sorting, filtering,
| searching, etc., all saved in your config so it was synced across
| all your devices, as well as all the performance goodies like
| recycled rendering etc. The constant stream of feature requests
| meant there were dev team years invested in this table widget.
|
| As a more mature engineer looking back, it's clear that at some
| point this had stopped being about customer value and more about
| one PM's obsession with getting excel like functionality in a in-
| browser reporting tool, but at the time we just kept building
| those features.
|
| Now at this time, our company had acquired a sort of competitor
| of ours. This competitor had what was effectively the same
| product, but in a different market. And so the first merger of
| the functionality was basically to reskin both applications so
| they would pretend to be tabs in a unified application, and
| change some terminology, etc. They actually did happen to have a
| pretty similar tech stack to us, so some newer components were
| available in both applications.
|
| But it became clear that our users were not happy with two
| applications pretending to be one. They wanted to know why this
| other market was not accessible by, say, a dropdown in the
| configuration, and not an entire application which worked in
| different parts from subtly differently to entirely differently.
|
| So the discussion became about building a ground up unified
| interface for both of them. Of course, this ignited the
| discussion of "which table component do we use?". On the one
| hand, the acquired team were looking at our table, with it's
| fifty billion options and single handedly accounting for half the
| page weight of the minified JS of our application and did not
| want something so bloated. On the other hand, our team were
| looking at their table widget which was effectively "for row in
| data, for column in row, print td" and dreading having to rebuild
| all these features for product management again.
|
| Ultimately the conflict was resolved by choosing to use an open
| grid that had less features than ours but more than theirs and
| telling the PM in question that table features were going to be
| prioritised much less heavily from then on unless there was a
| real user need for it.
| [deleted]
| sha16 wrote:
| > "take this [the spreadsheet] and put it in on the website
| [Wesley]."
|
| This should be taught in classrooms.
| abeppu wrote:
| I'm confused about the circular reference thing. Like, was there
| a reason to do the linear regression that way? Is there a secret
| story in a story where next week, in a spinoff / sequel episode,
| the data scientist responsible will explain why they took the
| weird/surprising choices they did?
| DwnVoteHoneyPot wrote:
| It's a common excel trick in finance. For example, let say you
| have $0. If you borrow $1,000,000 at 5% interest, by end of
| year you'll be short $50k. That means you actually needed to
| borrow $1,050,000. But the extra $50K causes more interest
| ($2,500)... so you needed to borrow $1,052,500, which causes
| more interest... and so on.
|
| Instead of doing some Excel Goal Seek or Solver or VBA macro,
| it's nice to let the excel "reactivity" handle it for you.
| andy81 wrote:
| I used it once to run a Monte-Carlo simulation in a
| spreadsheet.
|
| After enabling iterative calculation and manual calculation,
| every press of refresh runs a loop. Fun stuff.
| DubiousPusher wrote:
| Just a heads up for anyone who finds themselves with a similar
| requirement, there's a very robust set of office APIs available
| in .net. I would be surprised if you couldn't open and run a
| whole workbook with them though I have only used them for more
| tangential tasks.
| itsthecourier wrote:
| I'm glad I read this article and learned about the circ
| brap wrote:
| > He simply couldn't believe that I'd written a full spreadsheet
| engine that ran in the browser.
|
| I can't believe it either, and I don't mean this in a good way.
|
| Apache POI lets you run headless Excel. You import and interact
| with sheets programmatically in Java. We used this in my old
| workplace for exactly the same reason (functions, cell
| references, the whole thing), it worked great.
|
| You found the 'circ' problem with a bit of luck. What about all
| of the other hidden little quirks of Excel that you would
| ultimately run into down the road? Are you really going to build
| and maintain a full blown Excel clone in JS? Is this really the
| objective of the frontend team?
|
| It seems to me like a bit of googling and >90% of the work here
| could have been avoided. As an added bonus it would have been
| done by the backend team instead.
| bastawhiz wrote:
| > It seems to me like a bit of googling and >90% of the work
| here could have been avoided.
|
| I had a deadline and the only idea on the team for shipping a
| working product, and I shipped a working product on time.
|
| Uber ran (runs?) their own data center. Getting a Windows
| machine/VM procured to actually run Excel would have taken an
| act of god. I was able to spin up a new front-end service in
| about thirty minutes. And I had some code that sort of kind of
| already worked, so I wasn't starting from scratch. Keep in mind
| that this system needed to be used by multiple people with
| different sets of data simultaneously.
|
| > Are you really going to build and maintain a full blown Excel
| clone in JS? Is this really the objective of the frontend team?
|
| If they'd have kept asking for more features and Excel parity,
| I suppose we would have considered it. But they didn't.
|
| Certainly I don't expect many people would have chosen to do
| what I did. But the thing worked (and surprisingly well). If
| all you took away from the post is that it was a big
| complicated project, I'm afraid my writing has failed to convey
| the message it was attempting to convey.
| dihrbtk wrote:
| they still run data centers but there is a cloud migration
| ongoing.
| idkyall wrote:
| I think one of the biggest growth areas for junior engineers to
| reach mid-level and senior is recognizing when you're re-
| inventing the wheel. E.g. If you are given a programming task
| to do anything related to Excel or the Microsoft Office suite,
| it's worth googling it first, because some engineer somewhere
| was probably tasked with doing the same thing a decade ago and
| has written a blog post or made a GitHub repo for it.
| bastawhiz wrote:
| > some engineer somewhere was probably tasked with doing the
| same thing a decade ago
|
| *Seven year ago at Uber
| JohnMakin wrote:
| It's not just junior engineers. Senior/management can fall
| into this trap as well.
|
| At one of my former companies we had a small problem with
| whitelisting cloudflare IP's that don't typically change
| super duper often but definitely cannot be assumed to be
| static. My boss at that time decided the solution was this
| big initiative he called "whitelist maker" and assigned it to
| me. I don't remember what implementation details he wanted,
| but it was some insane rube-goldberg machine to basically
| pull down this list: https://www.cloudflare.com/ips-v4 and
| then put it into some terraform code.
|
| I ended up quietly killing the project during a re-org and
| used the cloudflare provider, which conveniently provides the
| forementioned IPv4 list as a data source in 1 line of code.
| Done, 5 mins work. He had scheduled out an entire quarter and
| half of a team's resources for it.
| idkyall wrote:
| That's true, it's misleading to say this is a mistake only
| junior engineers make. Perhaps the real lesson is in having
| the maturity to put your ego aside and reflect clearly on
| whether you are solving the right problem in a sustainable
| way before jumping into the how.
| rkangel wrote:
| To be fair, he wrote a spreadsheet engine that could run _one
| particular spreadsheet_. Admittedly a complex one, but it was a
| fixed set of functions that he needed to implement and not and
| endless tail of things that people expect Excel to do. I think
| I 'd probably have argued more about the UI spec and down some
| Excel behind the scenes but it is a familiar UI if you've got
| lots of number inputs all over the place.
|
| I've always enjoyed this article about building a spreadsheet
| in 100 lines of F#: https://tomasp.net/blog/2018/write-your-
| own-excel/ The expansion from that to the feature set needed
| here is manageable.
| brap wrote:
| Project requirements are never fixed, they're always
| evolving.
|
| It was only a matter of time before users would've complained
| about features being missing/broken, especially since what
| they're used to is Excel and this was meant to replace it.
| bastawhiz wrote:
| > this was meant to replace it.
|
| I think that's a very generous read of what I said the
| requirements for this product were.
| brap wrote:
| > "The city teams only know how to use Excel, just make
| it like Excel."
|
| With those expectations, sooner or later someone would
| have said "hey wait a minute, why isn't this like Excel?
| Excel knows how to do X, but this can't do X! I thought
| we talked about this, just make it like Excel!", repeat
| until you have a full blown Excel.
| kazinator wrote:
| > _You see, when formulas create a circular reference, Excel will
| run that computation up to a number of times._
|
| Like almost every spreadsheet before it: Lotus 1-2-3, Borland
| Quattro Pro, VP Planner ...
|
| Spreadsheets iterating on circ references goes back to the 1980s.
|
| The first spreadsheet application, VisiCalc, didn't track
| dependencies: it evaluated cells left to right, top to bottom,
| IIRC.
|
| Microsoft had a product called Multiplan that competed with
| VisiCalc. Not sure if that did iteration.
|
| I think it used to be a setting in some programs whether circular
| references are flagged as errors, or iterate. Maybe it's still
| that way in Excel?
| itigtohft wrote:
| There are a couple of nice minimal spreadsheets in js:
|
| http://web.archive.org/web/20130606222859/http://thomasstree...
|
| Inspired by that, an even smaller one:
| https://jsfiddle.net/ondras/hYfN3/
|
| The latter catches circular references rather than trying to
| calculate the fixpoint
| ChrisMarshallNY wrote:
| _> It's easy to treat a particularly clever or elegant piece of
| code as a masterpiece. It might very well be a beautiful trinket!
| But we engineers are not in the business of beautiful trinkets,
| we're in the business of outcomes._
|
| This spoke to me.
|
| However, as anyone that has looked at my code can attest, I tend
| to also want my code (and its functionality) to be very pretty.
| I'm generally writing code that I will be maintaining, so it
| needs to be something that I can look at, in a year, and
| understand.
|
| I'm currently in the final phases of a project that I will never
| announce here, and don't plan on taking much credit for, but it
| really is da schizz. It's that way, because no one is paying for
| it, and no one will make money from it.
|
| Money both spoils everything, and also makes it all happen.
| ryukoposting wrote:
| I wonder if the author would view this situation differently had
| Uber/Box decided to claim the code as their own. It has to bring
| some catharsis to know that, even if the code never actually met
| its potential, at least the whole world can see and appreciate
| it.
|
| I created a whole programming language as an intern for <defense
| megacorp>. It was lazily evaluated and garbage collected.
| Unquoted MAC addresses were valid syntax, among other
| application-specific oddities. No bytecode or JIT shenanigans -
| the interpreter just pushed and popped stuff from a stack as it
| traversed the parse tree, and that was fast enough for what we
| were doing with it. The interpreter was written in pure ANSI C,
| and Valgrind was very happy with it. Maybe it has been totally
| forgotten, or maybe it became critical to their technical
| infrastructure. That code never left the airgapped lab where I
| wrote it, so I have no way of knowing. 3 years ago, as a recent
| college grad, that was by far the coolest piece of "actually
| useful software" I had ever written. It's still high on the list.
| Sometimes I wonder whatever happened to it.
| allanrbo wrote:
| Another approach would be to use the Excel APIs. Both the classic
| desktop Excel and the web version has APIs to read/write cells
| and recompute. Rebuilding is more fun of course :-)
| sr228822 wrote:
| ironically, I was backend at Uber 2014 - 2018 and used to ask a
| simple version of implementing an excel formula engine as my go-
| to coding interview question. Its got a nice mix of data-
| structures, algorithms, complexity, and implementation. Good
| candidates can get a reasonably efficient implementation handling
| cell references in an hour.
|
| nice read. made me nostalgic for the wild days of Uber-China
| hacking.
| bastawhiz wrote:
| Amazing! I independently had an extremely similar interview
| question (because of this work) that also gave me great signal.
| That's really funny!
| mrintegrity wrote:
| I get a strangely dystopian feeling from this article, like it's
| almost about a character in a black mirror episode
| Tao3300 wrote:
| Starring Excel as the pig.
| Vt71fcAqt7 wrote:
| Is it possible that you thought this because the words "black
| mirror(ed)" are mentioned in the article iteslf?
| pagnol wrote:
| I reflexively swiped left when the inevitable newsletter modal
| started to appear at the bottom, so didn't finish reading the
| article, but would like to know what gave you the dystopian
| vibe?
| Tao3300 wrote:
| You should have kept going.
|
| Spoiler alert: It turns out they were fulfilling a Babylonian
| prophecy the whole time. The whole development cycle was a
| complicated sacrifice to Marduk.
| bastawhiz wrote:
| Man, what a commentary on my career
| lcnPylGDnU4H9OF wrote:
| Alternatively, they just accidentally (I assume) called you a
| good storyteller.
| dublinben wrote:
| Is it common to take code written for one employer and reuse it
| for another?
| arcbyte wrote:
| Some of us will strategically write generic code on our own
| time and machines and import it to employers machines when
| needed and customize it. For instance, how many times do you
| really need to write a spring OAuth server that integrates with
| LDAP? Or the guts of a simple CRUD app?
| Kwpolska wrote:
| > how many times do you really need to write a spring OAuth
| server that integrates with LDAP?
|
| I would expect this to be a library somewhere.
|
| > import it to employers machines
|
| How would you prove that you write the code long before
| getting hired by them?
| pc86 wrote:
| I would hope "import" in this sense means it gets pushed to
| your personal git[hub,lab,] repo, and forked from your work
| account.
| dartos wrote:
| Someone wrote the library for a reason, right? Many are
| made bc people needed it for their job.
| sokoloff wrote:
| > I would expect this to be a library somewhere.
|
| Indeed. That library is often written off-the-clock by
| person X and imported on-the-clock by person Y.
|
| Sometimes X === Y.
| bastawhiz wrote:
| And sometimes that person even writes about it on
| substack
| soperj wrote:
| > How would you prove that you write the code long before
| getting hired by them?
|
| Commit it to a repo somewhere.
| contravariant wrote:
| Of course those of us familiar with copyright law would
| probably not admit to doing so publicly, right?
| sublinear wrote:
| Quite the opposite.
|
| Software written off the clock that does not compete with
| the employer is not only not the property of the employer,
| but any contract attempting to gain such ownership is
| unenforceable.
|
| Many businesses even actively encourage their developers to
| contribute to open source projects.
| umanwizard wrote:
| > any contract attempting to gain such ownership is
| unenforceable
|
| I highly doubt that this is true, at least in the US. Can
| you cite case law?
|
| You can write a contract granting ownership of all the
| songs a musician performs, or all the books a writer
| writes during a specified time period. Why shouldn't the
| same be true of programmers and code?
| Tao3300 wrote:
| Because no programmer is going to sign that. They'll go
| somewhere else. Musicians and writers don't usually have
| as many options, if any.
| Alupis wrote:
| Your employment agreement or contract likely has some
| clause saying you transfer ownership, rights, etc to the
| organization.
|
| Which likely means your "free time" code you decided to
| do to make your _job_ easier now belongs to your employer
| since they asked you to write it (albeit indirectly in
| this situation).
|
| Will anything come of it for trivial stuff? Probably not,
| but that doesn't mean it's ok.
|
| Unless you have something in writing saying otherwise,
| best not to mix stuff like this because one day you might
| wind up on the wrong side of an army of lawyers.
| avgcorrection wrote:
| I don't.
| pc86 wrote:
| > _Which likely means your "free time" code you decided
| to do to make your job easier now belongs to your
| employer since they asked you to write it (albeit
| indirectly in this situation)._
|
| Especially when you have problem A at work, then some
| time later write "generic code" that solves problem A,
| then some time later "import" the code to your dayjob to
| solve problem A. And double so if nobody else ever uses
| this generic code and you never use it for anything else.
|
| As an industry we talk a lot about flexibility,
| particularly in scheduling and when we do our work, but
| you can't have it both ways. You can't be doing laundry
| and mowing the lawn and going grocery shopping in the
| middle of the work day because it helps you think or it
| helps your programming process, but then make the
| argument that because you wrote this code at 6 PM on a
| Sunday it's yours and not your employers, when you
| committed it to your employer's git repo Monday morning.
| Not with a straight face, at least.
|
| I want to be clear, I'm all about getting shit done
| during the day. If I need to get a haircut at 2:30 PM, I
| will. But I'm also not pretending that my employer's code
| is mine or that I have any right to publish it.
| Clamchop wrote:
| Ethically, I'm not sure how to slice it. I'm operating on
| what you wrote here rather than this specific story.
|
| Some contracts stipulate that anything you write while
| employed is owned by your employer. (I'm settled in that
| this is unethical, but it's reasonable to comply.)
|
| But let's suppose there's no such stipulation.
|
| You get an idea while at work. Everyone gets ideas. You
| take your brain home with you (I hope) and start
| developing that idea. You think it's generally useful and
| doesn't depend on any or reveal anything about a trade
| secret or other proprietary work, nor reveal anything
| about them.
|
| Is it your choice to contribute that idea to your
| employer or to use it in an open source or some other
| unassociated project? Why or why not?
|
| Is it OK if you never use it for any of your employer's
| projects?
|
| If not, then is it OK to wait until after your employment
| to develop that idea on your own or for your next
| employer or even turn it into your really awesome startup
| that definitely won't fail? (I think all of you are
| willing to do the first, and most of you the second.) Why
| does that change the ethical quandary, or why doesn't it?
|
| Alright, so your employer specifically asked for this
| solution and you wrote one on the clock but it was
| minimal, maybe you didn't have enough time to make a more
| elaborated one, and you write a better one and did one of
| the above with it. Is that OK?
|
| I don't think this question is all that cut and dried.
| contravariant wrote:
| Well sure, but how on earth are you going to claim it
| does not compete when you _use_ that very code for the
| employer?
|
| By all means try stuff out with some hobby project but
| don't be an idiot and tell your employer you've reused
| 'their' code (or at least, code in their codebase) for
| other clients. Either get an agreement up front or keep
| it secret.
|
| A contract that grants your employer copyright to code
| you wrote and _used_ in their codebase is easily
| enforceable. An exception would be code you wrote before
| the contract, but in that case using the code without
| some kind of agreement up front is still dangerous.
| m00x wrote:
| That's if they didn't use _any_ resources from their
| employer.
|
| If that code touched their work laptop (which it did
| since he showed it), it's now company property.
|
| That code most likely belongs to Uber legally, but they
| probably don't care that much.
| [deleted]
| Reubend wrote:
| Love the part about circular references. I gotta say though - I'm
| having a difficult time imagining how complex these fomulas are
| that reimplementing Excel's formula engine is easier than just
| porting the formulas into JS.
| bastawhiz wrote:
| The good news is that formulajs had a huge number of those
| functions implemented in JS already. Almost all of the time was
| spent on the engine, which wasn't a huge amount of code.
|
| The problem with porting the code to JS is that a.) nothing is
| named, b.) there's no real way to organize the code you've
| written because you're going from a spatial way of organizing
| code to imperative script, and c.) the actual design of the
| spreadsheet wasn't known to any engineers (it was designed by a
| data scientist, or perhaps an analyst). The work of translating
| would have meant really understanding what the thing is so that
| it can be turned into functions and modules. It also would have
| still required getting Excel function equivalents, since
| there's not a 1:1 equivalence between Excel and what's
| available in the JS standard lib.
| xen0 wrote:
| Honestly, if you're at 'thousands of formulae' across multiple
| sheets, I'd probably suggest writing a graph based execution
| engine too.
|
| The circular reference thing would have definitely thrown me
| for a loop though.
| Vt71fcAqt7 wrote:
| I love this article. It begins setting the scene with Uber's
| grand expectations for the chinese market and then shows just a
| small piece of the work Uber spent on Uber China. Meanwhile Uber
| China itself failed spectacularly, with factories of phones
| claiming the free rides money.[0] I don't think I've ever
| experienced dramatic irony in a blog before, certainly not a
| technical blog. The post itself is a masterpiece.
|
| [0]https://www.forbes.com/sites/ywang/2016/09/27/ghost-
| drivers-...
| gobrrrmeme wrote:
| There's a service in SharePoint and SharePoint Online that let
| you program against Excel. Just gonna leave this here.
|
| https://learn.microsoft.com/en-us/sharepoint/dev/general-dev...
| JohnMakin wrote:
| Reading this and having China cloud experience on my resume I
| wonder how many chinese data laws they may have violated, you
| have to really strictly segregate data when it comes to chinese
| users.
| acchow wrote:
| > Over the summer of 2016, we came up against a new twist on the
| project. We had a model that ran overnight to generate data for
| anticipated ridership in China. That data wasn't useful on its
| own, but if you fed it into a tab on a special Excel spreadsheet,
| you'd get a little interactive Excel tool for choosing driver
| incentives. Our job was to take that spreadsheet and make it
| available as the interface for this model's data.
|
| They eventually built a homegrown "Excel" clone as the UI for
| their model because "city teams only know how to use Excel".
|
| I would have done it the other way around - connected Excel to
| the data output by the model so the "city teams" could continue
| to use real excel. I think most finance teams do something like
| this.
| bastawhiz wrote:
| Because the city teams were in China, we didn't have this
| luxury. Everything had to be behind Uber's beyondcorp
| equivalent, and there was no real way to auth folks from the
| Chinese mainland. Our only surface was the browser.
| MichaelZuo wrote:
| A dedicated satellite link cross pacific is not that much
| money. Then again maybe the CFO didn't know so he took the
| easiest known option.
| bastawhiz wrote:
| I've always been fond of container ships loaded up with
| entangled qbits. Much lower latency!
| ftxbro wrote:
| > I would have done it the other way around - connected Excel
| to the data output by the model so the "city teams" could
| continue to use real excel.
|
| Yeah except:
|
| "When you click in the cells of the spreadsheet you can see the
| formulas. You shouldn't be able to do that."
|
| "You said to make it just like Excel."
|
| "People working for Didi apply for intern jobs at Uber China
| and then exfiltrate our data. We can't let them see the
| formulas or they'll just copy what we do!"
| bjornlouser wrote:
| "'Growing as an engineer' means becoming a better engineer, and
| becoming a better engineer (directly or indirectly) means getting
| better at using your skills to create business value."
|
| Learning how the Excel model worked and then reimplementing it
| would have been a better example of 'getting better at using your
| skills to create business value'.
| dmd wrote:
| > So the data scientists have multiple laptops that they download
| the data to, then run the models overnight.
|
| This haphazard way of running compute jobs really stuck out to
| me. I can't imagine doing things this way (rather than having a
| central compute cluster running SLURM or similar) at a company
| bigger than, say, a dozen people - much less the scale of Uber.
| What's the rationale? Even if it's just a cluster of 3 or 4
| machines in a rack shoved in the corner, isn't that better than
| ... laptops?
| bastawhiz wrote:
| It was easy to go to IT and say "get us a pile of laptops" and
| let the data scientists do their thing. It was hard to hire
| engineers to solve the problem (I was one of the engineers).
| They particular problem was far enough down the priority list
| that it took until 2016 to solve.
| dmd wrote:
| Heh. Well, I guess... sorry I didn't come help :) (I
| interviewed but turned down my offer.)
| Znafon wrote:
| "Nothing came of it, but I took the code and shoved it into my
| back pocket for a rainy day.
|
| My idea was to take this code and spruce it up for Uber's use
| case."
|
| "My first reaction was to publish the code on Github."
|
| I'm very surprised by this, isn't the code property of Box, or
| Uber? The author does not mention their authorisation before
| releasing it under MIT license.
| sidewndr46 wrote:
| I believe this kind of story is the kind that gives most legal
| counsel nightmares.
| pc86 wrote:
| Especially the brazenness with which the author basically
| says "if they want to sue me for this verified and admitted
| IP theft, they can."
|
| Sure, they probably won't. But they might. And if they do,
| you'll lose immediately. Seems like a pretty high risk no
| reward scenario.
| sokoloff wrote:
| And if you lose immediately, you likely owe damages. Those
| damages, even if trebled, appear to be $0 here.
| pc86 wrote:
| Which would be a great solace to someone who just spent
| $10k or more like 2-3x that responding to a lawsuit. But
| that's also why I agree the odds of actually getting sued
| are near-zero.
| Alupis wrote:
| Some companies, armed with floors of attorneys and
| retained outside counsel, do that sort of thing just for
| the message alone. It costs them next to nothing,
| relatively, ruins the defendant regardless of outcome,
| and makes it clear for others to not mess around with IP.
| akozak wrote:
| Pretty sure that's not how copyright damages work. Don't
| take legal advice on HN folks.
| m00x wrote:
| I'm sure they could find damages amounting to a large
| enough amount that OP would regret.
| [deleted]
| michael1999 wrote:
| Uber and the people they hired never struck me as particularly
| concerned by things like "laws" and "property".
| pavlov wrote:
| Oh, you're missing a few qualifiers. They're not concerned
| with laws applied to them, and other people's property. But
| in all other cases they're big believers in law and using it
| to protect their property.
| bastawhiz wrote:
| Author here. The code was originally written outside of work
| hours. I offered the code to Box and they didn't want it.
|
| If Uber wants a few thousand lines of JavaScript from over half
| a decade ago that didn't originate with them and that they used
| for less than a month, they can send me a letter.
| continuitylimit wrote:
| So come on man, let's be honest here. I got serious sacred
| masterpiece vibes from this story.
|
| This reminds me of some Hindu parable about people who let go
| of possessions and head out to become ascetics. So there is
| this wealthy man and wife and the wife is all upset because
| her brother keeps insinuating that he's gonna go ascetic and
| cut loose. The husband tells her to stop her crying and don't
| worry about it, he ain't going to do it. The wife asks him:
| 'but how can you be so sure?' Because, the husband says, this
| is how you do it, and then and there he rips open his shirt,
| tells her "you're my mother" and heads out to the woods.
| bastawhiz wrote:
| I just like telling fun stories from a long time ago
| [deleted]
| hitekker wrote:
| I might be dumb today but I think this parable is
| incomprehensible. Do you have a link to another version?
| deodar wrote:
| It's almost as incomprehensible as a Zen koan. I think
| the husband is showing the difference between talking and
| doing by, well, doing it. A radical way to demonstrate
| it.
| brtkdotse wrote:
| > they can send me a letter.
|
| Why am I reminded of this meme?
|
| https://amp.knowyourmeme.com/memes/what-are-you-gonna-do-
| sta...
| tempaway85751 wrote:
| _... Nothing came of it, but I took the code and shoved it
| into my back pocket for a rainy day ..._
|
| You can't really do this. Depends on your employment contract
| but code you write for an employer is usually copyright to
| them
|
| _... My first reaction was to publish the code on Github
| ..._
|
| You can't really do that either.
| bastawhiz wrote:
| I mean, I asked at the time, and I did it. If either
| company wants to start a legal fight over a pile of code
| that neither of them wanted that's old enough to be in
| elementary school, they know how to reach me.
| tempaway85751 wrote:
| Fair enough. But I'll leave my comment there as general
| advice for other readers
|
| enjoyed the article, the bit about Excel circular ref
| linear regression was wild
| thebradbain wrote:
| Cool -- if one of the companies wants to issue a takedown
| request, they're free to make the case for it.
|
| It's funny there's this idea that a company _might_ be
| potentially injured over code they do not want or know they
| had being made open source by its actual author, even
| though many of those companies will gladly use open-source
| tooling without ever contributing anything back.
|
| Perhaps more soundly, though, in California - where Uber is
| headquartered - IP/Copyright for code is a huge legal
| question that the state and federal Supreme Court has no
| clear answer to. Sure, you obviously can't secretly clone
| Uber's entire stack, slap a new company logo on it, and
| start up as a competitor. But if you, as an author, wrote
| some code for a company under an IP agreement, then no-
| longer worked at said company, and then later adapted and
| expanded upon that code (or even started over, with the
| knowledge of what you learned from others' work): are you,
| at the originator, not legally allowed to be inspired by
| your past work? That's not something you, me, or even the
| company could decide.
| andrewxdiamond wrote:
| There are gray areas but I do not think you are in one.
|
| > and then later adapted and expanded upon that code (or
| even started over, with the knowledge of what you learned
| from others' work)
|
| These are extremely different scenarios. Starting with a
| copyrighted material and modifying it is not at all the
| same as reading material and starting over. The first is
| violating copyright, the second is a derivative work.
|
| If I read everything correctly, what you describe doing
| is taking code owned by the first company and modifying
| it for the second company. That's not at all a gray area.
| It's a copyright violation. You the engineer sign away
| your rights to the code when you built it for company 1
| while employed by them. Their employment contract for-
| sure states they own any work produced by you during your
| employment, and you agreed to this.
|
| If the first project was done off of company time, posted
| publicly on a private account, you might have a claim to
| the rights.
|
| I know you've dug your trench too deeply to change your
| mind at this point, but anyone reading your comments
| should know what you did was technically illegal and can
| get people in legal hot water.
| thebradbain wrote:
| I wrote the comment above, though I'm not the author of
| the code that you appear to think I am. But I am in
| agreement with him.
|
| > Their employment contract for-sure states they own any
| work produced by you during your employment, and you
| agreed to this.
|
| There are many open legal questions as to where this line
| is drawn. Surely the line falls somewhere between "every
| character I've ever typed on a keyboard" and "the
| verbatim code". I personally don't think he's crossed it.
| IP ownership is much more complex than portrayed in HBO's
| Silicon Valley. That is my opinion.
|
| Furthermore, when I worked at GitHub (now acquired by
| Microsoft, so I'm sure things have changed drastically)
| -- there were very lax IP ownership agreements in the
| employment contracts around code ownership, because the
| legal department was worried that if found in any way
| conflicting with California law it would render the
| entire IP claims null and void (which does have precedent
| in California).
|
| The point is we don't know, and I think OP would know
| better than us if it was disallowed or not.
| ryandrake wrote:
| Like many things, the answer is usually "You'll find out
| if you want to go up against an army of lawyers". The
| last three companies I worked for all claimed ownership
| of any IP I create, on or off the job, using company's
| equipment or using my own equipment. One of them
| explicitly called it out during the interview: You will
| have to stop working on open source or publishing side
| projects when working here. Can they do that? Maybe,
| probably not. It doesn't matter because I do not plan to
| bankrupt myself fighting their lawyers.
| xwdv wrote:
| It's trivial to tell ChatGPT to rewrite the code base so it
| not longer resembles the original and then publish as a new
| thing. So yea, you can.
| migf wrote:
| I was also absolutely gob smacked at this. Will they care?
| Probably not. Are you putting yourself at the absolute
| mercy of them deciding not to care? Absolutely.
|
| I would have a hard time sleeping... like this would be
| like being in IT and knowing the backups were bullshit.
| sebzim4500 wrote:
| Is this a thing in the US? Here, if the code was written
| of your own volition outside of work hours then it's
| yours.
| santoshalper wrote:
| That story just doesn't seem plausible. Maybe for Box,
| but it feels like a stretch, and definitely not for Uber.
| migf wrote:
| It depends on your employment agreement or contract. Most
| contracts I have seen say that any IP you develop related
| to what you're doing at work is the employers.
| dragonwriter wrote:
| "Work hours" are less clear for salaried workers who may
| or may not take work home: if it was written to solve a
| problem for the employer, reviewed with other workers at
| work, but ultimately not further pursued the status seems
| murky.
|
| The later derivative that was actively used by and
| updated for the requirements of another employer during
| the coarse of work seems to more clearly their property
| as a derivative (but also murky because it is potentially
| an illegal derivative of the earlier work, _if_ that was
| owned by the earlier employer.)
| pc486 wrote:
| What a fantastic perspective from the former Uber BI team. I was
| on the Vertica team during this time period and the amount of
| effort was spent on incentives mind boggling. Millions a day lost
| on downtime, product features, or engineering bandwidth was a
| common theme.
|
| A director asking for an exact spreadsheet to be the UI would
| have been par for the course, especially during the Uber China
| days. Heck, I personally loaded FX prices into Vertica from a
| spreadsheet emailed every month to the team. That process
| remained for more than a year as there just wasn't enough
| bandwidth to invert the control as automated ingestion.
|
| Thanks for digging up these memories, @bastawhiz. I'd love to see
| more. :)
| constantly wrote:
| I wouldn't try moving code written for one employer and using it
| another employer. Unfortunately for me my employers have been
| somewhat litigious; thankfully Uber doesn't have that reputation.
| bsder wrote:
| Side question: Why the hell is it so stupidly difficult to
| display and _edit_ tabular data from programming code?
|
| You can't drive Excel from Python/Rust/etc (Microsoft just
| announced to great fanfare that Excel can call Python--which is
| the wrong way around). All the editable table widgets for the web
| seem to suck. Nobody seems to have a TUI which you can drive from
| an external program.
|
| A poorly written spreadsheet with a driveable API seems like a
| component that has been built multiple times by lots of people
| yet seems to be unavailable.
|
| Is there some solution that I'm missing?
| renewiltord wrote:
| That's a tremendous war story. I _love_ it! Doing a thing that's
| considered unreasonable. Knowing that total value is integral of
| instantaneous value over time (so short high-impact projects can
| easily outvalue long-lived projects).
| easton wrote:
| Excel circular reference doc, for the curious:
| https://support.microsoft.com/en-us/office/remove-or-allow-a...
|
| > Unless you're familiar with iterative calculations, you
| probably won't want to keep any circular references intact. If
| you do, you can enable iterative calculations, but you need to
| determine how many times the formula should recalculate. When you
| turn on iterative calculations without changing the values for
| maximum iterations or maximum change, Excel stops calculating
| after 100 iterations, or after all values in the circular
| reference change by less than 0.001 between iterations, whichever
| comes first. However, you can control the maximum number of
| iterations and the amount of acceptable change.
___________________________________________________________________
(page generated 2023-09-15 23:00 UTC)