[HN Gopher] An oral history of Bank Python
___________________________________________________________________
An oral history of Bank Python
Author : todsacerdoti
Score : 825 points
Date : 2021-11-04 06:21 UTC (2 days ago)
(HTM) web link (calpaterson.com)
(TXT) w3m dump (calpaterson.com)
| viksit wrote:
| Somehow reading this article really made me think of lisp and
| some old jane street and k lang articles here on hn over the
| years. I wonder if it's the composability or the centralization?
| rich_sasha wrote:
| Compared to one major IB bank Python system, this is all
| extremely clean and neat.
|
| Consider a Python API that is a thin wrapper on COM calls
| intended to be used from Excel. Want to request some data? Fill
| in a 2D virtual Excel table. Want to pull some data? Query it and
| parse a text-dump of a table excerpt (remembering to parse #NA!
| Etc as nans). Want to automate a job? Enter it as a new row to a
| global spreadsheet. And for Gods sake, do NOT edit any of the
| other rows, lest the whole house go down in flames!!!
| sfgweilr4f wrote:
| Or they could use instead use CSVs. What could possibly go
| wrong?
| jeffrallen wrote:
| Everything is fine, as long as no Americans come and write
| 1,000 where obviously they should have written 1000 or 1'000.
| /s
| int_19h wrote:
| Or those pesky Europeans writing 12,34 when they obviously
| meant 12.34!
| [deleted]
| blitzar wrote:
| Or some american writes the date somewhere.
|
| edit: /s we love you american colleagues,
| p_l wrote:
| I can confirm that, in a non-bank financial institution,
| date formats involved with Excel->CSV->XML->Certain
| (shit) magic application were considerable pain point :|
| simonh wrote:
| Working with and for Americans is why I always use ISO
| year-month-day format.
| girvo wrote:
| That, and I like how they neatly sort.
| zzzeek wrote:
| this is the approach I'm familiar with, but back when I did it
| in Perl (yes Win32 perl with COM bindings) and Java (we wrote
| native Java plugins to do COM to Excel). all the important
| stuff is Excel VBA code that they spent years developing and
| can never replace, so any front end type of thing had to
| somehow get back to the Excel models.
|
| We eventually did rewrite the Excel models in Java, released
| something, and then the whole project probably got cancelled or
| something, 9/11 happened a few years later and the whole
| building in which all this code was written had to be
| demolished.
| mst wrote:
| I remember back in 2000 converting the windows line of
| business app team at $ISP (I was mostly on the provisioning
| automation side) over to using a COM component called
| JendaRex which wrapped the perl VM just to expose the regexp
| engine.
|
| This basically came about after the Nth time they asked for
| regexp help and I had a trivial solution that didn't work in
| whatever native implementation they were using and I
| basically gave them a choice between JendaRex and "not having
| me debugging their regexps anymore".
|
| They unanimously chose JendaRex and everybody ended up
| happier as a result.
| Thorrez wrote:
| Doesn't seem that strange compared to K[1] or Q[2], which are
| used by Wall Street banks. K encourages you to use single-letter
| variables and bunch your code up as tight as possible into long
| lines. Here's an example: [3]. Interestingly, their Github repo
| has some K-inspired C[4], Java[5], C#[6], and Javascript[7].
|
| [1] https://en.wikipedia.org/wiki/K_(programming_language)
|
| [2]
| https://en.wikipedia.org/wiki/Q_(programming_language_from_K...
|
| [3] https://github.com/KxSystems/kdb/blob/master/holiday.q
|
| [4] https://github.com/KxSystems/kdb/blob/master/c/c/odbc.c
|
| [5] https://github.com/KxSystems/kdb/blob/master/c/jdbc.java
|
| [6] https://github.com/KxSystems/kdb/blob/master/c/c.cs
|
| [7] https://github.com/KxSystems/kdb/blob/master/c/c.js
| trenchgun wrote:
| Very interesting.
|
| I am feeling the urge to learn array processing languages.
|
| There is something very tempting.
| dTal wrote:
| You mean like Numpy? =P
| FabHK wrote:
| Project Euler frequently has ultra-short solutions in
| K/J/whatever other single letter they're using at the moment.
| It is quite intriguing, but ultimately I put too much store
| in readability, so decided not to pursue these.
| smsm42 wrote:
| Thank you. Now when I'm thinking some code I have to untangle
| is bad, I'd always remind myself "at least it's not K-inspired
| C"...
| grayrest wrote:
| This took me a while to figure out but K/APL code is built on
| a different value system than most software. Specifically,
| the goal is to be able to see and operate on the entire
| program at once. Obviously this only works for programs up to
| a certain size but that size is larger than you'd expect when
| abstractions, variable names, and general legibility are
| sacrificed. I wouldn't write code this way but I can see how
| someone would find it valuable.
| leprechaun1066 wrote:
| The big banks don't write code in K4 though, managers generally
| encourage people not to write code in it due to the difficulty
| of finding developers who are fluent in it.
|
| They all use q and q is very wordy and highly readable if you
| speak english. It's mostly just developer defined functions
| which are compositions of the keywords of which there are not
| many: https://code.kx.com/q/ref/#keywords
|
| Most of the code you would see in a kdb+ system in an
| investment bank won't look like any of the links you've
| provided.
| mst wrote:
| Do you have any suggestions for q code to look at? Every time
| I try array languages I bounce off the ubercompact and I feel
| like there's actually a chance I could learn something more q
| like.
| liveranga wrote:
| I have a copy of Fun Q sitting in the tsundoku pile on my
| desk that seems pretty good from a quick flick through.
|
| https://www.fun-q.net/
| jrochkind1 wrote:
| Honestly, this sounds a lot _less_ insane than a lot of
| "conventional" stacks.
| barnabee wrote:
| Agree. If only all software engineering was data first...
| oxfordmale wrote:
| I worked for a hedge fund that build their own database on top of
| MongoDB. Data was serialised and stored as binary blobs. This
| bonkers implementation took away any advantage of using MongoDB.
| Unfortunately this system had a lot of political backing, and
| rather than addressing the short comings we were told to simply
| datasets to ensure they could be stored in this bespoke database.
| I suspect a lot of trading signals were lost this way.
| rlewkov wrote:
| Are you talking about Arctic from Man AHL?
| https://github.com/man-group/arctic
| pepoluan wrote:
| > I suspect a lot of trading signals were lost this way.
|
| If you can prove & show evidence about this, I'm sure they will
| (grudgingly) accept the need to improve.
|
| Financial institutions are mostly risk-averse, "if it works,
| don't fucking fiddle with it!" They need something that will
| impact the bottom line in an (almost) direct way.
|
| I remember working with a financial institution back in 2010. I
| pushed for virtualization by presenting scenarios where the
| main servers are impacted, we have no warm backup servers, and
| calculate the impact to bottom line using known MTTR (Mean Time
| To Recovery) values of similar scenarios (gathered from
| incidents all over the world).
|
| Took several back-and-forth meetings with the BoD, but in the
| end they accepted the need to improve and allocated the funding
| + greenlight the project.
|
| (That was also back in the day when Microsoft had just released
| Hyper-V, and when we asked Microsoft to join in the project, I
| talked to their head engineer, and they respectfully declined
| because "their internal testing shows that Hyper-V, at the
| moment, is unable to fulfill the required parameters". Ended up
| with XenServer.)
| oxfordmale wrote:
| It is very hard to prove you have lost trading signal. You
| would have to redo your backtesting, and would politically be
| fraud (high risk of getting sacked). Often trading systems
| will perform worse in the real world anyway compared to the
| simulations, so it generally can only be proven if there is a
| politically willingness to do so.
| harel wrote:
| I worked on Quartz for a while as a contractor. Hated every
| second of it. Python version was old (2.4 if I remember correctly
| when 3.x was already the popular version). But that wasn't it. It
| was the proprietary version of everything in the stack that got
| me. Proprietary ide, source control, libs etc. I noted that none
| of the others who have been there for years have any transferable
| skills that can cary them out of banking and into a startup for
| example. All were good devs but only knew quartz. I can say they
| were great quartz devs. The pay was great, but the work was soul
| crushing.
| jaromir_ wrote:
| I second your opinion (interned @ Athena, not Quartz) -
| compared to my current BigN experience everything was worse by
| a magnitude: the IDE, the source control, the review mechanism,
| the job scheduler and so on.
|
| I'd expect with so many devs working on this the DevX will be
| ironed out
| markus_zhang wrote:
| I wonder how did they make the IDE? Must be an interesting
| job for whoever got to write it, and hell for whoever is
| maintaining it and using it, lol.
| harel wrote:
| Some questions are best left unanswered... :)
| bantatoes wrote:
| I think this would have depended on the teams/individuals you
| worked with. I recall some devs being completely clueless about
| environments - my code (that depends on dozens of other files)
| in uat is producing different outputs compared to my code in
| dev, why? Some didn't know how to debug. Many went on to FAANG
| as senior engineers (including Uber when it was the next big
| thing), hedge funds (citadel, 2s), startups (twilio, is twitter
| still a startup?). And as far as python knowledge is concerned
| - I recall attending a number of python talks given by my
| former colleagues at Python conferences, various PEP
| discussions around whether a given PEP would help or be
| detrimental to qz. As a matter of fact one of the qz core
| engineers is now also a PSF core engineer. Quartz was
| polarizing, but there was/is plenty of talent among the
| engineers. P.S.: fwiw, when I left baml, the migration to 3.6
| was nearing completion, and the migration to 3.7 was in
| progress. I guess at some point we realized python and qz were
| not going away, we must to migrate, so infra was built out to
| make future migrations easier.
| harel wrote:
| I have no doubt the people who BUILT Quartz are top notch.
| Users of it... Mileage varies i suppose.
| [deleted]
| Chris2048 wrote:
| > Proprietary ide, source control, libs etc
|
| The reason this was a problem was that it meant investment was
| needed for each of these things, and as such fell behind.
|
| The IDE fell behind most modern IDEs, presumably because it
| didn't get the budget for it. The source control/ libs where
| usually modified versions of _existing_ libs, but now needing
| to be maintained internally to remain compatible with the
| mainstream versions (which, again, they did not, and so fell
| out of compatibility).
|
| > any transferable skills
|
| It puts you in a position of arguing that you _are_ familiar
| with <some-lib>, just a modified proprietary version of it..
| Makes those conversations a bit more difficult..
|
| > All were good devs but only knew quartz
|
| To be fair - this is their own fault. It's difficult providing
| proof in terms of "what you worked on in your last job"; but
| there's no reason a professional python dev couldn't become
| familiar with the popular versions of things on their own,
| given they are fairly close in functionality. Many of the
| quartz devs I knew already had backgrounds in Python, attended
| pycons etc; so knew more than Quartz.
| harel wrote:
| It's not just the "knowing some lib". It's a way of working
| that is not compatible with the outside world. I had every
| single character i type having to get director approval. I
| sopped counting broken things that required human
| intervention (on rota). The style of programming is... well..
| . bankish? I would have had to take a big gamble hiring most
| of these guys.
| harel wrote:
| Yes, it is their fault, but the organisation didn't even
| attempt to nurture professional development. Stagnation was a
| feature, not a bug. Arguably, these guys were paid very well
| so they would have taken a pay cut anywhere else anyway.
| seanhunter wrote:
| I was the person who first deployed Python at Goldman Sachs. At
| the time it was an "unapproved technology" but the partner in
| charge of my division called me and said (This is literally word
| for word because partners didn't call me every day so I remember)
| Err....hi Sean, It's Armen. Uhh.... So I heard you like
| python.... well if someone was to uhh.... install python on the
| train... they probably wouldn't be fired. Ok bye.
|
| "Installing python on the train" meant pushing it out to all the
| computers globally in the securities division that did risk and
| pricing. Within 30mins every computer in goldman sachs's
| securities division had python and I was the guy responsible for
| keeping the canonical python distribution up to date with the
| right set of modules etc on Linux, Solaris and Windows.
|
| Because it was unapproved technology I had a stream of people
| from technology coming to my desk to say I shouldn't have done
| it. I redirected them to Armen (who was a very important dude
| that everyone was frightened of).
|
| The core engineers from GS went on to build the Athena system at
| JP and the Quartz platform at BAML.
|
| //Edit for grammar
| vba wrote:
| What year was this out of interest?
| seanhunter wrote:
| I think 2002 or 3. I worked there from 2001 to about 2010 and
| it was pretty early in my time there.
| gjvc wrote:
| what version of Python was it?
| stevesimmons wrote:
| I worked on Athena at JPMorgan for 8 years, and loved it.
|
| Seeing Python at the core of trading, risk and post-trade
| processing for Commodities, FX, Credit etc was such a great
| developer experience.
|
| By the time I left JPM, there were 4500 devs globally making
| 20k commits weekly into the Athena codebase. (I did a PyData
| presentation on this [1] for more details).
|
| The one downside was the delayed transition from Py2.7 to 3; I
| left just as that was getting underway.
|
| [1] https://www.youtube.com/watch?v=ZYD9yyMh9Hk The
| oogetyboogety wrote:
| That's funny they mentioned replayable financial message
| queues. Those are a hit here
| simonh wrote:
| I worked on Quartz for 3 years and loved it. Some devs grumbled
| about various aspects of it, but I come from an application
| support background and taught myself python, so I suppose I had
| fewer developer habits to un-learn.
|
| From what I understand, all this started with SecDB at Goldman,
| which was a the prototype for all these systems but wasn't
| Python based. The lore is that SecDB was instrumental in
| Goldman being able to rapidly figure out what their exposure
| was during the 2008 crisis.
|
| Some of that team, lead by Kirat Singh went on to start Athena
| at JP Morgan and then Quartz. I met Kirat once, he was
| considered a bit of a rock star in the bank tech community. He
| now runs Beacon, which is basically Bank Python as a service.
| Paladiamors wrote:
| I work at Beacon.io and it's an awesome place to be. Kirat is
| indeed a rockstar and it's awesome to work with an CEO that
| knows great code. We also landed a Series C last month and
| we're growing :)
|
| https://www.crunchbase.com/funding_round/beacon-series-c--
| 60...
|
| We've also got a bunch of positions open too for those that
| are interested in joining!
|
| https://www.beacon.io/careers/
| Icathian wrote:
| I took a quick look, it seems like all the postings are
| London or New York. What's the feeling internally about
| remote hires? I'm assuming that's still out of fashion in
| finance and Beacon feels the same?
| Paladiamors wrote:
| Because of Covid I've been remote since last year. It's
| still an evolving situation. But things have worked out
| pretty well.
|
| Our CEO was also remote during that time too and here's
| him giving a webinar from a cabin :)
|
| https://www.youtube.com/watch?v=fXPDXbrPdxI&ab_channel=Be
| aco....
| alecco wrote:
| s/rockstar/primadonna
|
| Build complex system, when things starts to get messy move
| somewhere else. Rinse, repeat.
| pillefitz wrote:
| As a full stack quant dev who is struggling to go all-in on
| Beacon: Do people generally like to use Glint for complex
| applications? Are benefits of Beacon lost when interfacing
| from e.g. Angular? I'm afraid that professional frontend
| devs might be unwilling to work with a proprietary
| framework, but that's speculation from my side.
| Paladiamors wrote:
| Glint is an integrated framework with the platform but it
| does not limit you to just using the framework. The
| platform is designed to be as extensible as possible,
| worry not about being locked in :)
| seanhunter wrote:
| > From what I understand, all this started with SecDB at
| Goldman, which was a the prototype for all these systems but
| wasn't Python based. The lore is that SecDB was instrumental
| in Goldman being able to rapidly figure out what their
| exposure was during the 2008 crisis.
|
| Correct. We used python for a bunch of infrastructure stuff
| (eg distributing all of secdb to all of the places it needed
| to go). The actual pricing and risk was written in Slang with
| a lot of guis that were "written" in slang but actually that
| caused autogeneration of JIT bytecode that was executed by a
| JVM. Most of the heavy lifting behind the scenes was C++. So
| a bit of everything.
| keithalewis wrote:
| My grandpappy always told me to cut out the middleman.
| Modern C++ was heavily influenced by the need to make it
| simple to use directly. If you are in the business of
| writing code instead of reminiscing, you can now leverage
| move semantics, lambdas, and smarter pointers to create
| software that is close to the silicon. Python might be
| great, but it sure is slow. Its success is founded on smart
| people making it easier for not so smart people to call C++
| that does the heavy lifting.
| seanhunter wrote:
| A big force multiplier in the old GS secdb model was
| simply the speed of the dev cycle vs speed of the code.
| As a strat you could push slang changes to pricing and
| risk literally in minutes with full testing, backout,
| object audit logging etc.
|
| C++ changes went out in a 2 week release cycle so changes
| were still fast by most standards but much slower. But
| yeah we had 20m + lines of C++ code so it was extensively
| used.
| seanhunter wrote:
| I know Kirat really well. Fun fact, one two-week dev cycle we
| had 667 distinct developers commit to the secdb code base
| which Kirat's boss described to me as "The number of the
| beast.... plus Kirat"
|
| Second fun thing. Kirat was advocating for lisp for secdb for
| a long time and used to rag on me for liking python when it's
| so slow.
| arthurcolle wrote:
| Had a lot of fun pulling things out of MetaDir and
| recreating what seemed like "early history" of SecDB when I
| was there, was a lot of fun :D
| arthurcolle wrote:
| the spirit of dubno lives on...
| keithalewis wrote:
| But does he? There seem to be a lot of people in the know
| on this thread. He seems to have disappeared after
| leaving BofA.
| breadandbeer wrote:
| deep sea diving was last i heard:
| https://www.youtube.com/watch?v=KKvJrCvIvOc and before
| that he was featured in a wsj article:
| https://www.wsj.com/articles/goldman-sachs-has-started-
| givin...
| noneeeed wrote:
| Interesting to be reading all this about SecDB. About 15
| years ago I was offered a job working on SecDB (I forget
| exactly what the position was now). It and Slang sounded
| really interesting.
|
| I do sometimes regret not taking the job because the people
| there were wickedly sharp and the tech sounded great, but
| in hindsight I'm not sure I would have thrived in a bank
| long term. I did a 3 month internship at Lehman's which I
| enjoyed, but I don't think I'd have suited a career in it.
| One thing I did get out of it was a total lack of fear
| around job interviews, if I could survive the 14 hours of
| interviews at GS and come out with an offer, then I can
| handle pretty much any recuitment process :)
| Maven911 wrote:
| It's amazing how a few people have left such a big mark on
| a part of the investment banking industry. I missed Kirat
| right before exiting BAML but met all his "disciples" and
| Dubno..including his miniature dinosaur and telescope in
| his office :). Very much felt like tech religion where no
| open debate on merits and drawbacks could be discussed. And
| a lot has changed in terms of engineering innovation with
| turnover since that era...
| mleonhard wrote:
| Why is the number of people who "left a big mark" so
| small?
|
| 1. An organization/industry can adopt each new technology
| only once. New technologies arise infrequently. Each time
| they arise, only a few people get to work on the projects
| introducing them. In other words, opportunities to leave
| a big mark are limited.
|
| 2. Credit for innovation is political capital. People
| hoard political capital and become powerful. They act as
| gatekeepers of innovation and take the credit for
| successful projects.
| bostik wrote:
| > _From what I understand, all this started with SecDB at
| Goldman, which was a the prototype for all these systems but
| wasn 't Python based._
|
| Correct. SecDB has its own language, S-lang, which you could
| probably best describe as a "lispy Smalltalk, with a hefty
| sprinkling of Frankenstein". The concept of SecDB was so good
| that other large banks wanted to get their own. Athena and
| Quartz have been mentioned several times in this thread, by
| people far more knowledgeable than I could ever be.
|
| It's not just banking, I know of at least one large
| pension/insurance company who are building their own version
| of SecDB, with direct lines to GS. (They don't try to hide
| it, btw: the company is Rothesay Life.) The last time I
| talked with them, they were looking for Rust devs.
|
| Disclosure: I work at Beacon.
| odiroot wrote:
| Very interesting story. I'm actually impressed, you're allowed
| to give us such "internals" about GS.
| lanevorockz wrote:
| Python Quants for the win ! I was lazy enough to stay with R
| but it was nice of Wes of building pandas, made adoption super
| easy.
| anentropic wrote:
| Ah, now I finally know what it means when I get job ads for
| Python+Quartz ...!
|
| I could tell from the context it was nothing to do with
| QuickTime...
| govg wrote:
| For some context (as an ex-Goldman employee myself), "Armen" in
| the quote is most probably
| https://www.goldmansachs.com/insights/outlook/bios/armen-ava...
| , who has quite a legendary reputation within the firm for the
| work he's done. He was also one of the first to be hired as a
| "strat", which used to be how Goldman referred to its quants
| who sat between front office and tech systems and worked with
| both sides.
| FabHK wrote:
| SecDB/Slang originated around 1992 at the commodity trading
| shop J Aron, which GS had bought 1981. Later (end of the 90s)
| there was a push to extend it to the rest of the firm, first
| fixed income, then equities. Armen flew the whole world-wide
| strat team to NY and gave a presentation, and to drive the
| point home, he played a clip from StarTrek: "You will be
| assimilated. Resistance is futile!"
| collinmanderson wrote:
| > Prior to joining the firm, Armen was a member of the
| technical staff at Bell Laboratories in Murray Hill.
|
| Wow. Bell Labs too.
| [deleted]
| shrike21 wrote:
| Hello Sean,
|
| I remember Slang when I first saw the code, a parse tree based
| evaluator in 1997. Come on folks. Separate parsing from
| evaluation. Opaque types with message passing. Inference
| anyone? Clearly no one read hindly milner.
|
| Add parse time optimizations, add locals, hey globals and
| locals weren't handled properly. Python in the 90s anyone?
|
| Shitty KV store with 32K object limits related to some random
| OS2 btree limits. Add huge pages.
|
| Deal with random rubbish related to inconsistent non
| transactional indices.
|
| Figure out you should layer nodes in a dag. A dag is
| topologically 2 dimensions, fairly limiting.
|
| Figure out that's somewhat similar to layering databases, it's
| just another dimension.
|
| Hmm bottom up evaluators, you actually need top down as well,
| create a turing complete specification. well, limit it a bit.
|
| Ah KLL points out that layers on top of dimensions end up being
| combinatorial, but you can actually cache the meta traversal
| and it's small n.
|
| Lots of people point out category theory parallels. Haskell is
| pretty but completely unusable. I'm a math guy, and it's still
| unusable, I don't like feeling smart when I do simple things.
|
| But interestingly creating imperative functions with pure
| input/outputs with implied function calls is pretty
| interesting. You can create an OOP paradigm with self and args
| as known roots aka linda tuple spaces.
|
| Ah and each tuple space point can be scheduled independently,
| some issues with serialization and locality...
|
| Go to another bank and choose to use python, foolishly decide
| to rely on pickle. Do that twice. Bad idea.
|
| But write a much better petabyte scale multi-master optimistic
| locking database with 4 operations. Insert, Upsert, Update,
| delete. WAL as a first class API.
|
| Finally decide that writing a coding scheme to convert python
| objects to json is not really hard. And of course cloud native
| and cloud agnostic is really the only way to go nowadays.
|
| I'm always confused why people complain about Athena/Quartz,
| hell we wrote it all, fix it if you don't like it. Open source
| it if you want other people to contribute. If we made stupid
| decisions on pickling data, well there's a version id, add json
| serialization, it's not hard, don't take things as given.
| rbanffy wrote:
| > foolishly decide to rely on pickle.
|
| This scared me a bit TBH. It's one of those decisions that
| come back to bite us repeatedly.
| tikkabhuna wrote:
| I joined the BAML grad scheme 10 or so years ago. We had a
| presentation from one of the Quartz guys and someone asked how
| they'd manage upgrading the version of Python. They were using
| something like 2.6.5. The whole move to 3.x was a thing. The
| Quartz guy just flat out said they wouldn't upgrade.
|
| Seemed crazy to a new grad back then, but now I wouldn't want
| to consider it either.
|
| Thanks for your contribution! It was amazing that even in my
| role where I didn't use Quartz, I could see and search all the
| code. Felt quite novel back then.
| lozenge wrote:
| BAML(Quartz) and JPM(Athena) both had Python 3 migrations
| well underway as of PyCon UK 2019.
|
| It took me more than half way down the article to identify I
| have not worked at the same bank as the author...
| [deleted]
| keithalewis wrote:
| A friend I've known since I first started on Wall Street now
| rides herd on the BofA Quartz libraries. One component of his
| job is to make developers aware of existing libraries they
| can use to solve business problems instead of reinventing the
| wheel. His theory of why they always have excuses not to do
| that is that they have no training in software development.
| They are still at the point in their learning where the are
| just excited that they can press keys on a computer and get
| it to do things they are barely able to understand.
| helsinki wrote:
| Ha, cool. I learned Python on your distribution ca. 2014.
| DrBazza wrote:
| There was no single person that introduced python at Citigroup
| that I am aware of. It came in via a variety of teams mostly
| because of the fact that the alternative was perl, and no one
| wanted to write perl (yet somehow kdb was acceptable a few
| years later).
| tomrod wrote:
| Beautiful!
|
| I worked a bit with Athena and was in the group that got
| Anaconda into the banking sector.
|
| Small world we live in. I think I remember your name from JPMC
| days, but it has been awhile.
| JoshTriplett wrote:
| There are a number of workplaces where I'd have been willing to
| rely on "probably wouldn't be fired", but a bank is definitely
| not one of them. Congratulations on shipping something useful
| in the face of that risk and uncertainty.
| simonh wrote:
| The trading community walk on a knife edge all the time, it's
| not a place for the faint of heart. I used to support
| derivatives trading systems and a few times there were issues
| that meant they'd lost control of orders on an exchange.
| Scary stuff. It requires a crazy mixture of careful,
| deliberate, calculated risk control on the one hand; but once
| you commit to something you jump in with both feet and throw
| everything into it.
|
| You need to be both meticulously risk averse, and also
| willing to do whatever needs to be done when it needs doing,
| and accept responsibility. It was great!
| gjvc wrote:
| An excellent description.
| hdjjhhvvhga wrote:
| > but a bank is definitely not one of them
|
| Investment banks are basically risk-management shops. The
| partner made an assessment and evaluated the potential
| benefits as higher than risks. Note the word "probably".
| qwhelan wrote:
| Also worth mentioning that "unapproved software" on bank
| infrastructure is what an aggressive prosecutor would call
| "felony bank hacking".
| AmericanBlarney wrote:
| Almost definitely false. Provided you weren't doing
| anything intentionally malicious with it, the risk would
| be that regulators might fine the bank for inadequate
| controls. As such, the bank might fire you for doing
| something that could lead to such a situation, but I
| don't see a criminal charge. There was actually quite a
| decent bit of "unapproved" software in use at one of the
| banks I worked in - mostly stuff that was in the process
| for approval, but that could take forever, so it was
| reasonably common for teams to run through the checks
| themselves (security scan, license review, etc) and move
| forward while the official review confirmed no issues.
| qwhelan wrote:
| Well the login message I was greeted with on every ssh
| connection certainly threatened criminal prosecution for
| unapproved software at the extremely large bank I worked
| at.
|
| Unlikely? Sure. But a lawyer somewhere thought it was
| worth reminding me 10x/day, so going to assume it's
| possible provided your unauthorized software caused a
| serious monetary loss.
| op00to wrote:
| Investment banks at that time were a little bit different.
| sizzle wrote:
| Curious why would this high-powered person go to bat for a
| technology decision they didn't seem to have done any risk
| assessment of? Wouldn't he be liable if something was exploited
| and hurt and company, like his head would be on the chopping
| block for giving the go ahead when they traced it up the chain
| of command?
| seanhunter wrote:
| That's precisely why he did it the way he did. He had total
| deniability. Here's how the conversation would go if it went
| wrong somehow:
|
| "Armen, did you tell Sean to install python?"
|
| "No".
|
| "Sean, did Armen tell you to install python?"
|
| "Err... no. He said I probably wouldn't be fired."
|
| "Well it turns out he's not right about everything. Here's a
| cardboard box for your things."
| Atothe4 wrote:
| Nah, I don't think that's why he did it.
|
| You say he reached out to you and asked if you liked
| Python. He probably wanted to roll out Python and was
| looking for someone who wanted to do it. If he told someone
| to do something they weren't passionate about, they would
| fail. He wanted to make sure it succeeded, so he reached
| out to you.
|
| If he's such a bigshot and everyone was frightened of him,
| he must not have been afraid of them. When he said, "you
| wouldn't get fired," he probably meant what he said. He was
| giving you air cover. And it worked. When the gnomes came
| out after you, you just sent them to him. And they didn't
| bother you again.
|
| I can imagine how the conversation went:
|
| "Armen, did you tell Sean to install Python"
|
| "No, did he"?
|
| "Yes he did!"
|
| "Great!!"
|
| Now the gnomes are on their backfoot and have to defend why
| Sean shouldn't install Python If this guy Armen told you to
| do it, Armen has to defend himself to them.
| seanhunter wrote:
| Hehe. Well I guess you would have a unique insight into
| his thought process. ;-) But yes indeed that's certainly
| another explanation and it did indeed work that way.
| Atothe4 wrote:
| Air cover is crucial, but you can't take land without
| good boots on the ground ;)
| seanhunter wrote:
| Well thanks for the air cover, and for all the other
| opportunities you provided for me and others at GS. I
| really appreciate it. It was an amazing time and I
| learned a great deal.
| simonh wrote:
| This is only my opinion, but I think the reason Armen said
| it like he did was because by not making it an order he's
| giving Sean the option of not doing it, if he's not up for
| accepting the risk. However the risk was both of them could
| have got fired.
|
| Armen must have known people would know Python had been put
| on these machines and that he authorised it, in fact what's
| the point of putting it on them if nobody knows and nobody
| uses it? I can guarantee you that within 24 hours someone
| was asking Armen why he'd authorised this and was
| justifying it. There cannot have been any possibility of
| dodging responsibility for this decision. If anyone got
| fired it would have been Armen, with a possibility of Sean
| going as collateral damage.
|
| This is the big league. You make your decisions and you
| accept responsibility for them.
| sizzle wrote:
| Exactly, so what is this Armen character getting out of
| this other than a potentially big amount of liability and
| unarticulated risk.
|
| The OP said he told people openly that Armen told him he
| could do it when asked.
|
| This makes no sense to me, what's the upside to Armen? If
| he is business savvy, he needs to be gaining something in
| exchange for having his name thrown around by OP as
| signing off on this.
| simonh wrote:
| He's doing his job, which is to ensure people have the
| tools and resources available to do their jobs. You know,
| furthering the goals of the organisation.
| tim333 wrote:
| Guess the Python was useful? From an FT article:
|
| >In 2011 Goldman Sachs put its top computer wizard, Armen
| Avanessians, in charge of the division. He has helped
| turn round its fortunes. The arm's assets under
| management reached a nadir of $38bn in 2012, but it now
| manages $91.8bn...
| sizzle wrote:
| Wow hope OP got a chunk of that!
| Sunspark wrote:
| It's a bank, so probably his reward for good performance
| was 10k at Christmas.
| dustintrex wrote:
| You may have dropped a couple of zeroes there.
| Sunspark wrote:
| Maybe that bank is more generous. The one I worked at
| begrudgingly counted out the pennies like it was coming
| out of the war orphans fund or something.
| boringg wrote:
| Hahahah 10k maybe if he was a new intern. Off by an order
| of magnitude
| sswaner wrote:
| Money or less risk of losing money. Partners originate
| deals and/or manage risk. He was looking for accelerated
| process to make informed decisions.
| smallnamespace wrote:
| Armen was very passionate about the value of "strats" (GS's own
| term for "quants", and later broadened to include software
| engineers and data scientists).
|
| A favorite quip of his: At GS, I'm like an arms
| dealer. When a desk has a problem, I send in the strats, and
| they blow away all the competition!
|
| Also, SecDB's core idea is not _just_ tight integration between
| the backend and development environment, but that all objects (
| "Security" in SecDB lingo) were functionally reactive.
|
| For example, you would write a pure function that defines:
| Price(Security) = F(stock price, interest rate, ...)
|
| When the market inputs changed, Price(Security) would
| automatically update (the framework handled the fiddly bits of
| caching intermediate values for you, so even an expensive Price
| function is not problematic).
|
| This is loosely the same idea that drives React, ObservableHQ,
| Kafka, and other event-streaming architectures, but I first
| encountered this ~15 years ago at a bank.
| barrkel wrote:
| It's as old as VisiCalc, it's how spreadsheets work.
|
| I built a similarly reactive system for web UI binding back
| in 2004, running binding expressions on the back end with
| cached UI state to compute the minimal update to return to
| the front end, in the form of attributes to set on named
| elements.
| AmericanBlarney wrote:
| Yes, although doing it in distributed fashion at the scale
| of SecDB or Athena introduces quite a few more
| complexities.
| chestervonwinch wrote:
| > This is loosely the same idea that drives React,
| ObservableHQ, Kafka, and other event-streaming architectures,
| but I first encountered this ~15 years ago at a bank.
|
| See also the "observer pattern" [0]. It's a fun exercise to
| implement a reactive system in Python using the descriptor
| protocol [1]. IPython's traitlets library is an example of
| this in the wild [2].
|
| [0]: https://en.wikipedia.org/wiki/Observer_pattern
|
| [1]: https://docs.python.org/3/howto/descriptor.html
|
| [2]: https://github.com/ipython/traitlets
| jb_s wrote:
| This immediately struck me when I was reading this article.
|
| To be honest, this whole paradigm seems absurdly fucking
| efficient for the developers. But I wonder about stuff like
|
| * What happens if the data model needs to change? If you need
| to move something from db["some/path"]?
|
| * How is it coordinated at a larger scale, how does everyone
| know what is running and how it interacts with everything
| else - can you figure out what depends on an object? What if
| the data used by your Price(Security) object changes and
| breaks it?
| lmm wrote:
| > What happens if the data model needs to change?
|
| You write conversions and there's a registry where you
| register them to be picked up by the unpickler. If
| necessary you can also customize the logic that determines
| which version a given pickled datum uses to deserialize.
| There aren't so many guardrails when you're writing that
| stuff, but the infrastructure does its best to support you.
|
| > If you need to move something from db["some/path"]?
|
| There's support for both symlinks (db["some/path"] ->
| db["other/path"]) and for a kind of hardlink by making both
| paths point to the same inode-line id. You can usually find
| a way to do what you need to.
|
| > How is it coordinated at a larger scale, how does
| everyone know what is running and how it interacts with
| everything else - can you figure out what depends on an
| object? What if the data used by your Price(Security)
| object changes and breaks it?
|
| There's a common model for the things that are shared, and
| that has a versioning and release/deprecation cycle.
| Otherwise every type has an owner and you probably had to
| request their permissions to read their data, so you should
| have a channel of communication with them. But yeah people
| do rely on the fundamental business entities not changing
| too quickly, and things do break when changes are made.
| KMag wrote:
| There's also a graph debugger that allows you to step
| through the dependency graph node-by-node across the
| various globally distributed databases.
| lmm wrote:
| True but not really helpful for this problem, because it
| can only tell you about the job you're debugging, whereas
| what you want to know is what code might ever depend on
| that data.
| jb_s wrote:
| Thanks! Interesting stuff.
| victor106 wrote:
| If you don't mind me asking which year was this?
| mirko22 wrote:
| Oh wow, I remember Quartz at BAML... Though this was several
| years after initial deployment and when core devs left.
|
| One day I will sit down and write a small poem about the
| insanity of software development based on my experience with
| Quartz. It will be an intriguing story of love and hate being
| told through a sensual dance between sales and engineering. The
| battles will not be epic but the consequences of one's actions
| will be far reaching.
|
| It was indeed and experience worth having.
| Maven911 wrote:
| Tell me more about what worked and didn't..I recall the pain
| of watching QzDesktop load and Bob/HUGS jobs failing..but
| what else, and what did you enjoy
| mirko22 wrote:
| All great stories use mundane circumstances in life to
| convey much deeper and abstract ideas.
|
| One such story, I humbly hope, would be my poem.
|
| Yes the b00bs were all over the place as they are in real
| life, and in my opinion how they got there is as
| interesting as real life.
|
| Original people solving the original problem had a good
| understanding of what they are dealing with.
|
| However with subsequent generations it became a monkey
| problem https://m.youtube.com/watch?v=5QuwPeH9P7Y
|
| When all you have is a DAG all solutions end up with sales
| person coming at your door problem. But years down the line
| we didn't have traveling salesman problem but out of date
| libraries problem.
|
| The monkeys tried to make their risk taking more safe by
| introducing all kinda of random constraints not realising
| that what gave power to the whole idea was actually risk
| taking.
|
| And no, they didn't try to manage risk but have completely
| fail to understand that risk is what made product good in
| the first place.
|
| And this is how every great idea in human history fell
| apart, the original people had different view of a current
| problem however the following generations have only
| understood the simplified problem and down the line the the
| solution to original problem had became the actual problem.
|
| I still believe the poem would make greater justice to the
| whole idea, but what I tried to explain here is that the
| simple things we have all witnessed actually hide much
| deeper truths about life in general.
|
| And from there I assume QzDesktop loading times where not
| the issue at the beginning of this deterministic chaotic
| system, but simply one of the possible generational
| products.
|
| I am yet to understand how to solve the monkey issue.
| arthurcolle wrote:
| I worked for GSAM for a bit as my first NAPA project - I guess
| you're referring to Armen Avanessians? Haha haven't heard
| references to the "train" in so long. Did you ever do any
| Slang/SecDB dev? I was mostly in FICC Tech so was pretty much
| slangin' slang most of my time there.
|
| JSI (Java Slang Integration) was just getting off the ground
| but there wasn't too much for the front office tech teams to do
| there until it was to mature in the coming years.
|
| Good times, thanks for sharing the history ;)
| seanhunter wrote:
| Yes, that's who I was referring to. I did absolutely tons of
| slang and a fair amount of TSecdb as well as a lot of work on
| the C++ infra and the build and distribution code.
| helsinki wrote:
| My former team, previously known as AIM, wrote JSI :)
| mumblemumble wrote:
| > There is an uncharitable view (sometimes expressed internally
| too) that Minerva as a whole is a grand exercise in NIH syndrome.
|
| My brief experience with this (in an adjacent area - proprietary
| trading) was that the more charitable view is that these firms
| need to be able to fully own their software stacks, and have the
| resources to pay for that luxury.
|
| Reading these descriptions from the article, I can't help drawing
| a connection to the Smalltalk ecosystem. It sounds like, to at
| least some extent, what these banks have built is a system that
| exhibits many of the more interesting characteristics of an
| enterprise Smalltalk system, only on top of a tech stack that
| they could own from top to bottom.
| igouy wrote:
| https://news.ycombinator.com/item?id=23826376
| carnitine wrote:
| Always funny to see the objections of new hires without finance
| experience to the use of floating point for pricing. It's more
| related to the inherent inaccuracy of any pricing model though,
| rather than clients not caring about pennies.
| Chris2048 wrote:
| Can cause issues if you are accounting for things, e.g.
|
| "The sum of these values need to equal the sum of these values"
|
| In that case you'd then needs to avoid
|
| "sum_a == sum_b"
|
| and use instead
|
| "abs(sum_a - sum_b) < SOME_SMALL_VALUE"
| carnitine wrote:
| You don't do that though, there's no use case to account with
| the output of a model.
| Chris2048 wrote:
| I'm not sure what you mean - aside from speculative pricing
| models there are regulatory constraints too, that _are_ a
| part of the same codebase.
|
| Not to mention that there is a use case for auditing
| pricing models (as in external requirement, or internally)
| or comparing alternative models.
| chewbaxxa wrote:
| Sure we add up the numbers, but there are thresholds for
| everything anyway. We might not be concerned about
| something within a 1MM range let alone a floating point
| inaccuracy. The uncertainty is accounted for already.
| JackFr wrote:
| I remember someone demanding that we needed to run our Monte
| Carlo pricer on 1024 paths, that 256 just wasn't precise enough
| and one of the risk guys said "Well since we know our
| assumptions are wrong I'm not sure what difference it makes."
| brazzy wrote:
| The difference between precision and accuracy in a nutshell.
| brazzy wrote:
| Exactly. Floating point is inappropriate for anything related
| to accounting (payments, balances, etc.), but when the numbers
| you're producing are effectively forecasts or estimates, it's
| no different than what floats were originally invented for
| (numerical modelling in physics and engineering).
| forgotmyoldacc wrote:
| Really interesting read. From what's described: Walpole
| (distributed job runner), Dagger (DAG that recalculates when
| dependencies change), Barbara (global key value store) and
| monorepo/fast deployment, its not so different from some big tech
| companies.
| timkpaine wrote:
| You can see how it's tied together in this presentation
| https://youtu.be/M9o9SF5-Pzw
| databag wrote:
| At JPM, Athena: Walpole = Bob job runner Dagger = pixies
| Barbara = hydra
|
| The python 3 migration is still ongoing
| markus_zhang wrote:
| Jesus I can imagine how difficult it is to migrate the whole
| solution to Python3.
| tstordyallison wrote:
| Indeed.
|
| Thankfully, I think it can be said the light is now
| strongly visible at the end of the tunnel on Python 3 (but
| it's not completely done).
| sfgweilr4f wrote:
| I can see the benefits of this collection of tools within an all-
| in-one monolith. Ease of deployment is a big benefit. I can also
| see the costs. As a stack its probably better in some ways than
| how a lot of other businesses operate as well as worse. There's
| probably a lot both ways.
|
| The mainframe mindset might be a factor here as well. The giant
| mainframe where all the magic happens is still a thing to behold
| and this is definitely part of banking's history and present.
| Mainframes are beasts and are still far from any kind of
| obsolescence. A monolithic Bank Python with a standardised set of
| libraries etc would slot right in to that mindset and way of
| thinking.
|
| The part about programming languages frequently not having tables
| is interesting. The closest as mentioned is the hash, but you
| lose so much in that abstraction eg the relational aspects. The
| counter argument then becomes the obvious: why aren't you using a
| database library, or in a pinch, sqlite? Rightly so. Why would
| you add relational tables to python rather than have a generic
| python database spec or a collection of database connector
| libraries. Databases are separate and large projects in
| themselves.
|
| I'd still be overly disturbed if they were running some old
| python 2.5 or similar. Just saying. That would be a source of
| pity.
| Zababa wrote:
| > The part about programming languages frequently not having
| tables is interesting. The closest as mentioned is the hash,
| but you lose so much in that abstraction eg the relational
| aspects. The counter argument then becomes the obvious: why
| aren't you using a database library, or in a pinch, sqlite?
| Rightly so. Why would you add relational tables to python
| rather than have a generic python database spec or a collection
| of database connector libraries. Databases are separate and
| large projects in themselves.
|
| This is covered in the article, in the distinction between
| "code-first" and "data-first". Databases means that you leave
| the interaction with data to a third party, and the only thing
| you do is send commands and receive results. This is very
| different from having all the data in your program, and
| starting from that. I'm not sure if "code-first" is the right
| word from it. Perhaps another way to put it would be that when
| data is the most important thing, you don't want to encapsulate
| it in a "database object", you want it to be right here.
| lmm wrote:
| > The part about programming languages frequently not having
| tables is interesting. The closest as mentioned is the hash,
| but you lose so much in that abstraction eg the relational
| aspects. The counter argument then becomes the obvious: why
| aren't you using a database library, or in a pinch, sqlite?
| Rightly so. Why would you add relational tables to python
| rather than have a generic python database spec or a collection
| of database connector libraries. Databases are separate and
| large projects in themselves.
|
| The separate datastore is the problem to be solved here -
| databases, especially relational databases, are extremely
| poorly integrated into programming languages and this makes it
| really painful to develop anything that uses them. You can just
| about use them as a place to dump serialized data to and from
| (not suitable for large systems because they're not properly
| distributed), but if you actually want to operate on data you
| need it to be in memory where you're running the code and you
| want it to be tightly integrated with your language and IDE and
| so on.
|
| (It's not even the main benefit, but just as an example of that
| kind of integration, when you're querying large datasets
| Minerva works a bit like Hadoop in that it will ship your code
| to where the data is and run it there)
| ttyprintk wrote:
| The first-blush conversion from Excel to this ecosystem only
| needs lookup tables. Excel has some static database I/O, but
| people who only know Excel use it as dat input for lookup
| tables.
|
| The Python results of that first conversion need to test
| against Excel, so it'll have identical lookup tables.
| int_19h wrote:
| Funny thing is, databases _were_ tightly integrated into
| programming languages all the way back in 80s - that 's
| exactly what dBase was, and why it became so popular.
| FoxBASE/FoxPro, Clipper, Paradox etc were all similar in that
| respect.
|
| And yes, it made for some very powerful high-level tooling. I
| actually learned to code on FoxPro for DOS, and the
| efficiency with which you could crank out even fairly
| complicated line-of-business data-centric apps was amazing,
| and is not something I've seen in any tech stack since.
| pphysch wrote:
| > The separate datastore is the problem to be solved here -
| databases, especially relational databases, are extremely
| poorly integrated into programming languages and this makes
| it really painful to develop anything that uses them.
|
| Hence "Active Record" ORMs like Rails and Django being highly
| successful. They functionally embed the RDBMS into the
| language/app (almost literally if using SQlite), which is a
| huge boon for developer productivity...
|
| ...but also a significant footgun, because it means the
| database is now effectively owned by the Active Record ORM
| and its (SWE) team, and not by some app-agnostic data team.
|
| Want to reuse that juicy clean data managed by Django? Write
| a REST API driven by the app; don't try to access the data
| directly over SQL, although it may be tempting.
| lmm wrote:
| > Hence "Active Record" ORMs like Rails and Django being
| highly successful. They functionally embed the RDBMS into
| the language/app (almost literally if using SQlite), which
| is a huge boon for developer productivity...
|
| Right, those are a step in the right direction, but still a
| lot more cumbersome than properly integrating your
| datastore with your application.
| AmericanBlarney wrote:
| I knew a few people who quit Goldman's because of Slang, although
| the reactive graph portion of SecDB carried over around the
| industry.
| webmaven wrote:
| Huh. A lot of Bank Python sounds eerily similar to Zope related
| tech:
|
| 'Barbara' (which I suspect is JP Morgan's Hydra) sounds like a
| brutalist version of the ZODB created with scalability in mind.
|
| The hierarchical overlays sound like Zope's Acquisition and
| Containment.
|
| Even storing (and running) code in the object database has it's
| equivalent in Zope's Python Script Objects.
| lyptt wrote:
| Reminds me of when I interviewed for an internship at Fidessa in
| London back in 2011-ish. I remember the team lead talking about
| an in-house programming language they used called FidessaC which
| used a mixture of C and SQL syntax.
|
| Seems like a lot of the banking world like to invent their own
| tech stacks.
| throwawayQuartz wrote:
| Sounds like the Quartz platform at Bank of America. When I
| interviewed with that team they joyfully espoused the virtues of
| their Principal Engineer who quote, "created the database
| software from scratch!"
|
| Edit: For the record, I implement third-party vendor Excel
| functions (DLLs written in C++) in C# and it's a great way to
| send useless processes to the shadow realm.
| dash2 wrote:
| > In order to deploy your app outside of Minerva you now need to
| know something about k8s, or Cloud Formation, or Terraform. This
| is a skillset so distinct from that of a normal programmer (let
| alone a financial modeller) that there is no overlap.
|
| This rang a bell. How did deployment become such an arcane skill?
| toyg wrote:
| 1. Sysadmins had to find new careers after cloud providers
| destroyed their livelihood.
|
| 2. cloud providers try very hard to lock you in, by offering
| all sorts of advanced goodies. They tend to come with a
| learning curve, and they all accumulate. Sooner or later
| someone comes up with cross-provider solutions, and they too
| have learning curves.
|
| 3. inventing new ecosystems means creating new work for
| advocates and ninjas. You don't become a rockstar by diligently
| doing what has been done before, but by finding (or inventing)
| a niche and becoming a guru.
|
| 4. some problems are indeed hard to solve, and the more
| products try to do that, they more they get complex.
|
| 5. everyone thinks they will have Facebook-scale problems, even
| when they never will.
| DrBazza wrote:
| > How did deployment become such an arcane skill?
|
| Prior to industry wide "solutions" such as Docker, or rather
| linux namespaces and cgroups, there was no obvious process
| isolation, so "deployment" was copying tarballs or using
| Windows installers.
|
| Also, investment banks want "support" from vendors, so hardware
| was either Windows servers (mostly for Exchange and AD), or Sun
| Solaris boxes.
|
| So although linux cgroups came around 2006 (?) and namespaces
| in 2001 (?), banks didn't do too much with linux until after
| 2005 (when Redhat were providing the aforementioned 'support').
| I don't think the 'industry' widely recognised the potential of
| cgroups and namespaces.
| di4na wrote:
| It is not in the value stream.
|
| Our job as sold by the zeitgeist is to write code for features.
| Fix bugs. And run production.
|
| The logistical part in the middle, how to get the code from
| commit to prod, is owned by noone and is not considered worth a
| budget.
|
| There lie the reason we have tools to configure prod, but
| nearly no tool to deploy code. Even docker and k8s dodge that.
| eb0la wrote:
| There are too many factors here.
|
| One of them is you don't own where your code runs anymore.
|
| This might scare some people until they realize they can get
| high availability without waiting 3 months for some server to
| arrive, but it makes deployment harder.
|
| I still believe the main reason people adopt CI/CD today is
| that "suddenly" deployment in a complex environment becomes
| easy and software gets tested. A lot.
| globular-toast wrote:
| > One of them is you don't own where your code runs anymore.
|
| That's not new. The first money I earnt from computers was
| making websites. We didn't own the webservers; we rented
| webspace from some company. To deploy it, we literally just
| uploaded PHP files to a place using an FTP client. It was
| that simple and it worked.
| tomrod wrote:
| Your example highlights the value chain of CI/CD!
| blitzar wrote:
| Nahh mate you want to talk to the facilities team, they deal
| with that sort of stuff.
| twic wrote:
| > Clubbing it all together
|
| Wherever the author works, they employ a lot of Indians!
| streamofdigits wrote:
| > Investment banks have a one-way approach to open source
| software: (some of) it can come in, but none of it can go out
|
| I'd say thats how they see society and their role in it more
| generally. Doing God's work is a surprisingly directed graph. It
| applies also to the broader banking world, but investment banks
| being the most lucrative (when not bailed out) attract talented
| people that in principle can close the loop and return something
| important back (much more so than the "sleepy" commercial bank or
| credit union).
|
| It would interesting if the above (potentially biased) view could
| be backed up by computing an _open source leech ratio_ per
| industry sector. The amount of open source code used versus the
| amount contributed.
|
| NB: a high leech ratio does not necessarily make you the worst
| offender. If your business model is evil no amount of open source
| contribution will wash it out
| randomcarbloke wrote:
| it's also not entirely true, I know of some hedge funds that
| have made significant contributions to open source codebases
| (including Python)
| streamofdigits wrote:
| agree, its not entirely true but if you look at the size of
| the financial industry (like > 10% of global GDP) its
| contribution is tiny.
|
| actually besides the tech industry itself I can only think of
| the bio/medical industry being an important contributor (E.g
| the entire R ecosystem)
| esonoi wrote:
| Not entirely true. Two things to consider: 1. Public sentiment.
| When Goldman Sachs open sourced their collections library on
| GitHub, it gained marginal traction (opinion: seemed more about
| PR to attract tech talent). When it was adopted by the Eclipse
| Foundation, usage rose by a non-trivial amount (based on usage
| stats from mvnrepository that aren't other eclipse projects).
| 2. There was a massive hiring frenzy, and their due diligence
| regarding IP was non-existent. Garden leave doesn't compensate
| for 'strategic' systems. Apart from "competitive advantage",
| when you have someone as tenured as he who shall not be named,
| you mitigate the risk of being sued by not making it obvious
| you're cloning a system developed for another firm.
|
| Bulge brackets are more risk avoidant than smaller firms, like
| hedge funds. Today, we have LMAX disruptor and Real Logic's
| Aeron (basis for Akka Remote) due to their liberal policy
| towards open source.
| alecco wrote:
| I worked at one of the largest of these systems. It seems to be
| the one referred by the post.
|
| The global distributed store of pickled python objects using
| Event Sourcing was one of the most horrible and expensive
| database systems I've ever heard of. It runs on THOUSANDS of
| expensive servers with all data stored in-memory. To get the
| state of a single deal you had to open, decompress, deserialize,
| and merge hundreds if not thousands of instances. And 90% of the
| output was more often than not discarded.
|
| The Python interpreter extensions reveal the ignorance of Python
| by the original developers. There was no good reason to fork
| CPython.
|
| There were many small subsystems created and supported by lone
| rangers with impressive CVs and astronomical salaries. A JIT
| better than any other one out there (but with a lot of
| limitations). A meta-query system extremely elegant.
|
| But this all was a sham. The actual daily crunch/analytics was
| run on more classic SQL/Columnar clusters. From the distributed
| object database hours-long running batch jobs loaded stuff on old
| school DBs. And those blew up frequently. Sometimes those blow
| ups cost many millions in delayed regulatory reports. The queries
| running on top of SQL were beyond stupid and the DB engine could
| not optimize for them. And of course, people blamed SQL and not
| the ridiculous architecture and the OOP dogma.
|
| Don't work for old school banking, hedge funds, or anything like
| that. They are driven by tech cavemen and their primadonnas.
| Exceptions might be _some_ HFT and fintech shops.
| markus_zhang wrote:
| But it was good for the first generation who worked there. Big
| dollars, total control and even the opportunity to create a
| proprietary IDE...
| esonoi wrote:
| That proprietary IDE was a piece of crap. Monkey patching was
| prevalent, exponentially increasing startup time depending on
| the last time you opened it, and an "online" source tree
| where one could easily modify the source code in someone
| else's 'private' workspace.
|
| PyCharm was a move in the right direction, but the way it
| worked was absurd- it would run the internal IDE in the
| background and sync to the file system. Given the proprietary
| IDE took up more resources than PyCharm, you constantly had
| to shut down apps so the machine had enough memory.
|
| IDEs should not be a requirement- they are tools... but you
| had no choice but to use it and their totally flawed code
| completion. Their measure of success was tantamount to having
| a Jupyter notebook- write code and get back results
| immediately.
| markus_zhang wrote:
| I guess it was originally created as a productivity tool as
| there was not any good Python IDE back before the year
| 2010. But after a while it became a monstrosity and new
| tools emerged but it was rooted so deeply that it was
| impossible to implement the new tools.
|
| But as said I'd really love to work on those projects. Most
| of the people don't get the prestige to own one's own baby
| in a big corporation. 99% of the job is maintaining a shit
| mountain of code and piling new shits on top of it. It
| really took a lot of luck to be able to make it happen.
|
| I mean even for people who get to work for Jetbrain or
| Microsoft Visual Studio team, they don't get to create new
| IDEs, they are buried deeply in a shit mountain of code and
| JIRA issues.
|
| Plus the pay and vacation is really good for the banks.
| regularfry wrote:
| It looks like sound ideas, poorly implemented. A lot of these
| capabilities crop up in BEAM or Smalltalk. I do wonder if
| Erlang had better developer ergonomics we wouldn't have seen
| that instead of Python as a basis.
| hcarvalhoalves wrote:
| This seems baroque but I quite like the idea, seems similar in
| spirit to having a shared LISP environment where you can live
| code and snapshot images with minimal hassle. I also see as an
| improvement over the usual Excel mess, in particular there's some
| version control and automatic propagation of changes.
| bob229 wrote:
| Never work at a bank. I've just quit barclays and they are
| ancient tech idiots doing a job that no one needs. Don't waste
| your time
| acosmism wrote:
| sounds like a company dealing with ..
| tbojanin wrote:
| Sounds like it could be JPM's 'Athena' platform?
|
| context: https://www.techrepublic.com/article/jpmorgans-athena-
| has-35...
| trenchgun wrote:
| Yes, but it is also probably similar to what they have in GS &
| BAML.
|
| Model seems to originate from GS:
| https://news.ycombinator.com/item?id=29104401
| duskwuff wrote:
| Certainly does. Some discussion here from a few years back:
|
| https://news.ycombinator.com/item?id=23819270
| curiousgal wrote:
| They are all similar but in this particular case this is
| definitely BAML's Quartz.
| simonh wrote:
| I think Minerva is clearly a reference to Athena.
| curiousgal wrote:
| Could be a misdirection because all of the rest fits Quartz
| to a tee.
|
| The Quartz database is called Sandra (referred to as
| Barbara here).
|
| The Quartz directed acyclic graph is called Dag (referred
| to as Dagger here)
|
| The Quartz job runner is called Bob (referred to as Walpole
| here which is a reference to Robert Warpole whose shortname
| is..Bob)
|
| These and the horrible proprietary IDE make it obvious
| which particular system he's describing.
| simonh wrote:
| I think Athena has equivalents to all of these but I
| don't know what they're called. I only know Qz.
| le-chiffre wrote:
| The Athena database is Hydra; The Athena graph is Pixie;
| The Athena job runner is also Bob
| tstordyallison wrote:
| I find it amusing that Bob made it in most of the banks,
| but few of the other names stuck.
| cbzbc wrote:
| How are the Barbara databases synchronized - as multiple
| nodes are mentioned ? The description makes it sound like
| it's just a large set of pickles in something like a
| Berkley DB?
| simonh wrote:
| Each server in a ring has a complete copy of the data for
| that ring. Each ring consists of a network of servers
| which may have nodes in different geographies. They're
| called rings, but are actually acyclic networks (IIRC).
|
| Replication occurs automatically so you need to manage
| consistency in your app architecture. For example if you
| have instances of an app running in different
| geographies, the specific data for those instances should
| be in different folders.
| [deleted]
| grvdrm wrote:
| I'm curious about experiences in other similar orgs:
|
| I work as a portfolio manager for a large reinsurance/insurance
| company but spend significant time in SQL, Python, Excel (not
| unexpected I'm sure).
|
| The Wapole platform in the article struck a chord with me. We
| built something roughly similar - call it Trek - that handles
| jobs. Jobs encompass lots of tasks - reading/writing from Excel,
| executing SQL, running Python, running C#. I could list many
| limitations, but realistically, the biggest limitation is that
| the platform can't handle something that it can't configure and
| run. In other words, the platform isn't set up with R - so no
| creates data pipelines/jobs that use R. Lots of people here use R
| (among other tools)
|
| One key problem (maybe?) is this all action happens inside the
| business. Trek was built by a talented actuary/programmer. No
| software engineering org involvement at all. I'm sure lots of
| folks here can imagine why: lots of red tape, general adversity
| to software that isn't already here, long stretches of time to
| get things done. Also, frankly, lots of our software devs write
| bad code.
|
| For folks familiar with the orgs in this article, and other
| similar orgs, is what the article discusses happening mostly with
| software devs in IT functions? Are these folks embedded in the
| business? And also are there folks using the more technical bits
| of the systems that are business-oriented - analysts, investment
| professionals, etc.
|
| Realize the lines are very blurry these days, but interested to
| learn from everyone here about the types/roles of end-users
| 3maj wrote:
| I work in a Technical PM role at a large North American
| Insurance company and used to work at one of the largest Banks
| as a Sr. Business Analyst (or Sr.Systems Analyst depending
| where you are).
|
| >lots of our software devs write bad code
|
| Ultimately, you get what you pay for, all our full stack devs
| are making 6 figures... While it might not be FAANG money they
| also almost never work overtime and the stress levels are
| relatively low.
|
| >Are these folks embedded in the business? No, thats my job. As
| a Tech PM I'm supposed to know exactly the business
| requirements, what my guys can do (to manage expectations) and
| any limitations of the software/business. I find the best PM's
| are the ones that have some Dev experience but also have
| extensive people skills and understand how to manage
| stakeholders.
|
| >are there folks using the more technical bits of the systems
| that are business-oriented? It varies, I started off as that
| business-oriented person (corporate finance) and eventually
| made my way over to the Data side of things and finally some
| programming work and now I'm running projects. While you won't
| get many analysts/portfolio managers doing dev work I do try
| and get them to have a hands on approach especially when doing
| QA and UAT work.
| grvdrm wrote:
| > Ultimately, you get what you pay for, all our full stack
| devs are making 6 figures... While it might not be FAANG
| money they also almost never work overtime and the stress
| levels are relatively low.
|
| Yes, it's the blessing and the curse. They are lovely people.
| They have lives. They aren't working 24/7. But there is
| misalignment between senior folks who want to innovate and
| build internal tech around core IP and the talent level of
| the folks tasked with actually getting that done. Insurance
| hardly unique in that regard, but an acute issue nonetheless.
| Firms like GS, JPM, etc have fatter margins (I think) and can
| afford to pay devs/strats/etc.
|
| Interesting to hear that you went from corporate finance to
| technical PM. Quite the journey. Would love to dig more into
| that if you're willing.
| 3maj wrote:
| >Firms like GS, JPM, etc have fatter margins (I think) and
| can afford to pay devs/strats/etc.
|
| From what I've seen on my end the Insurance firms have
| started paying somewhat of a premium to compensate for the
| lack of "excitement" that is associated with the insurance
| industry as a whole.
|
| >Would love to dig more into that if you're willing.
|
| Always willing to chat!
| klelatti wrote:
| Really interesting comments.
|
| I'd guess that you have proprietary (e.g. valuation / capital
| calculation) systems that need to interface with Trek in some
| way. Could you share how you've approached that at all?
|
| Also not clear why R couldn't be added to Trek alongside Python
| and C#?
| grvdrm wrote:
| Certainly - let me try to share succinct version germane to
| my day-to-day
|
| - We regularly perform group/segment level risk roll-ups.
| Involves running computationally expensive (by insurance
| standards) in-house and third-party models that estimate loss
| from hurricanes, earthquakes, etc. A lot of our insurance
| data is unsurprisingly stored in disparate systems that don't
| talk to each other, and in some cases, don't have any useful
| interface that someone like me can query/view. Things are
| changing, but still quite a lot of history to overcome
|
| - We also have lots of stuff from outside underwriting
| parties in form of Excel, CSV, MDF files.
|
| - We have to bring all that together to make sense of the
| portfolio, so we use Trek to do a lot of the various involved
| tasks like running the models, processing CSV data,
| processing Excel data, attaching databases, creating
| dashboards in PowerBI (tangent: hate it)
|
| - Sample pipeline: query portfolio data from one DB, read CSV
| file from another third-party, pull both into risk model,
| kick off analysis, then execute a script to pull together
| results in the model's databases or elsewhere.
|
| Happy to expand or answer other q's as you have them.
|
| As for your other comment about R - it's just a matter of the
| install. Someone has to install R so that Trek can use it.
| Not a major problem. Pointed this out as a contrast in our
| org that is probably smaller and has many fewer devs compared
| to what I'm reading in the post where "bank python" sort of
| feels like the platform in which everything happens /
| everything is configured.
| 3maj wrote:
| >read CSV file from another third-party Generally, if
| you're getting a CSV file from a client that indicates to
| me that theres a good chance they're exporting info from
| their systems and sending you the CSV.
|
| Have you considered developing a client facing API that can
| be used to send and digest data?
| grvdrm wrote:
| We have, but that's not something my team (actuarial) can
| accomplish without help/blessing/oversight from IT.
|
| I don't know all the details but we do have connections
| to some of our MGAs through XML dumps or perhaps real-
| time feeds (I'm doubtful). But that data is often missing
| some of the details as I need. It's useful for policy
| admin - not for all the other stuff.
|
| We've also explored portals but those are fraught with
| concerns about double-entry.
| klelatti wrote:
| Thanks for a really comprehensive reply. Enjoyed your
| comment on PowerBI.
|
| My background is mainly life which is dominated (at least
| in Europe) by computationally demanding proprietary
| liability modelling systems but I think Python / R is
| getting a foothold in capital calculation / aggregation.
|
| My perception that there is a lot more use of in-house
| models in the GI / property-casualty worlds so more Python
| etc but sounds like you still have to interface with
| proprietary modelling systems.
| grvdrm wrote:
| Absolutely - and for quite a long time (I work mainly
| property).
|
| There's not much (if any) appetite to completely rebuild
| 3rd-party geophysical vendor models. Those folks have 20+
| years of work behind them and a different talent base
| (e.g. different types of scientists building the base
| models).
|
| But we do focus on all the other stuff. Making data input
| easier/more accurate. Same thing re: output. Also the
| vast majority of our capital and group-level risk work
| happens in-house - R, Python, etc.
| barrenko wrote:
| Readers of this thread might be interested in reading "My Life as
| a Quant" by Emanuel Derman.
| benjaminwootton wrote:
| This pattern in large part came from SecDB at Goldman, and then a
| few people who moved to moved to JPMC and BAML.
|
| Dependency graphs are an elegant solution to risk management and
| pricing etc. There's a reason this approach works in IBanks.
|
| Check out Beacon.io which is the a SaaS implementation from the
| same team.
| lucozade wrote:
| > Dependency graphs are an elegant solution to risk management
| and pricing etc.
|
| Dependency graphs are not a solution to risk and pricing. They
| are, in certain circumstances, a very useful tool. That's all.
| They also scale notoriously painfully.
|
| Putting a dependency graph as a mandatory component in your
| risk system was one of the worst technical decisions I've come
| across (and I've been doing this lark a long time).
| goldenkey wrote:
| Wouldn't an observer pattern work better? The graph itself
| could even be used to instantiate subscriptions in a pub/sub
| system where changes in underlying pricing could be dealt
| with via an event queue. Compaction and debouncing could be
| applied on top of the queue to avoid lots and lots of
| redundant execution.
| lucozade wrote:
| > Wouldn't an observer pattern work better?
|
| Better as a solution to what problem? In some cases a
| dependency graph is an excellent solution. In some cases
| it's not. In some cases it's fine for small graphs but
| scales poorly as it can be very hard to reason about (as
| attested by pretty much anyone who's supported a really big
| spreadsheet).
|
| But that's the point; it's a really useful tool. Sometimes.
| qorrect wrote:
| > You can achieve a awful lot with Excel: more, even, than some
| programmers can achieve without it.
|
| I've heard this a lot, but have never really used Excel. What can
| it do that a programmer can't ?
| dwohnitmok wrote:
| By virtue of Turing completeness there's nothing you can do in
| Excel that you can't do in a program. It's all a matter of
| speed.
|
| Having seen Excel wizards work their magic before, the dizzying
| ways they can slice and dice their data with the help of a
| combination of GUI affordances, formulas, and hot keys is truly
| astounding. Often times a person could build out a full set of
| data and charts in half an hour that might be something like >
| 100 lines of equivalent Python/Pandas code.
|
| And crucially often the report would have less bugs than the
| equivalent code because the analyst could see all the data in
| front of them as they were manipulating it and would naturally
| do spot checks along the way.
|
| Now note the "some" in that statement. A Python/Pandas master
| could also probably whip up the equivalent > 100 lines in half
| an hour. But it was really astounding just how fast Excel
| experts worked.
| jetbooster wrote:
| And I think here-in lies the trap of excel.
|
| You've built this brilliant report because you're an excel
| wiz, but because of that you've gotten someone up the chain's
| attention and you need to do it every day/multiple times per
| day, and automating out all your shortcuts, hotkey and ui
| clicks with macros becomes a horrifying cludge that had you
| invested in something more automation oriented earlier would
| produce more resilient/repeatable solutions.
| alexilliamson wrote:
| If you're an excel wiz, you build it once and when you need
| to update, you drop in new data and everything magically is
| updated.
| nonesuchluck wrote:
| Yep. The peculiarities that make this Excel workflow
| unreasonably effective are pretty easy to identify:
|
| 1. Tabular data. There's some tricks with named ranges etc,
| but for the most part your entire application state is spread
| out before you, scrollable in every direction. It's just
| tables, and clicking into a cell highlights relationships
| (data dependency) between cells.
|
| 2. Visible data, hidden code. =macros are hidden behind their
| result; the most obvious thing is to treat them as tiny black
| boxes, applying a single data transformation (or small set of
| logically related transforms), and immediately see the result
| applied to your data set. This is a tiny bit like functional
| programming, and a tiny bit like Pandas or Spark (immutable
| data, lazy evaluations). Except unlike those worlds, Excel
| pushes the data front and center, not the code.
|
| And prob a bunch more I'm forgetting. It doesn't even feel
| like programming, more like building data pipelines in Unix
| or something. Except you can easily preview the data at each
| step in the pipeline. What I really want is Excel or Siag,
| but with Python and SQLite and a nice spreadsheet UI.
| disgruntledphd2 wrote:
| > Often times a person could build out a full set of data and
| charts in half an hour that might be something like > 100
| lines of equivalent Python/Pandas code.
|
| But only ten lines of R! (Excel is kinda awesome though).
| airstrike wrote:
| WYSIWYG, in every sense of the expression. There's no _written_
| abstraction. You just see the data (and depending on how things
| are structured, sometimes the intermediate steps getting that
| data from point A to point B).
|
| With code, you type the abstractions while imagining the data
| in your head -- and then check to see if the final result is
| the right one. The average user will usually litter the code
| with debug print statements to help understanding what's going
| on live
|
| With Excel, you live in the runtime and you stare at the
| results of your "code" all of the time. The abstractions and
| the links between steps of the process either live in your head
| (because you _know_ how that financial model works) or are
| buried in formulas in the most asynchronous of layouts (cell A1
| may be the result of a calculation in cell Z99, for instance)
| natpat wrote:
| Completely off topic - but I love the aesthetic of the post.
| "Vanilla HTML" is a design that isn't used enough. It's something
| I tried to apply to my personal blog, but I think it's been done
| much better here.
| luord wrote:
| I have been a developer for eight years and yet I still get
| shocked about the places where Python will be used. I mean, it
| _is_ my favorite language, but in the communities I gravitate
| towards (basically communities like HN) it has so many detractors
| for being dynamically typed, not being functional enough, being
| slow, etc, that sometimes I 'm tempted to think maybe it's
| actually a guilty pleasure of mine, and that I should look for
| better pastures.
|
| Then I read articles like this and I remember why I like it: it
| gets the job done, and quickly (for the developers at least).
| It's why it's so widely used and keeps climbing. Of course,
| nothing wrong with learning other languages and I do try to keep
| up, but Python will remain my go-to for the time being.
| mst wrote:
| People who make complaints like that are privileging their own
| personal aesthetics over pragmatism.
|
| Same mistake as the people who keep talking about perl being
| "dead" while they're deploying their production platforms on
| debian or red hat based systems and ignoring the fact that the
| packaging and release QA work for those distros is
| substantially dependent on - actively maintained by the distros
| in question - perl projects.
| lmm wrote:
| Sounds like someone else is putting their personal aesthetics
| over pragmatism. Perl is a dead language walking, the fact
| that there are some tools it hasn't been worth rewriting
| doesn't contradict that.
| mst wrote:
| I'm talking about actively chosen new development because
| it's still the dynamic language most oriented towards being
| comfortable as _part_ of a unix environment rather than
| simply running on top of one.
|
| Modern async/await + heavily OO based perl is not, I
| suspect, the language that you're thinking of when you made
| your comment.
| mtrovo wrote:
| > I've mentioned that programmers are far too dismissive of MS
| Excel. You can achieve a awful lot with Excel: more, even, than
| some programmers can achieve without it
|
| This is one of the most underrated topics in tech imho.
| Spreadsheet is probably the pinnacle of how tech could be easily
| approachable by non tech people, in the "bike for the mind"
| sense. We came a long way down hill from there when you need an
| specialist even to come up with a no-code solution to mundane
| problems.
|
| Sure the tech ecosystem evolved and became a lot more complex
| from there but I'm afraid the concept of a non-tech person
| opening a blank file and creating something useful from scratch
| has been lost along the way.
| pantsforbirds wrote:
| I think part of the problem with Excel (or clones) is that you
| can do so much haha. Its such a powerful tool, that you end up
| doing things in it that it really wasn't optimized or designed
| for and managing the change history in excel is pretty tough.
|
| But for 95%+ of analysis you really cant beat it.
| FabHK wrote:
| Plus, a spreadsheet is basically purely functional (unless
| there's mucking around in VisualBasic), and has a beautiful
| dependency graph and calculation engine! (And that is a big
| part of what SecDB/Slang/Bank Python brought to the table.)
| DrBazza wrote:
| Reminds me of the investment banking dev cycle in the 2000s:
|
| * trader writes a pricing "app" in Excel
|
| * trader discovers MS Access db
|
| * traders (plural) start copying around Access db files
|
| * problems
|
| * "IT" gets involved
|
| * convert MS Access to oracle or sybase
|
| * write some server process(es) in C++
|
| * write some replacement front end (spend months arguing over
| best grid component to replace excel) in C++/MFC
|
| * trading system emerges...
|
| * rewrite in C#, Java
|
| * etc.
| sam0x17 wrote:
| Very true. I often prototype algorithms and things in google
| sheets. One time I had backpropagation working in there, with a
| little button to process the next "row" of training samples.
| excitom wrote:
| I used to work for a very successful company that produced
| mobile games. The entire logic of the game, the rules, etc.
| was all in Excel
| dolmen wrote:
| So Excel spreadsheets were deployed on mobile devices? What
| about the runtime?
| markus_zhang wrote:
| The problem is sometimes analysts turn into shadow BI or even
| DE who only know Excel. They know Excel so well that they
| create a whole monstrosity in Excel. MSFT has been sort of
| encouraging that too by introducing some Power BI feature and
| now Javascript into Excel.
| denimnerd42 wrote:
| problem as described to me is that excel starts being used for
| regulated processes and it's not well auditable, access
| controlled, changed controlled, tracked, etc etc. Then people
| need to implement the exact same process across departments and
| they're all using a separate excel sheet and they all submit
| different numbers. becomes a huge mess and so much more
| complicated and expensive systems become commissioned.
| NovemberWhiskey wrote:
| There's an excellent example of this phenomenon in the JPM
| "London Whale" report where -- at various points -- poorly
| maintained and validated spreadsheets appear as minor
| villains in a $6.2bn loss.
| FabHK wrote:
| Fun story: I was at a bank that used Excel for everything. As
| you say, there came a complaint from the auditors that it's
| not well auditable, and there needed to be "a system".
|
| Solution: the bank put together a system that constructs
| (from Excel templates and the bank trading data and market
| data) Excel spreadsheets from scratch every day, then used
| those for the calculations, and stored them. But now it was
| "a system", so all good.
| joconde wrote:
| Well you can audit the code that generates spreadsheets,
| which seems to solve the audit problem. Kind of like I
| prefer reading a Dockerfile that builds a program from the
| GitHub repo, rather than downloading a pre-compiled package
| I can't trust.
| denimnerd42 wrote:
| sounds like a great system. we have something similar where
| we put excel in and out but doesn't sound as slick as that.
| on top of the system there is access control, versioning
| and such. the data gets approved and then stored in the
| backend to feed the regulated process.
| craftinator wrote:
| This describes what I've seen happen with Excel over and over
| again. I'm curious if the use of collaborative Google sheets
| could be a fix for this? Something where a portion of the
| sheet could be shared globally, but the rest of the document
| would be local to the instance working on it.
| brendoelfrendo wrote:
| The jargon for this is "user-developed application," and
| auditors do keep an eye out for these. Banks, from what I've
| seen at least, typically have some process to document these
| as they come up, replace them with supported solutions, and
| retire them. At least, that's the "happy path," where people
| are willing and able to get all that done before a big-three
| auditor comes in and tears you a new one.
| cstross wrote:
| This immediately sprang out at me:
|
| > Investment banks have a one-way approach to open source
| software: (some of) it can come in, but none of it can go out.
|
| I wonder how well this plays with the various open source
| software licenses?
| seanhunter wrote:
| That's actually an unfortunate side-effect of banks having
| weird requirements. When I was at GS we had this enormous
| source repo built on CVS. So we made improvements to CVS to try
| to make this more manageable . For example because branching in
| CVS absolutely _sucks_ we had to use tags (rather than
| branches) to identify releases. This meant you end up with lots
| of tags and when you look at them it 's really hard to find
| (visually) whether code has a particular tag. So we patched CVS
| to sort the tags alphabetically. We tried to upstream this but
| the CVS devs didn't want to know. So we had to maintain it.
|
| Likewise a bunch of fixes to the timezone handling code that
| iirc glibc simply wouldn't upstream so we had to maintain even
| though they were bugfixes.
|
| We did used to upstream everything we could and I think the
| situation is improving.
| lozenge wrote:
| Most licenses just require your users to have access to the
| source code. As all the users are bank employees, this is
| usually easily achieved. If the license is violated it's only
| by accidental oversight.
|
| Pretty much everything described is a Python library not a
| change in the Python interpreter so can be under a proprietary
| license.
|
| The spirit of open source is a different matter.
| simonh wrote:
| As I understand it from a legal point of view the user in
| this case is the bank, not individual employees running it on
| the bank's behalf, and the bank already has the code so it's
| a non-issue.
|
| I know some people think this is contrary to the spirit of
| open source, but it isn't. One of the goals of open source is
| so that users can customise the code to their specific use
| case, with no obligation to share. That's all the banks are
| doing. They have the same rights as any other user.
| Mikeb85 wrote:
| This. Even RMS has said many times that a company is a
| single user/owner of said code, and it doesn't matter who
| works on it as long as it doesn't leave the company. It's
| all explained in the GPL but the gist is, if the company
| only uses it internally/doesn't try to sell the code, they
| can do whatever TF they want.
| freeone3000 wrote:
| Quite well. You are under no obligation to commit back your
| changes under any OSI license. The old Sun CDDL required it,
| and was denied OSI "open sourceness" as a result.
| globular-toast wrote:
| As others have mentioned, it's fine, even with GPL, as the
| licences only really kick in when they try to distribute the
| software. They are only really hurting themselves. When
| starting a private fork you force yourself to maintain it
| alone. That means either letting it rot (ie. it becomes
| insecure and obsolete with no new libraries supporting it) or
| keeping up with the mainstream yourself. Either way it's a lot
| of work that wouldn't be necessary if they upstreamed their
| changes. Maybe one day they'll get the message. You'd think
| that long-term investors would understand this concept better.
| streamofdigits wrote:
| what would be the implication wrt external banking services
| that use open source software? (echoes the cloud databases
| story...)
| Mikeb85 wrote:
| Are they distributing/selling the code? If the answer is no
| (and it pretty much always is) then there's no
| implications. Nothing in the GPL says you can't have a web
| front end to gather info that's then processed by your
| modified GPL code (which never leaves your possession) and
| the results spat back out to that web front-end.
| betwixthewires wrote:
| Well, with GPL particularly but open source licenses more
| generally, the user is allowed to do whatever they want with
| the code. It is only when the code is redistributed that the
| source must be provided. With AGPL the source must be provided
| also to anyone using a service running AGPL licensed code.
|
| So it plays perfectly with the licenses. This is the sort of
| thing free software was designed for, allowing everyone that
| uses a codebase to own it 100%.
| transitory_pce wrote:
| Title should be ".. of investment bank python", trading and risk
| has little in common with a retail digital bank like say N26.
|
| The problem with these projects is that the folks leading them
| have never built a real trading system in entire their lives (the
| ones who have been there for many years worked with end-of-day
| batch systems) and there is a layer of useless and incompetent
| "business analysts" who hide behind their incompetence by finding
| ways to malign developers..
|
| Pro tip: Dont work for a bank before assessing its open source
| repos. They have none? Run in the opposite direction.
| tecleandor wrote:
| When you say "The problem with these banks..." you mean the
| ones like N26 (not investment banks)? (Seems like coffee hasn't
| kicked in for me yet)
| transitory_pce wrote:
| Edited.
| tecleandor wrote:
| Thx
| redsid wrote:
| I worked on modeling/mapping market risk schema in Quartz a few
| years ago and used to wonder why they were "customizing" open
| source software/systems in house, when they can as well as
| supported those initiatives directly and publicly. As a C++ dev,I
| had already realized the world of software tooling had passed by
| me, but still used to wonder at the (over)engineering of
| everything in quartz and involved skills that were not
| transferable.
|
| My view now is the value of these in-house systems is essentially
| a cost efficiency play on run costs, and there there is very
| little revenue/growth opportunities for the business from these
| investments. With Volcker (really the best thing to happen in the
| '10s) and loss of prop trading means all the market makers live
| off the spread, and so while there is value to minimize
| operational costs, they are not worth the investments that have
| been made.
|
| Bottom line for me is investments within large firms in capital
| markets are unlikely to generate revenue/profits in scale - I am
| sure there are some exceptions and would like to know
| nraynaud wrote:
| it has a bit of a smalltalk flavor, where the runtime is a memory
| image, with objects and data in a giant jumble.
| p_l wrote:
| There's one smalltalk vendor whose main product is an object-
| oriented database that is also a smalltalk instance (and you
| use other smalltalks as IDEs to it).
|
| GemStone/S
| twic wrote:
| There was also a giant investment bank system written in
| Smalltalk, so that may be a direct influence:
|
| http://www.esug.org/data/ESUG2004/ValueOfSmalltalk.pdf
| musiciangames wrote:
| Can anyone confirm whether JP Morgan were able to
| decommission Kapital when they went to Athena? I've seen so
| many cases in banks where the old system is still running
| years and years after it was 'replaced'. And Kapital was used
| in so many, different, parts of the business.
| akapitalidea_tw wrote:
| The answer is no. Kapital persisted past Athena for many
| years and was not (seriously) considered for shutdown
| igouy wrote:
| In that case, here's a glossy --
|
| https://www.cincom.com/pdf/CS040819-1.pdf
| arnsholt wrote:
| I had the exact same thought! A custom IDE with all the source
| stored in a database is extremely smalltalk, although here the
| source is stored in a shared DB (it seems), rather than a per-
| user DB as it is with a regular smalltalk image.
| igouy wrote:
| Back in the day, Morgan Stanley & JPMorgan & ..., used
| ENVY/Developer -- fine-grained Smalltalk config-map / sub-
| application / class / method configuration & version control
| in a central database.
|
| "Mastering ENVY/Developer"
|
| https://www.google.com/books/edition/Mastering_ENVY_Develope.
| ..
|
| Some folk worked as "librarians" to promote code reuse across
| projects.
| markus_zhang wrote:
| That's why working as the FIRST generation of programmers in any
| big financial shops is so fun. You get total ownership to
| whatever you build and others totally rely on you for their job.
| Even better, you can take months to reply to a requirement, if it
| doesn't come from a key stakeholder.
|
| _Edit_ : also applies to any large corporation.
| 0xbadcafebee wrote:
| Before Open Source, you hired a company that "did"
| software/computers. You knew jack shit about it and couldn't do
| anything about it other than "run commands".
|
| After Open Source, you still did that, but you also occasionally
| hired people to download and cobble stuff together to save money,
| and maybe one or two people to write code to help cobble together
| stuff.
|
| Over time this evolved into managing entire "technology
| divisions" of people writing code and cobbling together stuff, to
| manage larger and larger internal projects, to support teams, to
| build components, to eventually be used by one internal product
| used by a customer. 50 different teams to build components, and 1
| team actually servicing a customer. And each component built is
| exactly the same as components built at other companies.
| Sometimes even exactly the same as other components in the same
| company.
|
| Nowadays a Big Bank might produce more software than Facebook,
| and none of it ever escapes into the world as Open Source. Whole
| oceans of software are birthed, live and die in the shadows.
| Millions of lines of code that live for half a decade. Constantly
| manufacturing their own hammers because they believe theirs will
| work better than an existing hammer, or because they're too lazy
| to learn how to use an existing hammer. And never sharing their
| custom-built hammers with the rest of the world. All because some
| clueless executives believe this solves their business case
| better than buying something off the shelf and making it work for
| their business case.
| voidfunc wrote:
| It is not an oral history if you write it down...
| cyberpunk wrote:
| Both frontarena and murex use python as their "vb" kind of
| language. If you thought your deployment pipelines were weird,
| ours have included putting entire python apps into single strings
| and inserting them to an oracle db, where a fat windows client
| selects them and runs them on a windows python interpreter ...
| via citrix... :/
| twic wrote:
| This reminds me of an e-commerce system which stored data in a
| mixture of Oracle, and text files on the local disk. We handled
| backups by loading the text files into blob columns in the
| database, and then just backing up the database.
| rlewkov wrote:
| The horror
| konschubert wrote:
| This made me laugh out loud. Technology is awesome.
| FabHK wrote:
| Username checks out :-)
| urban_winter wrote:
| BAML Quartz was conceived by a bunch of front-office quants who
| had not the first idea about the software needs of a big bank
| beyond the front office. There was an arrogant assumption that
| front office software is obviously the most complicated/difficult
| variety of software within a bank and therefore any system
| designed with front office requirements at the forefront would,
| of course, be perfect for universal use.
|
| This assumption was challenged at the time by various groups - I
| was closest to the Equities Operations software team (although
| not part of it) who absolutely dug in their heels and refused to
| use Quartz. The assumption was explosively invalidated when
| people started implementing in Quartz applications that fell
| under Sarbanes/Oxley regulations and Quartz picked up a severity
| 1 audit finding - because Quartz was explicitly designed for
| "Hyper Agility" (literal quote from the quartz docs) - and
| anyone-can-change-anything-at-any-time does not make for
| applications that the regulators trust.
|
| There was an interesting trajectory of Python hiring during my
| time at BAML. I joined just as Quartz was getting started and we
| managed to easily hire tens of python devs in London because it
| was easy to sell the fact that BAML was making a strategic
| investment in Python and therefore their (at the time relatively
| uncommon) skills would be highly valued. But as Quartz matured,
| Python developers generally came to dislike it (for reasons see
| original article) and it became hard to retain the best ones. And
| after a while Python 2.x became a massive embarrassment and, as
| Python became a more common skill in the marketplace, it became
| harder to hire good developers into BAML.
| simonh wrote:
| I was at BAML when the sev 1 audit finding happened. My view
| was from an application support team in Risk. For us Quartz was
| fantastic, and it had a pretty decent permissions system. The
| problem is there were two miss-aligned goals.
|
| On the one hand the goal was to build a single enterprise scale
| system with a holistic view of the bank's data to do rapid ad-
| hoc position evaluations and meet new needs rapidly.
|
| On the other hand, access to all that data and all the code is
| clearly a security concern. By the time I left the sev 1
| finding was well on the way to being mitigated, but for example
| it meant that instead of handing out quartz developer accounts
| and IDE access like candy it had to be restricted to technology
| personnel only.
| lucozade wrote:
| > BAML Quartz was conceived by a bunch of front-office
| quants...
|
| It was worse than that. What they actually built was a system
| designed to support complex hybrid structuring. It's what
| markets desks had been making a lot of money in prior to the
| crash esp GS. Unfortunately, post-crash there wasn't much money
| in structuring so the Front Office was more interested in
| investing in flow. Quartz was really, really bad at flow.
|
| It took a long time (and the departure of Mike, Kirat et al) to
| get Quartz to a position where it was a reasonably sane FO
| system for the world as was rather than as it had been.
|
| Fun times.
| odiroot wrote:
| Can you have a career in finance as an engineer with Python and
| without C/C++ (professional) experience?
|
| Your post really made think it, it's an attractive area to work
| in.
| urban_winter wrote:
| Yes. Many adverts will specify financial services experience
| but it's worth applying anyway. You'll probably find that
| roles in back-office technology areas (operations, finance
| etc) are less demanding in this respect. I hired mostly from
| outside the financial services industry because other
| industries had, on average, better-skilled developers, lower
| salaries and better development practices.
| MagnumOpus wrote:
| Absolutely yes.
|
| Depending on what kind of engineer, it is far better to go to
| the finance (front office quant, back office risk) side than
| the tech support side. They are less snobbish about
| autodidacts and pay is far better if you are willing to learn
| about things outside the dev sandbox.
|
| (Our front office has a few quants and ex-quants with
| electric engineering background, I don't know of any software
| engineers there.)
| odiroot wrote:
| Thanks for detailed pointers. What's the deal with
| front/back offices?
| FabHK wrote:
| Rule of thumb: the closer to the business (ie front
| office), the more money and stress.
|
| (Front office deals with clients, and in this context
| comprises sales, trading, structuring. Middle office run
| control functions, reporting, risk, compliance, etc. Back
| office would be settlement, accounting, operations, etc.)
| simonh wrote:
| Also consider Application Support. I know it's not sexy
| rockstar dev stuff, but if you can get into App Support on
| the Quartz (or Athena I suppose) environments you get a dev
| account and access to all the tools. You can view all the
| code, config and running systems. If you have a good
| relationship with your dev team you can submit patches e.g.
| to improve logging. The live log files of all your
| applications are just a URL away.
|
| If you're up for it, you'll spend a significant amount of
| time in the Quartz IDE. There are teams within App Support
| that develop monitoring and compliance reporting tools in Qz
| and do about 50% development. I know because I ran one. One
| of my team transferred into our dev team.
| twic wrote:
| Yes. Tons of it is in Java.
| toyg wrote:
| _> in London because it was easy to sell the fact that BAML was
| making a strategic investment in Python_
|
| I reckon I "felt" that push to hire at one of the early
| PyConUK, where your boys suddenly showed up with a big
| contingent. I even thought about applying, but I was not based
| in London - and there were some red flags, like running a
| pretty old Python version (I thing it was 2.2 or 2.1, when
| 2.4/2.5 were the expected mainstream), that kinda sounded like
| I'd be signing up for the modern equivalent of mainframe
| maintenance.
| is0tope wrote:
| Having worked in an investment bank before this brought some
| flashbacks :)
| floe wrote:
| What makes this an 'oral history'?
| lmm wrote:
| Everything is second-hand; the primary written sources are not
| published publicly (and, given that these are secret systems,
| probably never will be).
| timkpaine wrote:
| There is a great talk on this as well:
|
| https://youtu.be/M9o9SF5-Pzw
| tstordyallison wrote:
| I was waiting to see when someone would link to this :)
| captainmuon wrote:
| Reminds me of what we used at the ATLAS experiment at CERN* .
| Python was tightly integrated with the application framework,
| Athena (which I just realize has the same name as JPM's Python
| framework!). You could use it as a job description language, and
| you would compose computation steps from classes you could write
| in C++. I think there was a separate `athena` executable that was
| just python with some packages pre-loaded. Because of all the
| binary modules, but even more so because of the minor syntax
| changes, the transition to Python 3 was really a problem (I hope
| they did it by now :-D).
|
| There was also a bespoke time-span database. You could store keys
| and values in there, but every data point had a start and end
| time. Then you could query what the values were between certain
| times, or run numbers (operational periods). We used it for
| example to store what configuration the detector was using when a
| certain dataset has been recorded.
|
| (* I've been out for a couple of years so I don't know what they
| use now, but I imagine it hasn't changed much.)
| kwertyoowiyop wrote:
| The Greek and Roman gods have always been a go-to for project
| names, LOL. We need to give some other cultures a shot!
| ttyprintk wrote:
| With particularly poor timing, I chose the Egyptian goddess
| Isis.
| tstordyallison wrote:
| One of the systems downstream of the JPM 'Minerva' (Athena)
| was called ISIS.
|
| It was renamed...
| airstrike wrote:
| I go for James Bond references, when I can. 'Moonraker' is
| always a great choice.
|
| Sometimes I'm constrained to a specific starting letter, so
| I've had to stretch it at times, like when I needed an 'S-'
| word... ended up going with 'Sinatra' since Nancy Sinatra
| performed 'You Only Live Twice' for that movie
| lifeisstillgood wrote:
| Sean ? Spectre ? Sexism ?
| airstrike wrote:
| Sean would be a weird codename and Spectre sounds a bit
| too ominous... We'll have to settle for Sexism
| lifeisstillgood wrote:
| Welcome to your first day on Project Sexism.
|
| Our PERT charts are really pert!
|
| We burn story points not bras round here !
|
| Agility is a core value ... nudge nudge ;-)
|
| ... oh well, The 1950s want their jokes back.
| tomrod wrote:
| I'm a fan of Norse!
| girvo wrote:
| I've been a fan of metals (and more broadly elements) for
| project code names lately. My current project is called
| Cobalt!
| zwaps wrote:
| I can report from another bank (in the top 10 globally), that
| recently moved from a more bespoke system (not even on Python) to
| having Python+Notebooks+Labs available to all - using Apache
| products and a global Anaconda-like Python distribution. The fact
| that you can use the Python, R or whatever programming language
| seems to be a factor.
| fock wrote:
| would you mind telling, which Apache products. I've been
| thinking about pushing that to around my organization (mostly
| for replacing reporting/...) but especially the I/O-interfaces
| are not that great if you are not living in a central database
| yet.
|
| I imagine something container-like with the notebook/batch-job
| (can be anything really), hooking up to datasources such as
| SMB-shares, thus allowing people who want to automate
| generating report Z to just request access to folder X for
| their job and thus be able to seamlessly create dashboards/...
| even if a lot of the org still is using "traditional"
| workflows.
| klelatti wrote:
| > This kind of Big Enterprise technology however takes away that
| basic agency of those Excel users, who no longer understand the
| business process they run and now has to negotiate with ludicrous
| technology dweebs for each software change. The previous
| pliability of the spreadsheets has been completely lost.
|
| > Financiers are able to learn Python, and while they may never
| be amazing at it they can contribute to a much higher level and
| even make their own changes and get them deployed.
|
| Coming from a slightly different part of the finance world
| (insurance) this rang very true.
|
| I think there is a huge opportunity here to build on the Python
| ecosystem - which is gaining more and more ground - and provide
| much more powerful alternatives to Excel and legacy proprietary
| systems.
| alexilliamson wrote:
| Also in insurance. I'm a big fan of using python to generate
| "read-only" pretty formatted workbooks. It makes the process
| more reproducible but people who "need my data in excel" still
| get that.
| mwexler wrote:
| This storage of code and data in the same oddball data structure
| and the "walled ecosystem" reminds me of forth. Forth had great
| access to its hardware, but it's "screens" and file structures
| were all unique to it... And extremely performance.
|
| Lots of "table-driven" systems live in bankland, and this python
| system sounds like the natural evolution of this...
| lmm wrote:
| I'm seeing a lot of people speculating about which bank this
| might be; I think the point is that it's all of them. I could
| loosely describe a previous job as implementing Morgan Stanley's
| Walpole and integrating more source code management into Minerva
| (even though that system wasn't actually Python-based).
|
| Having a global view on everything is large banks' value-add,
| it's why they haven't been outcompeted by their more nimble
| competitors. Being able to calculate the risk of the whole bank
| isn't just a cool feature, it's the core value proposition of
| this platform.
|
| Being able to just upload your code and run it is really cool,
| and if you squint it looks a bit like what the outside world is
| trying to set up with serverless/lambda-style platforms - just
| write a function, submit it, and there, it's running. (But it's
| worth remembering that Python is not a typical programming
| language; python build, dependency and deployment management is
| exceptionally awful in every respect, this isn't as big a pain
| point in other languages). Obviously there's a tension between
| this and having good version control, diffs, easy rollbacks etc.
| - but because Minevera is already designed to do all that for
| data (because you need that kind of functionality for
| modifications to your bonds or whatever), doing it this way
| strikes a much better compromise than something like editing PHP
| files directly on your live server.
|
| What this article calls data-first design has a lot in common
| with functional programming. I hope that as the outside world
| adopts more functional programming and non-relational datastores,
| Minerva-style programming will get more popular. It really is a
| much better way to write code in many ways. The difficulty of
| integrating with outside libraries is a shame though.
| joshu wrote:
| it's been 15 years and i'm still a bit traumatized by aurora
| sandGorgon wrote:
| Does everyone use a giant Pickle dump ? I mean - how big is
| that ? Petabytes ?
|
| I'm kind of surprised nobody monkey patched python
| serialisation to use a database (much like GitHub did with ssh
| key lookup in MySQL).
|
| What does the devops there look like ? Snapshot every minute ?
| lmm wrote:
| It's not a single giant pickle dump; each individual object
| gets pickled and stored in Minerva (which works more or less
| like Cassandra or something). It's a pretty similar high
| level design to what the likes of Google or Facebook do do
| where you store everything as protobufs in BigTable - the
| bank uses pickle rather than protobuf because they put a
| higher priority on being able to store arbitrary objects and
| deal with robustness/compatibility later, rather than having
| to write a proto definition and a bunch of mapping code up
| front. You wouldn't want to use a relational database because
| they're not properly distributed (and, frankly, kind of bad
| and overrated).
|
| The Minerva I worked on was temporal and append-only, like a
| HBase that never did compactions (so "delete" actually just
| writes a tombstone row at a particular timestamp - there was
| an "obliterate" command but you needed special authorization
| to use that), and it was distributed (with availability zones
| even) so you didn't really worry about losing data; loading
| data as-of a particular timestamp was part of every query
| (and implemented efficiently). There were probably regular
| dumps somewhere too but I never needed to encounter those.
| sandGorgon wrote:
| So Minerva is like a distributed datastore, specifically
| for python object storage ?
|
| Interesting. Do you think you would do this today with a
| Cassandra/Hbase? Can it be done - let's say take python
| 3.10 and the latest Cassandra (or even better - something
| like Firebase or Cloud Spanner).
|
| Just curious that in a post AWS/Firebase world, can
| something like Minerva be built, without investing in
| writing the db store ground up.
| lmm wrote:
| The incarnation of Minerva I worked on actually used
| Cassandra as its storage backend. But it's something
| that's not particularly useful piecemeal; the great value
| of Minerva is that all the bank's data is there and it's
| all temporal, all access-controlled and all the rest. The
| most fragile and cumbersome parts of Minerva are the
| parts where it integrates with an external/legacy
| datastore - but if you tried to introduce a Minerva-style
| datastore as a small piece in a system that was otherwise
| using a "normal" technology stack, those integrations
| would be most of what you made.
| qorrect wrote:
| I use Pickle quite a lot for caching, a file read is almost
| always faster than a DB query.
|
| For long-term persistent data ? Seems very dangerous to me,
| even reading a pickle from say PyPy vs a Cython intepreter
| corrupts the damn thing.
| miohtama wrote:
| ZODB is the object oriented database as a giant pickle dump.
| Surprisingly, it works and scales wuite well. The downside is
| that non-Python tools cannot access it all.
|
| https://zodb.org/en/latest/
| stevesimmons wrote:
| I learnt Python via Zope in 2000, and attended the Zope
| Conference in Python that year.
|
| Joined JPMorgan in 2010 to work on Athena, and immediately
| had a real sense of deja vu... Athena's Hydra object db
| (essentially an append-only KV store of pickles) felt like
| a great grandchild of Zope's ZODB.
| rbanffy wrote:
| I remember explaining our tech stack (Python and Zope) to
| clients.
|
| "Where is the code for that page?"
|
| "It's in the database"
|
| "Oh... Like MySQL?"
|
| "No. It's an object database"
|
| "???"
|
| I called it "Martian Technology Syndrome". But it worked.
| At later stages we paid the price and had to serialize the
| datastore for migrations, but that's what you get for
| relying on pickles.
| joshuamorton wrote:
| This works until...
|
| Specifically, until you realize that pickle changes based on
| python version, so updating from py3.x to py3.x+1 will
| prevent your application from reading previously stored data.
| detaro wrote:
| This is wrong. pickle can read old files just fine _and_
| lets you generate files in old pickle format versions if
| you require backwards compatibility further than when the
| current protocol was introduced (it does not get increased
| with each python version).
| rich_sasha wrote:
| I think it's mostly true. The complaint with this kind of
| "industrial global Python code base", be it at banks or
| elsewhere, is that often they are hastily cobbled together and
| depend on extreme user care to not flop over all the time.
|
| I guess banks are the archetypal places that care only about
| feature creation and not about maintenance or technical debt.
| When something does break in the end, someone senior just
| shouts at the poor devs until it works again - usually fixed
| with a hasty patch again.
|
| Similarly, documentation? Access control? Sanity? These seem to
| be left behind.
| misja111 wrote:
| > I guess banks are the archetypal places that care only
| about feature creation and not about maintenance or technical
| debt.
|
| It depends which department you are in, but in general:
| absolutely not. Actually the reverse is true. Banks have huge
| risks to manage: just imagine for instance what damage a hack
| of their account system could cause. Or a crash of their
| payment system. Therefore it is of the utmost importance that
| systems are stable and bug free. In most departments, feature
| development is only in second place: stability and
| reliability have priority one.
|
| This concern is so important that it is not just left to the
| responsibility of the banks themselves: for many systems,
| banks have to comply with external standards and are audited
| for that by external agencies.
| dolmen wrote:
| And that's why many old banks still use COBOL.
| simonh wrote:
| You could not be more wrong. Quartz has first class
| documentation, solid tooling, very well thought out and
| rigorous code review and access controls. Banks are regulated
| up to the eyeballs, so everything has to be audited and
| justified in detail.
|
| It's not nirvana, these are real working systems built by
| humans with human failings. There are tradeoffs. Not every
| application is suited to these sorts of platforms, but the
| people building these things are top notch technologists and
| know what they're doing.
| rich_sasha wrote:
| Well, I don't know about Quartz. I worked with DB systems
| and they were awful in that regard. They worked, but
| largely because people stuck to convention.
|
| For example, changing scheduler jobs required submitting a
| change in Excel and having it approved (twice...) by
| someone. Except the table was world-writable and changes
| not logged. So in principle only your appropriate superior
| could approve change, in practice anyone could, and you'd
| never even know.
| simonh wrote:
| Youch, that's nasty. I can completely believe it though,
| banks are huge unwieldy organisations. I spent a fair bit
| of time working with auditors though, so I know a huge
| amount of effort goes into rooting out things like that.
|
| The thing is just because a team in a bank did this
| thing, that doesn't mean "The Bank" thinks that's a good
| idea. Like any company, banks are communities. I'm not
| making excuses, the fact this system wasn't properly
| architected is a failure of governance, but I've been on
| the other side of this trying to get teams to fix their
| problems and adopt resilient processes and procedures.
| Every offender thinks their service is special and their
| violation of the standards is justified.
| kwertyoowiyop wrote:
| For some reason people think everything is as good as it
| can be at FAANG and other big name tech companies, and
| everyone else just walks around with their pants down at
| their ankles bumping into walls until 5:00. It's just not
| true.
| throwawaygh wrote:
| I mean, it's not just FAANG. _Literally everyone except
| banks_ use the standard Python environment. simonh is
| right about the reason for the forked tech stack.
|
| Actually, it's probably not FAANG folks at all. I'd
| expect ex-FAANG folks to be _more_ sympathetic to the
| forked python situation... FAANGs have an abundance of
| non-standard and frustrating infra (wasn 't 5TB just
| posted yesterday?), and maybe even on steroids compared
| to banks (do any of the FAANGs _not_ have at least one
| custom linux kernel?) Hell, both As roll a shitload of
| their own _silicon_.
| hnfong wrote:
| Exactly.
|
| Facebook literally maintains a python fork:
| https://github.com/facebookincubator/cinder
|
| Google invented "NoSQL" before anyone else knew what it
| was, and all those "cloud" tools they used internally
| were obviously proprietary (except the ones they open
| sourced). Ex-Googlers I work with typically had to spend
| quite a bit of time re-adjusting to the "inferior" tools
| and processes in other companies.
|
| Microsoft invented their own development ecosystem, and
| the only reason it's "common" or "standard" in the tech
| community is because they sell it as a product. This is
| the same for Apple at least for iOS development, and
| Amazon for their cloud service offerings.
|
| When companies have millions of dollars to spend on
| maintaining a custom development environment that they
| think will give them a competitive edge, they will do it.
| It's the smaller shops that can't afford not to go with
| the flow, so to speak.
| IshKebab wrote:
| I mean Google has Bazel which has reliable hermetic
| distributed builds with modern statically typed
| languages. I don't think this weird custom Python system
| is really in the same league. Barely even playing the
| same game.
| Chris2048 wrote:
| > Quartz has ... solid tooling
|
| Is that how you'd describe the IDE and (integrated) VCS?
| simonh wrote:
| Yes, it's fine IMHO. Ok it's not your favourite Java IDE,
| but it's way, way better than some of the crap I've had
| to use at various places. But then I wrote a 5Kloc PyQt
| desktop app almost entirely in IDLE so yeah, maybe I'm
| not the best judge.
| lmm wrote:
| Hmm, I found the opposite - the fact that there was this
| global framework that managed all the data and code meant
| that access control was actually pretty good, better than
| most tech companies I've worked for. You had a single source
| of truth for what your access rights were, there was
| integrated Kerberos any time you needed to access a system
| outside Minerva. And having all the code in a managed place
| meant good deprecation cycles - not instant deprecation like
| the Google monorepo, but tracking and policy for which old
| versions of libraries were in use and how much that was
| tolerated. Documentation was at least attempted, and while
| platform stability/enhancement work did have to be coupled to
| business initiatives to a certain extent (e.g. "we're doing
| this performance work to enable us to run risk estimation
| more often to meet MIFID requirements at low cost") there was
| leadership that put a value on maintaining high quality code
| and this paid dividends.
| stevesimmons wrote:
| This was 100% my experience too.
|
| The biggest productivity gains were:
|
| - having a single source of truth for both data and code
| (in a closely coupled environment)
|
| - strong, battle-tested libraries to take care of all
| infrastructure concerns.
|
| - enforced code dev/test/review/deployment workflows
|
| This let the front-office devs be highly productive on
| adding real business value for their trading desks.
|
| Remember also that these systems at GS, JPMorgan and BAML
| started around 2007-2010. The infra we all take for granted
| today at AWS/GCP/Azure simply did not exist back then, and
| banks' data security policies at the time did not allow
| cloud processing.
| FabHK wrote:
| > Remember also that these systems at GS, JPMorgan and
| BAML started around 2007-2010.
|
| GS had ,,these systems" well before 2000 (via J Aron). I
| think around the time you mentioned they spread to other
| firms (in their Python reincarnation).
| andrewshadura wrote:
| > python build, dependency and deployment management is
| exceptionally awful in every respect, this isn't as big a pain
| point in other languages
|
| I'm not sure how to react to that, but these features in Python
| are miles ahead of what many other languages have (or actually
| don't have).
| bloblaw wrote:
| If you compare Python's deployment and dependency management
| to those of statically compiled languages like Go, Rust, Zig,
| or Nim, you quickly see the experience with Python is quite
| poor.
|
| In all the above languages, you simply ship a statically
| compiled binary (often just 1 file), and the user needs
| nothing else.
|
| With any sufficiently complex Python project, the user will
| need:
|
| 1. virtualenv 2. possibly a C compiler 3. recent versions of
| Python (and that keeps changing. 3.0->3.4 are "ancient", and
| 3.6 seems to be the absolute minimum version these days ---
| due primarily to f-strings) 4. Or you ship a dockerfile and
| then the users need 600mb of Docker installed
|
| I sometimes joke that in the future every Python script will
| require a K8s deployment and people will call it "easy".
|
| Python is a great language, but deployment is a massive pain
| point for the language.
|
| When I know I am writing something that has to "just work" on
| a wide range of systems that I don't necessarily control,
| well I don't write the solution in Python. I pick Go, Nim, or
| Rust (Zig would be a good choice too).
| infecto wrote:
| Am I the only one who has never had enough headache over the
| years to say its awful? I do think dependency management is a
| somewhat difficult problem to solve and a lot of systems have
| pros/cons but I never have had huge issues with Python's.
| [deleted]
| danmur wrote:
| I think there are too many options, or not enough direction
| for busy people. Once you understand how it all works and
| pick / build the right tools it all works pretty well.
| Icathian wrote:
| I would agree for all but deployment. I know my way around
| python reasonably well, but pyinstaller and friends still
| make me have bad days pretty regularly.
| pmontra wrote:
| Four Python projects, same customer, five different
| deployment systems. Docker, a Capistano look-a-like I
| coded in bash, git pull (their former standard), git
| format-patch plus scp, zip archives. Yes, python file.zip
| works if it contains the right files. Probably the latter
| is the easiest way, except it doesn't address the
| dependencies.
| kristjansson wrote:
| > except it doesn't address the dependencies
|
| It does if you put them all in the zip :)
|
| (and build for exactly the platform your customer is
| going to deploy on)
| IshKebab wrote:
| That's just untrue. Maybe compared to C++ Python isn't too
| bad. But try something modern like Go, Rust or Deno. Python
| is light-years behind.
| lmm wrote:
| Python is literally decades behind. Dependency resolution is
| nondeterministic by default. The way you run a build is still
| not remotely standard (e.g.: I've downloaded the source of
| one of the top 20 packages on PyPI. How do I run the tests?
| Perl had a standard way to do that back in the '90s).
| Deployment is so bad that people recommend using containers
| as a substitute for something like fat jars.
| jrochkind1 wrote:
| It sounds like there may not be any tests-written-in-code at all,
| let alone CI-style testing?
| stevesimmons wrote:
| Nice stuff!
|
| For a view of Python at JPMorgan, I did a talk at PyData London
| "Python at Massive Scale"
|
| - https://www.youtube.com/watch?v=ZYD9yyMh9Hk
| rwmj wrote:
| I'm getting serious MUMPS flashbacks with a critical hospital-
| wide database accessible through only one programming language
| with absolutely no access control or rollback.
| oconnor663 wrote:
| There's a lot here in common with the higher-prestige systems
| operated by major tech companies. Giant custom monorepos,
| sometimes with custom IDEs built into them. Big proprietary
| services for running asynchronous jobs and collecting logs and
| everything else. Data-driven frameworks for spinning up new
| services. Bespoke databases. All of it rings a bell.
|
| The one thing in there that really jumped out at me as "oh my god
| never ever do that" was using pickle for serious persisted data.
| I can see the upside around letting people who aren't dedicated
| programmers avoid thinking about serialization but...daaamn. My
| understanding is that this locks you into a "the API is all the
| code" situation where you can never change the data layout of any
| class once you've written it?
| tstordyallison wrote:
| It's not just straight pickle (usually) - there's a layer in
| between that allows at the very least for the handling of
| deprecated fields/new fields/renames etc.
|
| But - after a change - you have to choose either to leave that
| 'backward compatibility' in-place (essentially forever), or put
| together a job to run (on that scheduler! Hah) to go re-write
| in your new format. If you care enough, you might - and then
| you can remove your logic to handle the old names/shapes.
|
| The charm in a lot of it is in its simplicity. It doesn't claim
| to very smart, but people get it - and are often remarkably
| productive.
| oconnor663 wrote:
| > there's a layer in between that allows at the very least
| for the handling of deprecated fields/new fields/renames etc.
|
| That sounds pretty reasonable. Not so different from tossing
| JSON objects into MongoDB or something like that.
___________________________________________________________________
(page generated 2021-11-06 23:02 UTC)