[HN Gopher] Engineering principles for building financial systems
       ___________________________________________________________________
        
       Engineering principles for building financial systems
        
       Author : KothuRoti
       Score  : 324 points
       Date   : 2024-07-11 01:47 UTC (21 hours ago)
        
 (HTM) web link (substack.wasteman.codes)
 (TXT) w3m dump (substack.wasteman.codes)
        
       | ricericebaby wrote:
       | I'll have this saved for the day IRS comes knocking
        
       | Terr_ wrote:
       | > Use consistent rounding methodologies
       | 
       | This may also mean promoting them to a named part of the
       | business-domain in code, with their own functions, unit-tests,
       | stuff like "fetch Rounding Strategy Suite by country code", etc.
       | 
       | > Use integer representations of time. This one is a little
       | controversial but I stand by it. There are so many libraries in
       | different technologies that parse timestamps into objects, and
       | they all do them differently. Avoid this headache and just use
       | integers. Unix timestamp, or even integer based UTC datetimes
       | work perfectly fine.
       | 
       | Warning: This only works for times that are either firmly in the
       | past or which represent a future delay-offset which is totally
       | under your own exclusive control.
       | 
       | For example, suppose your company has a 48 hour cancellation
       | policy. You can probably compute a future timestamp and use that,
       | it won't depend on whether a customer is about to hit a DST
       | transition or leap-seconds or whatever. nsition.
       | 
       | In contrast, the nation of Ruritania may have their tax year
       | ending at December 30th 17:30 Ruritania Time, but you don't
       | _reeeeealy_ know how many seconds from now that will happen:
       | Tomorrow the Ruritania congress may pass a law altering their
       | tax-schedule or their entire national time-keeping. This means
       | your underlying source-of-truth needs to be the _matching
       | conditions_ , and that often means storing the time-zone.
        
         | szundi wrote:
         | I like this answer as usually the world changes that cannot be
         | easily ported to code makes software shit at the end.
        
           | Terr_ wrote:
           | I like to quip that many hard technical-problems are actually
           | three business-problems in a trenchcoat.
           | 
           | In this case, the technical outgrowth might be heroic over-
           | weekend crunch: Write some code that finds every affected
           | timestamp in the DB, reconstructs the various code-chains
           | that created them, divines an "original meaning", calculates
           | the difference, and somehow apply changes in a way that
           | doesn't make auditors angry or lock a key table for and hour
           | or leads to data-corruption from race conditions...
        
         | bigiain wrote:
         | Also:
         | 
         | > Granularity of your financial amounts should ...
         | 
         | ... comply with relevant regulations.
         | 
         | I worked on a gambling webapp a long time back - we were bound
         | by government regulations on gambling licences to perform all
         | calculations at a precision of one ten thousandth of a cent,
         | and were only permitted to round to cents once at the very end
         | just before displaying a dollar amount to a person.
         | 
         | (This suited me down to the ground because I pointed out to
         | everybody that Javascript couldn't be guaranteed to treat
         | numbers and calculations in a way that would keep the
         | regulators auditors happy, and managed to push _every_
         | calculation involving dollar amounts off to the Java backend
         | team.)
         | 
         | We also used integer UTC milliseconds for time, so we'd have
         | good enough for the auditors timestamps showing who placed a
         | valid bet at (say) 12:34:59.9997 and who else place an invalid
         | bet at 12:35:00.0003 for a wager with a cut off time of 12:35.
         | (We asked for and got a ruling from the regulators that the
         | time that mattered was the time the API call was processed by
         | the backend, since network latencies between the webapp and the
         | backend platform were impossible to predict. I have no idea if
         | the backend had millisecond accurate time, that wasn't my
         | problem and I _so_ didn't want to have them make it my problem
         | by asking the question.)
        
           | Terr_ wrote:
           | > We asked for and got a ruling from the regulators that the
           | time that mattered was the time the API call was processed by
           | the backend
           | 
           | I haven't been in that situation, but I imagine I would reach
           | for a "what if it was snail-mail" analogy. "It's dangerous to
           | honor whatever time the bettor wrote by hand, the postmark-
           | time may not be fair if the mailman got delayed because of a
           | half-illegible address or random USPS delays... So how about
           | we go for when it was actually delivered to us?"
        
             | davidw wrote:
             | I wonder if that's the sort of ruling that just got thrown
             | out by the Supreme Court in favor of having courts decide.
        
           | Sammi wrote:
           | Luckily javascript has bigint now, so you can do this stuff
           | safely in js now (just don't do it in the frontend ever,
           | because you cannot trust the frontend).
        
           | throwaway2037 wrote:
           | You wrote "a long time back", so maybe my point can be
           | ignored.
           | 
           | Did you not consider to use decimal math? According to MDN,
           | the largest int in Javascript is 2^53 - 1. If you steal four
           | digits for your fraction, that still leaves a huge number. I
           | will assume that no one was gambling more than one billion
           | currency units, so you had plenty of space. Do all your calcs
           | with ints, then divide by 10,000 for a final result.
           | 
           | Ref: https://developer.mozilla.org/en-
           | US/docs/Web/JavaScript/Refe...
        
           | HWR_14 wrote:
           | > This suited me down to the ground because I pointed out to
           | everybody that Javascript couldn't be guaranteed to treat
           | numbers and calculations in a way that would keep the
           | regulators auditors happy,
           | 
           | Javascript can represent dollar values up to $2.25 billion at
           | a resolution of ten-thousandths of a cent without any loss of
           | precision.
           | 
           | I would want all money calculations done by the backend team
           | as a matter of policy. But it's not a technical limitation.
        
             | withinboredom wrote:
             | While you are correct that it can _represent_ these numbers
             | with that precision... it cannot operate on them and retain
             | that precision, so DO NOT USE FLOATS TO REPRESENT MONEY.
             | Thank you for your time.                   let a = 35
             | let b = -34.99              console.log(a+b)
             | // output: 0.00999999999999801
        
               | zoover2020 wrote:
               | Brr almost got the shivers until I read your comment
        
         | amluto wrote:
         | > Unix timestamp, or even integer based UTC datetimes work
         | perfectly fine.
         | 
         | For serious work, it's worth noting that leap seconds are not
         | representable in this format. Many financial applications can
         | get away without representing leap seconds, but this is
         | fundamentally a kludge. If you actually need to represent _any_
         | point in time (according to UTC or any particular timezone),
         | use a representation that actually represents what you want.
         | 
         | This is especially true for future times. If you mean "10 AM,
         | August 1, 2025, New York time", then represent that! Do not
         | kludge it as an integer, do not use a different time zone as a
         | proxy, do not use UTC, and don't use a representation that
         | contains the terms "standard" or "daylight". If you do this
         | wrong, you will eventually regret it. Bonus points for also
         | being able to represent future leap seconds, I suppose, but
         | I've never encountered any need to do that.
         | 
         | Keep in mind that those properly represented future times _are
         | not guaranteed to ever happen_. Time zone definitions can and
         | do change.
         | 
         | (And, since this is finance, you also get to deal with bank
         | holidays, past, present, and future. Your legal department may
         | have opinions about what should happen when a future date, that
         | you thought wouldn't be a bank holiday, turns out to be a bank
         | holiday after all.)
         | 
         | (Also, since this is finance, sometimes people will want to
         | translate a _time_ into a _date_. This can be a nontrivial
         | operation. The date associated with a timestamp may or may not
         | matter for actual business purposes, and various parties may or
         | may not agree about what date a certain time belongs to and may
         | or may not implement their opinion correctly.)
        
           | throwaway2037 wrote:
           | Is there any financial system, regime, or product that
           | depends upon leap seconds? I have never heard of any. If so,
           | it is safe to ignore them.
           | 
           | When storing date/time stamps, I prefer to persist into two
           | separate fields: (a) date/time using UTC time zone and (b)
           | original time zone offset, e.g., UTC+8. When debugging, it is
           | useful to know the original time zone offset. And when
           | querying and processing data, it is easier to use UTC time
           | zone.
        
             | amluto wrote:
             | > Is there any financial system, regime, or product that
             | depends upon leap seconds? I have never heard of any.
             | 
             | They're certainly unusual.
             | 
             | > If so, it is safe to ignore them.
             | 
             | Really? Leap seconds have happened, and they could
             | plausibly happen again. If so, during those leap seconds,
             | events will occur. Do you need to record the time of those
             | events?
             | 
             | > When storing date/time stamps, I prefer to persist into
             | two separate fields: (a) date/time using UTC time zone and
             | (b) original time zone offset, e.g., UTC+8.
             | 
             | That covers most bases for past events. It is inappropriate
             | for many future events.
             | 
             | Hmm, maybe a good system library would have separate types
             | for timestamps that were actually observed (or interpolated
             | or otherwise computed after the fact, etc) and for times of
             | events that may not have occurred at the time of
             | computation.
        
             | HWR_14 wrote:
             | > Is there any financial system, regime, or product that
             | depends upon leap seconds?
             | 
             | According to the standard for increasing leap seconds, they
             | interact with the opening auction of the Tokyo Stock
             | Exchange by delaying it one second.
        
             | gradschool wrote:
             | > If so, it is safe to ignore them.
             | 
             | if you say so, but don't whinge if someone pulls off a
             | major heist during a leap second by counting on you to
             | ignore it
             | 
             | https://www.imdb.com/title/tt0137494/
        
             | yareally wrote:
             | Only a guess, but maybe financial markets that run 23 hours
             | a day, 6 days a week, such as futures?
        
           | jwarden wrote:
           | There's a difference between a point in physical time, and a
           | coordinate on a calendar. UTC works great for points in time,
           | both past and future. So UTC is very appropriate for logging,
           | including of financial transactions, where often what matters
           | is the exact point in time something happens and not how that
           | maps to someone's calendar or the time zone on a server.
           | 
           | But UTC doesn't work at all for a calendar app, where a
           | calendar entry could for example span a calendar day, in some
           | specific time zone, especially if it is in the future and
           | there is not yet a mapping between that future date and UTC.
        
             | cgio wrote:
             | The mapping to someone's calendar is important too, though,
             | when you start looking at accruals...
        
         | throwaway2037 wrote:
         | In Java, the original date/times were awful in the base
         | library. Eventually, a very good replacement appeared called
         | Joda-Time. Eventually, a "fix-all-the-design-flaws" 2.0 version
         | was officially added to the base library as part of JSR-310.
         | Excluding leap seconds (for which I have no practical need), I
         | have never once found something that it cannot handle -- no
         | matter how complex my conversion and calculation needs. A few
         | other languages have basically ported Joda-Time or JSR-310 into
         | their own. In 2024, date/time math feels like a hard, but
         | solved problem with the correct libraries.
         | 
         | And last: I really like your example about Ruritania and the 48
         | cancellation policy. Sure, the country name is intended as a
         | joke, but in the Real World all kinds of crazy things like
         | this. Even Islamic calendars can be quite complex in 2024 --
         | knowing what days will be holidays in the future is a non-
         | trivial task.
        
         | darby_nine wrote:
         | Honestly we should just abandon all that bullshit and just use
         | TAI for internal representation. Would probably save billions.
        
         | amelius wrote:
         | > but you don't reeeeealy know how many seconds from now that
         | will happen
         | 
         | So don't use a time-like object. Instead use a condition.
        
       | Galanwe wrote:
       | I wouldn't endorse many of the things stated there. A lot of the
       | points are inaccurate or straight up misleading.
       | 
       | > The three main goals of your accounting system are to be (1)
       | Accurate, (2) Auditable and (3) Timely.
       | 
       | You forgot "consistent", which is very different from "accurate".
       | Most financial transactions have multiple time dimensions: a
       | trade is requested, booked, filled, matched, reconciliated, paid,
       | settled at different times. These times often don't even follow a
       | perfectly logical order (you can request and fill a trade and
       | book it an hour later). These time dimensions are often why there
       | are multiple, layered, ledgers to store transactions. An
       | accounting system needs to provide consistent views that depend
       | on the dimension you want to look at (e.g. a trade that is filled
       | but not settled counts for front office accounting, not for back
       | office accounting).
       | 
       | > If there is an event that occurred but is not in the financial
       | record, than the system is not complete.
       | 
       | The base assumption of every (working) accounting system I have
       | worked with is the reverse. There WILL be transactions that flow
       | through the system in an untimely manner, your system needs to
       | handle that, thus the need for multiple layers of ledgers, and
       | the necessity to batch reconciliation processes between these
       | layers.
       | 
       | You will book trades after they are filled. You will pay
       | materials before inputing the bill. Etc.
       | 
       | > If you are only working with dollars, representing values in
       | cents might be sufficient. If you are a global business, prefer
       | micros or a DECIMAL(19, 4).
       | 
       | If you are a global business only working with euro and dollar
       | maybe. Otherwise most FX markets quote to 8 decimals, and that's
       | not counting swaps. I would recommend using at least 8 decimal
       | places for storage.
       | 
       | > Delay currency conversion as long as you can. Preemptively
       | converting currencies can cause loss of precision.
       | 
       | No! Delay currency conversion _until conversion occurs_! An
       | accounting system does not convert currencies at will. There is
       | no dollar equivalent to euro, you either have dollars or euro.
       | And it's not a matter of rounding errors, cash management is part
       | of legal and accounting duties.
        
         | eru wrote:
         | If you are only doing accounting (and not like high frequency
         | trading), you can even consider using arbitrary precision
         | rational numbers internally. They are most likely fast enough.
         | 
         | However, similar to what you suggested about currency
         | conversion, your business logic might already dictate to you
         | exactly what kind of precision and rounding rules you should be
         | using; and deviating from that might cause havoc.
        
         | blowski wrote:
         | Seems to be like you're in violent agreement with the OP on all
         | those points, just phrasing the same truth differently.
        
         | tomnipotent wrote:
         | > You forgot "consistent"
         | 
         | I once had an auditor tell me that the books are allowed to be
         | wrong, but should be wrong consistently. I've even run into the
         | timestamp issue, which generally turned out to be a non-issue
         | unless the wrong date was used (like order date vs. shipping
         | date for revenue recognition) or a considerable number of
         | errors caused dates to end up in a different reporting period.
         | If you're dates are consistently a few hours off, then they'll
         | be consistently off across reporting period boundaries and that
         | can easily be explained to an auditor.
         | 
         | > the necessity to batch reconciliation processes between these
         | layers
         | 
         | I've never seen a reporting period closed without accountants
         | making journal entries for manual adjustments. These can often
         | be material changes.
         | 
         | > An accounting system does not convert currencies at will
         | 
         | This is correct, but for internal reporting you might need to.
         | An accountant and cash flow will capture the important
         | conversion rate when the funds are expatriated, but your
         | finance dashboard showing sales in euros will have an executive
         | that wants to see it in US dollars. It's impractical to use the
         | conversion rate from the transfer that could happen weeks or
         | months after the reporting period, so you come to a compromise
         | of using a X-day FX moving average that's "close enough" and
         | everyone's happy. What goes in your 10-K/Q will be a different
         | story.
        
           | Galanwe wrote:
           | > I've never seen a reporting period closed without
           | accountants making journal entries for manual adjustments.
           | These can often be material changes.
           | 
           | The fiscal view is just one of the layers your accounting
           | system takes into considerations.
           | 
           | You typically perform fiscal reconciliation when the fiscal
           | year ends, your bookings are often reconciliated end of day,
           | your settlements usually between 1 or 2 days (depending on
           | the currency), your FX hedge often quarterly to enter IMM
           | interest rate swaps, your front office is often optimistic
           | realtime accounting (i.e. you consider what has been promised
           | just now to already be out of the stock), etc.
           | 
           | > This is correct, but for internal reporting you might need
           | to.
           | 
           | Right but this is more some form of Bi on top of the
           | accounting system than part of it. For the accounting system
           | itself, you will often want to know precisely the cash per
           | currency so that you can actually hedge it. e.g. if you're
           | ultimately getting paid in dollar and have exposure to some
           | salaries you pay in euro.
        
         | amluto wrote:
         | > Most financial transactions have multiple time dimensions: a
         | trade is requested, booked, filled, matched, reconciliated,
         | paid, settled at different times. These times often don't even
         | follow a perfectly logical order (you can request and fill a
         | trade and book it an hour later).
         | 
         | Similar issues exist with transaction identifiers. Every party
         | involved with any aspect of a transaction may assign that
         | aspect of the transaction zero or more identifiers, those
         | identifiers may or may not be unique over any particular set of
         | transactions and/or events related to transactions, and the
         | sources of those identifiers may or may not even correctly
         | document what, exactly, those identifiers identify. (And their
         | representatives may think they know, and they may be wrong.)
         | Your own code may need to generate identifiers for events
         | associated with transactions, and you may or may not want to
         | leak information in your choice of identifiers.
         | 
         | For new designs, my suggestion would be to treat these as one-
         | to-many relations. Have a table along the lines of (transaction
         | id, event, identifier issuer, identifier name, identifier). You
         | might also need a timestamp in there, and you may need to throw
         | in a row id of some sort as well. You might want to make an
         | allowance for identifiers having variable _type_ -- are they
         | bytes? Unicode? something else? You could plausibly get away
         | with making these be JSON keys in an event object instead, but
         | this gets awkward when someone tells you an identifier after
         | the fact.
        
           | Galanwe wrote:
           | Spot on.
           | 
           | Worst, these identifiers can even be point-in-time. I usually
           | store them as you do, with an additional start/end date.
        
           | simpaticoder wrote:
           | There is an interesting tension in software between
           | _measurement_ and _validation_. The two concerns are often
           | conflated, resulting in software systems that cannot measure
           | the world as-it-is. The world (more precisely, measurements
           | of the world) may be in (and often is in) an  'invalid' or
           | 'inconsistent' state. Software can only help you detect and
           | resolve such states if it is capable of representing them.
        
         | withinboredom wrote:
         | > No! Delay currency conversion _until conversion occurs_!
         | 
         | Sometimes these are legal requirements one way or the other. In
         | my country, invoices in other currencies are based on the ECB
         | rate of the date of the invoice, no matter what the actual
         | value is when converted.
        
       | JetSetIlly wrote:
       | Good summary but the article doesn't mention the engineering of
       | the user interface. In this field of accountancy, I think the UI
       | is an engineering problem at least as important as anything else
       | mentioned in the article.
       | 
       | As someone who works in accountancy (as a bookkeeper/accountant)
       | I have to be brutally honest and say that I've been very
       | disappointed in the UI of all the accounting software packages
       | I've used. None of them give me the immediacy or simplicity of a
       | well organised filing cabinet, they really don't.
       | 
       | I'm not sure what a good solution would look like, but I can't
       | help thinking that making the double-entry system more visible
       | and apparent to the user would be a good start.
        
         | jsnk wrote:
         | This was the same perspective i had as an engineer working on a
         | finance problem for accountants and sales. So i would build a
         | nice reporting tool that had modern ui. The people liked it at
         | first but they always regress to spreadsheet. They just ask me
         | to export the data in spreadsheet.
        
         | Ratelman wrote:
         | As someone not in accountancy but having worked in corporate
         | and been forced to work on accounting software for managing
         | budgets, approving transactions etc. - you have really gotten
         | the short end of the stick from general usability perspective.
         | There are so many things that don't make sense - even for the
         | accountants that use the system on day-to-day basis - they have
         | workarounds for (what I would consider) the most basic of
         | things (or over-complicated solutions for something basic like
         | manager approvals for logging expenses - why do I have to click
         | through several screens that don't follow any sort of clear,
         | logical flow). And it's so embedded in an organization - that
         | you can rest assured once that client is signed up they're not
         | switching that system. Same for HR software...
        
         | openrisk wrote:
         | > I think the UI is an engineering problem at least as
         | important as anything else mentioned in the article.
         | 
         | Indeed.
         | 
         | > I'm not sure what a good solution would look like
         | 
         | What is interesting is that on the business reporting side
         | (which is in any case deeply enmeshed with the accounting
         | system side) there is a defacto standard (xbrl and its various
         | specifications [1]) which via web technologies (iXBRL [2])
         | provides a path towards generic web UI patterns.
         | 
         | [1] https://www.xbrl.org/int/gl/2016-12-01/gl-
         | framework-2017-PWD...
         | 
         | [2] https://www.xbrl.org/the-standard/what/ixbrl/
        
         | pintxo wrote:
         | I agree.
         | 
         | While I managed our start ups finances, I so much preferred to
         | stay doing everything in Excel over having to use our
         | accountants system. And I get a glimpse into SAP from time to
         | time as my wive is an accountant. It's incredible how shitty
         | those systems are from the user perspective.
         | 
         | But I guess, this is what you get when the user is not also the
         | customer.
        
         | jakjak123 wrote:
         | I agree, but also our accountants kept asking for the craziest
         | of things. I guess its a problem that the real world of
         | transactions is full of async processes and weirdness. There is
         | always something new weird that will happen, so it always ends
         | up back in a spreadsheet software.
        
       | rwieruch wrote:
       | Related [0], but one week ago I have written about my experiences
       | using TypeScript for an invoicing system, if anyone is
       | interested. It's especially about rounding errors and how to
       | prevent them. Since then we have created hundreds of invoices
       | (and cancellations) and everything worked as expected with minor
       | hiccups in between.
       | 
       | [0] https://www.robinwieruch.de/javascript-rounding-errors/
        
         | blackhaj7 wrote:
         | Superb article Robin
         | 
         | Sounds like you really went though a rough time building the
         | invoicing system!
         | 
         | I have read many times about the difficulties of dealing with
         | decimals/rounding but never found an article that nailed the
         | explanation as well as you did
        
       | alphalima wrote:
       | On the best practices, as noted by others, there are probably
       | classes in your standard lib/ commons lib that cover this stuff
       | better than storing in integers (e.g. BigDecimal in Java and
       | Python Decimal have precision and rounding mode concepts built
       | in).
       | 
       | Something I've found valuable is on is managing euler
       | units/ratios (e.g. proportions 10^1, percentages 10^-2, basis
       | points 10^-4) . Enforcing a standard that all interchange, data
       | storage has either the same scale, _or is stored with its scale
       | (and not in the name)_ will reduce errors hugely significantly.
        
         | eru wrote:
         | You standard library might even already have libraries for eg
         | storing currencies or time, too.
        
       | 0wis wrote:
       | Another good engineering principle is to write a lot of tests. I
       | know it is a basic rule of engineering but it is not always
       | followed.
       | 
       | However, beware as the result of your test will also be seen by
       | auditors, so if you refactor a system, it is better to have a
       | write/solve approach than write all your tests firsts and solve
       | them afterwards.
        
       | noelwelsh wrote:
       | > Use integer representations of time
       | 
       | This sounds like a type system issue (and the linked Etsy post is
       | also a type system issue), or more precisely languages / systems
       | with a loose type system where you can just pretend a time is a
       | number without having to explicitly say what that number means.
       | 
       | Conventions are good, but (real) type systems are better because
       | they check everything every time.
        
         | eru wrote:
         | Yes. And in this context, it doesn't even matter (too much)
         | whether your type system is statically checked or done at
         | runtime.
         | 
         | The main difference is whether the compiler will yell at you,
         | or whether it'll be a runtime error. But the important point is
         | that your mixing up won't go through and silently do the wrong
         | thing.
        
       | mgaunard wrote:
       | Representing prices is difficult because a lot of financial
       | instruments have widely different values and price increments.
       | 
       | Decimal floating-point gives you the range to cover everything
       | but is generally very inefficient to process.
       | 
       | What you usually want is fixed-point decimal, with the caveat
       | that the scaling factor will be different per asset.
        
         | eru wrote:
         | Depending on how many numbers you process, inefficiency might
         | not matter.
         | 
         | You could probably use arbitrary precision rational numbers (eg
         | as implemented in Python
         | https://docs.python.org/3/library/fractions.html) for all your
         | accounting needs, and it would still be fast enough.
        
           | mgaunard wrote:
           | There are millions of financial instruments and their price
           | change 100 million times per day.
           | 
           | That's a lot to process, so yes, it matters.
        
       | hippich wrote:
       | Another interesting head scratcher (thank God I never had to deal
       | with myself, but saw the mess it can make in the data easily) are
       | government-enacted devaluations. And probably very similar to the
       | concept, stock splits.
        
       | 47 wrote:
       | A great companion read is Martin Fowler's "Accounting
       | Patterns"[1]. Having built and maintained systems that manage
       | financial events for over a decade, I wish I had read these
       | patterns earlier.
       | 
       | [1] https://martinfowler.com/apsupp/accounting.pdf
        
         | arsenico wrote:
         | I think Fowler's work is an underrated must-read for anyone who
         | works in domains related to moving money. Makes any kind of
         | engineering practices and architecture principles logical and
         | make sense.
        
       | willtemperley wrote:
       | > Use integers to represent financial amounts
       | 
       | I followed this mantra when building a trading system. This was
       | the worst engineering decision I ever made - it added complexity
       | and runtime overhead, plus readability was reduced when
       | debugging.
       | 
       | If your language has a proper decimal type like BigDecimal in
       | Java or Decimal in Swift, I'd suggest using it. They offer
       | perfect precision for financial applications, are battle tested
       | and are a natural way to represent financial amounts. Yes, Java
       | BigDecimal is unweildy, but it works.
       | 
       | If you're using JavaScript however, definitely use integers to
       | represent financial amounts.
        
         | erik_seaberg wrote:
         | There's also javax.money.
        
       | ibejoeb wrote:
       | We're basically talking about bookkeeping systems here, rather
       | than high finance. Use a good relational database.
       | 
       | 1. ACID, so you don't have to invent it.
       | 
       | 2. Arbitrary precision numeric data types with vetted operations
       | and rounding modes.
       | 
       | 3. It does time as well as anyone can
       | 
       | 4. Your computations and reporting can be done entirely in SQL.
       | 
       | 5. When you get good at SQL or hire someone who is, the reporting
       | is elegant.
       | 
       | 6. It's as fast as it needs to be, and a good database architect
       | can make it extremely fast.
       | 
       | 7. It has all of the tooling to do disaster prevention and
       | recovery.
       | 
       | I've built so many financial systems for the biggest
       | multinationals in the world, and I have never regretted doing the
       | bulk of the work in the database. Coca-Cola, GE, UPS, AIG, every
       | iteration of the Tycos, basically every pharma with their crazy
       | transfer pricing schemes... Whenever I experienced a performance
       | problem, there have been way to address it that were far easier
       | than what would have been to reinvent all of the hard tech that
       | underlies consistent, accurate financial systems.
        
         | tumetab1 wrote:
         | > 2. Arbitrary precision numeric data types with vetted
         | operations and rounding modes.
         | 
         | Be aware that a good rationale for choosing the database but
         | you should enforce a specific precision per table otherwise you
         | can get nasty accounting bugs because you're adding a 3.0095
         | amount with an 9.99 amount.
        
           | ibejoeb wrote:
           | Yes. The data type itself supports arbitrary precision. I'm
           | not suggesting that you chose the precision arbitrarily.
        
         | throwaway2037 wrote:
         | What is the meaning of "high finance"?
        
           | ibejoeb wrote:
           | Specialty stuff like high frequency trading, risk models,
           | bespoke instruments, and other non-retail products. Like if
           | you're doing HFT, you're probably rolling your own loops.
           | 
           | But the broader point is that this piece is about accounting-
           | related systems that don't typically deal with hard volume,
           | speed, or latency constraints. The term "finance" can mean
           | the corporate "finance & accounting department" but you want
           | to differentiate that from "finance" the industry. The
           | headline does really tell you what actual material is.
        
             | dmoy wrote:
             | Yea this article really should have used "accounting"
             | instead of "finance" (or more specifically, operational
             | accounting).
             | 
             | Right, like, if you're a programmer going into a project-
             | for-finance/accounting-team at your large megacorp, and
             | don't know if this article is the right subject area, just
             | ask the finance/accounting person about "year end close"
             | and see if they get a thousand yard stare or not. If yes,
             | this is the right article lol.
        
               | xbar wrote:
               | Most correctly: Financial Accounting.
        
             | seanhunter wrote:
             | For HFT, risk models, exotic products etc you still are
             | almost certainly better off doing your bookkeeping using a
             | normal database. In my experience[1] you want to make sure
             | the financial records are consistent and accurate and able
             | at all times to be reconciled to third parties no matter
             | what lifecycle events happen on your trades and you want
             | that record to be out of band with your actual
             | decisionmaking stuff which is the bit that needs to be fast
             | and/or complex. For exotics specifically, the contracts
             | often specify exactly how things like rounding etc should
             | be handled so you need to be able to implement custom logic
             | to do that accurately or you'll get constant annoying price
             | breaks of a few cents and your ops people will hate you
             | because they'll keep having to reconcile things with
             | counterparties.
             | 
             | [1] Fast equities trading and slow complex exotics and
             | bonds trading/pricing/risk. Strictly speaking the fast
             | thing was algo trading[2] rather than actual HFT but we
             | were counting microseconds etc so it's similar in the time
             | constraint.
             | 
             | [2] The distinction I would draw is algo trading is
             | algorithmic execution of client orders in a brokerage
             | setting vs HFT is proprietary trading like you would do in
             | a hedge fund. In algo trading a lot of thought goes into
             | how you can ensure your clients are not able to be ripped
             | off by HFTs in the market, so they are sort of the
             | adversary even though a lot of what you're doing is similar
             | from an execution pov.
        
               | nostrademons wrote:
               | You should do your _bookkeeping_ in a normal database,
               | but the models themselves usually need something
               | specialized. Ideally keep them in cache /RAM, because if
               | you have to hit the disk you'll probably get beaten to
               | execution by another HFT. If the data set is too large to
               | keep in RAM (and you can't afford to just buy more
               | computers), then page out portions to flash and mmap them
               | back in when you need to. (Ironically, this is sort of
               | how programs were constructed in the 1970s before the
               | days of virtual memory, filesystem abstractions, and file
               | formats.)
        
               | seanhunter wrote:
               | Absolutely. Our models didn't look anything like most
               | programs I have worked on elsewhere in terms of
               | architecture etc. But the execution/bookkeeping records
               | just went into a conventional sql db out of band from the
               | main routing/strategy logic.
        
         | throwaway2037 wrote:
         | The best systems that I have ever seen have two parts: (a) a
         | read/write relational database (w/SQL) to persist data and (b)
         | a read-only OLAP cube. With enough time, your powerusers will
         | create the most insane reports imaginable using the OLAP cube
         | view.
        
         | nyrikki wrote:
         | ACID and DTCs will get you in legal trouble with financial
         | systems.
         | 
         | That is part of the reason financial systems backed by SQL are
         | so complicated and fragile while old COBOL code keeps going.
         | 
         | RDBMS tutorials always use ATMs as an example, but you can't
         | 'roll back' with ATMs, you have to use compensating actions.
         | 
         | ACID is useful in many contexts, but actually adds
         | complications to financial transactions.
         | 
         | Think of the ledger as being the source of truth, with the DB
         | as simply a materialized view.
         | 
         | It has applications with edge caching in web apps too. Events
         | as the source of record can help get rid of cache invalidation
         | challenges compared to cache aside or write back.
         | 
         | Two sides errors that don't impact the trial balance is
         | something to search for if you want to know why an ACID DB as
         | the source of truth is problematic in accounting.
        
       | gushogg-blake wrote:
       | Regarding the types, I think it's easy to fall into the trap of
       | thinking you have to try and force the domain concept into one of
       | the commonly available data structures at the point of creation,
       | i.e. the price "PS3.99" has to become (int) 399 or (float)
       | 3.99000000001.
       | 
       | Actually, as another commentor mentioned, you can and should just
       | choose something that represents what you want. In many cases it
       | will be an object with various domain-specific fields. If you
       | need to be sure that "3.99" always comes back as "3.99", you can
       | use a string, which doesn't have any issues with precision in the
       | storage format, and will usually map exactly to how humans deal
       | with the concept. (Not necessarily recommending this, but it's
       | worth considering.)
       | 
       | Similarly with JavaScript precision issues, to take another
       | example that's been mentioned here - if you need numbers to be
       | precise and the performance cost is acceptable, JavaScript can
       | compute to arbitrary precision just as well as any other
       | language. Just don't assume you're forced to convert your concept
       | of a number to use the built-in numeric types and operations.
        
         | ibejoeb wrote:
         | The string "3.99" always comes back as a string, but when you
         | need do anything at all to it, it resolves down to the same
         | problem.
        
           | langcss wrote:
           | Ideally, one should make illegal state unrepresentable. The
           | problem with a string is there is no guarantee it represents
           | a money amount.
        
         | a_c_s wrote:
         | If you have a decimal then you should store it as a Decimal
         | type, not a float nor an int. This is why Decimal types exist -
         | the fact that most programming languages don't have a native
         | Decimal type baffles me.
        
       | gwright wrote:
       | The biggest misunderstanding I see in these types of discussions
       | is that there is a lossless conversion between binary floating
       | point representation and decimal representation.
       | 
       | Very few values have exact decimal and binary floating point
       | representations.
        
       | Havoc wrote:
       | Mostly agree though the concept of materiality seems out of place
       | here.
       | 
       | Decision making factors that in sure but system design absolutely
       | shouldn't. There is no reason why accounting systems can't be
       | accurate down to the penny. That's what computers are good at
       | after all.
       | 
       | Only place where I could see system like relevance is in
       | presentation- showing revenue on millions etc
        
         | infecto wrote:
         | Agree. I mostly ignored the rest of the post because this threw
         | red flags that the poster does not build real financial
         | systems. Financial systems should be capturing everything, now
         | on the reporting end there is a question on materiality but
         | most definitely not on the system design.
        
       | DarkContinent wrote:
       | > Batch is just a special case of streaming
       | 
       | No. Designing a system that is always up and running and can
       | process small amounts of data constantly is a completely
       | different problem from designing a system that runs occasionally
       | with a lot of data. For one thing, your output formats are
       | usually different in the latter case (maybe you're creating a PDF
       | for example). Also the high availability requirement just makes
       | things different at the design level.
       | 
       | Finally, the author claims it's not hard to switch between batch
       | and streaming. With a large volume of preexisting data, this is
       | just not true. For example, if you make a REST API call for each
       | document in a DB, it can take days or months to load that. If
       | batching together documents isn't a possibility, how do you move
       | data between stores easily? (This data movement is often required
       | when switching between batch and streaming.)
        
         | lkdfjlkdfjlg wrote:
         | I'm seconding this, and I have first hand experience in exactly
         | this problem, in finance. My first boss also had the view that
         | "batching is a special case of streaming where you stream N and
         | streaming is also a special case of batching where the batch
         | size is 1 and so it doesn't matter which one you implement".
         | This was never performant enough and he was eventually asked to
         | leave.
        
         | erik_seaberg wrote:
         | The key is working incrementally, not sitting idle for months
         | and then hammering production as hard as possible trying to get
         | all the deferred work done in exactly one batch.
        
       | woah wrote:
       | Cryptocurrencies tend to use very large integers (128 or 256 bit)
       | for everything throughout the system. Rounding doesn't happen
       | until a number hits the UI. Why would you design a system that
       | intentionally destroys data several times internally?
        
         | HWR_14 wrote:
         | Sometimes precision (and thus data) needs to be destroyed. Say
         | you owe me $1,000.211 and write me a check for $1,000.21
         | 
         | Do you still owe me a tenth-of-a-cent? Even if you do, does
         | anyone (me, you, the taxing authorities) care? In fact, does
         | anyone want to go through the effort of maintaining that debt?
         | Even if I repeat this with 100,000 customers, I'm out $100.
         | Again, does anyone care?
        
         | makeitdouble wrote:
         | That's usually how it's done in systems using int as well:
         | internal calculations are done with as much precision as
         | needed, and only the significant results are converted back to
         | int using a defined rounding strategy.
        
       | cvdub wrote:
       | Managing rounding and ensuring each set of entries balance can be
       | tricky, especially if you have to share data with a system that
       | can only handle currencies with two decimal places. There are
       | scenarios where it's actually not possible to have every set of
       | entries balance, and have the total sum of all entries equal the
       | correct balance.
       | 
       | For example, if you had three monthly payments of $5, $10, and
       | $10, you might book something like:
       | 
       | Cash (5) Expense 8.33 Deferred (3.33)
       | 
       | Cash (10) Expense 8.33 Deferred 1.67
       | 
       | Cash (10) Expense 8.33 Deferred 1.67
       | 
       | All three of those blocks of entries balance, but the sum of
       | expenses is 24.99 instead of 25.
       | 
       | I'm not sure there's a way around this issue if you're forced to
       | use two decimal places. Luckily the discrepancy is immaterial.
       | I'd love to know if anyone else has encountered this problem.
        
       | caseysoftware wrote:
       | Years ago, I worked on APIs for a top 10 bank in the US. In our
       | discovery/delving into their logic, the most fascinating thing
       | was interest rate.
       | 
       | Some of their systems stored - and therefore the APIs - presented
       | interest rate as a simple number (5% was 5) where other used a
       | decimal (5% was 0.05) and still others used basis points (5% was
       | 500).
       | 
       | One of their VPs told me they once saw an unexpected uptick in
       | loan applications and dug in to find out the quoted rate was
       | 0.05%
       | 
       | Consistency is key.
        
       | jroseattle wrote:
       | What I find more important than the timestamp format -- the
       | timestamp source.
       | 
       | Centralize from where the ts is set (the db is a great place to
       | do this.) Don't let client-code set the timestamp.
        
       ___________________________________________________________________
       (page generated 2024-07-11 23:00 UTC)