[HN Gopher] My Beancount books are 95% automatic after 3 years (...
       ___________________________________________________________________
        
       My Beancount books are 95% automatic after 3 years (2024)
        
       Author : leonry
       Score  : 154 points
       Date   : 2025-03-05 16:08 UTC (6 hours ago)
        
 (HTM) web link (fangpenlin.com)
 (TXT) w3m dump (fangpenlin.com)
        
       | achierius wrote:
       | Anyone tried this tool out for their personal accounting? My
       | books are currently in a big custom Google sheet system, and
       | building metrics on top of that has been a pain.
        
         | miroljub wrote:
         | It follows the same principles as ledger-cli and hledger.
         | 
         | You won't make a mistake using any of them.
         | 
         | I used and still use both, since they share the file format.
         | Beancount is more opinionated, with slightly different file
         | format, so I didn't bother adding it.
        
         | Sphax wrote:
         | Yes. I have maintained my ledger with beancount for 5/6 years
         | now. I don't automate downloading from my bank, it's not worth
         | the hassle especially now that every login requires 2FA with my
         | phone.
         | 
         | But I did write an importer for the csv files my bank provides
         | and with smart_importer I don't even have to categorize the
         | statements anymore (although there are mistakes sometimes). I
         | don't gather metrics though , I use fava to have a visual view
         | of my books.
         | 
         | I usually spend half an hour per month maintaining the ledger .
        
         | clucas wrote:
         | Yes, I have one big beancount file tracking everything back to
         | 2014. I use fava for visualization and git for history/backup.
         | I've occasionally gone back and split or moved accounts, based
         | on what metrics I want to track at that moment, it's pretty
         | easy.
         | 
         | However, contrary to the article's automated method, my
         | workflow is to manually input transactions every day (or every
         | couple days, depending on what's going on) and balance my
         | accounts. It's a bit of a ritual, but I like having a really
         | good handle on day-to-day spending. Plus, I find ~10 minutes
         | once a day way easier than (e.g.) ~3 hours once a month, even
         | if it's the same amount of time overall.
        
         | anon291 wrote:
         | I have used beancount and it is significantly better than an
         | Excel or Google Sheet.
         | 
         | For me, the main benefit is tax lot matching and an auditable
         | trail of sales should the IRS come knocking. It's impossible to
         | do this properly in a spreadsheet (I mean... it is possible,
         | but no one will complain if it's done wrong until it's too
         | late). With beancount, matching is much more straightforward. I
         | have a python script that does it automatically for each sale
         | using normal FIFO, tax loss minimization, etc. Due to
         | beancount's internal checks, I am certain that these are
         | correct once they're entered into the ledger. There is no
         | chance of failure, and if the IRS ever asks me to justify a
         | capital gain / loss figure, it's very easy to point out that I
         | keep track of all my shares and have for the past several years
         | and I maintain a solid record of the entire history of each
         | lot.
        
           | FredPret wrote:
           | Another very beneficial double-check is asserting account
           | balances at certain dates.
           | 
           | You give beancount the balances at the end of each month
           | straight from your bank / broker statements, and it throws an
           | error if your transactions don't match up.
        
         | rockooooo wrote:
         | I don't use beancount but if you're already in sheets, Tiller
         | might be easier to switch to to get lots of built-in
         | dashboards/metrics
        
       | CaptainJack wrote:
       | I've used beancount extensively, spent many hours a few years
       | ago. Built importers parsing bank PDFs (in UK, plaid doesn't
       | work. Plus I'd rather also keep all the original statement PDFs).
       | 
       | Probably built 10+ importers, plus some plugins to do automated
       | transaction annotations.
       | 
       | I have not made any update for many years now, because: -
       | Downloading statements is still a pain, have to manually go
       | through all websites. Banks are bad at making the statements
       | available, and worse making it possible to automate it. - The
       | root of the issue is actually that beancount is too slow. Any
       | change/update takes ages. Python is both a blessing (makes it
       | easy to add plugins/importers etc), and a curse (way slower than
       | some other languages.
       | 
       | I believe the creator of beancount has started working on v3 with
       | a mix of C++/python, relying on protobufs, a C++ core for
       | parsing, etc. AFAIK, that is not production-ready yet.
        
         | diftraku wrote:
         | I'd be really curious on how hard programmatic access to your
         | own, personal banking data might be in the PSD2-era.
         | 
         | I can link my secondary bank account to my main bank's app so I
         | can see the balance in one place, but the catch is that I need
         | to refresh this authorization through the app every 90 days.
         | 
         | Ideally, you'd just use your banking credentials to authorise
         | the API access and pull data through that. What this requires
         | in practice, I have no idea but it probably involves a bit of
         | bureucracy.
        
           | jazzyjackson wrote:
           | Ran into this annoyance recently setting up new accounting
           | software, that the access my bank provides is last 6 months
           | only, so I still had to go and export a csv, rejigger the
           | column names and date format, to reimport the first 8 months
           | of 2024.
           | 
           | My thought for working around tracking new transactions
           | without a third party is to just set up email alerts so I get
           | a notification on every charge, deposit etc and set up some
           | cron job to read new emails and update my books.
        
         | mtlynch wrote:
         | v3 is out now and v2 is officially deprecated:
         | 
         | https://groups.google.com/g/beancount/c/iTdRuvZnE4E
         | 
         | I found the migration pretty confusing and haven't found good
         | documentation on how to go from v2 to v3.
         | 
         | The best I've found is this unofficial write-up from an
         | experienced Beancount user:
         | 
         | https://sgoel.dev/posts/moving-from-beancount-2x-to-3x/
        
           | CER10TY wrote:
           | As far as I can tell this is without the planned C++ rewrite
           | though, and the documentation at https://beancount.github.io/
           | still says to use v2.
           | 
           | Is there a point in migrating already?
        
             | mtlynch wrote:
             | I'm still waiting on better migration instructions.
             | 
             | The maintainer says here that v2 is officially deprecated:
             | 
             | > _You should not use v2 anymore._
             | 
             | https://groups.google.com/g/beancount/c/iTdRuvZnE4E/m/o9V91
             | W...
        
         | FredPret wrote:
         | I suspect Python isn't the limiting factor here - it's the file
         | format. You can end up with huge interconnected text files that
         | have to be fully parsed on every change.
         | 
         | If you have 1e5 - 1e6 of lines of transactions, I think a
         | SQLite database would be a huge step forward. If you have much
         | more than that, you probably need an ERP system.
         | 
         | Of course the text files make it ~easy to enter transactions,
         | but maybe there's an elegant way to use those for ingestion
         | only; that does make the system much more complicated to use.
         | That might not be a problem for the kind of person using plain-
         | text accounting over the course of years though.
        
         | chrislloyd wrote:
         | I have a very similar setup but with HLedger[1]. A "do-
         | nothing"[2] script helps me download statements by opening bank
         | websites, waits for manual import and finally checks balances.
         | That makes it a lot less repetitive and error prone. Or at
         | least, I catch the errors faster.
         | 
         | I've found HLedger and Shake to be fast enough to process
         | almost a decade of finances. Dmitry Astapov has an extremely
         | well produced tutorial workflow[3].
         | 
         | How have you managed the PDF parsing? Mine has become a bit of
         | a mess dealing with slight variations in formatting as they
         | change over time. I've been considering using LLMs but have
         | been nervous about quality.
         | 
         | [1]: https://hledger.org [2]:
         | https://blog.danslimmon.com/2019/07/15/do-nothing-scripting-...
         | [3]: https://github.com/adept/full-fledged-hledger
        
           | Karrot_Kream wrote:
           | Why not spot check your PDF LLM outputs? I always make sure
           | my accounts balance by hand anyway. Though Occasionally it's
           | really painful especially if it's a missing Venmo
           | transaction. It's rare that I need to really comb through my
           | accounts to account for some money but when I do it's really
           | time-consuming.
        
         | jxjnskkzxxhx wrote:
         | Banks in the UK allow to export transactions in many formats.
         | Login, pick time range, download in ofx format. Why is this a
         | pain?
        
           | BeetleB wrote:
           | Multiple bank accounts and multiple credit cards. Also,
           | figuring out the time range for each bank.
        
             | mgr86 wrote:
             | I run into the same issues here with banks in the US. It is
             | a real pain in the ass, and makes tracking this sort of
             | information way more time consuming then it needs to be.
             | 
             | My other issue is with stores like Costco that sell both
             | household goods, groceries, clothes, and even misc kids
             | stuff. I like to track each separately. Which means I then
             | need to fetch and analyze the receipts.
        
               | BeetleB wrote:
               | > I like to track each separately. Which means I then
               | need to fetch and analyze the receipts.
               | 
               | That is a reality. To make my life easier, when I check
               | out at a store, I put all my grocery items first on the
               | belt. Then everything else. Usually "everything else" is
               | only a few items. So I categorize those additional items,
               | and then specify "Groceries" for the rest.
               | 
               | Often I buy _only_ groceries, and I throw those receipts
               | away. When I 'm in a ledger/beancount session, if I don't
               | have a receipt, that means it was just Groceries.
               | 
               | This method alone really reduced my time dealing with
               | receipts.
        
               | cranky908canuck wrote:
               | For planning purposes, could you look at a year's
               | postings, then come up with "good enough" breakout
               | allocations going forward?
        
           | erikerikson wrote:
           | It makes it about them not about you. I don't care which
           | banks and other financial providers I use. I care about
           | managing my funds in a way that is efficient and healthy for
           | my life. The banks I use are simply service providers, a
           | subclass of service providers across all the dimensions of my
           | life. They have regulations they must abide by but in so
           | doing they attempt to force me to think and act in those
           | terms and I think they're poor.
        
             | jxjnskkzxxhx wrote:
             | The fact that I can download my data describing my
             | transactions in a format convenient to me, makes it about
             | them? Curious take.
        
           | cranky908canuck wrote:
           | "banks allowing export of transactions" is only the start.
           | 
           | I deal with two banks for credit cards.
           | 
           | One (call it "Blue Bank") allows me to download a statement.
           | I filter out a couple of things (payments mostly), check that
           | it matches the paper statement balance, and post it. About 15
           | minutes start to finish.
           | 
           | The other (call it "Orange Bank") allows me to download a
           | "statement". I filter out a couple of things (payments
           | mostly), check my previous month's transactions to see which
           | ones at the beginning of the file actually go in the current
           | billing period (not already paid), stare at the last
           | transactions to see which ones actually were posted to the
           | current billing period (not after the cutoff), run the script
           | to check the total (nope, doesn't match) then do that a
           | couple of times until it matches. The time they changed the
           | meaning of the "credit" column from "just confirming this is
           | a credit" to "it's a credit, you need to flip the sign" it
           | was 45 minutes.
           | 
           | But hey, it's all CSV!
        
             | jxjnskkzxxhx wrote:
             | I guess you must have a more complex life than me. I never
             | filter out anything, and everything always matches.
        
               | cranky908canuck wrote:
               | Maybe. What I was trying to get at was, some banks (the
               | 'orange one') don't provide sane semantics, so even if
               | the input format is compatible, reconciliation can be a
               | nightmare. You may not be dealing with your 'orange
               | bank'; if I only dealt with the blue one I would not be
               | aware of the problems of the other (and it would not have
               | occurred to me that the orange one could botch it up).
        
         | BeetleB wrote:
         | > Downloading statements is still a pain, have to manually go
         | through all websites.
         | 
         | Have you considered using Playwright?
         | 
         | I used aider[0] recently to log into my work's payslips and
         | download all the relevant payslips into JSON format (with
         | values encrypted). It took about 3 hours, but that's mostly
         | because of my lack of knowledge of good CSS selectors.
        
       | jazzyjackson wrote:
       | As for automatically pulling transactions, it's still shocking to
       | me that anyone handed over their banking username and password to
       | a third party, quicken et al at least use a system now where the
       | bank grants read only permission as an app integration, but still
       | I'm not a fan of every bank transaction becoming someone else's
       | anonymized data harvest.
       | 
       | I'm evaluating whether self hosting accounting software is at all
       | feasible to meet CMMC requirements and so far have my sights set
       | on ERPNext. I configured my bank to send me an email alert for
       | every transaction and intend to parse that and appene it to my
       | ledger, but the API to do so is fairly annoying, so hearing that
       | beancount is meant to be automatable is intriguing indeed
       | 
       | BTW the other thing that shocks me about quicken classic is lack
       | of version control - the database can become corrupt and your
       | only recourse is restoring from backup or sending the file to
       | support and having them manually fix it!
        
         | evrimoztamur wrote:
         | The space was absolutely insane prior to PSD2 in Europe.
         | Luckily most banks now have API endpoints, but there's a new
         | class of middlemen wrangling the APIS: the aggregators who pipe
         | all your data from the banks to apps. It's better than handing
         | out the keys to the old class of scraping aggregators, I
         | suppose, but the outcome is still not _that much_ better in my
         | opinion.
        
         | FredPret wrote:
         | I automated the flow of transaction -> text message -> MacOS ->
         | Beancount once, but it never added up.
         | 
         | - some transactions go through without a text
         | 
         | - some transactions generate two identical texts
         | 
         | - for Forex transactions, the amount in the text and the amount
         | on your statement will not match
         | 
         | - some texts are ambiguous in that they could have been
         | generated by two different kinds of transaction; especially
         | true of deposits & credit card payments
         | 
         | In the end I gave up, accepted that accounting will never be
         | smooth & simple, and now just generate a CSV every month by
         | hand.
        
         | Wronnay wrote:
         | Do you plan to open source or sell your Email to ledger /
         | ERPNext solution? I actually plan the exact same thing for my
         | use cases...
        
         | Helmut10001 wrote:
         | If you are in Europe, there is a very nice OSS software called
         | Hibiscus [1] with APIs-connectivity for most banks, and (local)
         | web scrapers for those banks that have no APIs.
         | 
         | I am working on a plugin to pull categories and transactions
         | from the Hibiscus DB (H2 or PG) to Beancount [2]. It is not
         | there yet, but the process overall looks promising.
         | 
         | [1]: https://www.willuhn.de/
         | 
         | [2]: https://github.com/Sieboldianus/beancount-hibiscus-
         | importer
        
           | amaccuish wrote:
           | Another Hibiscus fan.
           | 
           | I'm pretty sure FinTS/HBCI is a mostly only used here in
           | Germany sadly. Which is a shame because all this
           | "OpenBanking" stuff requires registering/paying/certificates
           | etc.
        
         | human_llm wrote:
         | ERPNext is fairly easy to automate. We have a number of Python
         | scripts that use the ERPNext API to create sales invoices, add
         | and reconcile transactions from PayPal, Stripe, etc.
         | 
         | We originally used Quickbooks Online and I'm glad we decided to
         | switch to ERPNext a few years ago.
        
         | fangpenlin wrote:
         | Hi, the author here.
         | 
         | If you are okay with Plaid[1], many of their bank connections
         | are now using OAuth-style authentication instead of password
         | sharing. I actually added a new feature called Direct
         | Connect[2] a while back to allow any plaintext accounting book
         | users to pull CSV directly via Plaid API through BeanHub. We
         | don't train AI models with our customers' transactions, and if
         | we want to, we will ask for explicit consent (not just ToS) and
         | anonymize customers' data.
         | 
         | If you're okay with the above, the key to achieving a high
         | automation level is the ability to pull CSV transaction files
         | directly from the bank in a standard format. Maybe you can give
         | it a try. We have 30 days free trial period.
         | 
         | I am not so familiar with the CMMC requirements, as you
         | mentioned, but for us to access transactions from some banks,
         | such as Chase, Plaid requires us to pass an auditing process
         | about our security measurements. Is the CMMC compliance your
         | company needs to meet to take a third-party software vendor
         | into considerations?
         | 
         | [1]: https://plaid.com
         | 
         | [2]: https://beanhub.io/blog/2025/01/16/direct-connect-
         | repository...
        
         | ashish01 wrote:
         | With a one-time setup, a low-fi solution is to receive an email
         | from your bank for every transaction and extract the data into
         | a ledger entry.
        
         | walterbell wrote:
         | _> not a fan of every bank transaction becoming someone else 's
         | anonymized data harvest_
         | 
         | In 2024, CFPB had mandated all U.S. banks to open up data
         | history to SaaS/fintech vendors, if the customer gives
         | permission.
        
       | skwee357 wrote:
       | That's interesting. I, however, still spend between 20 and 30
       | minutes once a week to keep by beancount updated by manually
       | entering the transactions, a habit I perform for the past 14 or
       | so years.
       | 
       | At the same time, I did automate 90% of my business beancount
       | import by writing a custom stripe importer that imports
       | transactions from stripe. As for the expenses, I still enter them
       | manually in the aforementioned 20-30 minute session.
        
         | BeetleB wrote:
         | > I, however, still spend between 20 and 30 minutes once a week
         | to keep by beancount updated by manually entering the
         | transactions
         | 
         | That sounds a lot - I spend less.
         | 
         | Consider entering the transactions into a software like
         | KMyMoney and write a script to export to beancount format.
         | Entering/importing is a lot easier in a SW like KMyMoney (e.g.
         | it does decent matching of the new transaction to prior
         | transactions of similar amounts).
        
           | jldugger wrote:
           | Honestly, there's a sort of Jevons paradox at work. Importing
           | credit card transactions isn't that hard, but now that it's
           | solved, I should really be monitoring my investment portfolio
           | more closely, or tracking intangible assets like unused
           | vacation hours, or recording all those taxes and expenses
           | listed on my paycheck.
        
       | zefhous wrote:
       | Love this! I have been using hledger for a while now and have a
       | pretty automated process for importing exported CSVs. I would
       | love a little more automation in terms of pulling down the data,
       | but on the bright side the manual process provides a good touch
       | point to keep up on accounting regularly in small doses. This is
       | great for just keeping an eye on things on a monthly basis.
       | 
       | I am starting a new business now and intend to see how far I can
       | take plain text accounting in that context. I plan to use mercury
       | for banking and want to automate as much as possible. I would
       | also like to associate invoices that are stored in a self-hosted
       | paperless-ngx instance.
        
         | abhiyerra wrote:
         | I wrote a script to download and categorize my Mercury
         | Transactions and it is quite straightforward. Highly recommend
         | it.
         | 
         | I am looking to deprecate my Quickbooks usage after this year
         | since it is such a pain to split payments into multiple chunks
         | automatically and I don't really know what I am getting for
         | $60/month.
        
       | joshstrange wrote:
       | I use YNAB and love it but I'm always interested in alternatives.
       | That said I opened the main website [0] and the animation [1]
       | shows a bunch of crypto logos. To my eye this seems to be a
       | product aiming itself at people using crypto. I don't think
       | that's the case given the blog post I just read but it's not a
       | good look. Anyone catering to the crypto market is at least a
       | little suspect in my book. Maybe it's just a feature of BeanHub
       | and you don't have to touch it at all but to be featured so
       | prominently is icky/
       | 
       | [0] https://beanhub.io/
       | 
       | [1] https://cs.joshstrange.com/5njjtGND
        
         | ndegruchy wrote:
         | Yeah, the site and service seem to be trying to attract a range
         | of people. However, `beancount` and the other `ledger-cli` like
         | tools are all just plain-text files with a semi-rigid markup
         | for recording transactions.
         | 
         | https://plaintextaccounting.org/ is the resource for most of
         | them, and has good resources for making them work for you. It's
         | not for everyone, though, many people just prefer spreadsheets
         | or apps, and that's fine.
        
         | shortrounddev2 wrote:
         | YNAB seemed WAY too involved for me. I spent so much time in
         | the fucking app and barely understood it. I had to "give each
         | dollar a job" which is a total inversion of the traditional
         | "set a dollar amount you want to spend on each category and
         | then exercise some self control". I pay for everything on a
         | credit card and then pay back that CC at the end of the month,
         | and doing so complicated the UI when it automatically pulled in
         | my spending from my bank account. I've found GNU Cash to be a
         | bit more intuitive with a little bit of training.
         | 
         | I feel that there are some people who legitimately enjoy
         | looking at money on spreadsheets and implementing
         | budgets/categorizing purchases and I think that YNAB is great
         | for those people, but I personally hate even THINKING about
         | money, let alone interfacing with budgeting software every
         | couple days
        
           | jimbokun wrote:
           | > "set a dollar amount you want to spend on each category and
           | then exercise some self control"
           | 
           | How is that different from "give each dollar a job"? The only
           | difference I see is that it forces you to make the categories
           | add up to the amount of your paycheck.
        
             | shortrounddev2 wrote:
             | Yes, that's the difference. When I asked on help forums
             | questions like "Can I just set the budget with the
             | assumption that how much money I make this month will be
             | the same as next month", they said no, because I don't know
             | what will happen to my paycheck this month. I need to divy
             | up real dollars, and not expected dollars, meaning that the
             | budget is a constantly moving target and I need to readjust
             | and give a job to my _actual_ dollars _every two weeks_
             | instead of just my expected paycheck.
        
               | jimbokun wrote:
               | Understood.
               | 
               | I copy over all the previous months budget amounts and
               | tweak it to match the current paycheck, if it's
               | different. And frankly don't care too much if it's a few
               | dollars off.
               | 
               | I do this in a spreadsheet. I had written an app for
               | budgeting but it was too much hassle keeping it up to
               | date. New versions of Mac OS would break it in subtle
               | ways and wasn't worth the effort of all the bug fixes.
        
           | joshstrange wrote:
           | People should use what works best for them but I'd like to
           | respond a bit to the YNAB issues you ran into (not to
           | convince you to switch or anything).
           | 
           | YNAB with Credit Cards was difficult for me, as was envelope-
           | based budgeting (what "give every dollar a job" is called)
           | because I also was used to the typical "set limits on
           | categories and stay within them"-style budgeting (Like Mint,
           | at least Mint way back when it first came out, I haven't
           | touched it in years). YNAB is very different in that it
           | doesn't let you allocate dollars that are not in your bank
           | account. You can't say "$300 for eating out" unless you have
           | $300 in your bank account and YNAB doesn't care that you
           | might have that money available by the time you want to spend
           | it, it forces you to allocate the money you actually have and
           | every time you get paid you allocate it into categories with
           | the long-term goal of getting a month (or months) ahead in
           | your budgeting (not spending the money you made this month on
           | stuff you need in this month).
           | 
           | Credit cards were also hard to wrap my head around. In a
           | debit-only world it all made sense but CC's complicated
           | things for me. I really enjoyed Nick True's videos on
           | Youtube, they helped me with this a lot but a simplified way
           | to think about this is:
           | 
           | * You put $200 in your "groceries" category (aka envelope)
           | 
           | * You go to the store and spend $60 (on eggs I assume?) and
           | pay with your CC
           | 
           | * In YNAB-land you will record that transaction (or it will
           | be auto-imported) and you will assign it to the groceries
           | category but since YNAB knows you spent this on a CC (you
           | always record which account the transaction happened on) it
           | essentially takes $60 out of the "Groceries" envelope and
           | moves it to the "Chase Sapphire" (or whatever you name it)
           | envelope. You set aside the money for your CC purchases at
           | time of purchase and then when the CC bill comes due it's
           | paid out of that "envelope".
           | 
           | In this way YNAB has become a layer on top of all my finances
           | and I care little about how much money is in any given
           | savings/checking account since YNAB tracks everything. I just
           | make sure there is enough to cover CC payments (there always
           | is) or any big transfers I want to make (like moving money to
           | a HYSA).
           | 
           | I've automated as much as I can with YNAB but I do spend
           | 30min-1hr every few weeks (this is not what they recommend,
           | but it works for me) reconciling my accounts. I totally get
           | if people don't want to do that or don't see the value in it.
           | Personally I love knowing where every dollar of mine is and
           | tracking every purchase/transfer.
        
         | fangpenlin wrote:
         | Hi, the author here.
         | 
         | So BeanHub is built on top of Beancount and uses double-entry
         | accounting. It's one of the benefits of double-entry
         | accounting. Many accounting software are not good at dealing
         | with multi-currencies or custom currency. With Beancount, you
         | can define any commodity you want, create transactions, and
         | convert them with different currencies easily. For example, you
         | can define a commodity TSM and create transactions[1] like
         | this:
         | 
         | 2025-01-01 commodity TSM
         | 
         | 2025-03-05 * "Purchase TSMC"
         | Assets:US:Bank:WellsFargo:Checking
         | -2,000 USD @ 100 TSM              Assets:US:Bank:Robinhood
         | 20 TSM
         | 
         | I think many people trade crypto, and traditional accounting
         | software may not be that friendly to them. That's why I
         | emphasized a bit to the crypto target audience. But you're
         | right; I should make it clearer that it's not just for crypto.
         | 
         | [1]:
         | https://beancount.github.io/docs/beancount_language_syntax.h...
        
       | egglemonsoup wrote:
       | Great article! A minor correction, Steph Ango is Obsidian's CEO,
       | not founder
        
         | fangpenlin wrote:
         | Hey! Thanks for pointing out. I have already corrected it in my
         | article :)
        
       | idopmstuff wrote:
       | I own a couple of small businesses, and I've tried a few things
       | with my books - was on Bench for a year (thankfully not the year
       | they shut down, and they were so incompetent I dropped them
       | before that), tried to do them myself for a year, then upon
       | realizing the P&L generated did not match the numbers I expected,
       | hired bookkeepers on Upwork to fix them for me.
       | 
       | I really feel like I ought to be able to do them myself - it's
       | just following some rules, and my accounts aren't that complex.
       | Still, it was just enough of a pain that it was easier to hire
       | someone overseas for cheap (especially since I know what the
       | business' numbers should roughly come out to, so I can validate
       | their work).
       | 
       | But as I've been using the latest AI models, I really feel like
       | this is something that's going to be fully automated by AI in the
       | next 1-2 years (at least for my very simple use case). It's
       | simple enough that an AI agent should pretty trivially be able to
       | fetch the docs from the various places that I sell upload them,
       | categorize transactions (this is already basically automated by
       | rules for me anyway) and then do whatever it is that bookkeepers
       | do to close the books.
       | 
       | I can't help but think that bookkeeper is not going to be a
       | profession in five years, and I'm just not sure where those
       | people go. It's not like automating bookkeeping will expand the
       | economic pie enough to create new jobs - I don't believe it's a
       | bottleneck to anything at this point.
        
         | fangpenlin wrote:
         | Hi, the author here.
         | 
         | Many customers have asked me about AI offerings, and I am
         | considering them. While this is doable with modern LLM
         | technologies, I need to consider many issues.
         | 
         | The first is that nobody, myself included, likes their data
         | being part of someone else's machine-learning training
         | pipeline. That's why I promised my users that I wouldn't use
         | their data for machine learning training without asking for
         | explicit consent (and, of course, anonymization will be
         | needed).
         | 
         | While I know everything involved in AI sounds cool, do we
         | really need LLM for a task like this? Maybe a rule-based import
         | engine could kill 95% of the repeating transactions? And that's
         | why I built beanhub-import[1] in the first place. Then, here
         | comes another question: Should I make LLM generate the rule for
         | you or generate the final transactions directly?
         | 
         | Yet another question is, everybody/every company's book is
         | different from one to another. Even if you can train a big
         | model to deal with the most common approaches, the outcome may
         | not be what you really need. So, I am thinking about possibly
         | using your own Git history as a source of training data to
         | teach machine learning models to generate transactions like you
         | would do. That would be yet another interesting blog post, I
         | guess if I actually built a prototype or really made it a
         | feature for BeanHub. But for now, it's still an idea.
         | 
         | [1]: https://beanhub-import-docs.beanhub.io/
        
       | asadjb wrote:
       | Love this! I haven't used Beanhup, but was a user of text based
       | accounting systems (beancount, hledger) for many years to track
       | personal expenses. I stopped doing it when I realized I wasn't
       | getting much out of it, but the knowledge of double-entry
       | accounting still helps me to this day when I keep track of my
       | business expenses in Xero.
       | 
       | One thing which I disagree with in this article is the focus on
       | file based data storage:
       | 
       | > That makes it 10 times harder because you need to parse the
       | text file, update it accordingly, and write it back. But I am
       | glad I did. That guarantees all my accounting books are in the
       | same open format.
       | 
       | This quote captures my issues with it. It just makes things so
       | much more difficult; and it makes the whole process slower as the
       | file grows. I remember that when I used hledger for tracking my
       | expenses over 3 years, I had to "close books" once a year and
       | consolidate all the transactions for the past year into 1 entry
       | in a new ledger file to keep entry/query operations fast.
       | 
       | I get the sentiment; you want open data formats that remain even
       | after your app is shutdown. But you can get the same by using
       | open formats; maybe a sqlite DB is good enough for that?
       | 
       | The only thing that would be more complicated with a DB is
       | versioning & reviewing commits like this app does; which does
       | seem like a very exciting feature.
        
         | packetlost wrote:
         | I'm not at all familiar with text-based accounting tools, but I
         | feel like the performance issues could be addressed by using
         | multiple files instead of one big one.
        
           | jimbokun wrote:
           | Then you are slowly building a database engine.
           | 
           | When do you split the files? How do you track which data
           | resides in which files? Does one file represent one kind of
           | data (table)? Does it reflect data within a given time range?
           | Do you sometimes need to retrieve data that crosses file
           | boundaries?
           | 
           | You quickly lose the simplicity inherent in saving to just a
           | single file.
           | 
           | Which is where Sqlite shines. It's a single file. But with a
           | full, user defined schema. And can update it and query it
           | incrementally, without having to read and write the entire
           | thing frequently. It handles all of that complexity for you.
        
             | fangpenlin wrote:
             | Some people, myself included, prefer text-based files as a
             | user interface. Like, some Vim users won't leave their Vim
             | session forever and would like to do everything in it.
             | While SQLite is immortal software and will probably be
             | there forever, using it means changing the UI/UX from text
             | files to SQL queries or other CLI/UI operations. I think
             | it's a preference for UI/UX style instead of a technical
             | decision. For that preference of UI/UX, we can push on the
             | technical end to solve some challenges.
        
             | packetlost wrote:
             | > Then you are slowly building a database engine.
             | 
             | > When do you split the files? How do you track which data
             | resides in which files? Does one file represent one kind of
             | data (table)? Does it reflect data within a given time
             | range? Do you sometimes need to retrieve data that crosses
             | file boundaries?
             | 
             | Not really. Splitting anywhere from per-day to per-year is
             | probably fine! Or split arbitrarily and merge the files at
             | runtime. Make it configurable! Add tools to split or merge
             | files, it's really _not_ that hard, a far cry from a
             | database engine.
             | 
             | > You quickly lose the simplicity inherent in saving to
             | just a single file.
             | 
             | No, you really don't.
             | 
             | > Which is where Sqlite shines. It's a single file. But
             | with a full, user defined schema. And can update it and
             | query it incrementally, without having to read and write
             | the entire thing frequently. It handles all of that
             | complexity for you.
             | 
             | That you need a particular tool or library to interact
             | with.
             | 
             | I'm not going to try and sell you on the benefits of using
             | plaintext tools because you've already clearly made up your
             | mind, but there are reasons even if you can't see them.
             | SQLite has like 160k lines of code complexity that isn't
             | necessary and is inherently less composable.
        
         | fangpenlin wrote:
         | Hi, the author here.
         | 
         | I get where you're coming from. My books are also growing big
         | right now, and indeed, they have become slower to process. Some
         | projects in the community, such as Beanpost [1], are actually
         | trying to solve the problem, as you said, by using an RMDB
         | instead of plaintext.
         | 
         | But I still like text file format more for many reasons. The
         | first would be the hot topic, which is about LLM friendliness.
         | While I am still thinking about using AI to make the process
         | even easier, with text-based accounting books, it's much easier
         | to let AI process them and generate data for you.
         | 
         | Another reason is accessibility. Text-based accounting only
         | requires an editor plus the CLI command line. Surely, you can
         | build a friendly UI for SQLite-based books, but then so can
         | text-based accounting books.
         | 
         | Yet another reason is, as you said, Git or VCS (Version control
         | system) friendliness. With text-based, you can easily track all
         | the changes from commit to commit for free and see what's
         | changed. So, if I make a mistake in the book and I want to know
         | when I made the mistake and how many years I need to go back
         | and revise my reports, I can easily do that with Git.
         | 
         | Performance is a solvable technical challenge. We can break
         | down the textbooks into smaller files and have a smart cache
         | system to avoid parsing the same file repeatedly. Currently, I
         | don't have the bandwidth to dig this rabbit hole, but I already
         | have many ideas about how to improve performance when the file
         | grows really big.
         | 
         | [1]: https://github.com/gerdemb/beanpost
        
           | asadjb wrote:
           | Thanks for responding and your thoughts! Generally agreed
           | with all you said.
           | 
           | However, I feel like maybe a different approach could be to
           | store all the app state in the DB, and then export to this
           | text only format when needed; like when interacting with LLMs
           | or when someone wants an export of their data.
           | 
           | Breaking the file into smaller blocks would necessarily need
           | a cache system I guess, and then maybe you're implementing
           | your own DB engine in the cache because you still want all
           | the same functions of being able to query older records.
           | 
           | There's no easy answer I guess, just different solutions with
           | different tradeoffs.
           | 
           | But what you've built is very cool! If I was still doing text
           | based accounting I would have loved this.
        
       | bks wrote:
       | Is there any solution for statement downloading, like many small
       | businesses I have credit card statements, bank accounts that I
       | need to provide to my accountant.
       | 
       | While my "books" are synced via Quickbooks, they (accountants)
       | really seem to love having the PDF in hand. I just need the PDFs
       | and they do not send them via email...
        
         | hamiltont wrote:
         | Hi - I'm working on something like this because I needed it too
         | ;-)
         | 
         | It's not yet ready for release, but I should be ready for beta-
         | test within 2 months. If you're interested I would be happy to
         | add you to my list of "people to notify when I am ready to beta
         | test"
        
           | bks wrote:
           | yes please
        
       | bzmrgonz wrote:
       | would using sqlite have been inline with the file-over-app
       | philosophy?? I'm thinking yes.
        
         | fangpenlin wrote:
         | If there's anything like immortal software, SQLite is
         | definitely on the list
        
       | conradev wrote:
       | Language models completely upended my Beancount setup. To me,
       | there is no point fiddling with precise parsers when a language
       | model can read any PDF. I have the language model extract balance
       | assertions from the PDF (beginning/end balance) so that it
       | grounds its work in reality.
       | 
       | I dream of a future where anyone can download a ZIP of all the
       | PDFs they've ever received from their bank, drop it onto a local
       | app, and wait while it creates an entire accounting setup for
       | you.
       | 
       | Edit: also not mentioned here is Fava, which is a really nice web
       | UI for Beancount (https://beancount.github.io/fava/). I share a
       | link with my accountants, and they find it convenient (for
       | downloading files, at least).
        
         | xyst wrote:
         | Lovely, now LLM can hallucinate how much I spent or earned in a
         | month, or year.
        
           | fangpenlin wrote:
           | For a usecase like this, a local running model would be
           | ideal. I won't like to share my personal accounting books
           | with LLM either.
        
           | conradev wrote:
           | It makes it a lot harder if you check the balances!
        
       | daft_pink wrote:
       | So can this be used for business books and records? A lot of the
       | documentation I've shifted through since Intuit sh*ttified their
       | product made me think that this the ledger open source accounting
       | is more of a quicken like personal finance product.
        
         | zie wrote:
         | double entry book keeping can be used to track any resource of
         | any size, though it's particularly good at tracking money of
         | any size.
         | 
         | The upside of a "real database" like postgres is when you need
         | multiple people in the books making changes at the same time.
         | Until your accounting department grows into multiple people,
         | Beancount, hledger or any plain text accounting system would be
         | fine.
        
           | simonmic wrote:
           | If those multiple people are updating the files via VCS, or
           | via UIs that enforce append-only updates, it's still
           | reasonably fine.
        
       | zdw wrote:
       | A long time ago I wrote a python extension to the C ledger
       | implementation that did a basic RPN calculator:
       | 
       | https://github.com/zdw/ledgercalc
       | 
       | And fed it with pile of scripts that extracted bank PDFs -> Text
       | -> ledger entries, and shoved it all in git.
       | 
       | This looks like it some superset of that, but in general I found
       | the files + text to ledger process to scratch a great itch.
        
       | gen220 wrote:
       | FWIW I went down the path of automating as much as I could about
       | this process.
       | 
       | These days, though, my process is more manual. Around every 24 or
       | 48 hours, or immediately after making a transaction, I'll record
       | the transaction in my ledger, which I store in Google Sheets (!)
       | instead of a .ldg file. No more CSVs, no more pure functions of
       | bank output.
       | 
       | Sometimes I miss the .ldg format. But I don't really miss
       | maintaining the automated system. Google Sheets isn't as
       | expressive as Ledger, but it is sufficient for my needs. Charting
       | is a bit easier. YMMV!
       | 
       | To me, the most essential pivots to get me back into personal
       | accounting were to record expenses both manually and ~daily. If I
       | were to return to ledger again -- which I might! -- I'd focus on
       | those aspects more than the automation.
        
       ___________________________________________________________________
       (page generated 2025-03-05 23:00 UTC)