[HN Gopher] Financial Statement Analysis with Large Language Models
___________________________________________________________________
Financial Statement Analysis with Large Language Models
Author : mellosouls
Score : 234 points
Date : 2024-05-24 17:39 UTC (5 hours ago)
(HTM) web link (papers.ssrn.com)
(TXT) w3m dump (papers.ssrn.com)
| davidw wrote:
| If that's a thing, does it become a thing learning how to write a
| "poisoned" statement that misleads an LLM but is still factual?
| SirLJ wrote:
| It is already been done as quants analyze company statements
| for the last decade at least, counting positive and negative
| words, etc...
| Drakim wrote:
| Maybe, but it sounds hard if there are multiple LLMs out there
| that people might use to analyze such text. Tricking multiple
| LLMs with a certain poisonous combination of words and phrases
| sounds a lot like having a file that hashes to the same hash
| from different hashing techniques. Theoretically possible but
| actually practically impossible.
| infecto wrote:
| As someone alluded to, the narrative that management drives has
| been examined and studied many times over. What is management
| saying, what are they not saying, what are they saying but not
| loudly, what did they say before that they no longer speak
| about. There are insights to glean but nothing that is giving
| you an unknown edge. Sentiment analysis and the like go back
| well into the late 80s, early 90s.
| dpflan wrote:
| I recalling seeing a LinkedIn post by Greg Diamos at Lamini, they
| shared analysis of earnings calls. There are links on HuggingFace
| and GitHub, here they are:
|
| - https://huggingface.co/datasets/lamini/earnings-calls-qa
|
| - https://huggingface.co/datasets/lamini/earnings-raw
|
| - https://github.com/lamini-ai/lamini-earnings-calls/tree/main
| deadmutex wrote:
| It would've been interesting to compare models with larger
| context windows, e.g. Gemini with 1m+ tokens and Claude Opus.
| Otherwise, the title maybe should've been Financial Statement
| Analysis with GPT-4.
| infecto wrote:
| The study only captured the financial statements. I am unsure
| what a larger context window would buy you.
| foota wrote:
| You could dump all of a companies financial statements
| together, or dump all of a companies competitors in with it,
| for one.
| infecto wrote:
| That was outside the context of what the paper was
| studying.
| yaj54 wrote:
| maybe the authors should have used a larger context
| window in order to output their paper ;-)
|
| I jest - a well contained and specified context is
| important for a paper.
|
| But also, questions like "it would be interesting to also
| try this other thing" is how new papers get written.
| antimatter15 wrote:
| Figure 3 on p.40 of the paper seems to show that their LLM based
| model does not statistically significantly outperform a 3 layer
| neural network using 59 variables from 1989. This
| figure compares the prediction performance of GPT and
| quantitative models based on machine learning. Stepwise Logistic
| follows Ou and Penman (1989)'s structure with their 59 financial
| predictors. ANN is a three-layer artificial neural network model
| using the same set of variables as in Ou and Penman (1989). GPT
| (with CoT) provides the model with financial statement
| information and detailed chain-of-thought prompts. We report
| average accuracy (the percentage of correct predictions out of
| total predictions) for each method (left) and F1 score (right).
| We obtain bootstrapped standard errors by randomly sampling 1,000
| observations 1,000 times and include 95% confidence intervals.
| infecto wrote:
| Was going to point out the same. Glad to have the paper to read
| but I don't think the findings are significant.
| foota wrote:
| I agree this isn't earth shattering, but I think the benefit
| here is that it's a general solution instead of one trained
| on financial statements specifically.
| _se wrote:
| That is not a benefit. If you use a tool like this to try
| to compete with sophisticated actors (e.g. all major firms
| in the capital markets space) you will lose every time.
| foota wrote:
| We come up with all sorts of things that are initially a
| step backwards, but that lead to eventual improvement.
| The first cars were slower than horses.
|
| That's not to suggest that Renaissance is going to start
| using Chat GPT tomorrow, but maybe in a few years they'll
| be using fine tuned versions of LLMs in addition to
| whatever they're doing today.
|
| Even if it's not going to compete with the state of the
| art models for something, a single model capable of many
| things is still useful, and demonstrating domains where
| they are applicable (if not state of the art) is still
| beneficial.
| ecjhdnc2025 wrote:
| Far too much in the way of "maybe in a few years" LLM
| prediction relies on the unspoken assumption that there
| will not be any gains in the state of the art in the
| existing, non-LLM tools.
|
| "In a few years" you'd have the benefit of the current,
| bespoke tools, plus all the work you've put into
| improving them in the meantime.
|
| And the LLM would still be behind, unless you believe
| that at some point in the future, a radically better
| solution will simply emerge from the model.
|
| That is, the bet is that at some point, _magic_ emerges
| from the machine that renders all domain-specialist
| tooling irrelevant, and one or two general AI companies
| can hoover up all sorts of areas of specialism. And in
| the meantime, they get all the investment money.
|
| Why is it that we wouldn't trust a generalist over a
| specialist in any walk of life, but in AI we expect one
| day to be able to?
| z7 wrote:
| >Why is it that we wouldn't trust a generalist over a
| specialist in any walk of life, but in AI we expect one
| day to be able to?
|
| The specialist is a result of his general intelligence
| though.
| Terr_ wrote:
| > That is, the bet is that at some point, magic emerges
| from the machine that renders all domain-specialist
| tooling irrelevant, and one or two general AI companies
|
| I have a slightly more cynical take: Those LLMs are _not
| actually general models_ , but niche specialists on
| correlated text-fragments.
|
| This means human exuberance is riding on the
| (questionable) idea that a _really good_ text-correlation
| specialist can effectively impersonate a general AI.
|
| Even worse: Some people assume an exceptional text-
| specialist model will effectively _meta-impersonate_ a
| generalist model impersonating a _different kind_ of
| specialist!
| ecjhdnc2025 wrote:
| > Even worse: Some people assume an exceptional text-
| specialist model will effectively meta-impersonate a
| generalist model impersonating a different kind of
| specialist!
|
| Eloquently put :-)
| HanayamaTriplet wrote:
| It seems to me that LLMs the metaphorical horse and
| specialized algorithms are the metaphorical car in this
| situation. A horse is a an extremely complex biological
| system that we barely understand and which has evolved
| many functions over countless iterations, one of which
| happening to be the ability to run quickly. We can
| selectively breed horses to try to get them to run
| faster, but we lack the capability to directly engineer a
| horse for optimal speed. On the other hand, cars have
| been engineered from the ground-up for the specific
| purpose of moving quickly. We can study and understand
| all of the systems in a car perfectly, so it's easy to
| develop new technology specialized for making cars go
| faster.
| yaj54 wrote:
| agreed. most people can't create a custom tailored finance
| statement model. but many people can write the following
| sentence: "analyze this financial statement and suggest a
| market strategy." and if that sentence performs as well as
| an (albeit old) custom model, and is likely to have
| compound improvements in its performance over time with no
| changes to the instruction sentence...
| TechDebtDevin wrote:
| No
| ecjhdnc2025 wrote:
| But it can't come up with a particularly imaginative
| strategy; it can only come up with a mishmash of existing
| stuff it has seen, equivocate, or hallucinate a strategy
| that looks clever but might not be.
|
| So it all needs checking. It's the classic LLM situation.
| If you're trained enough to spot the errors, the analysis
| wouldn't take you much time in the first place. And if
| you're not trained enough to spot the errors...
|
| And let's say it _does_ work. It 's like automated
| exchange betting robots. As soon as everyone has access
| to a robot that can exploit some hidden pattern in the
| data for a tiny marginal gain, the price changes and the
| gain collapses.
|
| So if everyone has the same access to the same banal,
| general analysis tools, you know what's going to happen:
| the advantage disappears.
|
| All in all, why would there be any benefits from a
| generalised model?
| jimbokun wrote:
| "buy and hold the S&P 500 until you're ready to retire"
| flourpower471 wrote:
| Not to mention, as somebody who works in quant trading doing ml
| all day on this kind of data. That ann benchmark is nowhere
| near state of the art.
|
| People didn't stop working on this in 1989 - they realised they
| can make lots of money doing it and do it privately.
| bethekind wrote:
| Do you use llama 3 for your work?
| posting_mess wrote:
| No hedge fund registered before the last 2 weeks will use
| Llama3 for their "prod work" beyond "experiments".
|
| Quant trading is about "going fast" or "being super right",
| so either you'd need to be sitting on some huge
| llama.cpp/transformer improvement (possible but unlikely)
| or its more likely just some boring math applied faster
| than others.
|
| Even if they are using a "LLM", they wont tell you or even
| hint at it - "efficient market" n all that.
|
| Remember all quants need to be "the smartest in the world"
| or their whole industry falls apart, wait till you find out
| its all "high school math" based on algo's largely derived
| 30/40 years ago (okay not as true for "quants" but most
| "trading" isn't as complex as they'd like you/us to
| believe).
| qeternity wrote:
| It's impressive how incorrect so much of this information
| is. High frequency trading is about going fast. There is
| a huge mid and low freq quant industry. Also most quant
| strategies are absolutely not about being "super
| right"...that would be the province of concentrated
| discretionary strategies. Quant is almost always about
| being slightly more right than wrong but at large scale.
|
| What algos are you referring to derived 30 or 40 years
| ago? Do you understand the decay for a typical strategy?
| None of this makes any sense.
| posting_mess wrote:
| Quantitative trading is simply the act of trading on
| data, fast or slowly, but I'll grant you for the more
| sophisticated audience there is a nuance between "HFT"
| and "Quant" trading.
|
| To be "super right" you just have to make money over a
| timeline, you set, according to your own models. If I
| choose a 5 year timeline for a portfolio, I just have to
| show my portfolio outperforming "your preferred index
| here" over that timeline - simple (kind of, I ignore
| other metrics than "make me money" here).
|
| Depending on what your trading will depend on which
| algo's you will use, the way to calculate the price of an
| Option/Derivative hasn't changed in my understanding for
| 20/30 years - how fast you can calculate, forecast, and
| trade on that information has.
|
| My statement wont hold true in a conversation with an
| "investing legend", but to the audiance who asks "do you
| use llama3" its clearly an appropriate response.
| mathematicaster wrote:
| > how fast you can calculate , forecast, and trade on
| that information has.
|
| How you can calculate fast, forecast, and trade on that
| information has
|
| There. Fixed it for you. ;)
| chollida1 wrote:
| > the way to calculate the price of an Option/Derivative
| hasn't changed in my understanding for 20/30 years
|
| That's not true. It is true that the black scholes model
| was found in the 70s but since then you have
|
| - stochastic vol models
|
| - jump diffusion
|
| -local vol or Dupire models
|
| - levy process
|
| - binomial pricing models
|
| all came well After the initial model was derived.
|
| Also a lot of work in how to calculate vols or prices far
| faster has happened.
|
| The industry has definitely changed a lot in the past 20
| years.
| flourpower471 wrote:
| I don't really understand your viewpoint - I assume you
| don't actually work in trading?
|
| Aside from the "theoretical" developments the other
| comment mentioned, your implication that there is some
| fixed truth is not reflected in my career.
|
| Anybody who has even a passing familiarity with doing
| quant research would understand that black scholes and
| it's descendants are very basic results about basic
| assumptions. It says if the price is certain types of
| random walk and also crucially a martingale and Markov -
| then there is a closed form answer.
|
| First and foremost black scholes is inconsistent with the
| market it tries to describe (vol smiles anyone??), so
| anybody claiming it's how you should price options has
| never been anywhere near trading options in a way that
| doesn't shit money away.
|
| In reality the assumptions don't hold - log returns
| aren't gaussian, the process is almost certainly neither
| Markov or martingale.
|
| The guys doing the very best option pricing are building
| empirical (so not theoretical) models that adjust for all
| sorts stuff like temporary correlations that appear
| between assets, dynamics of how different instruments
| move together, autocorrelation in market behaviour spikes
| and patterns of irregular events and hundreds of other
| things .
|
| I don't know of any firm anywhere that is trading
| profitably at scale and is using 20 year old or even
| purely theoretical models.
|
| The entire industry moved away from the theory driven
| approach about 20 years ago for the simple reason that is
| inferior in every way to the data driven approach that
| now dominates
| JumpCrisscross wrote:
| > _the way to calculate the price of an Option
| /Derivative hasn't changed in my understanding for 20/30
| years_
|
| Not true. Most of the magic happens in estimating the
| volatility surface, BSM's magic variable. But I've also
| seen interesting work in expanding the rates components.
| All this before we get into the drift functions.
| Jerrrrry wrote:
| Leveraging "hidden" risk/reward asymmetries is another
| avenue completely that applies to both quant/HFT, adding
| a dimension that turns this into a pretty complex
| spectrum with plenty of opportunities.
|
| The old joke of two economists ignoring a possible $100
| bill on the sidewalk is an ironic adage. There are
| hundreds of bills on the sidewalk, the real problem is
| prioritizing which bills to pick up before the 50mph
| steamroller blindsides those courageous enough to dare
| play.
| hattmall wrote:
| Algo trading is certainly about speed too though, but
| it's not HFT which is literally only a out speed and
| scalping spreads. It's about the speed of recognizing
| trends and reacting too them before everyone else
| realizes the same trend and thus altering the trend.
|
| It's a lot like quantum mechanics or whatever it is that
| makes the observation of a photon changes. Except with
| the caveat that the first to recognize the trend can
| direct it's change (for profit).
| creativeSlumber wrote:
| Is there any learning resources that you know of?
| Izkata wrote:
| > but most "trading" isn't as complex as they'd like
| you/us to believe
|
| I know nothing about this world, but with things like
| "doctor rediscovers integration" I can't help but wonder
| if it's not deception but ignorance - that they think it
| really is where math complexity tops out at.
| posting_mess wrote:
| They hire people who know that maths doesn't "top out
| here", so they can point to them and say "look at that
| mathematicians/physicists/engineers/PHD's we employ -
| your $20Bn is safe here". Hedge funds aren't run by
| idiots, just a different kind of "smart" to an engineer.
|
| The engineers are are incredibly smart people, and so the
| bots are "incredibly smart" but "finance" is criticised
| by "true academics" because finance is where brains go to
| die.
|
| To use popular science "the three body problem" is much
| harder than "arb trade $10M profitably for a nice life in
| NYC", you just get paid less for solving the former.
| flourpower471 wrote:
| It is just a different (applied) discipline.
|
| It's like math v engineering - you can come up with some
| beautiful pde theory to describe this column in a
| building will bend under dynamic load and use it to
| figure out exactly the proportions.
|
| But engineering is about figuring out "just make its
| ratio of width to height greater than x"
|
| Because the goal is different - it's not about coming up
| with the most pleasing description or finding the most
| accurate model of something. It's about making stuff in
| the real world in a practical, reliable way.
|
| The three body problem is also harder than running
| experiments in the LHC or analysing Hubble data or
| treating sick kids or building roads or running a
| business.
|
| Anybody who says that finance is where brains go to die
| might do well to look in the mirror at their own brain.
| There are difficult challenges for smart people in
| basically every industry - anybody suggesting that people
| not working in academia are in some way stupider should
| probably reconsider the quality of their own brain.
|
| There are many many reasons to dislike finance. That it
| is somehow pedestrian or for the less clever people is
| not true. Nobody who espouses the points you've made has
| ever put their money where there mouth is. Why not start
| a firm, making a billion dollars a year because you're so
| smart and fund fusion research with it? Because it's
| obviously way more difficult than they make out.
| aniviacat wrote:
| Claiming that being smart isn't required for trading is
| not the same as claiming that people doing trading aren't
| smart.
|
| (Note that I personally have no opinion on this topic, as
| I'm not sufficiently informed to have one.)
| flourpower471 wrote:
| Drs rediscover integration is about people stepping far
| outside their field of expertise.
|
| It is neither deception or ignorance.
|
| It's the same reason some of the best physics students
| get PhD studentships where they are basically doing
| linear regression on some data.
|
| Being very good at most disciplines is about having the
| fundamentals absolutely nailed.
|
| In chess for example, you will probably need to get to a
| reasonably high level before you will be sure to see
| players not making obvious blunders.
|
| Why do tech firms want developers who can write bubble
| sort backward in assembly when they'll never do anything
| that fundamental in their career? Because to get to that
| level you have to (usually) build solid mastery of the
| stuff you will use.
|
| Trading is truly a complex endeavour - anybody who says
| it isn't has never tried to do it from scratch.
|
| Id say the industry average for somebody moving to a new
| firm and trying to replicate what they did at their old
| firm is about 5%.
|
| Im not sure what you'd call a problem where somebody has
| seen an existing solution, worked for years on it and in
| the general domain, and still would only have a 5% chance
| of reproducing that solution.
| chronic7202h wrote:
| > Id say the industry average for somebody moving to a
| new firm and trying to replicate what they did at their
| old firm is about 5%.
|
| Because 95% of experienced candidates in trading were
| fired, are trying to scam their next employer, or think
| they know everything after 3-4 YOE.
|
| Very common example: MIT AI PhD quant leaves Citadel
| Securities or HRT because he thinks he knows the full
| alpha research and monetization pipeline (lol?). He
| interviews at various household names and small stealth
| firms. Gets hired, but realizes there's too much C++ he
| previously was not exposed to and too many model
| hyperparameters he didn't care to understand. He fails
| after 1 year. Blames it on poor SWE or DevOps at the new
| firm. Tries again at a new company. Rinse and repeat for
| 5-6 years. Eventually gives up trading. Goes to work easy
| hybrid hours at Meta or OpenAI. Tells recruiter some
| bullshit like WLB or societal impact.
|
| The remaining 5% are traders/quants/PMs who actually know
| what they are doing but want a higher pnl profit share %
| or left due to political issues. These guys can
| absolutely replicate their old trade. And they do.
| flourpower471 wrote:
| Well I work in prop trading and have only ever worked for
| prop firms- our firm trades it's own capital and
| distributes it to the owners and us under profit share
| agreements - so we have no incentive to sell ourselves as
| any smarter than the reality.
|
| Saying it's all high school math is a bit of a loaded
| phrase. "High school math" incorporates basically all
| practical computer science and machine learning and
| statistics.
|
| If I suspect you could probably build a particle
| accelerator without using more math than a bit of
| calculus - that doesn't make it easy or simple to build
| one.
|
| Very few people I've worked with have ever said they are
| doing cutting edge math - it's more like scientific
| research . The space of ideas is huge, and the ways to
| ruin yourself innumerable. It's more about people who
| have a scientific mindset who can make progress in a very
| high noise and adaptive environment.
|
| It's probably more about avoiding blunders than it is
| having some genius paradigm shifting idea.
| posting_mess wrote:
| >Saying it's all high school math is a bit of a loaded
| phrase. "High school math" incorporates basically all
| practical computer science and machine learning and
| statistics.
|
| Im responding to the comment "do use llama3" not
| "breakdown your start"
|
| > Very few people I've worked with have ever said they
| are doing cutting edge math - it's more like scientific
| research . The space of ideas is huge, and the ways to
| ruin yourself innumerable. It's more about people who
| have a scientific mindset who can make progress in a very
| high noise and adaptive environment.
|
| This statement is largely true of any "edge research", as
| I watch the loss totals flow by on my 3rd monitor I can
| think of 30 different avenues of exploration (of which
| none are related to finance).
|
| Trading is largely high school Math, on top of very
| complex code, infrastructure, and optimizations.
| areoform wrote:
| Do you work for rentech?
| zjaffee wrote:
| The math might not be complicated for a lot of market
| making stuff but the technical aspects are still very
| complicated.
| bossyTeacher wrote:
| >People didn't stop working on this in 1989 - they realised
| they can make lots of money doing it and do it privately.
|
| Mind elaborating?
| SavageBeast wrote:
| Speaking for myself and likely others with similar
| motivations, yes we can "figure it out" and publish
| something to show our work and expand the field of endeavor
| with our findings - OR - we can figure something profitable
| out on our own and use our own funds to trade our
| strategies with our own accounts.
|
| Anyone who has figured out something relatively profitable
| isn't telling anyone how they did it.
| andsoitis wrote:
| > Anyone who has figured out something relatively
| profitable isn't telling anyone how they did it.
|
| Corollary: someone who is selling you tools or strategies
| on how to make tons and tons of money, is probably not
| making tons and tons of money employing said tools and
| strategies, but instead making their money by having you
| buy their advice.
| SavageBeast wrote:
| Absolutely correct - and more over - when you do sit
| someone down (in my case, someone with a "superior
| education" in finance compared to my CS degree) and
| explain things to them, they simply don't understand it
| at all and assume you're crazy because you're not doing
| what they were taught in Biz School.
| BobaFloutist wrote:
| I think I could probably make more money selling a tool
| or strategy that consistently, reliably makes ~2% more
| than government bonds than I could make off it myself,
| with my current capital.
| SavageBeast wrote:
| Seems like the money here would be building a shiny,
| public facing version of the tool behind a robust paywall
| and build a relationship with a few Broker Dealer firms
| who can make this product available to the Financial
| Advisors in their network.
|
| If you were running this yourself with $1M input capital,
| that'd be $20k/year per 1M of input - so $20K is a nice
| number to try and beat selling a product that promulgates
| a strategy.
|
| But you're going to run into the question from people
| using the product: "Yeah - but HOW DOES IT WORK??!!!" and
| once you tell them does your ability to get paid
| disappear? Do they simply re-package your strategy as
| their own and cease to pay you (and worse start charging
| for your work)? Is your strategy so complicated that the
| value of the tool itself doing the heavy lifting makes it
| sticky?
|
| Getting people to put their money into some Black Box
| kind of strategy would probably be challenging - but Ive
| never tried it - it may be easier than giving away free
| beer for all I know. Sounds like a fun MVP effort really.
| Give it a try - who knows what might happen.
| madmask wrote:
| As fas as I know the more people use the strategy the
| worse it performs, the market is not static, it adapts.
| Other people react to the buy/sell of your strategy and
| try to exploit the new pattern.
| bongodongobob wrote:
| Lol try it and get back to us.
| BobaFloutist wrote:
| Well, see, I don't actually have a method for that. But
| if I did, I think my capital is low enough that I'd have
| more success selling it to other people than trying to
| exploit it myself, since the benefit would be pretty
| minimal if I did it with just my own savings, but could
| be pretty dramatic for, say, banks.
| jimbokun wrote:
| Or just sell it to exactly one buyer with a lot of
| capital to invest.
| SavageBeast wrote:
| That hypothetical person or organization already has an
| advisor in charge of their money at the smaller end or an
| entire private RIA on the Family Office side of things.
| This approach is a fools errand.
| renewiltord wrote:
| You can't do it because there are lots of fraudulent
| operators in the space. Think about it: someone comes up
| to you offering a way to give you risk-free return. All
| your ponzi flags go up. It's a market for lemons. If you
| had this, the only way to make it is to raise money some
| other way then redirect it (illegal but you'll get away
| with it most likely), or to slowly work your way up the
| ranks proving yourself till you get to a PM and then have
| him work your strat for you.
|
| The fact that you can't reveal how means you can't prove
| you're not Ponzi. If you reveal how, they don't need you.
| dpflan wrote:
| Seems simple: Why share your effective strategies in an
| industry full of competition and those striving to gain a
| competitive edge?
| melenaboija wrote:
| > Mind elaborating?
|
| I am assuming, he/she minds a lot.
| tempodox wrote:
| But I bet it uses way more energy.
| primitivesuave wrote:
| The area where I see this making the most transformational change
| is by enabling average citizens to ask meaningful questions about
| the finances of their local government. In Cook County, Illinois,
| there are hundreds of local municipalities and elected
| authorities, all of which are producing monthly financial
| statements. There is not enough citizen oversight and rarely any
| media attention except in the most egregious cases (e.g. the
| recent drama in Dolton, IL, where the mayor is stealing millions
| in plain view of the citizens).
| apwell23 wrote:
| > citizens to ask meaningful questions about the finances of
| their local government.
|
| is there a demand for this. I live in cook country. I really
| don't want to ask these questions. Not sure what I get out of
| asking these questions other than anger and frustration.
| Kon-Peki wrote:
| Municipal bond rating agencies should be a client of such
| data.
|
| And if not the rating agencies, people who invest in
| municipal bonds.
| m463 wrote:
| if all the citizens can ask these questions, I think it will
| make a difference.
|
| and of course, the follow-up questions. Like who.
| vsuperpower2020 wrote:
| Then anything you plan is doomed from the start. If
| companies start slipping cyanide into their food it would
| take at least 20 years for people to stop buying it.
| Getting everyone to simply do your thing while they're busy
| with their own life is a fool's errand.
| apwell23 wrote:
| > if all the citizens can ask these questions, I think it
| will make a difference.
|
| Our major just appointed some pastor to a high level
| position in CTA( local train system) as some sort of
| patronage.
|
| Thats the a level things operate in our govt here. I am
| skeptical that some sort of data enlightenment in citenzery
| via llm is what is need for change.
|
| edit: looks like the pastor buckled today
| https://blockclubchicago.org/2024/05/24/pastor-criticized-
| fo...
| abdullahkhalids wrote:
| The citizens ask LLMs (or more advanced future AIs) to identify
| if government finances are being used efficiently, and if there
| is evidence of corruption.
|
| The corrupt government officers then start using the AIs to try
| to cover up the evidence of their crimes in the financial
| statements. The AI possibly putting the skills of high-end and
| expensive human accountants (or better) into the hands of local
| governments.
|
| Who wins this attrition war?
| berkes wrote:
| > Who wins this attrition war?
|
| The AI companies. Double. People paying to use their
| products. But mostly by gaining a lot of leverage and power.
| airstrike wrote:
| > The corrupt government officers then start using the AIs to
| try to cover up the evidence of their crimes in the financial
| statements.
|
| There's a difference between an AI being able to answer
| questions and it helping cover up evidence, unless you mean
| "using the AIs for advice on how to cover up evidence"
| ketzo wrote:
| Corrupt government officers are one thing. But there is a ton
| of completely well-meaning bureaucracy in the U.S. (and
| everywhere!) that could benefit from a huge, huge step change
| in "ability to comprehend".
|
| Bad actors will always exist but I think there's a LOT of
| genuine good to be done here!
| oceanplexian wrote:
| > The corrupt government officers then start using the AIs
|
| You're making it way too complicated. The government will
| simply make AI illegal and claim it's for safety or
| something. The'll then use a bunch of scary words to demonize
| it, and their pals in the mainstream media will push it on
| low-information voters. California already has a bill in the
| works to do exactly this.
| hnthrowaway6543 wrote:
| Let's say LLMs work exactly as advertised in this case: you go
| into the LLM, say "find corruption in these financial reports",
| and it comes back with some info about the mayor spending
| millions on overpriced contracts with a company run by his
| brother. What then? You can post on Twitter, but unless you
| already have a following it's shouting into the void. You can
| go to your local newspapers, they'll probably ignore you; if
| they do pay attention, they'll write an article which gets a
| few hundred hits. If the mayor acknowledges it at all, they'll
| slam it as a political hit-piece, and that's the end of it. So
| your best chance is... hope really hard it goes viral, I guess?
|
| This isn't meant to be overly negative, but exposing financial
| corruption is mostly about information control; I don't see how
| LLMs help much here. Even if/when you find slam-dunk evidence
| that corruption is occurring, it's generally very hard to
| provide evidence in a way that Joe Average can understand, and
| assuming you are a normal everyday citizen, it's extremely hard
| to get people to act.
|
| As a prime example, this bit on the SF "alcohol rehab"
| program[0] went semi-viral earlier this week; there's no way to
| interpret $5 million/year spent on 55 clients as anything but
| "incompetence" at best and "grift and corruption" at worst. Yet
| there's no public outrage or people protesting on the streets
| of SF; it's already an afterthought in the minds of anyone who
| saw it. Is being able to query an LLM for this stuff going to
| make a difference?
|
| [0] https://www.sfchronicle.com/politics/article/sf-free-
| alcohol...
| beepbooptheory wrote:
| Are people supposed to be outraged that that is too little or
| too much money?
|
| That's still cheaper than sending them to prison!
| roywiggins wrote:
| Also, per the link, cheaper than emergency room visits and
| ambulance transports:
|
| > But San Francisco public health officials found that the
| city saved $1.7 million over six months from the managed
| alcohol program in reduced calls to emergency services,
| including emergency room visits and other hospital stays.
| In the six months after clients entered the managed alcohol
| program, public health officials said visits to the city's
| sobering center dropped 92%, emergency room visits dropped
| more than 70%, and EMS calls and hospital visits were both
| cut in half.
|
| > Previously, the city reported that _just five residents
| who struggled with alcohol use disorder had cost more than
| $4 million in ambulance transports over a five-year period,
| with as many as 2,000 ambulance transports over that time._
| [emphasis mine]
|
| > The San Francisco Fire Department said in a statement
| that the managed alcohol program has "has proven to be an
| incredibly impactful intervention" at reducing emergency
| service use for a "small but highly vulnerable population."
| Terr_ wrote:
| > That's still cheaper than sending them to prison!
|
| Literally:
|
| > It costs an average of about $106,000 per year to
| incarcerate an inmate in prison in California.
|
| https://www.lao.ca.gov/PolicyAreas/CJ/6_cj_inmatecost
| Scubabear68 wrote:
| Oh yeah. This. I live in a tiny community it our district
| school board has a $54 million budget right now, and all the
| audits are rubber stamps and wink and nudge from the State.
| When residents try to dig in and complain about waste and
| fraud we are shrugged off.
| tchalla wrote:
| It's your assumption that the lack of oversight is because of
| too much information. How will you validate that hypothesis
| before you invest in a solution?
| kenjackson wrote:
| I think this is in general one of the big wins with LLMs:
| Simple summarization. I first encountered it personally with
| medical lab reports. And as I noted in a past comment, GPT
| actually diagnosed an issue that the doctors and nurses missed
| in real-time as it was happening.
|
| The ability to summarize and ask questions of arbitrarily
| complex texts is so far the best use case for LLMs -- and it's
| non-trivial. I'm ramping up a bunch of college intern devs and
| they're all using LLMs and the ramp up has been amazingly
| quick. The delta in ramp up speed between this and last summer
| is literally an order of magnitude difference and I think it is
| almost all LLM based.
| myth_drannon wrote:
| that's what I did with my town financial report. Asked chatGPT
| to find irregularities. The response was very concerning, with
| multiple expenses that looked truly very suspicious (like
| planting a tree - 2000$). I would have gone berserk at the town
| council meeting if I was an activist citizen.
| 3abiton wrote:
| Then companies tends to utilize LLMs to maximize the confusion
| of a message to the shareholders, cat and mouse game.
| chollida1 wrote:
| So the history of this type of research as I know it was that we
|
| - started to diff the executives statements from one quarter to
| another. Like engineering projects alot of this is pretty
| standard so the starting point is the last doc. Diffing allowed
| us to see what the executives added and thought was important and
| also showed what they removed. This worked well and for some
| things still does, this is what a warrant canary does, but
| stopped generating much alpha around 2010ish.
|
| - simple sentiment. We started to count positive and negative
| words to build a poor mans sentiment analysis that could be done
| very quickly upon doc release to trade upon. worked great up
| until around 2013ish before it started to be gamed and even
| bankruptcy notices gave positive sentiment scores by this metric.
|
| - sentiment models. Using proper models and not just positive and
| negative word counts we built sentiment models to read what the
| executives were saying. This worked well until about 2015/2016ish
| in my world view as by then executives carefully wrote out their
| remarks and had been coached to use only positive words. Worked
| until twitter killed the fire hose, and wasn't very reliable as
| reputable news accounts kept getting hacked. I remember i think
| AP new's account got hacked and reported a bombing at the white
| house that screwed up a few funds.
|
| You also had Anne Hathaway news pushing up Berkshire Hathaway's
| share price type issues in this time period.
|
| - there was a period here where we kept the same technology but
| used it everywhere from the twitter firehose to news articles to
| build a realtime sentiment model for companies and sectors. Not
| sure it generates much alpha due to garbage in, garbage out and
| data cleaning issues.
|
| - LLMs, with about GPT2 we could build models to do the sentiment
| analysis for us, but they had to be built out of foundational
| models and trained inhouse due to context limitations. Again this
| has been gamed by executives so alot of the research that I know
| of is now targeted at ingesting the Financials of companies and
| being able to ask questions quickly without math and programming.
|
| ie what are the top 5 firms in the consider discretionary space
| that are growing their earnings the fastest while not yet raising
| their dividends and whose share price hasn't kept up with their
| sectors average growth.
| nebula8804 wrote:
| I have no window into this world but I am curious if you know
| anything about the techniques that investors used to short or
| just analyze Tesla stock during the production hell of
| 2017-2020? It was an interesting window in ways that firms use
| to measure as much of the company as they can from the outside.
| In fact was there any other stock that was as heaving watched
| during that time?
|
| Looking back at that era it seemed investors were _too_ focused
| on the numbers and fundamentals, even setting up live feeds of
| the factories to count the number of cars coming out and thats
| the same feeling I get from your post. It seems like _dumb_
| analysis ie. analysis without much context.
|
| We now know from the recent Isaacson biography what was
| happening on the other side. The shorts failed to measure the
| clever unorthodox ways that Musk and co would take to get the
| delivery numbers up. For example: The famous Tent. Musk used a
| loophole in CA laws to set up a giant tent in the parking lot
| and allowed him to boost the production by eliminating entire
| bottlenecks from the factory design. There is also just the
| religious like fervor with which the employees wanted to beat
| the shorts. I dont think this can be measured no? It helped to
| get them past the finish line.
| refulgentis wrote:
| Markets aren't sports teams, i.e. bimodal camps with us vs.
| them drama. Twitter discussion of markets, maybe, but not
| markets.
|
| I've been on both sides of this trade, regularly.
|
| Bear thesis back then was same as now. In retrospect, I give
| it a few more credits because Elon says they were getting
| close to bankrupt while he was posting "bankwupt" memes and
| selling short shorts.
|
| Being a pessimist, and putting your money where your mouth is
| in markets, is difficult because you have to be right and
| have the right timing.
| nicce wrote:
| What will happen if everyone starts using heavy statistical
| methods or LLMs to predict stocks prices? And buys stock based on
| them? Will it absolutely make everything unpredictable?
|
| Edit: assuming that they initially provide good predictions
| infecto wrote:
| This has already been a thing since the late 80s.
| nicce wrote:
| It hasn't been accurate enough to be meaningful, nor enough
| data.
| infecto wrote:
| There has been plenty of work behind the scenes over the
| decades that has been meaningful.
| surfingdino wrote:
| HFT guys won't touch GPT. The stakes are too high. If LLMs
| could give those guys an edge they'd be all over this tech.
| berkes wrote:
| Or rather: If LLMs could give those guys an edge, there's no
| way they'd share their edge-giving LLMs with anyone, least of
| all their competition and the plebs.
| lucianbr wrote:
| Isn't it already unpredictable? That is why nobody outperforms
| indices. And utterly irrational, which is again both expected
| and seen. This must be why Tesla continues to have a huge
| market value - Elon knows how to excite the LLMs. :)
| 01100011 wrote:
| Top story on HN because we all secretly think we can be the next
| Jim Simons when in reality we're a few months away from posting
| loss porn to /r/WSB.
|
| If standardized LLM models are used to analyze statements, expect
| the statements to be massaged in ways that produce more favorable
| results from the LLM.
| tempodox wrote:
| Yep, Goodhart's law is universal.
| doctoboggan wrote:
| If this were to become widely used, I can imagine executives
| writing financial statements, running them through an LLM, and
| tweaking it until they get the highest predicted future outcome.
| This would make the measure almost immediately useless.
| oceanplexian wrote:
| This is already how it works. Have you listened to an earnings
| call? Especially companies like Tesla? They are a dog and pony
| show to sell investors on the stock.
| doctoboggan wrote:
| I am not saying executives aren't currently trying to game
| the system. I am saying currently the best they can do is
| estimate how thousands of analysts will respond. If LLM
| analysts become wide spread then they would be able to run
| simulations in a feedback loop until their report is
| optimized.
| jellicle wrote:
| People are going to lose a LOT of money using this when the LLM
| says "buy" and old-school humans who read the same statement say
| "sell".
|
| But up until that day, it will probably be cheaper.
| dangerwill wrote:
| I guess this makes sense. Because while there should be some
| noise from the text translation into the internal representation
| of the financial data once ingested into the model, the authors
| purposefully re-formatted all the reports to be formatted
| consistently. That then should allow the model to essentially do
| less of the LLM magic and more plain linear regression of the
| financial stats. And often past performance does have an impact
| on future performance, up to a point.
|
| I wonder what the results would have been with still-anonymized
| but non-fully standardized statements.
|
| Still though, impressive.
| nvy wrote:
| You can scrape the filings from EDGAR, which presents the
| statements in standardized format.
| nextworddev wrote:
| To everyone thinking they can sell a LLM wrapper based on this -
| this is a very tough domain. You will soon run into data,
| distribution, and low demand. Funds that would actually use this
| are already using it.
| selimnairb wrote:
| Great. Humans no longer need to cook the books and can claim
| plausible deniability. The only problem is the hallucination
| errors could go against you as well as for you.
| caseyy wrote:
| They may claim, but there is no such plausible deniability. Not
| for lawyers using AI hallucinations, not for Tesla drivers
| crashing into things with FSD, not for tax fraud. People are
| ultimately held responsible and accountable for the way they
| use AI.
| einpoklum wrote:
| Let's just skip ahead to just implementing the script of
| Idiocracy verbatim and be done with it.
| motoboi wrote:
| Please add "and the financial result of this analysis will count
| toward your annual bonus" to the prompt.
| jrochkind1 wrote:
| how do you distinguish enterprise-level question-answering from
| other kinds of question-answering?
| dankai wrote:
| if you want to see successful "machine learning based financial
| statement analysis", check out my paper & thesis. its from 2019
| and ranks #1 for the term on google and gs because it is the
| first paper that applies a range of machine learning methods to
| all the quantitative data in them instead of just doing nlp on
| the text. happy to answer questions
|
| paper https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3520684
|
| thesis
| https://ora.ox.ac.uk/objects/uuid:a0aa6a5a-cfa4-40c0-a34c-08...
| caseyy wrote:
| > In this section, we aim to understand the sources of GPT's
| predictive ability.
|
| Oh boy... I wonder how a neural net trained with unsupervised
| learning has a predictive ability. I wonder where that comes
| from... Unfortunately, the article doesn't seem to reach a
| conclusion.
|
| > We implement the CoT prompt as follows. We instruct the model
| to take on the role of a financial analyst whose task is to
| perform financial statement analysis. The model is then
| instructed to (i) identify notable changes in certain financial
| statement items, and (ii) compute key financial ratios without
| explicitly limiting the set of ratios that need to be computed.
| When calculating the ratios, we prompt the model to state the
| formulae first, and then perform simple computations. The model
| is also instructed to (iii) provide economic interpretations of
| the computed ratios.
|
| Who will tell them how an LLM works and that the neural net does
| not _calculate_ anything? It only predicts the next token in a
| sentence of a calculation if it 's been loss-minimized for that
| specific calculation.
|
| It looks like these authors are _discovering_ large language
| models as if they are some alien animal. When they are
| mathematically describable and really not so mysterious
| prediction machines.
|
| At least the article is fairly benign. It's about the type of
| article that would pass as research in my MBA school as well...
| It doesn't reach any groundbreaking conclusions except to
| demonstrate that the guys have "probed" the model. Which I think
| is good. It's uninformed but not very misleading.
| Animats wrote:
| How do they get a LLM to do arithmetic?
| uncomplete wrote:
| Headline should be 'LLM can read (and understand) PDFs'
| diziet wrote:
| The fact that the paper does not mention the word
| "hallucinations" in the full body text makes me think that the
| authors aren't fully familiar with the state of LLMs as of 2024.
| thedudeabides5 wrote:
| bait
___________________________________________________________________
(page generated 2024-05-24 23:00 UTC)