[HN Gopher] AI assists clinicians in responding to patient messa...
       ___________________________________________________________________
        
       AI assists clinicians in responding to patient messages at Stanford
       Medicine
        
       Author : namanyayg
       Score  : 51 points
       Date   : 2024-04-07 16:37 UTC (6 hours ago)
        
 (HTM) web link (med.stanford.edu)
 (TXT) w3m dump (med.stanford.edu)
        
       | lumb63 wrote:
       | Fantastic. An additional way by which someone looking for help or
       | support from a human can be denied that help and have their needs
       | offloaded to a technology that probably can't help them.
       | 
       | Here's how I see this playing out: practices start to use LLMs to
       | answer patient queries to save money. People realize their
       | queries aren't actually being answered fully (or at all?) and
       | stop contacting their doctors at all. Practices save money and
       | patients get worse outcomes. Exactly as we've seen in customer
       | service over the years - talking to a human is nearly impossible.
       | 
       | An alternate possibility is the LLM gives such bad advice that it
       | causes irreparable harm to a patient and nobody can be held
       | liable because the doctor didn't issue the advice and neither did
       | the company which produced the LLM.
       | 
       | The universe where an LLM used in this situation does anything
       | positive for the patient is vanishingly small. But I'm sure
       | providers and practices will love it, since it effectively will
       | allow them to not do part of their job and increase margins
       | respectively.
        
         | sandspar wrote:
         | Why is there always this kneejerk "bad!" reaction to AI? You
         | don't think there's any conceivable way that this may actually
         | help people? You don't think that Stanford Medicine may know a
         | bit more about the problem space than you do?
        
           | SoftTalker wrote:
           | I think it's conceivable but if the goal was to help people
           | it would already have been done: address the systemic
           | roadblocks that prevent more people from studying medicine,
           | increase the supply of doctors and clinicians so that people
           | don't have to wait months for a routine appointment.
        
             | jdlshore wrote:
             | That's a false dilemma. We can do both. Some people are
             | addressing roadblocks (or attempting to; politics are
             | hard), and some people are working on AI-assisted support.
             | 
             | I'm as skeptical of GenAI as they come, but let's not
             | pretend that the people working on GenAI customer support
             | are the same people who should be working with the AMA to
             | address the artificial constraints in medical education.
        
           | constantcrying wrote:
           | There is by far enough enthusiasm for AI already.
           | 
           | >You don't think there's any conceivable way that this may
           | actually help people?
           | 
           | I don't. I believe that exactly as OP described people will
           | notice they are talking to some AI and just give up, as they
           | correctly understand that the AI does not have the abilities
           | and understanding that might actually help them.
        
           | gpm wrote:
           | Because the vast majority of commercial deployments of "AI"
           | (llms) are extremely unhelpful chat bots that just waste
           | customers time to try and avoid having to have a human talk
           | to them.
        
             | choilive wrote:
             | And the vast majority of commercial deployments of real
             | humans to solve the same problem are extremely unhelpful
             | that just waste customers time. Whats your point?
        
             | wruza wrote:
             | Decent LLMs are < 1 year old. How much commercial
             | deployment they've seen?
        
               | unleaded wrote:
               | Too much
        
           | lostlogin wrote:
           | > Why is there always this kneejerk "bad!" reaction to AI?
           | 
           | I don't interpret the OP as slamming AI. It's criticism of
           | the way in which companies find ways to save a few dollars.
        
           | olliej wrote:
           | Because "AI" or LLMs have no understanding of what they're
           | producing, and every single attempt to use them in an
           | environment where basic factual correctness is involved has
           | confirmed this?
           | 
           | Or the fact that the entire sales pitch is "this will be
           | cheaper than actual people who know what they're doing",
           | maybe a side claim of "outperforms people on our training
           | set" coupled with "we accept no liability for inaccurate
           | advice or poor outcomes".
           | 
           | The sole purpose of programs like these is to save money. It
           | has nothing to do with helping anyone. The goal is to sound
           | like your system is providing the service people paid for,
           | without actually spending the money to provide that service.
           | 
           | I will believe a company will is doing the right thing when
           | the developers personally accept the criminal and legal
           | liability that an actual doctor or nurse providing the
           | service is subject to.
           | 
           | But then to add insult to injury this just further increases
           | the wait time for people by forcing them to deal with the
           | "AI" before they get through to the further reduced pool of
           | actual specialists and correspondingly increased wait times
           | there as well.
           | 
           | This is not rocket science to understand, and literally
           | nothing I've said here is new - not the clear prioritization
           | of cost cutting, the overselling of "AI", and the generally
           | unsafe use of randomized statistical models to provide
           | factual information.
           | 
           | My question is why you think that this time is different vs
           | literally every other example?
        
           | tbalsam wrote:
           | This is because this is what the tendency has been like in
           | practice for many years. While the possibilities are good,
           | living in the world of what-if for ML is very similar to the
           | world of what-if for blockchain. It is very promising but
           | unfortunately the past trend with reality at both a surface
           | and deeper dynamics level does not seem to agree with it
           | being a solid motif long-term.
           | 
           | This is why people are pessimistic about ML/AI in people-
           | facing positions. It has happened with automation in the
           | past, it is _currently happening_ with ML/AI systems in
           | customer-facing applications, and the momentum of the field
           | is headed in that direction. To me, it would be very silly to
           | assume that the field would make a sudden 180 from decades of
           | momentum in a rather established direction, with no
           | established reason other than the potential for it to be
           | good.
           | 
           | This is why people tend to generally be upset about it, is my
           | understanding. It's a good tool, and it is oftentimes not
           | used well.
        
           | rsynnott wrote:
           | Given the long, unfortunate history of computerisation of
           | medicine, I don't think there should be a presumption that
           | they necessarily know what they're doing. If this is a
           | disaster, it won't be the first, and some previous medical-
           | technical disasters have involved human factors; the operator
           | either becoming too trusting of the machine, or too casual
           | with its use.
        
         | jncfhnb wrote:
         | Is it just me or has customer service never been easier, with
         | real humans easily accessible for most major corporations
        
           | pbourke wrote:
           | Real humans who are unable to resolve your issue are quite
           | accessible indeed.
        
         | justrealist wrote:
         | I don't think you understand how stupid and googleable most
         | customer support questions are.
         | 
         | Doctors will have more time for real questions if they aren't
         | answering idiotic questions. You are living in a bubble.
        
           | SoftTalker wrote:
           | Doctors would also have more time for real questions if they
           | told fat people to exercise and quit eating so many pop-tarts
           | instead of giving them shots with a host of risky side-
           | effects that then have to be managed.
        
             | justrealist wrote:
             | Are you really trying to blame obesity on doctors?
             | Impressive.
        
               | lupusreal wrote:
               | No, he's criticizing doctors for wasting time by humoring
               | the obese instead of cutting losses and focusing on
               | patients they stand a better chance of helping.
        
               | ceejayoz wrote:
               | No, they're ignoring the medical fact that "tell people
               | to lose weight" has a basically zero percent success
               | rate.
               | 
               | The only intervention known to work reliably was
               | bariatric surgery until stuff like Ozempic showed up
               | recently.
        
               | lupusreal wrote:
               | Cutting losses to spend time with other patients isn't
               | ignoring that little medical fact. It's an
               | acknowledgement of it; all the effort spent treating the
               | symptoms of obesity is basically wasted, squandering the
               | resources (time particularly) of the medical system.
        
               | ceejayoz wrote:
               | Except the post directly _criticizes_ one of the major
               | new ways of reducing that effort.
               | 
               | Someone genuinely worried about obesity wasting doctors'
               | time should be all for stuff like Ozempic.
        
               | arcticbull wrote:
               | Are you suggesting that the purpose of the medical system
               | is not to treat medical conditions? I suppose if they
               | took that tack, yes, the system would be more efficient.
        
               | arcticbull wrote:
               | It is in a sense their fault for prescribing something
               | (diet and exercise) we have known is utterly ineffective
               | for over a hundred years. Likely because until now there
               | weren't non-surgical alternatives, and caloric
               | restriction and exercise do have other non-weight-loss
               | benefits.
        
             | ceejayoz wrote:
             | I promise you, doctors have been telling fat people it's
             | their fault for many, many decades. It's one of the most
             | common complaints of fat people in a medical context - that
             | the first response to nearly any medical issue is "well you
             | should lose weight".
             | 
             | https://www.nbcnews.com/health/health-news/doctors-move-
             | end-...
             | 
             | > When Melissa Boughton complained to her OB-GYN about dull
             | pelvic pain, the doctor responded by asking about her diet
             | and exercise habits.
             | 
             | > On this occasion, three years ago, the OB-GYN told
             | Boughton that losing weight would likely resolve the pelvic
             | pain. The physician brought up diet and exercise at least
             | twice more during the appointment. The doctor said she'd
             | order an ultrasound to put Boughton's mind at ease. The
             | ultrasound revealed the source of her pain: a 7-centimeter
             | tumor filled with fluid on Boughton's left ovary.
        
               | Gibbon1 wrote:
               | A silver lining of being mentally ill is that you don't
               | suffer from the diseases that normal people do. Doctors
               | know whatever the complaint it's due to the mental
               | illness.
        
             | rsynnott wrote:
             | ... Doctors have, of course, been doing that for about a
             | century. It isn't particularly effective.
             | 
             | Ozempic actually is effective, and will likely lead to
             | significant improvements.
        
             | arcticbull wrote:
             | > Doctors would also have more time for real questions if
             | they told fat people to exercise and quit eating so many
             | pop-tarts instead of giving them shots with a host of risky
             | side-effects that then have to be managed.
             | 
             | There is no scientific evidence, whatsoever, that diet and
             | exercise is an effective way of losing a clinically
             | significant (>5%) amount of weight and keeping it off for a
             | long period if time (5y). Go ahead and try and find even
             | one study that shows this is the case.
             | 
             | When you diet and exercise, your basal metabolic rate slows
             | down as much as 20-30%, permanently, and your hunger
             | increases. Your BMR is where the vast majority of your
             | energy expenditure goes, no matter how much you work out.
             | In fact there's reason to think that more exercise will
             | actually slow your BMR. Body weight set point is
             | principally genetic and epigenetic, as evidenced from twins
             | studies.
             | 
             | Maybe we'd make some progress on this particular topic if
             | we stopped throwing out tired tropes and blaming people.
             | The only scientifically proven methods of achieving
             | significant, long-term sustained weight loss for most
             | people are GLP-1/GIPs or bariatric surgery (but even there,
             | only a gastric sleeve or roux-en-y work, lap bands do not).
             | 
             | Here's a 29-study meta-analysis which walks you through
             | what I said in more detail [1] and of course the famous
             | Biggest Loser study where everyone on that show regained
             | all the weight in the six years following. The more they
             | lost on the show the more they gained back. [2]
             | 
             | Let's not even get started on the use of insulin to treat
             | type 2.
             | 
             | It's pretty wild how backwards our approach to metabolic
             | health is from a clinical perspective. Your response here
             | is a perfect example.
             | 
             | Now look at Tirzepatide. 90% success vs. 5% success. [3]
             | 
             | [1] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5764193/
             | 
             | [2] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4989512/
             | 
             | [3] https://www.nejm.org/doi/suppl/10.1056/NEJMoa2206038/su
             | ppl_f...
        
         | userbinator wrote:
         | _An alternate possibility is the LLM gives such bad advice that
         | it causes irreparable harm to a patient and nobody can be held
         | liable because the doctor didn't issue the advice and neither
         | did the company which produced the LLM._
         | 
         | I think someone will be liable; this has already been tested in
         | at least one court with something less life-critical:
         | https://news.ycombinator.com/item?id=39378235
        
         | exe34 wrote:
         | The doctor who signs off will definitely be liable.
         | Unfortunately we live in a society where somebody both has to
         | die and have litigious surviving relatives to make sure the
         | problem becomes actionable.
        
       | ceejayoz wrote:
       | "Just review the AI draft" is likely gonna kill some people by
       | priming clinicians to agree with the result.
       | 
       | https://www.npr.org/sections/health-shots/2013/02/11/1714096...
       | 
       | > He took a picture of a man in a gorilla suit shaking his fist,
       | and he superimposed that image on a series of slides that
       | radiologists typically look at when they're searching for cancer.
       | He then asked a bunch of radiologists to review the slides of
       | lungs for cancerous nodules. He wanted to see if they would
       | notice a gorilla the size of a matchbook glaring angrily at them
       | from inside the slide.
       | 
       | > But they didn't: 83 percent of the radiologists missed it, Drew
       | says.
        
         | wccrawford wrote:
         | Based on the code that CoPilot has produced for me, I can 100%
         | see that happening, and rather a lot.
         | 
         | So far, the code it produces _looks_ really good at first
         | glance, but once I really dig into what it 's doing, I find
         | numerous problems. I have to really think about that code to
         | overcome the things it does wrong.
         | 
         | I think many doctors are used to doing that already, since they
         | have to correct their interns/assistants/etc when they come up
         | with things that sound good, but ultimately aren't. But it's
         | definitely something that they'll fail at sometimes.
        
           | stouset wrote:
           | This is what I really worry about in programming contexts.
           | The norm is _already_ for a distressing proportion of
           | engineers to barely (if even that) understand code they're
           | working on, even when they're the only author.
           | 
           | In that situation it's extremely easy to just blindly trust
           | an "expert" third party, particularly one where you can't ask
           | it questions about how and why.
        
             | userbinator wrote:
             | They don't even deserve to be called "engineers". They're
             | just guess-and-test amateurs.
        
         | userbinator wrote:
         | From what I've seen so far, some will absolutely notice and
         | complain loudly about the mistakes that AI makes, and some
         | won't --- praising it as another great solution. I suspect this
         | will create another highly divisive issue.
        
         | gretch wrote:
         | If you put an ascii art comment block of a gorilla in a code
         | base and told me to "find the race condition", I would probably
         | skip over the gorilla.
         | 
         | At the end of the day the metric of success for these doctors
         | is to find cancer, not impossible images of gorillas.
         | 
         | The study is kinda "funny", but it doesn't really mean
         | anything. Even the article itself does not claim that this is
         | some kind of failure
        
           | ceejayoz wrote:
           | The point of this study is that if you're told there's a race
           | condition in the code, you are much more likely to miss a big
           | security hole in it.
        
             | gretch wrote:
             | So give the a reviewer a rubric of things you care about.
             | Race conditions, security, etc.
             | 
             | The point is, an image of a gorilla is literally completely
             | impossible and has no relevance. That's why a security vuln
             | is not analogous
        
               | ceejayoz wrote:
               | In the case of radiology, that rubric is "anything
               | clinically signifcant you can find". A kid in being
               | treated for possible pneumonia might have signs on a
               | chest x-ray of cancer, rib fractures from child abuse, an
               | enlarged heart, etc.
               | 
               | There's a risk to "rule out pneumonia" style guidance
               | resulting in a report that misses these things in favor
               | of "yep, it's pneumonia".
        
             | vrc wrote:
             | A better study would be to place an unlikely but detection
             | worthy artifact into the image. Say, you're looking for a
             | broken bone and you put a rare disease in an unrelated part
             | of the body but visible in the radiograph. Bet the
             | clinicians spot it. Because in the real world, this is how
             | things get caught. But I'd love to know the missed
             | detection rate.
        
               | ceejayoz wrote:
               | This has been studied in various ways. Lots of perception
               | effects documented. For example:
               | 
               | https://cognitiveresearchjournal.springeropen.com/article
               | s/1...
               | 
               | > For over 50 years, the satisfaction of search effect
               | has been studied within the field of radiology. Defined
               | as a decrease in detection rates for a subsequent target
               | when an initial target is found within the image, these
               | multiple target errors are known to underlie errors of
               | omission (e.g., a radiologist is more likely to miss an
               | abnormality if another abnormality is identified). More
               | recently, they have also been found to underlie lab-based
               | search errors in cognitive science experiments (e.g., an
               | observer is more likely to miss a target 'T' if a
               | different target 'T' was detected). This phenomenon was
               | renamed the subsequent search miss (SSM) effect in
               | cognitive science.
        
         | Zenzero wrote:
         | With all due respect there is so much absurdity in the
         | assumptions being made with your linked article that it is
         | almost not worth engaging with. However, I will for educational
         | purposes.
         | 
         | As someone who is trained in and comfortable reading
         | radiographs but is not a radiologist, I can tell you that
         | putting a gorilla on one of the views is a poor measure of how
         | many things are missed by radiologists.
         | 
         | Effectively interpreting imaging studies requires expert
         | knowledge of the anatomy being imaged and the variety of ways
         | pathology is reflected in a visibly detectable manner. What
         | they are doing is rapidly cycling through what is effectively a
         | long checklist of areas to note: evaluate the appearance of
         | hilar and mediastinal lymph nodes, note bronchiolar appearance,
         | is there evidence of interstitial or alveolar patterns
         | (considered within the context of what would be expected for a
         | variety of etiologies such as bronchopneumonia, neoplasia,
         | CHF,...), do you observe appropriate dimensions of the cardiac
         | silhouette, do you see other evidence of consolidation within
         | the lungs, within the visible vertebrae do you observe
         | appropriate alignment, do the endplates appear abnormal, do you
         | observe any vertebral lucencies, on and on and on.
         | 
         | Atypical changes are typically clustered in expected ways.
         | Often deviations from what is expected will trigger a bit more
         | consideration, but those expectations are subverted during the
         | course of going through your "checklist". No radiologist has
         | look for a gorilla in their evaluation.
         | 
         | It is pretty clear that the layperson's understanding of a
         | radiologist being "look at the picture and say anything that is
         | different" is a complete miss on what is actually happening
         | during their evaluation.
         | 
         | It's like if I asked you to show me your skills driving a car
         | around an obstacle course, and then afterwards I said you are a
         | bad driver because you forgot to check that I swapped out one
         | of the lug nuts on each wheel with a Vienna sausage.
        
           | ceejayoz wrote:
           | My dad is a radiologist and... not so dismissive of this
           | study. Missing other conditions on a reading due to a focus
           | on something specific is not uncommon.
           | 
           | Things like obvious fractures left out of a report.
           | 
           | https://cognitiveresearchjournal.springeropen.com/articles/1.
           | ..
           | 
           | > For over 50 years, the satisfaction of search effect has
           | been studied within the field of radiology. Defined as a
           | decrease in detection rates for a subsequent target when an
           | initial target is found within the image, these multiple
           | target errors are known to underlie errors of omission (e.g.,
           | a radiologist is more likely to miss an abnormality if
           | another abnormality is identified). More recently, they have
           | also been found to underlie lab-based search errors in
           | cognitive science experiments (e.g., an observer is more
           | likely to miss a target 'T' if a different target 'T' was
           | detected). This phenomenon was renamed the subsequent search
           | miss (SSM) effect in cognitive science.
        
       | nicklecompte wrote:
       | I haven't finished reading the paper closely, but it is
       | disturbing that this is the only real reference to _accuracy_ and
       | _relevance_ I could find:
       | 
       | > Barriers to adoption include draft message voice and/or tone,
       | content relevance (8 positive, 1 neutral, and 9 negative), and
       | accuracy (4 positive and 5 negative).
       | 
       | That's it! Outside of that it looks like almost all the metrics
       | are about how this makes life easier for _doctors._ Maybe there
       | is more in the supplement, I have not read closely.
       | 
       | Look: I am not a policymaker or a physician. It's quite plausible
       | that a few LLM hallucinations is an acceptable price to mitigate
       | the serious threat of physician overwork and sleep deprivation.
       | And GPT-4 is pretty good these days, so maybe the accuracy issues
       | were minor and had little impact on patient care.
       | 
       | What's not acceptable is motivated reasoning about productivity
       | leading people to treat the accuracy and reliability of software
       | as an afterthought.
        
         | potatoman22 wrote:
         | Those are just free-text comments from the ~80 physicians who
         | used this tool. Tables 2-4 show the researchers measured
         | various characteristics like message read/write time,
         | utilization rate, work load, burnout, utility, and message
         | quality.
         | 
         | It's also worth noting that, from the abstract, the study's
         | objective isn't to study the LLM's accuracy. This is a study of
         | the effectiveness of this drafting system's implementation in
         | the hospital. I'm not saying the accuracy isn't an important
         | component of the system's effectiveness, but it's not the
         | question they're answering.
        
           | nicklecompte wrote:
           | My point is that "accuracy is not the question they're
           | answering" is a fatal flaw that makes this research
           | pointless.
           | 
           | Say I released a linear algebra library and extolled its
           | performance benefits + ease of development, then offhandedly
           | mentioned that "it has some concerns with accuracy" without
           | giving further details. You wouldn't say "ah, he focused on
           | performance rather than accuracy." You'd say "is this a
           | scam?" It is never acceptable for healthcare software
           | researchers and practitioners to ignore accuracy. LLMs don't
           | change that.
           | 
           | The only thing that sort of cynical laziness is good for is
           | justifying bad decision-making by doctors and hospital
           | administrators.
        
       | jvans wrote:
       | Notice how all the benefits they mention are for physicians and
       | not for the patients. That's a pretty backwards approach to
       | healthcare
        
         | Calavar wrote:
         | It's a benefit for health system C-suit execs because it makes
         | physicians "more efficient," and it's ultimately the health
         | system C-suite who will be signing off on the contracts for
         | five years of AI messenger service or whatever, not the
         | physicians.
        
         | ipv6ipv4 wrote:
         | Stanford medicine in a nutshell. Stanford medical has spread
         | like a cancer around me, and all the clinics that were acquired
         | immediately became awful.
        
       | jmount wrote:
       | Likely paying some tall dollars to deal with cheap first line
       | support. Probably a "direct to human" fee available soon.
        
       | throwaway918274 wrote:
       | this is horrifying
        
       ___________________________________________________________________
       (page generated 2024-04-07 23:01 UTC)