[HN Gopher] AI assists clinicians in responding to patient messa...
___________________________________________________________________
AI assists clinicians in responding to patient messages at Stanford
Medicine
Author : namanyayg
Score : 51 points
Date : 2024-04-07 16:37 UTC (6 hours ago)
(HTM) web link (med.stanford.edu)
(TXT) w3m dump (med.stanford.edu)
| lumb63 wrote:
| Fantastic. An additional way by which someone looking for help or
| support from a human can be denied that help and have their needs
| offloaded to a technology that probably can't help them.
|
| Here's how I see this playing out: practices start to use LLMs to
| answer patient queries to save money. People realize their
| queries aren't actually being answered fully (or at all?) and
| stop contacting their doctors at all. Practices save money and
| patients get worse outcomes. Exactly as we've seen in customer
| service over the years - talking to a human is nearly impossible.
|
| An alternate possibility is the LLM gives such bad advice that it
| causes irreparable harm to a patient and nobody can be held
| liable because the doctor didn't issue the advice and neither did
| the company which produced the LLM.
|
| The universe where an LLM used in this situation does anything
| positive for the patient is vanishingly small. But I'm sure
| providers and practices will love it, since it effectively will
| allow them to not do part of their job and increase margins
| respectively.
| sandspar wrote:
| Why is there always this kneejerk "bad!" reaction to AI? You
| don't think there's any conceivable way that this may actually
| help people? You don't think that Stanford Medicine may know a
| bit more about the problem space than you do?
| SoftTalker wrote:
| I think it's conceivable but if the goal was to help people
| it would already have been done: address the systemic
| roadblocks that prevent more people from studying medicine,
| increase the supply of doctors and clinicians so that people
| don't have to wait months for a routine appointment.
| jdlshore wrote:
| That's a false dilemma. We can do both. Some people are
| addressing roadblocks (or attempting to; politics are
| hard), and some people are working on AI-assisted support.
|
| I'm as skeptical of GenAI as they come, but let's not
| pretend that the people working on GenAI customer support
| are the same people who should be working with the AMA to
| address the artificial constraints in medical education.
| constantcrying wrote:
| There is by far enough enthusiasm for AI already.
|
| >You don't think there's any conceivable way that this may
| actually help people?
|
| I don't. I believe that exactly as OP described people will
| notice they are talking to some AI and just give up, as they
| correctly understand that the AI does not have the abilities
| and understanding that might actually help them.
| gpm wrote:
| Because the vast majority of commercial deployments of "AI"
| (llms) are extremely unhelpful chat bots that just waste
| customers time to try and avoid having to have a human talk
| to them.
| choilive wrote:
| And the vast majority of commercial deployments of real
| humans to solve the same problem are extremely unhelpful
| that just waste customers time. Whats your point?
| wruza wrote:
| Decent LLMs are < 1 year old. How much commercial
| deployment they've seen?
| unleaded wrote:
| Too much
| lostlogin wrote:
| > Why is there always this kneejerk "bad!" reaction to AI?
|
| I don't interpret the OP as slamming AI. It's criticism of
| the way in which companies find ways to save a few dollars.
| olliej wrote:
| Because "AI" or LLMs have no understanding of what they're
| producing, and every single attempt to use them in an
| environment where basic factual correctness is involved has
| confirmed this?
|
| Or the fact that the entire sales pitch is "this will be
| cheaper than actual people who know what they're doing",
| maybe a side claim of "outperforms people on our training
| set" coupled with "we accept no liability for inaccurate
| advice or poor outcomes".
|
| The sole purpose of programs like these is to save money. It
| has nothing to do with helping anyone. The goal is to sound
| like your system is providing the service people paid for,
| without actually spending the money to provide that service.
|
| I will believe a company will is doing the right thing when
| the developers personally accept the criminal and legal
| liability that an actual doctor or nurse providing the
| service is subject to.
|
| But then to add insult to injury this just further increases
| the wait time for people by forcing them to deal with the
| "AI" before they get through to the further reduced pool of
| actual specialists and correspondingly increased wait times
| there as well.
|
| This is not rocket science to understand, and literally
| nothing I've said here is new - not the clear prioritization
| of cost cutting, the overselling of "AI", and the generally
| unsafe use of randomized statistical models to provide
| factual information.
|
| My question is why you think that this time is different vs
| literally every other example?
| tbalsam wrote:
| This is because this is what the tendency has been like in
| practice for many years. While the possibilities are good,
| living in the world of what-if for ML is very similar to the
| world of what-if for blockchain. It is very promising but
| unfortunately the past trend with reality at both a surface
| and deeper dynamics level does not seem to agree with it
| being a solid motif long-term.
|
| This is why people are pessimistic about ML/AI in people-
| facing positions. It has happened with automation in the
| past, it is _currently happening_ with ML/AI systems in
| customer-facing applications, and the momentum of the field
| is headed in that direction. To me, it would be very silly to
| assume that the field would make a sudden 180 from decades of
| momentum in a rather established direction, with no
| established reason other than the potential for it to be
| good.
|
| This is why people tend to generally be upset about it, is my
| understanding. It's a good tool, and it is oftentimes not
| used well.
| rsynnott wrote:
| Given the long, unfortunate history of computerisation of
| medicine, I don't think there should be a presumption that
| they necessarily know what they're doing. If this is a
| disaster, it won't be the first, and some previous medical-
| technical disasters have involved human factors; the operator
| either becoming too trusting of the machine, or too casual
| with its use.
| jncfhnb wrote:
| Is it just me or has customer service never been easier, with
| real humans easily accessible for most major corporations
| pbourke wrote:
| Real humans who are unable to resolve your issue are quite
| accessible indeed.
| justrealist wrote:
| I don't think you understand how stupid and googleable most
| customer support questions are.
|
| Doctors will have more time for real questions if they aren't
| answering idiotic questions. You are living in a bubble.
| SoftTalker wrote:
| Doctors would also have more time for real questions if they
| told fat people to exercise and quit eating so many pop-tarts
| instead of giving them shots with a host of risky side-
| effects that then have to be managed.
| justrealist wrote:
| Are you really trying to blame obesity on doctors?
| Impressive.
| lupusreal wrote:
| No, he's criticizing doctors for wasting time by humoring
| the obese instead of cutting losses and focusing on
| patients they stand a better chance of helping.
| ceejayoz wrote:
| No, they're ignoring the medical fact that "tell people
| to lose weight" has a basically zero percent success
| rate.
|
| The only intervention known to work reliably was
| bariatric surgery until stuff like Ozempic showed up
| recently.
| lupusreal wrote:
| Cutting losses to spend time with other patients isn't
| ignoring that little medical fact. It's an
| acknowledgement of it; all the effort spent treating the
| symptoms of obesity is basically wasted, squandering the
| resources (time particularly) of the medical system.
| ceejayoz wrote:
| Except the post directly _criticizes_ one of the major
| new ways of reducing that effort.
|
| Someone genuinely worried about obesity wasting doctors'
| time should be all for stuff like Ozempic.
| arcticbull wrote:
| Are you suggesting that the purpose of the medical system
| is not to treat medical conditions? I suppose if they
| took that tack, yes, the system would be more efficient.
| arcticbull wrote:
| It is in a sense their fault for prescribing something
| (diet and exercise) we have known is utterly ineffective
| for over a hundred years. Likely because until now there
| weren't non-surgical alternatives, and caloric
| restriction and exercise do have other non-weight-loss
| benefits.
| ceejayoz wrote:
| I promise you, doctors have been telling fat people it's
| their fault for many, many decades. It's one of the most
| common complaints of fat people in a medical context - that
| the first response to nearly any medical issue is "well you
| should lose weight".
|
| https://www.nbcnews.com/health/health-news/doctors-move-
| end-...
|
| > When Melissa Boughton complained to her OB-GYN about dull
| pelvic pain, the doctor responded by asking about her diet
| and exercise habits.
|
| > On this occasion, three years ago, the OB-GYN told
| Boughton that losing weight would likely resolve the pelvic
| pain. The physician brought up diet and exercise at least
| twice more during the appointment. The doctor said she'd
| order an ultrasound to put Boughton's mind at ease. The
| ultrasound revealed the source of her pain: a 7-centimeter
| tumor filled with fluid on Boughton's left ovary.
| Gibbon1 wrote:
| A silver lining of being mentally ill is that you don't
| suffer from the diseases that normal people do. Doctors
| know whatever the complaint it's due to the mental
| illness.
| rsynnott wrote:
| ... Doctors have, of course, been doing that for about a
| century. It isn't particularly effective.
|
| Ozempic actually is effective, and will likely lead to
| significant improvements.
| arcticbull wrote:
| > Doctors would also have more time for real questions if
| they told fat people to exercise and quit eating so many
| pop-tarts instead of giving them shots with a host of risky
| side-effects that then have to be managed.
|
| There is no scientific evidence, whatsoever, that diet and
| exercise is an effective way of losing a clinically
| significant (>5%) amount of weight and keeping it off for a
| long period if time (5y). Go ahead and try and find even
| one study that shows this is the case.
|
| When you diet and exercise, your basal metabolic rate slows
| down as much as 20-30%, permanently, and your hunger
| increases. Your BMR is where the vast majority of your
| energy expenditure goes, no matter how much you work out.
| In fact there's reason to think that more exercise will
| actually slow your BMR. Body weight set point is
| principally genetic and epigenetic, as evidenced from twins
| studies.
|
| Maybe we'd make some progress on this particular topic if
| we stopped throwing out tired tropes and blaming people.
| The only scientifically proven methods of achieving
| significant, long-term sustained weight loss for most
| people are GLP-1/GIPs or bariatric surgery (but even there,
| only a gastric sleeve or roux-en-y work, lap bands do not).
|
| Here's a 29-study meta-analysis which walks you through
| what I said in more detail [1] and of course the famous
| Biggest Loser study where everyone on that show regained
| all the weight in the six years following. The more they
| lost on the show the more they gained back. [2]
|
| Let's not even get started on the use of insulin to treat
| type 2.
|
| It's pretty wild how backwards our approach to metabolic
| health is from a clinical perspective. Your response here
| is a perfect example.
|
| Now look at Tirzepatide. 90% success vs. 5% success. [3]
|
| [1] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5764193/
|
| [2] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4989512/
|
| [3] https://www.nejm.org/doi/suppl/10.1056/NEJMoa2206038/su
| ppl_f...
| userbinator wrote:
| _An alternate possibility is the LLM gives such bad advice that
| it causes irreparable harm to a patient and nobody can be held
| liable because the doctor didn't issue the advice and neither
| did the company which produced the LLM._
|
| I think someone will be liable; this has already been tested in
| at least one court with something less life-critical:
| https://news.ycombinator.com/item?id=39378235
| exe34 wrote:
| The doctor who signs off will definitely be liable.
| Unfortunately we live in a society where somebody both has to
| die and have litigious surviving relatives to make sure the
| problem becomes actionable.
| ceejayoz wrote:
| "Just review the AI draft" is likely gonna kill some people by
| priming clinicians to agree with the result.
|
| https://www.npr.org/sections/health-shots/2013/02/11/1714096...
|
| > He took a picture of a man in a gorilla suit shaking his fist,
| and he superimposed that image on a series of slides that
| radiologists typically look at when they're searching for cancer.
| He then asked a bunch of radiologists to review the slides of
| lungs for cancerous nodules. He wanted to see if they would
| notice a gorilla the size of a matchbook glaring angrily at them
| from inside the slide.
|
| > But they didn't: 83 percent of the radiologists missed it, Drew
| says.
| wccrawford wrote:
| Based on the code that CoPilot has produced for me, I can 100%
| see that happening, and rather a lot.
|
| So far, the code it produces _looks_ really good at first
| glance, but once I really dig into what it 's doing, I find
| numerous problems. I have to really think about that code to
| overcome the things it does wrong.
|
| I think many doctors are used to doing that already, since they
| have to correct their interns/assistants/etc when they come up
| with things that sound good, but ultimately aren't. But it's
| definitely something that they'll fail at sometimes.
| stouset wrote:
| This is what I really worry about in programming contexts.
| The norm is _already_ for a distressing proportion of
| engineers to barely (if even that) understand code they're
| working on, even when they're the only author.
|
| In that situation it's extremely easy to just blindly trust
| an "expert" third party, particularly one where you can't ask
| it questions about how and why.
| userbinator wrote:
| They don't even deserve to be called "engineers". They're
| just guess-and-test amateurs.
| userbinator wrote:
| From what I've seen so far, some will absolutely notice and
| complain loudly about the mistakes that AI makes, and some
| won't --- praising it as another great solution. I suspect this
| will create another highly divisive issue.
| gretch wrote:
| If you put an ascii art comment block of a gorilla in a code
| base and told me to "find the race condition", I would probably
| skip over the gorilla.
|
| At the end of the day the metric of success for these doctors
| is to find cancer, not impossible images of gorillas.
|
| The study is kinda "funny", but it doesn't really mean
| anything. Even the article itself does not claim that this is
| some kind of failure
| ceejayoz wrote:
| The point of this study is that if you're told there's a race
| condition in the code, you are much more likely to miss a big
| security hole in it.
| gretch wrote:
| So give the a reviewer a rubric of things you care about.
| Race conditions, security, etc.
|
| The point is, an image of a gorilla is literally completely
| impossible and has no relevance. That's why a security vuln
| is not analogous
| ceejayoz wrote:
| In the case of radiology, that rubric is "anything
| clinically signifcant you can find". A kid in being
| treated for possible pneumonia might have signs on a
| chest x-ray of cancer, rib fractures from child abuse, an
| enlarged heart, etc.
|
| There's a risk to "rule out pneumonia" style guidance
| resulting in a report that misses these things in favor
| of "yep, it's pneumonia".
| vrc wrote:
| A better study would be to place an unlikely but detection
| worthy artifact into the image. Say, you're looking for a
| broken bone and you put a rare disease in an unrelated part
| of the body but visible in the radiograph. Bet the
| clinicians spot it. Because in the real world, this is how
| things get caught. But I'd love to know the missed
| detection rate.
| ceejayoz wrote:
| This has been studied in various ways. Lots of perception
| effects documented. For example:
|
| https://cognitiveresearchjournal.springeropen.com/article
| s/1...
|
| > For over 50 years, the satisfaction of search effect
| has been studied within the field of radiology. Defined
| as a decrease in detection rates for a subsequent target
| when an initial target is found within the image, these
| multiple target errors are known to underlie errors of
| omission (e.g., a radiologist is more likely to miss an
| abnormality if another abnormality is identified). More
| recently, they have also been found to underlie lab-based
| search errors in cognitive science experiments (e.g., an
| observer is more likely to miss a target 'T' if a
| different target 'T' was detected). This phenomenon was
| renamed the subsequent search miss (SSM) effect in
| cognitive science.
| Zenzero wrote:
| With all due respect there is so much absurdity in the
| assumptions being made with your linked article that it is
| almost not worth engaging with. However, I will for educational
| purposes.
|
| As someone who is trained in and comfortable reading
| radiographs but is not a radiologist, I can tell you that
| putting a gorilla on one of the views is a poor measure of how
| many things are missed by radiologists.
|
| Effectively interpreting imaging studies requires expert
| knowledge of the anatomy being imaged and the variety of ways
| pathology is reflected in a visibly detectable manner. What
| they are doing is rapidly cycling through what is effectively a
| long checklist of areas to note: evaluate the appearance of
| hilar and mediastinal lymph nodes, note bronchiolar appearance,
| is there evidence of interstitial or alveolar patterns
| (considered within the context of what would be expected for a
| variety of etiologies such as bronchopneumonia, neoplasia,
| CHF,...), do you observe appropriate dimensions of the cardiac
| silhouette, do you see other evidence of consolidation within
| the lungs, within the visible vertebrae do you observe
| appropriate alignment, do the endplates appear abnormal, do you
| observe any vertebral lucencies, on and on and on.
|
| Atypical changes are typically clustered in expected ways.
| Often deviations from what is expected will trigger a bit more
| consideration, but those expectations are subverted during the
| course of going through your "checklist". No radiologist has
| look for a gorilla in their evaluation.
|
| It is pretty clear that the layperson's understanding of a
| radiologist being "look at the picture and say anything that is
| different" is a complete miss on what is actually happening
| during their evaluation.
|
| It's like if I asked you to show me your skills driving a car
| around an obstacle course, and then afterwards I said you are a
| bad driver because you forgot to check that I swapped out one
| of the lug nuts on each wheel with a Vienna sausage.
| ceejayoz wrote:
| My dad is a radiologist and... not so dismissive of this
| study. Missing other conditions on a reading due to a focus
| on something specific is not uncommon.
|
| Things like obvious fractures left out of a report.
|
| https://cognitiveresearchjournal.springeropen.com/articles/1.
| ..
|
| > For over 50 years, the satisfaction of search effect has
| been studied within the field of radiology. Defined as a
| decrease in detection rates for a subsequent target when an
| initial target is found within the image, these multiple
| target errors are known to underlie errors of omission (e.g.,
| a radiologist is more likely to miss an abnormality if
| another abnormality is identified). More recently, they have
| also been found to underlie lab-based search errors in
| cognitive science experiments (e.g., an observer is more
| likely to miss a target 'T' if a different target 'T' was
| detected). This phenomenon was renamed the subsequent search
| miss (SSM) effect in cognitive science.
| nicklecompte wrote:
| I haven't finished reading the paper closely, but it is
| disturbing that this is the only real reference to _accuracy_ and
| _relevance_ I could find:
|
| > Barriers to adoption include draft message voice and/or tone,
| content relevance (8 positive, 1 neutral, and 9 negative), and
| accuracy (4 positive and 5 negative).
|
| That's it! Outside of that it looks like almost all the metrics
| are about how this makes life easier for _doctors._ Maybe there
| is more in the supplement, I have not read closely.
|
| Look: I am not a policymaker or a physician. It's quite plausible
| that a few LLM hallucinations is an acceptable price to mitigate
| the serious threat of physician overwork and sleep deprivation.
| And GPT-4 is pretty good these days, so maybe the accuracy issues
| were minor and had little impact on patient care.
|
| What's not acceptable is motivated reasoning about productivity
| leading people to treat the accuracy and reliability of software
| as an afterthought.
| potatoman22 wrote:
| Those are just free-text comments from the ~80 physicians who
| used this tool. Tables 2-4 show the researchers measured
| various characteristics like message read/write time,
| utilization rate, work load, burnout, utility, and message
| quality.
|
| It's also worth noting that, from the abstract, the study's
| objective isn't to study the LLM's accuracy. This is a study of
| the effectiveness of this drafting system's implementation in
| the hospital. I'm not saying the accuracy isn't an important
| component of the system's effectiveness, but it's not the
| question they're answering.
| nicklecompte wrote:
| My point is that "accuracy is not the question they're
| answering" is a fatal flaw that makes this research
| pointless.
|
| Say I released a linear algebra library and extolled its
| performance benefits + ease of development, then offhandedly
| mentioned that "it has some concerns with accuracy" without
| giving further details. You wouldn't say "ah, he focused on
| performance rather than accuracy." You'd say "is this a
| scam?" It is never acceptable for healthcare software
| researchers and practitioners to ignore accuracy. LLMs don't
| change that.
|
| The only thing that sort of cynical laziness is good for is
| justifying bad decision-making by doctors and hospital
| administrators.
| jvans wrote:
| Notice how all the benefits they mention are for physicians and
| not for the patients. That's a pretty backwards approach to
| healthcare
| Calavar wrote:
| It's a benefit for health system C-suit execs because it makes
| physicians "more efficient," and it's ultimately the health
| system C-suite who will be signing off on the contracts for
| five years of AI messenger service or whatever, not the
| physicians.
| ipv6ipv4 wrote:
| Stanford medicine in a nutshell. Stanford medical has spread
| like a cancer around me, and all the clinics that were acquired
| immediately became awful.
| jmount wrote:
| Likely paying some tall dollars to deal with cheap first line
| support. Probably a "direct to human" fee available soon.
| throwaway918274 wrote:
| this is horrifying
___________________________________________________________________
(page generated 2024-04-07 23:01 UTC)