[HN Gopher] Half of women will get a false positive 3D mammogram...
___________________________________________________________________
Half of women will get a false positive 3D mammogram, study finds
Author : samizdis
Score : 196 points
Date : 2022-03-25 15:15 UTC (7 hours ago)
(HTM) web link (www.axios.com)
(TXT) w3m dump (www.axios.com)
| [deleted]
| haldujai wrote:
| This isn't quite as big of a problem as they're portraying and
| there is a lot of misinformation in these comments. For context
| I'm a radiologist at one of the academic centres involved in the
| ongoing TMIST (3D mammography) trial.
|
| For starters, false positives are an inherent part of medical
| diagnosis. Interpretation and diagnosis is probabilistic based on
| ROC curves balancing different sensitivities and specificities
| balanced with disease prevalence as well as the significance of
| missed/delayed diagnosis.
|
| Every concern that has been raised in these comments is factored
| into our reporting system. All radiologists use the BI-RADS
| reporting system which is internationally validated and expresses
| probabilities of malignancy to guide further management based on
| specific imaging. features that have been validated against
| pathologic data from excised surgical specimens and biopsies.
| This was done to homogenize reporting practices.
|
| For example a BI-RADS 5 lesion, which is based on specific
| imaging features not a gestalt/expert opinion, denotes a 95%
| probability of malignancy and would generally result urgent
| biopsy + surgical consultation. A BI-RADS 3 lesion has a 2%
| chance of malignancy and would generally be followed with serial
| imaging for a specified interval.
|
| BI-RADS 4 is the middle, this is broken down into A/B/C but
| essentially anything with 3%+ probability goes for biopsy. That
| means we expect up to 97% of the lesions we biopsy will have been
| false positives.
|
| This is intentional and desired, this is because the harms of
| missing breast cancer is horrible, early detection results in
| treatment with lumpectomy + radiation vs advanced breast cancer
| needing systemic therapies and radical resections.
|
| Second, what happens when there is an abnormality on a screening
| mammogram is not straight to biopsy. We use additional
| mammography views and ultrasound to help us find out what's going
| on before making that decision. We also have MRI to troubleshoot
| and increasingly for screening, certainly for high risk and
| extremely dense breasts.
|
| Tomosynthesis or 3D mammography is not a cure-all for false
| positives, it's also not intended for dense breasts. The point of
| Tomosynthesis is to reduce callbacks for overlapping breast
| tissue which can sometimes look like a real cancer, and to
| increase sensitivity for architectural distortion (an occult or
| infiltrative cancer distorting the fibroglandular architecture)
| which can be really hard to pick up on 2D.
|
| It works really great for that. Patients with extremely dense
| breasts should really be getting screening MRIs (US as a lesser
| alternative) in today's age and this is happening with increasing
| frequently.
|
| Further reading:
|
| https://en.wikipedia.org/wiki/Receiver_operating_characteris...
|
| https://www.acr.org/Clinical-Resources/Reporting-and-Data-Sy...
|
| https://www.nejm.org/doi/full/10.1056/NEJMoa1903986
|
| https://www.cancer.gov/about-cancer/treatment/clinical-trial...
| dontreact wrote:
| For starters, false positives are an inherent part of medical
| diagnosis. Interpretation and diagnosis is probabilistic based
| on ROC curves balancing different sensitivities and
| specificities balanced with disease prevalence as well as the
| significance of missed/delayed diagnosis.
|
| I totally get this and it's technical terminology.
|
| I think my takeaway from trying to explain why this isn't a big
| deal on this thread is that calling a tomo or mammo "positive"
| is just a recipe for confusion.
|
| I understand why it's done this way historically, but the idea
| of a mammo or tomo being "positive for cancer" makes no sense
| since there will almost always be either an ultrasound a biopsy
| or something else before a diagnosis is made. It's a test for
| whether a more invasive/expensive test is worth it, not a test
| for cancer.
|
| Curious what your opinion is on AI for DBT? I think there is
| great long term potential here, even more so than the potential
| that we have seen so far in mammo (such as
| https://www.nature.com/articles/s41586-019-1799-6), because
| it's easier for AI to thoroughly look at things in 3d and spot
| new patterns that are not obvious to human eyes.
| [deleted]
| asdfasgasdgasdg wrote:
| Some clarifications:
|
| "Half" here is the 10 year cumulative probability of at least one
| false positive if you get a screening each year.
|
| The status quo ten-year false positive rate with the test used
| before 3d mammograms (digital mammography) is 56.3, and the false
| positive rate with 3d mammograms (aka tomosynthesis) is 49.6.
|
| So the study is actually reporting an improvement vs the previous
| state of the art! It's weird that the article is written to
| convey that the false positive rate of the new technique is a
| drawback when it's actually a benefit.
|
| (This comment has been heavily edited thanks to a correction I
| received below.)
| rkangel wrote:
| I think that the title is actually a pretty clear and accurate
| statement.
|
| [If you assume that all women have mammograms at the normal
| cadence, over a period of 10 years then] half of women will get
| a false positive.
| asdfasgasdgasdg wrote:
| The title is technically accurate, but you have to take into
| account uninformed people's priors in order to efficiently
| convey information that might update their beliefs. Just my
| personal opinion on this issue. If you say "new technique [X]
| has [high] false positive rate," you're going to make people
| who don't know the status quo think there is a deficit in the
| new technique. Also, I'm not just talking about the title.
| rkangel wrote:
| Yes, the article seemed to be making two different points
| and got a little directionless as a result:
|
| * Half of all women will get false positives - that's a
| lot, it's not great, people will think they have cancer and
| we should set expectations
|
| * Despite the introduction of a newer technology to improve
| false positives (3D scanning) we've only improved the
| situation from 56% to 50%
| labcomputer wrote:
| > only improved the situation from 56% to 50%
|
| But isn't that million _s_ of fewer false positive
| results over 10 years?
|
| Assumptions:
|
| * US Women: ~150e6
|
| * Fraction getting annual mamograms: ~1/3 (life
| expectancy of 80yrs, and mamograms during 30 of those
| years, minus some with poor access to health care)
|
| * Reduction in cumulative FP rate over 10 year period: 6%
|
| 150e6 x 0.33 * 0.06 = ~3e6 fewer false positive results
| over a 10 year period. That seems like a significant
| effect size to me.
| barbegal wrote:
| I think you've got the numbers mixed up according to the paper
| https://jamanetwork.com/journals/jamanetworkopen/fullarticle...
| "tomosynthesis [3D mammogram] vs digital mammography [previous
| test] for all outcomes: 49.6% vs 56.3%"
| asdfasgasdgasdg wrote:
| Yeah, I mixed up which name applied to the new technique. To
| be clear, I don't know the space, I just read the study and
| assumed that "digital mammography" sounded more 3d-ish than
| "tomosythesis". I'll update my comment accordingly.
|
| Edit: oh man. Now I see that I mixed up the direction too!
| What in the heck. The article was written like there was a
| big problem with 3d mammograms so I assumed it was performing
| worse on this metric than the baseline. But in fact it's
| better?
| gzer0 wrote:
| False-positive findings on screening mammography caused long-term
| psychosocial harm 3 years after a false-positive finding [1].
|
| False positives also have a major economic impact, to the tune of
| $4 billion USD [2].
|
| [1] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3601385/
|
| [2]
| https://www.healthaffairs.org/doi/full/10.1377/hlthaff.2014....
| dontreact wrote:
| This is all certainly true but it's important to look at the
| bigger picture. What is the overall impact of catching cancer
| earlier and saving lives? That is enormous as well. One way
| this is sometimes talked about in public health literature is
| by looking at the QALYs of an intervention. On the whole,
| breast cancer screening does seem to be an effective
| intervention
|
| https://pubmed.ncbi.nlm.nih.gov/31098856/
| Johnny555 wrote:
| My wife got a false positive a few years back, her doctor called
| her personally to discuss and reassure her that it was not
| necessarily a problem, just an area of suspicion. She wasn't all
| that anxious or worried about it. The worst thing she was worried
| about was the chance of needing a biopsy, not the chance of
| actual breast cancer.
|
| She went in for a followup ultrasound scan and was cleared.
|
| Given the survivability of early stage breast cancer versus late
| stage, a high false positive rate doesn't seem worse than the
| alternative of missing early stage breast cancer. Obviously,
| better imaging or other tools would be even better.
| conductr wrote:
| > Given the survivability of early stage breast cancer versus
| late stage, a high false positive rate doesn't seem worse than
| the alternative of missing early stage breast cancer
|
| I stand by this sentiment wholeheartedly. 3D mammo caught a
| 0.6mm diameter tumor for my wife and she never even felt the
| lump. I felt it once and then couldn't find it again. I had to
| drag her to the doctors office as she didn't believe me. Oddly
| enough, my college years were spent in a pathology lab and I
| knew exactly what a tumor felt like and knew her family history
| and my heart sank the first time I felt it, so I was quite
| persistent in forcing her to see the doctor.
|
| They rushed her surgery on the schedule and still the tumor was
| 3.5mm by the time they removed it. Time is absolutely of the
| essence. Had she waited for her annual exam or for some
| convincing evidence that she needed to go to the doctor, I fear
| it would have metastasized and I'd be a widower writing this
| comment.
|
| I feel like adding, in the US, at the time, we had to fight
| tooth and nail for the 3D mammo to be covered by insurance and
| it was still pretty pricey copay. This really should be more
| widely available. Especially for certain demographic traits. My
| wife was 30ish (dense tissue), family history, known BRCA+, and
| should not have to fight for this technology that is the
| obvious best option.
| glitcher wrote:
| My wife got a false negative screening and then a few months
| later was diagnosed with stage 4 metastatic breast cancer. The
| cost of dealing with false positives seems much less than the
| consequences of a false negative in our experience, so I agree
| with your sentiment.
| sebmellen wrote:
| I hope your wife is doing alright. It is always shocking how
| quickly life can change.
| mensetmanusman wrote:
| Eventually disaster strikes everyone. Knowing this helps me
| traverse life with humility.
| mh- wrote:
| My wife is a survivor of stage 3 metastatic. We deal with
| false positive scares around once a year at this rate, but
| I'll take them every time over not finding something in time.
|
| I'm very sorry about your wife.
| qiskit wrote:
| The absolute worst thing is false negatives. No tests should
| allow those. We can live with false positive to some degree.
| But we should nonetheless try and minimize the false positives
| since high false positives might do more collective harm than
| no test at all. What the false positive limit is, I guess it's
| up to society and the medical profession to decide.
| Aulig wrote:
| Unfortunately we treat lots of false positives, which leads to
| unnecessary chemotherapy and other treatments. These
| unnecessary treatments harm ones health so much that no study
| could show an increase in life expectancy by doing regular
| mammograms.
| dontreact wrote:
| No one just treats a mammography or tomography positive
| without other follow up tests. It's not the standard of care.
| The next step is to get an ultrasound and then a biopsy.
| Johnny555 wrote:
| No one receives chemotherapy from a screening mammogram
| alone, there are always further diagnostic tests before
| treatment. Though there are probably unneccessary biopsies.
|
| Screening mammograms do reduce breast cancer mortality:
|
| https://pubmed.ncbi.nlm.nih.gov/33141657/
|
| _Annual mammographic screening at the age of 40-49 years
| resulted in a relative reduction in mortality, which was
| attenuated after 10 years. It is likely that digital
| mammography with two views at all screens, as practised now,
| could improve this further. There was no evidence of
| overdiagnosis in addition to that which already results from
| the National Programme carried out at later ages._
| tomrod wrote:
| This is really the tradeoff we face in the binary
| classification space. Do we want false positives or false
| negative reduced? There are more sophisticated approaches, but
| if the costs in one direction are magnitudes more than in
| another, it becomes a clear preference.
| orzig wrote:
| There's also a subtle clinical communication aspect to this. I
| once had a doctor message me shortly after a routine test and
| say "I need you to come back _tomorrow_ ", and my pregnant wife
| was losing it until we could get through to him and find out
| that he just wanted to ensure I hadn't gotten an infection
| during the test. My wife wasn't wrong to be upset! Several
| hours of suffering could have been avoided with a good
| pamphlet.
| Johnny555 wrote:
| Yeah, communication is key - no doubt it would have been a
| different story if she got a voice-mail from her doctor's
| office saying "We found a suspicious lump in your mammogram,
| we're going to need you to come in for a followup tomorrow".
| She probably would have had a sleepless night wondering how
| serious it was.
| colinmhayes wrote:
| Yep, I've got family members who are doctors and they always
| tell me that bedside manner is the most important part of
| their job. They have to go to continual learning conferences
| every year and the biggest part of them is usually bedside
| manner stuff.
| Aulig wrote:
| I looked into this recently. From what I can tell, no study could
| show a reduction in deaths from doing mammograms. This is because
| they detect many cancers that would go away on their own thus
| unnecessarily exposing you to chemotherapy which brings other
| health risks. A good starting point on this topic is this video:
| https://youtu.be/_sg14En-Z7A I was doubtful about the validity of
| that video at first, but everything he says is backed up by
| studies.
| dontreact wrote:
| https://pubmed.ncbi.nlm.nih.gov/33141657/
|
| "Long-term follow-up of the UK Age RCT" "There was a
| statistically significant 25% reduction in mortality from
| breast cancers diagnosed during the intervention phase at 10
| years' follow-up"
| kmonad wrote:
| Thank you.
| cupofpython wrote:
| https://www.breastcancer.org/research-news/3d-mammos-outperf...
|
| https://www.cancer.org/cancer/breast-cancer/screening-tests-...
|
| Related articles on the subject.
|
| Something that should be kept in mind is that false positives
| should always be talked about alongside false negatives. There is
| no perfect test, and often in medicine there is an increase in
| one of those in order to decrease the other.
|
| "A false-negative mammogram looks normal even though breast
| cancer is present. Overall, screening mammograms miss about 1 in
| 8 breast cancer"
|
| It isn't discussed in the OP article, and I could not easily find
| something that differentiates false-negative rates in 2D vs 3D
| mammograms, but the cost of over-diagnosis might be worth it if
| that means less cancers go unnoticed in screenings.
| blackbear_ wrote:
| Note that this is not due to bad tests, just a counter-intuitive
| probabilistic result happening when one tries to detect an a
| priori very rare condition.
|
| It follows from a straightforward application of Bayes formula:
|
| - Suppose a person has 1% chance of getting cancer, so p(cancer)
| = 0.01
|
| - Suppose that the test has 99% sensitivity and specificity,
| i.e., p(positive | cancer) = p(negative | no cancer) = 0.99
|
| - To use Bayes we first need the probability of a positive result
| regardless of disease status, i.e. p(positive) = p(positive |
| cancer) * p(cancer) + p(positive | no cancer) * p(no cancer) =
| 0.99 * 0.01 + 0.01 * 0.99 = 0.0198
|
| - Then by Bayes we have p(cancer | positive) = p(positive |
| cancer) * p(cancer) / p(positive) = 0.99 * 0.01 / 0.0198 = 0.5
|
| And things get worse the rarer the condition is. When p(cancer) =
| 0.5 the computations above give p(cancer | positive) = 0.99, and
| when p(cancer) = 0.001 then p(cancer | positive) = 0.09.
|
| In other words, the rarer the condition, the more precise tests
| have to be.
| hackernewds wrote:
| So in summary, it is a bad test for a rare case? Am I missing
| anything?
| ceras wrote:
| I think there's one nuance to add: the test is based on your
| base rate expectation for how likely the given _individual_
| is to have the rare disease.
|
| The more evidence you have that this person is more likely to
| have the disease, the more useful the test becomes. Some
| examples:
|
| - if the disease is more common among people over age 65,
| it's more useful on people in that age group
|
| - if the person displays more symptoms associated with the
| disease, and not _also_ associated with more common diseases,
| then it 's more useful for someone with those symptoms
|
| - if a disease is common in an area of the world, and the
| person has traveled there, the test is more useful
|
| The more factors you have, the less likely a false positive.
| This is why it's often better to avoid medical tests unless
| there's a reason to suspect someone is at elevated risk for a
| disease: a positive result is more likely to be a false
| positive.
| [deleted]
| somenameforme wrote:
| There's a really interesting mathematical 'paradox' with false
| positives along the same line as the birthday paradox. It's an
| important one also that many doctors get wrong. Let's say there
| is a terrible disease that 0.01% (1 in 10,000) of the population
| is infected with, and we develop a test that is 99% accurate. You
| go to the doctor and get a test. Oh no, it's positive. What are
| the chances that it's a false positive?
|
| Intuition would tell you 1% since the test is 99% accurate.
| Intuition, as it often is, is wrong. Take a sample size of 'x',
| we'll use 10,000. How many people will be infected? 1. How many
| false positives will there be? 1% * 10,000 = 100. So there will
| be 100 false positives, and 1 real positive. So it turns out
| there's a greater than 99% chance that your result is a false
| positive, even though the test is genuinely 99% accurate.
|
| Here's a fun one. So we decide to run the test twice. And AGAIN
| it comes up positive. What are the odds that it's a false
| positive? Again, the exact same logic holds - there are 101
| people we are testing. There will be about 1 false positive and
| about 1 real infection. So the odds of it being a false positive
| once again are about 50%.
| giovannibonetti wrote:
| > So the odds of it being a false positive once again are about
| 50%.
|
| It's even worse in practice, given that the false positive rate
| is probably not independent between both measurements
| dontreact wrote:
| Yeah this is why in practice the followup to a mammogram is
| never going to be another mammogram. It's a different test
| which is more invasive but also produces far less false
| positives: taking a tissue sample.
| InitialLastName wrote:
| Note that the basic Bayesian analysis relies on the samples
| being independent, so if the same test is re-run on the same
| patient, equipment, process and facility, the error is likely
| correlated to the previous test, so the chance that it's still
| a false positive is even higher than an independent analysis
| would suggest.
| jdthedisciple wrote:
| Imo you are exaggerating a bit: What you are describing is
| specificity vs sensitivity [1] which I am fairly certain
| doctors get educated about pretty early on in their studies.
| Once you know about that your examples shouldn't be surprising
| to anyone.
|
| [1] https://en.m.wikipedia.org/wiki/Sensitivity_and_specificity
| sarchertech wrote:
| Any younger doctor has most certainly had sensitivity vs
| specificity hammered into them in med school.
|
| I remember reading studies about doctors not understanding
| the stats years ago, but my wife assures me that medical
| education has over corrected in the last decade or 2.
| tmoertel wrote:
| > Here's a fun one. So we decide to run the test twice. And
| AGAIN it comes up positive. What are the odds that it's a false
| positive? Again, the exact same logic holds - there are 101
| people we are testing. There will be about 1 false positive and
| about 1 real infection. So the odds of it being a false
| positive once again are about 50%.
|
| This conclusion assumes that running the test twice is
| equivalent to running two independent tests for the same
| condition. For many real tests, getting a false positive once
| predicts a higher chance of getting another false positive in a
| repeated test.
| dontreact wrote:
| Yes exactly. In fact mammograms are already double-read and
| agreement is quite high, so this is definitely one of those
| types of tests. There are benign masses that are
| radiologically impossible to distinguish until you get a
| biopsy.
| jvanderbot wrote:
| Seems to me the thesis that "mammogram positive" implies
| "possible cancer" is the faulty one.
|
| In fact, "mammogram indeterminate" is a better diagnosis.
| dragonwriter wrote:
| > Seems to me the thesis that "mammogram positive"
| implies "possible cancer" is the faulty one.
|
| It's not, though.
|
| Possible [?] certain or even likely.
|
| Screening tests are used to determine whether there are
| indications for more invasive but more conclusive
| diagnostic tests.
| dontreact wrote:
| Yes, and this is how the standard of care treats things.
| You never proceed from positive mammo screen to
| treatment. You always first do some other steps like
| ultrasound or biopsy.
| lostlogin wrote:
| > There are benign masses that are radiologically
| impossible to distinguish
|
| Time to bring out the pigeons.
|
| Having just read the paper again, It's fascinating.
|
| https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4651348/
| belter wrote:
| https://youtu.be/R13BD8qKeTg
| [deleted]
| reincarnate0x14 wrote:
| Also depends a lot on the nature of the screening and what is
| being screened for.
|
| Prostate cancer being somewhat of an outlier as at least to
| my understanding, essentially all men will end up with some
| kind of tumor or another if they live long enough, but some
| large fraction of those won't become something that requires
| medical intervention before the individual would die from any
| number of other age-related causes. Outside my area of
| expertise but the relative size of the tissue that produces
| various breast cancers as well as the proximity to lymph
| nodes change the calculus on that a bit.
|
| But I would definitely expect overall screening protocols to
| evolve as methodologies are developed that can potentially
| exclude the false positives from another metholodgy. While
| test A + A may still produce false positives due the nature
| of the test failing the same way, if we have A + B + C that
| are all relatively low-impact and not horribly expensive,
| that may end up being a fairly good program if some
| combination of A, B, and C generally exclude the parameters
| that the other produces a false positive for.
| jrapdx3 wrote:
| Prevalence of prostate cancer increases with age >70. Of
| confirmed cases ~2/3 are low grade, require only "active
| surveillance". Around 20% are intermediate grade tumors and
| ~10% are high-grade, aggressive cancer. The basic screening
| tool is the high-specificity PSA lab test, with biopsy
| considered the definitive diagnostic procedure.
|
| Except for the highly malignant variety prostate cancer is
| generally an "indolent" tumor, that is slow-growing. So in
| the majority of cases chances are good that the patient
| would die of something else before the tumor becomes the
| cause of death. However prostate cancer is a leading cause
| of cancer death in men. In the US ~30k/yr men die of
| prostate cancer, so it's not a trivial condition.
| viraptor wrote:
| > as well as the proximity to lymph nodes change the
| calculus on that a bit.
|
| That sounds interesting. Why does lymph node proximity
| matter?
| reincarnate0x14 wrote:
| Again outside my area of expertise here, but to my
| understanding one of the significant risks to breast
| cancer is that if it spreads into the lymphatic tissue,
| it can subsequently metastasize across the across
| multiple other organs while compromising the lymphatic
| immune response that would normally help limit such a
| spread.
| avianlyric wrote:
| Not 100% sure. But I think the proximity means that its
| easy for the cancer cell to spread to the lymph nodes,
| then they'll get fast tracked around the body, spreading
| cancer everywhere. Once that happens you're basically
| dead, all targeted forms of cancer treatment don't work
| anymore, because the target area is basically your entire
| body.
| DarylZero wrote:
| You're not "basically dead" -- your odds of surviving 5
| years drop from around 80% to 30%.
| dmurray wrote:
| > For many real tests, getting a false positive once predicts
| a higher chance of getting another false positive in a
| repeated test.
|
| Almost all of them, in fact. If improving accuracy was as
| easy as repeating the test, the doctors would just be told
| that the correct procedure for this test is to take three
| readings.
|
| You can expect that this is untrue only if the tests are
| especially expensive or invasive.
| HWR_14 wrote:
| I believe repeating the test is standard procedure for many
| tests - if the first one is positive. However the
| repetitions are usually temporally separated (in some cases
| intentionally, in others because the test takes time.) This
| is effective in those cases because although there are a
| lot of temporary causes for a false positive to spike,
| there are fewer where it's persistent.
| [deleted]
| tejohnso wrote:
| So we say something like, "the test is 99% accurate, giving a
| ~0.99 percent positive test result reliability across the
| tested population."
|
| Is there a specific metric word or phrase for this less than 1%
| positive test result reliability?
|
| Edit: Looks like hadmatter responded with the term I'm looking
| for!
| dragonwriter wrote:
| > Let's say there is a terrible disease that 0.01% (1 in
| 10,000) of the population is infected with, and we develop a
| test that is 99% accurate. You go to the doctor and get a test.
| Oh no, it's positive. What are the chances that it's a false
| positive?
|
| The correct answer is that the question doesn't give enough
| information to know.
|
| I mean, a test that always said no would be 99.99% accurate. A
| mere 99% accuracy could be no false negatives and 0.99% of all
| tests false positives, in which case the answer is that a
| positive result is 1% likely to be correct, and 99% likely to
| be a false positive. Or it could be, on the other extreme,
| 0.01% false negative and 0.98% of all tests false positives, in
| which case the chance that a positive result represents an
| actual case would be 0%.
|
| Presumably, though, approval processes for medical tests will
| constrain the actual distribution of results to be such that, a
| patient with positive result is not more likely to be clear of
| the condition than one with a negative result, so it won't be
| _as bad_ as the latter extreme, but it 's worth noting that
| your assumption (essentially that the false/true ratio is equal
| for negative and positive results) is not certain at all.
| Apes wrote:
| What's interesting to me is that the three big criteria tests
| are rated on don't take this into account at all.
|
| In a population of 1,000,000:
|
| TP=99
|
| TN=989901
|
| FP=9999
|
| FN=1
|
| Accuracy = TN + TP / (TP+TN+FP+FN) = 99%
|
| Sensitivity = TP / (TP + FN ) = 99%
|
| Specificity = TN / (TN + FP ) = 99%
|
| Looks like a great test across the board!
|
| However, the actual chance someone with a positive test result
| has the disease:
|
| TP / ( TP + FP ) = ~1%
|
| So why isn't this ratio part of the major test criteria?
| majormajor wrote:
| I'm not sure 99% is considered the gold standard you think it
| is. Sure, a lot of our tests can't even do that well, but we
| _know_ that 99% isn 't a lot for large populations. We even
| have jargon like "five nines", "six sigma" around the fact
| that 99% isn't a lot.
| killjoywashere wrote:
| PPV is absolutely something that's tracked. In fact, if you
| read the instructions for use (equivalent to "package
| insert") for lab diagnostics, you'll find the manufacturer's
| truth tables and can calculate the PPV if it's not in there.
| FDA keeps them online, so you don't even need to buy the
| diagnostic. Look the up.
| thaumasiotes wrote:
| > So why isn't this ratio part of the major test criteria?
|
| Because it has nothing to do with the test. Sensitivity and
| specificity are facts about the test, and conceptually they
| will remain whatever they are if you start testing a
| different population.
|
| Whether doing a test has any value is a fact about the
| population ("olympic athletes are never anemic, so there's no
| point in testing them for anemia") that has nothing to do
| with the test. Why would you evaluate the test based on that?
| colinmhayes wrote:
| > So why isn't this ratio part of the major test criteria?
|
| I think it's because this number depends on the percent of
| the population that is positive which can change after the
| trial is conducted.
| giantg2 wrote:
| This doesn't sound quite right. I think it's ignoring
| randomness and not accounting for the thresholds correctly.
| [deleted]
| CTmystery wrote:
| I'm going to join you down at the bottom here. I agree with
| you it isn't quite right, and I _think_ the reason comes from
| a couple sleights of hand. One: healthy people don't go in
| testing for the presense of a rare disease. That is, if you
| are administering this test to 10,000 people and only one
| truly has the disease, then any statistic that you back out
| of this trial is well below any significance. Two: the
| colloquial use of 'accuracy' is not the statistical use of
| 'accuracy'. This post uses the former for our 'intuition' and
| the latter to demonstrate that our intuition is wrong.
| danachow wrote:
| There's not much wrong with the original other than using
| the term "accuracy" when what is actually meant is
| "sensitivity". But it is not colloquial.
|
| https://en.m.wikipedia.org/wiki/Sensitivity_and_specificity
|
| > healthy people don't go in testing for the presense of a
| rare disease.
|
| Well that's precisely the point. There's frequently the
| idea raised among lay people that it would be beneficial to
| test or screen for all kinds of rare or even just uncommon
| diseases that have high mortality in the general populace.
| This is the counter argument.
| giantg2 wrote:
| And many tests, likely the more invasive ones used to
| confirm these initial tests, carry risks. Even something
| as "routine" as a colonoscopy carries risks.
| SamBam wrote:
| I'm pretty sure it is quite right. It's standard Bayesian
| statistics.
|
| If the math is wrong, show us.
| rndgermandude wrote:
| It forgets to consider false negatives, which will not
| really change much for this example as the accuracy at 99%
| is pretty high and the incidence is pretty low, and false
| negatives thus would not really change much.
|
| However, when it comes to mammograms, where the false
| negative rate is up to 12.5% and the false positive rate is
| up to 50%, and the incidence of breast cancer in women
| (12.5% over a lifetime) is rather high, false negatives
| would play more of a role.
| SamBam wrote:
| It doesn't need to consider false-negatives. The premise
| is that you received a positive test.
|
| The math is, given the test shows a positive, what is the
| probability that you _actually_ have the disease?
| csee wrote:
| Well, if we assume there is a 0% chance of false
| negatives (which seems implicit in the post, even though
| the definition of "test accuracy" doesn't make that
| assumption), then we don't need to consider it.
|
| If we are not making that assumption, then the 50% chance
| claim at the end is incorrect, since there's a chance
| that _both_ of the two positive results are false
| positives and there is a false negative somewhere else in
| the remaining 99 (out of 101) people.
|
| The 50% chance of it being a false positive (conditional
| on two positive results being observed) is actually a
| lower bound, and will increase as a function of the false
| negative rate.
| giantg2 wrote:
| If the right, show the proof (or at least the real math).
|
| Specifically, why would the 10,001st test carry a 50%
| chance of being a false positive?
| tetromino_ wrote:
| You misunderstood the scenario: 10,000 people would
| receive a test. Out of those, ~100 would be expected to
| test positive. Only those ~100 people would receive a
| second test.
|
| Thus, the population for tests 1..10,000 is different
| than the population for tests 10,001..10,100.
| giantg2 wrote:
| Ok... and why would that second population carry a 50%
| risk of the second test being a false positive?
|
| Edit: I misunderstood the scenario.
| dontreact wrote:
| Yes, it's not a 50% chance of false positive. It's 50%
| chance of false, given positive test.
| rndgermandude wrote:
| In this example? Because we know that one is a true
| positive. We tested 100 people with a false positive, and
| 1 with a true positive.
|
| If we test again, and the test is "fair", then of these
| 101 people, we will retest 99% (or 99) true negative, 1%
| (or 1) false positive, and 1 true positive. So two
| positives, one false positive and one true positive. 1
| out of 2 is 50%.
|
| (Tho, there is a chance also that the true positive might
| come back as a false negative, then 1 out of 1 or 100% of
| positive results would be false positive)
| jgeralnik wrote:
| The second population contains 100 people without the
| disease. Of them we expect one person to have a false
| positive. In addition the person who actually has the
| disease will test positive. So on average 2 people will
| test positive, one of whom has the disease and one of
| whom doesn't. If you are one of those two people you have
| a 50% chance of your (positive) diagnosis being false.
| hervature wrote:
| There is a 1/10000 (the OP said 0.01%) chance of having
| the disease. That is, in 10000 people, one person will
| have the disease. But also, if you test everyone with a
| 99% accurate test, 100 people will test positive. If you
| test them again, 1 person will get another positive test.
| So, at the end of the day, your test will identify the
| one person with the disease with a 99% certainty (or
| 98.01 if you require two tests to be positive). But also
| identify another individual as being doubly positive. So,
| after two rounds of tests, 50% of the marked individuals
| truly have the disease.
|
| The silly thing with this demonstration is that accuracy
| is poorly defined. Tests are normally conditional on the
| actual presence of the disease or not.
| User23 wrote:
| A good example is screening men for breast cancer. It's
| possible for men to get breast cancer, but it's so vanishingly
| rare that the overwhelming majority of positive results will be
| false.
| croes wrote:
| And if you get two positive tests and were each time randomly
| selected for the test, the probability to be really positive is
| 99%
|
| https://github.com/khuyentran1401/Data-science/blob/master/s...
| Cpoll wrote:
| > You go to the doctor and get a test.
|
| Nitpick here. There's nothing wrong with the math of Bayesian
| probability, but the setup is usually something like an STD
| test, and in those cases the prior is wrong.
|
| Our prior probability shouldn't be "0.01% (1 in 10,000) of the
| population is infected," we really want % of infection for the
| population _that get tested_.
|
| This is really obvious with disease tests because people don't
| typically get tests without reasons (symptoms, contact with
| infected persons, etc.). The math as-is would only work if
| we're sampling people randomly.
| reincarnate0x14 wrote:
| This is, incidentally, the same basic reason why so many knee-
| jerk policies produce systems that can never do anything but
| fail. When trying to find needles in hay stacks, since the
| needle (let's say terrorists trying to board airplanes) is so
| tiny relative to the hay (the millions of other people doing
| same), the people responsible for reacting are essentially
| trained only on false positives and still miss the real events
| when they (highly infrequently) occur.
|
| Obviously with low-impact false positives like requiring more
| expensive or invasive, but not really dangerous, additional
| medical tests this is an easily accepted and accounted for set
| of outcomes, but as we try to do things like apply ML to ever
| more specific cases of complex questions it's worth keeping in
| mind.
| majormajor wrote:
| Going a level further, the claim that the systems fail is
| itself tough to support!
|
| Let's say two million people have been seriously
| inconvenienced by airport screening. We have to ask if that's
| an acceptable price to pay for the benefits? But do we know
| the benefits? _Can_ we know the benefits? We actually don 't
| know if the policy deterred anything, or even if someone who
| wanted to do something bad had their knife or liquid
| explosive or whatever confiscated and just abandoned the
| plan. It's purely a hypothetical so you're going to support
| the view you want: "we don't need to do all this because we
| don't actually have a big threat" or "we have successfully
| prevented any more attacks."
|
| After all, we often see discussions here of foolish companies
| that underfund their security teams because "why are we
| paying so much for this when nobody has cyberattacked us?" ;)
| [deleted]
| idrios wrote:
| There's a version of this that I was once told and now repeat
| to everyone else:
|
| When you have a condition that affects only 1% of the
| population, a test that reads negative 100% of the time will be
| 99% accurate.
| MiddleEndian wrote:
| My favorite is the 50% test case. If you flip a coin as a
| test and ascribe heads to true, 50% of the time it will
| describe what's happening accurately.
|
| So if you want to know "Has the Sun exploded?" you can flip a
| coin, assign heads to "Yes the Sun has exploded", and 50% of
| the time it will be correct.
|
| You flip a coin, you get heads. You decide to flip the coin
| again, and you get heads again! What are the odds the Sun has
| exploded?
| thaumasiotes wrote:
| But tests are not generally described in terms of accuracy.
| They're described in terms of sensitivity and specificity,
| because those quantities are both "accuracy" but they're not
| closely related to each other.
|
| A test that is always negative has 0% sensitivity and 100%
| specificity. Nobody thinks that sounds good.
| lordnacho wrote:
| You can combine them in an F1 score. Seems like a
| reasonable way to get a single number, though there might
| be some drawbacks I haven't thought of.
| cortesoft wrote:
| That is how tests are described in scientific circles, but
| not with the general public. An overall `accuracy` number
| is usually given.
| [deleted]
| hadmatter wrote:
| Funnily enough, accuracy is a somewhat inaccurate term when it
| comes to medical testing. What we are interested in are
| sensitivity, specificity [0] and prevalence of the disease in
| the population being tested. With those three we can get
| positive prediction value (PPV) and negative prediction value
| (NPV) [1]. But PPV and NPV always depend on prevalence! That's
| why we don't want to test healthy people for all kinds of
| exotic diseases.
|
| [0] https://en.wikipedia.org/wiki/Sensitivity_and_specificity
| [1]
| https://en.wikipedia.org/wiki/Positive_and_negative_predicti...
| dontreact wrote:
| However, despite the high false positive rate, breast cancer
| screening is a net beneficial procedure (though not without
| controversy) in that when applied in a population, long term
| outcomes like mortality improve.
|
| (edit yes of course cancer is not beneficial thank you)
| gms7777 wrote:
| Just to expand/add nuance to your comment about
| controversy:
|
| It's pretty uncontroversial that cancer screening for women
| above 50 years old or women with high risk (e.g. familial
| or genetic evidence or risk, history of cancer/treatment)
| is a net beneficial procedure.
|
| For asymptomatic, low-risk women aged 40-49, the
| recommendations are a bit more mixed. While there is some
| evidence that overall mortality does improve, there are
| also considerable harms (physical, emotional, financial)
| associated with false positives and "overdiagnosis" (that
| is, identifying and treating low-grade non-invasive lesions
| that are unlikely to progress to breast cancer).
| pfortuny wrote:
| Yep, but so do anxiety disorders associated to a "possible
| cancer".
|
| Living hapyy is more important than living for lots of
| people.
| dontreact wrote:
| You can still try to estimate this using something called
| QALYs which are life years adjusted for quality of life.
|
| A breast cancer scare, is certainly harmful, but it's
| important to remember that it is going to be a short
| episode as it becomes clear that the tumor is not
| growing. So you have maybe a few months of scariness.
|
| On the other hand cancer is brutal, and death is
| permanent. Years of life cut short. The years leading up
| that are extremely sad. The impacts of death extend
| beyond the person to their family and friends.
|
| When people try their best to measure these things on
| balance, breast cancer screening is overall a good
| intervention.
| scottlamb wrote:
| > However, despite the high false positive rate, breast
| cancer screening is a net beneficial procedure (though not
| without controversy) in that when applied in a population,
| long term outcomes like mortality improve.
|
| I've read that's also a statistically tricky statement.
| IIRC, they measure survival time from when the cancer was
| detected. So early detection by definition improves the
| survival stats even when there's no meaningful treatment.
|
| If I've read about this, I'm sure medical practitioners are
| well aware of it, but even so, I don't think it's easy to
| correct for it. How do you determine when a late-detected
| cancer actually started?
| dontreact wrote:
| You don't, you just measure long term outcomes like
| mortality in a treatment and control arm. Potentially you
| have to use a synthetic control.
|
| I am most familiar with NLST which actually was a long
| term randomized trial.
| scottlamb wrote:
| Thanks! Paraphrasing to check my understanding: you don't
| use the stat I mentioned, instead measuring whether the
| patients ultimately die from cancer or something else.
| You use a control, so after like detection method some
| patients get treated and some don't. But not treating may
| be unethical, so you use a "synthetic control", which I
| had to look up. Basically an invented group of not-
| treated patients based on statistics from other
| populations. [0] It sounds interesting but tricky to get
| right. "A healthy degree of scepticism on the use of
| synthetic controls is thus expected from the scientific
| community."
|
| [edit: or, looking at the NLST you mentioned [1], maybe
| "treatment" doesn't always mean what I think of as
| treatment. They are actually comparing two different
| detection methods, and they aren't using a synthetic
| control.]
|
| Do they ever use the "survival time after detection" stat
| I mentioned anymore, or has it been (/ should it be)
| abandoned?
|
| [0] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7218288/
|
| [1] https://www.cancer.gov/types/lung/research/nlst
| dontreact wrote:
| Nice follow up!
|
| Yes, NLST was actually a real randomized control trial
| for lung cancer screening, done before it was known
| whether or not it was helpful. The trial eventually
| stopped once they had enough outcome based evidence.
|
| The statistic you mentioned is not used in assessing the
| effectiveness of screening. It does closely relate to
| what is though to be the causal mechanism that leads to
| effectiveness.
| [deleted]
| WaitWaitWha wrote:
| I am always (jadedly) suspicious with articles on 'false
| positives' within the medical field.
|
| Sure, statistics say that there are false positives. Personally,
| I rather have three false positives, than one false negative.
| Taken to a bit of an extreme, the outcome of first is money spent
| and stress, the outcome of the later is death. I am not a
| statistics to myself.
|
| I am sorry for the inconvenience to the medical field by having
| to re-test, and certainly feel for the insurance company to have
| to pay for a second, or third opinion. I care, I really do. Just,
| not that much.
|
| That said, it took me decades to recognize the need and add
| sympathy into my calculations.
| giantg2 wrote:
| I hate all these new 3D imaging things.
|
| I wonder how much difference there really is with the older tech
| based on looking at outcomes. For standard surveillance, does the
| new 3D dental xray actually benefit me? Granted the cost is low
| (<10% normal yearly exposure), but I still wonder.
| dontreact wrote:
| It will take a long time to sort this out. I think that
| eventually, the ability for AI to review 3D in greater detail
| will improve on false positives and false negatives.
|
| For CT vs. XRay there have already been 2 long term randomized
| controlled trials that look at outcomes for lung cancer
| screening. PLCO, the XRay study, was not effective in improving
| long term outcomes like mortality. NLST, the CT study, showed a
| 20% reduction in mortality. With more modern techniques that
| actually look at the tumor in 3D NELSON in Europe has seen even
| better mortality reduction.
|
| DBT is newer so A) Radiologists are not as good at reading it
| yet B) the long term evidence hasn't accumulated yet, but I am
| definitely optimistic that it will present a big improvement
| over the previous state of the art.
|
| As mentioned elsewhere in the thread, this false positive
| number is already better than the existing state of the art for
| 2D.
| fedorareis wrote:
| Well according to the article in this case 3D imaging is
| reducing false positives by about 6% which seems like an actual
| benefit. Would it be a benefit if it is easier to maintain or
| easier for the doctor to read?
| abxytg wrote:
| Often it's a volumetric view of the same data, which is usually
| viewed as 2D slices. Doctors tend to prefer the slices.
| giantg2 wrote:
| Yeah, but what are they doing with that to benefit me? Does
| it identify dental issues that traditional imaging misses?
| How common is that?
| wincy wrote:
| I remember almost fainting when my doctor showed me the MRI
| scans of my brain. It doesn't bother me to see someone else's
| scan, but I just couldn't stomach looking at mine at all,
| seeing my optic nerve and my eyeballs and everything.
| jhgb wrote:
| And I got my MRI brain scan and I had absolutely zero
| emotional reaction to it, except for curiosity. But then
| again I'm a schizoid so that was very likely anyway...
| abxytg wrote:
| Kind of a non sequitur but anyone else working on this stuff? I
| am writing a mammo rendering engine for the browser and it would
| be nice to have people to talk shop with. This shit is hard.
| husarcik wrote:
| im a radiology resident interested in this space. Feel free to
| DM me on Twitter, @husarcik. :-)
| dontreact wrote:
| It's good to be publishing stats like this so that the
| psychological harm of false positives is lower. It's scary to get
| a false positive.
|
| However, in my opinion, there has been a systemic resistance to
| screening because of statistics like this, and that is misguided.
| This is not the right number to use for determining whether or
| not screening is useful.
|
| A 3D mammogram will have false positives, yes. And some women
| will need to either get an uneccesary ultrasound (annoying, but
| not harmful) or worse, an uneccessary tissue sample (biopsy).
| This type of biopsy is done with a needle and is certainly
| uncomfortable, but carries little to no risk. This is all true.
| But it needs to be balanced against the fact that catching cancer
| early gives you an immensely better chance of surviving.
|
| The right numbers to look at are more like: How much does a
| screening program reduce overall mortality rates? How much does
| the screening program reduce breast cancer mortality rate? What
| are the impacts of this in terms of QALY (quality adjusted life
| years)? What are the impacts of the false positives in quality
| adjusted life years?
|
| I believe on balance the answer to these questions is that
| screening for breast cancer is on the whole beneficial. Here is
| one example study https://pubmed.ncbi.nlm.nih.gov/31098856/
|
| Now the question is, how will 3D affect these questions? That is
| as of yet unanswered as far as I know but we shouldn't let
| numbers like this influence the conversation prematurely.
| Aulig wrote:
| You're totally right. However, the study you link does not
| discuss mortality while taking overdiagnosis into account from
| what I can tell. Other studies (first one I found:
| https://pubs.rsna.org/doi/full/10.1148/radiol.11110210 ) can't
| confirm a significant reduction in mortality.
| dontreact wrote:
| Yeah I just linked that study because it was looking at QALY
| which is an aggregate metric so it should in some sense take
| into account the effects of overdiagnosis even if
| overdiagnosis is not explicitly handled.
|
| Here is another study looking at all-cause mortality, which
| again does implicitly handle overdiagnosis.
| https://www.ajronline.org/doi/full/10.2214/AJR.14.12666
| aeternum wrote:
| Yes, the problem is mostly one of presentation. To most people
| a positive test means it is likely they have it, which is often
| a statistically invalid conclusion.
|
| We should instead present it as 'you currently have breast
| cancer with 13% probability' (population avg with no tests), a
| positive test means you have it with 25% probability, a
| negative test with 3% probability.
|
| That makes the test a lot less scary, after all you're only
| gaining 10% certainty either direction.
| conductr wrote:
| I don't believe we should present it as a statistics. Most
| people would not understand or feel comforted by that
| explanation. 'you currently have breast cancer' is all the
| person will see and the numbers and implications of will be
| lost on them.
|
| I feel like this is where bedside manor comes into play and
| the human component of being a physician. Any doctor ordering
| a 3D mammogram, then receiving a positive result, should
| communicate to their patient that imaging is just a screening
| tool and not diagnostic. Many doctors will not even use the
| word "Positive" in the test result conversation. They'll use
| vague wording like, "we did happen to see something we'd like
| to investigate further to be sure, it's really hard to see
| exactly what's going on in your body on a camera so while I
| was hoping to avoid it I do believe a biopsy should be the
| next step". Yes, a biopsy is now warranted, but the patient
| should try to keep up their spirits high and not stress to
| much at this point. The biopsy will tell more information
| about what is going on inside the body and we can come up
| with a plan for whatever that is once we know more.
|
| It's natural/unavoidable for the patient to be
| worried/anxious but should not yet be scared and that is best
| conveyed in a human interaction with a caring physician.
| matheusmoreira wrote:
| Of course. It's a high sensitivity test. It's meant to screen
| patients for more specific tests such as biopsies. You want it to
| be positive for anything that even remotely resembles breast
| cancer, the result can be confirmed by additional testing.
| stewbrew wrote:
| Funny. The original article says
|
| "The findings of this study suggest that digital breast
| tomosynthesis is associated with a lower cumulative probability
| of false-positive results compared with digital mammography"
|
| Which is an important finding since today such screenings rely on
| digital mammography. So, tomography is good.
|
| For whatever reason, the cited article turns this upside down
| into:
|
| "Half of all women getting 3D mammograms will experience a false
| positive over a decade of annual screening"
|
| Which suggests that tomography performs really bad.
|
| If you do a mammography screening every two years, you do 5
| mammographies in 10 years. With a false positive, you do 5 and an
| MRI. So what. Doctors should always tell their patients that a
| positive screening mammography doesn't necessarily imply they
| actually have cancer.
| SanchoPanda wrote:
| Here is a sample bayes rule calculcator made in my favorite web
| tool of all time, instacalc. This feels like the right way to do
| the math here. Random numbers provided, fill in your own as you
| deem appropriate.
|
| <https://instacalc.com/55494/embed?d=&c=QWN0dWFsX3Byb2JhYmlsa...>
| ggm wrote:
| Half of all women getting a 2d mammogram too. So.. the problem is
| not 3d vs 2d, from my reading of the axios story.
| tonymet wrote:
| i'm hoping with Covid more people are familiar with testing
| theory : ie sensitivity , specificity , false positives and false
| negatives
|
| these testing systems are just one of many indicators that are
| meant to be used . if used in isolation - very harmful outcomes
| will occur
| dontreact wrote:
| Yes exactly. No one is going to put you on chemo because of a
| mammogram. There are several followup tests including
| ultrasound and biopsy.
| tonymet wrote:
| so the millions of false positives will mean hundreds who die
| from biopsy infection and hundreds given false chemo
___________________________________________________________________
(page generated 2022-03-25 23:00 UTC)