[HN Gopher] Is Explainability the New Uncertainty?
___________________________________________________________________
Is Explainability the New Uncertainty?
Author : danso
Score : 31 points
Date : 2021-04-28 15:27 UTC (1 days ago)
(HTM) web link (statmodeling.stat.columbia.edu)
(TXT) w3m dump (statmodeling.stat.columbia.edu)
| kjjjjjjjjjjjjjj wrote:
| Can someone explain exactly _how_ and _why_ developers/ML people
| can't know why a neural network did something? I keep reading
| that it is a black box, but it absolutely has code running. I
| don't understand why that code just can't be analyzed.
| behnamoh wrote:
| In what areas do we expect explainability to be crucial? It seems
| most recommender systems (think Amazon, Youtube, etc.) don't add
| much value by providing the reason as to why they're recommending
| a specific product/video. Are there areas where this could be
| problematic?
| potatoman22 wrote:
| Doctors generally won't trust or use models unless they
| understand its reasoning.
| bumby wrote:
| I would expect any safety-critical application to be have a
| higher bar for explainability.
|
| As an extreme example, if image recognition is used for anti-
| aircraft defense, I think it would be nice to understand how it
| differentiates between an enemy plane vs. a domestic air
| carrier. Knowing it's accuracy is probably not enough.
| neatze wrote:
| Making all actions and effects explicitly explainable in terms
| of; why this happens, to me, it something in area of crystal
| ball, not science.
|
| Extreme example would be; gravity is not explainable, and yet, I
| don't need to have analytical solutions to learn how to play
| tennis, baseball or soccer.
| karpierz wrote:
| Explainability isn't needed when your training set encompasses
| your test set, in the conditions that are meaningful. You have
| a heuristic which says "stuff tends to fall down to Earth",
| which you learned by tossing things in the air. The relevant
| condition you trained it on was "am on the surface of the
| Earth", which turns out to be the only thing you really need.
|
| However, if gravity was very sensitive to the number of cars on
| the road, or the number of street signs, I think you'd run into
| a lot of trouble without an underlying model.
| andy99 wrote:
| Your example is good, but I would argue it points to a good
| confidence (uncertainty) estimate as the most important
| element of trusting a model, rather than a human-
| interpretable explanation.
|
| Like you say, we want to know that the test data is
| distributed the same as the training data, and that our model
| repeatably gets predictions right in the presence of
| irrelevant perturbations (those two statements are at least
| partially equivalent). If we have this information, and are
| happy with the train/validation performance, then the actual
| process that our model uses to make a prediction really is
| not that important.
| karpierz wrote:
| How do you know whether the test data is distributed
| similarly to the training data without an underlying model
| of how the system works? Put differently, how do you know
| what matters for your tests?
|
| If you have a machine-learned model, you've likely trained
| it on sparse data. How do you know that your sparse data
| covers the set of signals required to make accurate
| predictions, and how do you know that your model actually
| uses those signals?
|
| Edit: I would say that to have a notion of confidence
| requires that you have an underlying model of how the
| system works (IE, explainability).
| neatze wrote:
| You can list all system variables without having any
| explanation what so ever how this variables related with
| each other, and then build model without explicit
| explanation with sufficient performance.
| neatze wrote:
| I might be wrong here, but you also need uncertainty
| estimate for training/test data set in relation to
| empirical domain, at least in cases with prohibitively
| large problem state space.
| titzer wrote:
| So, basically, teaching. The best AI is a good teacher. What good
| is an AI if we can't learn something from it, after all? I would
| argue it actually has negative value if it takes over decision-
| making, as we slip further into learned helplessness and our
| expanding lack of agency leads to both oppression and depression.
| xbar wrote:
| What's the idiom involving a researcher/reporting announcing
| their name?
| threatofrain wrote:
| What about for automated proving? The math community overall
| seems to have accepted evidence without explanation.
| 6gvONxR4sf7o wrote:
| Have they? I thought automated theorem proving was still super
| niche.
| glup wrote:
| No.
| taeric wrote:
| The article makes a decent discussion piece. Such that it does
| seem that both are pitched as panacea cures for why the models
| sometimes don't work.
|
| Combined with the idea that folks think models would be better
| used if they presented their uncertainty, I can see the direct
| line to models needing explainability before we deploy them.
|
| To that end, why do you think "no?"
| bumby wrote:
| Not the OP, but maybe you can help me understand the
| relationship.
|
| As I understand it, uncertainty is a statement of risk.
| Explainability is statement of understanding how a system
| works to produce an outcome. None of the four NIST principles
| seem to conflate the two.
|
| I can say I understand how my brakes on my car may fail to
| work because it's an explainable mechanical system with known
| failure modes. However, that's different than the statement
| about the uncertainty that the brakes _will_ work as
| intended. In the latter, there is a statistical probability
| that gets translated to a risk statement. I think one needs
| to have an explainable system in order to arrive at an
| uncertainty risk statement. They are both related to quality,
| but speak to different aspects of the problem.
| taeric wrote:
| You are just highlighting that they are different things.
| The article seems to be pointing out that they are now
| getting used for the same reasons/aims.
|
| That is, yes, they do ultimately tell you different things.
| But, per the article, both can be used to push back on
| using a model.
|
| That is to say, in prior years, folks pushed back on models
| for them not presenting their uncertainty. Seems there is a
| growing push to push back if they do not present
| explainable reasons.
| bumby wrote:
| Ok, that's a better way to frame it than I was originally
| thinking. In that context, I'd say 'explainability' is
| too blunt of an instrument to be used to push back on a
| model than 'uncertainty'.
|
| IMO, if explainability is the new way to push back on
| models we're uncomfortable with, it shouldn't be.
| Uncertainty arguments can be mathematically quantified
| and defended. Can the same be said for explainability?
| (Genuinely asking). If not, it's really just a less
| rigorous way of saying "I'm not comfortable with this
| model but I can't explain why."
| taeric wrote:
| My gut is that it is too easy to make these conversations
| basically people yelling past each other.
|
| As an example, you are treating uncertainty as a form of
| tolerance. But you have to explain that, as well. Why is
| one model 10% uncertain, but another is 30%?
|
| You could just say it is over the data that was trained,
| but if you can pull it back to used parameters of a
| model, they may make something more obvious. And it is
| hard to take uncertainty based on trained data something
| that transfers to unseen data.
| troelsSteegin wrote:
| Explainability is the new credulity. The existence of an
| explanation system in adherence to the NIST's proposed guidelines
| will in itself signal that the underlying recommender can be
| trusted -- without needing to check the actual explanations. The
| NIST proposals are interesting, though, and it would be a good
| challenge to make them work.
|
| EDIT: adding commentary.
|
| The four principles (quoting Hullman quoting NIST) are: 1) AI
| systems should deliver accompanying evidence or reasons for all
| their outputs. 2) Systems should provide explanations that are
| meaningful or understandable to individual users. 3) The
| explanation correctly reflects the system's process for
| generating the output. 4) The system only operates under
| conditions for which it was designed or when the system reaches a
| sufficient confidence in its output. (The idea is that if a
| system has insufficient confidence in its decision, it should not
| supply a decision to the user.
|
| What do I think Hullman is getting at with "the New Uncertainty"?
|
| A major theme in Hullman's research, reductively, is
| understanding and then helping how people make sense of
| uncertainty as mediated through data visualization. In an
| explanation system, replace a the viz with a narrative of a
| solution process, as carried out by some algo agent, or with some
| rationale or justification, by as carried out by same. The
| process of judgment is now uncertain as well. Explanations are
| uncertain narratives. How does a person make sense of that?
|
| The NIST #2 bar for what is "meaningful or understandable",
| without relying explicitly in the authority of the recommender,
| seems pretty high to me.
| jerf wrote:
| "The existence of an explanation system in adherence to the
| NIST's proposed guidelines will in itself signal that the
| underlying recommender can be trusted"
|
| Not for long. This is trivially forgable with current AI tech.
| It's easy. Your AI tells you that X is in category Y because
| there's an 80% match on reason 1, a 65% match on reason 2, a
| 45% match for 3, etc. etc. Reason number 2 is, for the sake of
| argument, outright racist, and the person running this AI knows
| that, so they simply hand you an explanation with that reason
| removed. You have no way of knowing whether this has happened,
| neither does anyone else, and people continue to accuse your AI
| of bias. (Especially after I "helpfully" normalize the reason
| factors for you against the list I handed you, not what came
| out of the AI.)
|
| Any human-comprehensible explanation produced by a program is
| certainly human-editable, and almost certainly practically
| editable by code.
|
| If you were going to design an AI architecture to provide you
| parallel construction reasons systematically, it would be hard
| to produce something better than a neural net.
| tshaddox wrote:
| Wouldn't a good explanation need to account for everything
| it's purporting to account for and not contain any details
| that can be omitted or altered without changing what it
| accounts for? An explanation that omits that reason number 2
| would not account for why the output changes when the input
| to reason number 2 changes in isolation.
| aeternum wrote:
| There's pretty convincing evidence that even human brains tend
| to make decisions first then rationalize them after.
|
| I wonder if we could tell whether or not a sufficiently complex
| NN or AI were doing the same, or if it even matters. It at
| least feels somewhat useless if system A makes the decision but
| then system B comes up with justification (even if the
| justification is plausible).
| akiselev wrote:
| _> There 's pretty convincing evidence that even human brains
| tend to make decisions first then rationalize them after._
|
| The problem with that is we can't simulate human brains to
| test the integrity of that rationalization. Assuming the
| algorithm is repeatable, the data from the rationalization
| can be used to generate adversarial input from the original
| that tests the conclusion is actually responding to those
| factors. At the very least, this would give courts an avenue
| to validate the algorithms in practice.
| troelsSteegin wrote:
| XAI research talks about exactly that, system B "surrogate
| models" that are consistent with the recommender right around
| the locus of a prediction. If the surrogate model is easy to
| interpret, you can forgo directly explaining the base
| recommender. I think a thing to keep in mind that
| recommenders may be proprietary, and thus pointedly black
| box. I view interpretability as a conversation between the
| recommender operator and the user, subject, or regulator of
| the recommender that is unfortunately mediated by
| confidentiality. I view explanation as like debugging,
| something you do in house in dev, and it bewilders me that
| one would launch a system that one can't explain. But,
| sometimes we make deployments first and then rationalize them
| after ...
| dataflow wrote:
| > There's pretty convincing evidence that even human brains
| tend to make decisions first then rationalize them after.
|
| Does rationalize here mean "deduce that a supporting argument
| would exist", or does it mean "actually determine _what_ that
| argument is "?
|
| Like imagine if the brain is solving an approximate (or
| different) problem first, deducing that there's a solution
| that's very likely to be feasible for the original problem,
| then working out all the details afterward. Is that
| rationalizing after the fact or before the fact?
| duckfang wrote:
| It should be blatantly obvious that is is, especially in cases
| where software is making actionable decisions.
|
| Is the software trained on sexist, ableist, racist, etc content
| that is illegal or deadly when applied to decisions? To answer
| that requires having the AI algorithm explain why its decision is
| as such. If we could see the explanation, we could either
| consider it or rule it out.
|
| https://www.technologyreview.com/2019/01/21/137783/algorithm...
| is one such software that was trained on racist data. Black
| people get harassed and sentenced more by police than others, and
| thus have strong bias in criminal data. This AI software was
| trained with that data, and thus perpetrates the same.
|
| https://becominghuman.ai/amazons-sexist-ai-recruiting-tool-h...
| Is a hiring AI that Amazon used - turns out when you don't hire
| many women, and train the AI on the data, it's very good at
| finding women's names to deselect from hiring. It's the same
| trend - poisoned prior data leads to future poisoned data at
| scale.
|
| https://abilitynet.org.uk/news-blogs/ai-making-captcha-incre...
| And this is for general internet users, primarily ones who have
| auditory or visual handicaps. CAPTCHAS are increasingly hostile
| to disabled peoples. A CAPTCHA might as well state "no cripples".
| But this is done in the spirit of 'we dont want automated bots
| here' - and real humans suffer.
|
| Why was WATSON, the IBM AI cancelled? Because it kept
| recommending "treatments" that would kill its patients. At least
| it was probably killing people in a non-biased way.
| https://www.theverge.com/2018/7/26/17619382/ibms-watson-canc...
___________________________________________________________________
(page generated 2021-04-29 23:02 UTC)