[HN Gopher] Show HN: Unsure Calculator - back-of-a-napkin probab...
___________________________________________________________________
Show HN: Unsure Calculator - back-of-a-napkin probabilistic
calculator
Author : filiph
Score : 799 points
Date : 2025-04-15 08:22 UTC (1 days ago)
(HTM) web link (filiph.github.io)
(TXT) w3m dump (filiph.github.io)
| rogueptr wrote:
| brilliant work, polished ui. although sometimes give wrong ranges
| for equations like 100/1~(200~2000)
| thih9 wrote:
| Can you elaborate? What is the answer you're getting and what
| answer would you expect?
| BrandoElFollito wrote:
| How do you process this equation ? 100 divided by something
| from one to ...?
| notfed wrote:
| > 100 / 4~6
|
| Means "100 divided by some number between 4 and 6"
| throwanem wrote:
| "...some number with a 95% probability of falling between
| 4.0 and 6.0 inclusive," I believe.
| BrandoElFollito wrote:
| Yes, but this is not what op has. Their formula is 100 /
| 1~(20~200), with a double tilde
| djoldman wrote:
| I perused the codebase but I'm unfamiliar with dart:
|
| https://github.com/filiph/unsure/blob/master/lib/src/calcula...
|
| I assume this is a montecarlo approach? (Not to start a flamewar,
| at least for us data scientists :) ).
| kccqzy wrote:
| Yes it is.
| porridgeraisin wrote:
| Can you explain how? I'm an (aspiring)
| kccqzy wrote:
| I didn't peruse the source code. I just read the linked
| article in its entirety and it says
|
| > The computation is quite slow. In order to stay as
| flexible as possible, I'm using the Monte Carlo method.
| Which means the calculator is running about 250K AST-based
| computations for every calculation you put forth.
|
| So therefore I conclude Monte Carlo is being used.
| constantcrying wrote:
| Line 19 to 21 should be the Monte-Carlo sampling algorithm.
| The implementation is maybe a bit unintuitive but
| apparently he creates a function from the expression in the
| calculator, calling that function gives a random value from
| that function.
| hawthorns wrote:
| It's dead simple. Here is the simplified version that
| returns the quantiles for '100 / 2 ~ 4'.
| import numpy as np def monte_carlo(formula,
| iterations=100000): res = [formula() for _ in
| range(iterations)] return np.percentile(res, [0,
| 2.5, \*range(10, 100, 10), 97.5, 100])
| def uncertain_division(): return 100 /
| np.random.uniform(2, 4)
| monte_carlo(uncertain_division, iterations=100000)
| timothylaurent wrote:
| This reminds me of https://www.getguesstimate.com/ , a
| probabilistic spreadsheet.
| Recursing wrote:
| The authors of Guesstimate are now working on
| https://www.squiggle-language.com/
|
| Someone also turned it into the
| https://github.com/rethinkpriorities/squigglepy python library
| filiph wrote:
| Wow, this is fantastic! I did not know about squiggle
| language, and it's basically what I was trying to get to from
| my unsure calculator through my next project
| (https://filiph.github.io/napkin/). Squiggle looks and works
| much better.
|
| Thanks for the link!
| baq wrote:
| I was looking for this. Seen it (or a similar tool) ages ago.
|
| Want to use it every 3 months or so to pretend that we know
| what we can squeeze in the roadmap for the quarter.
| thih9 wrote:
| Feature request: allow specifying the probability distribution.
| E.g.: '~': normal, '_': uniform, etc.
| pyfon wrote:
| Not having this feature is a feature--they mention this.
| thih9 wrote:
| Not really, or at least not permanently; uniform distribution
| is mentioned in a github changelog, perhaps it's an upcoming
| feature:
|
| > 0.4.0
|
| > BRAKING: x~y (read: range from x to y) now means "flat
| distribution from x to y". Every value between x and y is as
| likely to be emitted.
|
| > For normal distribution, you can now use x+-d, which puts
| the mean at x, and the 95% (2 sigma) bounds at distance d
| from x.
|
| https://github.com/filiph/unsure/blob/master/CHANGELOG.md#04.
| ..
| tgv wrote:
| I think they should be functions: G(50, 1) for a Gaussian with
| u=50, s=1; N(3) for a negative exponential with l=3, U(0, 1)
| for a uniform distribution between 0 and 1, UI(1, 6) for an
| uniform integer distribution from 1 to 6, etc. Seems much more
| flexible, and easier to remember.
| rao-v wrote:
| This is terrific and it's tempting to turn into a little python
| package. +1 for notation to say it's ~20,2 to mean 18~22
| pvg wrote:
| Smol Show HN thread a few years ago
| https://news.ycombinator.com/item?id=22630600
| kccqzy wrote:
| I actually stumbled upon this a while ago from social media and
| the web version has a somewhat annoying latency, so I wrote my
| own version in Python. It uses numpy so it's faster.
| https://gist.github.com/kccqzy/d3fa7cdb064e03b16acfbefb76645...
| Thank you filiph for this brilliant idea!
| filiph wrote:
| Nice! Are you using your python script often?
|
| The reason I'm asking: unsure also has a CLI version (which is
| leaps and bounds faster and in some ways easier to use) but I
| rarely find myself using it. (Nowadays, I use
| https://filiph.github.io/napkin/, anyway, but it's still a web
| app rather than a CLI tool.)
| kccqzy wrote:
| Yes. I have Python on my phone so I just run it.
| throwanem wrote:
| I love this! As a tool for helping folks with a good base in
| arithmetic develop statistical intuition, I can't think offhand
| of what I've seen that's better.
| alexmolas wrote:
| is this the same as error propagation? I used to do a lot of that
| during my physics degree
| constantcrying wrote:
| It doesn't propagate uncertainty through the computation, but
| rather treats the expression as a single random variable.
| croisillon wrote:
| i like it and i skimmed the post but i don't understand why the
| default example 100 / 4~6 has a median of 20? there is no way of
| knowing why the range is between 4 and 6
| constantcrying wrote:
| The chance of 4~6 being less than 5 is 50%, the chance of it
| being greater is also 50%. The median of 100/4~6 has to be
| 100/5.
|
| >there is no way of knowing why the range is between 4 and 6
|
| ??? There is. It is the ~ symbol.
| perching_aix wrote:
| how do you mean?
| constantcrying wrote:
| An alternative approach is using fuzzy-numbers. If evaluated with
| interval arithmetic you can do very long calculations involving
| uncertain numbers very fast and with strong mathematical
| guarantees.
|
| It would especially outperform the Monte-Carlo approach
| drastically.
| sixo wrote:
| This assumes the inputs are uniform distributions, or perhaps
| normals depending on what exactly fuzzy numbers mean. M-C is
| not so limited.
| constantcrying wrote:
| No. It assumes the numbers aren't random at all.
|
| Although fuzzy-number can be used to model many different
| kinds of uncertainties.
| filiph wrote:
| I'm familiar with fuzzy numbers (e.g. see my
| https://filiph.net/fuzzy/ toy) but I didn't know there's
| arithmetic with fuzzy numbers. How is it done? Do you have
| a link?
| constantcrying wrote:
| There is a book by Hanss on it. It focuses on the
| sampling approach (he calls it "transformation method")
| though.
|
| If you want to do arithmetic and not a black box approach
| you just have to realize that you can perform them on the
| alpha-cuts with ordinary interval arithmetic. Then you
| can evaluate arbitrary expressions involving fuzzy
| numbers, keeping the strengths and weaknesses of interval
| arithmetic.
|
| The sampling based approach is very similar to Monte-
| Carlo, but you sample at certain well defined points.
| vessenes wrote:
| cool! are all ranges considered poisson distributions?
| re wrote:
| No:
|
| > Range is always a normal distribution, with the lower number
| being two standard deviations below the mean, and the upper
| number two standard deviations above. Nothing fancier is
| possible, in terms of input probability distributions.
| krick wrote:
| It sounds like a gimmick at first, but looks surprisingly useful.
| I'd surely install it if it was available as an app to use
| alongside my usual calculator, and while I cannot quite recall a
| situation when I needed it, it seems very plausible that I'll
| start finding use cases once I have it bound to some hotkey on my
| keyboard.
| NunoSempere wrote:
| > if it was available as an app
|
| Consider
| https://f-droid.org/en/packages/com.nunosempere.distribution...
| ttoinou wrote:
| Would be nice to retransform the output into an interval /
| gaussian distribution Note: If you're curious
| why there is a negative number (-5) in the histogram, that's just
| an inevitable downside of the simplicity of the Unsure
| Calculator. Without further knowledge, the calculator cannot know
| that a negative number is impossible
|
| Drake Equation or equation multiplying probabilities can also be
| seen in log space, where the uncertainty is on the scale of each
| probability, and the final probability is the product of
| exponential of the log probabilities. And we wouldnt have this
| negative issue
| hatthew wrote:
| The default example `100 / 4~6` gives the output `17~25`
| ttoinou wrote:
| Amazing, thank you !
| omoikane wrote:
| If I am reading this right, a range is expressed as a distance
| between the minimum and maximum values, and in the Monte Carlo
| part a number is generated from a uniform distribution within
| that range[1].
|
| But if I just ask the calculator "1~2" (i.e. just a range without
| any operators), the histogram shows what looks like a normal
| distribution centered around 1.5[2].
|
| Shouldn't the histogram be flat if the distribution is uniform?
|
| [1]
| https://github.com/filiph/unsure/blob/123712482b7053974cbef9...
|
| [2] https://filiph.github.io/unsure/#f=1~2
| hatthew wrote:
| Under the "Limitations" section:
|
| > Range is always a normal distribution, with the lower number
| being two standard deviations below the mean, and the upper
| number two standard deviations above. Nothing fancier is
| possible, in terms of input probability distributions.
| filiph wrote:
| Part of the confusion here is likely that the tool, as seen
| on the web, probably lags significantly behind the code. I've
| started using a related but different tool
| (https://filiph.github.io/napkin/).
|
| The HN mods gave me an opportunity to resubmit the link, so I
| did. If I had more time, I'd have also upgraded the tool to
| the latest version and fix the wording. But unfortunately, I
| didn't find the time to do this.
|
| Apologies for the confusion!
| marcodiego wrote:
| I put "1 / (-1~1)" and expected something around - to + infinty.
| It instead gave me -35~35.
|
| I really don't known how good it is.
| NunoSempere wrote:
| I'm guessing this is not an error. If you divide 1/normal(0,1),
| the full distribution would range from -inf to inf, but the 95%
| output doesn't have to.
| SamBam wrote:
| I don't quite understand, probably because my math isn't good
| enough.
|
| If you're treating -1~1 as a normal distribution, then it's
| centered on 0. If you're working out the answer using a Monte
| Carlo simulation, then you're going to be testing out
| different values from that distribution, right? And aren't
| you going to be more likely to test values closer to 0? So
| surely the most likely outputs should be far from 0, right?
|
| When I look at the histogram it creates, it varies by run,
| but the most common output seems generally closest to zero
| (and sometimes is exactly zero). Wouldn't that mean that it's
| most frequently picking values closest to -1 or 1
| denoninator?
| pyfon wrote:
| Only 1 percent of values would end up being 100+ on a
| uniform distribution.
|
| For normal it is higher but maybe not much more so.
| lswainemoore wrote:
| That may be true, but if you look at the distribution it
| puts out for this, it definitely smells funny. It looks
| like a very steep normal distribution, centered at 0
| (ish). Seems like it should have two peaks? But maybe
| those are just getting compressed into one because of
| resolution of buckets?
| etbebl wrote:
| OK, but do we necessarily just care about the _central_
| 95% range of the output? This calculation has the weird
| property that values in the tails of the input correspond
| to values in the middle of the output, and vice versa. If
| you follow the intuition that the range you specify in
| the input corresponds to the values you expect to see,
| the corresponding outputs would really include -inf and
| inf.
|
| Now I'm realizing that this doesn't actually work, and
| even in more typical calculations the input values that
| produce the central 95% of the output are not necessarily
| drawn from the 95% CIs of the inputs. Which is fine and
| makes sense, but this example makes it very obvious how
| arbitrary it is to just drop the lowermost and uppermost
| 2.5%s rather than choosing any other 95/5 partition of
| the probability mass.
| gregschlom wrote:
| The ASCII art (well technically ANSI art) histogram is neat. Cool
| hack to get something done quickly. I'd have spent 5x the time
| trying various chart libraries and giving up.
| Retr0id wrote:
| On a similar note, I like the crude hand-drawn illustrations a
| lot. Fits the "napkin" theme.
| smartmic wrote:
| Here [1] is a nice implementation written in Awk. A bit rough
| around the edges, but could be easily extended.
|
| [1] https://github.com/stefanhengl/histogram
| NunoSempere wrote:
| I have written similar tools
|
| - for command line, fermi:
| https://git.nunosempere.com/NunoSempere/fermi
|
| - for android, a distribution calculator:
| https://f-droid.org/en/packages/com.nunosempere.distribution...
|
| People might also be interested in https://www.squiggle-
| language.com/, which is a more complex version (or possibly
| <https://git.nunosempere.com/personal/squiggle.c>, which is a
| faster but much more verbose version in C)
| NunoSempere wrote:
| Fermi in particular has the following syntax
|
| ```
|
| 5M 12M # number of people living in Chicago
|
| beta 1 200 # fraction of people that have a piano
|
| 30 180 # minutes it takes to tune a piano, including travel
| time
|
| / 48 52 # weeks a year that piano tuners work for
|
| / 5 6 # days a week in which piano tuners work
|
| / 6 8 # hours a day in which piano tuners work
|
| / 60 # minutes to an hour
|
| ```
|
| multiplication is implied as the default operation, fits are
| lognormal.
| NunoSempere wrote:
| Here is a thread with some fun fermi estimates made with that
| tool: e.g., number of calories NK gets from Russia:
| https://x.com/NunoSempere/status/1857135650404966456
|
| 900K 1.5M # tonnes of rice per year NK gets from Russia
|
| * 1K # kg in a tone
|
| * 1.2K 1.4K # calories per kg of rice
|
| / 1.9K 2.5K # daily caloric intake
|
| / 25M 28M # population of NK
|
| / 365 # years of food this buys
|
| / 1% # as a percentage
| kqr wrote:
| Oh, this is very similar to what I have with Precel, less
| syntax. Thanks for sharing!
| antman wrote:
| I tried the unsure calc and the android app and they seem to
| produce different results?
| NunoSempere wrote:
| The android app fits lognormals, and 90% rather than 95%
| confidence intervals. I think they are a more parsimonious
| distribution for doing these kinds of estimates. One hint
| might be that, per the central limit theorem, sums of
| independent variables will tend to normals, which means that
| products will tend to be lognormals, and for the
| decompositions quick estimates are most useful,
| multiplications are more common
| NunoSempere wrote:
| Another tool in this spirit is <https://carlo.app/>, which
| allows you to do this kind of calculation on google sheets.
| joshlemer wrote:
| Their pricing is absolutely out of this world though. Their
| BASIC plan is $2990 USD per year, the pro plan is $9990/year.
| https://carlo.app/pricing
| notpushkin wrote:
| Would be a nice touch if Squiggle supported the `a~b` syntax
| :^)
| alex-moon wrote:
| > The UI is ugly, to say the least.
|
| I actually quite like it. Really clean, easy to see all the
| important elements. Lovely clear legible monospace serif font.
| roughly wrote:
| I like this!
|
| In the grand HN tradition of being triggered by a word in the
| post and going off on a not-quite-but-basically-totally-
| tangential rant:
|
| There's (at least) three areas here that are footguns with these
| kinds of calculations:
|
| 1) 95% is usually a lot wider than people think - people take 95%
| as "I'm pretty sure it's this," whereas it's really closer to
| "it'd be really surprising if it were not this" - by and large
| people keep their mental error bars too close.
|
| 2) probability is rarely truly uncorrelated - call this the
| "Mortgage Derivatives" maxim. In the family example, rent is very
| likely to be correlated with food costs - so, if rent is high,
| food costs are also likely to be high. This skews the
| distribution - modeling with an unweighted uniform distribution
| will lead to you being surprised at how improbable the actual
| outcome was.
|
| 3) In general normal distributions are rarer than people think -
| they tend to require some kind of constraining factor on the
| values to enforce. We see them a bunch in nature because there
| tends to be negative feedback loops all over the place, but once
| you leave the relatively tidy garden of Mother Nature for the
| chaos of human affairs, normal distributions get pretty abnormal.
|
| I like this as a tool, and I like the implementation, I've just
| seen a lot of people pick up statistics for the first time and
| lose a finger.
| youainti wrote:
| > I've just seen a lot of people pick up statistics for the
| first time and lose a finger.
|
| I love this. I've never though of statistics like a power tool
| or firearm, but the analogy fits really well.
| ninalanyon wrote:
| Unfortunately it's usually someone else who loses a finger,
| not the person wielding the statistics.
| btilly wrote:
| I strongly agree with this, and particularly point 1. If you
| ask people to provide estimated ranges for answers that they
| are 90% confident in, people on average produce roughly 30%
| confidence intervals instead. Over 90% of people don't even get
| to 70% confidence intervals.
|
| You can test yourself at https://blog.codinghorror.com/how-
| good-an-estimator-are-you/.
| Nevermark wrote:
| From link:
|
| > Heaviest blue whale ever recorded
|
| I don't think estimation errors regarding things outside of
| someone's area of familiarity say much.
|
| You could ask a much "easier"" question from the same topic
| area and still get terrible answers: "What percentage of blue
| whales are blue?" Or just "Are blue whales blue?"
|
| Estimating something often encountered but uncounted seems
| like a better test. Like how many cars pass in front of my
| house every day. I could apply arithmetic, soft logic and
| intuition to that. But that would be a difficult question to
| grade, given it has no universal answer.
| yen223 wrote:
| I guess people didn't realise they are allowed to, and in
| fact are expected to, put very wide ranges for things they
| are not certain about.
| kqr wrote:
| I have no familiarity with blue whales but I would guess
| they're 1--5 times the mass of lorries, which I guess weigh
| like 10--20 cars which I in turn estimate at 1.2--2 tonnes,
| so primitively 12--200 tonnes for a normal blue whale. This
| also aligns with it being at least twice as large as an
| elephant, something I estimate at 5 tonnes.
|
| The question asks for the heaviest, which I think cannot be
| more than three times the normal weight, and probably no
| less than 1.3. That lands me at 15--600 tonnes using
| primitive arithmetic. The calculator in OP suggests 40--
| 320.
|
| The real value is apparently 170, but that doesn't really
| matter. The process of arriving at an interval that is as
| wide as necessary but no wider is the point.
|
| Estimation is a skill that can be trained. It is a generic
| skill that does not rely on domain knowledge beyond some
| common sense.
| peeters wrote:
| So the context of the quiz is software estimation, where I
| assume it's an intentional parable of estimating something
| you haven't seen before. It's trying to demonstrate that
| your "5-7 days" estimate probably represents far more
| certainty than you intended.
|
| For some of these, your answer could span orders of
| magnitude. E.g. my answer for the heaviest blue whale would
| probably be 5-500 tons because I don't have a good concept
| of things that weigh 500 tons. The important point is that
| I'm right around 9 times in 10, not that I had a precise
| estimate.
| duckmysick wrote:
| I don't know, an estimate spanning three orders of
| magnitude doesn't seem useful.
|
| To continue your example of 5-7 days, it would turn into
| an estimate of 5-700 days. So somewhere between a week or
| two years. And fair enough, whatever you're estimating
| will land somewhere in between. But how do I proceed from
| there with actual planning or budget?
| peeters wrote:
| I mean it's no less useful than a more precise, but less
| certain estimate. It means you either need to do some
| work to improve your certainty (e.g. in the case of this
| quiz, allow spending more than 10 minutes or allow
| research) or prepare for the possibility that it's 700
| days.
|
| Edit: And by the way given a large enough view, estimates
| like this can still be valuable, because when you add
| these estimates together the resulting probability
| distribution narrows considerably. e.g. at just 10 tasks
| of this size, you get a 95% CI of 245~460 per task. At
| 20, 225~430 per task.
|
| Note that this is obviously reductive as there's no way
| an estimate of 5-700 would imply a normal distribution
| centred at 352.5, it would be more like a logarithmic
| distribution where the mean is around 10 days. And
| additionally, this treats each task as independent...i.e.
| one estimate being at the high end wouldn't mean another
| one would be as well.
| throwup238 wrote:
| _> But how do I proceed from there with actual planning
| or budget?_
|
| You make up the number you wanted to hear in the first
| place that ostensibly works with the rest of the
| schedule. That's why engineering estimates are so useless
| - it's not that they're inaccurate or unrealistic - it's
| that if we insisted on giving them realistic estimates
| we'd get fired and replaced by someone else who is
| willing to appease management and just kick the can down
| the road a few more weeks.
| MichaelDickens wrote:
| It shouldn't matter how familiar you are with the question.
| If you're pretty familiar, give a narrow 90% credence
| interval. If you're unfamiliar, give a wide interval.
| pertdist wrote:
| I did a project with non-technical stakeholders modeling likely
| completion dates for a big GANTT chart. Business stakeholders
| wanted probabilistic task completion times because some of the
| tasks were new and impractical to quantify with fixed times.
|
| Stakeholders really liked specifying work times as t_i ~
| PERT(min, mode, max) because it mimics their thinking and
| handles typical real-world asymmetrical distributions.
|
| [Background: PERT is just a re-parameterized beta distribution
| that's more user-friendly and intuitive
| https://rpubs.com/Kraj86186/985700]
| baq wrote:
| arguably this is how it should always be done, fixed
| durations for any tasks are little more than wishful
| thinking.
| kqr wrote:
| This looks like a much more sophisticated version of PERT
| than I have seen used. When people around me have claimed to
| use PERT, they have just added together all the small
| numbers, all the middle numbers, and all the big numbers.
| That results in a distribution that is too extreme in both
| lower and upper bound.
| baq wrote:
| that... is not PERT. it's 'I read a tweet about three point
| estimates' and I'm using a generous interpretation of
| _read_
| jrowen wrote:
| This jives with my general reaction to the post, which was that
| the added complexity and difficulty of reasoning about the
| ranges actually made me feel less confident in the result of
| their example calculation. I liked the $50 result, you can tack
| on a plus or minus range but generally feel like you're about
| breakeven. On the other hand, "95% sure the real balance will
| fall into the -$60 to +$220 range" feels like it's creating a
| false sense of having more concrete information when you've
| really just added compounding uncertainties at every step (if
| we don't know that each one is definitely 95%, or the true
| min/max, we're just adding more guesses to be potentially wrong
| about). That's why I don't like the Drake equation, every step
| is just compounding wild-ass guesses, is it really producing a
| useful number?
| kqr wrote:
| It is producing a useful number. As more truly independent
| terms are added, error grows with the square root while the
| point estimation grows linearly. In the aggregate, the error
| makes up less of the point estimation.
|
| This is the reason Fermi estimation works. You can test
| people on it, and almost universally they get more accurate
| with this method.
|
| If you got less certain of the result in the example, that's
| probably a good thing. People are default overconfident with
| their estimated error bars.
| pests wrote:
| > People are default overconfident with their estimated
| error bars.
|
| You say this but yet roughly in a top level comment
| mentions people keep their error bars too close.
| bigfudge wrote:
| They are meaning the same thing. The original comment
| pointed out that people's qualitative description and
| mental model of the 95% interval means they are
| overconfident... they think 95 means 'pretty sure I'm
| right' rather than 'it would be surprising to be wrong'
| kqr wrote:
| Sorry, my comment was phrased confusingly.
|
| Being overconfident with error bars means placing them
| too close to the point estimation, i.e. the error bars
| are too narrow.
| jrowen wrote:
| Read a bit on Fermi estimation, I'm not quite sure exactly
| what the "method" is in contrast to a less accurate method,
| it's basically just getting people to think in terms of
| dimensional analysis? This passage from the Wikipedia is
| interesting:
|
| _By contrast, precise calculations can be extremely
| complex but with the expectation that the answer they
| produce is correct. The far larger number of factors and
| operations involved can obscure a very significant error,
| either in mathematical process or in the assumptions the
| equation is based on, but the result may still be assumed
| to be right because it has been derived from a precise
| formula that is expected to yield good results._
|
| So the strength of it _is_ in keeping it simple and not
| trying to get too fancy, with the understanding that it 's
| just a ballpark/sanity check. I still feel like the Drake
| equation in particular has too many terms for which we
| don't have enough sample data to produce a reasonable
| guess. But I think this is generally understood and it's
| seen as more of a thought experiment.
| roughly wrote:
| I think the point is to create uncertainty, though, or to at
| least capture it. You mention tacking a plus/minus range to
| $50, but my suspicion is that people's expected plus/minus
| would be narrower than the actual - I think the primary value
| of the example is that it makes it clear there's a very real
| possibility of the outcome being negative, which I don't
| think most people would acknowledge when they got the initial
| positive result. The increased uncertainty and the decreased
| confidence in the result is a feature, not a bug.
| larodi wrote:
| Actually using it already after finding it few days ago on HN
| jbjbjbjb wrote:
| I think to do all that you'd need a full on DSL rather than
| something pocket calculator like. I think adding a triangular
| distribution would be good though.
| rssoconnor wrote:
| Normal distributions are the maximum entropy distributions for
| a given mean and variance. Therefore, in accordance with the
| principle of maximum entropy, unless you have some reason to
| not pick a normal distribution (e.g. you know your values must
| be non-negative), you should be using a normal distribution.
| kqr wrote:
| > you should be using a normal distribution.
|
| ...if the only things you know about an uncertain value are
| its expectation and variance, yes.
|
| Often you know other things. Often you _don 't_ know
| expectation and variance with any certainty.
| tgv wrote:
| At least also accept a log-normal distribution. Sometimes you
| need a factor like .2 ~ 5, but that isn't the same as N(2.6,
| 1.2).
| JKCalhoun wrote:
| > 2) probability is rarely truly uncorrelated
|
| Without having fully digested how the Unsure Calculator
| computes, it seems to me you could perhaps "weight" the ranges
| you pass to the calculator. Rather than a standard bell curve
| the Calculator could apply a more tightly focused -- or perhaps
| skewed curve for that term.
|
| If you think your salary will be in the range of 10 to 20, but
| more likely closer to 10 you could:
|
| 10<~20 (not to be confused with less-than)
|
| or: 10!~20 (not to be confused with factorial)
|
| or even: 10~12~20 to indicate a range of 10 to 20 ... leaning
| toward 12.
| gamerDude wrote:
| Great points. I think the idea of this calculator could just be
| simply extended to specific use cases to make the statistical
| calculation simple and take into account additional variables.
| Moving being one example.
| chris_wot wrote:
| There's an amazing scene in "This is Spinal Tap" where Nigel
| Tufnel had been brainstorming a scene where Stonehenge would be
| lowered from above onto the stage during their performance, and
| he does some back of the envelope calculations which he gives to
| the set designer. Unfortunately, he mixes the symbol for feet
| with the symbol for inches. Leading to the following:
|
| https://www.youtube.com/watch?v=Pyh1Va_mYWI
| vortico wrote:
| Cool! Some random requests to consider: Could the range x~y be
| uniform instead of 2 std dev normal (95.4%ile)? Sometimes the
| range of quantities is known. 95%ile is probably fine as a
| default though. Also, could a symbolic JS package be used instead
| of Monte-Carlo? This would improve speed and precision,
| especially for many variables (high dimensions). Could the result
| be shown in a line plot instead of ASCII bar chart?
| OisinMoran wrote:
| This is neat! If you enjoy the write up, you might be interested
| in the paper "Dissolving the Fermi Paradox" which goes even more
| on-depth into actually multiplying the probability density
| functions instead of the common point estimates. It has the
| somewhat surprising result that we may just be alone.
|
| https://arxiv.org/abs/1806.02404
| drewvlaz wrote:
| This was quite a fun read, thanks!
| baq wrote:
| a bit depressing TBH... but ~everyone on this site should read
| this for the methodology
| nritchie wrote:
| Here (https://uncertainty.nist.gov/) is another similar Monte
| Carlo-style calculator designed by the statisticians at NIST. It
| is intended for propagating uncertainties in measurements and can
| handle various different assumed input distributions.
| filiph wrote:
| I think I was looking at this and several other similar
| calculators when creating the linked tool. This is what I mean
| when I say "you'll want to use something more sophisticated".
|
| The problem with similar tools is that of the very high barrier
| to entry. This is what my project was trying to address, though
| imperfectly (the user still needs to understand, at the very
| least, the concept of probability distributions).
| ashu1461 wrote:
| So is it 250k calculations for every approximation window ? So i
| guess it will only be able to calculate upto 3-4 approximations
| comfortably ?
|
| Any reason why we kept it 250k and now a lower number like 10k
| Aachen wrote:
| https://qalculate.github.io can do this also for as long as I've
| used it (only a couple years to be fair). I've got it on my
| phone, my laptop, even my server with apt install qalc. Super
| convenient, supports everything from unit conversion to
| uncertainty tracking
|
| The histogram is neat, I don't think qalc has that. On the other
| hand, it took 8 seconds to calculate the default (exceedingly
| trivial) example. Is that JavaScript, or is the server currently
| very busy?
| filiph wrote:
| It's all computed in the browser so yeah, it's JavaScript.
| Still, 8 seconds is a lot -- I was targeting sub-second
| computation times (which I find alright).
| internetter wrote:
| Yes! (5+-6)*(9+-12) => 45+-81. Uncertainty propagation!
| explosion-s wrote:
| I made one that's much faster because it instead modifies the
| normal distribution instead of sending thousands of samples:
| https://gistpreview.github.io/?757869a716cfa1560d6ea0286ee1b...
| etbebl wrote:
| This is more limited. I just tested and for one example,
| exponentiation seems not to be supported.
| lorenzowood wrote:
| See also Guesstimate https://getguesstimate.com. Strengths
| include treating label and data as a unit, a space for examining
| the reasoning for a result, and the ability to replace an
| estimated distribution with sample data => you can build a model
| and then refine it over time. I'm amazed Excel and Google Sheets
| still haven't incorporated these things, years later.
| montag wrote:
| Thank you, I would have mentioned this myself, but forgot the
| name of it.
| BOOSTERHIDROGEN wrote:
| awesome
| kqr wrote:
| I have made a similar tool but for the command line[1] with
| similar but slightly more ambitious motivation[2].
|
| I really like that more people are thinking in these terms.
| Reasoning about sources of variation is a capability not all
| people are trained in or develop, but it is increasingly
| important.[3]
|
| [1]: https://git.sr.ht/~kqr/precel
|
| [2]: https://entropicthoughts.com/precel-like-excel-for-
| uncertain...
|
| [3]: https://entropicthoughts.com/statistical-literacy
| nkron wrote:
| Really cool! On iOS there's a noticeable delay when clicking the
| buttons and clicking the backspace button quickly zooms the page
| so it's very hard to use. Would love it in mobile friendly form!
| your_challenger wrote:
| Very cool. This can also be used for LLM cost estimation.
| Basically any cost estimation I suppose. I use cloudflare workers
| a lot and have a few workers running for a variable amount of
| time. This could be useful to calculate a ball park figure of my
| infra cost. Thank you!
| danpalmer wrote:
| This is awesome. I used Causal years ago to do something similar,
| with perhaps slightly more complex modelling, and it was great.
| Unfortunately the product was targeted at high paying enterprise
| customers and seems to have pivoted into finance now, I've been
| looking for something similar ever since. This probably solves at
| least, err... 40~60% of my needs ;)
| chacha21 wrote:
| Chalk also supports uncertainty :
| https://chachatelier.fr/chalk/chalk-features.php (combined with
| arbitrary long numbers and interval arithmetic)
| NotAnOtter wrote:
| This is super cool.
|
| It seems to break for ranges including 0 though
|
| 100 / -1~1 = -3550~3500
|
| I think the most correct answer here is -inf~inf
| filiph wrote:
| I'd argue this is WAI.
|
| It's hard for me to imagine _dividing_ by -1~1 in a real-world
| scenario, but let's say we divide by 0~10, which also includes
| zero. For example, we are dividing the income between 0 to 10
| shareholders (still forced, but ok).
|
| Clearly, it's possible to have a division by zero here, so "0
| sharehodlers would each get infinity". And in fact, if you try
| to compute 500 / 0, or even 500~1000 / 0, it will correctly
| show infinity.
|
| But if you divide by a range that merely _includes_ zero, I
| don't think it should give you infinity. Ask yourself this:
| does 95% of results of 500 / 0~10 become infinity?
| shubhamintech wrote:
| love it! gonna use this instead of calculating my own extremes
| now
| usgroup wrote:
| Interval/affine arithmetic are alternatives which do not make use
| of probabilities for this these kinds of calculations.
|
| https://en.wikipedia.org/wiki/Interval_arithmetic
|
| I think arbitrary distribution choice is dangerous. You're bound
| to end up using lots of quantities that are integers, or positive
| only (for example). "Confidence" will be very difficult to
| interpret.
|
| Does it support constraints on solutions? E.g. A = 3~10, B = 4 -
| A, B > 0
| po1nt wrote:
| I love it! Now I need it in every calculator
| dmos62 wrote:
| Love it! I too have been toying with reasoning about uncertainty.
| I took a much less creative approach though and just ran a bunch
| of geometric brownian motion simulations for my personal finances
| [0]. My approach has some similarity to yours, though much less
| general. It displays the (un)certainty over time (using
| percentile curves), which was my main interest. Also, man, the
| UI, presentation, explanations: you did a great job, pretty
| inspiring.
|
| [0] https://dmos62.github.io/personal-financial-growth-
| simulator...
| 97-109-107 wrote:
| The histogram is great, nice work;
|
| I want to ask about adjacent projects - user interface libraries
| that provide input elements for providing ranges and approximate
| values. I'm starting my search around
| https://www.inkandswitch.com/ and
| https://malleable.systems/catalog/ but I think our collective
| memory has seen more examples.
| usgroup wrote:
| I think the SWI Prolog clpBNR package is the most complete
| interval arithmetic system. It also supports arbitrary
| constraints.
|
| https://github.com/ridgeworks/clpBNR
| elia_42 wrote:
| Interesting. I like the notation and the histogram that comes out
| with the output. I also like the practical examples you gave
| (e.g. the application of the calculator to business and marketing
| cases). I will try it out with simple estimates in my marketing
| campaigns.
| cluckindan wrote:
| "Without further knowledge, the calculator cannot know that a
| negative number is impossible (in other words, you can't have -5
| civilizations, for example)."
|
| Not true. If there are no negative terms, the equation cannot
| have negative values.
| kqr wrote:
| The calculator cannot know whether there are no negative terms.
| For example, if people's net worth is distributed 0.2-400,
| there's likely a significant chunk of people who are, on the
| whole, in debt. These will be represented as a negative term,
| even though their distribution was characterised by positive
| numbers.
| burning_hamster wrote:
| The range notation indicates 95% confidence intervals, not the
| minima and maxima. If the lower bounds are close enough to zero
| (and the interval is large enough), then there may some
| residual probability mass associated with negative values of
| the variable.
| ralferoo wrote:
| On the whole it seems like a nice idea, but there's a couple of
| weird things, such as:
|
| > Note: If you're curious why there is a negative number (-5) in
| the histogram, that's just an inevitable downside of the
| simplicity of the Unsure Calculator. Without further knowledge,
| the calculator cannot know that a negative number is impossible
| (in other words, you can't have -5 civilizations, for example).
|
| The input to this was "1.5~3 x 0.9~1.0 x 0.1~0.4 x 0.1~1.0 x
| 0.1~1.0 x 0.1~0.2 x 304~10000" - every single range was positive,
| so regardless of what this represents, it should be impossible to
| get a negative result.
|
| I guess this is a consequence of "I am not sure about the exact
| number here, but I am 95% sure it's somewhere in this range" so
| it's actually considering values outside of the specified range.
| In this case, 10% either side of all the ranges is positive
| except the large "304~10000".
|
| Trying with a simpler example: "1~2 x 1~2" produces "1.3~3.4" as
| a result, even though "1~4" seems more intuitive. I assume this
| is because the confidence of 1 or 4 is now only 90% if 1~2 was at
| 95%, but it still feels off.
|
| I wonder if the 95% thing actually makes sense, but I'm not
| especially good at stats, certainly not enough to be sure how
| viable this kind of calculator is with a tighter range. But just
| personally, I'd expect "1~2" to mean "I'm obviously not 100%
| sure, or else I wouldn't be using this calculator, but for this
| experiment assume that the range is definitely within 1~2, I just
| don't know where exactly".
| kqr wrote:
| The calculator in Emacs has support for what it is you request,
| which it calls "interval forms". Interval form arithmetic
| simply means executing the operations in parallel on both ends
| of the interval.
|
| It also has support for "error forms" which is close to what
| the calculator in OP uses. That takes a little more
| sophistication than just performing operations on the lower and
| upper number in parallel. In particular, the given points don't
| represent actual endpoints on a distribution, but rather low
| and high probability events. Things more or less likely than
| those can happen, it's just rare.
|
| > I'm not especially good at stats
|
| It shows! All the things you complain about make perfect sense
| given a little more background knowledge.
| OisinMoran wrote:
| Is it actually just doing it at both ends or something nore
| complex? Because for example if I did 7 - (-1~2)^2 the actual
| range would be 3-7 but just doing both ends of the interval
| would give 3-6 as the function is maximised inside the range.
| kqr wrote:
| Oh, maybe it's performing more complicated interval
| arithmetic. I had no idea. That's kind of cool!
| perlgeek wrote:
| > every single range was positive, so regardless of what this
| represents, it should be impossible to get a negative result.
|
| They explain that the range you give as input is seen as only
| being 95% correct, so the calculator adds low-probability
| values outside of the ranges you specified.
|
| I can see how that surprises you, but it's also a defensible
| design choice.
| constantcrying wrote:
| >The input to this was "1.5~3 x 0.9~1.0 x 0.1~0.4 x 0.1~1.0 x
| 0.1~1.0 x 0.1~0.2 x 304~10000" - every single range was
| positive, so regardless of what this represents, it should be
| impossible to get a negative result.
|
| Every single range here includes positive and negative numbers.
| To get the correct resulting distribution you have to take into
| account the entire input distribution. All normal distributions
| have a non-zero possibility to be negative.
|
| If you want to consider only the numbers inside the range you
| can look at interval arithmetic, but that does not give you a
| resulting distribution.
| ThouYS wrote:
| similar to guesstimate, which does the same but for spreadsheets:
| https://www.getguesstimate.com/
| dejongh wrote:
| Cool. It would be great to extend with a confidence operator.
| Something like:
|
| Without default confidence: 0~9
|
| With confidence: 0%100~9%95
|
| We are sure it is 0 or more and we are %95 certain it is 9 or
| less.
|
| Would that work?
| thomascountz wrote:
| This reminded me of this submission a few days ago: Napkin Math
| Tool[1].
|
| [1]: https://news.ycombinator.com/item?id=43389455
| godDLL wrote:
| So is it like plugging in a normal distribution into some
| arithmetic?
|
| Consider maybe 1 + 1 ~ +-2 like Q factor, if you know what I
| mean.
|
| That would help to filter out more probabilistic noise in using
| it to help reason with.
| constantcrying wrote:
| No. It is sampling the resulting distribution with Monte-Carlo.
| henryaj wrote:
| Also very very good is Guesstimate -
| https://www.getguesstimate.com/.
| trieloff wrote:
| https://www.getguesstimate.com/ is this, as a spreadsheet
| spzzz wrote:
| This is really useful, but is this correct?
|
| persons = 10~15 // - 10~15
|
| budget = persons * 1~2 // - 12~27
|
| Should it not say 10-30?
| wongarsu wrote:
| If they are truly independent of each other some of the
| uncertainty cancels out. 10 people and a budget of $1/person
| are both unlikely events, and two unlikely events occurring
| independently of each other is even more unlikely. And because
| the calculator is not about the full range of possible values
| but about the values in the 95% confidence interval this leads
| to the outer edges of the range now falling outside the 95%
| confidence interval
| peeters wrote:
| Is there a way to do non-scalar multiplication? E.g if I want to
| say "what is the sum of three dice rolls" (ignoring the fact that
| that's not a normal distro) I want to do 1~6 * 3 = 1~6 + 1~6 +
| 1~6 = 6~15. But instead it does 1~6 * 3 = 3~18. It makes it
| really difficult to do something like "how long will it take to
| complete 1000 tasks that each take 10-100 days?"
___________________________________________________________________
(page generated 2025-04-16 17:01 UTC)