[HN Gopher] Guide to Software Project Estimation
___________________________________________________________________
Guide to Software Project Estimation
Author : todsacerdoti
Score : 63 points
Date : 2021-08-03 11:31 UTC (11 hours ago)
(HTM) web link (www.scalablepath.com)
(TXT) w3m dump (www.scalablepath.com)
| gilbetron wrote:
| Nothing interesting in the article at all, I have no idea why it
| is on the front page of HN.
|
| As usual with software estimation, unless you've read this paper:
| http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.19.3...
|
| and have a thoughtful response to it (of which I've read some
| solid ones), I've very little interest in hearing what you have
| to say about software estimation.
| diiq wrote:
| 1) The big if is written out in the abstract -- "if it is
| accepted that algorithmic complexity is an appropriate
| definition of the complexity of a programming project."
| Relating algorithmic complexity to "how long software takes to
| write" seems, to me, to ignore that the vast majority of my
| time as a developer is spent discovering and communicating
| requirements, handling human questions, not writing novel code.
| The conclusion touches on this, but ignores it.
|
| 2) Even if you accept that "if", this is like a halting-problem
| proof. Fine; it is impossible to estimate the complexity of ALL
| software. That does not mean that it's useless to quantify the
| complexity of software in limited but well-behaved problem
| spaces. How much of any commercial project is actually spent
| working on the cutting edge of computer science, facing
| complete unknowns? An estimate being wrong occasionally is
| worth most estimates being mostly right.
|
| 3) Why do you consider a 20 year old paper that's only been
| cited 16 times to be critical reading about estimation, when a
| _vast_ body of research in forecasting exists, written by
| people who have _measurements_ of the accuracy of estimates to
| base their theoretical models on?
| gilbetron wrote:
| > 3) Why do you consider a 20 year old paper that's only been
| cited 16 times to be critical reading about estimation, when
| a vast body of research in forecasting exists, written by
| people who have measurements of the accuracy of estimates to
| base their theoretical models on?
|
| Have some links to this vast body?
|
| I've encountered very few over the years that actually
| qualify as "science" as much of it is fake or close to fake.
|
| http://shape-of-code.coding-
| guidelines.com/2021/01/17/softwa...
| codemac wrote:
| That paper makes such a huge, and largely incorrect assumption
| in the abstract. I've never seen anyone make this argument.
|
| The old parable about bugs, that it's 10x harder to fix a bug
| than it is to write the bug, shows us that the time to
| implement something complex is not related to it's complexity.
|
| For example, if I wanted to build something that randomly
| selects a function from github, and then runs that code on your
| laptop - I bet I could estimate and implement that code.. but
| good luck every defining it's features, complexity, etc in any
| mathematical way.
|
| But you don't need to define things that way to have reasonable
| estimates, for the same reason that when you build a shed in
| your yard, you don't need to understand all the physics of how
| it's held up.
| nicholasjarr wrote:
| Never been good at estimation. Software Estimation from Steve
| McConnell is in my reading list for a long time now. From the
| little I have seen from it, it look good (I already read Code
| Complete from him and recommend it). Do you guys have any tips
| for estimation?
| smallerfish wrote:
| Start by listing out the features in a spreadsheet. For each
| feature, think through it and list out the stories, one per
| row. Create a section per feature in the spreadsheet (i.e. put
| a line under each group of rows).
|
| For each story, add an initial estimate (in terms of developer-
| days). This is your "low" estimate. Now in a second column, add
| a "high" (potential-but-reasonable "worst case") estimate. If
| you're looking at more than 10-15 days for either column for a
| story you should probably break the story up some more.
|
| Now add a 3rd/4th column, which are the low/high estimates
| multiplied by 1.3 ("fudged low" / "fudged high"). Total up all
| stories per feature in a row at the bottom of each feature's
| section. Divide by team size, divide by business days, round up
| to nearest integer, and you have your calendar weeks for each
| feature.
|
| When sales/marketing ask you for estimates, you then respond
| "between X and Y calendar weeks from [completion of previous
| feature]". Just be aware that they will hear X, so make sure Y
| is very clearly included in every communication where the dates
| are being discussed. If "previous feature" slips, make sure to
| communicate clearly that "next feature" has also pushed back by
| however many weeks. You'll be tempted to, but don't be
| optimistic with progress reports or estimates of where you are
| in the range - undersell and over-deliver, and you'll keep more
| allies on the business side.
| jbay808 wrote:
| Whatever number you come up with, treat it as the _median of a
| long-tailed distribution_ (if it matters, the lognormal).
|
| To get the mean (expectation), multiply your estimate by about
| 1.6. To get the 95% confidence bound, multiply by 5. To get the
| 99% confidence bound, multiply your estimate by 10.
|
| Understand why a distribution results in different numbers for
| different audiences, and why that's not the same as being
| inconsistent.
|
| Use the mean for calculating sprint workload and capacity
| planning, because the average is what matters for that, not the
| accuracy of any single job. If your manager understands
| probability then give them all these numbers, otherwise give
| them the 95% confident value, which you should also give others
| internally who depend on that specific job being done. Give
| marketing the 99% confident number even if they understand
| probability, because they're looking for a commited deadline
| they can use externally. They will push hard for an early date
| because they want the work done quickly, but they actually _don
| 't_ want to hear your optimistic estimate. It's easy to make
| that mistake.
|
| When requirements are understood, experienced developers are
| actually very, very good at estimating median completion times
| even just by gut feeling, but often fail to account for the
| distribution, especially when communicating with stakeholders,
| which makes them take heat when they're sometimes wrong by a
| factor of ten.
| quietbritishjim wrote:
| > Whatever number you come up with, treat it as the median of
| a long-tailed distribution (if it matters, the lognormal). To
| get the mean (expectation), multiply your estimate by about
| 1.6. To get the 95% confidence bound, multiply by 5. To get
| the 99% confidence bound, multiply your estimate by 10.
|
| I like the idea of multipliers but the maths here is just
| meaningless fluff to justify a particular number. If your
| initial estimate really was a median then it would be an
| overestimate (i.e. the project ends up taking less time) in
| about 50% of cases. In practice I find that initial estimates
| are overestimates in about 0% of cases!
| jbay808 wrote:
| > If your initial estimate really was a median then it
| would be an overestimate (i.e. the project ends up taking
| less time) in about 50% of cases
|
| It is, yes. And it frequently happens that you go to fix
| something, which seems really difficult, and then you
| realize that it's actually an easy fix or not a problem at
| all. But when you fix one thing in half the time you
| expect, and another in twice the time you expect, this
| doesn't average out, because the average of 0.5 and 2 is
| not 1.0.
|
| You might just be discarding the cases where estimates were
| found to be conservative, either because delays are more
| impactful and memorable, because the underestimates were
| close enough to be treated as on-time, or because the slack
| in the schedule was used to buy time for something else,
| originally out of scope, that was lumped in.
|
| Anyway, these numbers aren't just pulled out of a hat. It
| comes from studying vast amounts of high quality (but
| unfortunately, not publicly available) data collected
| comparing developer estimates and measured outcomes.
| ohthehugemanate wrote:
| That's just story points with extra steps.
|
| You're already using an abstracted measure of time, by
| working with a derivative value of "developer estimated
| hours". You're already doing timeline projections on the
| average throughput of your "adjusted developer hours" unit.
| That's most of the value right there.
|
| You can get even better results, with a little less cognitive
| load, by applying the research that people are much more
| consistent in estimating complexity than time (note that your
| method relies on consistency, not accuracy, to succeed). A
| quick imagination exercise validates this point for most of
| us: You bought a new IKEA sofa - how much time will it take
| to build? Honestly hard to do, and we're never accurate. But
| consider instead: how hard is it? Way easier to answer. And
| if you already know how long it takes you on average to
| finish other tasks of similar apparent difficulty...
|
| Try using your exact same system, but ask people to estimate
| the task in terms of complexity. Use any scale you like, as
| long as the units have consistent value in your developers'
| minds (I like "cups of coffee", personally). Make your Dev
| team agree on the difficulty score for each Feature, to
| ensure that consistency.
|
| Side benefit: Devs stop worrying about time and taking
| shortcuts (aka "technical debt") to meet their time estimate
| that you don't believe anyway. They're also a lot more likely
| to consider hidden risks and sources of extra complexity in
| the estimate.
|
| Then you just track the actual throughput with a confidence
| interval, and use that to make timeline projections with a
| confidence interval based on that tracking.
|
| TLDR: try asking Devs to estimate complexity rather than
| time, and use a moving average with confidence interval
| rather than the static 1.6 multiplier to make timeline
| projections. You'll find your projections more accurate and
| developers less stressed about it. You'll also have
| reinvented story points.
| jbay808 wrote:
| Unfortunately, that just masks the difficulty by using
| ambiguous terms that nobody knows if they agree on, and
| makes communication hard with 3rd party stakeholders who
| don't share your conventions. When marketing wants to know
| when something will be done, we can argue about whether
| dev-weeks or calendar dates are more appropriate, but I
| think I'd get told right off if I tried to tell them it
| would take a hundred "story points".
|
| There's no shortcut to avoid the requirement to present
| different summary statistics to different stakeholders.
| It's a consequence of decision theory. Unless they're
| equipped to understand the whole distribution.
|
| It's also the wrong sort of rounding. I think an ikea sofa
| might take an hour, but if it took all day I'd be pretty
| shocked. But with software tasks, it's important to accept
| that the distribution is _long-tailed_. Sometimes it really
| will take 10x as long as you expected, and that 's not your
| fault. Story points would have to abandon all meaning to
| capture that much variance.
|
| I don't recommend incentivizing estimates, though. A big
| benefit of recognizing a developer estimate as short-hand
| for the median of a distribution is that when the time
| doesn't match the estimate, it doesn't mean the estimate
| was "wrong" or "bad", and the developer shouldn't feel bad.
| ohthehugemanate wrote:
| Sorry if I was unclear, when talking to management
| outside of the project, you express in terms of
| time/calendar dates. The arbitrary units are just a more
| accurate and less pressured way of getting to time
| values, than "developer estimated hours times a static
| multiplier."
| jbay808 wrote:
| It's _not_ a static multiplier; I thought I was clear
| that it 's very much a _context-sensitive multiplier_ ,
| which depends on risk tolerance (which you get,
| straightforwardly, from how far you integrate the tail of
| the distribution).
| diiq wrote:
| Would upvote twice if I could -- These are 5-star rules of
| thumb.
|
| (I'm glad to see lognormal making more inroads in software
| estimation. McConnell is great, but assuming the normal
| distribution leads to some weird edge cases.)
| matttrotter wrote:
| And prepare for them to give you strange faces! I had a
| manager look at my estimate and then told me to multiply it
| by 3, which I thought was ludicrous. Turned out to be
| accurate.
| commandlinefan wrote:
| > Never been good at estimation
|
| Me neither. When I was first starting out, that really stressed
| me out a lot until I realized that I didn't work with anybody
| else who was "good" at it - that is, I didn't work with or know
| anybody who could take a list of requirements written out in
| English and produce a timeline that had any relationship to how
| long the software would take to be ready to use.
|
| Been doing this professionally since 1992. I still haven't met
| anybody who was "good" at estimation.
| handrous wrote:
| I've reached the point where I think that if you're not going
| to do the NASA Space Shuttle program thing of specifying the
| whole program to the smallest detail before you start writing
| the actual code, you may as well just start working, release
| often, and evaluate periodically whether the thing looks on-
| track to be worth the cost, cancelling if it's not. Just
| spend the estimation money on development instead.
| qznc wrote:
| Second McConnell. It is a great reference for all kinds of
| estimations around software development.
|
| That is also its downside. It is a reference. Not a textbook to
| learn things in a pedagogical structure.
| nicholasjarr wrote:
| Yeah. I notice this when I tried to read it the last time. I
| was expecting something more like Code Complete. I don't
| know, maybe it is the subject: code is way more interesting
| than estimation :)
| stronglikedan wrote:
| > Do you guys have any tips for estimation?
|
| Stick to your guns when Sales tries to get you to change your
| estimate (and they will). Tell them they can discount the
| project, or change any other variable they need to satisfy the
| customer, but don't ever let them touch the time estimate. Not
| really a tip for making the time estimate, but keep your ass
| covered once you do.
| okl wrote:
| Read that damn book :D It's a treasure trove of information,
| not only for estimating software projects! For example,
| learning to differentiate between estimate, target/goal, and
| commitment.
|
| Personally, I'm often dumbfounded that folks still use planning
| poker when there are so many more reliable methods as discussed
| in the book, e.g., wideband Delphi.
| mmcdermott wrote:
| I loved McConnell's books, having read Code Complete, Software
| Estimation and Rapid Delivery.
|
| Besides what is covered in those books, I've found it extremely
| useful to document assumptions. Every single estimate has some
| mental model of the project to be done. Code to be reused,
| vendors to integrate and, most importantly, things that won't
| be done. The real project almost always breaks with some of
| those high-level assumptions, but that tends to be lost in the
| shuffle.
|
| Attaching assumptions to the estimate makes it much easier to
| do a post-mortem.
|
| _A powerful memory cannot compare with pale ink._
| diiq wrote:
| McConnell's 50/90 approach makes a _big_ difference in my
| opinion because it lets you encode your uncertainty. The extra
| math means you don 't need to be _good_ at estimation as long
| as you know _roughly_ how bad you are at it.
|
| If that seems like too much effort, I also run
| quotes.vistimo.com , which takes a similarly (if slightly more
| advanced) statistical approach, but does all the math for you.
| quietbritishjim wrote:
| I like the advice in _Thinking Fast and Slow_ by Daniel
| Kahneman about estimating, which is not specific to software
| but still very applicable to it:
|
| Start with a known past project that is in some way similar in
| magnitude and adjust from there. For example, "this is twice as
| complex as some other project I did, and that took 2 months so
| this one might take 4 months". Most importantly, resist the
| temptation to say "although 1 of those 2 months was because of
| unexpected thing X so I shouldn't include that". Overall, it's
| highly flawed, but much less highly flawed than anything else.
| This is called "reference class forecasting".
|
| He gave a really compelling explanation of why estimates are
| almost always underestimates by a significant amount, and this
| technique is the best defence against it, but I won't try to
| resummarise because I'll surely misrepresent it. But I do
| recall he gave an example where he and some colleagues were
| trying to make a school syllabus about deductive biases, and
| underestimated the effort required for their own project.
| AlbertCory wrote:
| Thank you, quietbritishjim. I actually met Dr. Kahneman at
| Google, although I didn't introduce him. I got to ask him at
| lunch:
|
| "Dr. Kahneman, you've been at this for 40 years. Do you think
| you've changed anyone's ways of thinking?"
|
| He smiled and said "No, not even my own!" and then recounted
| how in his personal life he'd made a mistake which he'd
| written about extensively (not the one about planning,
| though). It's a _human_ failing, not a methodological one.
|
| I'm also vague about his example, but I think it was a new
| textbook. He asked his committee to reflect on their own past
| experiences with similar books. "Two years" was the past
| experience. Then they decided that it really _should_ be six
| months, and that 's the estimate they went with.
|
| No one wants to accept that shit happens and it's going to
| happen again. That's why estimation is hard.
| nicholasjarr wrote:
| Interesting. Will add it to my toolbelt. Thanks.
| automatic6131 wrote:
| I have no opinion on the content of the article, because light
| grey on white is barely readable, and it would require far too
| much energy to read.
| [deleted]
| diego_moita wrote:
| Just one more consultant doing "branding".
|
| What causes estimates to fail are the unknowns: unavoidable
| surprises when implementing something new, unexpected change in
| requirements, etc.
|
| It would be easy to have accurate estimates if there were no
| unknowns. But every innovative project must always be a march to
| unknown territory.
| seph-reed wrote:
| I got pretty decent at software estimation at my last job.
|
| I would spend an entire day "pre-programming" everything in my
| head, estimating the length of each little chunk, adding them up,
| then multiply by ~2.
|
| It worked for me. But I still would never trust the estimates.
| p0nce wrote:
| Instead of 2: Multiply by p for new kind of projects. Multiply
| by ph for known projects.
| d6ba56c039d9 wrote:
| Another angle.
|
| 'How long did a project this size take last time?'.
|
| As an aside, years ago I worked at a company that did thorough
| (and inaccurate) bottom-up schedules. I got dinged for not
| using quarter hour accuracy in the various task estimates.
| xcambar wrote:
| I am and always will be skeptical about software estimation.
|
| I am even more skeptical about the promises of software
| evaluation methods.
|
| But what I am the most skeptical about is teams avoiding software
| estimation altogether because they share the two skepticisms
| above.
| commandlinefan wrote:
| > avoiding software estimation altogether because they share
| the two skepticisms above
|
| Well, to put your mind at ease, I don't avoid software
| estimation because I share your skepticisms (although I do), I
| avoid it because I've observed that it's a complete waste of
| time. Nobody, at least never in my 30-year career, has ever
| asked for an honest estimate of how long it would take to
| produce a software product. What they _have_ asked for is
| somebody to agree that the time that they have budgeted will be
| enough for the (vague, still being defined and still to be
| defined even beyond the timeline) software project and take the
| "blame" when it inevitably doesn't.
| xcambar wrote:
| I agree.
|
| > somebody to agree that the time that they have budgeted
| will be enough for the [...] software project
|
| For me, that's still software estimation.
| tupac_speedrap wrote:
| Yep, every story pointing session is basically just think of
| a Fibonacci number and round it up. Finishing early makes
| your scrum master leave you alone but finishing late makes
| you and your team look bad and "unagile" and then you get
| even less done next sprint because you are stuck in meetings.
| The scrum master always wins because nobody is ever doing
| enough Agile.
| zoomablemind wrote:
| Too often the software project estimates are driven by some
| external constraints, not much by the understanding of the effort
| or complexity.
|
| It's either some existing deadline or reporting/sales cycle, some
| budget caps, like in grant proposals, or some promises already
| made by/to 'important people', or some fear of 'small people' to
| underdeliver etc.
|
| The estimation would all be fine if all involved people shared
| the same goal and responsibility.
|
| I find it practical to split the desired outcome which needs an
| estimate into two variants: 1) the most
| desired/promised/advertized and 2) an at-least viable variant.
|
| If noone can see the second variant, its viability and the effort
| it needs, then some details or skill force are clearly missing.
|
| If the second variant is estimatable, then the estimate could be
| used as a basis in dealing with the external constraints.
|
| If the devs are saying that in a given timeframe they can see a
| prototype done at least and you're fine with that, then no one
| should be damned if that pans out just to be the case. So it has
| to be clear from the beginning if it's at all acceptable to put
| such variant/prototype into production.
___________________________________________________________________
(page generated 2021-08-03 23:02 UTC)