[HN Gopher] Guide to Software Project Estimation
       ___________________________________________________________________
        
       Guide to Software Project Estimation
        
       Author : todsacerdoti
       Score  : 63 points
       Date   : 2021-08-03 11:31 UTC (11 hours ago)
        
 (HTM) web link (www.scalablepath.com)
 (TXT) w3m dump (www.scalablepath.com)
        
       | gilbetron wrote:
       | Nothing interesting in the article at all, I have no idea why it
       | is on the front page of HN.
       | 
       | As usual with software estimation, unless you've read this paper:
       | http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.19.3...
       | 
       | and have a thoughtful response to it (of which I've read some
       | solid ones), I've very little interest in hearing what you have
       | to say about software estimation.
        
         | diiq wrote:
         | 1) The big if is written out in the abstract -- "if it is
         | accepted that algorithmic complexity is an appropriate
         | definition of the complexity of a programming project."
         | Relating algorithmic complexity to "how long software takes to
         | write" seems, to me, to ignore that the vast majority of my
         | time as a developer is spent discovering and communicating
         | requirements, handling human questions, not writing novel code.
         | The conclusion touches on this, but ignores it.
         | 
         | 2) Even if you accept that "if", this is like a halting-problem
         | proof. Fine; it is impossible to estimate the complexity of ALL
         | software. That does not mean that it's useless to quantify the
         | complexity of software in limited but well-behaved problem
         | spaces. How much of any commercial project is actually spent
         | working on the cutting edge of computer science, facing
         | complete unknowns? An estimate being wrong occasionally is
         | worth most estimates being mostly right.
         | 
         | 3) Why do you consider a 20 year old paper that's only been
         | cited 16 times to be critical reading about estimation, when a
         | _vast_ body of research in forecasting exists, written by
         | people who have _measurements_ of the accuracy of estimates to
         | base their theoretical models on?
        
           | gilbetron wrote:
           | > 3) Why do you consider a 20 year old paper that's only been
           | cited 16 times to be critical reading about estimation, when
           | a vast body of research in forecasting exists, written by
           | people who have measurements of the accuracy of estimates to
           | base their theoretical models on?
           | 
           | Have some links to this vast body?
           | 
           | I've encountered very few over the years that actually
           | qualify as "science" as much of it is fake or close to fake.
           | 
           | http://shape-of-code.coding-
           | guidelines.com/2021/01/17/softwa...
        
         | codemac wrote:
         | That paper makes such a huge, and largely incorrect assumption
         | in the abstract. I've never seen anyone make this argument.
         | 
         | The old parable about bugs, that it's 10x harder to fix a bug
         | than it is to write the bug, shows us that the time to
         | implement something complex is not related to it's complexity.
         | 
         | For example, if I wanted to build something that randomly
         | selects a function from github, and then runs that code on your
         | laptop - I bet I could estimate and implement that code.. but
         | good luck every defining it's features, complexity, etc in any
         | mathematical way.
         | 
         | But you don't need to define things that way to have reasonable
         | estimates, for the same reason that when you build a shed in
         | your yard, you don't need to understand all the physics of how
         | it's held up.
        
       | nicholasjarr wrote:
       | Never been good at estimation. Software Estimation from Steve
       | McConnell is in my reading list for a long time now. From the
       | little I have seen from it, it look good (I already read Code
       | Complete from him and recommend it). Do you guys have any tips
       | for estimation?
        
         | smallerfish wrote:
         | Start by listing out the features in a spreadsheet. For each
         | feature, think through it and list out the stories, one per
         | row. Create a section per feature in the spreadsheet (i.e. put
         | a line under each group of rows).
         | 
         | For each story, add an initial estimate (in terms of developer-
         | days). This is your "low" estimate. Now in a second column, add
         | a "high" (potential-but-reasonable "worst case") estimate. If
         | you're looking at more than 10-15 days for either column for a
         | story you should probably break the story up some more.
         | 
         | Now add a 3rd/4th column, which are the low/high estimates
         | multiplied by 1.3 ("fudged low" / "fudged high"). Total up all
         | stories per feature in a row at the bottom of each feature's
         | section. Divide by team size, divide by business days, round up
         | to nearest integer, and you have your calendar weeks for each
         | feature.
         | 
         | When sales/marketing ask you for estimates, you then respond
         | "between X and Y calendar weeks from [completion of previous
         | feature]". Just be aware that they will hear X, so make sure Y
         | is very clearly included in every communication where the dates
         | are being discussed. If "previous feature" slips, make sure to
         | communicate clearly that "next feature" has also pushed back by
         | however many weeks. You'll be tempted to, but don't be
         | optimistic with progress reports or estimates of where you are
         | in the range - undersell and over-deliver, and you'll keep more
         | allies on the business side.
        
         | jbay808 wrote:
         | Whatever number you come up with, treat it as the _median of a
         | long-tailed distribution_ (if it matters, the lognormal).
         | 
         | To get the mean (expectation), multiply your estimate by about
         | 1.6. To get the 95% confidence bound, multiply by 5. To get the
         | 99% confidence bound, multiply your estimate by 10.
         | 
         | Understand why a distribution results in different numbers for
         | different audiences, and why that's not the same as being
         | inconsistent.
         | 
         | Use the mean for calculating sprint workload and capacity
         | planning, because the average is what matters for that, not the
         | accuracy of any single job. If your manager understands
         | probability then give them all these numbers, otherwise give
         | them the 95% confident value, which you should also give others
         | internally who depend on that specific job being done. Give
         | marketing the 99% confident number even if they understand
         | probability, because they're looking for a commited deadline
         | they can use externally. They will push hard for an early date
         | because they want the work done quickly, but they actually _don
         | 't_ want to hear your optimistic estimate. It's easy to make
         | that mistake.
         | 
         | When requirements are understood, experienced developers are
         | actually very, very good at estimating median completion times
         | even just by gut feeling, but often fail to account for the
         | distribution, especially when communicating with stakeholders,
         | which makes them take heat when they're sometimes wrong by a
         | factor of ten.
        
           | quietbritishjim wrote:
           | > Whatever number you come up with, treat it as the median of
           | a long-tailed distribution (if it matters, the lognormal). To
           | get the mean (expectation), multiply your estimate by about
           | 1.6. To get the 95% confidence bound, multiply by 5. To get
           | the 99% confidence bound, multiply your estimate by 10.
           | 
           | I like the idea of multipliers but the maths here is just
           | meaningless fluff to justify a particular number. If your
           | initial estimate really was a median then it would be an
           | overestimate (i.e. the project ends up taking less time) in
           | about 50% of cases. In practice I find that initial estimates
           | are overestimates in about 0% of cases!
        
             | jbay808 wrote:
             | > If your initial estimate really was a median then it
             | would be an overestimate (i.e. the project ends up taking
             | less time) in about 50% of cases
             | 
             | It is, yes. And it frequently happens that you go to fix
             | something, which seems really difficult, and then you
             | realize that it's actually an easy fix or not a problem at
             | all. But when you fix one thing in half the time you
             | expect, and another in twice the time you expect, this
             | doesn't average out, because the average of 0.5 and 2 is
             | not 1.0.
             | 
             | You might just be discarding the cases where estimates were
             | found to be conservative, either because delays are more
             | impactful and memorable, because the underestimates were
             | close enough to be treated as on-time, or because the slack
             | in the schedule was used to buy time for something else,
             | originally out of scope, that was lumped in.
             | 
             | Anyway, these numbers aren't just pulled out of a hat. It
             | comes from studying vast amounts of high quality (but
             | unfortunately, not publicly available) data collected
             | comparing developer estimates and measured outcomes.
        
           | ohthehugemanate wrote:
           | That's just story points with extra steps.
           | 
           | You're already using an abstracted measure of time, by
           | working with a derivative value of "developer estimated
           | hours". You're already doing timeline projections on the
           | average throughput of your "adjusted developer hours" unit.
           | That's most of the value right there.
           | 
           | You can get even better results, with a little less cognitive
           | load, by applying the research that people are much more
           | consistent in estimating complexity than time (note that your
           | method relies on consistency, not accuracy, to succeed). A
           | quick imagination exercise validates this point for most of
           | us: You bought a new IKEA sofa - how much time will it take
           | to build? Honestly hard to do, and we're never accurate. But
           | consider instead: how hard is it? Way easier to answer. And
           | if you already know how long it takes you on average to
           | finish other tasks of similar apparent difficulty...
           | 
           | Try using your exact same system, but ask people to estimate
           | the task in terms of complexity. Use any scale you like, as
           | long as the units have consistent value in your developers'
           | minds (I like "cups of coffee", personally). Make your Dev
           | team agree on the difficulty score for each Feature, to
           | ensure that consistency.
           | 
           | Side benefit: Devs stop worrying about time and taking
           | shortcuts (aka "technical debt") to meet their time estimate
           | that you don't believe anyway. They're also a lot more likely
           | to consider hidden risks and sources of extra complexity in
           | the estimate.
           | 
           | Then you just track the actual throughput with a confidence
           | interval, and use that to make timeline projections with a
           | confidence interval based on that tracking.
           | 
           | TLDR: try asking Devs to estimate complexity rather than
           | time, and use a moving average with confidence interval
           | rather than the static 1.6 multiplier to make timeline
           | projections. You'll find your projections more accurate and
           | developers less stressed about it. You'll also have
           | reinvented story points.
        
             | jbay808 wrote:
             | Unfortunately, that just masks the difficulty by using
             | ambiguous terms that nobody knows if they agree on, and
             | makes communication hard with 3rd party stakeholders who
             | don't share your conventions. When marketing wants to know
             | when something will be done, we can argue about whether
             | dev-weeks or calendar dates are more appropriate, but I
             | think I'd get told right off if I tried to tell them it
             | would take a hundred "story points".
             | 
             | There's no shortcut to avoid the requirement to present
             | different summary statistics to different stakeholders.
             | It's a consequence of decision theory. Unless they're
             | equipped to understand the whole distribution.
             | 
             | It's also the wrong sort of rounding. I think an ikea sofa
             | might take an hour, but if it took all day I'd be pretty
             | shocked. But with software tasks, it's important to accept
             | that the distribution is _long-tailed_. Sometimes it really
             | will take 10x as long as you expected, and that 's not your
             | fault. Story points would have to abandon all meaning to
             | capture that much variance.
             | 
             | I don't recommend incentivizing estimates, though. A big
             | benefit of recognizing a developer estimate as short-hand
             | for the median of a distribution is that when the time
             | doesn't match the estimate, it doesn't mean the estimate
             | was "wrong" or "bad", and the developer shouldn't feel bad.
        
               | ohthehugemanate wrote:
               | Sorry if I was unclear, when talking to management
               | outside of the project, you express in terms of
               | time/calendar dates. The arbitrary units are just a more
               | accurate and less pressured way of getting to time
               | values, than "developer estimated hours times a static
               | multiplier."
        
               | jbay808 wrote:
               | It's _not_ a static multiplier; I thought I was clear
               | that it 's very much a _context-sensitive multiplier_ ,
               | which depends on risk tolerance (which you get,
               | straightforwardly, from how far you integrate the tail of
               | the distribution).
        
           | diiq wrote:
           | Would upvote twice if I could -- These are 5-star rules of
           | thumb.
           | 
           | (I'm glad to see lognormal making more inroads in software
           | estimation. McConnell is great, but assuming the normal
           | distribution leads to some weird edge cases.)
        
           | matttrotter wrote:
           | And prepare for them to give you strange faces! I had a
           | manager look at my estimate and then told me to multiply it
           | by 3, which I thought was ludicrous. Turned out to be
           | accurate.
        
         | commandlinefan wrote:
         | > Never been good at estimation
         | 
         | Me neither. When I was first starting out, that really stressed
         | me out a lot until I realized that I didn't work with anybody
         | else who was "good" at it - that is, I didn't work with or know
         | anybody who could take a list of requirements written out in
         | English and produce a timeline that had any relationship to how
         | long the software would take to be ready to use.
         | 
         | Been doing this professionally since 1992. I still haven't met
         | anybody who was "good" at estimation.
        
           | handrous wrote:
           | I've reached the point where I think that if you're not going
           | to do the NASA Space Shuttle program thing of specifying the
           | whole program to the smallest detail before you start writing
           | the actual code, you may as well just start working, release
           | often, and evaluate periodically whether the thing looks on-
           | track to be worth the cost, cancelling if it's not. Just
           | spend the estimation money on development instead.
        
         | qznc wrote:
         | Second McConnell. It is a great reference for all kinds of
         | estimations around software development.
         | 
         | That is also its downside. It is a reference. Not a textbook to
         | learn things in a pedagogical structure.
        
           | nicholasjarr wrote:
           | Yeah. I notice this when I tried to read it the last time. I
           | was expecting something more like Code Complete. I don't
           | know, maybe it is the subject: code is way more interesting
           | than estimation :)
        
         | stronglikedan wrote:
         | > Do you guys have any tips for estimation?
         | 
         | Stick to your guns when Sales tries to get you to change your
         | estimate (and they will). Tell them they can discount the
         | project, or change any other variable they need to satisfy the
         | customer, but don't ever let them touch the time estimate. Not
         | really a tip for making the time estimate, but keep your ass
         | covered once you do.
        
         | okl wrote:
         | Read that damn book :D It's a treasure trove of information,
         | not only for estimating software projects! For example,
         | learning to differentiate between estimate, target/goal, and
         | commitment.
         | 
         | Personally, I'm often dumbfounded that folks still use planning
         | poker when there are so many more reliable methods as discussed
         | in the book, e.g., wideband Delphi.
        
         | mmcdermott wrote:
         | I loved McConnell's books, having read Code Complete, Software
         | Estimation and Rapid Delivery.
         | 
         | Besides what is covered in those books, I've found it extremely
         | useful to document assumptions. Every single estimate has some
         | mental model of the project to be done. Code to be reused,
         | vendors to integrate and, most importantly, things that won't
         | be done. The real project almost always breaks with some of
         | those high-level assumptions, but that tends to be lost in the
         | shuffle.
         | 
         | Attaching assumptions to the estimate makes it much easier to
         | do a post-mortem.
         | 
         |  _A powerful memory cannot compare with pale ink._
        
         | diiq wrote:
         | McConnell's 50/90 approach makes a _big_ difference in my
         | opinion because it lets you encode your uncertainty. The extra
         | math means you don 't need to be _good_ at estimation as long
         | as you know _roughly_ how bad you are at it.
         | 
         | If that seems like too much effort, I also run
         | quotes.vistimo.com , which takes a similarly (if slightly more
         | advanced) statistical approach, but does all the math for you.
        
         | quietbritishjim wrote:
         | I like the advice in _Thinking Fast and Slow_ by Daniel
         | Kahneman about estimating, which is not specific to software
         | but still very applicable to it:
         | 
         | Start with a known past project that is in some way similar in
         | magnitude and adjust from there. For example, "this is twice as
         | complex as some other project I did, and that took 2 months so
         | this one might take 4 months". Most importantly, resist the
         | temptation to say "although 1 of those 2 months was because of
         | unexpected thing X so I shouldn't include that". Overall, it's
         | highly flawed, but much less highly flawed than anything else.
         | This is called "reference class forecasting".
         | 
         | He gave a really compelling explanation of why estimates are
         | almost always underestimates by a significant amount, and this
         | technique is the best defence against it, but I won't try to
         | resummarise because I'll surely misrepresent it. But I do
         | recall he gave an example where he and some colleagues were
         | trying to make a school syllabus about deductive biases, and
         | underestimated the effort required for their own project.
        
           | AlbertCory wrote:
           | Thank you, quietbritishjim. I actually met Dr. Kahneman at
           | Google, although I didn't introduce him. I got to ask him at
           | lunch:
           | 
           | "Dr. Kahneman, you've been at this for 40 years. Do you think
           | you've changed anyone's ways of thinking?"
           | 
           | He smiled and said "No, not even my own!" and then recounted
           | how in his personal life he'd made a mistake which he'd
           | written about extensively (not the one about planning,
           | though). It's a _human_ failing, not a methodological one.
           | 
           | I'm also vague about his example, but I think it was a new
           | textbook. He asked his committee to reflect on their own past
           | experiences with similar books. "Two years" was the past
           | experience. Then they decided that it really _should_ be six
           | months, and that 's the estimate they went with.
           | 
           | No one wants to accept that shit happens and it's going to
           | happen again. That's why estimation is hard.
        
           | nicholasjarr wrote:
           | Interesting. Will add it to my toolbelt. Thanks.
        
       | automatic6131 wrote:
       | I have no opinion on the content of the article, because light
       | grey on white is barely readable, and it would require far too
       | much energy to read.
        
       | [deleted]
        
       | diego_moita wrote:
       | Just one more consultant doing "branding".
       | 
       | What causes estimates to fail are the unknowns: unavoidable
       | surprises when implementing something new, unexpected change in
       | requirements, etc.
       | 
       | It would be easy to have accurate estimates if there were no
       | unknowns. But every innovative project must always be a march to
       | unknown territory.
        
       | seph-reed wrote:
       | I got pretty decent at software estimation at my last job.
       | 
       | I would spend an entire day "pre-programming" everything in my
       | head, estimating the length of each little chunk, adding them up,
       | then multiply by ~2.
       | 
       | It worked for me. But I still would never trust the estimates.
        
         | p0nce wrote:
         | Instead of 2: Multiply by p for new kind of projects. Multiply
         | by ph for known projects.
        
         | d6ba56c039d9 wrote:
         | Another angle.
         | 
         | 'How long did a project this size take last time?'.
         | 
         | As an aside, years ago I worked at a company that did thorough
         | (and inaccurate) bottom-up schedules. I got dinged for not
         | using quarter hour accuracy in the various task estimates.
        
       | xcambar wrote:
       | I am and always will be skeptical about software estimation.
       | 
       | I am even more skeptical about the promises of software
       | evaluation methods.
       | 
       | But what I am the most skeptical about is teams avoiding software
       | estimation altogether because they share the two skepticisms
       | above.
        
         | commandlinefan wrote:
         | > avoiding software estimation altogether because they share
         | the two skepticisms above
         | 
         | Well, to put your mind at ease, I don't avoid software
         | estimation because I share your skepticisms (although I do), I
         | avoid it because I've observed that it's a complete waste of
         | time. Nobody, at least never in my 30-year career, has ever
         | asked for an honest estimate of how long it would take to
         | produce a software product. What they _have_ asked for is
         | somebody to agree that the time that they have budgeted will be
         | enough for the (vague, still being defined and still to be
         | defined even beyond the timeline) software project and take the
         | "blame" when it inevitably doesn't.
        
           | xcambar wrote:
           | I agree.
           | 
           | > somebody to agree that the time that they have budgeted
           | will be enough for the [...] software project
           | 
           | For me, that's still software estimation.
        
           | tupac_speedrap wrote:
           | Yep, every story pointing session is basically just think of
           | a Fibonacci number and round it up. Finishing early makes
           | your scrum master leave you alone but finishing late makes
           | you and your team look bad and "unagile" and then you get
           | even less done next sprint because you are stuck in meetings.
           | The scrum master always wins because nobody is ever doing
           | enough Agile.
        
       | zoomablemind wrote:
       | Too often the software project estimates are driven by some
       | external constraints, not much by the understanding of the effort
       | or complexity.
       | 
       | It's either some existing deadline or reporting/sales cycle, some
       | budget caps, like in grant proposals, or some promises already
       | made by/to 'important people', or some fear of 'small people' to
       | underdeliver etc.
       | 
       | The estimation would all be fine if all involved people shared
       | the same goal and responsibility.
       | 
       | I find it practical to split the desired outcome which needs an
       | estimate into two variants: 1) the most
       | desired/promised/advertized and 2) an at-least viable variant.
       | 
       | If noone can see the second variant, its viability and the effort
       | it needs, then some details or skill force are clearly missing.
       | 
       | If the second variant is estimatable, then the estimate could be
       | used as a basis in dealing with the external constraints.
       | 
       | If the devs are saying that in a given timeframe they can see a
       | prototype done at least and you're fine with that, then no one
       | should be damned if that pans out just to be the case. So it has
       | to be clear from the beginning if it's at all acceptable to put
       | such variant/prototype into production.
        
       ___________________________________________________________________
       (page generated 2021-08-03 23:02 UTC)