[HN Gopher] Big O Notation - Explained as Easily as Possible
       ___________________________________________________________________
        
       Big O Notation - Explained as Easily as Possible
        
       Author : optimalsolver
       Score  : 235 points
       Date   : 2021-01-16 16:33 UTC (6 hours ago)
        
 (HTM) web link (thatcomputerscientist.com)
 (TXT) w3m dump (thatcomputerscientist.com)
        
       | kadoban wrote:
       | That in _no_ way explained big-O notation. It explained the
       | basics of counting things to do with algorithms. Which is kind of
       | fine, really. But why bother saying it's going to explain big-O?
        
       | a_a_a_a wrote:
       | lkjhlkjhlk
        
       | pmiller2 wrote:
       | Every one of these "Big-O explainers" says pretty much the same
       | thing: count (or bound) the number of steps, then take the most
       | significant term, and drop the constant associated with it. None
       | of them explain _why_ you take the most significant term or drop
       | constant factors.
       | 
       | I get why that is. You need the mathematical definition to
       | demonstrate why that is, and most "Big-O explainers" don't want
       | to assume any significant amount of mathematical background. But,
       | that definition isn't that hard. It's simply:
       | 
       | f(x) is O(g(x)) iff there exists a positive number M and an x_0
       | such that for all x > x_0, |f(x)| <= Mg(x).
       | 
       | And, if you're in an analysis of algorithms context, it's even
       | easier, because you typically don't have to worry about this
       | absolute value business.
       | 
       | Well, that M is essentially the reason you get to drop constant
       | multiples of f(x). And, you drop the least significant terms of
       | f(x) because g(x) dominates them, _i.e._ lim_{x - > \infty}
       | g(x)/f(x) = 0. (No need to prove this, because this is what makes
       | the "less significant" terms less significant.)
       | 
       | I would also like to add that the equals sign in f(x) = O(g(x))
       | is one of the most useful abuses of notation that I know of, but
       | it can be misleading. It doesn't behave at all like a real
       | equality because it's not symmetric, but it _is_ transitive and
       | reflexive. It actually acts more like set membership than
       | equality.
        
         | jakear wrote:
         | It acts exactly like set membership, because that's what it is.
        
         | casion wrote:
         | You say it isn't hard, but I have 2 graduate degrees and didn't
         | understand your explanation at all.
         | 
         | "Hard" is relative to prerequisite knowledge, which can vary
         | significantly.
        
           | mhh__ wrote:
           | Admittedly I am studying theoretical physics so I am supposed
           | to be able to, but it made sense to me?
        
             | pmiller2 wrote:
             | Excellent!
             | 
             | The level I was shooting for with my brief explanation was
             | that someone who understood limits at a calc 1 level should
             | be able to get it with a little thinking. I do wonder,
             | though: did you know those things before you read my
             | comment?
        
               | mhh__ wrote:
               | I did. For some scale of what I get up to I'm currently
               | banging my head against various differential geometry
               | textbooks.
               | 
               | Ultimately you're limited in not having graphs, which
               | always limits the intuitiveness of any calculus-
               | explainers.
        
               | pmiller2 wrote:
               | Cool. Sounds like you're in grad school right now, yeah?
               | I know differential geometry is not exactly the same as
               | differential topology, but I remember taking differential
               | topology in grad school. I was always amused that we
               | never actually evaluated any integrals that didn't come
               | out to a very simple number (generally 0).
               | 
               | If I were writing this as a web page, though, I would
               | definitely include a few graphs to explain the calculus
               | concepts. The amount of calculus you need here is pretty
               | intuitive once you draw a couple of pictures.
        
               | mhh__ wrote:
               | I'm actually in my second year of university but I
               | started learning physics "properly" at 14 so I had a head
               | start.
        
           | roywiggins wrote:
           | The short version is that the fastest-growing term dominates
           | all the others and for large xs the smaller terms round down
           | to zero. Since big-O notation is about how the complexity
           | grows for large inputs, you can assume the input is
           | arbitrarily large, and you'll notice that the complexity is
           | completely determined by the fastest-growing term.
           | 
           | You drop the constant because it doesn't alter how the
           | complexity grows as the input increases.
        
           | pmiller2 wrote:
           | Of course "hard" is relative. Because I was writing a HN post
           | and not a "Big-O explainer," I didn't provide you with any of
           | that prerequisite knowledge. But, the amount of prerequisite
           | knowledge one needs to understand this is very, very little,
           | and would easily fit in a digestible web page, provided you
           | have some basic fluency with functions of the real numbers.
           | And, I think that's a reasonable level of prerequisite to
           | assume for anyone who wants to be throwing around terms like
           | "Big-O."
           | 
           | If it's not too intrusive, may I ask what your graduate
           | degrees are?
        
             | sabas123 wrote:
             | > But, the amount of prerequisite knowledge one needs to
             | understand this is very, very little, and would easily fit
             | in a digestible web page
             | 
             | I'm really not sure on this one. This is easy if you have
             | an idea of: 1. how you can graph things (runtime vs input
             | size) 2. do the same but stretching the function to
             | infinity 3. compare this to some term (which isn't as
             | tangible compared to most algorithms imo),
             | 
             | And this is only for the intuition part. For people that
             | got into programming by doing some UI stuff, I can
             | definitely can see why _a subset of people_ struggle with
             | this.
        
             | andi999 wrote:
             | The important pre knowledge is that every polynomial is
             | dominated by its largest exponential term (for x to infty).
        
               | pmiller2 wrote:
               | It's a little more than that. You need to account for the
               | fact that n log n dominates n, but not n^2, as well.
               | That's not hard, but you should spell it out.
               | 
               | This discussion is making me think that a good way to
               | write a "Big-O explainer" would be sort of like a
               | progressive web app, _i.e._ "here's the explanation with
               | the highest level of mathematical sophistication. Click
               | here to get some of the prerequisite knowledge." Then,
               | the user just keeps clicking "break it down more" until
               | either they get it or they reach the most detailed
               | explanation.
        
               | mhh__ wrote:
               | Largest exponential? That should be monomial, right?
               | 
               | Anyway you can show that on the spot if you think about
               | what the derivative of that monomial looks like
        
               | pmiller2 wrote:
               | One easy way to show that more generally is that if f and
               | g are differentiable, f(x) dominates g(x) iff f'(x)
               | dominates g'(x), by L'Hopital's rule. Details left as an
               | exercise for the reader.
        
               | aborsy wrote:
               | |g(x)|<M|f(x)| does not imply |g'(x)|<=C|f'(x)|.
        
               | pmiller2 wrote:
               | Sure it does, for all functions f and g that we actually
               | care about in the CS context for big-O. Hint: what if f
               | and g are smooth?
        
               | aborsy wrote:
               | One function could be below another and have arbitrarily
               | derivative. Even if they are both smooth: f(x)=sin(e^x)
               | and g(x)=1.
        
           | andi999 wrote:
           | Had such students as well. If I ask anything they said, oh I
           | learned that as undergrad (implying that it is too long ago
           | to remember). I am sad about such a waste.
        
         | edflsafoiewq wrote:
         | In most practical cases f=O(g) is the same as saying f/g is
         | bounded.
        
         | uh_uh wrote:
         | > And, you drop the least significant terms of f(x) because
         | g(x) dominates them, i.e. lim_{x -> \infty} g(x)/f(x) = 0.
         | 
         | If g(x) dominates all the terms in f(x), then wouldn't lim_{x
         | -> \infty} g(x)/f(x) go to infinity?
        
       | axaxs wrote:
       | I think a minimal reproduction of each common complexity would be
       | a way more helpful addition.
       | 
       | Someone with little math background is going to have no idea what
       | log even means, much less what causes it.
        
       | alok-g wrote:
       | >> If you consider "addition" to be 1 operation then ...
       | 
       | What if we don't? Addition on computers is an O[1] operation only
       | because we use fixed bit-widths for numbers, e.g., a 32-bit
       | signed integer. How would we reformulate or express algorithmic
       | complexity if we were to talk about unbounded integer values or
       | infinite-precision mathematics?
        
         | Brian_K_White wrote:
         | Then just substitute whatever other operation that makes you
         | happy? Or an imaginary one if you mean to say there is no such
         | thing in real hardware.
        
         | qsort wrote:
         | It generally changes the analysis very little but makes it more
         | annoying, which is why arithmetic is usually taken to be O(1).
         | Formally speaking you'd have to specify a computational model
         | to be able to talk about complexity (there is no such thing as
         | 'complexity' but only 'complexity respective to a model'; e.g.
         | you can change the big-O complexity on Turing machines by using
         | more tapes).
         | 
         | The computational model of a book like CLRS is "arithmetic is
         | O(1), but we're not going to abuse that bug".
        
         | mhh__ wrote:
         | You'd need to know the exact details of the implementation of
         | your arithmetic library, but it would depend on the integer in
         | question which makes analysis a little harder so adding two
         | numbers would become some function O(f(n, m)) where n, m are
         | the two numbers or some property of them (length, digits etc.)
        
         | aweinstock wrote:
         | The contrast between Gauss-Jordan[0] and the Bareiss
         | algorithm[1] is a good example of explicitly handling the
         | length of the numbers in bits as part of the runtime.
         | 
         | Gauss-Jordan is O(n^3) if you consider addition and
         | multiplication to be constant-time, but if it operates on
         | arbitrary-precision integers, it can get exponentially slow[2].
         | 
         | The Bareiss algorithm avoids this worst-case behavior by
         | choosing the pivots in a slightly more involved manner, which
         | avoids building up large intermediate values.
         | 
         | [0]:
         | https://en.wikipedia.org/wiki/Gaussian_elimination#Computati...
         | [1]: https://en.wikipedia.org/wiki/Bareiss_algorithm [2]:
         | https://staff.itee.uq.edu.au/havas/fh97.pdf, Section 3.1's
         | "Construction A" is a specific matrix that exhibits the worst-
         | case behavior.
        
       | lamename wrote:
       | Also a nice quick ref here https://www.bigocheatsheet.com/
        
       | maweki wrote:
       | What I always find a bit missing in such articles is what the n
       | is. The author writes "If you consider "addition" to be 1
       | operation" and this seems kinda intuitive and is correct if we
       | talk about normal machine integers.
       | 
       | But adding two arbitrary integers might be somewhat linear in
       | bit-width. And there we have it: with a fixed bit-width, this
       | becomes a constant term.
       | 
       | So you might not want to talk about number of input terms as n,
       | but also width of your terms (and python integers are arbitrary
       | precision btw.).
       | 
       | So yeah, this is an upper bound on how many "steps" you do for
       | every input, but often enough it's not really clear what a step
       | is, especially if you have several "steps" that relate do
       | different forms of data retrieval and organization (which often
       | are culprits for bad performance if you're done optimizing the
       | number of loops). Sometimes you can hide behind some guaruantees
       | that your hashset has constant lookup. But did you factor in
       | whether the hash function is actually constant (or even fast, for
       | that matter)?
        
       | ineedasername wrote:
       | I'll admit I'm a little surprised to see a topic like this get
       | much attention. Have programming courses & curriculums changed so
       | much in the past ~20 years that this isn't simply part of every
       | introduction to the subject?
       | 
       | Yes, the topic of code performance as a whole is more complex
       | than just Big O, but as its own concept it was, in my time (get
       | off my lawn!) pretty effective covered everywhere, and certainly
       | the moment "algorithms" we discussed in any learning material.
       | 
       | Maybe it's just that it goes back to the common topic here on HN
       | that there's a lot more inefficient code nowadays because faster
       | processors & more memory helps to paper over the cracks. But if
       | something like Big O isn't taught as one of the most primitive
       | bits of knowledge in programming then I can't completely that
       | trend either.
        
         | matttb wrote:
         | Many, many developers did not go to school and have no 'CS'
         | type learning
        
         | racl101 wrote:
         | I think this might be for the self taught crowd of developers
         | who never formally took Comp Sci, yet, who are also wielding
         | important positions in software development that pay as well if
         | not more than the guys who did take comp sci.
         | 
         | You'd be astounded how big this self taught cohort could be and
         | how much power they wield.
         | 
         | They do their job pretty well, and yet, the basics of computer
         | science is something that they never learned.
         | 
         | I met a senior guy the other day who had never heard of a
         | freaking truth table for fuck's sake.
        
           | jobbo wrote:
           | I was dropped into the tech lead position at my job last
           | year; I started in game design at art college, so having to
           | run and gun has been fascinating. I'd be lying if I said I
           | wasn't scared reading about much of the stuff in this comment
           | section that really should be bread and butter.
           | 
           | I'm talking to my boss to see if there's some kind of
           | training program I can pick-up on the side to help me gather
           | what should be the basics that I've missed out on, although
           | we're so overloaded finding the time and money is
           | challenging. I'm lucky it's mostly CRUD, but I can't help by
           | worry every architecture decision I'm making is going to cost
           | us massively down the road.
        
           | amelius wrote:
           | Like the React team, claiming that they have found a
           | revolutionary solution which is faster than direct DOM
           | manipulation, where in reality it is perhaps only a constant
           | factor faster. Nice in practice for small data, but not very
           | interesting from a CS point of view.
        
       | tracyhenry wrote:
       | Remotely related: the Theory of Computation course I took (by
       | Michael Sipser) gave me a more fundamental understanding of
       | algorithmic complexity than I had through solving algorithmic
       | puzzles in high-level languages:
       | https://math.mit.edu/~sipser/18404/
        
       | hprotagonist wrote:
       | https://nedbatchelder.com/text/bigo/bigo.html is also a good
       | intro read.
        
       | xyzelement wrote:
       | It's great to have a handle on big O but funny I've seen people
       | index to it too much.
       | 
       | An algorithm can look "really bad" from Big-O point of view and
       | still be really good if: it's applied to a small enough input, or
       | it's implemented very efficiently.
        
         | windowojji wrote:
         | > it's applied to a small enough input
         | 
         | The whole point of big-O notation is to analyze algorithms as
         | they're applied to input of size `n` larger than some `n_0`. Of
         | course it's not useful for small inputs - it's explicitly about
         | large inputs.
         | 
         | > it's implemented very efficiently
         | 
         | On large inputs, it's very hard see how, say, a linear
         | algorithm with a quadratic algorithm regardless of how
         | "efficiently" it's implemented. Assuming you're talking about
         | something like cache friendliness or good register allocation?
        
           | xyzelement wrote:
           | Yeah we are talking about the same thing. I agree that what
           | you are saying is deeply connected to value of Big O. I am
           | just pointing out that people sometimes forget to consider
           | these things.
        
         | musingsole wrote:
         | Most people jump to an extreme hyperbole instead of the reality
         | what they're grappling with. Big-O is just that
         | algorithmatised. Many software systems _must_ be designed to
         | anticipate orders of magnitude growth in scale.
         | 
         | However, I'm not sure about how _most_ software systems grow --
         | not sure anyone really knows. We all have narratives about it,
         | but it 's a small, small subset of projects I know about that
         | have an order of magnitude different number of users than when
         | I first learned about them (other than moving to effectively
         | 0).
        
       | infogulch wrote:
       | I noticed some comments here discussing the right prerequisite
       | knowledge to understand big-O and friends.
       | 
       | I propose limits. I think it would be a lot easier for someone to
       | understand how to think about big-O if they already understood
       | limits.
       | 
       | Lim [x -> inf] O(f(x))/O(g(x))
       | 
       | If you know limits, you know you how and why you can ignore all
       | but the highest power term, how to compare and simplify other
       | kinds of terms, etc. Though maybe that's too much to ask of
       | someone new.
        
         | windowojji wrote:
         | > Lim [x -> inf] O(f(x))/O(g(x))
         | 
         | I understand both limits and I've taught asymptotic analysis in
         | the past. O(f(x)) and O(g(x)) are sets. How does one divide two
         | sets? This explanation is ill formed.
        
       | joe_91 wrote:
       | Much better explained than in my computer science lectures! Nice
       | simple examples.
       | 
       | It's such an important topic to grasp too if you're going to end
       | up working with software!
        
       | tromp wrote:
       | Note that f(n) = O(g(n)) denotes an asymptotic upper bound. So it
       | would be correct (if confusing) to say that 2n+3 = O(n^2). This
       | is important for use in algorithmic complexity because we may not
       | always be able to determine the running time even up to a
       | constant, but can more easily infer an upper bound.
       | 
       | Using Omega instead denotes an asymptotic lower bound. To denote
       | both, you use Theta: f(n) = Theta(g(n)) means that for two
       | constants 0 < c1 < c2 and large enough n, we have c1 < f(n)/g(n)
       | < c2.
       | 
       | Finally, little o denotes vanishing behaviour. f(n) = o(g(n))
       | when f(n)/g(n) goes to 0 in the limit.
        
       | dragontamer wrote:
       | Big O is literally easier than the analytical alternative.
       | 
       | I didn't "understand" Big-O until I read Donald Knuth's "Art of
       | Computer Programming". I forget exactly where Knuth does this...
       | but Knuth derives the _exact_ number of average runtime on some
       | algorithm. (Where "average" is defined as all possible
       | permutations of the input).
       | 
       | I forget why or where this derivation was, but it was along the
       | lines of 3 _n_ log(n) + 5n +25, or something along those lines (I
       | made up the numbers).
       | 
       | Solving for the specific and exact runtime of a function is very,
       | very, very difficult. Instead of doing that, Comp. Sci has
       | decided that the easier Big-O is "good enough" for most purposes.
       | 
       | That's it. Big-O is ALWAYS easier to calculate than the exact
       | runtime. Yeah, its still hard in some cases, and there are
       | situations like Radix sort (O(n)) vs Quicksort (O(n*log(n)), or
       | Karatsuba multiplication vs optimal multiplication (like O(n^1.5)
       | vs O(n^1.4...)) where the "slower Big-O" is better in practice.
       | 
       | But such situations are rare, and are easily figured out through
       | profiling.
       | 
       | ---------
       | 
       | So lets do some real-talk. Most programmers don't wait on their
       | code anymore. Computers are so fast, that all this algorithmic
       | complexity is a red herring compared to other issues. By and
       | large, inefficient programming languages are being used in
       | grossly inefficient ways and no one cares.
       | 
       | For most practical purposes, profiling with a stopwatch is your
       | go-to methodology for analyzing algorithms. If that's not good
       | enough, then profiling with a dedicated profiler (which can
       | statistically count specific situations: like cache-hits or
       | branch-mispredictions, as well as how many times any particular
       | line of code ran).
       | 
       | That's the information you want and need. Take the runtimes, plot
       | it on a log-log plot, fit a growth-exponent on the curve and BAM,
       | you got O(n^whatever).
       | 
       | Where Big-O comes in are the situations where you're waiting for
       | your code to finish. You run your code, and you wait 10-minutes,
       | 1-hour, 10-hours, 2-days... will your code finish? Do you have an
       | infinite loop? Or is it actually making forward progress? If so,
       | how long do you estimate it to last? (And indeed: Hollywood
       | movies are known to take multiple days to render a single frame,
       | so these situations come up every now an then in practice).
       | 
       | Under such a situation, you can't really run a profiler or use a
       | stopwatch. You probably can run a big-O analysis by pen-and-paper
       | over your code however, and then get an estimate on the size of
       | your data. You'll want to make sure that you're O(n) or
       | O(n*log(n)). If you're getting O(n^2) or O(n^3) from an algorithm
       | that's taking days to run... you might be in trouble.
        
       | xiphias2 wrote:
       | Just read a book on algorithms. They are timeless, unlike APIs,
       | so it's worth the effort to study them.
        
       | lrossi wrote:
       | If you really want to make it easy to understand, make it
       | graphical.
       | 
       | That is: benchmark the code, varying the input size, and plot the
       | results. Almost anyone should be able to understand.
       | 
       | This might also reveal effects that are not taken into account by
       | Big O notation, as not all algorithms that have the same
       | complexity have the same performance. But I see it as a plus.
        
         | mhh__ wrote:
         | The coefficients are what can bite you.
         | 
         | Also amortized analysis, e.g. appending to a std::vector is
         | O(n) in the worst case but O(1) almost all the time (the O(n)
         | spikes come at something like every n = golden_ratio^k *
         | initial size for integer k)
        
         | kevin_thibedeau wrote:
         | It isn't uncommon for algorithms with "worse" algorithmic
         | complexity to perform better than the faster alternative
         | because the constant overhead of setting up the "better"
         | algorithm eats all the performance gains. Sequential search is
         | faster than binary search for short lists. You need to test to
         | see where the crossover happens on your platform.
        
       | adamnemecek wrote:
       | Count the number of nested loops. If one of the loops does
       | splitting (like binary search) it's a O(log n) as opposed to
       | O(n).
       | 
       | That's literally all there is to it.
        
         | nkozyra wrote:
         | That covers most of the algorithms you'll see in a coding
         | interview test, but there's a ton more complexities outside of
         | search spaces, sorts and nested O(n) loops.
         | 
         | Even some graph operations will leave familiar territory.
        
         | creata wrote:
         | That's not even close to all there is to it. Complexity
         | analysis is a relatively young field with lots of deceptively
         | simple unsolved problems.
         | 
         | For a simple example of how it's a lot more than that, follow
         | Tarjan's proof of the disjoint set's amortized time
         | complexity[0]. It's not at all obvious, even though the
         | disjoint set is a very simple and practical data structure.
         | 
         | [0]:
         | http://www.e-maxx.ru/bookz/files/dsu/Efficiency%20of%20a%20G...
        
           | andi999 wrote:
           | So what would be a deceptivly simple unsolved problem?
        
             | creata wrote:
             | Wikipedia has a list of big-name unsolved problems in
             | complexity theory. Most of these have very simple problem
             | statements.
             | 
             | https://en.wikipedia.org/wiki/List_of_unsolved_problems_in_
             | c...
        
               | adamnemecek wrote:
               | P == NP on analog quantum computers. A light prism
               | performs a diagonalization which can be used to do
               | factorization in O(1).
        
               | rytill wrote:
               | Factorization of arbitrarily large numbers? I doubt it.
        
               | adamnemecek wrote:
               | Yes.
        
               | amelius wrote:
               | On constant-size hardware?
               | 
               | And doesn't getting the data in and out already take
               | O(N)?
        
         | dragontamer wrote:
         | Strassen algorithm is 7-multiplications for a 2x2 matrix.
         | https://en.wikipedia.org/wiki/Strassen_algorithm
         | 
         | A matrix can always be split into groups of sub-matrix, and the
         | groups of sub-matrix is itself a matrix. Applying Strassen
         | algorithm recursively is therefore O(n^log2(7)) == O(n^2.8ish).
        
           | adamnemecek wrote:
           | The article covers none of those.
        
         | jkaptur wrote:
         | What if there are recursive function calls?
        
           | miguelrochefort wrote:
           | branchesk
        
           | pmiller2 wrote:
           | Then you get to have fun solving a recurrence relation. :-)
        
             | johndough wrote:
             | If you don't like "fun", WolframAlpha can solve those for
             | you. :-)
             | 
             | https://www.wolframalpha.com/input/?i=f%281%29+%3D+1%2C+f%2
             | 8...
        
               | commandnotfound wrote:
               | Thank you, but for the link provided it only says that
               | these are the Fibonacci numbers and nothing more. Yes,
               | you can click on this F(n) (how can you even know this is
               | a link), but anyways.
        
               | pmiller2 wrote:
               | Sure, but if you're not on the spot in an interview, the
               | master theorem is simple enough to apply: https://en.wiki
               | pedia.org/wiki/Master_theorem_(analysis_of_al...
        
       | hisfastness wrote:
       | Not a bad intro and appreciate the intention on keeping it short
       | and simple, but personally for me I think Grokking Algorithms has
       | the best beginner explanations of Big O and basic algorithms /
       | data structures. If I had to share resources with a learner, I'd
       | suggest Grokking as an item to go deeper on after reading this
       | article.
        
       | [deleted]
        
       | lordnacho wrote:
       | The thing that didn't click for me was precisely the "try to
       | count the operations" thing that the author mentions. In fact
       | it's the wrong road to go down, it isn't the point of big-o, and
       | yet that's how you're invited to think about it when the issue is
       | presented in college. It's only natural to think "oh let's look
       | at all the operations and add them up".
       | 
       | I think of it pretty simply, but not in that formal big-
       | theta/big-omega kind of way, because you want to just quickly
       | have an idea of whether you'll write a really slow piece of code
       | in general, not some best or worst case.
       | 
       | The question is simply what growth model dominates the increase
       | of time/space for the algo for each of the inputs? Imagine the
       | algo is already processing millions of input A, and you now
       | increase A by a factor of 10, 100, etc.
       | 
       | This melts away all the setup costs, nothing machine specific
       | like cache size matters, anything that isn't the dominating
       | factor is swamped, and all you're left thinking about is probably
       | how some loop expands. You also don't need to think about what
       | coefficient that dominating term has, which you would if you
       | tried to write an equation that took all the operations into
       | account.
        
       | breck wrote:
       | One thought I've never had: "I wish I hadn't been so much time
       | thinking about Big O notation, power laws, and orders of
       | magnitudes"
        
       | bjeds wrote:
       | If you are the kind of person that want to read an article titled
       | "explained as easily as possible", I think you should just avoid
       | saying the phrase "big oh" but instead talk about algorithm
       | runtime more informally, like "quicksort has a worst case
       | quadratic but average case n log n runtime".
       | 
       | The risk is otherwise you will shoot yourself in the foot, maybe
       | during an interview or other situation, as Big O is just one out
       | of many members in a family of notations that has a very specific
       | mathematical definition, other common ones being small-o and big-
       | theta.
       | 
       | Analysis of algorithms is more difficult than it appears. If you
       | implement an algorithm in a high level language like Python you
       | may get much worse runtime than you thought because some inner
       | loop does arithmetic with bignum-style performance instead of
       | hardware integer performance, for example. In such case you could
       | talk of big-omega (your analysis is bounded-below instead of
       | bounded-above, asymptotically).
        
         | bitexploder wrote:
         | It is still a useful conceptual framework. Maybe this sparks
         | someone's interest. Agree it is much harder than it appears :)
        
         | zachrose wrote:
         | I appreciate that the author was specific about "if you count
         | addition as 1 operation." Without saying this, it's not obvious
         | that the notation is such a simplified abstraction over
         | operations and their costs, with all the limitations that come
         | with that simplification.
        
         | nicoburns wrote:
         | The problem with that is that informal use of big-o like
         | notation is a lot more intuitive than the fancy language in
         | your explanantion.
         | 
         | Most people who can program can grasp the informal meaning of
         | O(n^2) pretty easily. They may not connect the word quadratic
         | to that say concept.
        
           | [deleted]
        
           | mkr-hn wrote:
           | I think it's a case of there being multiple ways to explain
           | the same thing, and different ones clicking with different
           | people for different purposes.
           | 
           | "O(n^2)" doesn't mean anything to me. Math has never been my
           | strong suit. I prefer words to lay out exactly what it means.
           | "Can take a long time, but might be best for [purpose of
           | specific algorithm]."
        
           | sabas123 wrote:
           | Why wouldn't you be able to just say "worst case n^2?"
        
         | bakuninsbart wrote:
         | Exactly, the link above starts well, but fails in some regards.
         | If you use big O, you should give the mathematical definition
         | and at least shortly explain it. The idea of an upper boundary
         | isn't very hard, and it is actually important to understand
         | that an algorithm running in O(n) is also running in O(n
         | log(n)) is also running in O(n^2). It is at least necessary to
         | understand the other greek letters, which really does come in
         | handy in a deeper understanding of algorithms.
         | 
         | The list of "common" O's is also _kinda_ bad. Particularly, and
         | I see this all the time, I think it is a mistake to go from
         | O(n^2) to O(c^n), as this is the step that leaves polynomial
         | time complexity, and glosses over the fact, that each level of
         | exponent constitutes a different time complexity. Here the
         | mathematical notation of O(n^2), O(n^3), ..., O(n^l) is
         | indispensible. Nesting loops is probably one of the most
         | commonly relevant applications of O-notations, so this actually
         | has an influence on real-world implementations.
        
         | Edmond wrote:
         | Agreed.
         | 
         | The use of "Big O Notation" itself as a way of referring to
         | algorithmic complexity seems like a misnomer, considering that
         | the topic is about analysis rather than the notation used to
         | express the results of such analysis.
         | 
         | Unfortunately academic textbooks have terrible "UX", so
         | students end up dealing with confusing presentation of topics,
         | hence we're stuck with labels such as "Big O Notation".
        
           | bjeds wrote:
           | I hear you.
           | 
           | Whether I like it or not, by now big o notation has fallen
           | into the category of "folklore" that working engineers use
           | and abuse informally without being very precise about it.
           | 
           | It's like the "proof by engineers induction": if some
           | statement P(n) is true for P(0), P(1) and P(2), then P(n) is
           | true for all n \in Z. :-)
           | 
           | Similarly if an engineer states that algorithm has a runtime
           | of O(f(n)) that should probably be read as "as n grows very
           | large (whatever that means) the runtime approximates
           | (whatever that means) some bound (below, above, whatever)
           | f(n). yolo.".
           | 
           | But people should at least be _aware_ that they are being
           | imprecise about it.
           | 
           | If I read a blog post or StackOverflow post or whatever and I
           | see big-theta notation I know that the person is probably
           | precise with her definition. If I see big-o then it may be
           | correct, or accidentally correct (happens often due to the
           | nature of the definition) or mistaken.
        
             | tshaddox wrote:
             | There can be appropriate levels of imprecision. Arguably
             | all communication necessarily requires that. This is only a
             | problem if it leads to an unnoticed miscommunication. For
             | example, I suspect most of the time when an engineers
             | refers to an algorithm as being in O(n^2) they intend to
             | preclude the possibility that the algorithm is not in O(n).
        
               | [deleted]
        
               | bonzini wrote:
               | Usually "average" or worst case" O(n^2) means it's really
               | big O, while "best case" or "always" O(n log n) means big
               | Theta.
        
         | melenaboija wrote:
         | I agree with what you say and not a Python fanatic but this
         | high level language stigmas catch my attention.
         | 
         | If you try to implement your own, let's say, intersection of
         | sets you will probably get at most the same performance as
         | Python using a reasonable amount of time.
         | 
         | I guess my point is that seeing the comment of Python in a
         | thread like this can also be confusing. Bad (or maybe I should
         | say "not appropriate for your use") implementations can exist
         | in any language.
        
         | tester756 wrote:
         | >Analysis of algorithms is more difficult than it appears. If
         | you implement an algorithm in a high level language like Python
         | you may get much worse runtime than you thought because some
         | inner loop does arithmetic with bignum-style performance
         | instead of hardware integer performance, for example. In such
         | case you could talk of big-omega (your analysis is bounded-
         | below instead of bounded-above, asymptotically).
         | 
         | Of course, that's why Big O says one thing, Compiler, CPU and
         | Caches may say the other.
        
         | cortesoft wrote:
         | I think the main purpose of an "explained as easily as
         | possible" article is to help the reader understand when SOMEONE
         | ELSE uses the term.
         | 
         | Sure, I can choose t use informal language, but I can't stop
         | someone else from using big o notation when speaking to me or
         | writing something I want to read.
        
         | indymike wrote:
         | I didn't really read that article as being for a developer. I
         | read it and thought, hey, this one would be good to share with
         | a few project managers & business unit managers. Any time
         | people who are not programmers (especially ones we have to work
         | with) start to better understand what we're really doing, it is
         | a good thing. Articles like this are superb for helping them
         | understand that developers do have a disciplined and rigorous
         | way of solving problems.
         | 
         | I do agree with you that these articles do leave a lot of
         | detail and precision out. They tend to give the reader a
         | superficial understanding of the subject... but a superficial
         | understanding may be enough to help.
        
         | cgriswald wrote:
         | For Python code, I usually just ask critics if they've tested
         | it (because I have). Frequently using a built-in will be faster
         | than a custom loop even if at the surface level you're going
         | over the data more than once.
        
           | mhh__ wrote:
           | Anecdotally (it was a contrived sorting benchmark so the
           | exact numbers don't really matter), Python started off about
           | 3 times slower than D, but and grew at what would be
           | considered the same O() but the coefficient was enormous. To
           | the point where a D was taking 3s for n-million arrays,
           | Python closer to one minute.
           | 
           | Node was actually very impressive, roughly as fast as D's
           | reference compiler (cutting edge optimisations maybe 2
           | decades ago) in release mode.
        
         | sixstringtheory wrote:
         | > If you implement an algorithm in a high level language like
         | Python you may get much worse runtime than you thought because
         | some inner loop does arithmetic with bignum-style performance
         | instead of hardware integer performance
         | 
         | If a language/library can change the dominant asymptotic term
         | of an algorithm like that, instead of just the constant factor,
         | that is a problem. Does that really happen with python or is
         | this exaggeration? I'm inclined to accept another reason to
         | dislike these super high level interpreted langs but I've never
         | seen something that e.g. appears written to be logarithmic to
         | become quadratic because of library implementation details.
        
           | Brian_K_White wrote:
           | The language doesn't matter, nor even it's high vs low class.
           | 
           | You can write a short function in any language that _looks_
           | O(1) if you only look at the surface or high level of it.
           | Even assembly. Meanwhile in the middle of that routine is a
           | single call to another function or macro which may be 0(1) or
           | O(n!).
           | 
           | Python was just an example.
        
             | a1369209993 wrote:
             | > Even assembly.
             | 
             | Although high-level languages like python or to a lesser
             | but less excusable degree C++ are _more_ susceptible to
             | this because more features can secretly be greater-than-
             | constant-time function calls.
        
         | Al-Khwarizmi wrote:
         | I know the rest of the notations in the family but, to be
         | honest, even in most algorithmics textbooks they tend to use
         | Big O like 90% of the time, even in contexts where Big Theta
         | would be more precise (e.g. "mergesort is O(n log n) in all
         | cases"). Let alone in more informal contexts.
         | 
         | I don't especially like it, but it's OK because it's not a lie.
         | And I do think for people who just want the gist of the
         | concept, like readers of this piece, it's enough and it's not
         | worth being fussy about it.
         | 
         | What irks me is people who use the equal sign as in T(n)=O(n
         | log n), though. Why would a function equal a class? Set
         | membership notation does the job just fine.
        
       ___________________________________________________________________
       (page generated 2021-01-16 23:00 UTC)