[HN Gopher] Zero one infinity rule
___________________________________________________________________
Zero one infinity rule
Author : azhenley
Score : 60 points
Date : 2023-03-18 18:01 UTC (4 hours ago)
(HTM) web link (en.wikipedia.org)
(TXT) w3m dump (en.wikipedia.org)
| readthenotes1 wrote:
| Zero one infinity (many) is a good rubric for test driving, and
| for thinking how close to infinity your system can/should
| actually support
| furyofantares wrote:
| > It argues that arbitrary limits on the number of instances of a
| particular type of data or structure should not be allowed.
|
| Either this is a misuse of "arbitrary" or it's not really arguing
| for no limits beyond 1. In the Carmack example from another
| comment, if you know your solution isn't going to scale well
| beyond 1000 items, that's not an arbitrary limit, and it's not 0
| or 1 either. When we choose not to allow users to back up their
| hard drive in the username field of their account, that's not an
| arbitrary limit.
| xlii wrote:
| I have a feeling that a lot of commenters mix up structure design
| with structure implementation.
|
| Rule talks about the design (and I find approach interesting) yet
| implementation can (and most likely should) be limited.
|
| E.g. we are designing a world. In this design we say children can
| have toys. It doesn't make sense to design universe so that
| children are limited to 5 toys or any other arbitrary number. But
| no room can fit infinite number of toys so implementation is
| limited. For some it's 500 toys but for some it's 500.000 toys.
|
| Names or parents are good real world examples. Sure, most people
| have 2 parents pr at most 2 names but this rule argues that if
| there's more than one it shouldn't be limited at all (which also
| makes sense in real life even though IT system might limit names
| to 256 characters because hardware).
| eternalban wrote:
| Then change the name of the rule :)
|
| _Zero, One, or Many_. I think that would satisfy all
| concerned.
| xlii wrote:
| Absolutely agree. However that's for the rule creator. I
| think that infinity in this context sounds slightly poetic.
| antiquark wrote:
| RIP the byte.
| layer8 wrote:
| It's just one octet.
| travisjungroth wrote:
| A metarule for using rules like this: instead of thinking of it
| as a rule, think of it as an option you should usually consider.
| You get lot of the benefits and less of the costs if you just ask
| questions like "Would it be easier to solve for N than 3? Easier
| to use and maintain?". The answer will never always be yes or no,
| but you'll be more likely to get the right answer if you ask the
| question.
| Kim_Bruning wrote:
| Am I the only person who sometimes makes an exception for "two" ?
| ChainOfFools wrote:
| I seem to recall that there are certain languages (human
| languages) that have cases specifically for talking about pairs
| of objects, situated semantically between singular and plural.
| I think one of these is an archaic form of Greek, maybe ancient
| or Hellenistic.
| mmphosis wrote:
| How about a sign bit?
|
| https://en.wikipedia.org/wiki/Unum_(number_format)
| playingalong wrote:
| It has always "bothered" me that AWS allowed for up to two API
| keys per IAM user.
| jdreaver wrote:
| That's probably so you can rotate your keys without downtime.
| adhesive_wombat wrote:
| A ping-pong buffer certainly seems like generalising to a ping-
| pong-pang-peng-pung-p(infinity)ng buffer would be overkill.
| hprotagonist wrote:
| i fairly regularly want pairwise, yep.
| jerf wrote:
| No.
|
| But I have the decency to feel bad and leave an apologetic
| comment.
| GauntletWizard wrote:
| Wherever I need "Two" I usually find that what I actually need
| is a one and a many; there's not a master and a replica,
| there's one master and many replicas. There's not two places
| that the data is when doing a raid restripe, there's many
| places the data probably is and one place it should be.
| grensley wrote:
| Two is quite relevant for relationships
| ChickeNES wrote:
| Well, if you ignore polyamory and polygamy...
| layer8 wrote:
| You can have n-ary relationships. Relational databases are
| built on that.
| alex_sf wrote:
| The argument against two would be: if you can do two, why can
| you not do three?
| GauntletWizard wrote:
| I learned this rule through, of all things, the children's series
| Animorphs. One book they encounter a colony of ants, who mention
| that "You are one, we are many" and count that way; any finite
| number is one, and infinite numbers are many.
| hsjqllzlfkf wrote:
| John Carmack argued the opposite. He said he would hardcore
| limits into his data structures. Limits that would never be hit
| under normal operating circumstances. He argued that when you
| design software you should have an idea under what circumstances
| it will run and optimize for those. The fact that people normally
| don't do this, is why software often lags - that algo you
| implemented worked worked just fine when it's hashmap has 1000
| keys, but not when it has 1m keys.
|
| I haven't articulated his point very well, but I lost interest in
| this comment mid way thru.
| cjfd wrote:
| Well, John Carmack is a game programmer, at least originally.
| In games one has strong upper limits on how long it can take to
| calculate the next frame. So, it may very well be that
| something can be said for this in this context. I am not sure
| in general, though. If one limits the length of the name of a
| person in some record one can start waiting until one day a
| person arrives that has a name that is one character longer and
| then a programmer needs to get involved to increase the limit.
| Okay, maybe there exists a ridiculously large maximum name
| length such that this problem is unlikely to occur. Then one
| may still have the problem that if all people would have names
| that approach this limit, the system still might get too slow
| as well. So maybe we should impose a limit on the sum of all
| the lengths of all the names in the system. Hmmm... Kind of
| gets a bit too complicated and arbitrary for my tastes. I think
| for many systems Carmarcks advice really is not very good.
| layer8 wrote:
| There should be limits. What if someone pastes the text of
| the bible into the name field? What if you want to send an
| email containing the name, and you hit the maximum email size
| accepted by the SMTP server (they all have a limit)? What if
| you want to send a postal letter and print the name on the
| envelope?
|
| Not setting explicit limits either means that you still have
| implicit limits, but you don't know what they are and where
| and when you'll run into them, or it means you're opening
| yourself up to DoS attacks, because your database or your
| backups or your AWS bill will run amok.
| GauntletWizard wrote:
| I have legitimately used the complete works of William
| Shakespeare, unabridged, as a password. Even that is in
| megabytes, and not significant load for bcrypt2.
|
| Not that there should be no limits at all, but the upper
| bound should be relatively high.
| asynchronous wrote:
| Doesn't bcrypt2 essentially truncate every source input
| to no longer than 35 characters?
| mhh__ wrote:
| > 1m keys
|
| As opposed to not working at all?
| quelsolaar wrote:
| Ive come to agree with this. Say you have a struct that
| contains a name. If you limit the name size to say 64 bytes,
| then you can store it in the struct, otherwise you need to have
| a separate allocation and an indirection. This makes the code
| slower, more error prone and more complex to use. So think hard
| of when "infinite" is justified.
| meghan_rain wrote:
| > more error prone
|
| not when using Rust ;-)
| AlotOfReading wrote:
| You should also be careful about incorrectly imposing limits
| on human input, as every "falsehoods programmers should know
| about x" repeatedly hits on. To continue the name example,
| there are people with legal names and titles that take far
| more than 64 bytes to store and truncation is not an
| appropriate solution. You _should_ store the entire thing,
| even if that 's a small bit more difficult technically. The
| other limit also applies. Some people don't have any
| characters in their legal names at all, like my mother (birth
| certificate name was blank).
| PKop wrote:
| Or you can make a tradeoff and simply exclude these
| outliers from using your product, instead of optimizing for
| their unique case.
| quelsolaar wrote:
| Who said anything about the name being the name of a
| person?
| zeroonetwothree wrote:
| I think "should" is a bit strong here. There are always
| tradeoffs. Having support for very long names can cause
| other problems, like the opportunity for abuse, making it
| hard to format UIs, making human inspection of data much
| harder, or even limiting algorithmic efficiency.
|
| For example, US Passports have a relatively small limit for
| name length (I think it's around 20 characters each for
| first/middle/last).
| jiggawatts wrote:
| "Who you are is inconvenient for me."
| GMoromisato wrote:
| Conversely, "The needs of the many outweigh the needs of
| the few, or the one."
| zeroonetwothree wrote:
| This is true, but note that the "Zero one infinity rule" only
| applies to the _number of items_ , not the size of an item.
| Limits on size are a lot more typical anyway.
|
| (Yes I know someone will point out that to store a list of
| items requires a linear amount of space, so therefore an
| unbounded number of items inherently implies an unbounded
| amount of space)
| MawKKe wrote:
| Sure if you know the operating domain, just hard code some
| sensible upper limit and move on. Not that this way is "better"
| in all cases. YAGNI and all that
| adhesive_wombat wrote:
| You can still have both: an algorithm that technically has no
| formal numerical limit, but have warnings or errors when it
| reaches some condition that is clearly outside the design
| scope.
|
| For example, you might have a system that warns somehow if you
| start inserting items at the front of a vector with over a
| million items (when you originally expected the vector to be
| used for tens of items). If the structure is suddenly being
| used for something way outside of it's design scope, it's
| probably in need of a re-think.
| stefan_ wrote:
| Nope. "No formal limit" or "infinity" is just another way of
| saying "dynamic memory allocation". There is a reason you
| usually find this advice in game adjacent places; to punt
| everything to the dynamic memory allocator makes it very
| difficult to know if you are keeping within the memory-
| starved limits of e.g. the game console you are developing
| for, or on mobile to keep a garbage collector from stepping
| in and ruining your day.
| adhesive_wombat wrote:
| "Nope." An "algorithm" can be about things other than
| memory allocation. Say you have a collection of sensors the
| you poll. ZOI is saying that the system that polls them
| shouldn't[1] have some kind of hard assumption that there
| will only ever be, say, specifically 4. You could still
| statically allocate the storage on a given system, for
| example in an embedded system where the software is
| compile-time configured for a given hardware.
|
| However, if you pass the "poll_sensors" function a size of
| 60 million when the system was designed for "about 4, I
| guess", it's likely that you're operating the algorithm in
| a regime it wasn't designed for. You may (or may not, this
| is just another trade-off) wish to know about it.
|
| [1]: of course you can always construct exceptions. If you
| follow every rule you subscribe to dogmatically, then
| you're doing something more akin to religion then
| engineering.
| shoo wrote:
| > when you design software you should have an idea under what
| circumstances it will run and optimize for
|
| This philosophy is sometimes referred to as "data-oriented
| design"
|
| see also: Mike Acton's CppCon 2014 talk:
| https://www.youtube.com/watch?v=rX0ItVEVjHc
|
| Data-oriented design seems appropriate for shipping high-
| performance games, and shipping them on schedule and on budget.
| The goal is not to design and code the algorithm with the best
| theoretical scalability characteristics - and publish it in a
| paper, or invent the most mathematically general version of the
| algorithm to be factored out into a library for use by 100
| hypothetical future projects. The goal is not to design a
| system using hexagonal architecture (or whatever) that can more
| flexibly accommodate arbitrary changes of requirements in
| future, at the expense of performance. The source code is not
| the product that the customers buy, the game is the product.
|
| The goal is to understand the problem you actually have -
| processing data of a certain finite size or distribution of
| sizes, on some fixed finite set of target hardware platforms,
| and figure out how to do this in a way that hits performance
| targets, with a limited budget of engineering time, to keep
| moving forward on schedule toward shipping the product.
| roflyear wrote:
| Only works when fail early is better than run slow which IME is
| not often the case
| TazeTSchnitzel wrote:
| You can see this kind of design in action in the Source engine,
| perhaps inherited from Quake (which is partly Carmack's work of
| course). There are hard limits on things like how many objects
| can exist at once. Occasionally those get hit... usually due to
| a memory leak![0] So the artificial limit helps find real
| problems. That's also been my experience with PHP, which limits
| the amount of memory a request or script execution can consume
| by default.
|
| [0] https://www.youtube.com/watch?v=pw2X1yhrDdE -- check out
| shounic's other videos too, there's lots of interesting content
| there about how complex game systems fail, including about
| these limits
| at_a_remove wrote:
| I've had a different take on this that goes "Zero, One, Many."
| And that when you end up really using your software, the things
| that were "one" often become "many," and the unthinkable zeroes
| frequently became ones.
| zeroonetwothree wrote:
| While this has some theoretical merit, IME, limits are quite
| useful to catch bugs (or prevent degenerate cases). For example I
| was recently working on a permissions system wherein there can be
| members of groups. I set a reasonable limit on the size of a
| group based on a maximum of actual usage and what I could foresee
| being reasonable. A few days later, this limit was triggered, and
| I got a bug report. But it turned out there was some bad caller
| that had essentially an infinite loop that was adding more and
| more members to the group. Without the limit this probably would
| only have detected after it cascaded and caused other failures.
|
| As another example, Facebook famously had (maybe still has? I
| have no idea) a limit of 5,000 friends. This was based on the
| idea that you were supposed to be friends with people you
| actually kind of know at least a little bit, which would probably
| fit within 5,000 for most people. After some more popular users
| started hitting the limit, it led to the development of public
| profiles/pages as an alternative product to handle that use case.
| So in this case the limit allowed the company to identify a new
| market that they could better serve in an alternative way.
|
| BTW, the George Gamow book that the name comes from (One, Two,
| Three... Infinity) is a great read for anyone interested in
| popular descriptions of physics or math. He also lived a very
| interesting life, I would recommend his autobiography "My World
| Line".
| josephg wrote:
| Yeah; expected limits are also fantastically useful in
| performance engineering. It's very common your code needs to
| handle an arbitrarily sized input, but 99% of the time the
| input will be bounded. (Or generally simpler). Special casing
| the common code path can make a lot of code run much faster.
|
| For example, in some code I'm writing at the moment I have
| lists of integers all over the place. I call them lists -
| usually they only have 1 element. Sometimes they have 2
| elements (10%) and very occasionally more than 2 elements or
| they're empty (<1%).
|
| If I used a language like Javascript, I'd use Arrays. But
| arrays are quite expensive performance wise - they need to be
| allocated and tracked by the GC and the array contents are
| stored indirectly.
|
| Instead, I'm using an array type which stores up to 2 items
| inline in the container object (or the stack) without
| allocating. It only allocates memory on the heap when there are
| 3 or more items. This decreases allocations by 2 orders of
| magnitude, which makes a really big difference for performance
| in my library. And my code is just as readable.
|
| I'm using the smallvec crate. There's plenty of libraries in C
| and Rust for this sort of thing in arrays and strings. Swift
| (like obj-c before it) builds small string optimizations into
| the standard library. I think that's a great idea.
|
| https://crates.io/crates/smallvec
| pyuser583 wrote:
| The Wiki page says the problem isn't with limits, but with
| _arbitrary_ limits.
|
| When a program limits me to 256 of something, it doesn't seem
| arbitrary.
|
| I've heard stories of programmers setting limits to multiples
| of two simply so nobody asks why.
| crazygringo wrote:
| Totally agreed. I think it's absolutely a best practice to set
| limits.
|
| Code is generally designed to operate within certain
| "reasonable" performance boundaries, and when it goes outside
| those you need to think whether code should be rewritten to
| accomodate it.
|
| Just a tiny example, but I regularly deal with long (800+ page)
| PDF's on my iPad, reading parts of them in the stock Books app.
| When I select text to highlight, the context menu unfortunately
| puts "select all" directly next to "highlight". Of course,
| every so often I accidentally hit "select all", and then I have
| to force-close the app because otherwise it just freezes for 10
| minutes as it extracts and selects the text on every single
| page.
|
| When really, it needs a limit to detect that, hey, if the PDF
| is over 20 pages long then just don't allow "select all"
| anymore, because this isn't the right tool for the job.
| psychphysic wrote:
| Although that's also ripe for the annoyance of not being able
| to select all in border cases.
|
| I'm much in favour of don't fuck with the basics (i.e. the
| core expect feature UI like select, copy and paste.).
| solveit wrote:
| I think a better way to formulate the same rule of thumb is
| that zero, one, and infinity are the only numbers that don't
| need to be _justified_. You should always have an answer to the
| question "why 5000 and not 6000?"
|
| Of course, the justification can be as simple as "it has to be
| some number and 5000 is as good as any", but that opens the
| door to the discussion of whether 5000 really is as good as any
| other number, which is often surprisingly enlightening.
| pixl97 wrote:
| What should the limits be? And are the limits clear to the
| users of the system? Are they clear to other components of the
| system?
|
| I agree we should have limits in software because we don't have
| unlimited memory and processing time, but I commonly find these
| limits are encoded by the imagination of the programmer working
| on the software at the time, and it's often you find limits
| that were not considered in systems design of the product.
| HervalFreire wrote:
| >While this has some theoretical merit,
|
| What theoretical merit? It sounds like a completely made up
| rule of thumb based off of a persons anecdotal experience.
| heyzk wrote:
| Which, I assume, led to one of my favorite idioms: there are only
| three numbers in computing: 0, 1, and n.
| deergomoo wrote:
| I think this is something programmers intuitively learn pretty
| quickly (in fact I'm only just learning that there is actually a
| name for this idea), but it's an interesting principle to explain
| to non-developers.
|
| Back when I was doing agency work, on more than one occasion I
| had clients question why modifying some entity to permit multiple
| related entities instead of one would be more work than they were
| anticipating.
|
| The question was usually along the lines of "would it help if we
| were to limit related $foobars to three?". But of course, from a
| structural perspective supporting three related records is
| usually exactly the same amount of work as supporting arbitrary
| amounts. Yeah I could probably hack in some fixed amount of extra
| records, but that's just going to create an increasingly large
| mess when your business grows and you realise
| three...twelve...one hundred wasn't enough. Might as well do it
| properly from the start.
| dragontamer wrote:
| This is probably a rule I believed in as a beginner, and
| completely don't believe in anymore today.
|
| Lets take a look at an exceptionally well designed system:
| single-error correction double-error detection codes. Also known
| as "ECC RAM". The implementation of this only works for *exactly*
| 64 bits + 8-parity bits.
|
| Now sure, we can talk about how we can generalize SECED ECC to
| different numbers of bits or whatever. But I'll also point out
| that the only implementation in widespread use is ECC Memory, the
| 64-bits + 8-parity bits (72-bits total) implementation.
| Nullabillity wrote:
| It is generalized, you can chain together infinitely many
| blocks.
| alex_sf wrote:
| I would argue your example is a special case of only allowing
| one.
| dragontamer wrote:
| But it detects two-bit errors.
|
| Which is nonzero, non-one, and non-infinity.
|
| --------
|
| SEC without DED is the classic implementation of Hamming
| error correction codes. The double error detection (aka DED)
| but is grafted on top of the theory, because it's useful in
| practice.
| crazygringo wrote:
| I've read the article but I'm having trouble understanding the
| context of what this was _in response to_. The article quotes:
|
| > _I formulated it in the early 70s, when I was working on
| programming language design and annoyed by all the arbitrary
| numbers that appeared in some of the languages of the day._
|
| Was it just an issue with languages in the 70's then? Because I'm
| struggling to think of any "arbitrary number" limits in modern
| languages.
|
| Obviously we have numerical limits depending on if a variable is
| a signed 32-bit integer or whatever. But those aren't arbitrary.
|
| Like would older languages just arbitrarily say there couldn't be
| more than 100 elements in an array, or something?
| layer8 wrote:
| Lots of examples (e.g. maximum number of nesting levels), see
| for example:
|
| http://www.tendra.org/tdfc2-config/chapter2
|
| https://www.ibm.com/docs/en/epfz/5.3?topic=reference-limits
|
| One aim of finite implementation limits is to define which
| programs are guaranteed to compile successfully, so that you
| don't run into the situation that a program compiles on one
| implementation but not on another implementation, which would
| raise the question of whether its the program or the compiler
| that is nonconforming. If both are conforming, you want the
| program to compile 100% of the time. This isn't possible if
| programs can have unlimited complexity.
|
| The mindset is similar to embedded programming, where you have
| limited memory and absolutely don't want to run into the
| situation that there's not enough memory left to perform the
| operation that needs to be performed.
| gdprrrr wrote:
| Early C compilers had a limit of 6 characters for an identifier
| neilv wrote:
| I first heard of "no arbitrary limits" in connection with GNU
| software. It was one of ways that GNU software not only had
| different licensing than alternatives, but was also usually
| technically superior.
|
| (At the time, there were a lot of different workstation computer
| vendors, each with their own flavor of Unix. But where GNU tools
| existed, they'd usually be better than all of the Unix
| workstation ones. One exception was that the premium vendor C and
| C++ compilers generated better code and/or were more trusted than
| GCC at the time. GCC didn't take over until Linux, initially on
| x86.)
___________________________________________________________________
(page generated 2023-03-18 23:00 UTC)