[HN Gopher] The intricacies of implementing memoization in Ruby
___________________________________________________________________
The intricacies of implementing memoization in Ruby
Author : thunderbong
Score : 123 points
Date : 2024-12-23 13:37 UTC (1 days ago)
(HTM) web link (denisdefreyne.com)
(TXT) w3m dump (denisdefreyne.com)
| User23 wrote:
| It's fun how the runtime in seconds for the naive Fibonacci is
| 36: 14930352 (1.2s) 37: 24157817 (2.0s) 38: 39088169
| (3.2s) 39: 63245986 (5.2s) 40: 102334155 (8.3s)
| 41: 165580141 (13.5s) 42: 267914296 (21.8s)
|
| What a familiar pattern in those runtimes!
| izietto wrote:
| Fibonacci never stops surprising!
| JadeNB wrote:
| That's a neat observation! But isn't it for the naive reason
| that, with negligible overhead, the run time of executing
| `fib(n) = fib(n - 1) + fib(n - 2)` is the run time of executing
| `fib(n - 1)`, plus the run time of executing `fib(n - 2)`?
|
| On the other hand, if someone had asked me for an estimate on
| the run time, I probably would have tried breaking out the
| master theorem instead of thinking of this, so I don't want to
| downplay the observation.
| bmm6o wrote:
| Right. Also note that the non-optimized implementation boils
| down to computing F(n) = 1+1+1+1+...+1, with a function call
| tree that has 2F(n) nodes. So no matter what the relative
| cost is for addition and function calls, the total time
| should go like F(n).
| thih9 wrote:
| I know it's pointless but now I'd like to see a fibonacci
| implementation that returns these runtimes as the result.
| kazinator wrote:
| Another kind of fun with Fibonacci (sort of):
|
| https://news.ycombinator.com/item?id=14331627
|
| floor(x + 0.5) wants to be round(x), that function having been
| introduced in C99. Just remnants of old school C coding.
| thih9 wrote:
| > My preferred approach to [thread safety] is to explicitly not
| think about thread safety, and delegate this concern to the
| caller.
|
| Good call IMO. If I was the author then I'd implement this; it
| would mostly work for current use cases, would be a pain to
| maintain and impossible to extend. I learned something today.
| tomchuk wrote:
| Can't mention Fibonacci and memoization in the same sentence
| without me breaking out my favorite Python party trick:
| def fib(n, cache = {0: 0, 1: 1}): if n not in cache:
| cache[n] = fib(n-1) + fib(n-2) return cache[n]
| jaggederest wrote:
| The directly translated ruby version (from stack overflow of
| course) is even shorter: def fib(n,
| cache=Hash.new{ |h,k| h[k] = k < 2 ? k : h[k-1] + h[k-2] })
| cache[n] end
|
| It runs out of stack around 7146 on my machine at least. The
| python one is limited by the recursion depth limit in my test
| but of course that's configurable at your own risk.
| vidarh wrote:
| If you're first going to golf it, endless-def:
| def fib(n, cache=Hash.new{ |h,k| h[k] = k < 2 ? k : h[k-1] +
| h[k-2] }) = cache[n]
| jaggederest wrote:
| It's actually kind of ungolfed. The default version would
| be just fib = Hash.new{ |h,k| h[k] = k <
| 2 ? k : h[k-1] + h[k-2] } fib[7145]
| inopinatus wrote:
| This is the proper Ruby form since it relies on the []
| operator and is therefore substitutable for any other
| callable esp. procs/lambdas, and able to wrap them as
| well.
|
| This isn't just golf, it's an excellent way to pass
| around a lazily computed, self-caching closure. I use it
| often in preference to memoization gems.
|
| Aside from noting the false/nil equivalence concern, the
| original article seems over-engineered to me. Excessive
| metaprogramming is a common trap for Ruby devs.
| nyrikki wrote:
| IMHO, the mamul form is even better for golf, F(n) in O(n
| log n)
| x86x87 wrote:
| RUBY_THREAD_VM_STACK_SIZE for Ruby
| jez wrote:
| The Python version and this Ruby version are not equivalent.
|
| In Ruby, default parameters are allocated for each call to
| the function, which means that they are not shared across
| calls.
|
| In Python, default parameters are allocated once and shared
| across all calls, allowing them to be mutated.
|
| This becomes obvious if we change the Python and Ruby
| versions to print when they're computing something that is
| not yet cached. For back-to-back calls to `fib(10)`, the
| Python version prints only on the first call to `fib(10)`.
| The Ruby version must recompute all values from 2 - 10 again.
| technion wrote:
| It's definitely something I remember being interested to find
| out about the hard way.
|
| Every recursion class talks about the Fib algorithm and why
| it's nice with recursion. But the iterative version doesn't
| have these stack limitations, presumably uses less memory and
| is just as fast. Doesn't that make it a better
| implementation?
| Jtsummers wrote:
| > Every recursion class talks about the Fib algorithm and
| why it's nice with recursion.
|
| It _is_ nice with recursion in the sense that it 's a
| straightforward, easy algorithm (what college CS student
| doesn't know basic arithmetic?) that _illustrates_
| recursion. It 's also trivial to write the recursive
| version based on the mathematical definition.
|
| It is not nice in that it's very slow and blows up due to
| the exponential growth of the recursive calls. No one
| outside of CS 101 or an algorithms class is going to ever
| write the recursive version again. But that not-nice aspect
| makes it nice again, because it's a good jumping off point
| in how to improve an algorithm's runtime by understanding
| its structure (the iterative version) or throwing more
| memory at it (the memoized version).
|
| > But the iterative version doesn't have these stack
| limitations, presumably uses less memory and is just as
| fast.
|
| There is no "presumably", unless you store every Fibonacci
| number and not just the last two, the iterative Fibonacci
| _does_ use less memory than the naive recursive Fibonacci.
| And there 's no "just as fast". The iterative Fibonacci is
| faster, unless you've done something horribly wrong. It
| runs in linear time wrt N rather than exponential. If your
| linear algorithm is only just as fast as an exponential
| one, you haven't made a linear algorithm.
|
| > Doesn't that make it a better implementation?
|
| Yes, obviously. Which is why after CS 101 or an algorithms
| class you pretty much never write anything like that except
| maybe as a prototype. "This recursive program follows the
| definition closely, but it blows up memory/time." Start
| there and improve is a very reasonable thing. Start there
| and stop is just silly.
| eru wrote:
| The stack is just an implementation detail of some
| languages.
|
| Some other languages have non-broken function calls, and
| don't need to grow the stack for all of them.
|
| In those languages, you don't need to have loops as built-
| in to work around broken function calls. You can just write
| your own loops as library functions.
|
| (Of course, the naive recursive definition of Fibonacci
| would still be slow. You'd want to write a version that can
| be efficiently evaluated. You are already familiar with the
| linear time version in the iterative form.
|
| You can also write a logarithmic time version of
| Fibonacci.)
| macrocosmos wrote:
| This is fun. Just wanted to say that both my 64 gb desktop
| and my 16 gb of RAM laptop reach the same stack limit at
| exactly 7692 when using pry. And reached the limit at 7694
| with irb.
| azhenley wrote:
| That's beautiful. Why didn't I think of that?!
| rplnt wrote:
| Probably because not using mutable default arguments is
| python 101. Just something you don't do or think about. But
| this is indeed clever use of that gotcha.
| kccqzy wrote:
| You should see the Mathematica version:
| fibonacci[0] = 0; fibonacci[1] = 1;
| fibonacci[n_] := fibonacci[n] = fibonacci[n-1] + fibonacci[n-2]
|
| It cleverly uses both = and := together. Usually people use =
| for immediate assignment (such as constants) and := for delayed
| assignment (such as functions) but this combines the two.
| kazinator wrote:
| Does that work due to Ruby having Python-like broken semantics
| for evaluating the default value expressions for optional
| arguments? (I have no idea.) Or does it work due to the object
| denoted by {...} being a real literal?
|
| If the default value expression is evaluated on each call to
| the function in which it is required rather than at function
| definition time, and if the { ... } syntax is a constructor
| that creates a fresh object, then this wouldn't work; one of
| those two conditions (or both) must be broken.
| x3n0ph3n3 wrote:
| Python's default argument is evaluated once. It's very sibtle
| behavior that is inlile Ruby.
| kazinator wrote:
| TXR Lisp version: (defun fib (n : (cache #H(()
| (0 1) (1 1)))) (or [cache n] (set [cache n]
| (+ (fib (pred n)) (fib (ppred n))))))
|
| TXR Lisp does _not_ have broken Python semantics for evaluating
| the default expressions for optional arguments. The expressions
| are freshly evaluated on each call in which they are needed, in
| the lexical scope in in which the prior parameters are already
| visible, as well as the scope surrounding the function
| definition.
|
| Why this works is that the #H hash literal syntax is a true
| literal. Every time that expression is evaluated, it yields the
| same hash table, which is mutable. This wouldn't work in an
| implementation of TXR Lisp in which literals are put into a ROM
| image or mapped into read only virtual memory. We will not
| likely see such a thing any time soon.
|
| If we change #H(...) to, say, (hash-props 0 1 1 1), it won't
| work any more.
| PittleyDunkin wrote:
| > Why this works is that the #H hash literal syntax is a true
| literal. Every time that expression is evaluated, it yields
| the same hash table, which is mutable.
|
| While this is cool and I think I grok the semantics, the
| identification of it as a "true literal" has me scratching my
| head. To me a literal is a syntactical term, so putting a
| literal into a rom image only makes sense in terms of storing
| the source itself (which is not necessary most of the time,
| but might make sense in a lisp).
|
| First-class support for partial evaluation looks really cool.
| I've been playing around with a scheme dialect built around
| this very concept.
| kazinator wrote:
| A literal is not purely a syntactical term. A literal is an
| object that is embedded in the program itself. As such, it
| is necessarily part of the syntax: the piece of syntax
| denoting the literal is understood to be that object
| itself.
|
| However, note that syntax disappears when the program is
| compiled. The literal itself does not!!! (Unless identified
| as dead code and eliminated, of course).
|
| Even using the C language as an example we can readily
| distinguish between literal syntax and semantics. The
| string literal syntas "abc\n" includes the double quotes
| and backslash. Yet the string literal object at run time
| has no double quotes or backslash, and the n after the
| backslash was turned into a newline (linefeed in ASCII). (C
| compilers on Unix-like platforms do in fact map string
| literals to read-only memory. It's not ROM, but wite-
| protected VM. Firmwares written in C being burned into ROM
| are thing also. The literals are then in ROM.)
|
| In Lisp, we think of the objects as being source code, so
| it's a little different. The C idea of the double quotes
| and backslash being source is called "read syntax" in Lisp,
| particularly Common Lisp. Scheme might use "read syntax",
| I'm not sure. In any case, "surface syntax" or "character-
| level syntax" are also useful synonyms. Note
|
| Lisp compilers and interpreters take the scanned-surface-
| syntax-turned-object as their input. We could call that
| deep syntax. So a quoted literal is literally a chunk of
| the deep syntax, turned into a datum for the program:
| (quote X) embeds the syntax X into the program, making it
| available as a datum, and the compiler will process that
| quote construct by propagating X as is into the compiled
| image, arranging it to be part of some table or literals or
| whatever.
|
| Some object are self-quoting, like numbers of strings, or
| #(...) vectors in Common Lisp and Scheme.
|
| The TXR Lisp #H(...) hash syntax is like this. If it
| appears as an evaluated expression in code, then it is a
| literal. The compiler must propagate that to the compiled
| image, just like a vector, string, or number.
| PittleyDunkin wrote:
| > However, note that syntax disappears when the program
| is compiled. The literal itself does not!!!
|
| I would term this a "value". The word "literal" is
| inherently lexical, dealing with letters (ie a signifier)
| from an etymological standpoint as opposed to the value
| (the signified).
|
| Or to further clarify my confusion, a symbol is a value
| that can be evaluated to a different value at runtime but
| I wouldn't call it a "literal" outside the context of a
| legible source file.
|
| Perhaps the one exception I can think of is the use of
| macros, but macros are distinguished from other list
| transformations by being specifically applied to source
| code (and also runtime values, but at LEAST source code).
| So it makes sense that they have some value-form of
| literals--ie signified signifiers that are not themselves
| symbols or quoted values.
|
| I don't want to argue, I'm just pointing out the diction
| here is super confusing at the very least to me. A
| literal is just that-a lexical object that makes up part
| of a source
|
| > Yet the string literal object at run time
|
| I would again term this a "value" that the literal
| evaluates to at compile-time, or alternatively a "value"
| that the literal signifies to the reader. Even homoiconic
| languages like lisp differentiate between source and
| runtime forms. In the context of c I would consider the
| concept of a "runtime literal" to be a
| contradictory/paradoxical/incoherent/nonsensical one.
|
| That said, what you were showcasing remains very cool
| regardless of how we refer to the forms. A great
| demonstration on how there's only really one difficult
| problem--naming shit! Basically all other problems follow
| from this original one (yes, even cache invalidation and
| coloring and off-by-one errors).
| kazinator wrote:
| "literal" a short for "literal constant" which is in
| contrast to "symbolic constant".
|
| 3.1415926535 is a literal constant; p is symbolic.
|
| The former constant literally is the value, whereas p
| isn't literally that value.
|
| There's another term we could use for the runtime
| counterpart of the literal. In machine languages we have
| something called an immediate operand. For instance an
| instruction which moves a constant value into a register,
| like mov r1, 42 is said to have an immediate operand.
| This operand is baked right into the instruction code
| word, or an adjacent word.
|
| Thus, a source code literal constant, or possibly a
| symbolic constant, appearing in the assembly language,
| becomes an immediate operand in the running machine
| language.
|
| In higher level languages we just tend to use the word
| literal for both. And of course what that means is that a
| symbolic constant can become a literal one at runtime,
| due to a translation time substitution of symbolic
| constants by their values in place.
|
| An immediate operand is an object. What makes it special
| is that it lives inside the program code. Each time the
| instruction is executed which references the immediate
| operand, it references that one and only operand. The
| only way for that reference to get a different value
| would be if the program were modified the runtime: that
| is, if it were self-modifying.
|
| Literals, or shall we say immediate operands, being part
| of the program, are assumed not to be modified. Compilers
| can optimize them in various ways, like usung one object
| for multiple occurrences of the same literal. Or they can
| arrange for these literals to be in unwritable storage.
| Some literals can be stuffed into the immediate operand
| fields of machine instructions.
|
| So yes, literals correspond to runtime objects, but
| objects which often have special treatment and rules
| regarding their use.
| ngcazz wrote:
| Have always appreciated Haskell's tail-recursive one
| fib = (fibs !!) where fibs = 0 : 1 : zipWith (+) fibs
| (tail fibs) main = putStrLn $ show $ fib 1000
| bmacho wrote:
| You probably just jest, either way fibs is not tail recursive
| [0], since its returning value is a list constructor, and not
| a call to itself.
|
| [0] : https://stackoverflow.com/questions/33923/what-is-tail-
| recur...
| ngcazz wrote:
| No jest, I just recalled this formulation incorrectly as
| being tail-recursive
| darrenf wrote:
| Not memoization, but I like the Raku version:
| $ raku -e 'my @fib = 0,1,{$^a + $^b} ... *; @fib[0..40].say'
| (0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584
| 4181 6765 10946 17711 28657 46368 75025 121393 196418 317811
| 514229 832040 1346269 2178309 3524578 5702887 9227465 14930352
| 24157817 39088169 63245986 102334155)
| jmull wrote:
| In real code I've only actually seen memoization used as a cache.
| That leaves cache validation to some higher level which is in a
| poor position to do it very well. In my experience, most often it
| means someone has to figure out what things to manually restart,
| and when.
| kazinator wrote:
| You mean invalidation?
|
| Speaking of invalidation, the memoized Fibonacci will work even
| with a cache of 1 item that is replaced each time. (By work, I
| mean slash exponential time to linear.)
|
| If you calculate fib(n - 2) first and cache it, evacuating
| everything else in the process, that is good enough: when f(n -
| 1) needs f(n - 2), it finds it instead of recursing, and so the
| multply-and-surrender logic is disrupted, resulting in time
| proportional to n.
| JadeNB wrote:
| With all that it borrowed from Perl, I'm surprised Ruby didn't
| also borrow the defined-or operator `//=`
| (https://perldoc.perl.org/perlop#Logical-Defined-Or). (Or, if it
| did, I guess the author hadn't encountered it.)
| bradly wrote:
| I don't think Ruby has this natively, but Rails adds it with
| Object#presence.
| https://api.rubyonrails.org/classes/Object.html#method-i-pre...
| tprynn wrote:
| ActiveSupport adds tons of great convenience methods to Ruby
| and you can require it even outside of Rails! blank, present;
| date and time conversions like 1.month.from_now; except;
| enumerable methods like pluck... It's just lovely
| require 'active_support/all'
|
| https://guides.rubyonrails.org/v5.2/active_support_core_exte.
| ..
| JadeNB wrote:
| > I don't think Ruby has this natively, but Rails adds it
| with Object#presence. >
| https://api.rubyonrails.org/classes/Object.html#method-i-
| pre...
|
| It's not immediately obvious to me how one could use that to
| write a Ruby analogue of `a //= b`. If I understand
| correctly, it would still need to be the relatively verbose
| `a = a.present? ? a : b`. (I think even `a = a.presence || b`
| wouldn't work in case `a` was defined but nil, which is
| exactly the edge case the defined-or operator is meant to
| handle.)
| the_sleaze_ wrote:
| I use this to memoize instance vars from time to time,
| whenever nil could be considered valid
| return x if defined? x x = ingest_stock_market
|
| `defined?` won't evaluate the parameter either which is
| nice whereas things like blank? present? and presence do
| JadeNB wrote:
| > return x if defined? x
|
| It's a natural pattern, and translates with obvious
| changes to Perl, but still `x = defined? x ? x : y` is
| longer than `x //= y`, so all I was meaning to say in my
| original comment was that I was surprised that Ruby
| didn't borrow the latter from Perl.
| inopinatus wrote:
| That is not the same, because Object#present? is a test for
| nonblankness, not definedness. So, for example, the empty
| string "" returns nil for #presence, which is exactly what we
| don't want when caching the result of a potentially expensive
| templating operation, external service query etc.
|
| The proper equivalent uses the defined? keyword, as in
| (defined? @result) ? @result : @result = ""
|
| which will produce @result with conditional assignment over
| definition. Note that defined? is a keyword, not an operator.
|
| There is no shorthand for this expression. There's less
| motivation because instance variables are better accessed
| symbolically. If you're not the accessor code then it's none
| of your business, and if you _are_ a memoizing accessor then
| your specialist action doesn 't really warrant an operator
| simply for golf. This isn't autovivification but Perl has
| that as an adjacent idea already, and all this adds up to the
| //= operator making a lot more idiomatic sense and holding
| more purpose in Perl than it does in Ruby.
| gls2ro wrote:
| Not sure I understand this example from Perl but Ruby has ||=
| that checks for nil but not defined.
|
| You can do
|
| a ||= b
|
| And if a is nil then it will get the value from b else it will
| remain with the current value
| viraptor wrote:
| This really checks for true values, which is fine for nil and
| numbers. But it will fail you if you want to cache `false` or
| any falsey object.
| kazinator wrote:
| That's an anti-pattern. In a properly lexically scoped
| language, an undefined identifier is erroneous, diagnosable
| before you even run the code.
|
| If optional dynamic scoping is supported, it can make sense to
| deal with an undefined variable, but is still something you
| would want to avoid. E.g. Common Lisp programs almost always
| assidiously define dynamic variables via a top-level _defvar_.
|
| Testing for definedness is usually part of some hack that
| doesn't deserve its own syntax as a predefined language
| feature.
|
| The basic idea is that programs should usually know which of
| their variables are defined, at what point in the program. It
| should be statically obvious, or failing that, at least
| dynamically obvious. The remaining situations can be valid, but
| are low-priority hacks.
|
| (You can always have your macro for it, if you want it badly
| enough.)
|
| For the environment var example, we can use the regular _or_
| (which is much like ||), if the lookup yields _nil_ , which is
| false: 1> (or (getenv "HOME") (getenv "LOGDIR")
| (error "You're homeless!")) "/Users/kazinator" 2>
| (or (getenv "XXXHOME") (getenv "LOGDIR") (error "You're
| homeless!")) ** You're homeless! ** during
| evaluation at expr-2:1 of form (error "You're homeless!")
|
| Looks like Perl made a mess of this simple thing, too.
|
| I understand some languages have an "undefined" value for a
| variable that is actually defined, but that's pretty lame too;
| it gives rise to multiple ways of being false. Awk is like
| this. Awk knows every variable in the program before executing
| it. GNU Awk builds a global symbol table of all variables at
| compile time. So every variable is considered defined in the
| sense of being known to the implementation and existing. It
| just has a kind of undefined value initially that serves as an
| empty string or zero, and tests false, but is neither a number
| nor string. It's a pretty cockamamie design from the POV of
| larger programs, but does lead to very short one-liners.
| JadeNB wrote:
| These thoughts on proper typing discipline are a valuable
| perspective on good programming-language design, but I think
| that they probably don't help much for writing code in Ruby,
| which, I believe, doesn't behave this way. (I'm not a Ruby-
| ist, so can't speak authoritatively to it.)
|
| Nonetheless, supposing that we're working in a language that
| behaves as you suggest--if memoization isn't built in, how
| would a user build it in a way that allows caching of the
| return value `false`? (For example, for testing whether a
| positive integer is prime, where getting that `false` value
| could be very expensive.)
| kazinator wrote:
| The problem is identical to how do we have an hash table
| (or other associative array or dictionary) in which we
| distinguish the presence of a _false_ value from the
| absence of a value.
|
| There are solutions for that. For instance, a lookup
| operation that returns the underlying dictionary entry, or
| _nil_ if there is no entry. If there is an entry, we can
| dereference it: _entry.value_ has the value. That 's a
| solution not requiring any special type system constructs
| like Optional or sum types.
|
| We can have a double lookup: if the lookup yields the
| ambiguous _false_ value, then we can do another lookup like
| contains(hash, key) that has a boolean result.
|
| Some languages like Common Lisp have multiple return
| values. The _gethash_ function in Common Lisp returns a
| second value which indicates whether the key is present. If
| the first value is _nil_ , but the second value is _t_ ,
| then it indicates that the _nil_ was actually pulled from
| the hash table.
| JadeNB wrote:
| > Some languages like Common Lisp have multiple return
| values. The gethash function in Common Lisp returns a
| second value which indicates whether the key is present.
| If the first value is nil, but the second value is t,
| then it indicates that the nil was actually pulled from
| the hash table.
|
| But this seems to be what you said was an anti-pattern:
|
| > That's an anti-pattern. In a properly lexically scoped
| language, an undefined identifier is erroneous,
| diagnosable before you even run the code.
|
| You were speaking there, presumably, of bare variables,
| not of hash entries, but Perl's defined-or works just as
| well on lookups, like `$a{ key } //= "Hi there!"`, in
| which case it is just a less-verbose version of the two-
| step lookup you propose.
| mmahemoff wrote:
| I've come across memoize gems before and thought, "do I really
| need to introduce a dependency when I can save results into a
| hash", but this makes a convincing argument to leverage a gem
| like memo_wise. Clean DSL syntax and the perf gains are much
| nicer than micromanaging it.
| jacobevelyn wrote:
| memo_wise maintainer here. No pressure from me to try it out,
| but if you do use it and have any feedback or suggestions
| please let us know!
| nixpulvis wrote:
| There are only 1 hard problems in computer science: memoization
| and ... shoot what was it again?
| kazinator wrote:
| There are only 2 hard problems in computer science:
|
| 1. Naming things
|
| 2. Memori^H^Hdamn it^H^H^H^H^H^H^Hization.
|
| P.S. I see what you did there. You botched your cache
| invalidation, and so lost the answer.
| drusepth wrote:
| 3. Off-by-1 errors
| nixpulvis wrote:
| _Segmentation fault_
| mtkd wrote:
| With any Ruby code under load, it's important to have a proper
| understanding of what's happening in memory -- if adding
| memoisation is yielding significant gains, there's likely much
| more potential for optimisation
|
| Many in the Ruby community have taken a 1974 quote from Donald
| Knuth out of context and used it as a justification for minimal
| or zero effort on memory efficiency
|
| The article is a very interesting read, but implementation is
| heavily reliant on metaprogramming magic that can be something of
| a grenade in realworld projects especially if anyone fairly new
| to Ruby has to fix it later
|
| Almost none of the gems I used 10 years ago still exist today --
| taking a small penalty for hand-rolling initially will usually
| save days or more later -- the author even says a similar earlier
| Gem he built is long gone
| bradly wrote:
| > if adding memoization is yielding significant gains, there's
| likely much more potential for optimization
|
| Most often when I see memoization in Ruby code it is when
| fetching data from either a db or external service. Memory
| isn't the concern is these cases.
|
| > Almost none of the gems I used 10 years ago still exist today
| -- taking a small penalty for hand-rolling initially will
| usually save days or more later -- the author even says a
| similar earlier Gem he built is long gone
|
| With a clean DSL changing the gem or removing it for your own
| solution should not be problem.
| mtkd wrote:
| If app is repeatedly asking DB for same data and it's
| immutable for a period -- the persistance choice is wrong
|
| There is potentially a whole lot of memory/callstack overhead
| involved in a request like that even when memoised
|
| It likely should be in a memory structure all the time or
| Redis if sufficiently complex / shared across processes etc.
| as all the other processes will likely be doing same
|
| Only in Ruby world would someone say yeah what we need right
| there is a Gem with some define_method sprinkles
| wutwutwat wrote:
| > If app is repeatedly asking DB for same data and it's
| immutable for a period -- the persistance choice is wrong
|
| Says who? ACID databases are the norm for a reason.
|
| > There is potentially a whole lot of memory/callstack
| overhead involved in a request like that even when memoised
|
| So because there is overhead elsewhere you shouldn't
| optimize here?
|
| > It likely should be in a memory structure all the time or
| Redis if sufficiently complex / shared across processes
| etc. as all the other processes will likely be doing same
|
| Redis doesn't help if the issue is the network io.
| Accessing anything out of process memory will objectively
| be slower than accessing in process memory. Doesn't matter
| if the data is coming from pg, redis, or wherever. If its
| going across the wire, there will be io latency.
|
| > Only in Ruby world would someone say yeah what we need
| right there is a Gem with some define_method sprinkles
|
| Ah, I see this is based on a negative view of ruby in
| particular. Memoization is a common pattern in basically
| every language, and can net significant performance wins
| with very little effort (low hanging fruit). Is it a magic
| bullet? Nope. But it's nice to have in the tool belt when
| dealing with an app that's falling on its face and every
| improvment means you get to sleep better at night.
| bradly wrote:
| > Only in Ruby world would someone say yeah what we need
| right there is a Gem with some define_method sprinkles
|
| Oh sorry. I didn't mean for it to sound like that is what I
| was saying.
|
| What I meant was a gem with a memoization DSL shouldn't be
| hard to remove if you decide you no longer want it, so I do
| not think it is a great reason to not go with a gem it it
| solves your need.
|
| To be clear, in 18 years of Ruby I've never work on a
| project that used a gem for this sort of thing. That being
| said-I think a better interface for memoization would make
| it easier to reach for the right tool at the right time.
| And I've work large enough project where even a project
| specific DSL in lib would be valuable across the project.
| ufmace wrote:
| The thing about metaprogramming magic in Ruby is that, it can
| work okay for one thing, but if you have 2 or 3 or more things
| all using it to do different things, it's very likely they'll
| conflict and blow up in a way that's really difficult to
| troubleshoot or solve.
| 15155 wrote:
| I see no mention of object shape optimization and why and how
| excessive rose memoization (or just instance variables in
| general) can be expensive.
| baggy_trough wrote:
| It would be nice if Ruby had a syntax for memoizing a method
| that would handle the nil/false case and update the shape at
| init time.
| jez wrote:
| In a similar line, it would be nice if there were a way to
| predeclare instance variables so that there was a canonical
| place to put type annotations for instance variables. Ruby
| type checkers right now all invent their own ways of
| declaring the types of instance variables.
|
| If Ruby supported something like class A
| def @instance_variable def
| self.@singleton_class_instance_variable def
| @@class_variable end
|
| as a way to pre-declare the object shape at class load time,
| it would be an obvious place to also let type checkers
| declare the types of these instance variables.
|
| (Not saying that _this specific_ syntax is ideal, but rather
| just that having _some_ syntax for declaring instance
| variables at load time, not method run time, would solve both
| the "predetermined object shape" and "predetermined ivar
| types" problems.)
| jez wrote:
| For those curious, this article is a good introduction to the
| topic:
|
| https://railsatscale.com/2023-10-24-memoization-pattern-and-...
| bwilliams wrote:
| Great article, memoization is pretty complex and full of trade-
| offs. We recognized a lot of these same pitfalls as we worked
| through making memoization a consistent and common pattern in our
| (very large, monolithic Rails) codebase via a `memoize def...`
| helper. The `false` and `null` pitfall is _surprisingly_ common,
| enough that it (and other pitfalls) warranted us writing our own
| memoization DSL.
|
| We took the opposite approach of most libraries though (and a
| reason we rolled our own instead of pulling in an existing gem)
| by avoiding handling the complex edge cases that hide trade-offs:
|
| 1. Class methods can't be memoized.
|
| 2. Methods with arguments can't be memoized (so no LRU caches,
| weak refs, tracking arguments, handling default args, hash/deep
| object equality, etc)
|
| The thought at the time was that those more complex scenarios
| deserve more thought than "prefix it with `memoize`". It's been a
| few years since we introduced the helper and after seeing it all
| throughout our codebase with no issues I'm even more confident
| this was the correct decision.
|
| I haven't revisited it in a while, but I'd love to add some extra
| niceties to it, like hooking into `ActiveRecord`s `#reload`,
| although I dislike memoizing in models.
| kazinator wrote:
| In TXR Lisp, I implemented an original invention of mine called
| Paremeter List Macros (Parameter Macros for short).
|
| A parameter macro works in any lambda syntax. Methods, macros.
|
| The documentation for Parameter List Macros shows an example
| implementation of rudimentary memoization.
|
| https://www.nongnu.org/txr/txr-manpage.html#N-C9CAE162
|
| You can use :memo in a method if you like, in a lambda
| expression, defmacro, macrolet, labels, ...: anywhere you have
| a lambda parameter list.
|
| There is a predefined parameter macro called :key which
| implements keyword parameters. (Common-Lisp-like keyword
| parameters are not native to the language run-time at all: you
| have to use this param macro if you want them.)
|
| Another predefined parameter macro called :match allows you to
| express a function with pattern matching. :match will take
| stock of your patterns and implicitly generate a parameter list
| which can take any of those patterns, and a function body which
| then further handles the cases.
| jweir wrote:
| We no longer use memoization except for older code.
|
| New code has anything that needs to be memoized defined in the
| initialize method. This solves a lot of problems and Sorbet
| typing is what encouraged us to go this route.
|
| Now there could be a branching problem - one branch will not need
| to call the expensive thing another would not. In reality I have
| found this never to be the case in our code base. May be
| different for your designs.
| revskill wrote:
| There are 2 hard problems in ruby: cache invalidation and magic.
___________________________________________________________________
(page generated 2024-12-24 23:01 UTC)