[HN Gopher] For RoR, see every method call, parameter and return...
___________________________________________________________________
For RoR, see every method call, parameter and return value in
production
Author : puuush
Score : 37 points
Date : 2023-11-22 19:07 UTC (3 hours ago)
(HTM) web link (callstacking.com)
(TXT) w3m dump (callstacking.com)
| aantix wrote:
| Here's the client code for those that would like to inspect.
|
| https://github.com/callstacking/callstacking-rails
| CPLX wrote:
| This seems ridiculously useful. What's the catch?
| jtokoph wrote:
| These types of profiling gems usually kill performance
| aantix wrote:
| The instrumentation has a performance overhead.
|
| You enable the instrumentation with a prepend_before_action,
|
| e.g.
|
| prepend_around_action :callstacking_setup, if: -> {
| params[:debug] == '1' }
|
| When the request is completed, the instrumented methods are
| removed (thus removing the overhead).
|
| You have to enable it judiciously. But for a problematic
| request, it will give the entire team a holistic view as to
| what is really happening for a given request. What methods are
| called, their calling parameters, and return values, all are
| given visibility.
|
| You no longer have to reconstruct production scenarios
| piecemeal via the rails console.
| cschneid wrote:
| as far as I can tell, it only executes the trace when asked.
| It's not an APM like newrelic. Most likely the trace
| meaningfully slows down the individual request.
|
| When I was at ScoutAPM, we built a version of this that was
| stochastic instead of 100% predictable. We sampled the call
| stack every 10-50ms. Much lower overhead, and it caught the
| slower methods, which is quite helpful on its own, especially
| since slow behavior often isn't uniform, it happens on only a
| small handful of your biggest customers. But it certainly
| missed many fast executed methods.
|
| Different approaches for sure, solve different issues.
| progne wrote:
| This is a product where the SaaS doesn't seem to add much that
| couldn't be done as easily locally, other than monetizing the
| process.
| aantix wrote:
| Disagree.
|
| 1) In a large-scale production scenario, you typically do not
| have the data, nor the interaction flow, to reproduce the bug
| locally. The idea is that you enable Call Stacking on the fly,
| when needed. Turn it off when not needed.
|
| 2) Having multiple runtime captures of the same endpoint across
| two different deployments or time periods allows you to quickly
| compare for logic or data changes (argument values and return
| values are visible).
|
| 3) Commenting on individual lines of execution allows for the
| team to have a specific discussion surrounding logic changes.
| VeejayRampay wrote:
| local is never really a production environment though
| kayodelycaon wrote:
| Some of this already exists to some degree:
| https://github.com/MiniProfiler/rack-mini-profiler
| aantix wrote:
| Different emphasis.
|
| The goal is to quickly be able to see just the important,
| executed methods for a given request.
|
| E.g. you may have a 2,000-line User model, but Call Stacking
| allows you to pinpoint, "Oh, only these three methods are
| actually being called during authentication. And here are the
| subsequent calls that those methods make. And here's where the
| logic change occurred."
| stevepike wrote:
| How does this work on the backend? Does it only trace method
| calls when an exception is thrown, or does it profile the call
| stack of every request?
|
| Something I've been interested in is the performance impact of
| using https://docs.ruby-lang.org/en/3.2/Coverage.html to find
| unused code by profiling production. Particularly using that to
| figure out any gems that are never called in production. Seems
| like it could be made fast.
| aantix wrote:
| The idea is to turn it on for a given request when needed - via
| a parameter, feature flag, etc.
|
| prepend_around_action :callstacking_setup, if: -> {
| params[:debug] == '1' }
|
| Once the request completes, the instrumented methods are
| removed to remove the performance overhead.
| ysavir wrote:
| Would this mean that any data I happened to have in memory during
| the flow now permanently lives in callstacking's data stores? How
| does it handle all the data flowing through from a security
| perspective?
| aantix wrote:
| The same filtering mechanism you have in place for your
| application logs is applied to the argument hash before being
| sent to the server.
|
| https://github.com/callstacking/callstacking-rails/blob/599d...
| jxf wrote:
| It respects the normal RoR toolchain parameter filtering, so
| anything that you say is sensitive (or everything by default,
| if you'd like) also doesn't get sent to CallStacking.
| jmholla wrote:
| The calculator at the bottom of the page is doing some weird
| calculations. If you have no incidents a year, it still costs you
| money. So do incidents that take zero minutes to resolve. I took
| apart the code, and this seems to be the equation in use:
| (revenueTarget / 8760) * resolutionTimeTarget *
| numIncidentsTarget * resolutionTimeTarget + numEmployeesTarget *
| avgEmployeeTargeRate
|
| This means revenue lost is correlated to the square of the lost
| time and the cost from employees is a static yearly cost.
|
| There are a couple things wrong with it. The third * should be
| switched with a + and the last term need to be multiplied by the
| number of incidents. (revenueTarget / 8760) *
| resolutionTimeTarget * numIncidentsTarget + resolutionTimeTarget
| * numEmployeesTarget * avgEmployeeTargeRate * numIncidentsTarget
|
| Which if anyone at Call Stacking is here, just means changing
| o = (n / 8760) * e * t * e + i * r;
|
| to o = (n / 8760) * e * t + e * i * r * t;
|
| or more succinctly o = t * e * (n / 8760 + i *
| r)
|
| I'm assuming that's minified, so
| numIncidentsTarget * resolutionTimeTarget * (revenueTarget / 8760
| + numEmployeesTarget * avgEmployeeRateTarget)
|
| Edit: With the correct math, the example is wildly different. It
| should be $37,277.81, not $87,991.23.
| berkes wrote:
| The irony is that the issues this helps with _could_ be solved
| far before production. Compile time, or some local runtime even.
| Just not in Ruby.
|
| Nearly all the issues this shows you quickly are issues that
| static typing would prevent compile time, or type-hints would
| show you in dev-time.
|
| I've been doing fulltime Rails for 12+ years now, PHP before
| that, C before that. But always I developed side-gigs in Java, C#
| and other typed languages and now, finally fulltime over to Rust.
| They solve this.
|
| Before production. You want this solved before production.
| Really.
| aantix wrote:
| Of the bugs that I've experienced in large-scale, Rails
| production systems, typing is a small subset.
|
| Manually reconstructing logistical errors based on a
| combination of user input and system data, are the most time-
| consuming issues to diagnose.
|
| When your codebase is 500,000+ lines of code, which code paths
| are relevant for a given endpoint? What methods were called and
| under what context? How do we begin to reconstruct this bug?
|
| These are the scenarios for which Call Stacking gives instant
| visibility to.
| IshKebab wrote:
| Depends on the type system. I would say for Java / Python
| level static types they catch a small but significant
| fraction of bugs (10-20% according to the only objective
| measurement I've seen, which is easily worth it). However
| some languages like Rust, Haskell and OCaml let you express
| much more in the type system.
|
| Subjectively it feels like that catches more like 30-60% of
| bugs.
|
| So this thing is still useful but Berkes is right that you
| need it a lot less if you use better static types.
|
| > which code paths are relevant for a given endpoint?
|
| This is exactly the question that static types can answer...
| statically. You don't need a runtime log to find out.
|
| You do need a runtime log to see the actual values though. So
| it's not like a debugger is completely useless in Rust. But I
| definitely reach for it much less than in other languages.
| rco8786 wrote:
| The number of times I wish I had this tool for my production
| systems using Java, Kotlin, and Scala is...enormous.
|
| Typing is great, I am a fan. But seeing the _values_ that are
| running through those types is not solved at compile time.
| theonething wrote:
| This is advertised as a tool for production. Seems like it would
| be useful in development too.
| aantix wrote:
| And new engineer onboarding.
|
| You have a new engineer. Point them to the Call Stacking
| dashboard.
|
| "Here's a list of all of our endpoints for our application.
|
| Click on a trace, and you can see the relevant methods for that
| endpoint and the context in which they are called."
|
| This will get your new engineers up to speed, much quicker.
___________________________________________________________________
(page generated 2023-11-22 23:00 UTC)