[HN Gopher] Kalman Filter Explained Simply
       ___________________________________________________________________
        
       Kalman Filter Explained Simply
        
       Author : RafelMri
       Score  : 275 points
       Date   : 2024-02-12 11:33 UTC (11 hours ago)
        
 (HTM) web link (thekalmanfilter.com)
 (TXT) w3m dump (thekalmanfilter.com)
        
       | pfdietz wrote:
       | My late father used these all the time during his career,
       | starting about the time they were invented. He worked on radar
       | and missile guidance systems.
        
         | lkdfjlkdfjlg wrote:
         | You mean, actual engineering.
        
           | willis936 wrote:
           | Plenty of EEs understand and apply DSP techniques today
           | without making weapons.
        
       | hprotagonist wrote:
       | "if you do it right, the noise washes out."
        
       | foobarbecue wrote:
       | I've always thought math would be much easier to learn if they
       | used descriptive variable names. Or, at least in an interactive
       | medium like the web, add some tooltips at a bare minimum.
       | Whenever I study math I spend 90% of the time looking up the
       | symbols.
       | 
       | Also when this person says the subscript "denotes the order of
       | the measurement" I'm trying to figure out what kind of order he's
       | talking about. I guess that's the index? It's been a while since
       | I did kalman filters:-p
        
         | Waterluvian wrote:
         | This was 90% of my problem with math.
         | 
         | Same when something is named descriptively: shield volcano,
         | star dunes, vs. some person's name like Rayleigh scattering.
         | 
         | It's just an extra layer to memorize and parse.
        
           | bonoboTP wrote:
           | I agree but this is a surface level issue impacting the very
           | beginner phase only. Once you get familiar with the
           | vocabulary, the hard stuff will be the actual material,
           | understanding the thing itself, not its description. This
           | differentiates fields that have an actual subject matter from
           | fields that are essentially terminology and categorizations
           | all the way through, a self-referential map with no
           | territory.
           | 
           | In math/physics/engineering the terminology is unfamiliar at
           | first but very precise and hence learnable. The vast majority
           | of STEM textbooks (at least from good US university
           | publishers) make their best effort in presenting the material
           | understandably without any intentional obscurantism or
           | additional fanciness. Academic joirnal/conference papers do
           | sometimes intentionally confuse to game the publication
           | metrics but intro materials are an earnest effort at
           | educating. The subject matter has some inherent complexity,
           | there's no need to prop it up artificially for prestige (that
           | happens more in other fields that are insecure about their
           | level of inherent complexity).
        
             | Waterluvian wrote:
             | Oh definitely! Eventually you've spent so much time that
             | the thetas and lambdas and Lorentzes and whatever become
             | your close intimate friends. I've most recently experienced
             | this with learning piano and how an ocean of white and
             | black keys all developed these individual "identities."
             | Like, "ah yes, <looks at A4> I won't forget you. We've had
             | so many dramatic moments before, and remember that time I
             | kept showing up at your neighbour's house instead?" ...Okay
             | maybe I'm just weird.
        
         | shiandow wrote:
         | People always seem to forget that mathematical notation is
         | designed to make algebraic manipulations easier to follow. It's
         | not really intended as something that makes sense on its own,
         | it's mostly physicists who think something like E=mc^2 should
         | have any meaning.
         | 
         | The more pure the mathematics the shorter the scope of most
         | variables. Typically a variable is defined just before it's
         | used, with a scope no longer than the proof or derivation it's
         | in.
         | 
         | Also some of the choices in this article are just plain silly.
         | Such as using P as both variable _and_ index, and then use it
         | for the covariance matrix when the _precision_ matrix is the
         | exact inverse.
        
           | toxik wrote:
           | In computer science, we straddle this divide where the usual
           | way to describe things in scientific literature is using
           | algebraic notation, but the usual way to actually write up a
           | program is to use meaningful variable names. It's not _K_i_ ,
           | it's _current_gains_ or something like that. It leads to a
           | funny kind of style where especially computer science-related
           | fields start writing _K_{current}_ in scientific manuscripts,
           | and are then chided because that is not how you use
           | subscripts. The reverse is also true, theory-adjacent code
           | tends to use _a lot_ more terse naming conventions.
        
             | michaelrpeskin wrote:
             | I often work with a guy who's a pure computer scientist. He
             | communicates in LaTeX and pseudocode only. I don't think he
             | could find a compiler if he tried. What I've learned to do
             | is keep his notation whenever I implement something. I
             | start the function with a comment that says what paper I'm
             | referencing when I implement it and then comment all of the
             | one-letter variables. (In this case I would use K[i]).
             | 
             | Usually the stuff I'm implementing is complicated enough
             | that reading the code without reading the paper won't have
             | much meaning. And if you're reading the paper, it's good to
             | have as close to a 1-1 representation to help give
             | confidence that I implemented it right.
             | 
             | I've had non-zero times where I took his pseudocode, pasted
             | it into my IDE and did a search and replace of begin/end to
             | braces and "<-" to "=" and it "just worked" on the first
             | try. I always found that amazing that he could write this
             | stuff without ever trying it (outside of his head).
        
             | Kim_Bruning wrote:
             | Just _DO_ let me have my k_P, k_I, k_D please! See also:
             | 
             | * r,g,b,a
             | 
             | * h,s,l
             | 
             | * x,y,z
        
           | atoav wrote:
           | To be honest a lot of things that are incredibly easy,
           | incredibly intuitive -- things of the "I could come up with
           | this myself"-kind -- can seem impenetrable when written down
           | in mathematical notation.
           | 
           | If you want people to _understand_ , you skip the
           | mathematical notation and use pseudocode, or at least you
           | start with an explaination and once people know what they are
           | looking at you go into the math notation.
           | 
           | I understand that mathematical notation can be very practical
           | for describing a certain class of problems concisely, but
           | especially if you teach or explain things, consider that
           | being concise is meaningless if people don't understand what
           | you describe.
           | 
           | Sometimes my feeling is that this is on purpose. People not
           | understanding what you are doing (but you come across _soo_
           | smart), is a feature to many. Even better: The layperson
           | cannot tell whether you are talking utter gibberish or using
           | the most precise language on earth.
           | 
           | I, however, think you can recognize _real_ smart people by
           | the phenomenon that every room they are in seems to become
           | smarter in total, because they lift people up in their wake
           | of understanding things so deeply, they can just break them
           | down into their very simple, managable parts and translate
           | obscure domain-specific languages into something people can
           | grasp.
           | 
           | I too can go into deep domain specific lanuages, be it in
           | philosophy, electrical regulation, film theory, programming
           | lingo, music theory, control theory etc. But what is the
           | point of doing that without making the person opposite
           | _understand_ what you 're waffling on about? Because either
           | you will seem like a knowledgeable person that has a total
           | lack of self-perception or like a brick that doesn't care
           | what message reaches the person opposite as long as rhey can
           | hear themselves talk.
           | 
           | That being said: Domain-specific language can be okay if it
           | is used between people within that domain. But I have met
           | many physicists or mathematicians who also think math
           | notation sucks, so maybe there could be something better.
        
             | bonoboTP wrote:
             | A great positive example would be Andrej Karpathy.
        
               | CamperBob2 wrote:
               | Great explainer -- incredibly so -- but not awesome in
               | the variable-naming department, having just gone through
               | his ML videos.
        
             | shiandow wrote:
             | Unecessary use of mathematical notation / terminology
             | definitely happens. It's one of the things that annoy me to
             | no end. Especially if after unwrapping the unintelligible
             | mess of mathematical gibberish you're left with some inane
             | argument.
             | 
             | I still haven't quite forgiven the writers of UMAP for
             | writing a theorem that uses pi^n/2 / Gamma(n/2+1) instead
             | of the more reasonable "volume of the unit sphere". It
             | makes it so confusing that they fail to spot the theorem
             | doesn't work for low dimensional objects embedded in high
             | dimensional spaces, which is their exact use-case. Luckily
             | their informal conclusion does work, mostly.
        
         | pinkmuffinere wrote:
         | > Also when this person says the subscript "denotes the order
         | of the measurement" I'm trying to figure out what kind of order
         | he's talking about. I guess that's the index? It's been a while
         | since I did kalman filters:-p
         | 
         | The order referred to is the index-in-time that a value
         | correspond to. Eg, x_3 would be the state at the third time
         | step. I think their subscript "p" stands for prediction. x_p at
         | time 3 is the state we expect at time 4. But then when time 4
         | comes around, we incorporate new measurements and calculate x_4
         | including that new information. Just to be explicit, this x_4
         | will be different from the x_p we calculated at time 3, as our
         | prediction is always a bit incorrect
        
         | gromneer wrote:
         | > Also when this person says the subscript "denotes the order
         | of the measurement" I'm trying to figure out what kind of order
         | he's talking about.
         | 
         | I hate that the most when reading papers. Authors trying to
         | sound abstract and academic, but only accomplishing being
         | frustratingly vague. AUTHORS YOU STILL HAVE TO INSERT THE
         | SUBJECT INTO YOUR SENTENCES FOR THEM TO MAKE SENSE.
         | 
         | I'm so frustrated at this aspect in research papers more than
         | anything else. You must disambiguate. Use absolute descriptors
         | and do not use relative descriptors. Don't tell me to look
         | right, because I'll look left. Use absolute descriptors! "then
         | after spinning the prism the light cone blah blah blah" SPIN!?
         | SPIN IN WHAT DIRECTION????? LEFT?RIGHT?! LATERAL? UP? DOWN????
         | How fast? How slow? You imagine all of these CRITICAL ASPECTS
         | in your head when writing such ambiguous sentences, but the
         | reader cannot read your mind.
        
         | foofie wrote:
         | > I've always thought math would be much easier to learn if
         | they used descriptive variable names.
         | 
         | I think the variable names are already picked to be
         | descriptive. No one is picking them to be more obscure or
         | harder to track. The problem is that those who are starting out
         | still haven't picked up concepts or learned the standard
         | notations for each problem domain, thus we are left with the
         | pain of ramping up on a topic.
        
         | max_ wrote:
         | I completely agree. And my insight similar to yours is that the
         | greatest math book that no one has written is one where the
         | meaning of notation, variables and a clear assortment of
         | theorems across all topics are well curated.
        
           | bonoboTP wrote:
           | Many math books nowadays come with Python code eg jupyter
           | notebooks.
        
             | max_ wrote:
             | Python code is abit of an over kill.
             | 
             | I was for example reading a cryptography book and it has
             | symbols like a hamadad product.
        
         | colechristensen wrote:
         | I've always thought that code would be much easier to
         | understand with shorter, less descriptive variable names.
         | Whenever I look at new code most of the confusion involves
         | searching through layers of abstraction for the part that
         | actually _does_ the thing as opposed to the layer upon layer of
         | connections between abstractions which would be much less
         | necessary if the entire behavior could be encoded in a single
         | line. You can only have a small number of descriptive variables
         | in an expression before it becomes entirely unreadable. That is
         | opposed to single character with sub /superscripts where you
         | can easily see what's happening with tens of variables in a
         | single line of math.
         | 
         | https://wikimedia.org/api/rest_v1/media/math/render/svg/a7d2...
         | 
         | Here's a formula for calculating the downstream Mach number in
         | a certain kind of supersonic flow. I cannot imagine any way to
         | write this in "descriptive variables" which makes the formula
         | understandable at all, you just could not see the structure.
         | (from https://en.wikipedia.org/wiki/Oblique_shock )
        
       | shiandow wrote:
       | Kalman filters might be one of those weird cases in mathematics
       | where the 'simple' version is simplified beyond all recognition.
       | 
       | I mean what you're really doing is take a measurement then
       | simulate the possible future states and combine this information
       | with the next measurement and repeat.
       | 
       | You can imagine e.g. taking multiple pictures of a tennis-ball,
       | estimate its position and speed from the first picture, simulate
       | where it's going to end up, and compare this with the next
       | picture to see which estimate is closer to the truth. Or more old
       | school, measure the inclination of the sun and compare the
       | resulting line of possible locations on a map to the spot you
       | thought you were.
       | 
       | Of course the exact calculations are beyond impractical. So you
       | use sampling to simplify. However that still makes it difficult
       | so you assume the distribution is somewhat close to a Gaussian
       | distribution. And then you simplify even more by assuming the
       | evolution of the system is just a linear transformation. And
       | that's how you end up with the Kalman filter discussed here.
       | 
       | I'd be amazed if anyone could really understand what's going on
       | just based on the linear algebra.
        
         | jvanderbot wrote:
         | It's simpler than that. The linear algebra is actually easier.
         | 
         | The kalman filter tries to guess the hidden input that produced
         | the measurements. It does so forming the minimization problem:
         | 
         | 'minimize over x, the function [ actual_measurement -
         | expected_measurement(x) ]^2/s^2', here 's' is sigma of noise.
         | 
         | This follows from the state estimation problem:
         | 
         | 'maximize over x, the likelihood of seeing the
         | actual_measurement', because the only term that matters in the
         | likelihood function _is_ -([x-expected(x)] /s )^2. (look at the
         | exponent in the Normal distribution, or any exponential
         | distribution really).
         | 
         | 'actual_measurement' is a constant, so if it happens that the
         | function 'expected_measurement' is linear, this is trivially
         | solved directly as a convex optimization, and if you take
         | derivative, equate to zero, and solve, you'll get the kalman
         | filter update step.
         | 
         | If it so happens that the function is non-linear, well we just
         | make a single netwton-rhapson step by linearizing the equation,
         | minimizing, and returning the solution to the "pretend
         | linearization".
         | 
         | This is basic calc + linear algebra at an undergrad level, but
         | nobody bothered to tell you that.
         | 
         | ---
         | 
         | It's also completely wrong. It's a hack from the 60s to
         | maximize the likelihood function using a recursive, single-step
         | linearization like this. A misreading of the Cramer Rao Lower
         | Bound has "proven" to generations of engineers that this is
         | optimal. It's not, not really.[^1]
         | 
         | Nowadays we have 10,000x more compute, and any one of the
         | following _will_ produce better performance:
         | 
         | * Forming and solving the non-linear equation using many
         | newton-rhapson steps
         | 
         | * Keeping a long history of measurements, and solving using
         | many newton-rhapson steps over this batch
         | 
         | * Using sum-of-guassian representation to accomodate multi-
         | modal measurement functions, esp when including the prior
         | bullets
         | 
         | All of these were well covered by state estimation research
         | from 80s to now, but again, the textbooks seem to be written in
         | stone in 1972.
         | 
         | [^1]: (the cramer rao lower bound is only defined when all
         | measurement likelihood functions are linearized at the true
         | state - which is only possible asymptotically in a batch which
         | preserves all the measurements - and not possible before time
         | infinity and not possible with recursive filter)
        
           | palebluedot wrote:
           | > All of these were well covered by state estimation research
           | from 80s to now, but again, the textbooks seem to be written
           | in stone in 1972.
           | 
           | Do you have any good resources (online or textbook) you could
           | recommend, as an introduction to these concepts, that is more
           | modern / up-to-date?
        
             | jvanderbot wrote:
             | I don't, sadly. The only coverage of this that was first-
             | principles accessible was the course taught by Prof
             | Stergios Romelioutis where I went to grad school.
             | 
             | You could do fine by reading some old books by Bar-Shalom.
             | Any practical textbook like his would include all the
             | "other stuff" about the EKF that helps you understand how
             | nonperforming it often is.
             | 
             | But the actual derivation of the EKF is probably only one
             | or two pages in such a textbook, which is a damn shame
             | nobody includes it.
             | 
             | The background required is simply:
             | 
             | * Know the form of exponential family of PDFs (like Normal
             | Gaussian)
             | 
             | * Bayes rule
             | 
             | * Recognize that to maximize f~= exp(-a), you have to
             | minimize 'a'
             | 
             | * Know how to take derivative of matrix equation ('a',
             | above)
             | 
             | * Solve it
             | 
             | * Use 'matrix inversion lemma' to transform solution to
             | what KF/EKF provides.
             | 
             | Ah hell, I'll just write it up.
        
             | markisus wrote:
             | Probabilistic Robotics covers Kalman Filter from a first
             | principles probabilistic viewpoint, and its extension the
             | EKF. It's quite readable for someone with basic
             | understanding of linear algebra, probability, and calculus.
             | I believe it also has a refresher on these basics in the
             | introduction.
             | 
             | http://www.probabilistic-robotics.org/
        
           | eutectic wrote:
           | Both the Bayesian perspective and the optimization
           | perspectice are legitimate ways of understanding the Kalman
           | filter. I like the Bayesian perspective better.
        
             | jvanderbot wrote:
             | Forgive me, I'm thoroughly confused by that dichotomy. How
             | are they different? Approaching from bayes rule or a
             | "maximum likelihood" approach produces the same results.
             | 
             | The problems of the filter are present in both.
        
               | eutectic wrote:
               | Well, the derivations are different, and your comment
               | seemed to imply that the maximum likelihood perspective
               | was easier to understand.
        
               | hgomersall wrote:
               | The result is identical, the understanding is different.
               | I would suggest that the Bayesian perspective leads to
               | insights like the UKF [1] which IME is all round much
               | better than the apparently better known EKF for
               | approximating non linear systems.
               | 
               | [1] That is, it is generally easier to approximate a
               | distribution than a non linear function.
        
           | dwrtz wrote:
           | Yeah one needs to distinguish between the optimal state
           | sequence and the optimal state at a particular time. In other
           | words, a joint posterior (state trajectory up to now given
           | all measurements up to now) vs a marginal posterior (state
           | right now given all measurements up to now). the KF only
           | gives you a marginal posterior. they are not the same
        
           | hgomersall wrote:
           | It all depends on your background. If you come from a
           | Bayesian background, looking at a Kalman filter as a
           | posterior update mechanism is very intuitive. A distribution
           | over state that gets updated as more information arrives...
        
         | foobarian wrote:
         | I don't know what it is about the Kalman filter but so many
         | explanations including the OP have this format: "It's very
         | simple! <complicated obtuse explanation listing the computation
         | steps>"
         | 
         | Your comment is the first I've seen actually providing
         | intuition about what is happening. It doesn't help perhaps that
         | the name itself is misleading as heck to computer people like
         | me: it's not a filter as in stream processing or SQL.
        
           | shiandow wrote:
           | I recall reading a very intuitive explanation including
           | animations of a point cloud to show how it works. I've had no
           | luck finding the article again though.
        
           | jvanderbot wrote:
           | The problem is 'simple' can mean a few things:
           | 
           | - simple to predict / understand what it would _do_ : Easily
           | explained with pictures and hardly no math
           | 
           | - simple to understand at a higher level / see why it works:
           | Not easily explained without math and fraught with bayesian
           | vs optimization vs EE-type approaches
           | 
           | - simple to understand well enough to use: Not easily done
           | without _other relevant math_ that is not covered in KF
           | explanation, e.g., controls, matrix analysis, Jacobians, etc
           | 
           | - simple to understand why the equations are named what they
           | are, and why they work: Not easily explained without math
           | _and_ historical context that takes a page or so to explain.
           | 
           | > it's not a filter as in stream processing
           | 
           | As you say "filter" means 'remove noise', but it _also_ means
           | 'process in order of arrival', so it's _similar_ to your def
           | of filter.
           | 
           | So, we really need 4 guides.
        
           | dwrtz wrote:
           | yeah the "filter" term means something different in the KF
           | context and is confusing
           | 
           | filter: use measurements up to time t to estimate the state
           | at time t
           | 
           | smooth: use past and future measurements to estimate the
           | state at time t
           | 
           | predict: use measurements up to time t to estimate the state
           | at time t+n
        
       | visarga wrote:
       | Simply put, it is an "online" model that which means that it
       | learns on the fly. Specifically, it is the optimal online
       | learning algorithm for linear systems with Gaussian noise.
       | 
       | In a way it is like a primitive RNN, it has internal state,
       | inputs and outputs.
        
       | nayuki wrote:
       | I like this previous explanation: https://www.bzarg.com/p/how-a-
       | kalman-filter-works-in-picture... ,
       | https://news.ycombinator.com/item?id=13449229
        
         | swyx wrote:
         | this is way better, and should be at the top.
        
         | Moduke wrote:
         | I enjoyed the simplicity of this explanation as well:
         | 
         | https://praveshkoirala.com/2023/06/13/a-non-mathematical-int...
         | 
         | https://news.ycombinator.com/item?id=36971975
        
         | fho wrote:
         | That's the one with the cute little robot, is it? I always
         | liked that one the most :-)
        
       | syntaxing wrote:
       | One thing that clicked for me is that two uncertain distribution
       | of measurements (a high variance in the distribution) makes a
       | "more certain" measurement (a narrower distribution). Use this
       | more certain measurement and combine it with the next
       | measurement, then rinse and repeat and boom, you have a Kalman
       | filter.
        
       | Symmetry wrote:
       | I've used them in robotics and for tracking satellites going
       | overhead via radar. Apparently they're also used by economists
       | for guessing the state of the economy, along with other filters
       | in the standard robotics toolkit.
        
         | mesofile wrote:
         | Purely idle curiosity - I've heard a lot about the Kalman
         | filter over the years, it's a popular subject here, but what
         | are the other filters in the standard robotics toolkit?
        
           | Symmetry wrote:
           | Particle filters would be the main other one I'd reach for.
           | The ability to represent multiple hypotheses is a big
           | advantage they have.
        
             | dplavery92 wrote:
             | You can also construct multiple hypothesis trackers from
             | multiple Kalman Filters, but there is a little more
             | machinery. For example, Interacting Multiple Models (IMM)
             | trackers may use Kalman Filters or Particle Filters, and a
             | lot of the foundational work by Bar-Shalom and others
             | focuses on Kalman Filters.
        
           | elbear wrote:
           | Someone in another comment mentioned a book called
           | Probabilistic Robotics. I think that covers everything.
        
           | dplavery92 wrote:
           | The Kalman filter has a family of generalizations in the
           | Extended Kalman Filter (EKF) and Unscented Kalman Filter
           | (UKF.)
           | 
           | Also common in robotics applications is the Particle Filter,
           | which uses a Monte Carlo approximation of the uncertainty in
           | the state, rather than enforcing a (Gaussian) distribution,
           | as in the traditional Kalman filter. This can be useful when
           | the mechanics are highly nonlinear and/or your measurement
           | uncertainties are, well, very non-Gaussian. Sebastian Thrun
           | (a CMU robotics professor in the DARPA "Grand Challenge" days
           | of self-driving cars) made an early Udacity course on
           | Particle Filters.
        
           | mesofile wrote:
           | Thanks to everyone who replied.
        
         | bafe wrote:
         | Ensemble kalman filter (and similar techniques like variational
         | assimilation) are also used heavily in the geosciences to
         | assimilate measurements and model data in order to obtain a
         | "posterior observation" which can be understood intuitively as
         | an interpolation between model and observation weighted by
         | their relative uncertainty (and covariance)
        
       | 01HNNWZ0MV43FF wrote:
       | If I really needed a Kalman filter I'm sure I could read this, or
       | the Wikipedia page, or an implementation's source code
       | (https://github.com/LdDl/kalman-rs/blob/master/src/kalman/kal...)
       | and figure it out.
       | 
       | But IME everyone in the entire world is a "visual learner" who
       | learns best by examples. So I'm surprised that the tutorial
       | midway through the page doesn't put any example numbers into the
       | formulas (maybe I glanced over it?) and the pictures only start
       | after a page of "what is a Kalman filter" text, and the pictures
       | are just of more formulas.
        
         | the__alchemist wrote:
         | Another comment pointed out variable naming conventions as an
         | obstacle to learning and understanding mathematical topics. I
         | am sympathetic to that perspective, but even more so to this
         | one you post. I am astounded by how common this is. A weaker
         | form of this exists in software libraries that don't include
         | code examples.
        
       | girzel wrote:
       | No thread on Kalman Filters is complete without a link to this
       | excellent learning resource, a book written as a set of Jupyter
       | notebooks:
       | 
       | https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Pyt...
       | 
       | That book mentions alpha-beta filters as sort of a younger
       | sibling to full-blown Kalman filters. I recently had need of
       | something like this at work, and started doing a bunch of
       | reading. Eventually I realized that alpha-beta filters (and the
       | whole Kalman family) is very focused on predicting the near
       | future, whereas what I really needed was just a way to smooth
       | historical data.
       | 
       | So I started reading in that direction, came across "double
       | exponential smoothing" which seemed perfect for my use-case, and
       | as I went into it I realized... it's just the alpha-beta filter
       | again, but now with different names for all the variables :(
       | 
       | I can't help feeling like this entire neighborhood of math rests
       | on a few common fundamental theories, but because different
       | disciplines arrived at the same systems via different approaches,
       | they end up sounding a little different and the commonality is
       | obscured. Something about power series, Euler's number, gradient
       | descent, filters, feedback systems, general system theory... it
       | feels to me like there's a relatively small kernel of intuitive
       | understanding at the heart of all that stuff, which could end up
       | making glorious sense of a lot of mathematics if I could only
       | grasp it.
       | 
       | Somebody help me out, here!
        
         | bonoboTP wrote:
         | Maybe check out _Probabilistic Robotics_ by Dieter Fox,
         | Sebastian Thrun, and Wolfram Burgard. It has a coherent
         | Bayesian formulation with consistent notation on many Kalman-
         | related topics. Also with the rise of AI /ML, classic control
         | theory ideas are being merged with reinforcement learning.
        
           | girzel wrote:
           | Thanks for the recommendation! It would never have occurred
           | to me to look at robotics, but I can understand why that's
           | very relevant.
           | 
           | I read _Feedback Control for Computer Systems_ not too long
           | ago, which felt like yet another restatement of the same
           | ideas; I guess that counts as  "classic control theory".
        
           | esafak wrote:
           | I agree that Bayesian filtering is the most general and
           | logical approach. There are Bayesian derivations of the
           | Kalman filter too.
           | 
           | Here is a broad survey: https://people.bordeaux.inria.fr/pier
           | re.delmoral/chen_bayesi...
        
         | plasticchris wrote:
         | Hey, I had very similar thoughts many years ago! The trick is
         | yes, many filters boil down to alpha/beta, and the kalman
         | filter is (edit: can be) really a way to generate those
         | constants given a (linear) model (set of equations describing
         | the dynamics, ie the future states) and good knowledge of the
         | noise (variance) in the measurements. So if the measurements
         | always have the same noise it will just reduce the constants
         | over time, and it is only really useful when the measurement
         | accuracy can be determined well and also changes a lot.
        
           | girzel wrote:
           | Interesting. Are you characterizing Kalman filters mostly as
           | systems of control/refinement on top of alpha-beta filters?
           | 
           | I do feel like the core of it is essentially
           | exponential/logarithmic growth/decay, with the option to
           | layer multiple higher-order growth/decay series on top of one
           | another. Maybe that's the gist...
        
             | plasticchris wrote:
             | Yeah, because a lot of times the equations that fall out of
             | the KF look the same, only with variable values for
             | alpha/beta.
        
         | ndriscoll wrote:
         | Incidentally this is why people miss the mark when they get mad
         | about mathematicians using single letter variable names. Short
         | names let you focus on the structure of equations and
         | relationships, which lets you more easily pattern match and say
         | "wait, this is structurally the same as X other thing I already
         | know but with different names". It's not about saving paper or
         | making it easier to write (it is not easier to write Greek
         | letters with super/subscripts in LaTeX using an English
         | keyboard than it would be to use words). It is about
         | transmitting a certain type of information to the reader that
         | is otherwise very difficult to transmit.
         | 
         | While it uses letters so it looks vaguely like writing, math
         | notation is very pictorial in nature. Long words would obscure
         | the pictures.
        
           | elbear wrote:
           | I disagree. Single letter variables are meaningless. In order
           | to get the big picture, you have to remember what all those
           | meaningless letters stand for. Using meaningful variables
           | would make this easier.
        
             | brobdingnagians wrote:
             | If you work with them long enough it becomes second nature
             | to read them, and then it is easier to manipulate and
             | compose them. The rest of the context is the background
             | knowledge to understand the pithy core equations. Papers
             | are for explaining concepts, equations are for symbolic
             | manipulation. Meaningful variable names would be middle
             | ground and not good at either, except to help someone not
             | familiar with the subject to understand the equation, but a
             | lot of the symbols are so abstract that they really need to
             | be explained in more detail elsewhere or would be
             | arbitrarily named.
        
               | np_tedious wrote:
               | If you're in an abstract/general mathematical function,
               | then sure: single letters. If you're doing more business
               | logic kind of stuff (iterating through a list of db/orm
               | objects or processing a request body) then the names
               | should be longer
        
         | duped wrote:
         | You're looking for the theory of linear (or nonlinear)
         | dynamical systems. Unfortunately it's not one kernel of
         | intuition backed by consistent notation, it's many with no
         | consistency. A good course on controls and signals/systems will
         | beat those intuitions into you and you learn the math/parlance
         | without getting attached to any one notational convention.
         | 
         | The real intuition is "everything is a filter." Everything else
         | is about analysis and synthesis of that idea.
        
         | thundercarrot wrote:
         | If Q and R are constant (as is usually the case), the gain
         | quickly converges, such that the Kalman filter is just an
         | exponential filter with a prediction step. For many people this
         | is a lot easier to understand, and even matches how it is
         | typically used, where Q and R are manually tuned until it
         | "looks good" and never changed again. Moreover, there is just
         | one gain to manually tune instead of multiple quantities Q and
         | R.
        
         | ActorNightly wrote:
         | When you start dealing with linear systems and disturbances,
         | you end up with basically matrix math and covariance in some
         | form and way.
         | 
         | The thing about Kalman filter is that its a pretty well known
         | and exists in many software packages (just like PID) so its
         | fairly easy to implement. But because noise is often not
         | gaussian, and systems are often not linear, its more of a
         | "works well enough" for most applications.
        
       | ModernMech wrote:
       | Close your eyes and walk around for a while. Imagine where you
       | are. Now open your eyes. Is your actual location different from
       | where you thought you were?
       | 
       | That last bit, using an observation to update a belief on a state
       | variable, that's what the Kalman filter does.
        
       | cocostation wrote:
       | I really like this set of videos for explaining the KF. I 'got
       | it' more than I did with the material on the original post.
       | 
       | https://www.youtube.com/watch?v=CaCcOwJPytQ
        
       | lopatin wrote:
       | Does anyone here use Kalman filter for pairs trading?
        
       | eclectic29 wrote:
       | Genuine question: why does kalman filter come up so frequently on
       | HN? Is this something I'm missing? I'm a machine learning
       | engineer, not a data scientist.
        
         | xchip wrote:
         | It is considered a difficult topic and people want to show they
         | understand it.
         | 
         | A similar thing happens that had the word quantic or
         | relativistic. I'm a physicist and we hardly talk about it, but
         | here in HN we find people bringing in up every other day
        
         | scarmig wrote:
         | ML and Kalman filtering try to solve similar problems.
         | 
         | Some theories even suggest that Kalman filtering (or a similar
         | algorithm) provides a basis for neurobiological learning. See
         | predictive coding (e.g. https://arxiv.org/pdf/2102.10021.pdf)
         | 
         | (Why I'm interested in it, at least.)
        
       | chubs wrote:
       | Recently tasked with implementing a Kalman filter, I found it
       | very very difficult to find good resources that explained it in
       | language that made sense to a developer like me. So after
       | spending a month learning it, I wrote a couple posts on it,
       | perhaps someone might find it helpful?
       | 
       | https://www.splinter.com.au/2023/12/14/the-kalman-filter-for...
       | 
       | https://www.splinter.com.au/2023/12/15/the-kalman-filter-wit...
       | 
       | As a developer I found the maths made sense only after
       | implementing it, ironically. I guess we learn by building on top
       | of what we already know? Is there a term for that?
        
       ___________________________________________________________________
       (page generated 2024-02-12 23:00 UTC)