[HN Gopher] Kalman Filter Explained Simply
___________________________________________________________________
Kalman Filter Explained Simply
Author : RafelMri
Score : 275 points
Date : 2024-02-12 11:33 UTC (11 hours ago)
(HTM) web link (thekalmanfilter.com)
(TXT) w3m dump (thekalmanfilter.com)
| pfdietz wrote:
| My late father used these all the time during his career,
| starting about the time they were invented. He worked on radar
| and missile guidance systems.
| lkdfjlkdfjlg wrote:
| You mean, actual engineering.
| willis936 wrote:
| Plenty of EEs understand and apply DSP techniques today
| without making weapons.
| hprotagonist wrote:
| "if you do it right, the noise washes out."
| foobarbecue wrote:
| I've always thought math would be much easier to learn if they
| used descriptive variable names. Or, at least in an interactive
| medium like the web, add some tooltips at a bare minimum.
| Whenever I study math I spend 90% of the time looking up the
| symbols.
|
| Also when this person says the subscript "denotes the order of
| the measurement" I'm trying to figure out what kind of order he's
| talking about. I guess that's the index? It's been a while since
| I did kalman filters:-p
| Waterluvian wrote:
| This was 90% of my problem with math.
|
| Same when something is named descriptively: shield volcano,
| star dunes, vs. some person's name like Rayleigh scattering.
|
| It's just an extra layer to memorize and parse.
| bonoboTP wrote:
| I agree but this is a surface level issue impacting the very
| beginner phase only. Once you get familiar with the
| vocabulary, the hard stuff will be the actual material,
| understanding the thing itself, not its description. This
| differentiates fields that have an actual subject matter from
| fields that are essentially terminology and categorizations
| all the way through, a self-referential map with no
| territory.
|
| In math/physics/engineering the terminology is unfamiliar at
| first but very precise and hence learnable. The vast majority
| of STEM textbooks (at least from good US university
| publishers) make their best effort in presenting the material
| understandably without any intentional obscurantism or
| additional fanciness. Academic joirnal/conference papers do
| sometimes intentionally confuse to game the publication
| metrics but intro materials are an earnest effort at
| educating. The subject matter has some inherent complexity,
| there's no need to prop it up artificially for prestige (that
| happens more in other fields that are insecure about their
| level of inherent complexity).
| Waterluvian wrote:
| Oh definitely! Eventually you've spent so much time that
| the thetas and lambdas and Lorentzes and whatever become
| your close intimate friends. I've most recently experienced
| this with learning piano and how an ocean of white and
| black keys all developed these individual "identities."
| Like, "ah yes, <looks at A4> I won't forget you. We've had
| so many dramatic moments before, and remember that time I
| kept showing up at your neighbour's house instead?" ...Okay
| maybe I'm just weird.
| shiandow wrote:
| People always seem to forget that mathematical notation is
| designed to make algebraic manipulations easier to follow. It's
| not really intended as something that makes sense on its own,
| it's mostly physicists who think something like E=mc^2 should
| have any meaning.
|
| The more pure the mathematics the shorter the scope of most
| variables. Typically a variable is defined just before it's
| used, with a scope no longer than the proof or derivation it's
| in.
|
| Also some of the choices in this article are just plain silly.
| Such as using P as both variable _and_ index, and then use it
| for the covariance matrix when the _precision_ matrix is the
| exact inverse.
| toxik wrote:
| In computer science, we straddle this divide where the usual
| way to describe things in scientific literature is using
| algebraic notation, but the usual way to actually write up a
| program is to use meaningful variable names. It's not _K_i_ ,
| it's _current_gains_ or something like that. It leads to a
| funny kind of style where especially computer science-related
| fields start writing _K_{current}_ in scientific manuscripts,
| and are then chided because that is not how you use
| subscripts. The reverse is also true, theory-adjacent code
| tends to use _a lot_ more terse naming conventions.
| michaelrpeskin wrote:
| I often work with a guy who's a pure computer scientist. He
| communicates in LaTeX and pseudocode only. I don't think he
| could find a compiler if he tried. What I've learned to do
| is keep his notation whenever I implement something. I
| start the function with a comment that says what paper I'm
| referencing when I implement it and then comment all of the
| one-letter variables. (In this case I would use K[i]).
|
| Usually the stuff I'm implementing is complicated enough
| that reading the code without reading the paper won't have
| much meaning. And if you're reading the paper, it's good to
| have as close to a 1-1 representation to help give
| confidence that I implemented it right.
|
| I've had non-zero times where I took his pseudocode, pasted
| it into my IDE and did a search and replace of begin/end to
| braces and "<-" to "=" and it "just worked" on the first
| try. I always found that amazing that he could write this
| stuff without ever trying it (outside of his head).
| Kim_Bruning wrote:
| Just _DO_ let me have my k_P, k_I, k_D please! See also:
|
| * r,g,b,a
|
| * h,s,l
|
| * x,y,z
| atoav wrote:
| To be honest a lot of things that are incredibly easy,
| incredibly intuitive -- things of the "I could come up with
| this myself"-kind -- can seem impenetrable when written down
| in mathematical notation.
|
| If you want people to _understand_ , you skip the
| mathematical notation and use pseudocode, or at least you
| start with an explaination and once people know what they are
| looking at you go into the math notation.
|
| I understand that mathematical notation can be very practical
| for describing a certain class of problems concisely, but
| especially if you teach or explain things, consider that
| being concise is meaningless if people don't understand what
| you describe.
|
| Sometimes my feeling is that this is on purpose. People not
| understanding what you are doing (but you come across _soo_
| smart), is a feature to many. Even better: The layperson
| cannot tell whether you are talking utter gibberish or using
| the most precise language on earth.
|
| I, however, think you can recognize _real_ smart people by
| the phenomenon that every room they are in seems to become
| smarter in total, because they lift people up in their wake
| of understanding things so deeply, they can just break them
| down into their very simple, managable parts and translate
| obscure domain-specific languages into something people can
| grasp.
|
| I too can go into deep domain specific lanuages, be it in
| philosophy, electrical regulation, film theory, programming
| lingo, music theory, control theory etc. But what is the
| point of doing that without making the person opposite
| _understand_ what you 're waffling on about? Because either
| you will seem like a knowledgeable person that has a total
| lack of self-perception or like a brick that doesn't care
| what message reaches the person opposite as long as rhey can
| hear themselves talk.
|
| That being said: Domain-specific language can be okay if it
| is used between people within that domain. But I have met
| many physicists or mathematicians who also think math
| notation sucks, so maybe there could be something better.
| bonoboTP wrote:
| A great positive example would be Andrej Karpathy.
| CamperBob2 wrote:
| Great explainer -- incredibly so -- but not awesome in
| the variable-naming department, having just gone through
| his ML videos.
| shiandow wrote:
| Unecessary use of mathematical notation / terminology
| definitely happens. It's one of the things that annoy me to
| no end. Especially if after unwrapping the unintelligible
| mess of mathematical gibberish you're left with some inane
| argument.
|
| I still haven't quite forgiven the writers of UMAP for
| writing a theorem that uses pi^n/2 / Gamma(n/2+1) instead
| of the more reasonable "volume of the unit sphere". It
| makes it so confusing that they fail to spot the theorem
| doesn't work for low dimensional objects embedded in high
| dimensional spaces, which is their exact use-case. Luckily
| their informal conclusion does work, mostly.
| pinkmuffinere wrote:
| > Also when this person says the subscript "denotes the order
| of the measurement" I'm trying to figure out what kind of order
| he's talking about. I guess that's the index? It's been a while
| since I did kalman filters:-p
|
| The order referred to is the index-in-time that a value
| correspond to. Eg, x_3 would be the state at the third time
| step. I think their subscript "p" stands for prediction. x_p at
| time 3 is the state we expect at time 4. But then when time 4
| comes around, we incorporate new measurements and calculate x_4
| including that new information. Just to be explicit, this x_4
| will be different from the x_p we calculated at time 3, as our
| prediction is always a bit incorrect
| gromneer wrote:
| > Also when this person says the subscript "denotes the order
| of the measurement" I'm trying to figure out what kind of order
| he's talking about.
|
| I hate that the most when reading papers. Authors trying to
| sound abstract and academic, but only accomplishing being
| frustratingly vague. AUTHORS YOU STILL HAVE TO INSERT THE
| SUBJECT INTO YOUR SENTENCES FOR THEM TO MAKE SENSE.
|
| I'm so frustrated at this aspect in research papers more than
| anything else. You must disambiguate. Use absolute descriptors
| and do not use relative descriptors. Don't tell me to look
| right, because I'll look left. Use absolute descriptors! "then
| after spinning the prism the light cone blah blah blah" SPIN!?
| SPIN IN WHAT DIRECTION????? LEFT?RIGHT?! LATERAL? UP? DOWN????
| How fast? How slow? You imagine all of these CRITICAL ASPECTS
| in your head when writing such ambiguous sentences, but the
| reader cannot read your mind.
| foofie wrote:
| > I've always thought math would be much easier to learn if
| they used descriptive variable names.
|
| I think the variable names are already picked to be
| descriptive. No one is picking them to be more obscure or
| harder to track. The problem is that those who are starting out
| still haven't picked up concepts or learned the standard
| notations for each problem domain, thus we are left with the
| pain of ramping up on a topic.
| max_ wrote:
| I completely agree. And my insight similar to yours is that the
| greatest math book that no one has written is one where the
| meaning of notation, variables and a clear assortment of
| theorems across all topics are well curated.
| bonoboTP wrote:
| Many math books nowadays come with Python code eg jupyter
| notebooks.
| max_ wrote:
| Python code is abit of an over kill.
|
| I was for example reading a cryptography book and it has
| symbols like a hamadad product.
| colechristensen wrote:
| I've always thought that code would be much easier to
| understand with shorter, less descriptive variable names.
| Whenever I look at new code most of the confusion involves
| searching through layers of abstraction for the part that
| actually _does_ the thing as opposed to the layer upon layer of
| connections between abstractions which would be much less
| necessary if the entire behavior could be encoded in a single
| line. You can only have a small number of descriptive variables
| in an expression before it becomes entirely unreadable. That is
| opposed to single character with sub /superscripts where you
| can easily see what's happening with tens of variables in a
| single line of math.
|
| https://wikimedia.org/api/rest_v1/media/math/render/svg/a7d2...
|
| Here's a formula for calculating the downstream Mach number in
| a certain kind of supersonic flow. I cannot imagine any way to
| write this in "descriptive variables" which makes the formula
| understandable at all, you just could not see the structure.
| (from https://en.wikipedia.org/wiki/Oblique_shock )
| shiandow wrote:
| Kalman filters might be one of those weird cases in mathematics
| where the 'simple' version is simplified beyond all recognition.
|
| I mean what you're really doing is take a measurement then
| simulate the possible future states and combine this information
| with the next measurement and repeat.
|
| You can imagine e.g. taking multiple pictures of a tennis-ball,
| estimate its position and speed from the first picture, simulate
| where it's going to end up, and compare this with the next
| picture to see which estimate is closer to the truth. Or more old
| school, measure the inclination of the sun and compare the
| resulting line of possible locations on a map to the spot you
| thought you were.
|
| Of course the exact calculations are beyond impractical. So you
| use sampling to simplify. However that still makes it difficult
| so you assume the distribution is somewhat close to a Gaussian
| distribution. And then you simplify even more by assuming the
| evolution of the system is just a linear transformation. And
| that's how you end up with the Kalman filter discussed here.
|
| I'd be amazed if anyone could really understand what's going on
| just based on the linear algebra.
| jvanderbot wrote:
| It's simpler than that. The linear algebra is actually easier.
|
| The kalman filter tries to guess the hidden input that produced
| the measurements. It does so forming the minimization problem:
|
| 'minimize over x, the function [ actual_measurement -
| expected_measurement(x) ]^2/s^2', here 's' is sigma of noise.
|
| This follows from the state estimation problem:
|
| 'maximize over x, the likelihood of seeing the
| actual_measurement', because the only term that matters in the
| likelihood function _is_ -([x-expected(x)] /s )^2. (look at the
| exponent in the Normal distribution, or any exponential
| distribution really).
|
| 'actual_measurement' is a constant, so if it happens that the
| function 'expected_measurement' is linear, this is trivially
| solved directly as a convex optimization, and if you take
| derivative, equate to zero, and solve, you'll get the kalman
| filter update step.
|
| If it so happens that the function is non-linear, well we just
| make a single netwton-rhapson step by linearizing the equation,
| minimizing, and returning the solution to the "pretend
| linearization".
|
| This is basic calc + linear algebra at an undergrad level, but
| nobody bothered to tell you that.
|
| ---
|
| It's also completely wrong. It's a hack from the 60s to
| maximize the likelihood function using a recursive, single-step
| linearization like this. A misreading of the Cramer Rao Lower
| Bound has "proven" to generations of engineers that this is
| optimal. It's not, not really.[^1]
|
| Nowadays we have 10,000x more compute, and any one of the
| following _will_ produce better performance:
|
| * Forming and solving the non-linear equation using many
| newton-rhapson steps
|
| * Keeping a long history of measurements, and solving using
| many newton-rhapson steps over this batch
|
| * Using sum-of-guassian representation to accomodate multi-
| modal measurement functions, esp when including the prior
| bullets
|
| All of these were well covered by state estimation research
| from 80s to now, but again, the textbooks seem to be written in
| stone in 1972.
|
| [^1]: (the cramer rao lower bound is only defined when all
| measurement likelihood functions are linearized at the true
| state - which is only possible asymptotically in a batch which
| preserves all the measurements - and not possible before time
| infinity and not possible with recursive filter)
| palebluedot wrote:
| > All of these were well covered by state estimation research
| from 80s to now, but again, the textbooks seem to be written
| in stone in 1972.
|
| Do you have any good resources (online or textbook) you could
| recommend, as an introduction to these concepts, that is more
| modern / up-to-date?
| jvanderbot wrote:
| I don't, sadly. The only coverage of this that was first-
| principles accessible was the course taught by Prof
| Stergios Romelioutis where I went to grad school.
|
| You could do fine by reading some old books by Bar-Shalom.
| Any practical textbook like his would include all the
| "other stuff" about the EKF that helps you understand how
| nonperforming it often is.
|
| But the actual derivation of the EKF is probably only one
| or two pages in such a textbook, which is a damn shame
| nobody includes it.
|
| The background required is simply:
|
| * Know the form of exponential family of PDFs (like Normal
| Gaussian)
|
| * Bayes rule
|
| * Recognize that to maximize f~= exp(-a), you have to
| minimize 'a'
|
| * Know how to take derivative of matrix equation ('a',
| above)
|
| * Solve it
|
| * Use 'matrix inversion lemma' to transform solution to
| what KF/EKF provides.
|
| Ah hell, I'll just write it up.
| markisus wrote:
| Probabilistic Robotics covers Kalman Filter from a first
| principles probabilistic viewpoint, and its extension the
| EKF. It's quite readable for someone with basic
| understanding of linear algebra, probability, and calculus.
| I believe it also has a refresher on these basics in the
| introduction.
|
| http://www.probabilistic-robotics.org/
| eutectic wrote:
| Both the Bayesian perspective and the optimization
| perspectice are legitimate ways of understanding the Kalman
| filter. I like the Bayesian perspective better.
| jvanderbot wrote:
| Forgive me, I'm thoroughly confused by that dichotomy. How
| are they different? Approaching from bayes rule or a
| "maximum likelihood" approach produces the same results.
|
| The problems of the filter are present in both.
| eutectic wrote:
| Well, the derivations are different, and your comment
| seemed to imply that the maximum likelihood perspective
| was easier to understand.
| hgomersall wrote:
| The result is identical, the understanding is different.
| I would suggest that the Bayesian perspective leads to
| insights like the UKF [1] which IME is all round much
| better than the apparently better known EKF for
| approximating non linear systems.
|
| [1] That is, it is generally easier to approximate a
| distribution than a non linear function.
| dwrtz wrote:
| Yeah one needs to distinguish between the optimal state
| sequence and the optimal state at a particular time. In other
| words, a joint posterior (state trajectory up to now given
| all measurements up to now) vs a marginal posterior (state
| right now given all measurements up to now). the KF only
| gives you a marginal posterior. they are not the same
| hgomersall wrote:
| It all depends on your background. If you come from a
| Bayesian background, looking at a Kalman filter as a
| posterior update mechanism is very intuitive. A distribution
| over state that gets updated as more information arrives...
| foobarian wrote:
| I don't know what it is about the Kalman filter but so many
| explanations including the OP have this format: "It's very
| simple! <complicated obtuse explanation listing the computation
| steps>"
|
| Your comment is the first I've seen actually providing
| intuition about what is happening. It doesn't help perhaps that
| the name itself is misleading as heck to computer people like
| me: it's not a filter as in stream processing or SQL.
| shiandow wrote:
| I recall reading a very intuitive explanation including
| animations of a point cloud to show how it works. I've had no
| luck finding the article again though.
| jvanderbot wrote:
| The problem is 'simple' can mean a few things:
|
| - simple to predict / understand what it would _do_ : Easily
| explained with pictures and hardly no math
|
| - simple to understand at a higher level / see why it works:
| Not easily explained without math and fraught with bayesian
| vs optimization vs EE-type approaches
|
| - simple to understand well enough to use: Not easily done
| without _other relevant math_ that is not covered in KF
| explanation, e.g., controls, matrix analysis, Jacobians, etc
|
| - simple to understand why the equations are named what they
| are, and why they work: Not easily explained without math
| _and_ historical context that takes a page or so to explain.
|
| > it's not a filter as in stream processing
|
| As you say "filter" means 'remove noise', but it _also_ means
| 'process in order of arrival', so it's _similar_ to your def
| of filter.
|
| So, we really need 4 guides.
| dwrtz wrote:
| yeah the "filter" term means something different in the KF
| context and is confusing
|
| filter: use measurements up to time t to estimate the state
| at time t
|
| smooth: use past and future measurements to estimate the
| state at time t
|
| predict: use measurements up to time t to estimate the state
| at time t+n
| visarga wrote:
| Simply put, it is an "online" model that which means that it
| learns on the fly. Specifically, it is the optimal online
| learning algorithm for linear systems with Gaussian noise.
|
| In a way it is like a primitive RNN, it has internal state,
| inputs and outputs.
| nayuki wrote:
| I like this previous explanation: https://www.bzarg.com/p/how-a-
| kalman-filter-works-in-picture... ,
| https://news.ycombinator.com/item?id=13449229
| swyx wrote:
| this is way better, and should be at the top.
| Moduke wrote:
| I enjoyed the simplicity of this explanation as well:
|
| https://praveshkoirala.com/2023/06/13/a-non-mathematical-int...
|
| https://news.ycombinator.com/item?id=36971975
| fho wrote:
| That's the one with the cute little robot, is it? I always
| liked that one the most :-)
| syntaxing wrote:
| One thing that clicked for me is that two uncertain distribution
| of measurements (a high variance in the distribution) makes a
| "more certain" measurement (a narrower distribution). Use this
| more certain measurement and combine it with the next
| measurement, then rinse and repeat and boom, you have a Kalman
| filter.
| Symmetry wrote:
| I've used them in robotics and for tracking satellites going
| overhead via radar. Apparently they're also used by economists
| for guessing the state of the economy, along with other filters
| in the standard robotics toolkit.
| mesofile wrote:
| Purely idle curiosity - I've heard a lot about the Kalman
| filter over the years, it's a popular subject here, but what
| are the other filters in the standard robotics toolkit?
| Symmetry wrote:
| Particle filters would be the main other one I'd reach for.
| The ability to represent multiple hypotheses is a big
| advantage they have.
| dplavery92 wrote:
| You can also construct multiple hypothesis trackers from
| multiple Kalman Filters, but there is a little more
| machinery. For example, Interacting Multiple Models (IMM)
| trackers may use Kalman Filters or Particle Filters, and a
| lot of the foundational work by Bar-Shalom and others
| focuses on Kalman Filters.
| elbear wrote:
| Someone in another comment mentioned a book called
| Probabilistic Robotics. I think that covers everything.
| dplavery92 wrote:
| The Kalman filter has a family of generalizations in the
| Extended Kalman Filter (EKF) and Unscented Kalman Filter
| (UKF.)
|
| Also common in robotics applications is the Particle Filter,
| which uses a Monte Carlo approximation of the uncertainty in
| the state, rather than enforcing a (Gaussian) distribution,
| as in the traditional Kalman filter. This can be useful when
| the mechanics are highly nonlinear and/or your measurement
| uncertainties are, well, very non-Gaussian. Sebastian Thrun
| (a CMU robotics professor in the DARPA "Grand Challenge" days
| of self-driving cars) made an early Udacity course on
| Particle Filters.
| mesofile wrote:
| Thanks to everyone who replied.
| bafe wrote:
| Ensemble kalman filter (and similar techniques like variational
| assimilation) are also used heavily in the geosciences to
| assimilate measurements and model data in order to obtain a
| "posterior observation" which can be understood intuitively as
| an interpolation between model and observation weighted by
| their relative uncertainty (and covariance)
| 01HNNWZ0MV43FF wrote:
| If I really needed a Kalman filter I'm sure I could read this, or
| the Wikipedia page, or an implementation's source code
| (https://github.com/LdDl/kalman-rs/blob/master/src/kalman/kal...)
| and figure it out.
|
| But IME everyone in the entire world is a "visual learner" who
| learns best by examples. So I'm surprised that the tutorial
| midway through the page doesn't put any example numbers into the
| formulas (maybe I glanced over it?) and the pictures only start
| after a page of "what is a Kalman filter" text, and the pictures
| are just of more formulas.
| the__alchemist wrote:
| Another comment pointed out variable naming conventions as an
| obstacle to learning and understanding mathematical topics. I
| am sympathetic to that perspective, but even more so to this
| one you post. I am astounded by how common this is. A weaker
| form of this exists in software libraries that don't include
| code examples.
| girzel wrote:
| No thread on Kalman Filters is complete without a link to this
| excellent learning resource, a book written as a set of Jupyter
| notebooks:
|
| https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Pyt...
|
| That book mentions alpha-beta filters as sort of a younger
| sibling to full-blown Kalman filters. I recently had need of
| something like this at work, and started doing a bunch of
| reading. Eventually I realized that alpha-beta filters (and the
| whole Kalman family) is very focused on predicting the near
| future, whereas what I really needed was just a way to smooth
| historical data.
|
| So I started reading in that direction, came across "double
| exponential smoothing" which seemed perfect for my use-case, and
| as I went into it I realized... it's just the alpha-beta filter
| again, but now with different names for all the variables :(
|
| I can't help feeling like this entire neighborhood of math rests
| on a few common fundamental theories, but because different
| disciplines arrived at the same systems via different approaches,
| they end up sounding a little different and the commonality is
| obscured. Something about power series, Euler's number, gradient
| descent, filters, feedback systems, general system theory... it
| feels to me like there's a relatively small kernel of intuitive
| understanding at the heart of all that stuff, which could end up
| making glorious sense of a lot of mathematics if I could only
| grasp it.
|
| Somebody help me out, here!
| bonoboTP wrote:
| Maybe check out _Probabilistic Robotics_ by Dieter Fox,
| Sebastian Thrun, and Wolfram Burgard. It has a coherent
| Bayesian formulation with consistent notation on many Kalman-
| related topics. Also with the rise of AI /ML, classic control
| theory ideas are being merged with reinforcement learning.
| girzel wrote:
| Thanks for the recommendation! It would never have occurred
| to me to look at robotics, but I can understand why that's
| very relevant.
|
| I read _Feedback Control for Computer Systems_ not too long
| ago, which felt like yet another restatement of the same
| ideas; I guess that counts as "classic control theory".
| esafak wrote:
| I agree that Bayesian filtering is the most general and
| logical approach. There are Bayesian derivations of the
| Kalman filter too.
|
| Here is a broad survey: https://people.bordeaux.inria.fr/pier
| re.delmoral/chen_bayesi...
| plasticchris wrote:
| Hey, I had very similar thoughts many years ago! The trick is
| yes, many filters boil down to alpha/beta, and the kalman
| filter is (edit: can be) really a way to generate those
| constants given a (linear) model (set of equations describing
| the dynamics, ie the future states) and good knowledge of the
| noise (variance) in the measurements. So if the measurements
| always have the same noise it will just reduce the constants
| over time, and it is only really useful when the measurement
| accuracy can be determined well and also changes a lot.
| girzel wrote:
| Interesting. Are you characterizing Kalman filters mostly as
| systems of control/refinement on top of alpha-beta filters?
|
| I do feel like the core of it is essentially
| exponential/logarithmic growth/decay, with the option to
| layer multiple higher-order growth/decay series on top of one
| another. Maybe that's the gist...
| plasticchris wrote:
| Yeah, because a lot of times the equations that fall out of
| the KF look the same, only with variable values for
| alpha/beta.
| ndriscoll wrote:
| Incidentally this is why people miss the mark when they get mad
| about mathematicians using single letter variable names. Short
| names let you focus on the structure of equations and
| relationships, which lets you more easily pattern match and say
| "wait, this is structurally the same as X other thing I already
| know but with different names". It's not about saving paper or
| making it easier to write (it is not easier to write Greek
| letters with super/subscripts in LaTeX using an English
| keyboard than it would be to use words). It is about
| transmitting a certain type of information to the reader that
| is otherwise very difficult to transmit.
|
| While it uses letters so it looks vaguely like writing, math
| notation is very pictorial in nature. Long words would obscure
| the pictures.
| elbear wrote:
| I disagree. Single letter variables are meaningless. In order
| to get the big picture, you have to remember what all those
| meaningless letters stand for. Using meaningful variables
| would make this easier.
| brobdingnagians wrote:
| If you work with them long enough it becomes second nature
| to read them, and then it is easier to manipulate and
| compose them. The rest of the context is the background
| knowledge to understand the pithy core equations. Papers
| are for explaining concepts, equations are for symbolic
| manipulation. Meaningful variable names would be middle
| ground and not good at either, except to help someone not
| familiar with the subject to understand the equation, but a
| lot of the symbols are so abstract that they really need to
| be explained in more detail elsewhere or would be
| arbitrarily named.
| np_tedious wrote:
| If you're in an abstract/general mathematical function,
| then sure: single letters. If you're doing more business
| logic kind of stuff (iterating through a list of db/orm
| objects or processing a request body) then the names
| should be longer
| duped wrote:
| You're looking for the theory of linear (or nonlinear)
| dynamical systems. Unfortunately it's not one kernel of
| intuition backed by consistent notation, it's many with no
| consistency. A good course on controls and signals/systems will
| beat those intuitions into you and you learn the math/parlance
| without getting attached to any one notational convention.
|
| The real intuition is "everything is a filter." Everything else
| is about analysis and synthesis of that idea.
| thundercarrot wrote:
| If Q and R are constant (as is usually the case), the gain
| quickly converges, such that the Kalman filter is just an
| exponential filter with a prediction step. For many people this
| is a lot easier to understand, and even matches how it is
| typically used, where Q and R are manually tuned until it
| "looks good" and never changed again. Moreover, there is just
| one gain to manually tune instead of multiple quantities Q and
| R.
| ActorNightly wrote:
| When you start dealing with linear systems and disturbances,
| you end up with basically matrix math and covariance in some
| form and way.
|
| The thing about Kalman filter is that its a pretty well known
| and exists in many software packages (just like PID) so its
| fairly easy to implement. But because noise is often not
| gaussian, and systems are often not linear, its more of a
| "works well enough" for most applications.
| ModernMech wrote:
| Close your eyes and walk around for a while. Imagine where you
| are. Now open your eyes. Is your actual location different from
| where you thought you were?
|
| That last bit, using an observation to update a belief on a state
| variable, that's what the Kalman filter does.
| cocostation wrote:
| I really like this set of videos for explaining the KF. I 'got
| it' more than I did with the material on the original post.
|
| https://www.youtube.com/watch?v=CaCcOwJPytQ
| lopatin wrote:
| Does anyone here use Kalman filter for pairs trading?
| eclectic29 wrote:
| Genuine question: why does kalman filter come up so frequently on
| HN? Is this something I'm missing? I'm a machine learning
| engineer, not a data scientist.
| xchip wrote:
| It is considered a difficult topic and people want to show they
| understand it.
|
| A similar thing happens that had the word quantic or
| relativistic. I'm a physicist and we hardly talk about it, but
| here in HN we find people bringing in up every other day
| scarmig wrote:
| ML and Kalman filtering try to solve similar problems.
|
| Some theories even suggest that Kalman filtering (or a similar
| algorithm) provides a basis for neurobiological learning. See
| predictive coding (e.g. https://arxiv.org/pdf/2102.10021.pdf)
|
| (Why I'm interested in it, at least.)
| chubs wrote:
| Recently tasked with implementing a Kalman filter, I found it
| very very difficult to find good resources that explained it in
| language that made sense to a developer like me. So after
| spending a month learning it, I wrote a couple posts on it,
| perhaps someone might find it helpful?
|
| https://www.splinter.com.au/2023/12/14/the-kalman-filter-for...
|
| https://www.splinter.com.au/2023/12/15/the-kalman-filter-wit...
|
| As a developer I found the maths made sense only after
| implementing it, ironically. I guess we learn by building on top
| of what we already know? Is there a term for that?
___________________________________________________________________
(page generated 2024-02-12 23:00 UTC)