[HN Gopher] Category Theory [?] Machine Learning
___________________________________________________________________
Category Theory [?] Machine Learning
Author : bgavran
Score : 65 points
Date : 2023-03-07 20:04 UTC (2 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| AlexCoventry wrote:
| > _Category Theory has been finding increasing applications in
| machine learning_
|
| What's the most compelling application so far?
| umutisik wrote:
| It is tempting to believe that category theory will shed new
| light on and simplify machine learning, just like it did in
| algebraic geometry, algebraic topology and other mathematical
| things. This is wishful thinking. Folks who care about doing
| something useful should stay away from this content.
| rmdamiao wrote:
| Is this simply a consequence of exponential growth in CS
| publications driven by machine learning or is there something
| really going on here?
| bgavran wrote:
| OP here.
|
| The exponential growth in CS publication is much faster. This
| repository is simply a testament that CT is slowly ramping up.
|
| It's meant to show what kind of expressive power and breadth
| _current_ CT models have, which to my knowledge isn 't
| something that's well-known outside of our niche community.
| bgavran wrote:
| It's also meant to suggest where things are going (the kind
| of a chart I have in mind is this one
| https://twitter.com/bgavran3/status/1422206118688956420 ),
| though I understand this is something that deserves a much
| more substantial proof.
| adamnemecek wrote:
| The field needs better foundations. CT is pretty good.
| hgsgm wrote:
| Why? How?
|
| The OP GitHub site doesn't promote any material that
| introduces the concepts at all. The "survey" paper at the top
| is nigh-impenetrable. I'm sure the category theorists are
| having fun modelling machine learning, but it doesn't show
| how machine learning benefits from the category theory.
| bgavran wrote:
| There's a well-articulated essay from Christopher Olah that
| captures the undercurrent, and perhaps even the motivation
| of many of these papers
| https://colah.github.io/posts/2015-09-NN-Types-FP/
|
| You're completely right that the repo doesn't promote any
| introductory material. CT is notoriously difficult to get
| into, and this repository wasn't meant to be a pedagogical
| one, but rather a list of all the relevant papers.
|
| Though, I'll see about remedying this. I have a different
| repository that curates a list of (what I consider to be,
| as a CS major) relatively approachable CT introductory
| materials
| https://github.com/bgavran/Category_Theory_Resources
|
| and it might be a good idea to add a pointer to it.
| adamnemecek wrote:
| Residual connection serves as a feedback/trace a la trace
| in traced monoidal categories. That is one insight I have
| gleaned from CT.
| resource0x wrote:
| You beat me to this. Indeed, of all categories the
| monoidal ones are the most potent. Look how nicely they
| fit in crypto ledgers:
| https://www.cl.cam.ac.uk/events/syco/3/slides/Nester.pdf
|
| \s
| Yoric wrote:
| Category Theory (just as all mathematical models for
| programming or subsets thereof) are building blocks for
| reasoning on what we build. Past applications of such
| mathematical models include:
|
| - programming languages with semantics that are better
| adapted to specific problems (e.g. Rust's ownership);
|
| - better compilers (see e.g. Haskell's supercompiler, which
| puts to shame `constexpr`-style features);
|
| - better static analyzers (e.g. better type systems,
| abstract interpretation, model checkers).
|
| In the case of Machine Learning, it _might_ some day help
| us create Machine Learning that we can understand and trust
| better. Or it might fail. Or it might help us invent
| something different entirely, in 30 years.
| anfelor wrote:
| Intuition from category theory has certainly been useful
| for some programming language features (eg. algebraic
| effects, modal types or lenses), but I don't think it has
| helped with the things you cite. Rusts ownership model
| was created with little formal, academic understanding -
| or do you mean more foundational stuff like linear logic?
| And I don't think that Haskell does supercompilation
| today? It does optimize a whole lot but that is more due
| to laziness and other optimizations. But neither these
| optimizations nor supercompilation have much to do with
| category theory as far as I am aware.
| Yoric wrote:
| Yeah, my phrasing is probably unclear. I meant to give
| examples of applications of formal semantics (huddling
| everything together). I should probably sleep instead of
| commenting on HN :)
|
| But it is my understanding that Rust's ownership traces
| its roots back to both linear logic (well, affine logic)
| and region-based resource management, both of which have
| formal semantics.
|
| I haven't followed what got merged into GHC, but I
| remember seeing demos (and a paper) of a Haskell
| supercompiler during a conference many years ago, so it
| is something.
|
| In my sleep-deprived brain, supercompilation is directly
| related to algebraic effects (although probably not in
| Haskell itself), which are themselves related to category
| theory. I could be wrong.
| riku_iki wrote:
| why those approaches never picked up outside of some
| academia projects?..
| Yoric wrote:
| These days, Microsoft requires model-checking proofs
| before accepting new device drivers. That's their secret
| weapon that finally got (mostly) rid of the BSOD.
|
| I suspect that Apple is also using model-checking at
| various layers, but I have no proof :)
| KRAKRISMOTT wrote:
| No. It won't make a significant (if any at all) difference to
| effectiveness. Rewriting Pytorch in Haskell won't magically
| get you AGI.
| adamnemecek wrote:
| It's not about rewriting things in Haskell but about using
| CT to reason about architectures.
| solomonb wrote:
| No one was suggesting rewriting anything in Haskell
| afaict..
| adamnemecek wrote:
| I have recently written a paper on understanding machine learning
| via the lens of Hopf algebra https://arxiv.org/abs/2302.01834.
|
| Hopf algebras (which are really just tensors with recurrence
| relations built in) subsume convnets, transformers and diffusion
| model and also provide a theoretically better autodiff that
| operates within single layers as opposed to across entire graphs.
|
| Furthermore, there is a correspondence between Hopf algebra and
| cyclical linear logic and Hopf algebras are related to zonotopes,
| which are polyhedra that have been used in verified numerical
| computation. I'm strongly convinced the LL connection can provide
| proofs over zonotopes which paves the way towards interpretable
| AI and will be central for XAI.
|
| I know this sounds too good to be true but Persi Diaconis has
| also written a paper that shows how useful Hopf algebras are in
| the context of Markov chains https://arxiv.org/abs/1206.3620
|
| I'm working on a next gen Hopf algebra based machine learning
| framework.
|
| Join my discord if you want to discuss this further
| https://discord.cofunctional.ai.
|
| ====
|
| My account is currently rate limited so I will use this comment
| to respond to comments below.
|
| red_trumped: What about Hopf algebras do I not understand?
|
| gaze: Haha, it's been a while since I have commented about QC.
| What do I not understand about it? And what comment are you
| referring to?
| ianandrich wrote:
| I just read your Coinductive guide to inductive transformer
| heads paper.
|
| My mind is blown.
|
| Is the Hopf Algebra based ML framework you are working on on
| your github? I took a glance, but you have 1500 repositories
| and it wasn't on the first few of them.
| red_trumpet wrote:
| Your paper didn't pass my smell test at all, tbh. For example
| the formula you write about "product" and "coproduct" in
| section 3 is literally identical (as "=" is symmetric). In
| section 4.2 you write "the product is the standard tensor
| product" with a formula that doesn't at all involve the map m:
| A \otimes A \to A. The formula you write is the _induced_
| product on A \otimes A, assuming that you already have a
| product on A. The formula for "coproduct" is just an
| example[1] of a coproduct, not every coproduct has to look that
| way.
|
| [1] https://en.wikipedia.org/wiki/Coalgebra#Examples
| jurynulifcation wrote:
| adamnemecek has posted too many comments and is in cooldown
| phase, but he's asked me to post this comment: "It's the
| programmers equal sign. I think that the surrounding text
| provides a decent explanation what the deal is.
|
| You are right, there's a missing sentence fragment, "standard
| tensor product that satisfies the property...".
|
| Read the Diaconis paper. "
|
| --- This isn't a sock puppet and I hope this isn't against
| site rules. I just wanted to try and help facilitate good
| discussion. I think trumpet brought up some interesting
| criticisms and felt Adam had a legitimate interest in
| responding ASAP.
| LudwigNagasena wrote:
| > It's the programmers equal sign.
|
| That doesn't seem to make any sense.
| [deleted]
| gexaha wrote:
| could you advertise your research a bit less often, please? i
| see your post like literally almost every other day here
| [deleted]
| hgsgm wrote:
| Or at least explain it in more accessible way. Every time
| Adam posts about the paper, it gets confused comments and no
| engagement on the content, because it's pretty deep graduate
| level pure math, which is occasionally seen but rare on HN.
| red_trumpet wrote:
| As a maths PhD student that has seen Hopf algebras before
| (though I'm no expert, and the context was different), I'm
| not convinced Adam understands things about Hopf algebras.
| gaze wrote:
| He doesn't understand quantum computing either. I wish he
| would stop writing authoritatively about things he
| doesn't understand.
| donnowhy wrote:
| category theory is 'native 2-dimensional' math. i.e. category
| theory explains everything in terms of graphs, where a graph is
| made from two different sorts of 'entities', nodes and vertices
| i.e. categories and morphisms
|
| this being math, I wonder to which extent can category theory be
| re-expressed in terms of sets.
|
| perhaps a better question is if category theory can be re-
| expressed (or founded on) functions?
|
| lastly, I wonder if category theory can be expressed in terms of
| functions (i think maybe it can, without sets?) why shouldn't it
| be expressible in terms of sets (for some reason I don't think
| just sets are sufficient, may have to define functions (which
| possible in terms of sets) before 'expressing' categories
| starting with set theory)?
| voxl wrote:
| Set theory is fine, see the (Stack
| Project)[https://stacks.math.columbia.edu/browse] which
| develops a ton of modern Category Theory on ZFC (Zermelo-
| Fraenkel Set Theory with the Axiom of Choice) alone.
|
| Alternative foundations of mathematics (Set Theory, Category
| Theory, Type Theory, and all their variations) can all mutually
| interpret the other by just postulating sufficiently large
| universes. You don't pick or advocate one based off its ability
| to encode mathematics, but instead based on its ability to
| express your intention and ideas.
|
| Really its no different from programming language preference in
| my book.
| Koshkin wrote:
| > _i.e. categories and morphisms_
|
| Objects, not categories.
| justatdotin wrote:
| > I wonder to which extent can category theory be re-expressed
| in terms of sets...
|
| yoneda
| mydogcanpurr wrote:
| > for some reason I don't think just sets are sufficient
|
| The reason you're looking for is that the category of sets is
| not a set.
___________________________________________________________________
(page generated 2023-03-07 23:00 UTC)