Post AQZqr8m7fYnDMRC0Rc by exterm@layer8.space
 (DIR) More posts by exterm@layer8.space
 (DIR) Post #AQZTNvtgAmuS4npZdA by tiago@social.skewed.de
       2022-12-13T17:41:32Z
       
       0 likes, 0 repeats
       
       LLMs can only learn language statistical patterns, not logic, understanding, etc. This will never change, even if you train it with infinite data.But there's a more fundamental problem with what is called “AI” these days: they are all based on supervised learning, i.e. curve fitting that matches input to output.This means that the only kind of “creativity" they can produce is interpolation: i.e. yield an output for an input that is similar but not identical to what was seen in the training set.Sometimes these models yield results that look like compositions of new things, but it's just an illusion: the training set is so large that they can find enough examples to interpolate between them, using some intrinsic regularization property.True creativity, which is a hallmark of human intelligence, is not achievable in this way. The system must already be imbued with the ability to reason,  before any data is seen. It must be able to extrapolate to way beyond the training set, and with a minimal amount of information.
       
 (DIR) Post #AQZTgYyDddp0TLq0Yq by exterm@layer8.space
       2022-12-13T17:44:45Z
       
       0 likes, 0 repeats
       
       @tiago I agree with all of this, but I fear with just a few years of improvements to existing capabilities none of this will matter. Correlation becomes virtually indistinguishable from causation and random recombination from “true creativity”.
       
 (DIR) Post #AQZUQbFbUZWarjvBtQ by tiago@social.skewed.de
       2022-12-13T17:53:10Z
       
       0 likes, 0 repeats
       
       @exterm Oh, I disagree completely. It's a statement of our computing prowess that we can currently produce these shallow illusions with sheer brute force, but I think you underestimate the truly astronomical amount of data (nevermind computing power) required to “substitute” deductive and inductive reasoning.Also, no imaginable amount of data can recover causation from correlation via brute force. You need a causal model for that, and in general be able to perform interventions.
       
 (DIR) Post #AQZWgw464XInxiSMSW by pedro@mastodon.social
       2022-12-13T18:18:34Z
       
       0 likes, 0 repeats
       
       @tiago true, but most of what we call criativity is interpolation. Everything is a remix 🙂
       
 (DIR) Post #AQZXxzg9zMq030xSiW by tiago@social.skewed.de
       2022-12-13T18:32:52Z
       
       0 likes, 0 repeats
       
       @pedro Of course this is not true. Everything that we know that we are not born with was invented at some point by someone. Language, mathematics, music, etc. There's plenty of remix, but there's also true creativity.
       
 (DIR) Post #AQZbmWzMiYt1hJjNAG by pedro@mastodon.social
       2022-12-13T19:15:35Z
       
       0 likes, 0 repeats
       
       @tiago imho true creativity (extrapolation) is a rare event and the space of interpolation is huge and only a tiny part was explored. And when a genius creates (extrapolates) something great, it also opens a new very large space of interpolation for AI. We can call it false creativity, but it will still change the world.
       
 (DIR) Post #AQZc1rDjmKr8mzlGSW by tiago@social.skewed.de
       2022-12-13T19:18:23Z
       
       0 likes, 0 repeats
       
       @pedro I still disagree. I think that we are creative all the time. We are creative when having this conversation. We are not interpolating between a huge set of points. We have neither the storage capacity nor the computational power to interpolate. We can't afford the brute force necessary.
       
 (DIR) Post #AQZfQVb1c9VWpB6GwK by pedro@mastodon.social
       2022-12-13T19:56:26Z
       
       0 likes, 0 repeats
       
       @tiago I think I am interpolating all the time, and that is ok 🙂 For example, if I create a new beautiful color palette never seen before, is this extrapolation or interpolation? I would say it is interpolation in the color space.
       
 (DIR) Post #AQZgJH1JVcmlXqVhlA by tiago@social.skewed.de
       2022-12-13T20:06:21Z
       
       0 likes, 0 repeats
       
       @pedroIt's interpolation only if you have the surrounding points inside your head already!There is a huge difference between being creative and *being original*!We are creative when we solve a problem or make an invention from scratch and the solution is new to us.We are original only when the solution is new to others.When we are solving mathematical problems we are *almost never* interpolating, we are solving the problem from scratch and the solution is new to us.It's hard be original because there are many of us with similar reasoning abilities and access to the same kinds of information, so we arrive independently at the same  answers, and therefore there is large probability that someone arrived at it before you did. But nevertheless we are creative all the time!
       
 (DIR) Post #AQZozP7cZo8UT43qfw by exterm@layer8.space
       2022-12-13T21:43:35Z
       
       0 likes, 0 repeats
       
       @tiago I'm not saying causation is possible, I'm saying it may become irrelevant :)I'd argue that what we've seen over the last few years is an expansion of use cases where that's already true. Corporations generally have no interest in causation, they want to see numbers go up. So that's been a fertile ground for this.However - if you show me practical limitations that clearly can't be overcome by increasing training set size and computational power, I'll happily revise that opinion.
       
 (DIR) Post #AQZqr8m7fYnDMRC0Rc by exterm@layer8.space
       2022-12-13T21:45:44Z
       
       0 likes, 0 repeats
       
       @tiago I'm not sure I understand the part about interventions. I was mostly thinking about correlation vs causation in terms of prediction.Can you expand on why you think causation is necessary to perform interventions?
       
 (DIR) Post #AQZqr9D43OLMhznWRF by tiago@social.skewed.de
       2022-12-13T22:04:31Z
       
       0 likes, 0 repeats
       
       @exterm Prediction is not what defines causation! I can predict if it's going to rain tomorrow by looking at the weather report today, but this is not what causes it! What defines causation is the relationship to interventions and counterfactuals, i.e. answers to “what would have happened if” questions.The rooster crows when the sun rises. This is correlation. But the later causes the former, not the opposite. How do we know? By postulating interventions: If we kill the rooster, the sun will still rise. If we cover the sun, the rooster will be silent.We can observe the rooster crowing at sunrises for all eternity, and fill petabytes of  observational data, but if we don't consider the intervention, we can never establish the flow of causation.You are very wrong about companies not being interested in causation. Why would they spend so much effort in doing AB testing and so on?This is because they want to know if a given treatment (e.g. a marketing campaign) causes users to change their behavior.I can also assure you that cigarette companies care very much about the causal relationship between smoking and cancer ­— even if their intent is to hide or mislead. If they could disprove causation, they would scream it at the top of their tar-filled lungs.
       
 (DIR) Post #AQZyBJx4WY8VUlsrSa by exterm@layer8.space
       2022-12-13T23:26:34Z
       
       0 likes, 0 repeats
       
       @tiago I think we’ll continue to disagree on parts of this, at least!As for companies’ interest in causation… surely sometimes they are interested in causation. But I’d argue that in most cases they are not. The AB test is actually a great example. Test two variants of a product, measure which one works better, then decide to use that one. Why does that one work better? The AB test doesn’t tell. It doesn’t need to. The company is making more money.
       
 (DIR) Post #AQZzNRqF7DXJOW2Its by tiago@social.skewed.de
       2022-12-13T23:40:00Z
       
       0 likes, 0 repeats
       
       @exterm A/B testing is a randomized controlled experiment (a.k.a. intervention) designed *precisely* to identify causation.If performed correctly, it tells that a variant of a treatment *causes* a particular outcome over the original.Indeed, the mechanism behind it is not revealed, but the causal relationship is.So it's not about simply which product works better, but if the new features introduced in B that are not present in A are actually responsible for it performing better. It's deeper than what you are implying.If you are interested to learn more about causation I would recommend “The book of why” by Judea Pearl: https://en.wikipedia.org/wiki/The_Book_of_Why
       
 (DIR) Post #AQaMN3a3cDBz60FR0C by exterm@layer8.space
       2022-12-14T03:57:37Z
       
       0 likes, 0 repeats
       
       @tiago thank you for your patient explanations - I guess I have seen a lot of misuse of AB testing then. I’ll take a look at that book. However - am I wrong in thinking that all commercial application of machine learning is based on causation being unnecessary? I’m still not convinced otherwise. Example, machine learning used to detect payment fraud. It’s purely based on correlation from what I’ve seen in the past.
       
 (DIR) Post #AQaMTpB5YwmCSBnQOW by exterm@layer8.space
       2022-12-14T03:58:51Z
       
       0 likes, 0 repeats
       
       @tiago the distinction between mechanism and causal relationships is very helpful.
       
 (DIR) Post #AQaTvesM9tc8I1Fmam by tiago@social.skewed.de
       2022-12-14T05:22:20Z
       
       0 likes, 0 repeats
       
       @exterm Indeed, it's a *limitation* of virtually all ML methods currently employed that they cannot deal with causation. But that does not mean it is not needed!As I mentioned initially, this is a blind spot in AI that will not get fixed by throwing more data or compute at it.There's already quite a bit of research on this field, so hopefully things will start changing in practice soon.
       
 (DIR) Post #AQb6VvDGgtURdrknAG by exterm@layer8.space
       2022-12-14T12:34:37Z
       
       0 likes, 0 repeats
       
       @tiago I think my point is that what we had 5-10 years ago is sufficient for a lot of commercial use cases, and with these correlation machines (my interpretation) being orders of magnitude better now at the high end that fraction of use cases that are commercially viable will increase drastically. Meaning, in effect, causal relationships become less relevant. For my initial statements in this thread I just extrapolated from there.