Post ATynOEIsprcgXFkfZ2 by numist@xoxo.zone
(DIR) More posts by numist@xoxo.zone
(DIR) Post #ATyn1MbzmOMNdtmuUC by simon@fedi.simonwillison.net
2023-03-25T16:22:33Z
1 likes, 1 repeats
One of the big challenges in learning to use LLMs is overcoming the very human urge to generalize and quickly form assumptions"I tried X and the result was terrible - hence LLMs are terrible at X"That might be true! They're bad at lots of things. But making a snap judgement based on a single anecdotal example is not a good way to explore this very weird new space
(DIR) Post #ATynC8vF18fWN1viE4 by simon@fedi.simonwillison.net
2023-03-25T16:25:08Z
0 likes, 0 repeats
You do have to maintain a very critical eye at all times - it's so easy to develop an incorrect mental model, like thinking ChatGPT can retrieve and read a URL because it generates a very convincing looking (though actually purely hallucinated) summary when you paste one in https://simonwillison.net/2023/Mar/10/chatgpt-internet-access/
(DIR) Post #ATynOEIsprcgXFkfZ2 by numist@xoxo.zone
2023-03-25T16:27:59Z
0 likes, 0 repeats
@simon If we're honest, a lot of the reactions are pretty familiar to students of history https://mastodon.cloud/@loren/110081850840000086
(DIR) Post #ATynaT6NdpbduGjtx2 by SignalsAndSorcery@ravenation.club
2023-03-25T16:29:46Z
0 likes, 0 repeats
@simon I've also notice many of my non-engineer friends/family seem to refer to AI a one entity. For example they say: "Get AI to do it". I am splitting hairs a bit but their wording implies a single AI algo backing every app. If they do think like that and have bad results from one system its possible they conclude all system are the same. Not sure though.
(DIR) Post #ATypHUrf419HmNw66K by simon@fedi.simonwillison.net
2023-03-25T16:49:06Z
0 likes, 0 repeats
@synx508 one of the areas I'd most like to see more research into is how non-experts' mental models of what these things can do evolve over timeWhat effect does it have the first time it clearly lies to them or makes an obvious and egregious mistake?
(DIR) Post #ATyq34cPwrnGHNUZu4 by tolmasky@mastodon.social
2023-03-25T16:56:18Z
0 likes, 0 repeats
@simon It’s weird because with worse technologies people understand that there’s a learning curve. Everything from “needing to learn how to think in SwiftUI” to “needing to learn how to ride a bike” (vs. trying it once, falling off, and being like “This bike can’t even stay upright! Everything I heard about bikes must be hype or an exaggeration.”). But due to the “uncanny valley magic” of LLMs, people drop this expectation and give up too easily.
(DIR) Post #ATyqGXFwmmxgrUOvJ2 by JessTheUnstill@infosec.exchange
2023-03-25T16:57:50Z
0 likes, 0 repeats
@simon We're still in the peak of inflated expectations. Is there likely to be utility to LLMs in the long run? Certainly!Is slapping ChatGPT into your application going to magically solve your user's problems? Hell no. But we have to get past the peak before people will finally slow down, do the ground work to understand the problem space, and use them in responsible ways.
(DIR) Post #ATyqVSywxXZvwvZQNE by sayrer@mastodon.social
2023-03-25T16:58:02Z
0 likes, 0 repeats
@simon
(DIR) Post #ATyqhbw9UVlBCJ3K3U by simon@fedi.simonwillison.net
2023-03-25T16:59:57Z
0 likes, 0 repeats
@tolmasky I blame the chat UI: it makes it instantly feel like this must be the most obvious technology in the world to use - you just send it messages and it replies!I'm spending a lot of time at the moment trying to spread the message that this technology is not at all easy to use - at least not effectively and in a way that avoids all kinds of unexpected traps
(DIR) Post #ATyxuJQCLEg7DV6sPw by apike@mastodon.social
2023-03-25T18:25:49Z
0 likes, 0 repeats
@simon I’ve seen a lot of misunderstandings like that, even just stemming out of the nondeterministic nature of the models. So somebody will say “GPT-4 has this bad output if you input Y” when the actual discovery was “GPT-4 occasionally has this bad output if you input Y” which is less dramatic.
(DIR) Post #ATz1QUbwZga337dlFA by savaran@hachyderm.io
2023-03-25T19:05:08Z
0 likes, 0 repeats
@simon I tried grinding my coffee with a hammer and it was terrible, I don’t know why people use hammers at all! /s
(DIR) Post #ATz2qLzhKn5ro9e90C by blackcoffeerider@social.saarland
2023-03-25T19:20:52Z
0 likes, 0 repeats
@simonBut the inverse also holds... "It gave me great answers for this Z which is of category X so it must be great and everything you throw at it in category X"
(DIR) Post #ATz31ffQWxzk0ZYrrs by simon@fedi.simonwillison.net
2023-03-25T19:22:12Z
0 likes, 0 repeats
@blackcoffeerider yes exactly! That's particularly dangerous: assuming that it can do a thing reliably because it produced a very convincing (but actually hallucinated) answer the first time you tried itThat one catches people out all the time
(DIR) Post #ATz3JNFVkNg269sfui by simon@fedi.simonwillison.net
2023-03-25T19:26:25Z
0 likes, 0 repeats
@synx508 I want to see researchMy hunch is that some (many?) people will be less likely to fall for LLM hallucinations if they experience some that both contradict those people's existing biases and convictions AND can be easily disproved
(DIR) Post #ATz3Vq3WTEOGupDW2y by blackcoffeerider@social.saarland
2023-03-25T19:26:32Z
0 likes, 0 repeats
@simonI am very irritated on that point because i have seen people jump the gun on that i assumed should know better (working in tech!). And you end up in realy strange arguments when given a Y similar to Z also of category X which has a disaster of an answer - not just wrong but misleading in a way that it is incredibly hard to catch...
(DIR) Post #ATz3uJ54Cy05subfZA by wolfr@mastodon.social
2023-03-25T19:30:31Z
0 likes, 0 repeats
@simon @synx508 but now it can https://openai.com/blog/chatgpt-plugins
(DIR) Post #ATz4JJ8X8ZkxIW9KAC by simon@fedi.simonwillison.net
2023-03-25T19:32:22Z
0 likes, 0 repeats
@synx508 I don't think if it can be put back in the bottle at this point, hence my interest in figuring out how best to help people use it effectively and responsibly
(DIR) Post #ATz4YtgdWl1301XCQy by SnoopJ@hachyderm.io
2023-03-25T19:40:24Z
0 likes, 0 repeats
@simon I am still feeling punch-drunk from the sheer volume of people who were (are? 😬) absolutely convinced that the model was running an honest-to-gosh virtual machine for them
(DIR) Post #ATz4xIN7OrOwVd2TZI by simon@fedi.simonwillison.net
2023-03-25T19:41:04Z
0 likes, 0 repeats
@SnoopJ Urgh, yeah that was a particularly weird and misunderstood demo!
(DIR) Post #ATz59qAxXlwupDKkd6 by simon@fedi.simonwillison.net
2023-03-25T19:43:14Z
0 likes, 0 repeats
@synx508 one thing I've been enjoying playing with recently is getting it to answer in the form of an anthropomorphic animal - getting marketing advice from a golden eagle, that kind of thingFeels more honest than getting it to imitate human beings!
(DIR) Post #ATz5KQWxETKKTOfZzM by SnoopJ@hachyderm.io
2023-03-25T19:46:52Z
0 likes, 0 repeats
@simon on the one hand, I'm happy to see that we have a really obvious case for why more fundamental HCI research is important, but this is a heck of way to make the case
(DIR) Post #AU0uzLzcIWw75pJDKy by lewiscowles1986@phpc.social
2023-03-26T17:02:23Z
0 likes, 0 repeats
@simon Actually this tweet series did remind me of something it did for me. It came up with the jq syntax, to update a json value, based upon a predicate.I Had to iterate on it's 20 incorrect answers, but it was enough to get something done, and with a spreadsheet I expanded the pattern to ~50 repos.I'm still not sure it's answer is particularly elegant, and for whatever reason I had to redirect output to another file then overwrite original; but I noticed you also had it showing you jq.
(DIR) Post #AU0vBlsTsskUqftO7M by simon@fedi.simonwillison.net
2023-03-26T17:04:33Z
0 likes, 0 repeats
@lewiscowles1986 yeah, I love it for jq - I didn't used to use that tool at all because I found the syntax too difficult to remember but now I use it several times a week
(DIR) Post #AU0xjivbYb4pSFff60 by lewiscowles1986@phpc.social
2023-03-26T17:33:17Z
0 likes, 0 repeats
@simon did it ever cross your mind "Lots of people find this tool [jq] hard to use, including me, and I'm fairly technical. Maybe the problem is the tool and not the learner / user?"If I had to score software on helping learners; I'd say datasette is better than chatGPT, because it's a reliable teacher. I don't have to solve the parallel problems of the SQL not working, setting up a SQL lab, creating a UI and exploring data. I Can pick one, and work on that.
(DIR) Post #AU0yhOktggzGGV7M5w by simon@fedi.simonwillison.net
2023-03-26T17:44:18Z
0 likes, 0 repeats
@lewiscowles1986 yeah jq is a petty terrible tool! I'm using it so much because it's available in the environments I care about and it solves a need for me: incorporate JSON into command-line pipelinesBefore ChatGPT I hardly used it at allBut now... my tolerance for weird DSLs is suddenly significantly higher