Post AWqVvL9Tlymi89MCOW by alexthurow@mstdn.social
 (DIR) More posts by alexthurow@mstdn.social
 (DIR) Post #AWqVucXhd4QHmFLPP6 by alexthurow@mstdn.social
       2023-05-07T08:03:48Z
       
       0 likes, 1 repeats
       
       @baldur in:https://illusion.baldurbjarnason.com/„These models aren't new bird brains new alien minds that are peers to our own. They aren't even insect brains. Insects have autonomy. They are capable of general problem solving - some of them dealing with tasks of surprising complexity - and their abilities tolerate the kind of minor alterations in the problem environment that would break the correlative pseudo-reasoning of a language model. Large Language Models are something lesser. They…“ 1/
       
 (DIR) Post #AWqVuep99BZ4qvWMs4 by alexthurow@mstdn.social
       2023-05-07T08:04:50Z
       
       0 likes, 0 repeats
       
       „… are water running down pathways etched into the ground over centuries by the rivers of human culture. Their originality comes entirely from random combinations of historical thought. They do not know the 'meaning' of anything - they only know the records humans find meaningful enough to store. Their unreliability comes from their unpredictable behaviour in novel circumstances. When there is no riverbed to follow, they drown the surrounding landscape. The…“ 2/
       
 (DIR) Post #AWqVugaKbEu0JYbZ3Y by alexthurow@mstdn.social
       2023-05-07T08:05:06Z
       
       0 likes, 0 repeats
       
       „… entirety of their documented features, capabilities, and recorded behaviour - emergent or not - is explained by this conceptual model of generative AI. There are no unexplained corner cases that don't fit or actively disprove this theory. Yet people keep assuming that what ChatGPT does can only be explained as the first glimmer of genuine Artificial General Intelligence.“ /3
       
 (DIR) Post #AWqVuiJOBCXRfah3vU by alexthurow@mstdn.social
       2023-05-07T08:08:58Z
       
       0 likes, 0 repeats
       
       „When ChatGPT demonstrates intelligence, that comes from us. Some of it we construct ourselves. Some of it comes from our inherent biases. There is no 'there' there. We are alone in the room, reconstructing an abstract representation of a mind. The reasoning you see is only in your head. You are hallucinating intelligence where there is none. You are doing the textual equivalent of seeing a face in a power outlet. This drive - anthropomorphism - seems to be innate. Our…“ 1/
       
 (DIR) Post #AWqVuk9tJU85OiGVOq by alexthurow@mstdn.social
       2023-05-07T08:10:20Z
       
       0 likes, 0 repeats
       
       „… first instinct when faced with anything unfamiliar - whose drives, motivations, and mechanisms we don't understand - is to assume that they think much like a human would. When that unfamiliar agent uses language like a human would, the urge to see them as near or fully human is impossible to resist - a recurring issue in the history of AI research that dates all the way back to 1966. These tools solve problems and return fluent, if untruthful, answers, which…“ 2/
       
 (DIR) Post #AWqVuluMoAtqp918To by alexthurow@mstdn.social
       2023-05-07T08:10:35Z
       
       0 likes, 0 repeats
       
       „… is what creates such a convincing illusion of intelligence.“ /3
       
 (DIR) Post #AWqVund4PSFiA4wLnU by alexthurow@mstdn.social
       2023-05-07T08:40:26Z
       
       0 likes, 0 repeats
       
       „After everything the tech industry has done over the past decade, the financial bubbles, the gig economy, legless virtual reality avatars, crypto, the endless software failures - just think about it - do you think we should believe them when they make grand, unsubstantiated claims about miraculous discoveries? Have they earned our trust? Have they shown that their word is worth more than that of independent scientists?“
       
 (DIR) Post #AWqVupM7zPt9W71qfQ by alexthurow@mstdn.social
       2023-05-07T10:59:39Z
       
       0 likes, 0 repeats
       
       „Sceptical Believers are the ones who take the biggest risks, because they don't appreciate the pace of change of a paradigm shift like the True Believer, and they aren't as careful as the True Sceptic.“
       
 (DIR) Post #AWqVur4Tc0xQpwmmQq by alexthurow@mstdn.social
       2023-05-07T11:00:33Z
       
       0 likes, 0 repeats
       
       „What strategy you choose for adopting the generative AI is going to depend on your appetite for risk, faith in its capabilities, and desire for differentiation.“
       
 (DIR) Post #AWqVusllIZAy6U2rXU by alexthurow@mstdn.social
       2023-06-16T20:44:33Z
       
       0 likes, 0 repeats
       
       „Most Generative Al are too immature and unpredictable to be used safely. They have too much of a tendency to make up facts and data, plagiarise text and images, and their behavioural edge cases are extreme enough to potentially inflict serious damage to almost any or-ganisation's reputation. The only scaleable safeguard is to prefer specialised AI-based tools that can make assurances about their use in their chosen field, and whose assurances you have confirmed to be true your-self.“
       
 (DIR) Post #AWqVuuY0gfMdcPcuNk by alexthurow@mstdn.social
       2023-06-16T20:50:40Z
       
       0 likes, 0 repeats
       
       „Spam, abuse, and fraud are the perfect use cases for Generative Al software. Spammers and scammers are unaffected by many of the downsides of Generative Al and the only issue that could be prohibitive to their task is the high cost. These tools are likely to help them bypass almost all of your existing safeguards against abusive automation.“
       
 (DIR) Post #AWqVuwBObiSmgr3sPY by alexthurow@mstdn.social
       2023-06-16T20:57:22Z
       
       0 likes, 0 repeats
       
       „Some of the 'safeguards' that Al vendors themselves are marketing are potentially more harmful than the sporadic fraud itself. Every tool that claims to check whether any given text or image is AI-generated is itself built using the same principles and mechanisms as other Al, and they inherit much of their unreliability. OpenAl's own classifier, which should be the current state-of-the-art given that they are the leading company in Generative Al, only correctly identifies Al…“ 1/
       
 (DIR) Post #AWqVuxxI18MsBgTdhY by alexthurow@mstdn.social
       2023-06-16T20:57:34Z
       
       0 likes, 0 repeats
       
       „… generated text 26% of the time and falsely identifies hu-man-authored text as authored by an Al about 9% of cases. This means that if you have 100 applicants or stu-dents, 10% of whom attempt to cheat using ChatGPT text in a test, the current state-of-the-art Al content checker will only correctly identify three out of ten cheaters, but will falsely accuse eight students of cheating. That's with a tool made by the leading company in the field, one that has full…“ 2/
       
 (DIR) Post #AWqVuztSoKUoCOhc12 by alexthurow@mstdn.social
       2023-06-16T21:00:17Z
       
       0 likes, 0 repeats
       
       „… access to the technology and models that Chat GPT is built on. Other Al content checkers don't have that access and are likely to perform even worse.“ /3
       
 (DIR) Post #AWqVv1cAPbqfXKcpKi by alexthurow@mstdn.social
       2023-06-16T21:02:00Z
       
       0 likes, 0 repeats
       
       „The safeguards built into current AI software are insufficient. Most of them are too easily bypassed by intent users and many of them semi-regularly generate output that is unnerving or even offensive. Al companies have a poor track record for software security.“
       
 (DIR) Post #AWqVv3ZP8qpLbLLeIy by alexthurow@mstdn.social
       2023-06-16T21:03:43Z
       
       0 likes, 0 repeats
       
       „If you're already an AI company, stacked to the rafters with Al researchers and programmers specialised in machine-learning, all supported by an effective AI safety team, then carry on. This book isn't for you. It's for the people who you hope to have as customers.“
       
 (DIR) Post #AWqVv5M0VdIb8N5yhU by alexthurow@mstdn.social
       2023-06-16T21:07:11Z
       
       0 likes, 0 repeats
       
       „[…] if it weren't for the issue that its entirely unclear whether hosted generative Al tools enjoy safe harbour protections for hosting that are provided by Section 230(c)(1) of the US Communications Decency Act. Section 230, and similar laws in other countries, let hosting companies host without being liable for what their customers upload. Without those laws, social media networks and search engines would not exist. In most countries these laws have some limits, but overall what they…“ 1/
       
 (DIR) Post #AWqVv72wDVEYNoBmFs by alexthurow@mstdn.social
       2023-06-16T21:07:20Z
       
       0 likes, 0 repeats
       
       „… mean is that you aren't legally liable for the comments that somebody posts on your service. One problem with integrating generative Al into your product, especially a hosted product, is that many of the uses of AI are likely to fall outside these protections. If somebody gets your online Al chat widget to talk like a genocidal Nazi, you might be as liable for that as if you had published pro-genocide Nazi propaganda yourself. That would mean that exposing these tools to your…“ 2/
       
 (DIR) Post #AWqVv8jVwgsvc97IG0 by alexthurow@mstdn.social
       2023-06-16T21:09:06Z
       
       0 likes, 0 repeats
       
       „… customers risks exposing you to enormous legal liability. […] Microsoft, which is positioning itself to be at the forefront of Al services, doesn't exactly have a history of handling the security of its products well. Integrating these tools to be used safely by the end-user takes enormous work and expertise and, even with all the right resources, you still might not succeed. We don't know if all of these flaws can be fixed in the models themselves. You might end up…“ 3/
       
 (DIR) Post #AWqVvAaj2L2jNT1Ips by alexthurow@mstdn.social
       2023-06-16T21:12:38Z
       
       0 likes, 0 repeats
       
       „… adding novel and innovative security holes to your own products and that's a kind of innovation nobody wants.“ /4
       
 (DIR) Post #AWqVvCNgNnnYvavume by alexthurow@mstdn.social
       2023-06-16T21:14:27Z
       
       0 likes, 0 repeats
       
       „The biggest source of errors, bias, and legal risk in AI tools is their training data. It's next to impossible for an organisation to assess the risk of using Al software whose training data set is undocumented or, worse, kept as a trade secret. Dependency on closed Al software is inherently riskier than other closed software as it leaves you with fewer tools to validate grand claims from the vendor.“
       
 (DIR) Post #AWqVvE4c5fjWB21iL2 by alexthurow@mstdn.social
       2023-06-16T21:17:33Z
       
       0 likes, 0 repeats
       
       „You can't get an accurate sense of the biases that a generative AI has - how racist or misogynistic it isv- through ad hoc testing. The output of a generative AI will vary, and you might just get lucky or unlucky in your testing. You need to be able to investigate the training data for inherent biases and when you test the AI directly, you need to be able to do that in a structured, replicable way. A closed, hosted model could get updated at any time, introduce regressions, or…“ 1/
       
 (DIR) Post #AWqVvFnffdMxX47DCy by alexthurow@mstdn.social
       2023-06-16T21:18:17Z
       
       0 likes, 0 repeats
       
       „… invalidate your test results, requiring you to start your testing again from square one. This is why a true risk and reward assessment of any large-scale use of generative Al requires access to both the training data (preferably with documentation) and to the Al model itself.“ /2
       
 (DIR) Post #AWqVvHYV90QIyb27qC by alexthurow@mstdn.social
       2023-06-16T21:21:55Z
       
       0 likes, 0 repeats
       
       „For the field to advance and for organisations to be able to use generative AI safely, we need these Al models to be open and for the training data to be documented. This is a problem because the models on the market that we think are the most capable - OpenAl's and Google's - are closed. They are black boxes we can't properly test. They might be fine, or they might not be. These AI models could be filled with ticking time-bombs: undiscovered security…“ 1/
       
 (DIR) Post #AWqVvJRq6kHaqVvpjc by alexthurow@mstdn.social
       2023-06-16T21:22:05Z
       
       0 likes, 0 repeats
       
       „… problems, plagiarism rates, inaccurate summaries, and general safety issues.“ /2
       
 (DIR) Post #AWqVvL9Tlymi89MCOW by alexthurow@mstdn.social
       2023-06-19T09:16:58Z
       
       0 likes, 0 repeats
       
       „Large Language Models have a tendency to hallucinate and fabricate nonsense in ways that are hard to detect. This makes fact-checking and correcting the output very expensive and labour-intensive. These hallucinations appear in AI-generated summaries as well, which makes these systems unsuitable for many knowledge- and information-management tasks.“
       
 (DIR) Post #AWqVvN3Wh5DA2GaTOS by alexthurow@mstdn.social
       2023-06-19T09:19:33Z
       
       0 likes, 0 repeats
       
       „One of the AI industry's time-tested tactics for implying success in failure is through creative naming. There is even a standard term for it in the field: wishful mnemonics. Statistical analysis and transformation is called learning, even though it bears no resemblance to how animals or humans learn. The neurons of an Al's neural network have next to nothing in common with the neurons you find in animal brains. Als are said to understand or predict. Vendors ascribe motivation to systems…“ 1/
       
 (DIR) Post #AWqVvOq83rgPZIKnmy by alexthurow@mstdn.social
       2023-06-19T09:22:00Z
       
       0 likes, 0 repeats
       
       „… with about as much autonomy as Excel. The models do not parse, they are said to read. Memorisation, which I described earlier, is a good example. These AI systems are not intelligences that can memorise something as a human would. They are software that copies, stores, parses, or generates data. They do not memorise. It is a patently inaccurate label for the phenomenon of copying. But, if you want to follow and participate in AI research, you are stuck with the…“ 2/
       
 (DIR) Post #AWqVvQaxXEjl0pFiQC by alexthurow@mstdn.social
       2023-06-19T09:22:31Z
       
       0 likes, 0 repeats
       
       „… terms they use. If you don't accept their terminology, discourse and debate becomes impossible. But if you do, you at least partly accede to their premise that these are, well, Artificial Intelligences, and not just software with no autonomy or intelligence of their own.“ /3
       
 (DIR) Post #AWqVvSSWbZB8nFK0YK by alexthurow@mstdn.social
       2023-06-19T09:24:44Z
       
       0 likes, 0 repeats
       
       „One such misnomer that seems to be specifically chosen to gloss over the flaws and inadequacies of these systems is hallucination. These systems cannot hallucinate. They are not deluded, deranged, delirious or in any other mental state you can think of. They are data and they can't feel. An Al can't have any kind of mental state in any way. They are, quite simply, software that keeps generating garbage and the sooner we face that reality, the sooner we can deal with the consequences and…“ 1/
       
 (DIR) Post #AWqVvUIflAUCVGjATQ by alexthurow@mstdn.social
       2023-06-19T09:25:01Z
       
       0 likes, 0 repeats
       
       „… use them where it's safe and productive. These hallucinations are the most frequent and obvious issue with the performance of common text-oriented Generative AI. This is one instance where adopting the industry term actively hinders analysis and risk assessment, because it is an umbrella term that covers a number of distinct errors.• Omissions caused by training data limitations, which leads the language model to writing a falsehood, such as an Al…“ 2/
       
 (DIR) Post #AWqVvW1jL87drIofLM by alexthurow@mstdn.social
       2023-06-19T09:32:10Z
       
       0 likes, 0 repeats
       
       „… whose training data set extends only to 2021 not being aware of the release of a film in 2022.• Fabrication errors where the Al fabricates an answer that fits the pattern of a valid answer to the question, such as when it attributes fictional papers to real writers, claims that an API can do something it absolutely can't, makes up a fictional history of Ukraine, claims that somebody is dead when they aren't, or makes up a summary for web pages that don't…“ 3/
       
 (DIR) Post #AWqVvXxY9dxzqusM6a by alexthurow@mstdn.social
       2023-06-19T09:32:24Z
       
       0 likes, 0 repeats
       
       „… exist.• Summarisation errors, where the AI makes inaccurate claims about the text it was supposed to summarise.These are three distinct types of errors. Even though the causes of these errors are the same, their differences mean they are likely to require different mitigation mechanisms in the software facing the end-user.“ /4@baldur in:https://illusion.baldurbjarnason.com/
       
 (DIR) Post #AWqVvZkVV6ipP2my3M by alexthurow@mstdn.social
       2023-06-19T09:44:34Z
       
       0 likes, 0 repeats
       
       „Every built-in mitigation strategy that has shipped to date treats all hallucinations as straightforward factual errors that can be fixed either through some form of automated checking fact by fact, itself a very unreliable process, or by instructing the Al, error by error, what is true and what isn't. The latter strategy is manifestly, obviously, impossible because the long tail of falsehoods is infinite and truth is scarce. You will never run out of falsehoods to correct or…“ 1/
       
 (DIR) Post #AWqVvbR5EINCdNiU3U by alexthurow@mstdn.social
       2023-06-19T09:45:06Z
       
       0 likes, 0 repeats
       
       „… lies to debunk. Unfortunately, this is also the exact strategy that OpenAl adopted for GPT-4. They put together a list of common falsehoods and specifically trained the model not to fabricate those specific falsehoods, such as claiming that Elvis was the son of an actor. Unsurprisingly, this did not fix the problem. The fundamental problem with fabrication errors is that they are indistinguishable from the regular text output of a Generative AI. It's not…“ 2/
       
 (DIR) Post #AWqVvd80wAJ9sooHbs by alexthurow@mstdn.social
       2023-06-19T09:45:17Z
       
       0 likes, 0 repeats
       
       „… that the system sometimes makes mistakes. The system is always fabricating answers. Every answer is a fabrication. Sometimes it just gets lucky and the most probable collection of words in response to that query, at that time, happened to be factual. The software does not have any self-awareness or understanding of the meaning that underlies the language. Everything it generates is a fabrication. It's all hallucinations, all the way down, even when it happens to be right.“ /3
       
 (DIR) Post #AWqVvf23rGjbmw2Ybo by alexthurow@mstdn.social
       2023-06-19T09:50:21Z
       
       0 likes, 0 repeats
       
       „Using a text synthesis system as a knowledge base or as a front end to one - such as search - is inherently risky and should be avoided. It's unlikely that the errors can be suppressed because they seem to be fundamental to how the system works - the larger and more capable the language model becomes, the less truthful it is in its answers, and early tests with ChatGPT seem to confirm that this is still the case. The very thing that makes them generate more fluent and…“ 1/
       
 (DIR) Post #AWqVvgtGwutPYFwZBg by alexthurow@mstdn.social
       2023-06-19T09:50:30Z
       
       0 likes, 0 repeats
       
       „… more convincing text, seems to make them less truthful.“ /2
       
 (DIR) Post #AWqVvijm5CU3HNW0f2 by alexthurow@mstdn.social
       2023-06-19T09:54:48Z
       
       0 likes, 0 repeats
       
       „Many organisations have employees from a mix of different cultures. Some come from what's called 'high context' cultures where you are expected to leave many things unsaid. Others, like those from the USA, come from 'low context' cultures where if it isn't spelled out it doesn't matter. AI summarisation might work for 'low context' cultures, won't work for 'high context' cultures, and is likely to reliably create misunderstandings in a mixed culture.“
       
 (DIR) Post #AWqVvkZvEnn6zOvAa8 by alexthurow@mstdn.social
       2023-06-19T09:58:01Z
       
       0 likes, 0 repeats
       
       „People vary in their expertise. If you're discussing a new feature with ten people in an email thread, where everybody agrees, but most are commenting on a superficial decorative aspect of the feature ("bikeshedding"), then the debate on the colour of the proverbial bikeshed is understandably going to dominate the summary. But if the lead developer responsible for software security, in a single message out of dozens, expresses their doubt about the viability of…“ 1/
       
 (DIR) Post #AWqVvmSCGUneo1K1om by alexthurow@mstdn.social
       2023-06-19T09:58:28Z
       
       0 likes, 0 repeats
       
       „… the feature, then that would easily be the single most important point made in the thread. Any summarisation tool that downplays or even omits the lead developer's comment from the summary is inviting disaster. But how would the Al understand the relevance of the different levels of expertise in the thread? It can't, because it doesn't do "understanding" in any way, shape, or form. It's software; not a mind. Additionally, most language models are trained on biased data, so they are just…“ 2/
       
 (DIR) Post #AWqVvo8lzgS22MFXou by alexthurow@mstdn.social
       2023-06-19T09:58:41Z
       
       0 likes, 0 repeats
       
       „… as likely to downgrade the lead developer's comment if their name isn't western or masculine enough.“ /3
       
 (DIR) Post #AWqVvprTaxntNIAl8a by alexthurow@mstdn.social
       2023-06-19T10:00:33Z
       
       0 likes, 0 repeats
       
       „In a hierarchical organisation, people vary in their importance. Sometimes the importance comes from an explicit hierarchy. Sometimes an employee is important because they're friends with somebody important. The human mind is highly sensitive to social relationships and when we read through discussions, we account for them in our judgement without giving it any special thought. An Al is not a mind. It can't have any judgement. It won't take a…“ 1/
       
 (DIR) Post #AWqVvrcJ4KrEop5flo by alexthurow@mstdn.social
       2023-06-19T10:00:43Z
       
       0 likes, 0 repeats
       
       „… hierarchy that isn't in its training data or context - that might not even be accurately recorded anywhere - into account.“ /2
       
 (DIR) Post #AWqVvtS6FFsiVkKY8e by alexthurow@mstdn.social
       2023-06-19T10:01:35Z
       
       0 likes, 0 repeats
       
       „Using a language model to summarise workplace discussions risks omitting vital information or fabricating misinformation, both of which can cause a project to fail.“
       
 (DIR) Post #AWqVvvARrqwzpa5Tu4 by alexthurow@mstdn.social
       2023-06-19T10:04:52Z
       
       0 likes, 0 repeats
       
       „The results from a search engine are even more varied than office discussions. The documents in a search engine result come from a variety of institutions. Some of them are formal documents; some of them are off-the-cuff social media posts; some of them are blatant marketing; some are actual advertising. They come from a broad range of expertise, writing styles, and experiences. That you could put a language model Al in front of a search engine and get a meaningful summary…“ 1/
       
 (DIR) Post #AWqVvwqfcMJn2oqiLw by alexthurow@mstdn.social
       2023-06-19T10:05:08Z
       
       0 likes, 0 repeats
       
       „… of the results is unrealistic for anything except the simplest of queries. You couldn't even summarise it reliably as a fully "generally intelligent human being you'd just pick the results that seem more plausible than others - but this mechanism is somehow supposed to magically make a large language model truthful? Relying on AI summarisations in real-world scenarios is risky and likely to be counter-productive in most workplaces.“ /2
       
 (DIR) Post #AWqVvygolxcqkqFsH2 by alexthurow@mstdn.social
       2023-06-19T10:07:55Z
       
       0 likes, 0 repeats
       
       „If you can't use a language model AI for research or as a front-end to a knowledge base or search engine, what is it good for? These tools are amazing for anything that is pure language work. Any time you need to work with the structure, tone, or style of a text, these systems will almost give you superpowers. The results will probably need to be edited - considerable editing if you want something that isn't mediocre - but language models can take care of a lot of the busywork.“
       
 (DIR) Post #AWqVw0dhWWJwnkoPh2 by alexthurow@mstdn.social
       2023-06-19T10:10:07Z
       
       0 likes, 0 repeats
       
       „• It can change the tone of a text you wrote. Need to take a casual social media post and make it sound formal? It can do that.• It can change the structure of text. Give it an unstructured rough detail of your career, and it should be able to turn it into a formally structured resume. Want to turn an email you wrote, one that has all the correct details of the project you're working on, into a structured description for a grant application. It might be…“ 1/