Posts by david_chisnall@infosec.exchange
(DIR) Post #B4pINE1BbctRMSC608 by david_chisnall@infosec.exchange
2026-03-31T16:09:41Z
0 likes, 0 repeats
Any recommendations for alternatives to GitHub for corporate use?The current GitHub AI training thing means that GitHub cannot be used for anything confidential. There's no way of saying at an org level that data cannot be used to train AI, which means that anything in a GitHub private org used by more than one person is at risk.I've mostly looked at alternatives for personal and F/OSS things. The requirements for corporate use also include decent ACLs, MFA, and so on.#AICostsYouPayingCustomersYouMuppets
(DIR) Post #B4qX0xJIjmhvdtSyRM by david_chisnall@infosec.exchange
2026-04-01T09:56:17Z
0 likes, 0 repeats
@futurebird They also have a strong aversion to 'academic elites' and so fired a load of the experts who would have been able to tell them that the machine is talking nonsense.
(DIR) Post #B4ql6eVewvrgQerDkG by david_chisnall@infosec.exchange
2026-04-01T08:28:03Z
0 likes, 1 repeats
@cdarwin I never thought I would boost an article from the Saudi Royal Family’s propaganda arm but, aside from the Anthropic-ad tone, this is quite on point. A couple of bits of context:This comes after the Saudis have pulled out of backing a load of ‘AI’ datacenter projects. Their money was important for making some of these possible. I wouldn’t be surprised if they’ve established a bunch of put positions and are now ready to cash in on the crash.Machine learning, as deployed in these models (other techniques don’t always have this property) is all about generating mostly correct outputs from statistical correlations. There are some places where this is really good. The deep neural networks for computer vision outperform every other approach because we don’t really understand the problem (can you unambiguously describe what a bicycle or a chair looks like, in a way that would let someone write a program that would recognise them from a picture where one is shown at any angle, partially occluded?) but we have loads of examples and so we can just throw them at a correlation system and get good results. And this works because the ground truth changes rarely. If you train a model with a hundred million pictures of trees, you are unlikely to see many new trees that don’t look quite like several of the images in the training set. War is very different because it is always an unusual set of conditions. No two wars are the same (there’s a line about the army always fighting the last war, because the things you learn in one conflict may not be applicable in the next). And you’re in an adversarial situation, where other people are intentionally trying to do things that you don’t expect (and therefore are things not present in any data building statistical models). This is why wargaming and military analysis usually depend on experts, not statistical models: there aren’t enough data points to build a good statistical model.
(DIR) Post #B4rMxaIj1Z0x1nuWR6 by david_chisnall@infosec.exchange
2026-04-01T14:12:39Z
1 likes, 0 repeats
@whitequark This is very close to where I parted ways with the FSF. There's always a tension between enabling people to create the desirable thing and enabling people to make the undesirable. Their view is that it should be very hard to make the undesirable thing, and slightly easier to make the desirable thing. My view is that you should make it so easy to make the desirable thing that people always have a choice and then, once the desirable thing exists, you can apply other pressures to get rid of the undesirable thing.I don't think deskilling is the right framing for a lot of these things, it's about where you focus cognitive load. There's a line from the Stantec ZEBRA's manual (1956) that says that the 150-instruction limit is not a real problem because no one could possibly write a working program that complex. Small children write programs more complex than that now. That's not a loss to the world, the fact that you don't have to think about certain things means you can think about other things, such as good algorithm and data structure design.There was research 20ish years ago comparing C and Java programs and found that the Java programs tended to be more efficient for the same amount of developer effort, because Java programmers would spend more time refining data structure and algorithmic choices and improve entire complexity classes, whereas C programmers spend the time tracking down annoying bug classes that are impossible in Java and doing microoptimisations. Of course, under time pressure, Java developers will simply ship the first thing that works and move onto new features rather than doing that optimisation. C programmers would take longer to get to the MVP level and their poorly optimised code was often faster than poorly optimised Java.I see LLMs as very different because they don't provide consistent abstractions. A programmer in a high-level language has a set of well-defined constraints on how their language is lowered to the target hardware and can reason about things, while allowing their run-time environment to make choices within those constraints. Vibe coding does not do this, it delegates thinking to a machine, which then generates code that is not working within a well-defined specification. This really is deskilling because it's not giving you a more abstract reasoning framework, it's removing your ability to reason.Letting people accomplish more with less effort, in an environment where their requirements are finite, ends up shifting power to individuals, because it reduces the value of economies of scale.
(DIR) Post #B4rMxc3YUw4ITKpR4K by david_chisnall@infosec.exchange
2026-04-01T15:31:20Z
0 likes, 0 repeats
@giacomo @whitequark I think you're misunderstanding my point. The FSF decides to promote the creation of Free Software (a goal I agree with) by creating complex licenses. Developing software reusing software under any license requires understanding the license. The FSF's licenses are sufficiently complex that I have had multiple conversations with lawyers (including some with the FSF's lawyers) where they have not been able to tell me whether a specific use case is permitted. This places a burden on anyone developing Free Software using FSF-approved licenses, because there are a bunch of use cases that the FSF would regard as ethical, but where their licenses do not clearly permit the use. It places a larger burden on people doing things that the FSF disapproves of. They have to come up with exciting loopholes. Unfortunately, it turns out that this isn't that hard and once you've found a loophole you can keep using it. The FSF responds with even more complex licenses.EDIT: To be clear, the FSF and I have very similar goals. I just think that their strategy is completely counterproductive. Complex legal documents empower people who can afford expensive lawyers. We're increasingly seeing companies using AGPLv3 to control nominally-Free Software ecosystems.
(DIR) Post #B4t2ZTkO87k6z6zvJg by david_chisnall@infosec.exchange
2026-04-02T09:12:53Z
1 likes, 0 repeats
@0xabad1dea There are two nines in their uptime number!No one said they had to be leading digits…
(DIR) Post #B4vSJnv0LuJs2H4O3s by david_chisnall@infosec.exchange
2026-04-03T18:57:13Z
0 likes, 0 repeats
@futurebird Around 25 years ago, there was a not-very-serious paper from MIT that pointed out that tinfoil hats are basically parabolic reflectors and so, rather than keeping out rays, they will focus them on the brain.Possibly worth sharing with people who might buy this nonsense.
(DIR) Post #B4wfxYOaygs02R9fLU by david_chisnall@infosec.exchange
2026-04-04T09:04:41Z
0 likes, 0 repeats
@futurebird Destroying structures that cause pride in the country was considered for a long time to be a good psychological warfare technique, to demoralise the enemy. Only, in the 20th century, some people with the relevant expertise got around to actually measuring it and found that it normally had the opposite impact, uniting the country against whoever was responsible. The USA got a good first-hand example of this in 2001.There are two reasons why the current regime might have done this:They have fired all of the experts and have the mindset of ‘big strong man blow stuff up, him strong!’ as their primary decision-making process.They know regime change has failed and are actively trying to make the population support the new leader because it makes it easier to blame the entire populace (war crimes that kill civilians are easier, politically, if the victims can be portrayed as actively supporting people who are killing your people).My money is on 1, though Bibi is a fan of 2 and so that may also factor in.
(DIR) Post #B4x2TFOvKiWjNxpI9I by david_chisnall@infosec.exchange
2026-04-04T08:45:10Z
1 likes, 1 repeats
The Anthropic code leak is showing that contrary to claims made by AI sceptics, it is possible for humans to understand the output of LLM code generators on large and complex codebases. This has shown limitations in conventional office chair design and may require some new health and safety rules instituted in places that allow LLM use, due to the large number of reports of people falling out of their chairs laughing.
(DIR) Post #B55WNfRQ9egCIevfBw by david_chisnall@infosec.exchange
2026-04-08T07:21:05Z
1 likes, 1 repeats
@ZachWeinersmith The original Coverity paper found over 300 bugs, most of which had security implications. Static analysis has been great at finding exploitable vulnerabilities for a long time. This is a new approach to doing static analysis.The biggest problem is always the false positive rate. If you run a tool and it finds a load of vulnerabilities, that’s great. Except you run the same tool and it also finds a load of things that look like vulnerabilities, but aren’t. So now you have to triage them and that takes effort. You also need to add annotations to silence the ones that aren’t real. With deterministic analysers, you can often provide some extra information (e.g. parameter attributes) that allow this information to be tracked across an analysis boundary. BCMC has a lot of these. But with a probabilistic tool, these may or may not work. So you’re left with just slapping on an annotation that says ‘ignore the warning here’. The bug I found a little while ago in some MISRA C code was of that form: their analyser had found it, someone had determined it was not a bug, and they were wrong.For a defender, if you spend too much time looking at and discounting false positives, you can improve code quality better with something else. I’ve only looked at a few of the bugs Claude reported, but one was a missing bounds check that wasn’t actually a vulnerability because the bounds were checked in the caller. Its fix made things slower, but not less exploitable. A good static analyser would have had a tool for annotating the function parameter to say ‘this is always at least n bytes’ and then checked that callers did this check. Claude has nothing like this because it doesn’t actually have a model of how code executes, it just has a set of probabilities for what exploitable code looks like. Unfortunately (and this is one of the problems with C), correct and vulnerable code can look exactly the same with different call stacks.The second problem is the asymmetry. To be secure, you need to investigate and fix all of the vulnerabilities that tools can find. For an attacker, you just need one vulnerability. The ROI for attackers is much higher. Imagine a tool with a 90% false positive rate that finds 1,000 vulnerability-shaped objects. An attacker who triages 6-7 of them has around a 50% chance of finding an attack that they can use. A defender who does the same amount of work has a 50% chance of reducing the number of vulnerabilities discoverable by attackers using this or similar tools by 1%.This is why I build things that deterministically prevent classes of vulnerabilities from being exploitable.
(DIR) Post #B58PIhz79dLEXGFzqC by david_chisnall@infosec.exchange
2026-04-09T18:00:20Z
1 likes, 0 repeats
@EUCommission Imagine what you could have done if you had ignored a hype wave pushed by grifters and scammers and instead invested that money in things that will actually improve the economy and social wellbeing.
(DIR) Post #B59DvK8ek7K7a1KK8G by david_chisnall@infosec.exchange
2026-04-10T10:20:40Z
0 likes, 0 repeats
@aral @Mer__edith Note that simply turning on notifications is not sufficient for this exploit route to work, you must also allow notifications to be shown on the home screen.If you do this, then anyone with physical access to your device will see messages as they arrive, so your threat model must exclude people who can see your screen. If your threat model excludes people who can see your screen, it should probably also exclude people who can connect to the OS and extract system state from the device.
(DIR) Post #B5BgZhItDlSA2c4baq by david_chisnall@infosec.exchange
2026-04-10T07:24:02Z
2 likes, 2 repeats
@EUCommission I don’t know if this account is actually monitored, or just a publishing place, but you may have noticed that this post has received almost overwhelmingly negative responses.You could disregard this as Mastodon bias, but keep in mind that the biggest bias on Mastodon is that people who understand and built core parts of the information technology that you use every day are massively over represented. This is probably the only place you will get a lot of replies from people who both understand technology and do not have a financial incentive to hype things to get large amounts of government funding.EDIT: I should add, I used machine learning during my PhD and there are a lot of problems for which it is a really good fit. But, in the current climate, it’s generally safe to interpret ‘AI’ as meaning ‘machine learning applied to a problem where machine learning is the wrong solution’. It isn’t a technology, it’s a branding term, and it’s a branding term used almost exclusively for things that have no social benefit.
(DIR) Post #B5BlrT6TFuUoz0YddY by david_chisnall@infosec.exchange
2026-04-11T11:42:34Z
0 likes, 1 repeats
It sounds as if electric trucks are great for long-range land transport. But they require heavy batteries, so rather than putting them on the road (where they'll damage the road surface), why don't we build special metal tracks for them to go on? And, on long trips, join a bunch of them together so that you only need one motor and driver for a load of them travelling in a convoy? I bet you could make freight transport a lot more efficient if you did that.
(DIR) Post #B5CM0d2JI80ykHHLWa by david_chisnall@infosec.exchange
2026-04-11T21:29:41Z
0 likes, 0 repeats
Yay, the CI system that we use is shutting down because OpenAI kills everything they touch. I guess we have a migration ahead of us.
(DIR) Post #B5HubHD5ltX801lh0y by david_chisnall@infosec.exchange
2026-04-14T11:15:14Z
0 likes, 0 repeats
Walking around Cambridge today, there are a lot of Vote Green signs. More than any other party (though the one person with three Vote Labour signs is trying to even the balance). Last election I only saw one or two.The Liberal Democrats (with their incredibly arrogant 'Winning Here!' signs) seem to have vanished. I only saw one of their signs, and I think it's been up for two elections.EDIT: And the flyer we just got from Labour is telling us to vote for their candidate to prevent a Liberal Democrat getting in. If the flags are in any way representative, there isn't much chance of that happening.#GPEW #UKPol
(DIR) Post #B5LOEikOnq1OqLO6Hw by david_chisnall@infosec.exchange
2026-04-15T16:14:16Z
0 likes, 1 repeats
@davidgerard I found, with another vendor, that we way you get good support is not to talk to their customer support, it's to send a draft article to their press contact and ask for comment.
(DIR) Post #B5NujPmiwNSvrI4hqS by david_chisnall@infosec.exchange
2026-04-17T10:04:41Z
1 likes, 1 repeats
A few notes about the massive hype surrounding Claude Mythos:The old hype strategy of 'we made a thing and it's too dangerous to release' has been done since GPT-2. Anyone who still falls for it should not be trusted to have sensible opinions on any subject.Even their public (cherry picked to look impressive) numbers for the cost per vulnerability are high. The problem with static analysis of any kind is that the false positive rates are high. Dynamic analysis can be sound but not complete, static analysis can be complete but not sound. That's the tradeoff. Coverity is free for open source projects and finds large numbers of things that might be bugs, including a lot that really are. Very few projects have the resources to triage all of these. If the money spent on Mythos had been invested in triaging the reports from existing tools, it would have done a lot more good for the ecosystem.I recently received a 'comprehensive code audit' on one of my projects from an Anthropic user. Of the top ten bugs it reported, only one was important to fix (and should have been caught in code review, but was 15-year-old code from back when I was the only contributor and so there was no code review). Of the rest, a small number were technically bugs but were almost impossible to trigger (even deliberately). Half were false positives and two were not bugs and came with proposed 'fixes' that would have introduced performance regressions on performance-critical paths. But all of them looked plausible. And, unless you understood the environment in which the code runs and the things for which it's optimised very well, I can well imaging you'd just deploy those 'fixes' and wonder why performance was worse. Possibly Mythos is orders of magnitude better, but I doubt it.This mirrors what we've seen with the public Mythos disclosures. One, for example, was complaining about a missing bounds check, yet every caller of the function did the bounds check and so introducing it just cost performance and didn't fix a bug. And, once again, remember that this is from the cherry-picked list that Anthropic chose to make their tool look good.I don't doubt that LLMs can find some bugs other tools don't find, but that isn't new in the industry. Coverity, when it launched, found a lot of bugs nothing else found. When fuzzing became cheap and easy, it found a load of bugs. Valgrind and address sanitiser both caused spikes in bug discovery when they were released and deployed for the first time.The one thing where Mythos is better than existing static analysers is that it can (if you burn enough money) generate test cases that trigger the bug. This is possible and cheaper with guided fuzzing but no one does it because burning 10% of the money that Mythos would cost is too expensive for most projects.The source code for Claude Code was leaked a couple of weeks ago. It is staggeringly bad. I have never seen such low-quality code in production before. It contained things I'd have failed a first-year undergrad for writing. And, apparently, most of this is written with Claude Code itself.But the most relevant part is that it contained three critical command-injection vulnerabilities.These are the kind of things that static analysis should be catching. And, apparently at least one of the following is true:Mythos didn't catch them.Mythos doesn't work well enough for Anthropic to bother using it on their own code.Mythos did catch them but the false-positive rate is so high that no one was able to find the important bugs in the flood of useless ones.TL;DR: If you're willing to spend half as much money Mythos costs to operate, you can probably do a lot better with existing tools.
(DIR) Post #B5Pr209WSChkMZGY2C by david_chisnall@infosec.exchange
2026-04-18T10:47:50Z
1 likes, 1 repeats
@mjg59 I’ve heard this argument before and I disagree with it. My goal for Free Software is to enable users, but that requires users have agency. Users being able to modify code to do what they want? Great! Users being given a black box that will modify their code in a way that might do what they want but will fail in unpredictable ways, without giving them any mechanism to build a mental model of those failure modes? Terrible!I am not a carpenter but I have an electric screwdriver. It’s great. It lets me turn screws with much less effort than a manual one. There are a bunch of places where it doesn’t work, but that’s fine, I can understand those and use the harder-to-use tool in places where it won’t work. I can build a mental model of when not to use it and why it doesn’t work and how it will fail. I love building the software equivalent of this, things that let end users change code in ways I didn’t anticipate.But LLM coding is not like this. It’s like a nail gun that has a 1% chance of firing backwards. 99% of the time, it’s much easier than using a hammer. 1% of the time you lose an eye. And you have no way of knowing which it will be. The same prompt, given to the same model, two days in a row, may give you a program that does what you want one time and a program that looks like it does what you want but silently corrupts your data the next time. That’s not empowering users, that’s removing agency from users. Tools that empower users are ones that make it easy for users to build a (nicely abstracted, ignoring details that are irrelevant to them) mental model of how the system works and therefor the ability to change it in precise ways. Tools that remove agency from users take their ability to reason about how systems work and how to effect precise change.I have zero interest in enabling tools that remove agency from users.
(DIR) Post #B5PzIvnShXFZPTWysC by david_chisnall@infosec.exchange
2026-04-18T08:19:14Z
1 likes, 0 repeats
The thing I wish someone would build, which I suspect would find bugs a lot more cheaply than Claude Mythos:Integrate a static analyser with a fuzzer.Static analysers will find paths and variable values that, if they occur, reach unhappy states in the program. But they can’t tell you that it is possible for the preconditions to occur.Fuzzers can explore the state space of code rapidly by throwing random values at it and then refining the input to try to explore specific places in the state space.I would love to see someone wire up clang’s analyser with libFuzzer, for example, so that you can throw the analyser at a big project and have it spit out the hooks for guided fuzzing, then try to generate the inputs that will trigger the possible bug. Bonus points if it then tries to minimise the test case (some existing fuzzers do this).This would then give you a fully automated way of triaging static analysis reports, by providing something you can use as a test case for the ones that are easy to trigger.