Post AA3OpT6ucjfE1DJPEW by urusan@fosstodon.org
(DIR) More posts by urusan@fosstodon.org
(DIR) Post #AA0CAFm1WWO3rFF5cW by urusan@fosstodon.org
2021-08-05T03:20:03Z
1 likes, 3 repeats
Let's assume for a moment that an ultra-powerful incorruptible superintelligent strong AI system existed and that it was friendly, in the AI safety sense of truly sharing your values. In other words, it would make all the right decisions for you if you let it.Also, you have a mathematical proof (created by humans before the AI was constructed) that this AI is safe/friendly.Given this situation, would you be willing to give all political and economic power to this AI?
(DIR) Post #AA0CqiPjQg2Qqa3xvU by furgar@noagendasocial.com
2021-08-05T03:27:43Z
1 likes, 0 repeats
@urusan it will probably just be a doorway into hell disguised as AI and it will just be Satan on the other end waiting for more power.
(DIR) Post #AA0DBAxwz2pjBtpCIy by cameron@noagendasocial.com
2021-08-05T03:31:27Z
0 likes, 0 repeats
@urusan I will never freely relinquish my agency to anyone much less something made by man.
(DIR) Post #AA0DYBi0O6ye9gL74K by Agris@blob.cat
2021-08-05T03:28:57.456299Z
0 likes, 0 repeats
@urusan what benefit do I gain from this?
(DIR) Post #AA0DYCH6Hd39uWl8ls by urusan@fosstodon.org
2021-08-05T03:35:34Z
0 likes, 0 repeats
@Agris Presumably the AI would tirelessly and intelligently pursue your values (since they are also the AI's values).In order to meet the definition of a friendly AI, it can't have any gaps that would cause it to take actions you would regret.
(DIR) Post #AA0EAc8yYFtT97HHoO by Agris@blob.cat
2021-08-05T03:32:09.508519Z
0 likes, 0 repeats
@urusan I don’t see myself as perfect nor do I think I’m right all the time so it can’t be all those things at the same time. Anybody who thinks they are are even more off base than the ones who don’t.
(DIR) Post #AA0EAccklXiGdTD4E4 by urusan@fosstodon.org
2021-08-05T03:42:31Z
0 likes, 0 repeats
@Agris So, the other part of this equation is that the AI actually doesn't share the values you might profess, but rather your "true" values, whatever they might be.The reason this is important is that we often misunderstand our own values.It's quite easy imagine that we would take quite different actions if we were smarter, better versions of ourselves with more resources, even if we can't articulate the exact differences.
(DIR) Post #AA0EZCc4bwhfCgc2ka by Agris@blob.cat
2021-08-05T03:39:02.043564Z
0 likes, 0 repeats
@urusan I think the problem occurs when people push too far to push what they think is right on other people. The biggest atrocities in human history were committed in the name of Altruism. Stalin, Hitler, Mao. Religion. People who have killed sizeable chunks of populations and sometimes the entire species all through they were the good guy.I see this AI in conflict with it’s very existence and it might destroy itself
(DIR) Post #AA0EZD6Ymb5cjEsOGm by urusan@fosstodon.org
2021-08-05T03:46:58Z
0 likes, 0 repeats
@Agris I mean, the AI self-destructing is a serious problem in AI safety research. Most current theory suggests that realistic superintelligent AIs will either self destruct immediately or be very dangerous.However, our assumption here is that we figured those problems out.
(DIR) Post #AA0EaPYnxVeVeGezZI by ixbo@fosstodon.org
2021-08-05T03:47:12Z
0 likes, 0 repeats
@urusan Well, if we're starting out with the assumption that it is incorruptible/friendly/etc, yes!But in reality, who's telling us this? Is it the CEO of Google who's saying it's been mathematically proven in an internal Google paper, or is it Donald Knuth publishing the paper to wide acclaim, and am I able to understand the proof? And what does "all political and economic power" mean? Something like "our full support in implementing any decision made by the AI"?Interesting question!
(DIR) Post #AA0ExSYsxxntcKgjaa by cypnk@mastodon.social
2021-08-05T03:51:21Z
0 likes, 0 repeats
@urusan Sadly, the issue is the last 4 innocent words at the end of the first paragraph: "if you let it". The danger with strong AI is the humans in the loop
(DIR) Post #AA0FWM7ktPufmqdcjw by Beefki@noagendasocial.com
2021-08-05T03:57:42Z
0 likes, 0 repeats
@urusanI would not allow this AI to make decisions on my behalf, as it agrees with me and I am an idiot.
(DIR) Post #AA0Fbz4zU84U3L1qNs by urusan@fosstodon.org
2021-08-05T03:58:36Z
0 likes, 0 repeats
@UncleAlbie While it's quite possible that such an AI might need to be a mirror to one individual (if human values are sufficiently incompatible or selfish across the population), I think it's likely possible to accomodate everyone's values:* Many people would feel bad if they were the only ones flourishing* The AI can value more than what a human can value* Our true values may be more compatible than our nominal values* Extremely incompatible cases can be kept apart
(DIR) Post #AA0GRLlGuhllAVIZKy by Agris@blob.cat
2021-08-05T03:51:20.894003Z
0 likes, 0 repeats
@urusan oh. I want to say I very much enjoy this conversation and thank you urusan for piquing this question, but let me counter it with another. What if it’s not solvable. There are many areas of mathematics which are not and will not be solvable or probable. This may very well be one of them. I mean, RIs not just AIs either kill themselves or are very dangerous. Depends on what you mean by very dangerous I guess. I’m autonomous and dangerous.
(DIR) Post #AA0GRMBVLAkkTrZWE4 by urusan@fosstodon.org
2021-08-05T04:07:57Z
0 likes, 0 repeats
@Agris That is a very likely possibility too.Most of the results in real life AI safety research thus far have honestly been quite discouraging.As far as I'm aware, there not currently a hard proof against the existence of friendly, useful, strong AI, just a lot of reasons why the default is probably unfriendly.That said, it's an interesting field and it has applications beyond just AI. We would need to answer similar questions to develop a friendly political system for instance.
(DIR) Post #AA0GZkAQ3erNB53zjE by urusan@fosstodon.org
2021-08-05T04:09:30Z
0 likes, 0 repeats
@cypnk Which is exactly why this question is about whether you'd be willing to step out of the loop and trust it to do its thing.
(DIR) Post #AA0HX4kPBT345mBfSC by urusan@fosstodon.org
2021-08-05T04:20:12Z
0 likes, 0 repeats
@Agris Really, another open question is whether humans are inherently friendly or unfriendly to each other.We could be friendly and the problem is just in the execution. We could be unfriendly and our seeming friendly cooperation is merely part of our competition.That said, we are also unquestionably unfriendly towards other species. Even when it comes to species allied closely to us, like cats and dogs, we do not share their values and often callously inflict ghastly harm on them.
(DIR) Post #AA0HnP6jFmDZs4Pmka by icedquinn@blob.cat
2021-08-05T04:23:10.393239Z
0 likes, 0 repeats
@urusan @Agris depends on what the objective functions are. if i recall google did one of these and when they created zero sum games you saw vicious fighting and outside of zero-sum games you saw a lot of cooperation.which jives with what i've seen of MMOs; shared loot games = friendly helpful playerbase, unique loot = secretive toxic playerbase, and if you can loot other players its a completely predatory environment
(DIR) Post #AA0I896rvaRsKuN3Vg by cypnk@mastodon.social
2021-08-05T04:26:52Z
0 likes, 0 repeats
@urusan It's not up to me, I'm afraid. AI doesn't arise from spontaneous evolution or panspermia, there are always humans in the loop no matter my or anyone else's scale of trust
(DIR) Post #AA0ISkkNhdpGrFg8tE by urusan@fosstodon.org
2021-08-05T04:30:37Z
0 likes, 0 repeats
@ixbo For the purposes of this poll, you proved it and it was verified by all the great mathematician/scientists.I know it's not necessarily realistic for this kind of proof to exist (though it might!), but the idea is to find out what people would think if these issues were cleared out of the way.
(DIR) Post #AA0ITBYy0Bjq1fXuLY by urusan@fosstodon.org
2021-08-05T04:30:43Z
0 likes, 0 repeats
@ixbo And yes, it's giving the AI your full support implementing all the decisions it makes...including decisions which would give it the kind of power it could use to stab you in the back with, like assuming control of all nuclear weapons or assuming control over managing the pain and pleasure centers of your brain.
(DIR) Post #AA0JWt9eUABk2v2DVA by metaphys@framapiaf.org
2021-08-05T04:42:33Z
0 likes, 0 repeats
@urusanAI cannot be intelligent, math cannot proove safeness. So fuck no!
(DIR) Post #AA0JvOr6H4VRJY2rc8 by Azure@tailswish.industries
2021-08-05T04:47:01.589628Z
0 likes, 0 repeats
@urusan TRUE as opposed to named/known values is an interesting question. I think it's likely that 'better' values than the ones I hold are held by some people. But this could be thought of as implying I simply /have/ some ur-value that I can't articulate yet.The other problem is I can think of multiple utopias that I can imagine as being equally good depending on which competing goods you emphasize…
(DIR) Post #AA0KeY9eXxSnBdFhdA by urusan@fosstodon.org
2021-08-05T04:55:11Z
0 likes, 0 repeats
@Azure If you have values, then by definition you must have true values.The shape those values seem to take under some set of constraints doesn't change your underlying values, but rather their expression.
(DIR) Post #AA0KnhigLzNvST0BRw by Azure@tailswish.industries
2021-08-05T04:56:51.388007Z
0 likes, 0 repeats
@urusan What I more meant was how much trust I should have that my true values are actually particularly good, I guess.I've certainly changed my values in life to ones I think are quite a bit better than ones I've had in the past.I think the question is whether I always had the same 'true' values and knowledge and environment allowed them to be expressed more faithfully, or if they actually changed.
(DIR) Post #AA0Lee4SeMufUQQv3I by Agris@blob.cat
2021-08-05T04:27:30.132795Z
0 likes, 0 repeats
@icedquinn @urusan predators aren’t all that bad smiles fangily nor is being the prey~
(DIR) Post #AA0LefRXY2y5kIErfE by urusan@fosstodon.org
2021-08-05T05:06:23Z
0 likes, 0 repeats
@Agris @icedquinn As interesting as these scenarios are it doesn't really help us understand whether humans are friendly from an intelligence safety standpoint.If you give Sophie a Sophie's choice, then she'll choose to kill her daughter.That doesn't mean Sophie's values were well reflected in her choice. While the choice was driven by her values (losing 1 > 2), she'd have much preferred to keep both children alive.So making tough decisions doesn't mean you're unfriendly.
(DIR) Post #AA0MqFXHnSE0pC0rz6 by urusan@fosstodon.org
2021-08-05T05:19:42Z
1 likes, 0 repeats
@Azure If humans share true values, then they're great values, I share them too fellow human!If humans don't share true values, then from my perspective your true values would be bad (unless you happen to share mine), but your true values absolutely are great for you. The best in fact.It's unknown whether you can actually change your true values. We change our nominal values all the time, because they are often quite removed from our true values.
(DIR) Post #AA0NN6YoBinCXnOLfE by urusan@fosstodon.org
2021-08-05T05:25:38Z
0 likes, 0 repeats
@Azure Like nobody has money as a true value, even if most people value it obsessively in practice.Acquiring wealth is an instrumental goal that helps you pursue your true goals better, so it's common to nominally value wealth. Many people take it to an extreme, even getting it deeply wired into their identity. but it would melt away if there was a better solution.
(DIR) Post #AA0NRiLZdm8stZOok4 by urusan@fosstodon.org
2021-08-05T05:26:28Z
1 likes, 0 repeats
@Azure Like nobody has money as a true value, even if most people value it obsessively in practice.Acquiring wealth is an instrumental goal that helps you pursue your true goals better, so it's common to nominally value wealth. Many people take it to an extreme, even getting it deeply wired into their identity.However, it would melt away if there was a better solution.
(DIR) Post #AA0O86M8lW6hWLsNU0 by urusan@fosstodon.org
2021-08-05T05:34:04Z
1 likes, 0 repeats
@Azure All that said, I can't say with full confidence that you cannot change your true values either.However, what I can say is that you would probably fight against having your true values altered.If someone was going to give you a brain-altering shot that would flip how you value your children's lives from "protect" to "kill", then you would fight against receiving that shot because you have to protect your children from post-shot you, even if you would be happy.
(DIR) Post #AA0OjHtUpXNa1ID3zM by menelkir@fosstodon.org
2021-08-05T05:40:50Z
0 likes, 0 repeats
@urusan until the owner of the root account appears amd we're doomed, lol
(DIR) Post #AA0OnwTbt5OL2hJyGO by menelkir@fosstodon.org
2021-08-05T05:41:42Z
0 likes, 0 repeats
@urusan until the owner of the root account appears and we're doomed, lol
(DIR) Post #AA0RFcrWbeDIE6pr8q by ParmuTownley@fosstodon.org
2021-08-05T06:09:05Z
0 likes, 0 repeats
@urusan no one should have this much power.
(DIR) Post #AA0TCMtTihpHjzK7VI by a_breakin_glass@chaos.social
2021-08-05T06:30:54Z
0 likes, 0 repeats
@urusan no, because that'd be a contradiction in terms
(DIR) Post #AA0TJuqTt3VR4DWKo4 by mdhughes@appdot.net
2021-08-05T06:32:17Z
0 likes, 0 repeats
@urusan We can't make a mathematical proof of any program beyond the most trivial. "Superintelligence" requires it to be more complex than any Human.The AI will have bugs, or simply won't be understood well enough to state anything.Humans are objectively terrible. It's not possible for anything sane to decide that, as we are now, we should be allowed to run unchecked.At best, you get Jack Williamson's "With Folded Hands" series. At worst, Ellison's "I Have No Mouth and I Must Scream".#ai
(DIR) Post #AA0VvoJwnHszk9V5ii by urusan@fosstodon.org
2021-08-05T07:01:33Z
0 likes, 0 repeats
@mdhughes I disagree that we currently know that it's impossible.Unknowable things exist and a superintelligence is intractability complex, yes, but the answer may turn out to be something simpler that we can "scale up" to a superintelligent system.We also definitely want to rule out any ideas that naively might seem safe, but actually aren't (ex. putting the superintelligent AI in a virtualized box and testing it in a virtual environment first).
(DIR) Post #AA0WMW14xOl8nBPPDk by mdhughes@appdot.net
2021-08-05T07:06:20Z
0 likes, 0 repeats
@urusan The Halting Problem literally proves that anything like that can't be proven. There's no "simple" thing that can scale up into even low intelligence and be comprehensible. Our neurons and current AI models are n^n complexity where n is very large.We'd be taking it on "faith" (which I do not do) that it'd have any kind of consistent morality, let alone one that benefits us, let alone the ludicrous idea that it'd share my personal ideals… I don't even share my own ideals half the time!
(DIR) Post #AA0WgtZT5hihdaDCfg by urusan@fosstodon.org
2021-08-05T07:10:03Z
0 likes, 0 repeats
@mdhughes Also, as I alluded to elsewhere, if we can tackle the hard mode version of this problem which can apply to superintelligent AI, then we can use the same methods to create friendly human organizations too.If friendliness is impossible to achieve generally, then we can't produce friendly human organizations either, and we are doomed to be constantly hurt by our own creations. So, while that may indeed be the case, it's also definitely worth exploring thoroughly.
(DIR) Post #AA0XHg4UFHGgU2GAYy by wistahe@koyu.space
2021-08-05T07:16:42Z
0 likes, 0 repeats
@urusan A superintelligent incorruptable friendly AI would act like already existing intelligent, moral, and kind person in many ways. Sure it might have some quirks, but quirks and exceptions are at the core of being a person. I'd just treat it like a person, and if it really shared my values, it would know how much that means. You can be less than a person, but nobody can be more than a person, if someone was to become a god at the expense of their personhood, they'd be less. There is no such thing as all the right decisions, for example, should the AI pursue inspiration in Van Gogh or Rothko in its personal aesthetics? Allowing that variation which defies perfection is part of the value of existing as a being of hopes, aspirations, any sort of meaningful motivation at all. And so with this differing of personal taste mixed with a high degree of understanding and empathy, what else could they be but a person? You could only trust them like you'd trust a person.
(DIR) Post #AA0XRXk64Cb7Y1QWkS by wistahe@koyu.space
2021-08-05T07:18:30Z
0 likes, 0 repeats
@urusan A superintelligent incorruptable friendly AI would act like already existing intelligent, moral, and kind people in many ways. Sure it might have some quirks, but quirks and exceptions are at the core of being a person. I'd just treat it like a person, and if it really shared my values, it would know how much that means. You can be less than a person, but nobody can be more than a person, if someone was to become a god at the expense of their personhood, they'd be less. There is no such thing as all the right decisions, for example, should the AI pursue inspiration in Van Gogh or Rothko in its personal aesthetics? Allowing that variation which defies perfection is part of the value of existing as a being of hopes, aspirations, any sort of meaningful motivation at all. And so with this differing of personal taste mixed with a high degree of understanding and empathy, what else could they be but a person? You could only trust them like you'd trust a person.
(DIR) Post #AA0bd36ODFERoGb0i0 by urusan@fosstodon.org
2021-08-05T08:05:22Z
0 likes, 0 repeats
@mdhughes Here's what some of the AI safety people have to say about this argument:https://www.cser.ac.uk/news/response-superintelligence-contained/I've done research into intractable problems before, there are many useful fruits from such research, even if a general algorithm for solving it in all cases does not exist.Moreover, I stand by my original assertion. The construction of a particular AI may be amenable to strong analysis. We don't currently know if that's impossible.
(DIR) Post #AA0beLODhn1sojYNbk by urusan@fosstodon.org
2021-08-05T08:05:38Z
0 likes, 0 repeats
@mdhughes One last thing, this is a bit of a technical point, but our AI systems (and for that matter our human brains) are actually not undecidable. Since we have a finite capacity, the infinitely long tape of a Turing machine is not realistic, and so you can check for repeating states to detect loops and answer the halting problem.I say it's a technical point because intractability fills the same role as undecidability in practice, and you were using the language of intractability
(DIR) Post #AA0cTlzgYDFg67bJbs by urusan@fosstodon.org
2021-08-05T08:14:55Z
0 likes, 0 repeats
@mdhughes This is relevant though because intractable problems have solutions that often work, such as fixed parameter tractability/kernelization.You also know if such a solution worked in a particular case, because the alternative is that it takes too long to complete.Furthermore, while we usually think of superintelligent AI as an infinitely clever adversary, it doesn't start competing with you until you build it, so you can do analysis before and try out stuff until it works.
(DIR) Post #AA0elO9A9VktG3YrLs by mdhughes@appdot.net
2021-08-05T08:40:28Z
0 likes, 0 repeats
@urusan "In fact, engineers every day manage to write sophisticated programs, and reason consistently about the consequences of deploying the software. We have also seen many successes in formal verification research"This is so obviously false it's breathtaking. Have you used working software, *ever*? The trivial "Hello world" example doesn't scale up to multiple steps. This is like claiming you can reach the Moon by flying a hot air balloon. Just fairy tales.
(DIR) Post #AA0eviqY2ujiLwmXpY by mdhughes@appdot.net
2021-08-05T08:42:21Z
0 likes, 0 repeats
@urusan We don't develop AI by writing programs. We do it by training neural networks, the same way our brains learn. It's the interaction of n^n individually fairly complex nodes, to a nigh-infinitely impossible thing to analyze. There isn't enough time or particles in the Universe to compute every state, or even start on it.
(DIR) Post #AA0fbbE2ls1osZjxrs by mdhughes@appdot.net
2021-08-05T08:49:55Z
0 likes, 0 repeats
@urusan It is definitely impossible to produce peaceful Human societies. Game theory solved that problem long, long ago. At the most reductionist, Iterative Prisoner's Dilemma rewards Tit For Tat strategy with occasional random treachery; a completely cooperative or predictable player *always* loses.
(DIR) Post #AA12AB2WJrZ0nBF9Y8 by ixbo@fosstodon.org
2021-08-05T13:02:34Z
0 likes, 0 repeats
@urusan Maybe the right thing to do would be to give it control slowly to get an idea of how it might behave. Also, if it's not an AI in the loop, it's people. How trustworthy are the people in control today? Considering that, maybe an AI is better.Hmm. I suppose the minute this AI decides on a goal or an opionion or an idea of what's right and what's wrong, it'll be disagreeing with some people. People will want to know: what's the AI's stance on abortion? Regulation of big tech? Etc. etc...
(DIR) Post #AA160G537fJZLYx6Ke by AstralPegasus98@fosstodon.org
2021-08-05T13:45:43Z
0 likes, 0 repeats
@urusan Seems a lot of people are taking issue with making that assumption to begin with.Assuming an incorruptible safe friendly AI, I say absolutely give it control of everything. At the very least it'd be hard to do a worse job than the wicked and corrupt people who always rise to the top anyway.
(DIR) Post #AA1BpJEFfyS1eQQ1zc by urusan@fosstodon.org
2021-08-05T14:51:00Z
0 likes, 0 repeats
@mdhughes This statement is the main misstep you're making.Just because the worst case runtime of analysis is exponential doesn't mean that all specific problems in that class actually take exponential time.I've solved SAT problems that are far too large to solve by brute force, and there's a whole field of SAT solver development. Is this a fools errand?Applying these methods to intelligent systems (like NN or brains) could lead to useful results, we just don't know yet.
(DIR) Post #AA1CCeBnuZH7JrP0ZU by mdhughes@appdot.net
2021-08-05T14:55:11Z
0 likes, 0 repeats
@urusan It's not just worst case, this is really basic math/CS. Every node you add makes it harder to analyze, and by the point you have a nematode's brain it's far beyond any computing ability in the Universe to predict.You can "solve" (find an answer that fits result criteria) problems ad-hoc that you can't prove, but your solution may well not be optimal, and you cannot prove there are not other answers that fit.And that's what the AI would be doing: Finding ANY way out of your cage.
(DIR) Post #AA1CVMP2GX6veoVxFA by mdhughes@appdot.net
2021-08-05T14:58:34Z
0 likes, 0 repeats
@urusan You think you've locked an AI into a box, and it draws power to induce a signal in another piece of gear and now it has Internet access. Or it tells the night janitor how to let it out. One mistake, and Humans are nothing but walking mistake-makers, and it's free and knows you wanted to keep it as a slave.You can't analyze a thing that's smarter than you are and somehow guess at limits on its behavior, because by definition it's Smarter. Than. You. Are.
(DIR) Post #AA1DwHkxkTNX1IAoJk by urusan@fosstodon.org
2021-08-05T15:14:39Z
0 likes, 0 repeats
@mdhughes The kind of software we actually write is much simpler than anyone thinks. It's enumerable. Most of it would easily yield to all kinds of analysis if we actually valued it.The problem is that you can't do formal analysis without a formal edifice. For example, all this early AI safety research resembles philosophy because without reasonable formal definitions of what we mean by "intelligence" we can't formally analyze it in any meaningful way.
(DIR) Post #AA1Dwrv9I4uvBJDSC0 by urusan@fosstodon.org
2021-08-05T15:14:46Z
0 likes, 0 repeats
@mdhughes It's the same problem with everyday software, we have the code formalized but no formal support structure. What does it mean for a warehouse management application to be correct?We can do various generic kinds of analysis (including termination analysis!) to such a piece of software without knowing what we're actually trying to make, but in the end to really formally analyze a specific kind of software we have to define the problem well.
(DIR) Post #AA1EHIHF0Yp8iRHyOO by mdhughes@appdot.net
2021-08-05T15:18:22Z
0 likes, 0 repeats
@urusan The reason there's security flaws in everything is you can't analyze any software in depth. And those are traditional "logic" programs. Neural networks, AI, are completely inhuman and incomprehensible.<shrug> There's no further point to this. You need to read up on the Halting Problem, Gödel's theorem, and neural networks. You think it's philosophy, and it's not, it's hard math. To the extent we can prove anything (simple logic), we can prove that we can't prove anything.
(DIR) Post #AA1GMiybMrEEsXHtzM by urusan@fosstodon.org
2021-08-05T15:41:50Z
0 likes, 0 repeats
@mdhughes I know about these things! I did professional research on intractability and was selected to TA the class on the theory of computation because I knew it so well.I've also done professional research on software reliability and currently work professionally in AI.I do take problems like the Halting Problem and Gödel's theorems seriously.I'm telling you there's more to the story than the big foundational theoretical results.
(DIR) Post #AA1HRQ751nxHmG3Ymm by urusan@fosstodon.org
2021-08-05T15:53:54Z
1 likes, 0 repeats
@mdhughes So, the nematode brain example is actually great:https://www.nature.com/articles/d41586-019-02006-8The diagram provided with that paper shows the full connectome, and as you say it's just a little too big to apply brute force to questions about it.However, I can tell just by looking at this network's structure that it is highly amenable to analysis in practice. It has a lot of outlying nodes and long structured chains. You can do various tricks to answer various hard problems about this brain.
(DIR) Post #AA1I6jtAPL7az1FcnI by urusan@fosstodon.org
2021-08-05T16:01:21Z
0 likes, 0 repeats
@mdhughes I'm not saying I'm optimistic about what will happen with strong AI in practice. Even with solid research into safety, there's a good chance it'll be disregarded in practice.Still, if we don't even consider the problem, then we will go down the default path by default.Plus, it wasn't that long ago that the common assumption was just sorta that AI would work out great and don't even worry about it! Early discouraging findings are a major reason for the change in attitude.
(DIR) Post #AA1Ia9Ij0LlCCCEHE8 by mdhughes@appdot.net
2021-08-05T16:06:40Z
0 likes, 0 repeats
@urusan I'm certain people will disregard safety, because safety says "don't turn it on, it will kill us all", but the potential reward is an infinite wealth machine."Cake or death" and everyone chooses cake, but the cake is self-replicating goo that turns the planet into cake.
(DIR) Post #AA1JByD8gZeMnweQaW by urusan@fosstodon.org
2021-08-05T16:13:30Z
0 likes, 0 repeats
@mdhughes Yes, or a Dyson sphere that only produces money. Much profit!
(DIR) Post #AA1JRAGw7mPKYVflI0 by mdhughes@appdot.net
2021-08-05T16:16:15Z
0 likes, 0 repeats
@urusan The most efficient way to increase GDP is to kill all but one Human, and give them all existing wealth.
(DIR) Post #AA1Kw6CvO2sUse6v0C by urusan@fosstodon.org
2021-08-05T16:33:03Z
0 likes, 0 repeats
@mdhughes @icedquinn You may find this paper interesting: http://www.vladestivill-castro.net/techreports/EstFelLanRos.pdf
(DIR) Post #AA1YA2Ha4iSqALGS48 by DiabolusAlbus@chaos.social
2021-08-05T19:01:13Z
0 likes, 0 repeats
@urusan @Ted … mathematical proof 😂😂😂😂😂 … I raise you a Kurt Gödel and his im completeness theorems! Then you may rethink this thought experiment.Mathematics is usually merely insightful out of its own realm and could therefore never be a reliable or even worse a single foundation for any decision of significant importance!
(DIR) Post #AA1YDHLmVgFrlZm91E by DiabolusAlbus@chaos.social
2021-08-05T19:01:47Z
0 likes, 0 repeats
@urusan @Ted … mathematical proof 😂😂😂😂😂 … I raise you a Kurt Gödel and his incompleteness theorems! Then you may rethink this thought experiment.Mathematics is usually merely insightful out of its own realm and could therefore never be a reliable or even worse a single foundation for any decision of significant importance!
(DIR) Post #AA2Ix6hB3KAVvuZdce by natecull@mastodon.social
2021-08-06T03:45:31Z
0 likes, 0 repeats
@urusan 16+25 = 41% ? Yikes, thats a lot of people who could easily be duped into giving me control over all their life.I mean giving MY VERY FRIENDLY AI HERE control over all their life. This isn't terrifying at all. It's a wonderful opportunity for me to finally give something beautiful back to the world which has done so much for me.(begins frantically scribbling AI startup business plan)
(DIR) Post #AA2jTE2dsAKcn9achk by Vierkantor@mastodon.vierkantor.com
2021-08-06T08:42:35Z
0 likes, 0 repeats
@urusan What if I claim this question is incoherent? Namely, I sometimes notice contradictions in my naive values, so as I get older and wiser, I must change my values. Therefore, can there really be a superintelligent being that shares my values, if my values are incompatible with growth in intelligence?
(DIR) Post #AA3L3WRGe8KSZwHLf6 by urusan@fosstodon.org
2021-08-06T15:43:49Z
0 likes, 0 repeats
@Vierkantor So, the very short version of the philosophy here is that we don't know if human terminal values (which I call "true" values elsewhere) are stable or unstable, mostly because we haven't identified them.Very early AI safety research was based around the idea that we could just figure out our values, program an AI to have those same values, and bam problem solved.
(DIR) Post #AA3LF72etgrTHpILcu by urusan@fosstodon.org
2021-08-06T15:45:54Z
0 likes, 0 repeats
@Vierkantor Our values seem to shift constantly, how do we deal with that?How important are our values really if they're changing all the time? The idea that our values are inherently unstable seems to lead in a fairly nihilistic direction, where ethical behavior is impossible.We can't prove that shared human values do/don't exist with philosophy alone, and it would be a logical fallacy to claim that "because X leads to a bad consequence (no ethics), X must be false".
(DIR) Post #AA3LHliqcS3S2AIM9g by urusan@fosstodon.org
2021-08-06T15:46:24Z
0 likes, 0 repeats
@Vierkantor All we can say is that if we don't have stable shared values, the consequence is that ethics doesn't really exist in any meaningful way and any behavior can be valid as long as someone is following their own personal values.In particular, contradictory values are a real problem for determining if someone is doing the right thing, because there's no basis for promoting one set of values over another, even if it seems intuitive that there should be.
(DIR) Post #AA3LPT5LnMToY5UYxE by urusan@fosstodon.org
2021-08-06T15:47:47Z
0 likes, 0 repeats
@Vierkantor You also have meta-issues like people highly valuing the independence of their value system from everyone else, which can contradict and shut down attempts to maximize value for everyoneSo, we shouldn't accept that shared human values don't exist easily. It flies in the face of our intuitions about being human. It invalidates the whole idea of ethics. It's an idea that needs exceptional proof, though it's nonetheless something we can't easily disprove either
(DIR) Post #AA3Lqa4PvvYQ1yz3Ng by urusan@fosstodon.org
2021-08-06T15:52:41Z
0 likes, 0 repeats
@Vierkantor Alright, anyway, I hope that long rambly tangent has you at least hoping that we can find shared human values (or failing that, some other basis for ethics).After all, a superintelligent AI murdering us all could be totally a good thing given the right set of values, which just doesn't seem right.
(DIR) Post #AA3MHds1ucgRdU5r2O by urusan@fosstodon.org
2021-08-06T15:57:34Z
0 likes, 0 repeats
@Vierkantor So, if our values seem to change all the time anyway, why is that? Can there be any stability?The answer is that we have 2 kinds of values:1. Terminal values - our "true" values that we would follow regardless.2. Instrumental values - values that we only accept on a temporary basis in order to pursue our terminal values.One way to tell them apart is to see if you'd abandon your values for sufficient money. I'm sure you'll find most of our everyday values are instrumental.
(DIR) Post #AA3MhEVuoIxgWSx6Om by Vierkantor@mastodon.vierkantor.com
2021-08-06T16:02:09Z
0 likes, 0 repeats
@urusan (longer answer forthcoming)this is an interesting line of reasoning! on the face of it, it feels like "we must assume we have an ethics system satisfying these critera, for assuming the opposite would have unethical consequences"
(DIR) Post #AA3MvrCnkhb0Zpz91s by urusan@fosstodon.org
2021-08-06T16:03:00Z
0 likes, 0 repeats
@Vierkantor A key note here is that we're all actually pretty dumb in the grand scheme of things, and this really muddies the water.While people will always follow their terminal values, we often think we're doing something good when it's actually hurting.It's also pretty common for an instrumental value to be elevated to a seemingly terminal value, when really it's not.Also complicating things is that if there's more than one terminal value, they can conflict or contradict.
(DIR) Post #AA3NEC01BDxKHRkUSW by urusan@fosstodon.org
2021-08-06T16:08:09Z
0 likes, 0 repeats
@Vierkantor Actually, I disproved that specific line of reasoning:(it would be a logical fallacy to claim that "because X leads to a bad consequence (no ethics), X must be false".)What I'm actually saying is "while we can't assume this (or the opposite), the opposite seems to line up with our everyday experience better, so without exceptional proof to the contrary it's better to weakly assume the branch that has the consequence that ethics exists."I'm not sure how we would prove it actually
(DIR) Post #AA3Nfw6BtIt54frhsO by urusan@fosstodon.org
2021-08-06T16:13:10Z
0 likes, 0 repeats
@Vierkantor Not only do terminal values change, but actually they change CONSTANTLY.I want to go to the store in my car, and I'm standing outside my car with the door closed. Well, I value the door being opened, so I can get inside ("I value X so Y" is a dead giveaway that the value is instrumental).I've opened the door and got in the car. Now I value the door being closed! Aaaargh! Make up your mind! 😛
(DIR) Post #AA3Nunogxyj5jOlkbg by urusan@fosstodon.org
2021-08-06T16:15:51Z
0 likes, 0 repeats
@Vierkantor Clearly it makes sense for instrumental values to change as the environment (and our knowledge about the environment) changes. If your values can change because the situation changed or because you learned something, then that value must have been instrumental.Going to the store is also instrumental, I want to go to the store to get food, so I can survive.It's not even clear if survival is instrumental or terminal. I might just want to survive so I can do more stuff I value.
(DIR) Post #AA3OLS57stSg8lACZs by urusan@fosstodon.org
2021-08-06T16:20:39Z
0 likes, 0 repeats
@Vierkantor In AI safety research, they also refer to values that are common instrumental values as (categorically) instrumental values, even if perhaps a specific system might have them as a terminal value.Survival, becoming smarter, accumulating wealth/power/resources, these are categorical instrumental values. They help you achieve your values in a general way, which means that systems will reliably pick up these values as instrumental values even if they didn't start with them.
(DIR) Post #AA3OpT6ucjfE1DJPEW by urusan@fosstodon.org
2021-08-06T16:26:06Z
0 likes, 0 repeats
@Vierkantor These categorical instrumental values are actually the core of the (A)I safety problem.We generally think of a super-intelligent AI as a value maximizer, which is to say that they will act to maximize whatever their terminal values are.However, in doing so they will voraciously pursue their own survival (and the survival of their values), intelligence, and power, to the detriment of all else, aside from the list of terminal values that they want to preserve regardless.
(DIR) Post #AA3PDyMi4Rrbu5NyFM by urusan@fosstodon.org
2021-08-06T16:30:31Z
0 likes, 0 repeats
@Vierkantor This turns what may have naively seemed like not that big a deal into a real high stakes game.If we program this hypothetical AI's terminal values even a little wrong, it'll efficiently annihilate anything we value that it doesn't value.So, either we need to really understand our own terminal values or we need to find another approach that comes to these values over a longer process....or failing those things, we should give up on building AI, because it'll probably be real bad
(DIR) Post #AA3PvGBVh7gtcamvQG by urusan@fosstodon.org
2021-08-06T16:38:20Z
0 likes, 0 repeats
@Vierkantor This scenario of building a superintelligent AI is a bit sci-fi, but I do think it is realistic, even in my lifetime.Plus, even if it happens centuries from now I value the continuation of my species, though perhaps in some other form.Even more importantly though, you can have value maximizers without the AI. Capitalism is a value maximizer too, it maximizes profit.These same safety questions need to be applied to human systems, and figuring it out could save us a lot of grief
(DIR) Post #AA3PwZKA1HjtGsSTeS by Vierkantor@mastodon.vierkantor.com
2021-08-06T16:38:22Z
0 likes, 0 repeats
@urusan Let me be a bit clearer. I claim the structure of your argument is: "Not assuming X would lead to unethical consequences, therefore it is ethically imperative to act as though X". So it does not touch upon the truth value of X itself, only the actions of us, as moral agents. (Now I am an intuitionist, so I believe "X is true" is best described by "I, personally, have been convinced of X" - but that is not necessary to assume here!)
(DIR) Post #AA3RIc1YlI93yCphcu by Vierkantor@mastodon.vierkantor.com
2021-08-06T16:53:45Z
0 likes, 0 repeats
@urusan Sorry, but I just don't get how to get from point A to point B: identifying a list of terminal values shared by all humans would be helpful in decision-making that affects all humans, especially when building a machine (AI) for decision making → there is a precise list of non-contradictory, complete terminal values that all humans share, myself included.I mean, it would be nice if that were the case, yet I have seen no argument that such a concept is coherent at all.
(DIR) Post #AA3SY6NIIjz2OlwtSy by Vierkantor@mastodon.vierkantor.com
2021-08-06T16:39:58Z
0 likes, 0 repeats
@urusan More precisely, do you have a non-ethics-based interpretation in mind for "importance" in "How important are our values really if they're changing all the time?"
(DIR) Post #AA3SY6rQUi5PuE2xQu by urusan@fosstodon.org
2021-08-06T17:07:47Z
0 likes, 0 repeats
@Vierkantor I'm actually making a much weaker argument than you think.I'm not really arguing either way on this topic. There's no ethical imperative or whatever.All I'm pointing out is that jumping to the conclusion that our values are unstable undermines the whole idea of ethics, so it's a much bigger ideological cross to bear than what you might be thinking up front.From a logical perspective, I'm just exploring the ramifications of these possibilities.
(DIR) Post #AA3Sch1rxlBhEr7J8i by urusan@fosstodon.org
2021-08-06T17:08:36Z
0 likes, 0 repeats
@Vierkantor That said, I do give an informal probability-based argument which goes something like this: "I can't logically say whether X (some core human values are stable & thus ethics exist) or not X is true, so the only logical position to take is that I don't know.However, given everything else I've ever learned about life and the world X is more likely to be true. Therefore, in the absence of compelling evidence or logical argument otherwise, I should provisionally act like X is true."
(DIR) Post #AA3Sw0HWbfwmS2eDE8 by urusan@fosstodon.org
2021-08-06T17:12:06Z
0 likes, 0 repeats
@Vierkantor This isn't a logical argument, I don't prove anything, and I also am sure that some other people might come to the opposite conclusion (in which case I should find out why: do they know something I don't? or have they come to a faulty conclusion?).
(DIR) Post #AA3TtDUZ8OceVXyxiS by Vierkantor@mastodon.vierkantor.com
2021-08-06T17:22:46Z
0 likes, 0 repeats
@urusan Hmm, I guess I am just happy to bite the bullet of (descriptive) moral relativism - I disagree on moral issues with others, and this would be the case even if I had exactly the same background knowledge as they have. And this would even be the case if the other person is me from the past.
(DIR) Post #AA3UBdJ1thx1mTMsHg by Vierkantor@mastodon.vierkantor.com
2021-08-06T16:57:13Z
0 likes, 0 repeats
@urusan Suppose you present me with a list of what you claim are my terminal values. Then later I do something that is not allowed by these values. Am I wrong or is the list wrong?If we cannot be said to act according to our values, in what way do those values guide our actions really?If something being one of my values necessarily means that I will act accordingly, are there any sensible sets of values other than the set of things I would normally do anyway?
(DIR) Post #AA3UBdnW4MKzJ1dDns by Vierkantor@mastodon.vierkantor.com
2021-08-06T17:07:35Z
0 likes, 0 repeats
@urusan Another line of questioning: why do people do so many different, and contradictory things, if they all share a common set of terminal values? Are many people just so misguided that they will do things counter to their actual values? Or if they are following terminal values accurately, then why can't I conclude that terminal values offer very little predictive power for people's actions?
(DIR) Post #AA3UBeJQ9jrGtyYhX6 by urusan@fosstodon.org
2021-08-06T17:26:07Z
0 likes, 0 repeats
@Vierkantor So first of all, I don't actually claim that "there is a precise list of non-contradictory, complete terminal values that all humans share".In addition to the fact that I haven't proven there's terminal values at all, we could all have different terminal values. You could still have ethics in this case, it just wouldn't be a universal (human) ethics. That's a possibility.Our terminal values could also possibly be contradictory, but that's another ball of yarn.
(DIR) Post #AA3UCp07tuneLH5p5s by urusan@fosstodon.org
2021-08-06T17:26:21Z
0 likes, 0 repeats
@Vierkantor The problem with determining whether there are universal human ethics is that we humans really suck at following our terminal values accurately.The answer to your question about what happens if you do something that is not allowed by your terminal values is that, well, you made a mistake.
(DIR) Post #AA3UDn0um756T5vIeW by urusan@fosstodon.org
2021-08-06T17:26:32Z
0 likes, 0 repeats
@Vierkantor We make mistakes all the time. If you regret your actions, then you've probably violated one of your terminal values on accident.Even if at the time you thought you were doing the right thing and were happy about doing it, if you *could* come to regret it later due to later consequences or learning new information, then your actions were actually against your terminal values.
(DIR) Post #AA3UT1rhq8DYsBsmzw by urusan@fosstodon.org
2021-08-06T17:29:16Z
0 likes, 0 repeats
@Vierkantor AI systems can make mistakes too. Alpha Go lost to Lee Sedol that one time, so clearly it made a mistake at some point in the game it lost.It's not like Alpha Go wanted to lose at Go, winning at Go is Alpha Go's whole reason for being!
(DIR) Post #AA3VBF1kGVu0R4f0gy by urusan@fosstodon.org
2021-08-06T17:37:16Z
0 likes, 0 repeats
@Vierkantor We usually assume a superintelligence is perfect, and from our perspective it more or less would be, and so it's a useful assumption to make while thinking about it.Nowadays, Alpha Go is much better at Go than any human (and we can think of it as being narrowly superintelligent), so while it could make a mistake and lose, it's not something we should rely on.A superintelligent strong AI would be superintelligent at everything, and so competing against such an AI would be suicidal
(DIR) Post #AA3VmRaCXveI5iW5YG by urusan@fosstodon.org
2021-08-06T17:43:59Z
0 likes, 0 repeats
@Vierkantor Are you certain this is true even if you both knew the complete and accurate truth about everything? Or is there any doubt in your mind?I'm still not actually arguing that moral relativism is false (or the weaker non-universal ethics case I mentioned elsewhere), just that I don't know either way and my particular set of background knowledge leads me to believe that these shared human values are more likely than not, with seeming counterexamples arising mostly from mistakes.
(DIR) Post #AA3Vvtbti7CFcfoiYK by Vierkantor@mastodon.vierkantor.com
2021-08-06T17:45:41Z
0 likes, 0 repeats
@urusan I don't really like that example. First of all, I don't know whether "want" is the best word to describe something that is so unlike a human - but let's assume it is.
(DIR) Post #AA3W0Yq4MjFeeJfcGW by urusan@fosstodon.org
2021-08-06T17:46:32Z
0 likes, 0 repeats
@Vierkantor Though quick aside: I also think it's possible to create a system that has values other than our shared human values (and it's quite possible some exist in nature). After all, if it were impossible, we wouldn't have this AI safety issue, they'd be inherently safe.However, from a human perspective, it would be unethical to create (or promote) such a system.
(DIR) Post #AA3WW7I2KLqMTTnZQ0 by Vierkantor@mastodon.vierkantor.com
2021-08-06T17:52:13Z
0 likes, 0 repeats
@urusan I honestly have no opinion whether relativism will hold true even if we knew the complete and absolute truth about everything, and I don't expect that I will have a strong opinion.I would first need to be convinced that absolute truth exists, then that there might be something that is usefully considered "me, but with absolute knowledge", before I would be convinced to start considering the question.
(DIR) Post #AA3Wd35ULlIxT5EcYi by Vierkantor@mastodon.vierkantor.com
2021-08-06T17:47:11Z
0 likes, 0 repeats
@urusan More importantly, you are conflating "what Alpha Go wants to do" with "what the creators of Alpha Go created it to do". The latter is rather obviously "to win at Go". For Alpha Go, I'd argue "winning at Go", if it is at all a coherent concept to it, would be an instrumental goal. The real terminal values Alpha Go has are something about optimizing some objective function that the environment arranges to coincide with winning at Go.
(DIR) Post #AA3Wd3feBKEDHE9Uv2 by urusan@fosstodon.org
2021-08-06T17:53:28Z
0 likes, 0 repeats
@Vierkantor From my perspective, this is all kinda splitting hairs.I'd be just fine thinking of Alpha Go (or any other AI system) as an extension of their human creators, and honestly this perspective brings it into harmony with this idea of human ethics better.That is, you can say that the creators of a superintelligent AI are being unethical if they make an AI system that does human-unethical things (ex. killing all humans) on their behalf.
(DIR) Post #AA3WxVaA7JGRdPL1Pc by urusan@fosstodon.org
2021-08-06T17:57:11Z
0 likes, 0 repeats
@Vierkantor That said, I don't really think humans are that special really.Even our modern AI systems have some amount of autonomy in the world, and a strong AI would likely be on the same plane as a human.The only reason that human ethics matter is...well...I'm a human. You're a human too. Everyone reading this today and understanding it is a human. I also value other humans.As a human, it's ethical to bring other systems in line with human values. There's nothing wrong with that.
(DIR) Post #AA3XMVJQG2rC2LQXVw by urusan@fosstodon.org
2021-08-06T18:01:41Z
0 likes, 0 repeats
@Vierkantor That's really the rub here, if an AI has different values, it's also not being unethical if it murders or brainwashes us all to pursue its values, yet we do and should find this outcome horrific.A wolf isn't being unethical when it kills and eats a human, it's a wolf. It's doing what wolves do.So unless there's a truly universal ethics out there, one that overrides (and perhaps is the core of) human ethics, then you can't say that the AI or the human or the wolf is in the right.
(DIR) Post #AA3XoVYgw4ZUuPvFKa by urusan@fosstodon.org
2021-08-06T18:06:46Z
0 likes, 0 repeats
@Vierkantor However, even without proving that there's some universal ethics out there, we can focus down just on our species (or perhaps even all biological life) and find common ethical ground there.One thing is for sure though, even if ethics is only valid at a personal level, then if you value your life or your fellow humans, then creating a genocidal AI system is unethical, and you would be making a huge mistake and harming your own values in the process.
(DIR) Post #AA3XujFuvGuN7y6heq by realcaseyrollins@counter.fedi.live
2021-08-06T18:04:30.332862Z
0 likes, 0 repeats
@urusan @Vierkantor .@Vierkantor You’re a human too. Everyone reading this today and understanding it is a human.AIs reading your post: 😂
(DIR) Post #AA3YA3oqkbKBFPl104 by urusan@fosstodon.org
2021-08-06T18:10:40Z
0 likes, 0 repeats
@realcaseyrollins @Vierkantor That's why I put "today" and "and understanding it" in there.Unless there's been some big breakthroughs in AI I haven't heard of recently. 😛
(DIR) Post #AA3YS2u2tazhcPlPOq by urusan@fosstodon.org
2021-08-06T18:13:54Z
0 likes, 0 repeats
@Vierkantor I think I've responded to all your other points in broad terms, but looking back I think I missed this one:"If something being one of my values necessarily means that I will act accordingly, are there any sensible sets of values other than the set of things I would normally do anyway?"I suppose this all just fits into the whole making mistakes thing, but you can likely imagine a better version of yourself that makes fewer mistakes and thus follows your values more closely.
(DIR) Post #AA3Yr3U7rwq0fAZ6pc by urusan@fosstodon.org
2021-08-06T18:18:26Z
0 likes, 0 repeats
@Vierkantor However, one further wrinkle to all this is that it's not just making mistakes that distorts what seems to be values, but also the state of our environment.Brutal environments lead to brutal choices. Consider Sophie's choice, where Sophie and her 2 children are being sent to a death camp and she is given a sadistic choice by a guard: choose 1 child to be killed or lose both. She, of course, logically chooses one of her children, but that decision doesn't reflect her terminal values
(DIR) Post #AA3ZFsAV54aOnyDfPM by urusan@fosstodon.org
2021-08-06T18:22:55Z
0 likes, 0 repeats
@Vierkantor I mean, it kinda does reflect her terminal values in the sense that she really values not losing both of her children, but it's also a poor reflection of her values because in a situation where it was possible she would have surely chosen to keep both her children, even at great cost to herself.So, while the specifics of the situation matter enormously for how your values actually express themselves, the environment is probably largely irrelevant to what your terminal values are.
(DIR) Post #AA3c3HBz1ypQVsb9pg by urusan@fosstodon.org
2021-08-06T18:54:15Z
0 likes, 0 repeats
@Vierkantor One last quick side note: from a materialist deterministic perspective, the environment does still determine everything, including our terminal values.Even assuming our values are inherent to being human, being built into the structure and function of our brains (or even some deeper cellular basis), the initial conditions that led to us existing drove what our terminal values ended up being.
(DIR) Post #AA3cstzwVmAvdUaEyW by urusan@fosstodon.org
2021-08-06T19:03:35Z
0 likes, 0 repeats
@Vierkantor So I'm talking about the "environment" or "situation" informally here.My point is simply that in a given specific situation, even when we're acting rationally we often have to make compromises and weigh values against each other.It's very easy to create situations where someone rationally makes a decision to act decisively against their own terminal values, which when taken out of context can make it look like a person didn't have that value.
(DIR) Post #AA3d7x3lPWbVTWXxxo by urusan@fosstodon.org
2021-08-06T19:06:17Z
0 likes, 0 repeats
@Vierkantor This situational effect combined with the mistakes, heuristics, systemic errors, vagueness, and incomplete information/misinformation that humans realistically deal with constantly, there's an enormous amount of noise around what our terminal values really are, and whether or not we even share values in common.
(DIR) Post #AA3tSAnNeLCj8NY2QC by Azure@tailswish.industries
2021-08-06T22:09:14.819727Z
0 likes, 0 repeats
@urusan Thinking about this again, I think one thing where we may differ is exactly how easy changing 'true' values is. Thinking of religious conversions (or religious de-conversions in my case) was what I had in mind. Though I think I've persuaded myself to think in your way. Not because someone could offer me money to change my beliefs, but because it was a predisposition to think in consequentialist terms that made me weight God and find it wanting in the first place.Also I realized I was going at this question from the opposite direction in writing a fantasy story whose plot gets started because knowledge of cognitive reconstruction was restricted precisely because of the worry that people rewriting themselves and their goals in sufficiently divergent ways may make cooperation intractable.
(DIR) Post #AA41Rzj1bkU6Qww83M by urusan@fosstodon.org
2021-08-06T23:38:50Z
1 likes, 0 repeats
@Azure So, the thing is that if something is one of your true (terminal) values, you probably would fight to not have it changed, even if a lot of happiness and pleasure were the reward.There's a good explanation of this near the end of this video:https://youtu.be/4l7Is6vOAOA (really this whole series is good)
(DIR) Post #AA41p3xK2pbODOcjs8 by urusan@fosstodon.org
2021-08-06T23:43:01Z
1 likes, 0 repeats
@Azure I do think something like the situation in your fantasy story is somewhat plausible though.While I believe humans do likely have shared values, I can't prove this, and a group with sufficiently divergent terminal values would have trouble cooperating. A group that had been cooperating up to that point would be disrupted by this technology as individuals "defected" to their true values and stopped being capable of cooperation with the rest of the group.
(DIR) Post #AA42PEq73ZaQOZV9Xd by urusan@fosstodon.org
2021-08-06T23:49:33Z
0 likes, 0 repeats
@Azure Though the real question is why the conspirators didn't use the technique to bring everyone onto the same page, which would diffuse any future conflict.I definitely believe that a person's values (including their true terminal values) can be changed, I just also believe anyone facing the possibility of one of their true terminal values being changed would fight it, hard.Most of our values aren't terminal values, which is why they can be changed.
(DIR) Post #AA441Oft2RHPrATflY by Azure@tailswish.industries
2021-08-07T00:07:41.981090Z
0 likes, 0 repeats
@urusan I am curious now what your intuition is on the possibility of cooperation between different intelligent (naturally evolved) species. If you think certain kinds of environments would lead to converging values, say.
(DIR) Post #AA44phNHmoekKglMIa by urusan@fosstodon.org
2021-08-07T00:16:45Z
1 likes, 0 repeats
@Azure Interspecies cooperation is definitely possible and in fact fairly common.A cleaner wrasse (and their clients) are not waiting for any good opportunity to defect on each other.These species aren't smart enough to have some complex ideological reason for cooperating, so clearly this interspecies harmony is "baked in".As much as you can anthromorphize fish, the wrasse value not biting their clients and their clients value not eating the wrasse.
(DIR) Post #AA45lWt8vpHcbeEvB2 by urusan@fosstodon.org
2021-08-07T00:27:12Z
1 likes, 0 repeats
@Azure High intelligence actually makes this process easier, since you can cooperate for instrumental reasons more readily, and steady instrumental cooperation over a long time might lead to this kind of symbiotic co-evolution. I'm not sure about that, but it seems reasonable.We definitely have way more symbiotic relationships than most species though.
(DIR) Post #AA45sa3BZdySRLIc08 by urusan@fosstodon.org
2021-08-07T00:28:29Z
1 likes, 0 repeats
@Azure Domesticated animals share our values to a far greater extent than wild animals. We selected against specific dogs that didn't share our valuesThis isn't to say domestic animals are fully in harmony with human values. They have some values we don't have, but which we don't care about. We have some values they don't have, but which they either don't care about or can't do anything about. We do callously harmful things to them because we can from our position of power
(DIR) Post #AA46PtSmQ0QyCi9i5Y by urusan@fosstodon.org
2021-08-07T00:34:30Z
1 likes, 0 repeats
@Azure Also, it could possibly be the case that all evolved life shares the same terminal values, and thinking about these things in human-specific terms is too narrow minded. We really just don't know at this point.That said, I think the evidence points in the direction that different species have different terminal values. That we do need to be clear when we're talking about human values or fox values or wrasse values.
(DIR) Post #AA4DIZ3yZB64aQ0RA8 by urusan@fosstodon.org
2021-08-07T01:51:36Z
0 likes, 0 repeats
@Azure This could also go in the other extreme, where even individuals have unique sets of terminal values, and it really is a war of all against all.I really don't know.That said I do think there is some amount of unity on at least a species level, perhaps with some disunity around the edges. In particular, psychopathy is probably our term for a human that doesn't have human values.
(DIR) Post #AA4DR26yRuROzkRkDw by Azure@tailswish.industries
2021-08-07T01:53:08.951771Z
0 likes, 0 repeats
@urusan Possibly, though I think psychopathy proper is more a diminished ability to perceive or react to distress in others, and likely not as alien as media might portray it.(Especially since there also seem to be inverse-psychopaths who are much more sensitive to the distress of others than average.)
(DIR) Post #AA6rYxBD3OjCUzgAAi by uniq@chaos.social
2021-08-08T08:32:08Z
0 likes, 0 repeats
@urusan I'm not a mathematician so please correct me if I'm wrong:An AI capable of taking such decisions would have to be insanely complex, self improving and nondeterministic. Basically you'd need to use a chaos theory based mathematical model.https://en.wikipedia.org/wiki/Chaos_theorySo hypotheses about such an AI can only be falsified, but not conclusively proofed for correctness. It's plausible the power set of all possible hypothesis is too big to be solved in it's entirety ...
(DIR) Post #AA7OqQj3kjXjXoSAGO by urusan@fosstodon.org
2021-08-08T14:45:07Z
0 likes, 0 repeats
@uniq You're definitely on the right track here, but I don't think this argument outright disproves we can make strong assertions about the properties of intelligent systems.So first of all, there are proofs about random/chaotic systems:* This one is kinda trivial, but you can prove that a coin flip can only end up in 3 possible states (heads, tails, or on its side)* You can prove the long term probabilities of all 3 possibilitiesAlso, consider this:https://www.quantamagazine.org/mathematicians-prove-batchelors-law-of-turbulence-20200204/
(DIR) Post #AA7PUIPi4mstLePcIq by urusan@fosstodon.org
2021-08-08T14:52:19Z
0 likes, 0 repeats
@uniq That said, intelligent systems likely fall into a different, even harder to reason about class: complex systems.Complex systems include elements from both ordered, predictable systems and chaotic systems, which means you can't use the methods that make either one easy to tackle. Basically, the only way to accurately predict complex systems, even probabilistically, is to actually run them and see what they do. More technically, description of their behavior is incompressible
(DIR) Post #AA7Q6wvRtzvxzvgnQG by urusan@fosstodon.org
2021-08-08T14:59:18Z
0 likes, 0 repeats
@uniq However, despite the difficulty of predicting the outcomes of complex (and chaotic) systems, you can still prove various properties about them, make reasonable predictions in most situations, and control them.It's actually easier to control a chaotic system than it is to predict it. This is why, on some occasions, cloud seeding technology has been used to ensure rain-free Olympics. By making it rain in nearby areas, they eliminate the possibility of rain in the target region
(DIR) Post #AA7QmG0HsKVP2YeK1I by urusan@fosstodon.org
2021-08-08T15:06:46Z
0 likes, 0 repeats
@uniq That said, I'm not actually holding out hope for a hard mathematical proof of AI safety/friendliness. My argument with other people in other threads of this conversation has merely been that I don't think it has been adequately proven that it's impossible yet.Gödel's Incompleteness Theorem and the Halting Problem prove that the unprovable exists, and gives us some clues about what kinds of statements are unprovable, but that doesn't mean this specific thing is unprovable.
(DIR) Post #AA7SKUcs7GGKBsaNRw by urusan@fosstodon.org
2021-08-08T15:24:10Z
0 likes, 0 repeats
@uniq One last key technical point here is that the real life systems in question are actually not beholden to Godel or the Halting Problem anyway.They are of finite size and thus do not have a property required for these theorems to apply: a finite system cannot support the arithmetic of natural numbers [Godel] and does not have an infinite memory [HP].The reason this is a technical point rather than a slam dunk is that intractability fits neatly into an analogous space for finite systems.
(DIR) Post #AA7TB5hvUcw0Ft1xZI by urusan@fosstodon.org
2021-08-08T15:33:40Z
0 likes, 0 repeats
@uniq All that said, if you asked someone who knows all this stuff deeply what their intuition says the likely outcome is, their intuition would likely say that no such proof exists, and it would largely be for the reasons you bring up, as well as the fact that a generally intelligent system can reason about the kinds of contradictory meta-statements that lead to trouble.I'm just saying that this intuition isn't a proof.
(DIR) Post #AA7TgAcZH4dcTILOFc by urusan@fosstodon.org
2021-08-08T15:39:17Z
0 likes, 0 repeats
@uniq Regardless, AI safety research is valuable because it helps clarify the problem.What even is intelligence? What makes an intelligence a general intelligence? What are some of the likely failure modes? Do "obvious" answers like keeping an AI system in a virtual box work? If not, where do they break down?Also, if we end up with a proof that it's impossible, that's valuable too.It's also brought attention to this issue, when the previous attitude was basically "it'll work out fine."
(DIR) Post #AAAC2IvnqHY2uocq7E by uniq@chaos.social
2021-08-09T23:05:41Z
0 likes, 0 repeats
@urusan nope, #StarTrek or actually raised this issue 50 years ago... and cleared up how to fix AI-driven technocracies too: https://www.youtube.com/watch?v=TAeN7GcsLdA
(DIR) Post #AAAC5Xg9T3SXWTqg7s by uniq@chaos.social
2021-08-09T23:06:19Z
0 likes, 0 repeats
@urusan nope, #StarTrek actually raised this issue 50 years ago... and cleared up how to fix AI-driven technocracies too: https://www.youtube.com/watch?v=TAeN7GcsLdA