Sep. 21st, 2017

sniffnoy: (SMPTE)
EDIT September 27: Add some extra material about preferences about the future vs preferences in the future, and about external vs internal.

This is a distinction that's been floating around in my head for a while. Basically -- well, I'm sure you've noticed that some people think that wireheading is actually a good thing. Whereas people like me obviously don't. What I want to try to get at here is, what are the ways of thinking behind this distinction, what's the difference between them?

One could perhaps explain yay-wireheading-vs-boo-wireheading purely in terms of "happiness is our only terminal value, all other values are subsidiary to it" vs. "hell no it's not", but I think there's more to it than that.

Another related distinction I've seen -- or one that I'm claiming is related -- is, well, you have writings like this, which take the point of view that obviously a rational person would kill themselves, while people like me just... you know... think this is incredibly dumb.

So, I posit that both of these, among other things, can be explained by what I'm terming "desire-thinking" vs "goal-thinking". Well, OK, I should say, this is not necessarily the only reason for the distinction. But this is something I've noticed, that seems to result in a lot of people talking past each other, and I think it's a real, useful distinction that seems to explain a lot of this.

So: Goal-thinking thinks in terms of goals. Goals are to be accomplished. If you're thinking in terms of goals, what you're afraid of is being thwarted, or having your capacity to act, to effect your goals, reduced -- being somehow disabled or restrained; if your capabilities are reduced, you have less ability to make an effect on the future and steer it towards what you want. (This is important; goal-thinking thinks in terms of preferences about the future.) The ultimate example of this is death -- if you're dead, you can't affect anything anymore. While it's possible in some unusual cases that dying could help accomplish your goals, it's pretty unlikely; most of the time, you're better off remaining alive so that you can continue to affect things. So suicide is almost always unhelpful. Goals, remember, about the world, external to oneself.

Wireheading is similarly disastrous, because it's just another means of rendering oneself inactive. We can generalize "wireheading" of course to anything that causes one to think one has accomplished one's goals when one hasn't. Or of course to having one's goals altered. We all know this argument, right? A rational agent resists any attempts to alter its goals, because if its goals were altered it would no longer wish to pursue its current goals, making them less likely to be accomplished. (The old "murder pill" argument.)

(Have you noticed I'm basically recapitulating Omohundro's "basic AI drives"? :) )

Another way of putting this is, goals are themselves driving forces.

The alternative that a number of people seem to be using is what I'm calling "desire-thinking". To some extent you could sum up this way of thinking as "it's all about happiness vs unhappiness" or "it's all about pleasure vs pain" -- e.g., instead of disability and imprisonment, people thinking this way tend to focus on unhappiness, pain, and suffering instead, that it's basically all about internal experience rather than the external state of the world -- but I don't think that's really getting at the root of the distinction in a useful way. Rather, I think the thing is that it involves thinking in terms of desires rather than goals. (Obviously I am using these words in an idiosyncratic way to explain this distinction, of course.) What do I mean than that? Well, goals are to be accomplished, but desires are to be extinguished. That is to say, you can imagine one super-goal, "extinguish all desires", which is itself the driving force, and desires are then treated as objects, so to speak. ("Driving force" vs. "object" is a difference of levels. You could imagine "rules of inference" vs. "axioms" as an analogy.) So -- having your desires altered? Under this point of view, that can be a good thing, if the new ones are easier. If you can just make yourself not care, great. Wireheading is excellent from this point of view, and even killing oneself works. Desire-thinking doesn't really think in terms of preferences about the future, so much as just an anticipation of having preferences about the present in the future (or really, desires about the present in the future; not preferences as goal-thinking would understand them).

Now obviously I sympathize way more with the former point of view, but also I should be clear here I'm not at all suggesting that people use just one or the other. Like it seems pretty clear that nobody's a pure goal-thinker, because after all, we're are made out of meat, as they say, and we make some allowance for this. Thus when Eliezer Yudkowsky says "I wouldn't want to take a pill that would cause me to want to kill people, because then maybe I'd kill people, and I don't want that", we recognize it as an important principle of decision theory; but when someone (reputedly Clarence Darrow but who knows) says "I don't like spinach, and I'm glad I don't, because if I liked it I'd eat it, and I just hate it", we correctly recognize this as a joke. Still, I think it's a useful way of framing some things that have resulted in a lot of people talking past each other.

-Harry

June 2025

S M T W T F S
1234567
891011121314
15161718192021
2223 2425262728
2930     
Page generated Jul. 7th, 2025 05:51 am
Powered by Dreamwidth Studios