About

AMMDI is an open-notebook hypertext writing experiment, authored by Mike Travers aka mtraven. It's a work in progress and some parts are more polished than others. Comments welcome! More.

Search

Page Contents

Deliberate Agency
Gears Level Understanding of Yourself
Coherence and Consistency
Game Theoretic Soundness

MapFull

Page Tree

Coherence Arguments Do Not Imply Goal Directed Behavior

Embedded Agency

Explaining Insight Meditation and Enlightenment in Non-Mysterious Terms

Meta-honesty

Naming the Nameless

The Rocket Alignment Problem

Varieties of Argumentation Experience

What Motivated Rescuers During the Holocaust?

A Map That Reflects the Territory

Incoming links

from LWMap/A Map That Reflects the Territory

LWMap/Being a Robust Agent (Raymond Arnold)

from anti-purpose

Purposefulness in itself is a key value of Rationalism (see LWMap/Being a Robust Agent). A good rationalist not only has goals, they have meta-goals about being more goal-oriented.

from Agency Made Me Do It

A caution: my goal is not to write a self-help book or a manual on how to acquire more agency. I guess this is something people might be looking for, given how it is basically the promise of a whole subindustry of productivity and self-help gurus, and a concern of Rationalism (see LWMap/Being a Robust Agent and High Agency ).

Here's a rather extreme example, possibly a parody?

I wrote this book in three months while simultaneously attempting seventeen other missions, including running a startup, launching a hit iPhone app, learning to write 3,000 new Chinese words, training to attempt a four-hour marathon from scratch, learning to skateboard, helping build a successful cognitive testing website, being best man at two weddings, increasing my bench press by sixty pounds, reading twenty books, going skydiving, helping to start the Human Hacker House, learning to throw knives, dropping my 5K time by five minutes, and learning to lucid dream. I planned to do all this while sleeping eight hours a night, sending 1,000 emails, hanging out with a hundred people, going on ten dates, buying groceries, cooking, cleaning, and trying to raise my average happiness from 6.3 to 7.3 out of 10.

Twin Pages

LWMap/Being a Robust Agent

08 Mar 2022 08:39 - 06 Aug 2022 09:20

Open in Logseq

A review of the original post by Raymond Arnold

This essay valuably makes explicit an idea I detected in implicit form in LWMap/Agency: Introduction – that one should have the meta-goal of striving towards being "a robust, coherent agent" in contrast to the default human state of being "a kludgy bundle of impulses".

This passage is structured sort of as a Rationalist version of the Categorical Imperative:

Be the sort of agent who, if some AI engineers were white-boarding out the agent's decision making, they would see that the agent makes robustly good choices, such that those engineers would choose to implement that agent as software and run it.

I think this is supposed to be kind of minimal commitment for rationalists, but to me it seems like a really weird thing to take on as a goal. It's too meta. Rather than wanting something concrete, like money; or idealistic, like peace and prosperity for others; it's wanting to want better. There's nothing obviously wrong with that, but something about it bothers me.

Let me try and unpack that feeling. First, it reminds me of William Blake's dictum:

He who would do good to another must do it in Minute Particulars: General Good is the plea of the scoundrel, hypocrite, and flatterer, For Art and Science cannot exist but in minutely organized Particulars, And not in generalizing Demonstrations of the Rational Power – from Jerusalem

Rationalism almost definitionally involves taking Blake's generalizing rational power to an extreme. "Utility" is their name for "general good", and the entire ideology is based around the idea that it can be separated from "minutely organized particulars" Quoting Blake is not an argument that's going to get very far in Rationalist precincts of course, but it exposes the deep roots of the issue.

On the surface this sort of drive towards generalized purposefulness is no worse than the productivity coaches and the like who offer to increase their customer's general effectiveness, to any arbitrary end. What's wrong with wanting to be more effective?

One of my heroes, Stewart Brand, famously promoted something akin to this stance with his Whole Earth Catalog mottos: "Access to Tools" and "We are as Gods and Might as well get Good at It". Tools for what? It doesn't matter, whatever you might want to do (in this case, "you" was a subset of hippies and people interested in alternative ways of living).

Rationalism by nature does not deal with specific goals and goods. They are relegated to abstractions like "values" or "utility", while the real focus of attention is on the powerful and fully general machinery of goal-satisfaction, for both human and computational agents.

The independence of goals and goal-satisfaction machines is a foundational principle of rationalism, under the name orthogonality thesis. This thesis one of those assumptions that seems axiomatic to rationalists and completely wrong to me.

Wrong how? Well, it's obviously a completely wrong model of human motivation. It's sort of like an inverse of naive Freudianism. Freud's great method was to try to ascertain how our high-level goals were grounded in and powered by our low-level goals (notably sex); and how we never really escape from that fact. The Rationalist model of goals are the opposite, they are radically ungrounded. Human thriving or making paperclips; it's all the same to the abstract optimization engine.

I think I've arrived at a compact understanding of what Rationalism is:

start with the natural goal-seeking and problem-solving abilities of actual humans

abstract this out so you have a model of goal-seeking in general.

assume that computational technology is going to make hyperaccelerated and hypercapable versions of this process (ignoring or confusing the relationship between abstract and actual goal-seekers)

notice that this is dangerous and produces monsters.

try to repair the problem with aftermarket alignment techniques (see LWMap/The Rocket Alignment Problem)

Because Rationalism is about an idealized version of thinking, it doesn't have much interest in the ways that humans (so far, the only examples we have of intelligent agents) actually work. It aims to make humans more closely approximate the ideal, even though the ideal is monstrous when taken to its logical extremes.

Here are four components of Robust Agency that the author identifies, with some snark by me:

Deliberate Agency

Don't just go along with whatever kludge of behaviors that evolution and your social environment cobbled together. Instead, make conscious choices about your goals and decision procedures that you reflectively endorse,

Just want to note that this is the opposite of Gregory Bateson's thesis that conscious purpose is usually bad and we should pay a lot more attention to evolution.

Gears Level Understanding of Yourself

One of the things I generally like about Rationalists is that they are good at introspection and regularly come up with creative new techniques for doing so.

However, labeling it as "gears level understanding of yourself" seems pretentious and misleading. You don't have access to your gears. You just have the ability to represent yourself and tell the same kind of stories about yourself that you tell about external objects.

Also it ought to be obvious that introspection very often leads to paralysis rather than robust agency. There's an underlying assumption here that if we just understood our "gears" better we wouldn't be anomic or nihilistic, we'd just be better-functioning machines. This is contradicted by the evidence.

Coherence and Consistency

The hobgoblin of small minds. But OK, goal-consistency is important to getting anything done. Here's where I think the author ignores his previous bit of advice to look at the gears. If the gears are kludgy impulses, how do goal-coherent selves get built on top of that?

That is the subject of George Ainslie's work, which has not really been well-understood or integrated. The author mentions "making trades with your future self" which suggests he's read Ainslie, but also says "This is easier if your preferences aren’t contradictory", but the whole point of Ainslie is that our preferences are contradictory and the only reason we have selves at all is to enable intertemporal bargaining.

Game Theoretic Soundness

This means acknowledging that we live in a world with other agents, with their own goals that might be in alliance or conflict with your own. I can't argue with this.

I'll quibble with this passage though:

Related to this is legibility. Your gears-level-model-of-yourself helps you improve your own decision making. But it also lets you clearly expose your policies to other people. This can help with trust and coordination.

Isn't one of the basic strategies of game play is to not be legible; to hide your intentions? This seems to resonate with a theme I've noticed elsewhere (see conflict theory) – despite their love of games, rationalists tend to not take real competition seriously enough, and assume more cooperation than there actually is.

See illegibility

Further reading

Breakdown of Will, George Ainslie

About

Search

Page Contents

MapFull

Page Tree

Incoming links

Twin Pages

LWMap/Being a Robust Agent

Deliberate Agency

Gears Level Understanding of Yourself

Coherence and Consistency

Game Theoretic Soundness