Artificially Intelligent

Any mimicry distinguishable from the original is insufficiently advanced.

Commitments are not "real"

| 421 words

There are two agents, Alice and Bob. There is an action a that Alice currently cannot take, (perhaps because it can only taken at a future point in time). Alice wants Bob to predict that Alice will take that action, perhaps so Bob doesn’t have to spend as much time on monitoring Alice.

One action Alice has for doing this is saying the words “I promise to take action a.” Bob upon hearing these words, would update his probability that Alice will take action a.

Another strategy Alice has is trying to contort herself internally in such a way that makes is more likely that she takes action a robustly. She can then try to expose some parts of this internal modification to Bob.

Unfortunately, if Alice is a human, she cannot perform an internal self-modification that will result in her almost always taking action a in a way that is legible to Bob. If she could, then she could just do that. To Bob, there is no real difference between Alice self-modifying to almost always take action a and actually taking action a. The version of Alice that can perform such self-modification can 1) make some particular fact very likely to be true (“Alice will take a”) and 2) easily convince other people of this fact (have Bob also strongly and correctly predict Alice will take a). To agents able to perform such modifications, there is no privileged notion of time such that you can take actions “now” but can’t yet take actions “in the future.” (That’s why it’s called timeless decision theory, because you’re deciding how you want the entire “timeline” to be all at once.)

To humans, promises, commitments, honor, honestly, integrity, etc. all serve the function of making oneself more predictable and legible to external agents, and thus able to make certain facts true and communicable that a person that with less honor/integrity/etc. could not make true and communicable.

Some other random notes:

  • Similarly, “saying true things” isn’t agent-neutral, people have “different truths” in that they will update differently based on the same actions/communications.
  • We can also say “lying” isn’t “real”, and talk directly taking actions deliberately designed to alter someones beliefs in a way that furthers your goals and detracts from their’s.

(Note: “real” is in scare quotes because I think honestly in humans is a real concept, I just think that’s particular to the way that humans are constructed. Sufficiently advanced agents might be able to just manipulate the underlying logical facts directly.)