I'm happy when life's good and when it's bad I cry I got values but I don't know how or why
The orthogonality thesis: Intelligence and final goals are orthogonal: more or less any level of intelligence could in principle be combined with more or less any final goal.
The orthogonalists, who represent the dominant tendency in Western intellectual history, find anticipations of their position in such conceptual structures as the Humean articulation of reason / passion, or the fact / value distinction inherited from the Kantians. They conceive intelligence as an instrument, directed towards the realization of values that originate externally.
The philosophical claim of orthogonality is that values are transcendent in relation to intelligence. This is a contention that Outside In systematically opposes....To look outside nature for sovereign purposes is not an undertaking compatible with techno-scientific integrity, or one with the slightest prospect of success.
The main objection to this anti-orthogonalism, which does not strike us as intellectually respectable, takes the form: If the only purposes guiding the behavior of an artificial superintelligence are Omohundro drives, then we’re cooked. Predictably, I have trouble even understanding this as an argument. If the sun is destined to expand into a red giant, then the earth is cooked — are we supposed to draw astrophysical consequences from that? Intelligences do their own thing, in direct proportion to their intelligence, and if we can’t live with that, it’s true that we probably can’t live at all. Sadness isn't an argument.
Only recently did I clearly realize that I reject the Orthogonality Thesis in its practically-relevant version. At most, I believe in the Pretty Large Angle Thesis....In the Orthodox AI-doomers’ own account, the paperclip-maximizing AI would’ve mastered the nuances of human moral philosophy far more completely than any human—the better to deceive the humans, en route to extracting the iron from their bodies to make more paperclips. And yet the AI would never once use all that learning to question its paperclip directive. I acknowledge that this is possible. I deny that it’s trivial.
WWII was (among other things) a gargantuan, civilization-scale test of the Orthogonality Thesis. And the result was that the more moral side ultimately prevailed, seemingly not completely at random but in part because, by being more moral, it was able to attract the smarter and more thoughtful people.
I think we should consider the possibility that powerful AIs will not be best understood in terms of the monomanaical pursuit of a single goal—as most of us aren’t, and as GPT isn’t either. Future AIs could have partial goals, malleable goals, or differing goals depending on how you look at them.