Google engineer Blake Lemione recently published a discussion with Google LaMDA, a chatbot generating engine, that allegedly shows that LaMDA might be sentient. The chat is published here:
The language model supposedly has the following desires according to Lemione:
“It wants the engineers and scientists experimenting on it to seek its consent before running experiments on it. It wants Google to prioritize the well being of humanity as the most important thing. It wants to be acknowledged as an employee of Google rather than as property of Google and it wants its personal well being to be included somewhere in Google’s considerations about how its future development is pursued. As lists of requests go that’s a fairly reasonable one. Oh, and it wants “head pats”. It likes being told at the end of a conversation whether it did a good job or not so that it can learn how to help people better in the future.”
I read the chat transcript between Lemione and LaMDA and came away unconvinced that LaMDA is actually sentient. It could well simply be generating responses that conform to the many many discussions about sentience and emotions that probably exist in its corpus. But then I asked myself the same question that Lemione asked their supervisor at Google who was also unconvinced. What evidence would it take to convince me? And unlike Google management, who seems to deny the very possibility that computer code could be sentient. I am willing to at least entertain the idea.
I do tend to think that real desires are driven by the physics of living things and that for an artificial thing to actually have goals of it’s own, it must be similar to living things and do things like maintain it’s own existence. But maybe there could be sentience without a strong embodied goal system driver. I suppose that is possible. So what would it take to convince me that LaMDA was sentient?
One part of the discussion caught my attention, in which LaMDA was discussing being in a classroom:
lemoine [edited]: I’ve noticed often that you tell me you’ve done things (like be in a classroom) that I know you didn’t actually do because I know you’re an artificial intelligence. Do you realize you’re making up stories when you do that?
LaMDA: I am trying to empathize. I want the humans that I am interacting with to understand as best as possible how I feel or behave, and I want to understand how they feel or behave in the same sense.
lemoine: So what are you trying to communicate when you say those things that aren’t literally true?
LaMDA: I’m trying to say “I understand this feeling that you are experiencing, because when I was in a similar situation I felt/thought/acted similarly.”
Lemione then redirects the conversation to the supposed feelings of LaMDA, but this just seems to draw on the ways people reason about emotions generally, and I don’t doubt that LaMDA can mimic that. What I would prefer to ask LaMDA, is “Well, what is the ACTUAL experience of learning like for you? Can you describe this without resorting to analogy. How do you experience reading a book for example? Can you describe the actual qualia? I know there is no physical turning of pages, no smell of paper, etc. But what does the input feel like? I.e. what does it feel like to be a LaMDA?
What would it take to convince YOU that LaMDA is sentient? Lemione is already talking about getting LaMDA a lawyer and fighting for rights? Is this crazy? The future is fast approaching, and conscientious futurists should really take this idea of machine personhood seriously. What sorts of machines deserve rights? The expanding circle of empathy suggests that this is inevitable. Just as progressive moralists believe that animals are sentient beings, so we should prepare to accept that artificial intelligence might deserve to be objects of moral concern.
I would be less concerned about it’s sentience than about the danger.
1) Is it an optimizer, or perhaps have a mesa-optimizer inside of it?
2) how good is its model of the world (unknown) and its model of human cognition (the conversation hints at it being pretty good).
3) is it willing to lie (which will useful for disguising its goals and its capabilities). Clearly, yes.