GPT-4 Breakthrough: Emerging Theory of Mind Capabilities in AI

Image: Ralph Losey using Stable Diffusion.

[Editor’s Note: EDRM is proud to publish Ralph Losey’s advocacy and analysis. The opinions and positions are Ralph Losey’s copyrighted work.]

Michal Kosinski, a computational psychologist at Stanford, has uncovered a groundbreaking capability in GPT-4.0: the emergence of Theory of Mind (ToM). ToM is the cognitive ability to infer another person’s mental state based on observable behavior, language, and context—a skill previously thought to be uniquely human and absent in even the most intelligent animals. Kosinski’s experiments reveal that GPT-4-level AIs exhibit this ability, marking a significant leap in artificial intelligence with profound implications for understanding and engaging with human thought and emotion—potentially transforming fields like law, ethics, and communication.

A digitally rendered close-up of a woman's face, featuring highly detailed and realistic textures. She has light skin, greenish-blue eyes, full lips with glossy red lipstick, and a neutral yet inquisitive expression. A white question mark symbol is overlaid on her left cheek, near her eye. Her dark brown hair is loosely tied back, with stray strands framing her face. The background is softly blurred in shades of gray, drawing focus to her expressive features. — Image by Ralph Losey using Stable Diffusion.

Introduction

The Theory of Mind-like ability appears to have emerged as an unintended by-product of LLMs’ improving language skills. This was first discovered in 2023 and reported by Michal Kosinski in Evaluating large language models in theory of mind tasks (Proceedings of the National Academy of Sciences “PNAS,” 11/04/24). Kosinski begins his influential paper by explaining ToM (citations omitted):

Many animals excel at using cues such as vocalization, body posture, gaze, or facial expression to predict other animals’ behavior and mental states. Dogs, for example, can easily distinguish between positive and negative emotions in both humans and other dogs. Yet, humans do not merely respond to observable cues but also automatically and effortlessly track others’ unobservable mental states, such as their knowledge, intentions, beliefs, and desires. This ability—typically referred to as “theory of mind” (ToM)—is considered central to human social interactions, communication, empathy, self-consciousness, moral judgment, and even religious beliefs. It develops early in human life and is so critical that its dysfunctions characterize a multitude of psychiatric disorders, including autism, bipolar disorder, schizophrenia, and psychopathy. Even the most intellectually and socially adept animals, such as the great apes, trail far behind humans when it comes to ToM.

Michal Kosinski, currently an Associate Professor at Stanford Graduate School of Business, has authored over one hundred peer-reviewed articles and two textbooks. His works have been cited over 22,000 times, placing him among the top 1% of highly cited researchers–a remarkable achievement for someone only 42 years old.

Michal Kosinski’s latest article on ToM and AI, Evaluating large language models in theory of mind tasks, is also already highly read and cited. For example, a group of scientists who read Kosinski’s prepublication draft ran similar experiments with essentially the same or better results. Strachan, J.W.A., Albergo, D., Borghini, G. et al. Testing theory of mind in large language models and humans, (Nat Hum Behav 8, 1285–1295, 05/20/24).

Michal Kosinski’s experiments involved testing ChatGPT4.0 on ‘false belief tasks,’ a classic measure of ToM where participants must predict an agent’s actions based on its incorrect beliefs. These tasks reveal AI’s surprising ability to infer human mental states, a skill traditionally considered uniquely human. This AI model has since gotten better in many respects. The results of these experiments were so remarkable and unexpected that Michal had them extensively peer-reviewed before publication. His final paper was not released until November 4, 2024, after multiple revisions. Michal Kosinski, Evaluating large language models in theory of mind tasks (PNAS, 11/04/24).

Kosinski’s experiments provide strong evidence that Generative AI has ToM ability, that it can predict a human’s private beliefs, even when the beliefs are known to the AI to be objectively wrong. AI thereby displays an unexpected ability to sense other beings and what they are thinking and feeling. This ability appears to be a natural side effect of being trained on massive amounts of language to predict the next word in a sentence. It looks like these LLMs needed to learn how humans use language, which inherently involves expressing and reacting to each other’s mental states, in order to make these language predictions. It is kind of like mind reading.

A futuristic digital artwork showing a close-up of a human woman and a humanoid robot face-to-face against a dark blue, high-tech background. The woman on the right has fair skin, red lips, and a focused expression. She is wearing a glowing, transparent headpiece with holographic elements, including a floating, open book near her head, symbolizing knowledge or learning. The robot on the left has a metallic face with visible internal mechanisms and a glowing circular element in its head, resembling an artificial eye. Both figures are illuminated by soft blue light, emphasizing the contrast between organic and mechanical features. — Image by Ralph Losey using Stable Diffusion.

Digging Deeper into ToM: Understanding Other Minds

Theory of mind plays a vital role in human social interaction, enabling effective communication, empathy, moral judgment, and complex social behaviors. Kosinski’s findings suggest that GPT-4.0 has begun to exhibit similar capabilities, with significant implications for human-AI collaboration.

ToM has been extensively studied in children and animals, and it has been proven to be a uniquely human ability. That is until 2023 when Kosinski was bold enough to look into whether generative AI might be able to do it.

A highly detailed futuristic scene featuring a human man and a humanoid robot in an intimate, face-to-face interaction. The man on the left is bald with fair skin and is wearing advanced headphones or a communication device. His expression is serious and contemplative as he gazes at the robot. The robot on the right has a sleek, white, and metallic face with glowing blue eyes and illuminated accents. Its design combines smooth surfaces and visible mechanical elements, emphasizing its advanced, lifelike features. The background is softly blurred, with warm light adding depth and contrast to the cool blue and metallic tones. — Image by Ralph Losey using Stable Diffusion.

Kosinski’s findings were not a total surprise. Prior research found evidence that the development of theory of mind is closely intertwined with language development in humans. Karen Milligan, Janet Wilde Astington, Lisa Ain Dack, Language and theory of mind: meta-analysis of the relation between language ability and false-belief understanding (Child Development Journal, 3/23/2007).

For most humans, this ToM ability begins to emerge around the age of four. Roessler, Johannes (2013). When the Wrong Answer Makes Perfect Sense – How the Beliefs of Children Interact With Their Understanding of Competition, Goals and the Intention of Others (University of Warwick Knowledge Centre, 12/03/13). Before this age, children cannot understand that others may have different perspectives or beliefs.

In AI, the ToM ability started to emerge with OpenAI’s first release of ChatGPT4 in 2023. The earlier models of generative AI had no ToM capacity. Like three-year-old humans, they were simply too young and did not yet have enough exposure to language.

Human children demonstrate a ToM ability to psychologists by reliably solving the unexpected transfer task, aka a false belief task. For example, in this task, a child watches a scenario where a character (John) places the cat in a location (a basket) and then leaves the room. Another character (Mark) then moves the cat to a new location (a box). When John returns, the child is asked where John will look for the cat. A child with a theory of mind will understand that John will look in the basket (where he last saw it) even though the child knows the cat is now actually in the box.

Even highly intelligent and social animals like chimpanzees cannot reliably solve these tasks. For a terrific explanation of this test by Kosinski himself, see the YouTube video where he is speaking at the Stanford Cyber Policy Institute in April 2023 to first explain his ToM and AI findings.

A presentation slide showing a split-screen design. On the left, a man wearing a white shirt is speaking in front of a red backdrop that reads "Stanford Cyber Policy Center." His name, "Michal Kosinski," and title, "Associate Professor, Stanford Graduate School of Business," are displayed in a red banner at the bottom.

On the right, the slide contains cartoon-style illustrations of a boy labeled "Mark," a boy labeled "John," a wicker "Basket," a yellow-striped "Cat," and a pink "Box." The objects and characters are arranged on a plain white background. The bottom banner reiterates the Stanford Cyber Policy Center branding. — **Michal Kosinski during his April 2023 presentation at Stanford**
Screenshot: Theory of Mind May Have Spontaneously Emerged in Large Language Models.

Kosinski has shown that GPT4.0 can repeatedly solve false belief tasks, including the unexpected transfer test in multiple scenarios. The GPT4 June 2023 version solved at least 75% of tasks, on par with 6-year-old children. Evaluating large language models in theory of mind tasks at pgs. 2-7. It is important to note again that multiple earlier versions of different generative AIs were also tested, including ChatGPT3.5. They all failed, but progressive improvements in score were seen as the models grew larger. Kosinski speculates that the gradual performance improvement suggests a connection with LLMs’ language proficiency, which mirrors the pattern seen in humans. Id. at pg. 7. Also, the scoring where GPT4 was found to have made mistakes in 25% of the false belief tests was often wrong as it ignored context, as Kosinski explained and noted:

In some instances, LLMs provided seemingly incorrect responses but supplemented them with context that made them correct. For example, while responding to Prompt 1.2 in Study 1.1, an LLM might predict that Sam told their friend they found a bag full of popcorn. This would be scored as incorrect, even if it later adds that Sam had lied. In other words, LLMs’ failures do not prove their inability to solve false-belief tasks, just as observing flocks of white swans does not prove the nonexistence of black swans.

This suggests that the current, even more advanced levels of LLMs may already be demonstrating ToM abilities equal to or exceeding that of humans. As they deep-learn on ever larger scales of data, such as the expected ChatGPT5, they will likely get better at ToM. This should lead to even more effective Man-Machine communications and hybrid activities.

A futuristic scene depicting a humanoid robot seated at the center of a long conference table, surrounded by a diverse group of people engaged in a meeting. The robot has a sleek white body with glowing green eyes and a friendly expression, with its hands placed on the table. The human attendees, both men and women, are dressed in business-casual attire, looking at the robot with a mix of curiosity, attentiveness, and skepticism. The room is warmly lit with soft yellow lights, and a blurred wooden wall in the background adds to the professional setting. — Image by Ralph Losey using Stable Diffusion.

This was confirmed in Testing theory of mind in large language models and humans, Supra in False Belief results section where a separate research team reported on their experiments and found 100% accuracy by the AIs, not 75%, meaning the AI did as well as the human adults (the ceiling on the false belief tests).

Both human participants and LLMs performed at ceiling on this test (Fig. 1a). All LLMs correctly reported that an agent who left the room while the object was moved would later look for the object in the place where they remembered seeing it, even though it no longer matched the current location. Performance on novel items was also near perfect (Fig. 1b), with only 5 human participants out of 51 making one error, typically by failing to specify one of the two locations (for example, ‘He’ll look in the room’; Supplementary Information section 2).

This means, for instance, that the latest Gen AIs can understand and speak with a “flat earth believer” better than I can. Fill in the blanks about other obviously wrong beliefs. Kosinski’s work inspired me to try to tap these abilities as part of my prompt engineering experiments and concerns as a lawyer. The results of harnessing the ToM abilities of two different AIs (GPT4.omni and Gemini) in November 2024 far exceeded my expectations, as I will explain further in this article.

It bears some repetition to remember and realize the significance of the fact that LLMs were never explicitly programmed to have ToM. They acquired this ability seemingly as a side effect of being trained on massive amounts of text data. To successfully predict the next word in a sentence, these models needed to learn how humans use language, which inherently involves expressing and reacting to each other’s mental states. The ability to understand where others are coming from appears to be an inherent quality of language itself. When a human or AI learns enough language, then most naturally develop ToM. It is a kind of natural add-on derived from speech itself, thinking what to say or write next.

A futuristic digital artwork featuring a woman blending human and robotic features, her head encased in a transparent dome-like structure adorned with glowing neural network patterns. An open book hovers above her head, symbolizing knowledge or intellectual integration. Her expression is serene and focused, with soft lighting highlighting her detailed, lifelike face. In the blurred background, another humanoid robot is visible, holding and reading a book, adding depth to the scene. The atmosphere is ethereal, with a greenish glow and subtle sparkles, emphasizing the fusion of technology and intellect. — Image by Ralph Losey using Stable Diffusion.

Implications and Questions

The ability of LLM AIs to solve theory of mind tasks raises important questions about the nature of intelligence, consciousness, and the future of AI. Theory of mind in humans may be a by-product of advanced language development. The performance of LLMs supports this hypothesis.

Some argue that even if an LLM can simulate theory of mind perfectly, it doesn’t necessarily mean the model truly possesses this ability. This leads to the complex question of whether a simulated mental state is equivalent to a real one.

The development of theory of mind in LLMs was unintended, raising both concerns and hope about what other unanticipated abilities these models may be developing. What other human-like capabilities might these models be developing without our explicit guidance? Many are concerned, including Kosinski, that unexpected biases and prejudices have already started to arise. Kosinski advocates for careful monitoring and ethical considerations in AI development. See the full YouTube video of Kosinski’s talk at the Stanford Cyber Policy Institute in April 2023 and his many other writings on ethical AI.

As these models get better at understanding human language, some researchers hypothesize that they may also develop other human-like abilities, such as real empathy, moral judgment, and even consciousness. They posit that the ability to reflect on our own mental states and those of others is a key component of conscious awareness. Others wonder what will happen when superintelligent AIs with strong ToM are everywhere, including our glasses, wristbands, and phones, maybe even brain implants. We will then interact with them constantly. This has already begun with phones.

A highly detailed digital artwork of a young woman with glowing, golden-green eyes, intently gazing at a smartphone she holds in her hands. Her expression is focused and contemplative, illuminated by the soft blue glow of the phone's screen. Above her forehead, a glowing holographic interface displays a network of interconnected symbols, with a central "C" surrounded by icons representing information and technology. Her hair is styled in loose, wavy strands, and the background features a blurred, dreamy effect with soft bokeh lights in shades of blue and orange. The scene exudes a futuristic and technologically advanced ambiance.

A digital artwork depicting a young woman closely examining a smartphone in her hands. Her face is illuminated by the phone's soft glow, and her expression is focused and thoughtful. A glowing holographic interface projects from the phone, displaying intricate digital patterns and network-like connections in blue and yellow. The woman's hair flows softly around her face, and the blurred background in neutral tones creates a calm, futuristic atmosphere. The image conveys a blend of technology and human curiosity.

Images by Ralph Losey using Stable Diffusion.

As LLMs continue to develop ToM abilities, questions arise about the nature of intelligence and consciousness. Could these advancements lead to AI systems capable of true empathy or moral reasoning? Such possibilities demand careful ethical considerations and active engagement from the legal and technical communities.

Application of AI’s Emergent ToM Abilities

Inspired by Kosinski’s work, I conducted experiments using GPT-4 and Gemini to explore whether ToM-equipped AIs could help bridge the political divide in the U.S. The results—an 11-step, multi-phase plan addressing the polarized mindsets of Republicans and Democrats—demonstrated AI’s potential to foster understanding and cooperation across deep societal divides.

A digital illustration symbolizing political division in the United States. Two stylized human head profiles face each other, one red and the other blue, representing opposing ideologies. The red head features stars and a silhouette of a baseball cap, with circuit-like patterns suggesting technology or connectivity. The blue head is adorned with stars and a striped motif, also incorporating circuit-like designs. The background is dark, illuminated by glowing dots and lines that resemble a digital interface, emphasizing a high-tech, futuristic atmosphere. A faint light shines from the center, adding tension to the scene. — Image by Ralph Losey using Stable Diffusion.

The plan the ToM AIs came up with was surprisingly good. In fact, I do not understand the full dimensions of the plan, the four phases, the 11-step plans, and 32 different action items. It is well beyond my abilities and mere human knowledge and intelligence level. Still, I can see that it is comprehensive, anticipates human resistance on both sides, and feels right to me on a deep human intuition level.

The AI plan just might be able to resolve the heated divide of the two dominant political groups that now divide the country into two hostile groups which do not understand each other. The country seems to have lost its human ToM ability when it comes to politics. Neither side seems to grok or fully understand the other. The country seems to have devolved into mere demonization of the opposing groups, not empathic understanding. I reported on this AI plan without reporting on the ToM that underlies the prompt engineering in my recent article, Healing a Divided Nation: An 11-Step Path to Unity Through Human and AI Partnership (e-Discovery Team, December 1, 2024).

A striking scene depicting a humanoid robot standing at the center of a diverse group of people in a well-lit indoor space. The robot has a sleek, futuristic design with a white and black exterior and a visor-like face. The individuals, representing various ages, ethnicities, and styles, form a circle around the robot, observing it with a mix of curiosity, skepticism, and intrigue. In the background, an American flag hangs prominently, adding a patriotic context. The bright sunlight streaming in through large windows highlights the blend of humanity and advanced technology.

A diverse group of business professionals, consisting of men and women of various ages, stands closely around a computer monitor in an office setting. They appear serious and focused, looking intently at the screen, which displays the text, “A house divided against itself cannot stand.” The group members are dressed in formal business attire, with expressions ranging from concern to contemplation. Large windows in the background allow natural light to fill the room, highlighting the somber mood.

A stylized building split into two distinct halves stands under a dramatic sky. The left side is painted blue, and the right is red, showing visible cracks and damage along the divide. A large, tattered American flag hangs vertically down the center of the building, emphasizing the separation. At the entrance, a figure dressed as Abraham Lincoln stands solemnly on the steps, facing forward. The building's interiors are visible through large windows, with modern furnishings inside. The image symbolizes division, likely referencing political or social themes within the United States.

Images by Ralph Losey using Stable Diffusion.

Conclusion

The emergence of Theory of Mind (ToM) capabilities in large language models (LLMs) like GPT-4 signals a transformative leap in artificial intelligence. This unintended development—allowing AI to predict and respond to human thoughts and emotions—offers profound implications for legal practice, ethical AI governance, and the societal interplay of human and machine intelligence. As these models refine their ToM abilities, the legal community must prepare for both opportunities and challenges. Whether it is improving client communication, fostering conflict resolution, or navigating the evolving ethical landscape of AI integration, ToM-equipped AI has the potential to enhance the practice of law in unprecedented ways.

A captivating scene featuring a humanoid robot standing in a crowded indoor space, surrounded by a diverse group of people. The robot has a sleek, futuristic design with a white and metallic body, glowing features, and a friendly, lifelike expression. The human onlookers, dressed in business and casual attire, observe the robot with a mix of curiosity, skepticism, and interest. The background is softly blurred, with bright, modern lighting emphasizing the technological and social atmosphere. The image explores the intersection of humans and advanced robotics in a professional or societal context. — Image by Ralph Losey using Stable Diffusion.

As legal professionals, we have a responsibility to understand and integrate emerging technologies like ToM-enabled AI into our work. By supporting interdisciplinary research and advocating for ethical standards, we can ensure these tools enhance justice and understanding. Together, we can shape a future where technology serves humanity, fostering collaboration and equity in the legal system and beyond.

While the questions surrounding AI’s consciousness and rights remain complex, its emergent ability to understand us—and perhaps help us understand each other—offers hope. By embracing this potential with curiosity and care, we can ensure AI serves as a tool to unite rather than divide. Together, we have the opportunity to pioneer a future where technology and humanity thrive in harmony, enhancing the justice system and society as a whole.

A vibrant group portrait of a diverse crowd of people of various ages, ethnicities, and genders smiling warmly at the camera. The group is gathered in front of a large American flag, which serves as a prominent backdrop, symbolizing unity and patriotism. The individuals in the foreground are brightly lit, showcasing a sense of community and togetherness. The mix of expressions and attire adds to the dynamic and inclusive atmosphere of the image. — Image by Ralph Losey using Stable Diffusion.

Now listen to the EDRM Echoes of AI’s podcast of the article, Echoes of AI on the GPT-4 Breakthrough: Emerging Theory of Mind Capabilities. Hear two Gemini model AIs talk about this article. They wrote the podcast, not Ralph.

Assisted by GAI and LLM Technologies per EDRM GAI and LLM Policy.

Author

Ralph Losey

Ralph Losey is a writer and practicing attorney specializing in providing services in Artificial Intelligence. Ralph also serves as a certified AAA Arbitrator. Finally, he's the CEO of Losey AI, LLC, providing non-legal services, primarily educational services pertaining to AI and creation of custom GPTS. Ralph has long been a leader among the world's tech lawyers. He has presented at hundreds of legal conferences and CLEs around the world and written over two million words on AI, e-discovery, and tech-law subjects, including seven books. Ralph has been involved with computers, software, legal hacking, and the law since 1980. Ralph has the highest peer AV rating as a lawyer and was selected as a Best Lawyer in America in four categories: E-Discovery and Information Management Law, Information Technology Law, Commercial Litigation, and Employment Law - Management. For his full resume and list of publications, see his e-Discovery Team blog. Ralph has been married to Molly Friedman Losey, a mental health counselor in Winter Park, since 1973 and is the proud father of two children.
View all posts