Five Faces of the Black Box: How AI ‘Thinks’ and Makes Decisions

[EDRM Editor’s Note: EDRM is proud to publish Ralph Losey’s advocacy and analysis. Writing and images in the article are by Ralph Losey using Gemini AI. Originally published on EDRM.net. This article remains the intellectual property of Ralph Losey and is shared with permission.]

We are currently living through a “Gutenberg Moment,” but with a complex, digital twist: our new printing press is alive, probabilistic, and prone to “confident delusions.” While AI may be humanity’s most transformative invention, it remains an enigma to most.

For many legal professionals, the outputs of Generative AI feel like a digital seance—words appearing out of the ether with no visible logic. This “Black Box” is not just a technical curiosity; it is a professional liability. If you cannot at least partially understand and explain how your “assistant” reached a conclusion, you are effectively practicing in the dark. To move from being a passenger to a pilot, you must understand the mechanical soul of the machine and learn how to make it sing with the voices you command.

Five people looking at the black box — Five Faces of the Black Box. My choices. My direction. Image: Ralph Losey with Gemini AI.

My recent article, What People Want To Know About AI: Top 10 Curiosity Index, revealed that the primary thing people want to know is how the machine actually works. They are asking the most difficult question in the field: How does AI “think” or make decisions?

This article answers that question by providing a structured understanding of Large Language Models (LLMs) across five levels of technical complexity:

The Smart Child: The world’s best guessing game.
The High School Graduate: Statistical probability at a global scale.
The College Graduate: Mapping meaning in Latent Space.
The Computer Scientist: The logic of the Transformer and Self-Attention.
The Tech-Minded Legal Professional: Navigating probabilistic advocacy.

Same five people using AI — Image: Ralph Losey with Gemini AI.

There is a meta-lesson here too that goes beyond the words on this page. Some of my favorite explanations of complex subjects emulate the fresh, clear speech of fifth graders. You will often find deep creativity when AI models parrot their language.

I chose five kinds of speech to describe how AI works. There are hundreds more that I could have picked. I also could have asked for explanations that use story or humor, much like Abraham Lincoln liked to do. It is fun to learn to tell AI what to do so that you can better communicate. It empowers a level of creativity never before possible. Maybe next time I will use comedy or poetry. For now, let’s peel back the curtain using these five.

1. The Smart Child Level: The World’s Best Guessing Game

Definition: Generative AI is like a magic “Fill-in-the-Blank” machine that has played the game trillions of times with almost every book ever written.

Imagine you are playing a game. If I say, “The peanut butter and…”, you immediately think of the word “jelly.” You don’t need to look at a jar of jelly to know that word fits. You’ve heard those words together so many times that your brain just knows they belong together.

An AI is a computer that has “listened” to almost everyone in the world talk and “read” almost every story ever told. It doesn’t “know” what a sandwich is, and it doesn’t have a stomach that feels hungry. It simply knows that in the history of human writing, the word “jelly” follows “peanut butter” more than almost any other word.

But it’s even smarter than that. If you say, “I am at the library and I am reading a…”, the AI knows that “book” is a much better guess than “sandwich”. It looks at all the words you give it—the “clues”—to narrow down the billions of possibilities into one likely answer. It makes decisions by picking the word that is most likely to come next to complete a pattern that makes sense to us. It isn’t “thinking” about the story; it’s just very, very good at predicting the next piece of the puzzle.

Crossword table in a library with a robot hand putting the word "jelly" on a scrabble board near "butter". — Image: Ralph Losey with Gemini AI.

2. The High School Level: Statistical Probability at Global Scale

Definition: AI is a Prediction Engine. It uses “Big Data” to calculate the statistical likelihood of the next piece of information.

Most of us use the “Autofill” feature on our smartphones every day. As you type a text, the phone suggests the next likely word based on your past habits. If you often text “I’m on my way,” the phone learns that “way” usually follows “my.” Generative AI—specifically Large Language Models—is essentially Autofill scaled to include the vast majority of digitized human knowledge.

During its “training” phase, the model does not “memorize” facts like a traditional database. If you ask it for the date of the Magna Carta, it isn’t looking it up in a digital encyclopedia. Instead, it has learned through billions of examples that the words “Magna Carta” and “1215” have a very high statistical correlation.

This explains why AI can sometimes be “confidently wrong.” It isn’t “lying” in the human sense; it is simply following a statistical path that leads to a mistake. If the data it was trained on contains a common error, the AI will repeat that error because, in its mathematical world, that error is the “most likely” next word. It recognizes the “shape” of human thought without actually having a human mind.

Iphone superimposed on world map with scale, Infrastructure and global. — High School Graduate Level Speech Using Statistical Probabilities. Image: Ralph Losey with Gemini AI.

3. The College Graduate Level: Mapping the Latent Space

Definition: AI organizes information using Vector Embeddings, which convert words into numerical coordinates on a massive, multi-dimensional map called Latent Space.

To understand how AI moves beyond mere word-matching, we have to look at how it “maps” meaning. In a physical library, books are organized by a 1D system (the spine) or 2D (the shelf). AI organizes information in a “map” that has thousands of dimensions.

Vectoring (The Coordinate System): Every word or concept is assigned a “Coordinate”—a long string of numbers. For example, the word “Stealing” is mathematically plotted very close to “Larceny” but far away from “Charity”.
Conceptual Proximity: Think of this as the “Relativity” of language. If you ask the AI about “theft,” it doesn’t look for that specific word. It navigates to those coordinates in Latent Space and finds all the “neighboring” concepts like “property,” “intent,” and “deprivation.”
Vector Arithmetic: Researchers discovered that you can actually perform “logic” using these numbers. A famous example is: King – Man + Woman = Queen. The model “understands” the relationship between these concepts because the mathematical distance between “King” and “Man” is the same as the distance between “Queen” and “Woman.”

When you provide a prompt, the AI identifies the coordinates of your request. It then “walks” through the nearby clusters of meaning to synthesize an answer. The “Black Box” is the result of the sheer scale of this map. With hundreds of billions of dimensions, the path the AI takes is so complex that no human can trace the logic of a single output back to a single “rule.”

A visual representation of legal terms and criminal acts, featuring nodes and connections depicting concepts like larceny, fraud, contract law, and violent crimes. — College Graduate Level Speech Mapping Latent Space. Image: Ralph Losey with Gemini AI.

4. The Computer Scientist Level: The Decoder-Only Transformer

Definition: Generative AI is a system powered by neural network architectures—most notably the Decoder-only Transformer—that is specifically tuned to generate the next piece of information by mathematically looking back at everything that came before it. Rather than relying on rigid rules, these models evaluate entire inputs using a mathematical weighting system called Self-Attention to determine the contextual relationship between every element.

To achieve this generative capability, the architecture relies on several complex mathematical mechanisms:

A. The “Query, Key, and Value” System: To decide how much “weight” to give a word, the AI creates three numerical identities for every token. The Query represents what the token is looking for (like a pronoun searching for a subject), the Key represents what the token offers (like a subject offering its identity), and the Value represents the token’s actual semantic meaning.

Query moving to token, with a key to derive a value. Looks like brain neurons. — AI Sytem to decides hew much Weight to give a word. Image: Ralph Losey with Gemini AI.

B. The Logic of Self-Attention: The AI establishes context by comparing the Query of one word against the Keys of all other words in the sequence. Imagine a judge sitting through a long trial. When a witness says the word ‘It,’ the judge immediately looks back at previous exhibits to see what ‘It’ refers to. The AI does this mathematically by comparing the Query of one word against the Keys of every other word in the sequence. For example, in the sentence “The court sanctioned the attorney because his motion was meritless,” the AI mathematically calculates the relationship between “his” and the surrounding words. The Query for “his” finds a high match with the Key for “attorney,” allowing the model to assign a high Attention Weight to “attorney” so the word “his” inherits the correct context.

Robot judge with human in witness box, big screen on the wall. — Futuristic courtroom where a cyborg judge Queries one word to the Keys of all others to build context, Image: Ralph Losey with Gemini AI.

C. Multi-Head Attention (Parallel Deliberation): The model doesn’t just evaluate the text once; it runs these calculations dozens of times in parallel. Different “Heads” focus on different aspects simultaneously—one might evaluate syntax and grammar, another focuses on technical legal definitions, and a third assesses the overall tone or sentiment.

Left and right brain representation with legal symbols in the center. — AI brain split into three parallel sections working simultaneously. Left side scans floating grammar and punctuation. Middle analyzes justice definations. Right side evaluates holographic floating masks of human emotions.

D. The Decision Layer (Feed-Forward Networks): After attention weights are settled, the data moves into a decision-making layer consisting of billions of Weights (connection strengths) and Biases (baseline leanings). These act as the model’s “institutional knowledge,” which was grown during training to satisfy the objective of predicting the next token.

Neuron drawing with input layer, network data flow and weights. — FFN where thickness of neural connections represents weights. Image: Ralph Losey with Gemini AI.

E. The Softmax Verdict: Finally, the model uses a Softmax function to produce a probability list of every possible word in its vocabulary. It calculates the exact odds—for example, assigning “Court” an 85% probability and “Sandwich” a 0.01% probability—and then mathematically samples the winner to generate the next word. Since the Softmax Verdict generates words based on statistical odds rather than verified facts, it is crucial for lawyers to verify the output, which we will also discuss in more detail later in this article.

List of most likely words in descending order of probability — Softmax Verdict predicts “Court” to be the most likely next word. Image: Ralph Losey with Gemini AI.

5. The Tech-Minded Legal Professional Level: Probabilistic Advocacy

Definition: For the legal professional, Generative AI is not a database, but a Probabilistic Inference Engine. It does not “find” data in the traditional sense; it infers the most likely response based on the conceptual coordinates of your request and the mathematical “gravity” of the language it was trained on.

A. From Search to Inference

For fifty years, the legal industry’s relationship with technology was deterministic. Traditional legal databases use rigid logic gates: Does Document A contain Word X AND Word Y? If the words are present, it is a ‘hit’; if not, it is ignored, functioning as a simple ‘On/Off’ switch. The Transformer changes this completely. It is not a search database, but a Probabilistic Inference Engine. When you ask it to ‘analyze a witness’s credibility,’ it doesn’t just look for the word ‘credibility’; it infers a conclusion by weighing the context of every word in the record.

On off deterministic keyword search contrasted to a probablistic word cloud showing connections. — Legal Tech Tools and Search Based on AI Probabilistic Analysis. Image: Ralph Losey with Gemini AI.

B. Navigating the Latent Space

To perform this analysis, the model navigates the Latent Space coordinates of your query. It uses the Self-Attention weights discussed in Level 4 to “infer” a conclusion by weighing the context of every word in the record. It identifies the “Intent” and “Sentiment” within millions of documents in a second. Such tasks were previously impossible for deterministic software.

C. The Weight of the Legal Oath

While the machine provides the “Magic Guesses” of a child and the “Neural Weights” of a scientist, it lacks the professional standing to be an advocate.

The Black Box as an Invitation: The “Black Box” is not an excuse for ignorance; it is an invitation to a higher level of legal practice.
The Human Validator: We use the machine to find the “needle” (the insight), but we use our human judgment to prove it is evidence and not a hallucination.
The Ultimate Weight: In this new era, the most important “Weight” in the entire system is the one held by the human professional.

Scales of justice in a high tech courtroom, black box lighter than human judgement (gavel). — Heavy Weight of the Legal Oath. Image: Ralph Losey with Gemini AI.

6. The “Growing, not Building” Concept: The Genesis of the Black Box

To understand why even the creators of these models cannot always explain a specific output, we have to understand that AI is trained into complexity, rather than just hard-coded with logic.

The Old World of Software: In the past, we built programs based on rigid, transparent logic. If the code said “If X, then Y,” but it did something else, it was a “bug” to be corrected within a deterministic machine.
The New World of Generative AI: This technology is created through Self-Supervised Learning. We don’t provide the model with logic blueprints (corrected spelling from “bluepritns”); instead, we provide an ocean of data and a single objective: “Predict the next piece of information.”
The “Growth” of Intelligence: The model then “grows” its own internal pathways—billions of connections known as Weights and Biases—to satisfy that objective.

Think of it like a massive vine growing through a lattice. As engineers, we provide the lattice (the Transformer architecture), but the vine (the intelligence) grows itself. By the time training is finished, there are hundreds of billions of connections. There is no “Master Code” for a human to read or audit. The “Black Box” is not a wall; it is a forest so dense that no human can map every leaf.

In the era of AI Entanglement, we must judge the AI by its results (the fruit) rather than its process (the roots).

Tree of life like entangled vines. — Image: Ralph Losey with Gemini AI.

7. The “Context Window” as a Trial Record

In the computer scientist level we discussed the Transformer’s ability to look at a whole document simultaneously. In practice, this capability is governed by the Context Window. In AI, the Context Window is the specific amount of data the model can “Attend” to at any one time. When you upload a 100-page contract, the AI holds that text in a temporary “workspace.”

The Judicial Analogy: Think of the Context Window as a judge’s Active Memory during a hearing.

The Risk of Loss: If a trial lasts for ten days, but the judge can only remember the last two hours of testimony, they will lose the thread of the case.

Hallucination via Omission: They might “hallucinate” a fact not because they are lying, but because they have lost the beginning of the record.

Legal Strategy: For the tech-minded lawyer, you must manage the “Active Record” of your conversation to ensure the model maintains access to critical early facts. In a similar way, a judge relies on a court reporter who makes a transcript of the record to ensure nothing is lost to the passage of time.

Court reporter and judge in digital age. — Image: Ralph Losey with Gemini AI.

8. Anatomy of a Hallucination

A “Case Study” of a hallucination through the lens of Latent Space will help us to understand them.

Suppose you ask an AI for a case supporting a specific point of Florida law. The AI navigates to the “Neighborhood” of Florida Law and the “Street” of that specific legal issue. It sees a cluster of real cases—Smith v. Jones and Doe v. Roe.

Because it is a Probabilistic Inference Engine, the AI doesn’t naturally “check” a verified list of real cases. Instead, it follows the mathematical pattern of how Florida cases are typically named and cited.

The AI then “generates” Brown v. State—a case that sounds perfectly correct because its coordinates are exactly where a real case should be based on the surrounding patterns. It has followed the statistical “gravity” of the neighborhood, but it has drifted into a sequence of words that is factually untethered from reality.

It is a perfectly logical mathematical guess that happens to be a factual lie. This is the primary reason why we must cross-examine our assistants. We use our human judgment to prove the output is a needle of truth and not a hallucination of the “Black Box.” Cross-Examine Your AI: The Lawyer’s Cure for Hallucinations (12/17/25).

Street from above, with buildings representing cases, along "precedent avenue" — Latent Space Can Generate AI Hallucinations. Image: Ralph Losey with Gemini AI.

Conclusion: A Symphony of Five Understandings

We have traveled from the magic toy box to the multi-dimensional math of the Transformer. To close, let’s look at the “Black Box” one last time through all five lenses.

The Smart Child sees a magic friend who is the best guesser in the world. To the child, the lesson is simple: the magic friend is fun, but sometimes they make up stories. Enjoy the story, but don’t bet your lunch money on it.

The High Schooler sees a massive “Autocomplete” engine. They understand that the AI is just a mirror of everything we’ve ever written. The lesson: the mirror is only as good as the light you shine into it.

The College Graduate sees the “Latent Space”—a map of human culture turned into math. They realize that meaning is not found in isolated words, but in the mathematical distance and relationship between them.

The Computer Scientist sees the Decoder-only Transformer—a masterpiece of matrix multiplication and Self-Attention weights. They know that “thinking” is just the sound of billions of Query and Key vectors finding their mathematical match.

The Tech-Minded Legal Professional—the “Human in the Loop”—sees a revolution. We see a tool that can navigate the “Intent” and “Sentiment” of millions of documents in a heartbeat using Probabilistic Inference. But we also see the weight of our professional oath.

Five people representing the faces of the black box — Five Faces of the Black Box. My choices. My direction. Writing and images assisted by Gemini AI.

Our New Role: From Searcher to Validator. Electronic discovery professionals are no longer just “Searchers” of data; we are the Validators of a new, probabilistic reality.

We are the ones who must take the “Magic Guesses” of the child, the “Statistical Patterns” of the high schooler, the “Latent Map” of the college graduate, and the “Neural Weights” of the scientist, and forge them into Evidence.

The “Black Box” is not an excuse for ignorance; it is an invitation to a higher level of practice. We use the machine to find the needle, but we use our human judgment to prove it is a needle and not a hallucination.

In the era of AI Entanglement, the most important “Weight” in the entire system is the human in charge: You.

In the era of ai entanglement, the most important "weight" is the human in charge: You. Superwoman in robot form. — Assume your place in the AI command chair. Image: Ralph Losey with Gemini AI.

Author

Ralph Losey

Ralph Losey is the creator of QUANTUM LAW: From Causation to Probability, a self-paced course on AI, quantum computing, evidence, cybersecurity, privacy, and legal judgment, and author of the e-Discovery Team blog, established in 2006.

Ralph Losey is a lawyer, tech researcher, educator and writer. After 45-years of legal practice with several local and national law firms., Ralph retired in 2026 . He continues his service to the profession as CEO of Losey AI, LLC, providing non-legal educational services on AI and quantum law.

Ralph has long been a leader among the world's tech lawyers. He has presented at hundreds of legal conferences and CLEs around the world and written over two million words on AI, e-discovery, quantum and other tech-law subjects, including seven books.

Ralph has been involved with computers, software, legal hacking, and the law since 1980. Ralph had the highest peer AV rating as a lawyer and was consistently selected as a Best Lawyer in America in four categories: E-Discovery and Information Management Law, Information Technology Law, Commercial Litigation, and Employment Law - Management. For his full resume and list of publications, see his e-Discovery Team Blog.

Ralph has been married to Molly Friedman Losey, a mental health counselor in Winter Park, since 1973 and is the proud father of two children and grandfather of two more.

View all posts