Intelligence is Intelligence, even if it’s Artificial

Insight categories: AI and MLPerspectiveTechnology

I had a stimulating conversation with the head of our GenAI practice, Suhail Khaki, a few weeks ago. Suhail made the remark that the more he works with GenAI, the more it strikes him that it’s less like conventional computer software, and more like a person in the way it interacts. He made the remark: “Intelligence is Intelligence”. That got me thinking: a lot of so-called “issues” with GenAI are actually attributable to the fact that it’s modeled on the way people think. It’s really not GenAI that’s to blame—to a large extent, it’s just surprisingly good at behaving the way we humans do.

If someone asks you the same question on two different occasions, how likely is it that you will give exactly the same answer, word-for-word each time? You won’t, unless it’s some memorized speech. If you ask two different developers to implement the same algorithm, how likely is it that each of them will write exactly the same code? It won’t happen. They may both get it ‘right’, but the two programs will be different—slightly, or even radically.

So why does it surprise and frustrate us when GenAI behaves exactly the same way? Humans give different responses to the same question because many variables influence our behavior, including what we ate for breakfast that morning, who our audience is, how the question was phrased (including intonation), and what we’ve learned and thought about between the first iteration of the question and the second. GenAI has different factors that influence it—no need for it to eat breakfast, yet—but it essentially behaves in a ‘human’ way when it gives a different answer to the same question. Similarly for coding. There are many correct answers to the same software development problem. Which one a given developer picks, or which one the same developer picks on different occasions, are determined by a lot of internal and external variables, not least of which is the sum of our previous experiences and training.

What we call “hallucinations” in GenAI, likewise, are also common to us human beings. In the US, politicians on both sides of the aisle give us ample demonstrations of made-up facts to cover lapses of memory or inconvenient truths. We can argue about whether these political misstatements are deliberate or not, but sometimes human hallucinations are done with no bad intent. An elderly woman I knew had vascular dementia, a brain condition that cuts off access to certain memories or faculties. Her intelligence, however, was largely unaffected. If you asked her about her day, she would happily make up a story about activities that, on the surface, sounded very plausible—but in fact never occurred. There’s no way I believe she did this intentionally, with any attempt to deceive. Instead, absent the actual facts available in her memory, I think her brain creatively generated a response that sounded plausible, but was unfiltered by the truth. She was not diagnosed until she was formally interviewed by a psychologist who asked her objectively factual questions, such as the names of her children. It was only then that it became obvious that she had a medical condition, and that her responses in normal conversation were largely made up.

While I’m not a psychologist, I suspect that human intelligence, when denied access to appropriate information but finding itself in circumstances compelling a real-time response, tends to fill in the blanks—or make stuff up. We’d prefer our politicians and my elderly friend with vascular dementia to simply say “I’m sorry, I don’t know”, “I’d rather not say”, or “I don’t remember”.  But where the person feels an imperative to give an answer regardless of missing or internally suppressed information we get “fake news”, false memories or hallucinations. The same is true of GenAI—it defaults to a plausible-sounding but invalid response when it can’t find an accurate one.

My wife is a psychologist, and she tells me that in the human brain there is concept called “filling in the missing gestalt”. The brain tries different strategies and options to fill in missing data. This presentation of options contributes to human creativity and problem-solving behavior. We’ve all experienced this when we’ve been puzzled trying to solve a problem, and then suddenly the answer comes to us. This happens largely sub-consciously, below our level of awareness. When there is insufficient rejection by our brain of wrong alternatives, then we can get human confabulation to “fill in the blanks”, even though the best option might be to ‘leave it blank’, and say you don’t know. But where our brains make a good choice among the generated alternatives, we get originality, spontaneity and invention.

In an LLM, this is controllable to some degree by setting a parameter called the “temperature” which, essentially, governs the degree of randomness used to generate alternative responses. While lowering the temperature limits hallucinations in an LLM, it also reduces the number of good alternatives that are being considered. The downside of fewer alternatives is that the ‘better’ and ‘best’ alternatives may not be generated at all, limiting the effective ‘intelligence’ of the AI. Rather than surpressing the generation of alternatives, the right answer, in my view, is better filtration of multiple generated alternatives. Indeed, a number of GenAI startups are working on hallucination prevention by intelligently filtering generated responses. But the generation of alternative responses, even wrong ones, is actually a characteristic of human-type intelligence itself—it’s a “feature”, not a “bug”. We’re just at a relatively early state-of-the-art in terms of filtration—though I’m convinced that is coming.

Why do these ‘human’ inconsistencies and confabulations surprise and annoy us when it comes from GenAI? Most of us have grown up with computers. While they can be frustrating or bewildering to deal with at times, computers are also predictable. In particular, when programmed, computers do the same thing in the same way every time, and consistently give you the same answer to the same question. We experience computers as machines or ‘robotic’ (in the narrow sense) in the interactions we have with them.

GenAI is not that way. While it runs on a machine, it acts in important ways more like a person. Compared to a programmatic device, GenAI is relatively unpredictable and inconsistent.

I would argue that the unpredictability and inconsistency of GenAI is an essential feature of any intelligence that tries to emulate, in some respects, the human brain. Perhaps inconsistency is a feature of intelligence in general. It may not be a feature we always like, but if we want the advantages of intelligence in our machines, I think we will also learn to make do with its quirks.

Does that mean we can’t use GenAI for useful work? I would argue that, despite our own foibles and lapses, we have used fallible people to do useful work for many, many generations. We can follow some of these same practices in using GenAI.

When managing people, we often have multiple specialists assigned to different aspects of the same activity. Often work is overseen by a manager, who ensures the consistency and quality of the output. For critical tasks, we have documented procedures that people are required to follow. And in emergency situations, or those requiring real-time body control (like sports), we fall back on training. Trained responses are those where people follow pre-defined or pre-learned guidelines—essentially programming—automatically, and largely without thinking. These same principles of human work can, and are, being applied to GenAI today.

Consciously or sub-consciously, analogs to human organization are being developed and applied to GenAI’s today, with more in the works. “Ensembles” of specialized LLMs are being orchestrated by “agents” and other technologies to leverage the strengths of each model, analogous to a human team with complementary skillsets. Like a human supervisor, GenAI management approaches such as “LLMs for LLMs” and programmatic analysis of model outputs, are emerging to filter and  evaluate the quality of an AI’s output. These managers can also trap hallucinations, and send the AI’s—or team of AI’s—back to the drawing board to come up with a better answer. For critical or end-user facing tasks, implementations may combine the best features of programmed approaches and GenAI models. For example, a customer support application might use dialogflow [https://cloud.google.com/dialogflow] for the structured element of dialogs, together with one or more LLMs for call routing, information gathering and results summarization.

The final frontier is, perhaps, machine or industrial control systems, or control of life-critical real-time systems. For these systems, we need deterministic outputs. Creativity may be useful in some situations, but even with humans in the loop, we generally used trained responses and documented step-by-step procedures that we expect people to robotically follow. This is because in an emergency there is rarely time or mental energy to improvise—and documented procedures have been researched, vetted and tested. The robotic following of directions is probably the least human thing that we do, but it’s necessary sometimes—for example, in an emergency situation like steering your car out of a skid when it slips on the ice. Improvising from scratch is the wrong approach in that case—we’re better off if we have trained ourselves to turn into the skid to regain control, without having to process the physics in real-time. For activities like sports, piloting an aircraft in an emergency, and others real-time decision-making, learned and trained skills are an important foundation. Creativity is still beneficial at times, but only on top of a firm foundation of learned skills.

Like human trained responses to emergency or real-time sports situations, control systems operated by computer tend to be more automatic, rules-based and deterministic. That’s not to rule out AI entirely. We already have good examples of conventional, non-GenAI models playing an important role in such systems: For example, your car’s advanced driver assistance controls for lane-following, collision avoidance and adaptive cruise control have important aspects that are AI based. My experience is that these AI-based systems add substantially to my safety. However, I think all of us, myself included, would hesitate before putting our lives in the hands of a technology that is subject to hallucinations. On the other hand, I drove myself for many years without any driver assistance at all, and my fallible human brain still enabled me to survive. So maybe—properly supervised—GenAI has a role here too. After all—intelligence is intelligence, even if it’s artificial.

  • URL copied!