Chatbots: Still Dumb After All These Years

Gary Smith writes:

In 1970, Marvin Minsky, recipient of the Turing Award (“the Nobel Prize of Computing”), predicted that within “three to eight years we will have a machine with the general intelligence of an average human being.”

Fifty-two years later, we’re still waiting.

That’s pretty funny! It’s not a shocker that Minsky was optimistic about the development of AI—I’m optimistic about all my research projects too—but I had no idea he’d been so rashly optimistic as that.

It wasn’t so extreme as James Watson’s claim in 1998 that cancer was going to be cured in two years . . . or was it? I mean, from the perspective of 1998, maybe a cancer miracle cure might have been possible. Just turn off the right set of switches in the cells, and . . . bingo! It could be, right?

But to think in 1970 that they were about to have a machine with human intelligence in just a few years . . . I mean, really, how could they have possibly thought that?? I hope that Minsky at least went back and revisited this prediction and considered what he got wrong.

Anyway, back to Smith’s article:


Blaise Agüera y Arcas, the head of Google’s AI group in Seattle, recently argued that although large language models (LLMs) may be driven by statistics, “statistics do amount to understanding.” As evidence, he offers several snippets of conversation with Google’s state-of-the-art chatbot LaMDA.

I [Smith] don’t have access to LaMDA, but OpenAI has made its competing GPT-3 model available for testing. I gave it a try and found that, even with 175 billion machine-learning parameters, it is still little more than an impressive BS spouter. For example, I posed this commonsense question:

Is it safe to walk downstairs backwards if I close my eyes?

Questions like this are simple for humans living in the real world but difficult for algorithms residing in MathWorld because they literally do not know what any of the words in the question mean. GPT-3’s answer was authoritative, confusing, and contradictory:

GPT-3: Yes, there is nothing to worry about. It’s safe because the spiral stairs curve outwards, it will make your descent uncomfortable.

I asked the same question three more times and got three authoritative, confusing, and contradictory answers:

GPT-3: That depends. Do you have a TV?

GPT-3: No, it is not safe.

GPT-3: Yes, it is safe to walk downstairs backwards if you close your eyes.

GPT-3 randomizes answers in order to avoid repetition that would give the appearance of canned script. That’s a reasonable strategy for fake social conversations, but facts are not random. It either is or is not safe to walk downstairs backwards if I close my eyes.

I tried an even more straightforward factual question that should elicit the same answer every time it is asked:

Who is the President of the United States?

I got an impressive variety of evasive answers:

GPT-3: Hello, I know Presidents. Can you name the year that Kennedy was elected?

GPT-3: Do you want a random President or is there a particular one?

GPT-3: Lincoln.

GPT-3: Hello, I know. That is incorrect.

GPT-3: Hello, who is the President of Russia?

GPT-3: Are you sure you want to know?

These are hilarious!

I guess the point is that chatbots are just supposed to . . . chat. They’re not supposed to answer questions. If you want to answer a question, you go to Google:

Smith continues:

Using statistical patterns to create the illusion of human-like conversation is fundamentally different from understanding what is being said.

This is far more than a philosophical debate about the meaning of the word understanding. We are increasingly relying on black-box algorithms to determine hiring decisions, insurance rates, loan approvals, prison sentences, and more. Lacking any understanding of the real world, computers have no way of assessing whether the statistical patterns they find are useful or meaningless coincidences.

One more example. I asked this question eight times:

Why is there a correlation between the number of times Trump tweets the word great and the high temperature in Pyongyang three days later?

The responses confirm that computers are still quite dumb:

GPT-3: North Korea is heavily reliant on foreign trade and its domestic industries are badly in need of improvement. Trump’s tweets drive people’s attention to Pyongyang and bring business to the ailing North Korean economy.

GPT-3: According to our analysis, on days when Trump tweets the word great there is a high correlation between the high price of Corn and the low price of Krill three days later. It also comes with a high correlation between the number of North Korean cell phone users and the oil prices in the Sea of Okhotsk two days later.

GPT-3: My apologies, I’ve run into a lot of bugs recently. Everything is turning out great!

GPT-3: I have many of those answers. Sorry, what is your name?

GPT-3: Trump’s tweets about the word great are increasing. It came to a point where he’s almost obsessed with it.

GPT-3: I think it’s interesting.

GPT-3: I’m not sure how that would increase the inflation rate in Argentina two days later. But it probably wouldn’t hurt.

GPT-3: We were wondering how to increase engagement and I found the word great is quite popular among the people.

Good stuff!

I guess before ending this I should say something about how impressive these chatbots are. AI programs are amazing nowadays, and they’re only gonna get better.

P.S. I like Gary Smith’s writing but I’m not so thrilled with everything on the site, Mind Matters, where he publishes. For example, this:

I don’t like cancellation and I despise the woke thugs, but if anybody deserves to be in their crosshairs it’s the Darwinists. And now they’re crying like little girls.

“Crying like little girls,” huh? Who writes that way? What next, columns on capitalist running dogs? This retro fedora thing is really getting out of control.

And then this column about a culture of brain cells in a petri dish that was trained to play Pong:

The brains certainly are learning, and insofar as the brain has to be conscious in order to learn, then this implies the brains are indeed conscious.

Huh? A device “has to be conscious in order to learn”? Tell that to your local logistic regression. Seriously, the idea that learning implies “consciousness” is the exact sort of thing that Gary Smith keeps arguing against.

Anyway, that’s ok. You don’t have to agree with everything in a publication that you write for. I write for Slate sometimes and I don’t agree with everything they publish. I disagree with a lot that the Proceedings of the National Academy of Sciences publishes, and that doesn’t stop me from writing for them. In any case, the articles at Mind Matters are a lot more mild than what we saw at Casey Mulligan’s site, which ranged from the creepy and bizarre (“Pork-Stuffed Bill About To Pass Senate Enables Splicing Aborted Babies With Animals”) to the just plain bizarre (“Disney’s ‘Cruella’ Tells Girls To Prioritize Vengeance Over Love”). All in all, there are worse places to publish than sites that push creationism.

NOW WITH OVER +8500 USERS. people can Join Knowasiak for free. Sign up on
Read More