Apple warns: GenAI still isn’t very smart

June 9, 2025 Yanac

Filling the void in the few hours before WWDC begins, Apple’s machine learning team raced out of the gate with a report to make people think twice about artificial intelligence, arguing that while the intelligence is artificial, it’s only superficially smart.

Some seem to think that Apple is attempting to mask its slow progress in AI development as its competitors push ahead toward artificial general intelligence (AGI). Others warn that perhaps we should all think about these findings, given the speed with which the technology is being deployed across almost every part of our society. Are we really becoming dependent on tech that doesn’t really work and will simply repeat the prejudices of its owners?

Because that’s what you get when you use something that is basically the kind of glorified pattern matching Apple’s researchers reveal in their latest paper, “The Illusion of Thinking.”

The Illusion of Thinking

In the 32-page paper, Apple’s machine learning experts argue that despite the impressive results they can achieve, AI models aren’t actually capable of authentic reasoning, relying more on pattern matching than deduction. For its analysis, the team tested models from OpenAI, Google DeepMind, Anthropic, and DeepSeek.

“Recent generations of frontier language models have introduced Large Reasoning Models (LRMs) that generate detailed thinking processes before providing answers,” they write.“Through extensive experimentation across diverse puzzles, we show that frontier LRMs face a complete accuracy collapse beyond certain complexities. Moreover, they exhibit a counterintuitive scaling limit: their reasoning effort increases with problem complexity up to a point, then declines despite having an adequate token budget.”

They also claim that while both LRM and LLM systems are good at some things, LLM wins at low-complexity tasks, LRMs do well in medium complex tasks, and both LLM and LRM models “experience complete collapse” when handling high-complexity tasks.

“We found that LRMs have limitations in exact computation: they fail to use explicit algorithms and reason inconsistently across puzzles. We also investigate the reasoning traces in more depth, studying the patterns of explored solutions and analyzing the models’ computational behavior, shedding light on their strengths, limitations, and ultimately raising crucial questions about their true reasoning capabilities,” the researchers wrote.

It raises the critical question: are the genAI systems we think are capable of reasoning actually doing so, or have they simply been taught to appear to be reasoning, even when they aren’t.

Between AI ideas and reality, falls the shadow

To prove its point, Apple shows how existing systems can be fed irrelevant or fake data they don’t recognize, leading to failures. The inference is that while these machines are good at repeating and identifying the patterns they know, actual creative problem-solving is beyond their abilities.

AI struggles with abstraction and generalization, which means the notion we’re anywhere near AGI…, well, it’s not going to happen anytime soon.

Apple isn’t alone in warning about resisting the hype around AI. Critic Gary Marcus has been raising red flags about the true intelligence in these systems, pointing to their over-reliance on trading data. This reliance means that while we’re promised that genAI will change everything and make everything better and more productive, it probably won’t achieve anything like those promises. The gulf between promise and reality is driving many in the industry to reassess their expectations already.

What this means, of course, is that rather than becoming reliant on these systems, it makes more sense to remain pragmatic and develop a hybrid approach, both toward AI deployment AI development.

Hold the door

Apple would say this, I suppose. We know it is facing unexpected challenges realizing the promises it made for Apple Intelligence at WWDC last year. “We need more time to complete our work on these features, so they meet our high quality bar,” Apple CEO Tim Cook said during last month’s earnings call. “We are making progress, and we look forward to getting these features into customers’ hands.”

It’s not impossible that part of the reason Apple has delayed introduction of some AI features is quite simply because they don’t work as they should. It’s possible this reflects what Apple’s research shows, with the abilities of Apple Intelligence constrained by the same challenges its report describes.

To some degree, Apple Intelligence doesn’t matter. What is important is that any move to bet the entire future of you or of your business on AI is doomed to fail right now, if only because at root even automated pattern matching will eventually repeat the errors of the past. In the future? Who knows — the ability to create machines with human level intelligence is evidently in process, though most humans will probably prove more a little resistant to being replaced by machines.

In the meantime, Apple will bring us a rebranded Siri, APIs so developers can build Apple Intelligence within their apps, and more partnerships with AI providers; the final obstacle is whether Apple can evolve an iteration of AI that gives it a truly competitive edge, and how long the company can play for time to achieve that.

You can follow me on social media! Join me on BlueSky, LinkedIn, and Mastodon.Apple warns: GenAI still isn’t very smart – ComputerworldRead More