Apple researchers: Mainstream AI models still cannot achieve the expected reasoning level of AGI.

Gate News bot news, researchers at Apple pointed out in a paper titled "The Illusion of Thinking" published in June that leading artificial intelligence (AGI) models still face difficulties in reasoning. Therefore, the race to develop general artificial intelligence (AGI) still has a long way to go.

The article points out that the latest updates of mainstream artificial intelligence large language models (LLM) (such as OpenAI's ChatGPT and Anthropic's Claude) have included large reasoning models (LRM), but their fundamental functions, extended features, and limitations "are still not fully understood."

The current evaluation mainly focuses on established mathematical and coding benchmarks, "emphasizing the accuracy of the final answer." However, researchers indicate that this assessment does not delve into the reasoning capabilities of AI models, starkly contrasting with the expectations that general artificial intelligence could be achieved in just a few years.

Researchers designed different puzzle games to surpass standard mathematical benchmarks to test the "thinking" and "non-thinking" variants of Claude Sonnet, OpenAI's o3-mini and o1, as well as DeepSeek-R1 and V3 chatbots.

They found that "state-of-the-art logical reasoning models (LRM) face a complete collapse in accuracy when exceeding a certain level of complexity," making it impossible to generalize reasoning effectively, and their advantages diminish as complexity increases, contrary to expectations regarding the capabilities of general artificial intelligence (AGI).

Source: Cointelegraph

View Original
The content is for reference only, not a solicitation or offer. No investment, tax, or legal advice provided. See Disclaimer for more risks disclosure.
  • Reward
  • Comment
  • Share
Comment
0/400
No comments
  • Pin
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)