{"id":1540774,"date":"2025-06-09T21:10:00","date_gmt":"2025-06-10T01:10:00","guid":{"rendered":"https:\/\/bugaluu.com\/news\/?p=1540774"},"modified":"2025-06-09T21:10:00","modified_gmt":"2025-06-10T01:10:00","slug":"ai-models-still-far-from-agi-level-reasoning-apple-researchers","status":"publish","type":"post","link":"https:\/\/bugaluu.com\/news\/ai-models-still-far-from-agi-level-reasoning-apple-researchers\/1540774\/","title":{"rendered":"AI Models Still Far From AGI-Level Reasoning: Apple Researchers"},"content":{"rendered":"<p><span class=\"field field--name-title field--type-string field--label-hidden\">AI Models Still Far From AGI-Level Reasoning: Apple Researchers<\/span><\/p>\n<div class=\"clearfix text-formatted field field--name-body field--type-text-with-summary field--label-hidden field__item\">\n<p><a href=\"https:\/\/cointelegraph.com\/news\/artificial-general-intelligence-long-way-off-apple\"><em>Authored by Martin Young via CoinTelegraph.com,<\/em><\/a><\/p>\n<p>The race to develop artificial general intelligence (AGI) still has a long way to run, according to Apple researchers who found that leading AI models still have trouble reasoning.\u00a0<\/p>\n<p><a href=\"https:\/\/cms.zerohedge.com\/s3\/files\/inline-images\/01975288-e6e9-7b63-a0a0-7e4ab94e.jpg?itok=3RseTkyV\"><\/a><\/p>\n<p>Recent updates to leading AI large language models (LLMs) such as OpenAI\u2019s ChatGPT and\u00a0<a href=\"https:\/\/cointelegraph.com\/news\/anthropic-launches-latest-ai-whistleblowing-backlash\">Anthropic\u2019s Claude<\/a>\u00a0have included large reasoning models (LRMs), but their fundamental capabilities, scaling properties, and limitations \u201cremain insufficiently understood,\u201d said the Apple researchers in a June\u00a0<a href=\"https:\/\/machinelearning.apple.com\/research\/illusion-of-thinking\">paper<\/a>\u00a0called \u201cThe Illusion of Thinking.\u201d\u00a0<\/p>\n<p><strong>They noted that current evaluations primarily focus on established mathematical and coding benchmarks, \u201cemphasizing final answer accuracy.\u201d\u00a0<\/strong><\/p>\n<p>However, this evaluation does not provide insights into the reasoning capabilities of the AI models, they said.\u00a0<\/p>\n<p>The research contrasts with an\u00a0<a href=\"https:\/\/cointelegraph.com\/news\/human-level-ai-as-early-as-2026-anthropic-ceo\">expectation\u00a0<\/a>that artificial general intelligence is just a few years away.<\/p>\n<h2>Apple researchers test \u201cthinking\u201d AI models<\/h2>\n<p>The researchers devised different puzzle games to test \u201cthinking\u201d and \u201cnon-thinking\u201d variants of Claude Sonnet, OpenAI\u2019s o3-mini and o1, and DeepSeek-R1 and V3 chatbots beyond the standard mathematical benchmarks.\u00a0<\/p>\n<p>They discovered that \u201cfrontier LRMs face a complete accuracy collapse beyond certain complexities,\u201d don\u2019t generalize reasoning effectively, and their edge disappears with rising complexity, contrary to expectations for AGI capabilities.<\/p>\n<p><em><strong>\u201cWe found that LRMs have limitations in exact computation: they fail to use explicit algorithms and reason inconsistently across puzzles.\u201d<\/strong><\/em><\/p>\n<p><a href=\"https:\/\/cms.zerohedge.com\/s3\/files\/inline-images\/019752da-0126-74a1-bdae-73095d63.jpg?itok=dennNQpr\"><\/a><\/p>\n<p><em>Verification of final answers and intermediate reasoning traces (top chart), and charts showing non-thinking models are more accurate at low complexity (bottom charts). Source:\u00a0<\/em><a href=\"https:\/\/ml-site.cdn-apple.com\/papers\/the-illusion-of-thinking.pdf\"><em>Apple Machine Learning Research<\/em><\/a><em>\u00a0<\/em><\/p>\n<h2>AI chatbots are overthinking, say researchers<\/h2>\n<p>They found inconsistent and shallow reasoning with the models and also observed overthinking, with AI chatbots generating correct answers early and then wandering into incorrect reasoning.<\/p>\n<p>The researchers concluded that LRMs mimic reasoning patterns without truly internalizing or generalizing them, which falls short of AGI-level reasoning.<\/p>\n<p><em><strong>\u201cThese insights challenge prevailing assumptions about LRM capabilities and suggest that current approaches may be encountering fundamental barriers to generalizable reasoning.\u201d<\/strong><\/em><\/p>\n<p><a href=\"https:\/\/cms.zerohedge.com\/s3\/files\/inline-images\/019752dc-6617-7291-af19-a86f4863.jpg?itok=OrSyM5Ya\"><\/a><\/p>\n<p><em>Illustration of the four puzzle environments. Source: Apple<\/em><\/p>\n<h2>The race to develop AGI<\/h2>\n<p>AGI is the holy grail of\u00a0<a href=\"https:\/\/cointelegraph.com\/news\/ai-arms-race\">AI development<\/a>, a state where the machine can think and reason like a human and is on a par with human intelligence.\u00a0<\/p>\n<p>In January, OpenAI CEO Sam Altman\u00a0<a href=\"https:\/\/cointelegraph.com\/news\/first-ai-agents-join-workforce-2025-sam-altman\">said<\/a>\u00a0the firm was closer to building AGI than ever before. \u201cWe are now confident we know how to build AGI as we have traditionally understood it,\u201d he said at the time.\u00a0<\/p>\n<p>In November, Anthropic CEO Dario Amodei\u00a0<a href=\"https:\/\/cointelegraph.com\/news\/human-level-ai-as-early-as-2026-anthropic-ceo\">said<\/a>\u00a0that AGI would exceed human capabilities in the next year or two. <em><strong>\u201cIf you just eyeball the rate at which these capabilities are increasing, it does make you think that we\u2019ll get there by 2026 or 2027,\u201d he said.\u00a0\u00a0<\/strong><\/em><\/p>\n<\/div>\n<p>      <span class=\"field field--name-uid field--type-entity-reference field--label-hidden\"><a title=\"View user profile.\" href=\"https:\/\/cms.zerohedge.com\/users\/tyler-durden\" class=\"username\">Tyler Durden<\/a><\/span><br \/>\n<span class=\"field field--name-created field--type-created field--label-hidden\">Mon, 06\/09\/2025 &#8211; 17:10<\/span><\/p>\n<p>\u200b<a href=\"https:\/\/www.zerohedge.com\/technology\/ai-models-still-far-agi-level-reasoning-apple-researchers\" target=\"_blank\" class=\"\">https:\/\/www.zerohedge.com\/technology\/ai-models-still-far-agi-level-reasoning-apple-researchers<\/a>\u00a0<\/p>\n","protected":false},"excerpt":{"rendered":"<p>AI Models Still Far From AGI-Level Reasoning: Apple Researchers Authored by Martin Young via CoinTelegraph.com, The race to develop artificial general intelligence (AGI) still has&#8230;<\/p>\n","protected":false},"author":0,"featured_media":1540775,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-1540774","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-news","wpcat-1-id"],"_links":{"self":[{"href":"https:\/\/bugaluu.com\/news\/wp-json\/wp\/v2\/posts\/1540774","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/bugaluu.com\/news\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/bugaluu.com\/news\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/bugaluu.com\/news\/wp-json\/wp\/v2\/comments?post=1540774"}],"version-history":[{"count":0,"href":"https:\/\/bugaluu.com\/news\/wp-json\/wp\/v2\/posts\/1540774\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/bugaluu.com\/news\/wp-json\/wp\/v2\/media\/1540775"}],"wp:attachment":[{"href":"https:\/\/bugaluu.com\/news\/wp-json\/wp\/v2\/media?parent=1540774"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/bugaluu.com\/news\/wp-json\/wp\/v2\/categories?post=1540774"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/bugaluu.com\/news\/wp-json\/wp\/v2\/tags?post=1540774"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}