Date: March 20th, 2025 12:42 PM
Author: Contagious Angry Shitlib
https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/?utm_source=www.therundown.ai&utm_medium=newsletter&utm_campaign=ai-s-moore-s-law-emerges&_bhlid=8ff10f37396291f5cb09a5431c0a47b50c620201
The Rundown: Researchers at METR just published new data showing that the length of tasks AI agents can complete autonomously has been doubling approximately every 7 months since 2019, revealing a "Moore's Law" for AI capabilities.
The details:
The study tracked human and AI performance across 170 software tasks ranging from 2-second decisions to 8-hour engineering challenges.
Top models like 3.7 Sonnet have a "time horizon" of 59 minutes — completing tasks that take skilled humans this long with at least 50% reliability.
Older models like GPT-4 can handle tasks requiring about 8-15 minutes of human time, while 2019 systems struggle with anything beyond a few seconds.
If the exponential trend continues, AI systems will be capable of completing month-long human-equivalent projects with reasonable reliability by 2030.
Moore's Law predicts that computing power doubles roughly every two years — explaining why devices get faster and cheaper over time.
Why it matters: The discovery of a predictable growth pattern in AI capabilities provides an important forecasting tool for the industry. Systems that can handle much longer (months-long tasks for humans) and more complex tasks independently will completely reshape how businesses across the world approach AI and automation.
(http://www.autoadmit.com/thread.php?thread_id=5697123&forum_id=2:#48765527)