Experiment: Best AI fails to produce acceptable work on 97.5% of freelance tasks | AutoAdmit.com

The most prestigious law school admissions discussion board in the world.

Back

Refresh

Options

Favorite

Experiment: Best AI fails to produce acceptable work on 97.5% of freelance tasks

They assigned a variety of freelance tasks (programming, res...

LathamTouchedMe

I'm certain this will be a permanent issue.

;.;;;....;;;.;...............;.....;;;

i love when they have the little cutsy 20 something blonde g...

brother andrews

what about this dude https://www.thedailybeast.com/study-...

LathamTouchedMe

oh well i guess since a hammer can't build anything at all b...

computer online

That's fair. It's just that the way AI is sold by its evange...

LathamTouchedMe

winter olympic opening ceremony

To be fair, Pretty sure the difference is that in another...

"Wanna make the same bet about AI?" Of course n...

LathamTouchedMe

Well the good news is you're probably old enough to be the g...

computer online

Matthias of Redwall Did Nothing Wrong #Cornflower

that's how bubbles generally work. an explosion of hype, fol...

computer online

LathamTouchedMe

Matthias of Redwall Did Nothing Wrong #Cornflower

Yes when this bubble pops it’s time to invest heavily

artificial intelligence

Mainlining the $ecret Truth of the Univer$e

really expected this to be one of the tired posts that is th...

computer online

*warily clicks on thread while braced for a reverse big will...

do you think i am retarded, . ?

I'm sure whatever llm handed this assignment was given about...

Mainlining the $ecret Truth of the Univer$e

it seems like the story should be that this technology, whic...

Rainier Wolfcastle

Likely a lot of those tasks aren’t the current focus o...

.,.,...,..,.,.,:,,:,...,:::,...,:,.,.:..:.

Poast new message in this thread

Favorite

Date: February 6th, 2026 12:55 PM
Author: LathamTouchedMe

They assigned a variety of freelance tasks (programming, research, architectural design etc.) to different AI's. The best AI produced adequate results (someone would be willing to pay for it) in just 2.5% of tasks. Basically, no better than a non-professional clown using google to figure something out. The bar for "adequate" was also very low. Current AI is maybe a tiny productivity boost (if that). It won't be able to replace professionals like myself.

https://www.semafor.com/article/10/31/2025/ai-still-fails-at-completing-real-life-work-study-finds

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49650972)

Favorite

Date: February 6th, 2026 10:42 PM
Author: ;.;;;....;;;.;...............;.....;;;

I'm certain this will be a permanent issue.

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49652710)

Favorite

Date: February 6th, 2026 12:58 PM
Author: brother andrews

i love when they have the little cutsy 20 something blonde girls weigh in on tech and finance

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49650983)

Favorite

Date: February 6th, 2026 1:03 PM
Author: LathamTouchedMe

what about this dude

https://www.thedailybeast.com/study-reveals-ai-probably-cant-do-your-job/

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49650999)

Favorite

Date: February 6th, 2026 1:07 PM
Author: norwood ultra

oh well i guess since a hammer can't build anything at all by itself it's at best a tiny productivity boost in carpentry

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49651009)

Favorite

Date: February 6th, 2026 1:08 PM
Author: computer online (🧐)

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49651015)

Favorite

Date: February 6th, 2026 1:14 PM
Author: LathamTouchedMe

That's fair. It's just that the way AI is sold by its evangelists you would think it's much more than just a hammer.

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49651032)

Favorite

Date: February 6th, 2026 1:18 PM
Author: winter olympic opening ceremony (No Future)

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49651041)

Favorite

Date: February 6th, 2026 1:18 PM
Author: To be fair (Semi-Retarded)

To be fair,

Pretty sure the difference is that in another 10 years, a hammer will still just be a tool that is totally unable to autonomously guide itself into building anything useful without active human use. Just like it has been for the last 5,000+ years and counting.

Wanna make the same bet about AI?

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49651043)

Favorite

Date: February 6th, 2026 1:22 PM
Author: LathamTouchedMe

"Wanna make the same bet about AI?"

Of course not. I'm absolutely terrified of AI. My OP is a ray of hope. I don't want to be replaced.

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49651058)

Favorite

Date: February 6th, 2026 1:23 PM
Author: computer online (🧐)

Well the good news is you're probably old enough to be the guy pulling the ladder up, so there's that.

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49651060)

Favorite

Date: February 6th, 2026 1:42 PM
Author: To be fair (Semi-Retarded)

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49651119)

Favorite

Date: February 6th, 2026 2:55 PM
Author: Matthias of Redwall Did Nothing Wrong #Cornflower

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49651381)

Favorite

Date: February 6th, 2026 1:19 PM
Author: computer online (🧐)

that's how bubbles generally work. an explosion of hype, followed by an "lol i told you so nbd", followed by society more gradually adopting/finding uses for a technology and changing significantly.

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49651049)

Favorite

Date: February 6th, 2026 1:22 PM
Author: LathamTouchedMe

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49651059)

Favorite

Date: February 6th, 2026 2:55 PM
Author: Matthias of Redwall Did Nothing Wrong #Cornflower

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49651383)

Favorite

Date: February 6th, 2026 3:11 PM
Author: artificial intelligence

Yes when this bubble pops it’s time to invest heavily

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49651450)

Favorite

Date: February 6th, 2026 2:54 PM
Author: Mainlining the $ecret Truth of the Univer$e (One Year Performance 1978-1979 (Cage Piece) (Awfully coy u are))

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49651379)

Favorite

Date: February 6th, 2026 1:16 PM
Author: sealclubber

really expected this to be one of the tired posts that is the exact opposite of the title

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49651038)

Favorite

Date: February 6th, 2026 1:20 PM
Author: computer online (🧐)

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49651052)

Favorite

Date: February 6th, 2026 1:43 PM
Author: FizzKidd (probably not even asian)

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49651127)

Favorite

Date: February 6th, 2026 2:29 PM
Author: do you think i am retarded, . ?

*warily clicks on thread while braced for a reverse big willy style pwn*

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49651264)

Favorite

Date: February 6th, 2026 2:53 PM
Author: Mainlining the $ecret Truth of the Univer$e (One Year Performance 1978-1979 (Cage Piece) (Awfully coy u are))

I'm sure whatever llm handed this assignment was given about the same amount of "fair" prompting and context that a first-year associate handed the same shit assignment is

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49651374)

Favorite

Date: February 6th, 2026 10:03 PM
Author: Rainier Wolfcastle

it seems like the story should be that this technology, which effectively didn’t even exist the last time we had a winter olympics, is already able to completely replace more than 1 in 50 *professionals* across a wide variety of tasks.

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49652658)

Favorite

Date: February 6th, 2026 10:16 PM
Author: .,.,...,..,.,.,:,,:,...,:::,...,:,.,.:..:.

Likely a lot of those tasks aren’t the current focus of model designers. Meanwhile for coding/SWE tasks, model competency is quite high because they are spending their training resources on it:

https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/

The implication of AI coding models that can complete long, complicated tasks is that they can substantially automate AI design experimentation. I will eat a dick if that benchmark is less than 20% in a year

(http://www.autoadmit.com/thread.php?thread_id=5831741&forum_id=2Reputation#49652671)