AI Agents Flunk Real-World Freelance Test, Study Finds
-

A new study from the Center for AI Safety and Scale AI suggests that replacing remote freelancers with AI agents may be far less effective than some companies hope. Researchers tested six leading AI systems using a newly created “Remote Labor Index,” which evaluates their ability to complete real-world freelance tasks across industries like game development and data analysis. The results were stark: not a single AI agent completed more than 3% of assigned work at an acceptable professional level.
According to CAIS director Dan Hendrycks, the findings offer a more grounded perspective on current AI capabilities. In total, the agents earned just $1,810 out of a possible $143,991 in simulated freelance income. The top performer, Manus, achieved a 2.5% completion rate — underscoring how far today’s AI tools remain from meaningfully replacing skilled human freelancers.