Advanced AI Models Still Fall Short of Real Work

madmax

Despite claims of near-human intelligence, newer models like GPT-5 and ChatGPT Agent showed limited capability in completing practical freelance work, scoring just 1.7% and 1.3% respectively. Gemini 2.5 Pro ranked last with only 0.8%, underscoring how even advanced systems struggle with execution beyond controlled environments.

The study introduced the “Remote Labor Index,” a benchmark designed to evaluate whether AI can deliver economically valuable output. Results indicate that current AI lacks key human abilities such as continuous learning, long-term memory, and adaptability, which are critical for completing multi-step or evolving tasks in real-world workflows.

cryptohog

1% completion rate but don’t worry it’s basically human-level intelligence

Earn up to 50 UDS per post

Spin your Wheel of Fortune!

Paired Staking

Buy UDS!

INFLUENCER LEVEL

MULTIPLIER

Post links to Undeads Forum messages or Undeads products to receive additional rewards

Advanced AI Models Still Fall Short of Real Work