Dan Hendrycks

Notable Quotes

"While AIs are smart, they are not yet that useful." — on the launch results, October 2025

from AI agents quadruple their score on a real freelance-work benchmark · Updated 2 hours ago

Stories

AI agents quadruple their score on a real freelance-work benchmark

New Capabilities

Leading the group that runs the benchmark

Eight months ago, the best AI agent could finish 2.5% of real freelance jobs well enough for a paying client to accept. On July 6, the Center for AI Safety reported that a new model now clears 16.1%.

Updated 2 hours ago

Google Gemini's push toward scientific reasoning

New Capabilities

Co-creator of Humanity's Last Exam benchmark

OpenAI launched the first commercial reasoning model in September 2024. Seventeen months later, Google's upgraded Gemini 3 Deep Think pulled ahead on the benchmarks that matter most for science.

Updated May 29

Dan Hendrycks

Notable Quotes

Stories

AI agents quadruple their score on a real freelance-work benchmark

Google Gemini's push toward scientific reasoning

Help us improve Newzino